- Home
- ▪
- About
- ▪
- News
- ▪
- Advanced Search
- ▪
- Mobile
- ▪
- Contact Us
- ▪
- Site Map
- ▪
- Help

1

Multiple Correlation versus Multiple Regression.

Describes differences between multiple correlation analysis (MCA) and multiple regression analysis (MRA), showing how these approaches involve different research questions and study designs, different inferential approaches, different analysis strategies, and different reported information. (SLD)

Huberty, Carl J.

2003-01-01

2

MULTIPLE REGRESSION ANALYSIS OF MAIN ECONOMIC INDICATORS IN TOURISM

Directory of Open Access Journals (Sweden)

Full Text Available This paper analysis the measure between GDP dependent variable in the sector of hotels and restaurants and the following independent variables: overnight stays in the establishments of touristic reception, arrivals in the establishments of touristic reception and investments in hotels and restaurants sector in the period of analysis 1995-2007. With the multiple regression analysis I found that investments and tourist arrivals are significant predictors for the GDP dependent variable. Based on these results, I identified those components of the marketing mix, which in my opinion require investment, which could contribute to the positive development of tourist arrivals in the establishments of touristic reception.

Erika KULCSÁR

2009-12-01

3

Coding for Direct Interpretation in Multiple Regression Analysis.

A general procedure is presented for generating code values for a qualitative variable in multiple linear regression analyses that result in directly interpretable estimates of interest. The basic approach, in viewing ANOVA as a multiple regression problem, is to derive quantitative code values for the various levels of the qualitative ANOVA…

Serlin, Ronald C.; Levin, Joel R.

4

Using Robust Standard Errors to Combine Multiple Regression Estimates with Meta-Analysis

Combining multiple regression estimates with meta-analysis has continued to be a difficult task. A variety of methods have been proposed and used to combine multiple regression slope estimates with meta-analysis, however, most of these methods have serious methodological and practical limitations. The purpose of this study was to explore the use…

Williams, Ryan T.

2012-01-01

5

Analysis of ? spectra in airborne radioactivity measurements using multiple linear regressions

International Nuclear Information System (INIS)

This paper describes the net peak counts calculating of nuclide 137Cs at 662 keV of ? spectra in airborne radioactivity measurements using multiple linear regressions. Mathematic model is founded by analyzing every factor that has contribution to Cs peak counts in spectra, and multiple linear regression function is established. Calculating process adopts stepwise regression, and the indistinctive factors are eliminated by F check. The regression results and its uncertainty are calculated using Least Square Estimation, then the Cs peak net counts and its uncertainty can be gotten. The analysis results for experimental spectrum are displayed. The influence of energy shift and energy resolution on the analyzing result is discussed. In comparison with the stripping spectra method, multiple linear regression method needn't stripping radios, and the calculating result has relation with the counts in Cs peak only, and the calculating uncertainty is reduced. (authors)

6

Quantitative electron microscope autoradiography: application of multiple linear regression analysis

International Nuclear Information System (INIS)

A new method for the analysis of high resolution EM autoradiographs is described. It identifies labelled cell organelle profiles in sections on a strictly statistical basis and provides accurate estimates for their radioactivity without the need to make any assumptions about their size, shape and spatial arrangement. (author)

7

Multiple regression analysis of Jominy hardenability data for boron treated steels

Energy Technology Data Exchange (ETDEWEB)

The relations between chemical composition and their hardenability of boron treated steels have been investigated using a multiple regression analysis method. A linear model of regression was chosen. The free boron content that is effective for the hardenability was calculated using a model proposed by Jansson. The regression analysis for 1261 steel heats provided equations that were statistically significant at the 95% level. All heats met the specification according to the nordic countries producers classification. The variation in chemical composition explained typically 80 to 90% of the variation in the hardenability. In the regression analysis elements which did not significantly contribute to the calculated hardness according to the F test were eliminated. Carbon, silicon, manganese, phosphorus and chromium were of importance at all Jominy distances, nickel, vanadium, boron and nitrogen at distances above 6 mm. After the regression analysis it was demonstrated that very few outliers were present in the data set, i.e. data points outside four times the standard deviation. The model has successfully been used in industrial practice replacing some of the necessary Jominy tests. (orig.)

Komenda, J. [Swedish Inst. for Metals Research, Stockholm (Sweden); Sandstroem, R. [Royal Inst. of Tech., Stockholm (Sweden); Tukiainen, M. [Fundia Wire Oy Ab, Lappohja (Finland)

1997-03-01

8

Multiple regression analysis of Jominy hardenability data for boron treated steels

International Nuclear Information System (INIS)

The relations between chemical composition and their hardenability of boron treated steels have been investigated using a multiple regression analysis method. A linear model of regression was chosen. The free boron content that is effective for the hardenability was calculated using a model proposed by Jansson. The regression analysis for 1261 steel heats provided equations that were statistically significant at the 95% level. All heats met the specification according to the nordic countries producers classification. The variation in chemical composition explained typically 80 to 90% of the variation in the hardenability. In the regression analysis elements which did not significantly contribute to the calculated hardness according to the F test were eliminated. Carbon, silicon, manganese, phosphorus and chromium were of importance at all Jominy distances, nickel, vanadium, boron and nitrogen at distances above 6 mm. After the regression analysis it was demonstrated that very few outliers were present in the data set, i.e. data points outside four times the standard deviation. The model has successfully been used in industrial practice replacing some of the necessary Jominy tests. (orig.)

9

Directory of Open Access Journals (Sweden)

Full Text Available Budiman, Arisoesilaningsih E. 2012. Predictive model of Amorphophallus muelleri growth in some agroforestry in East Java by multiple regression analysis. Biodiversitas 13: 18-22. The aims of this research was to determine the multiple regression models of vegetative and corm growth of Amorphophallus muelleri Blume in some age variations and habitat conditions of agroforestry in East Java. Descriptive exploratory research method was conducted by systematic random sampling at five agroforestries on four plantations in East Java: Saradan, Bojonegoro, Nganjuk and Blitar. In each agroforestry, we observed A. muelleri vegetative and corm growth on four growing age (1, 2, 3 and 4 years old respectively as well as environmental variables such as altitude, vegetation, climate and soil conditions. Data were analyzed using descriptive statistics to compare A. muelleri habitat in five agroforestries. Meanwhile, the influence and contribution of each environmental variable to the growth of A. muelleri vegetative and corm were determined using multiple regression analysis of SPSS 17.0. The multiple regression models of A. muelleri vegetative and corm growth were generated based on some characteristics of agroforestries and age showed high validity with R2 = 88-99%. Regression model showed that age, monthly temperatures, percentage of radiation and soil calcium (Ca content either simultaneously or partially determined the growth of A. muelleri vegetative and corm. Based on these models, the A. muelleri corm reached the optimal growth after four years of cultivation and they will be ready to be harvested. Additionally, the soil Ca content should reach 25.3 me.hg-1 as Sugihwaras agroforestry, with the maximal radiation of 60%.

BUDIMAN

2012-01-01

10

Data collected from the sensory test score evaluation of bottled lager beer, together with the chemical components related to aging, including carbonyl compounds, higher alcohols, unsaturated fatty acid, organic acids, alpha-amino acids, dissolved oxygen, and staling evaluation indices, including lag time of electron spin resonance (ESR) curve, 1,1'-diphenyl-2-picrylhydrazyl (DPPH) scavenged amounts, and thiobarbituric acid (TBA) values, were used to predict the extent of aging in bottled lager beer, using both multiple linear regression and principal component analysis methods. Carbonyl compounds, higher alcohols, and TBA value were significantly and positively correlated with sensory evaluation of staling flavor. While lag time and DPPH scavenging amount were negatively correlated with taste test score. Multiple regression analysis was used to fit the sensory test data using the above chemical compound aging related parameters and evaluation indices as predictors. A variable selection method based on high loadings of varimax rotated principal components was used to obtain subsets of the predominant predictor variables to be included in the regression model of beer aging, so as to eliminate the multicollinearity of the original nine variables. It was found that staling extent was influenced significantly by higher alcohols, TBA value, and DPPH scavenging amount, and the multicollinearity of the regression model was found to be weak by examining the variance inflation factors of the new predictor variables. A mathematic model of the organoleptic test score for beer aging using these three predictors was obtained by multiple linear regression, showing that the major contributors to the sensory taste of beer aging were higher alcohols, TBA index, and DPPH scavenging amount, with the adjusted R(2) of the model being 0.62. PMID:18624409

Liu, Jing; Li, Qi; Dong, Jianjun; Chen, Jian; Gu, Guoxian

2008-08-27

11

REVAAM Model to determine a company's value by multiple valuation and linear regression analysis

Directory of Open Access Journals (Sweden)

Full Text Available This paper shows an alternative model to the widely used method of multiple valuation (or relative valuation) in order to calculate the value of a company by using either the Price Earnings (PE) and/or the Enterprise Value to Earnings Before Interest, Taxes, Depreciation and Amortization (EV/EBITDA). When calculating multiples, analysts tend to consider average multiples within an industry and apply them directly to the target company; however, we believe that this practice is not considering differences among the companies being compared, although they belong to the same sector or industry. REVAAM Model uses linear regression to calculate adjusted PE and EV/EBITDA multiples by taking into consideration profitability factors for each multiple in order to differentiate companies in the samples. Calculations are based on public data for US companies, but could be further expanded to other markets. Not only REVAAM Model provides a better estimate to relative valuation analysis than simply using average multiples, but it could be used to compare under/overvalued companies or sectors, and also analyze multiple value changes over time as the intrinsic fundamentals change.

Luis G. Acosta-Calzado

2010-07-01

12

A Performance Study of Data Mining Techniques: Multiple Linear Regression vs. Factor Analysis

Directory of Open Access Journals (Sweden)

Full Text Available The growing volume of data usually creates an interesting challenge for the need of data analysis tools that discover regularities in these data. Data mining has emerged as disciplines that contribute tools for data analysis, discovery of hidden knowledge, and autonomous decision making in many application domains. The purpose of this study is to compare the performance of two data mining techniques viz., factor analysis and multiple linear regression for different sample sizes on three unique sets of data. The performance of the two data mining techniques is compared on following parameters like mean square error (MSE, R-square, R-Square adjusted, condition number, root mean square error(RMSE, number of variables included in the prediction model, modified coefficient of efficiency, F-value, and test of normality. These parameters have been computed using various data mining tools like SPSS, XLstat, Stata, and MS-Excel. It is seen that for all the given dataset, factor analysis outperform multiple linear regression. But the absolute value of prediction accuracy varied between the three datasets indicating that the data distribution and data characteristics play a major role in choosing the correct prediction technique.

R.K.Chauhan

2011-04-01

13

International Nuclear Information System (INIS)

Various types of ultrasonic techniques have been used for the estimation of compressive strength of concrete structures. However, conventional ultrasonic velocity method using only longitudial wave cannot be determined the compressive strength of concrete structures with accuracy. In this paper, by using the introduction of multiple parameter, e. g. velocity of shear wave, velocity of longitudinal wave, attenuation coefficient of shear wave, attenuation coefficient of longitudinal wave, combination condition, age and preservation method, multiple regression analysis method was applied to the determination of compressive strength of concrete structures. The experimental results show that velocity of shear wave can be estimated compressive strength of concrete with more accuracy compared with the velocity of longitudinal wave, accuracy of estimated error range of compressive strength of concrete structures can be enhanced within the range of ± 10% approximately

14

Leaf pigments are key elements for plant photosynthesis and growth. Traditional manual sampling of these pigments is labor-intensive and costly, which also has the difficulty in capturing their temporal and spatial characteristics. The aim of this work is to estimate photosynthetic pigments at large scale by remote sensing. For this purpose, inverse model were proposed with the aid of stepwise multiple linear regression (SMLR) analysis. Furthermore, a leaf radiative transfer model (i.e. PROSPECT model) was employed to simulate the leaf reflectance where wavelength varies from 400 to 780 nm at 1 nm interval, and then these values were treated as the data from remote sensing observations. Meanwhile, simulated chlorophyll concentration (Cab), carotenoid concentration (Car) and their ratio (Cab/Car) were taken as target to build the regression model respectively. In this study, a total of 4000 samples were simulated via PROSPECT with different Cab, Car and leaf mesophyll structures as 70% of these samples were applied for training while the last 30% for model validation. Reflectance (r) and its mathematic transformations (1/r and log (1/r)) were all employed to build regression model respectively. Results showed fair agreements between pigments and simulated reflectance with all adjusted coefficients of determination (R2) larger than 0.8 as 6 wavebands were selected to build the SMLR model. The largest value of R2 for Cab, Car and Cab/Car are 0.8845, 0.876 and 0.8765, respectively. Meanwhile, mathematic transformations of reflectance showed little influence on regression accuracy. We concluded that it was feasible to estimate the chlorophyll and carotenoids and their ratio based on statistical model with leaf reflectance data.

Liu, Pudong; Shi, Runhe; Wang, Hong; Bai, Kaixu; Gao, Wei

2014-10-01

15

Purpose: To explore variables associated with self-reported communicative participation in a sample (n = 498) of community-dwelling adults with multiple sclerosis (MS). Method: A battery of questionnaires was administered online or on paper per participant preference. Data were analyzed using multiple linear backward stepwise regression. The…

Baylor, Carolyn; Yorkston, Kathryn; Bamer, Alyssa; Britton, Deanna; Amtmann, Dagmar

2010-01-01

16

International Nuclear Information System (INIS)

In this study, thermodynamic and statistical analyses were performed on a gas turbine system, to assess the impact of some important operating parameters like CIT (Compressor Inlet Temperature), PR (Pressure Ratio) and TIT (Turbine Inlet Temperature) on its performance characteristics such as net power output, energy efficiency, exergy efficiency and fuel consumption. Each performance characteristic was enunciated as a function of operating parameters, followed by a parametric study and optimization. The results showed that the performance characteristics increase with an increase in the TIT and a decrease in the CIT, except fuel consumption which behaves oppositely. The net power output and efficiencies increase with the PR up to certain initial values and then start to decrease, whereas the fuel consumption always decreases with an increase in the PR. The results of exergy analysis showed the combustion chamber as a major contributor to the exergy destruction, followed by stack gas. Subsequently, multiple regression models were developed to correlate each of the response variables (performance characteristic) with the predictor variables (operating parameters). The regression model equations showed a significant statistical relationship between the predictor and response variables. (author)

17

In a typical randomized clinical trial, a continuous variable of interest (e.g., bone density) is measured at baseline and fixed postbaseline time points. The resulting longitudinal data, often incomplete due to dropouts and other reasons, are commonly analyzed using parametric likelihood-based methods that assume multivariate normality of the response vector. If the normality assumption is deemed untenable, then semiparametric methods such as (weighted) generalized estimating equations are considered. We propose an alternate approach in which the missing data problem is tackled using multiple imputation, and each imputed dataset is analyzed using robust regression (M-estimation; Huber, 1973, Annals of Statistics 1, 799-821.) to protect against potential non-normality/outliers in the original or imputed dataset. The robust analysis results from each imputed dataset are combined for overall estimation and inference using either the simple Rubin (1987, Multiple Imputation for Nonresponse in Surveys, New York: Wiley) method, or the more complex but potentially more accurate Robins and Wang (2000, Biometrika 87, 113-124.) method. We use simulations to show that our proposed approach performs at least as well as the standard methods under normality, but is notably better under both elliptically symmetric and asymmetric non-normal distributions. A clinical trial example is used for illustration. PMID:22994905

Mehrotra, Devan V; Li, Xiaoming; Liu, Jiajun; Lu, Kaifeng

2012-12-01

18

Anomalous particle pinch and scaling of vin/D based on transport analysis and multiple regression

Predictions of density profiles in current tokamaks and ITER require a validated scaling relation for vin/D where vin is the anomalous inward drift velocity and D is the anomalous diffusion coefficient. Transport analysis is necessary for determining the anomalous particle pinch from measured density profiles and for separating the impact of particle sources. A set of discharges in ASDEX Upgrade, DIII-D, JET and ASDEX is analysed using a special version of the 1.5-D BALDUR transport code. Profiles of ?svin/D with ?s the effective separatrix radius, five other dimensionless parameters and many further quantities in the confinement zone are compiled, resulting in the dataset VIND1.dat, which covers a wide parameter range. Weighted multiple regression is applied to the ASDEX Upgrade subset which leads to a two-term scaling \\rho _sv_in ({x'}) /D ({x'}) =0.0432 [ { ({L_{T_{\\rme}} ({ \\bar {x}'}) / \\rho _s}) ^{-2.58}+7.13 \\, U_L^{1.55} \

Becker, G.; Kardaun, O.

2007-01-01

19

Investigations upon the indefinite rolls quality assurance in multiple regression analysis

International Nuclear Information System (INIS)

The rolling rolls quality has been enhanced mainly due to the improvements of the chemical compositions of rolls materials. The realization of an optimal chemical composition can constitute a technical efficient mode to assure the exploitation properties, the material from which the rolling mills rolls are manufactured having a higher importance in this sense. This paper continues to present the scientifically results of our experimental research in the area of the rolling rolls. The basic research contains concrete elements of immediate practical utilities in the metallurgical enterprises, for the quality improvements of rolls, having in last as the aim the durability growth and the safety in exploitation. This paper presents an analysis of the chemical composition, the influences upon the mechanical properties of the indefinite cast iron rolls. We present some mathematical correlations and graphical interpretations between the hardness (on the working surface and on necks) and the chemical composition. Using the double and triple correlations which is really helpful in the foundry practice, as it allows us to determine variation boundaries for the chemical composition, in view the obtaining the optimal values of the hardness. We suggest a mathematical interpretation of the influence of the chemical composition over the hardness of these indefinite rolling rolls. In this sense we use the multiple regression analysis which can be an important statistical tool for the be an important statistical tool for the investigation of relationships between variables. The enunciation of some mathematically modeling results can be described through a number of multi-component equations determined for the spaces with 3 and 4 dimensions. Also, the regression surfaces, curves of levels and volumes of variations can be represented and interpreted by technologists considering these as correlation diagrams between the analyzed variables. In this sense, these researches results can be used in the engineers collectives of the foundries and the rolling mills sectors, for quality assurances of rolls as far back as phase of production, as well as in exploitation of these, what lead to, inevitably, to the quality assurance of produced laminates. (Author) 16 refs.

20

A number of parameters of the solar wind and magnetosphere are correlated with the production of relativistic electrons. These include the level of relativistic electrons at storm onset, seed electron flux, solar wind velocity and number density, IMF Bz, AE and Kp indices, and ULF and VLF wave power. However, as all these variables may be intercorrelated between each other as well, simple correlations between each predictor variable and electron flux may not tell the whole story. We identified 166 storms and substorms (1992-2002) with at least 72 storm free hours after the minimum Dst. We obtained hourly averaged electron fluxes for relativistic electrons (> 1.5 MeV) and seed electrons (100 keV) from several spacecraft (Los Alamos National Laboratory geosynchronous energetic particle instruments). For each storm or substorm event, we found the log10 maximum relativistic electron flux for each satellite following the end of the main phase of each storm. No spacecraft was in operation for this entire period, so we averaged over all available satellites in each hour. As each satellite was calibrated differently, we first converted each observation to a standardized score with mean 0 and standard deviation of 1. We performed a stepwise multiple regression using solar wind velocity and flow angles (both latitude and longitude), number density, standard deviation of velocity and number density, a ULF index (Kozyreva et al., 2007, Planet. Space Sci., 55, 755-769), VLF (.5-1.0 kHz), AE, Kp, Dst, IMF Bz, Bz RMS standard deviation , and log10 of seed electron and onset relativistic electron fluxes as predictor variables. We also performed regressions entering physical variables first (e.g., solar wind velocity) and adding indices second (e.g., Dst) to determine if physical variables were more predictive. We subsequently performed a path analysis, showing the relationships between the predictor variables, as well as their influence on electron flux following each event. The rise in relativistic electron flux following storms and substorms is best explained by a set of variables rather than by one or two factors. Vsw, ULF, main phase seed electron flux, and either IMF Bz or Dst are the most significant explanatory variables. AE (relating to substorm activity) and Kp show somewhat less influence.

Simms, L. E.; Engebretson, M. J.; Pilipenko, V.; Reeves, G. D.

2012-12-01

21

Regression analysis by example

Praise for the Fourth Edition: ""This book is . . . an excellent source of examples for regression analysis. It has been and still is readily readable and understandable."" -Journal of the American Statistical Association Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. Regression Analysis by Example, Fifth Edition has been expanded

Chatterjee, Samprit

2012-01-01

22

Investigations upon the indefinite rolls quality assurance in multiple regression analysis

Directory of Open Access Journals (Sweden)

Full Text Available The rolling rolls quality has been enhanced mainly due to the improvements of the chemical compositions of rolls materials. The realization of an optimal chemical composition can constitute a technical efficient mode to assure the exploitation properties, the material from which the rolling mills rolls are manufactured having a higher importance in this sense. This paper continues to present the scientifically results of our experimental research in the area of the rolling rolls. The basic research contains concrete elements of immediate practical utilities in the metallurgical enterprises, for the quality improvements of rolls, having in last as the aim the durability growth and the safety in exploitation. This paper presents an analysis of the chemical composition, the influences upon the mechanical properties of the indefinite cast iron rolls. We present some mathematical correlations and graphical interpretations between the hardness (on the working surface and on necks and the chemical composition. Using the double and triple correlations which is really helpful in the foundry practice, as it allows us to determine variation boundaries for the chemical composition, in view the obtaining the optimal values of the hardness. We suggest a mathematical interpretation of the influence of the chemical composition over the hardness of these indefinite rolling rolls. In this sense we use the multiple regression analysis which can be an important statistical tool for the investigation of relationships between variables. The enunciation of some mathematically modeling results can be described through a number of multi-component equations determined for the spaces with 3 and 4 dimensions. Also, the regression surfaces, curves of levels and volumes of variations can be represented and interpreted by technologists considering these as correlation diagrams between the analyzed variables. In this sense, these researches results can be used in the engineers collectives of the foundries and the rolling mills sectors, for quality assurances of rolls as far back as phase of production, as well as in exploitation of these, what lead to, inevitably, to the quality assurance of produced laminates.

Con este trabajo se ha logrado asegurar la calidad de los cilindros de laminación, debido fundamentalmente a la aportación de una determinada composición química a los materiales cilíndricos. Esta composición química mejorada, puede desarrollar de una forma efiicaz las propiedades de explotación, donde estos cilindros de laminación podrán ser fabricados, ofreciendo mejores resultados. El trabajo se presenta de una forma científica, aportando los resultados de una investigación experimental en el área de los cilindros de laminación. Dicha investigación contiene elementos suficientes y de inmediata utilidad práctica para las empresas metalúrgicas, y así de esta forma, mejorar la calidad de los cilindros de laminación. El objetivo principal es el aumento de la durabilidad y la seguridad en la explotación. En este proceso se presenta un análisis de la composición química y de la influencia sobre las propiedades mecánicas de los cilindros de laminación indefinida. Presentamos algunas correlaciones matemáticas añadiendo una interpretación gráfica entre la dureza (en la superficie de trabajo y el cuello y la composición química. La determinación de las correlaciones dobles y triples, que son realmente útiles en la práctica de la fundición, nos permite determinar los límites de variación de la composición química, con vistas a obtener los valores óptimos de la dureza.Se podrá observar una interpretación matemática de la influencia de la composición química, sobre la dureza de estos cilindros de laminación. En este sentido, realizamos el análisis de regresión múltiple el cual puede aportar un importante instrumento estadístico para la investigación de las relaciones entre las variables. Los resultados matemáticamente modelados, pueden ser descritos mediante una ser

Kiss, I.

2012-04-01

23

Crude oil prices do play significant role in the global economy and are a key input into option pricing formulas, portfolio allocation, and risk measurement. In this paper, a hybrid model integrating wavelet and multiple linear regressions (MLR) is proposed for crude oil price forecasting. In this model, Mallat wavelet transform is first selected to decompose an original time series into several subseries with different scale. Then, the principal component analysis (PCA) is used in processing...

Ani Shabri; Ruhaidah Samsudin

2014-01-01

24

Understanding logistic regression analysis

Logistic regression is used to obtain odds ratio in the presence of more than one explanatory variable. The procedure is quite similar to multiple linear regression, with the exception that the response variable is binomial. The result is the impact of each variable on the odds ratio of the observed event of interest. The main advantage is to avoid confounding effects by analyzing the association of all variables together. In this article, we explain the logistic regression procedure using ex...

Sperandei, Sandro

2014-01-01

25

Multiple Regressions in Analysing House Price Variations

Directory of Open Access Journals (Sweden)

Full Text Available An application of rigorous statistical analysis in aiding investment decision making gains momentum in the United States of America as well as the United Kingdom. Nonetheless in Malaysia the responses from the local academician are rather slow and the rate is even slower as far as the practitioners are concern. This paper illustrates how Multiple Regression Analysis (MRA and its extension, Hedonic Regression Analysis been used in explaining price variation for selected houses in Malaysia. Each attribute that theoretically identified as price determinant is priced and the perceived contribution of each is explicitly shown. The paper demonstrates how the statistical analysis is capable of analyzing property investment by considering multiple determinants. The consideration of various characteristics which is more rigorous enables better investment decision making.

Aminah Md Yusof

2012-03-01

26

Many solar wind and magnetosphere parameters correlate with relativistic electron flux following storms. These include relativistic electron flux before the storm; seed electron flux; solar wind velocity and number density (and their variation); interplanetary magnetic field Bz, AE and Kp indices; and ultra low frequency (ULF) and very low frequency (VLF) wave power. However, as all these variables are intercorrelated, we use multiple regression analyses to determine which are the most predictive of flux when other variables are controlled. Using 219 storms (1992-2002), we obtained hourly averaged electron fluxes for outer radiation belt relativistic electrons (>1.5 MeV) and seed electrons (100 keV) from Los Alamos National Laboratory spacecraft (geosynchronous orbit). For each storm, we found the log10 maximum relativistic electron flux 48-120 h after the end of the main phase of each storm. Each predictor variable was averaged over the 12 h before the storm, the main phase, and the 48 h following minimum Dst. High levels of flux following storms are best modeled by a set of variables. In decreasing influence, ULF, seed electron flux, Vsw and its variation, and after-storm Bz were the most significant explanatory variables. Kp can be added to the model, but it adds no further explanatory power. Although we included ground-based VLF power from Halley, Antarctica, it shows little predictive ability. We produced predictive models using the coefficients from the regression models and assessed their effectiveness in predicting novel observations. The correlation between observed values and those predicted by these empirical models ranged from 0.645 to 0.795.

Simms, Laura E.; Pilipenko, Viacheslav; Engebretson, Mark J.; Reeves, Geoffrey D.; Smith, A. J.; Clilverd, Mark

2014-09-01

27

Application of multiple regression analysis to forecasting South Africa's electricity demand

Scientific Electronic Library Online (English)

Full Text Available In a developing country such as South Africa, understanding the expected future demand for electricity is very important in various planning contexts. It is specifically important to understand how expected scenarios regarding population or economic growth can be translated into corresponding future [...] electricity usage patterns. This paper discusses a methodology for forecasting long-term electricity demand that was specifically developed for applying to such scenarios. The methodology uses a series of multiple regression models to quantify historical patterns of electricity usage per sector in relation to patterns observed in certain economic and demographic variables, and uses these relationships to derive expected future electricity usage patterns. The methodology has been used successfully to derive forecasts used for strategic planning within a private company as well as to provide forecasts to aid planning in the public sector. This paper discusses the development of the modelling methodology, provides details regarding the extensive data collection and validation processes followed during the model development, and reports on the relevant model fit statistics. The paper also shows that the forecasting methodology has to some extent been able to match the actual patterns, and therefore concludes that the methodology can be used to support planning by translating changes relating to economic and demographic growth, for a range of scenarios, into a corresponding electricity demand. The methodology therefore fills a particular gap within the South African long-term electricity forecasting domain.

Renee, Koen; Jennifer, Holloway.

2014-11-01

28

Factor and multiple regression analysis were carried out on morphological traits (body length, body width, bill length, bill width, bill height, shank length, body height, head length, head width, neck length, wing length, chest circumference and body weight) of male and female muscovy ducks. Obvious sexual dimorphism was exhibited between sexes, relationship between body measurement and body weight were examined through factor and multiple linear regression analysis. Three factors had positi...

Ogah, D. M.; Alaga, A. A.; Momoh, M. O.

2009-01-01

29

Practical Session: Multiple Linear Regression

Three exercises are proposed to illustrate the simple linear regression. In the first one investigates the influence of several factors on atmospheric pollution. It has been proposed by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr33.pdf) and is based on data coming from 20 cities of U.S. Exercise 2 is an introduction to model selection whereas Exercise 3 provides a first example of analysis of variance. Exercises 2 and 3 have been proposed by A. Dalalyan at ENPC (see Exercises 2 and 3 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_5.pdf).

Clausel, M.; Grégoire, G.

2014-01-01

30

Reliability and Regression Analysis

This applet, by David M. Lane of Rice University, demonstrates how the reliability of X and Y affect various aspects of the regression of Y on X. Java 1.1 is required and a full set of instructions is given in order to get the full value from the applet. Exercises and definitions to key terms are also given to help students understand reliability and regression analysis.

Lane, David M.

31

Snow is the important component of hydrological cycle in the central Europe. Large quantity of water is accumulated as snow during winter period and this water runs off into rivers in relative short time during spring period. Increased risk of floods in central Europe exists namely in alpine and pre-alpine catchments which have the pluvio-nival flow regime. Research of snow accumulation and snowmelt processes is important for runoff forecast and reservoir management. The research is carried out in small mountain catchments in the Czech Republic. Experimental catchments are differing in elevation range, aspect, slope and type of vegetation cover. Automatic and field measurements of the snow depth and snow water equivalent (SWE) have been caring out at specific localities since 2008. Each locality is specified with elevation, aspect, slope and vegetation type (open area, clearing, young forest, sparse mature forest and dense mature forest). Measurements of snow depth and SWE are carried out at 19 localities both during snow accumulation and snow melt period. Data of snow depth and SWE were assessed using both simple statistical analysis and multiple regression and cluster analysis in order to describe the spatial distribution in snow accumulation and snowmelt. The correlation of SWE with vegetation type, elevation, aspect and slope was tested. The main findings of the research show that vegetation type has the most significant influence on the snowpack distribution and on the snow accumulation and snowmelt dynamics. Significant correlations were also proved for aspect (especially for southern slopes). The study completes similar results carried out in different study areas and climatic conditions but moreover it shows changes of importace of governing factors during snow accumulation and snowmelt periods. The results demonstrate a good applicability of cluster analysis and multiple regression for description of snowpack distribution.

Pevná, Hana; Jení?ek, Michal

2014-05-01

32

Multiple linear regression analysis was used to determine an equation for estimating hot corrosion attack for a series of Ni base cast turbine alloys. The U transform (i.e., 1/sin (% A/100) to the 1/2) was shown to give the best estimate of the dependent variable, y. A complete second degree equation is described for the centered" weight chemistries for the elements Cr, Al, Ti, Mo, W, Cb, Ta, and Co. In addition linear terms for the minor elements C, B, and Zr were added for a basic 47 term equation. The best reduced equation was determined by the stepwise selection method with essentially 13 terms. The Cr term was found to be the most important accounting for 60 percent of the explained variability hot corrosion attack.

Barrett, C. A.

1985-01-01

33

International Nuclear Information System (INIS)

A two layer perceptron with backpropagation of error is used for quantitative analysis in ICP-AES. The network was trained by emission spectra of two interfering lines of Cd and As and the concentrations of both elements were subsequently estimated from mixture spectra. The spectra of the Cd and As lines were also used to perform multiple linear regression (MLR) via the calculation of the pseudoinverse S+ of the sensitivity matrix S. In the present paper it is shown that there exist close relations between the operation of the perceptron and the MLR procedure. These are most clearly apparent in the correlation between the weights of the backpropagation network and the elements of the pseudoinverse. Using MLR, the confidence intervals over the predictions are exploited to correct for the optical device of the wavelength shift. (orig.)

34

Directory of Open Access Journals (Sweden)

Full Text Available The aim of this study was to forecast the returns for the Stock Exchange of Thailand (SET Index by adding some explanatory variables and stationary Autoregressive Moving-Average order p and q (ARMA (p, q in the mean equation of returns. In addition, we used Principal Component Analysis (PCA to remove possible complications caused by multicollinearity. Afterwards, we forecast the volatility of the returns for the SET Index. Results showed that the ARMA (1,1, which includes multiple regression based on PCA, has the best performance. In forecasting the volatility of returns, the GARCH model performs best for one day ahead; and the EGARCH model performs best for five days, ten days and twenty-two days ahead.

Nop Sopipan

2013-01-01

35

Concise, mathematically clear, and comprehensive treatment of the subject.* Expanded coverage of diagnostics and methods of model fitting.* Requires no specialized knowledge beyond a good grasp of matrix algebra and some acquaintance with straight-line regression and simple analysis of variance models.* More than 200 problems throughout the book plus outline solutions for the exercises.* This revision has been extensively class-tested.

Seber, George A F

2012-01-01

36

Directory of Open Access Journals (Sweden)

Full Text Available In this article authors showed influence of technological parameters and modification treatment on structural properties for closed skeleton castings. Approach obtained maximal refinement of structure and minimal structure diversification. Skeleton castings were manufactured in accordance with elaborated production technology. Experimental castings were manufactured in variables technological conditions: range of pouring temperature 953 ÷ 1013 K , temperature of mould 293 ÷ 373 K and height of gating system above casting level 105 ÷ 175 mm. Analysis of metallographic specimens and quantitative analysis of silicon crystals and secondary dendrite-arm spacing analysis of solution ? were performed. Average values of stereological parameters for all castings were determined. (B/L and (P/A factors were determined. On basis results of microstructural analysis authors compares research of samples. The aim of analysis was selected samples on least diversification of refinement degree of structure and least silicon crystals. On basis microstructural analysis authors state that samples 5 (AlSi11, Tpour 1013K, Tmould 333K, h – 265 mm has the best structural properties (least diversification of refinement degree of structure and the least refinement of silicon crystals. Then statistical analysis results of structural analysis was obtained. On basis statistical analysis autors statethat the best structural properties for technological parameters: Tpour= 1013 K, Tmould= 373 K and h = 230 mm [4]. The results of statistical analysis are the prerequisite for optimization studies.

M. Cholewa

2011-07-01

37

Correlation Weights in Multiple Regression

A general theory on the use of correlation weights in linear prediction has yet to be proposed. In this paper we take initial steps in developing such a theory by describing the conditions under which correlation weights perform well in population regression models. Using OLS weights as a comparison, we define cases in which the two weighting…

Waller, Niels G.; Jones, Jeff A.

2010-01-01

38

Crude oil prices do play significant role in the global economy and are a key input into option pricing formulas, portfolio allocation, and risk measurement. In this paper, a hybrid model integrating wavelet and multiple linear regressions (MLR) is proposed for crude oil price forecasting. In this model, Mallat wavelet transform is first selected to decompose an original time series into several subseries with different scale. Then, the principal component analysis (PCA) is used in processing subseries data in MLR for crude oil price forecasting. The particle swarm optimization (PSO) is used to adopt the optimal parameters of the MLR model. To assess the effectiveness of this model, daily crude oil market, West Texas Intermediate (WTI), has been used as the case study. Time series prediction capability performance of the WMLR model is compared with the MLR, ARIMA, and GARCH models using various statistics measures. The experimental results show that the proposed model outperforms the individual models in forecasting of the crude oil prices series. PMID:24895666

Shabri, Ani; Samsudin, Ruhaidah

2014-01-01

39

We develop a new method for estimating the biochemistry of plant material using spectroscopy. Normalized band depths calculated from the continuum-removed reflectance spectra of dried and ground leaves were used to estimate their concentrations of nitrogen, lignin, and cellulose. Stepwise multiple linear regression was used to select wavelengths in the broad absorption features centered at 1.73 ??m, 2.10 ??m, and 2.30 ??m that were highly correlated with the chemistry of samples from eastern U.S. forests. Band depths of absorption features at these wavelengths were found to also be highly correlated with the chemistry of four other sites. A subset of data from the eastern U.S. forest sites was used to derive linear equations that were applied to the remaining data to successfully estimate their nitrogen, lignin, and cellulose concentrations. Correlations were highest for nitrogen (R2 from 0.75 to 0.94). The consistent results indicate the possibility of establishing a single equation capable of estimating the chemical concentrations in a wide variety of species from the reflectance spectra of dried leaves. The extension of this method to remote sensing was investigated. The effects of leaf water content, sensor signal-to-noise and bandpass, atmospheric effects, and background soil exposure were examined. Leaf water was found to be the greatest challenge to extending this empirical method to the analysis of fresh whole leaves and complete vegetation canopies. The influence of leaf water on reflectance spectra must be removed to within 10%. Other effects were reduced by continuum removal and normalization of band depths. If the effects of leaf water can be compensated for, it might be possible to extend this method to remote sensing data acquired by imaging spectrometers to give estimates of nitrogen, lignin, and cellulose concentrations over large areas for use in ecosystem studies.We develop a new method for estimating the biochemistry of plant material using spectroscopy. Normalized band depths calculated from the continuum-removed reflectance spectra of dried and ground leaves were used to estimate their concentrations of nitrogen, lignin, and cellulose. Stepwise multiple linear regression was used to select wavelengths in the broad absorption features centered at 1.73 ??m, 2.10 ??m, and 2.301 ??m that were highly correlated with the chemistry of samples from eastern U.S. forests. Band depths of absorption features at these wavelengths were found to also be highly correlated with the chemistry of four other sites. A subset of data from the eastern U.S. forest sites was used to derive linear equations that were applied to the remaining data to successfully estimate their nitrogen, lignin, and cellulose concentrations. Correlations were highest for nitrogen (R2 from 0.75 to 0.94). The consistent results indicate the possibility of establishing a single equation capable of estimating the chemical concentrations in a wide variety of species from the reflectance spectra of dried leaves. The extension of this method to remote sensing was investigated. The effects of leaf water content, sensor signal-to-noise and bandpass, atmospheric effects, and background soil exposure were examined. Leaf water was found to be the greatest challenge to extending this empirical method to the analysis of fresh whole leaves and complete vegetation canopies. The influence of leaf water on reflectance spectra must be removed to within 10%. Other effects were reduced by continuum removal and normalization of band depths. If the effects of leaf water can be compensated for, it might be possible to extend this method to remote sensing data acquired by imaging spectrometers to give estimates of nitrogen, lignin, and cellulose concentrations over large areas for use in ecosystem studies.

Kokaly, R.F.; Clark, R.N.

1999-01-01

40

Regression Analysis A Constructive Critique

Regression Analysis: A Constructive Critique identifies a wide variety of problems with regression analysis as it is commonly used and then provides a number of ways in which practice could be improved. Regression is most useful for data reduction, leading to relatively simple but rich and precise descriptions of patterns in a data set. The emphasis on description provides readers with an insightful rethinking from the ground up of what regression analysis can do, so that readers can better match regression analysis with useful empirical questions and improved policy-related research. "An

Berk, Richard A

2003-01-01

41

Swine waste land application has increased due to organic fertilization, but excess application in an arable system can cause environmental risk. Therefore, in situ characterizations of such resources are important prior to application. To explore this, 41 swine slurry samples were collected from Korea, and wide differences were observed in the physico-biochemical properties. However, significant (Phydrometer, EC meter, drying oven and pH meter were found useful to estimate Mn, Fe, Ca, K, Al, Na, N and 5-day biochemical oxygen demands (BOD?) at improved R² values of 0.83, 0.82, 0.77, 0.75, 0.67, 0.47, 0.88 and 0.70, respectively. The results from this study suggest that multiple property regressions can facilitate the prediction of micronutrients and organic matter much better than a single property regression for livestock waste. PMID:21767950

Suresh, Arumuganainar; Choi, Hong Lim

2011-10-01

42

A Dirty Model for Multiple Sparse Regression

Sparse linear regression -- finding an unknown vector from linear measurements -- is now known to be possible with fewer samples than variables, via methods like the LASSO. We consider the multiple sparse linear regression problem, where several related vectors -- with partially shared support sets -- have to be recovered. A natural question in this setting is whether one can use the sharing to further decrease the overall number of samples required. A line of recent research has studied the use of \\ell_1/\\ell_q norm block-regularizations with q>1 for such problems; however these could actually perform worse in sample complexity -- vis a vis solving each problem separately ignoring sharing -- depending on the level of sharing. We present a new method for multiple sparse linear regression that can leverage support and parameter overlap when it exists, but not pay a penalty when it does not. A very simple idea: we decompose the parameters into two components and regularize these differently. We show both theore...

Jalali, Ali; Sanghavi, Sujay

2011-01-01

43

Energy Technology Data Exchange (ETDEWEB)

The effects of proximate and ultimate analysis, maceral content, and coal rank (R{sub max}) for a wide range of Kentucky coal samples from calorific value of 4320 to 14960 (BTU/lb) (10.05 to 34.80 MJ/kg) on Hardgrove Grindability Index (HGI) have been investigated by multivariable regression and artificial neural network methods (ANN). The stepwise least square mathematical method shows that the relationship between (a) Moisture, ash, volatile matter, and total sulfur; (b) ln (total sulfur), hydrogen, ash, ln ((oxygen + nitrogen)/carbon) and moisture; (c) ln (exinite), semifusinite, micrinite, macrinite, resinite, and R{sub max} input sets with HGI in linear condition can achieve the correlation coefficients (R{sup 2}) of 0.77, 0.75, and 0.81, respectively. The ANN, which adequately recognized the characteristics of the coal samples, can predict HGI with correlation coefficients of 0.89, 0.89 and 0.95 respectively in testing process. It was determined that ln (exinite), semifusinite, micrinite, macrinite, resinite, and R{sub max} can be used as the best predictor for the estimation of HGI on multivariable regression (R{sup 2} = 0.81) and also artificial neural network methods (R{sup 2} = 0.95). The ANN based prediction method, as used in this paper, can be further employed as a reliable and accurate method, in the hardgrove grindability index prediction. (author)

Chelgani, S. Chehreh; Jorjani, E.; Mesroghli, Sh.; Bagherieh, A.H. [Department of Mining Engineering, Research and Science Campus, Islamic Azad University, Poonak, Hesarak Tehran (Iran); Hower, James C. [Center for Applied Energy Research, University of Kentucky, 2540 Research Park Drive, Lexington, KY 40511 (United States)

2008-01-15

44

Directory of Open Access Journals (Sweden)

Full Text Available The thermal inactivation of Enterococcus faecium under isothermal conditions in tryptic soy broth of different pH (4.0, 5.5 and 7.4 was studied. The bacterial cells were more sensitive at higher temperature and in media of low pH. Decimal reduction times at 71ºC were 2.56, 0.39 and 0.03 min at pH 7.4, 5.5 and 4.0 respectively. At all temperatures and pH assayed, the survival curves obtained were linear. A mathematical model based on the first order kinetic accurately described these survival curves. The relationship between DT values and temperature was also linear. A mean z-value of 5ºC was established. A multiple linear regression model using four predictor variables (pH, T, pH2 and T2 related the Log of DT value with pH and treatment temperature. The developed tertiary model satisfactorily predicted the heat inactivation of Enterococcus faeciumunder the treatment conditions investigated.

S. CONDON

2014-06-01

45

International Nuclear Information System (INIS)

A novel statistical method, namely Regression-Estimated Input Function (REIF), is proposed in this study for the purpose of non-invasive estimation of the input function for fluorine-18 2-fluoro-2-deoxy-d-glucose positron emission tomography (FDG-PET) quantitative analysis. We collected 44 patients who had undergone a blood sampling procedure during their FDG-PET scans. First, we generated tissue time-activity curves of the grey matter and the whole brain with a segmentation technique for every subject. Summations of different intervals of these two curves were used as a feature vector, which also included the net injection dose. Multiple linear regression analysis was then applied to find the correlation between the input function and the feature vector. After a simulation study with in vivo data, the data of 29 patients were applied to calculate the regression coefficients, which were then used to estimate the input functions of the other 15 subjects. Comparing the estimated input functions with the corresponding real input functions, the averaged error percentages of the area under the curve and the cerebral metabolic rate of glucose (CMRGlc) were 12.13±8.85 and 16.60±9.61, respectively. Regression analysis of the CMRGlc values derived from the real and estimated input functions revealed a high correlation (r=0.91). No significant difference was found between the real CMRGlc and that derived from our regression-estimated input function (Student's t test, P>0.05).put function (Student's t test, P>0.05). The proposed REIF method demonstrated good abilities for input function and CMRGlc estimation, and represents a reliable replacement for the blood sampling procedures in FDG-PET quantification. (orig.)

46

Scientific Electronic Library Online (English)

Full Text Available SciELO Brazil | Language: English Abstract in english This paper joins the main properties of joint regression analysis (JRA), a model based on the Finlay-Wilkinson regression to analyse multi-environment trials, and of the additive main effects and multiplicative interaction (AMMI) model. The study compares JRA and AMMI with particular focus on robust [...] ness with increasing amounts of randomly selected missing data. The application is made using a data set from a breeding program of durum wheat (Triticum turgidum L., Durum Group) conducted in Portugal. The results of the two models result in similar dominant cultivars (JRA) and winner of mega-environments (AMMI) for the same environments. However, JRA had more stable results with the increase in the incidence rates of missing values.

Paulo Canas, Rodrigues; Dulce Gamito Santinhos, Pereira; João Tiago, Mexia.

2011-12-01

47

Directory of Open Access Journals (Sweden)

Full Text Available The objective of this research is to compare multiple regression coefficients estimating methods with existence of multicollinearity among independent variables. The estimation methods are Ordinary Least Squares method (OLS, Restricted Least Squares method (RLS, Restricted Ridge Regression method (RRR and Restricted Liu method (RL when restrictions are true and restrictions are not true. The study used the Monte Carlo Simulation method. The experiment was repeated 1,000 times under each situation. The analyzed results of the data are demonstrated as follows. CASE 1: The restrictions are true. In all cases, RRR and RL methods have a smaller Average Mean Square Error (AMSE than OLS and RLS method, respectively. RRR method provides the smallest AMSE when the level of correlations is high and also provides the smallest AMSE for all level of correlations and all sample sizes when standard deviation is equal to 5. However, RL method provides the smallest AMSE when the level of correlations is low and middle, except in the case of standard deviation equal to 3, small sample sizes, RRR method provides the smallest AMSE.The AMSE varies with, most to least, respectively, level of correlations, standard deviation and number of independent variables but inversely with to sample size.CASE 2: The restrictions are not true.In all cases, RRR method provides the smallest AMSE, except in the case of standard deviation equal to 1 and error of restrictions equal to 5%, OLS method provides the smallest AMSE when the level of correlations is low or median and there is a large sample size, but the small sample sizes, RL method provides the smallest AMSE. In addition, when error of restrictions is increased, OLS method provides the smallest AMSE for all level, of correlations and all sample sizes, except when the level of correlations is high and sample sizes small. Moreover, the case OLS method provides the smallest AMSE, the most RLS method has a smaller AMSE than RRR and RL methods when the level of correlations is low or median and sample sizes are large.The AMSE varies with, most to least, respectively, error of restrictions, level of correlations, standard deviation and number of independent variables but inversely with to sample sizes, except that error of restrictions does not affect AMSE of OLS method.

Hukharnsusatrue, A.

2005-11-01

48

Survival analysis and regression models.

Time-to-event outcomes are common in medical research as they offer more information than simply whether or not an event occurred. To handle these outcomes, as well as censored observations where the event was not observed during follow-up, survival analysis methods should be used. Kaplan-Meier estimation can be used to create graphs of the observed survival curves, while the log-rank test can be used to compare curves from different groups. If it is desired to test continuous predictors or to test multiple covariates at once, survival regression models such as the Cox model or the accelerated failure time model (AFT) should be used. The choice of model should depend on whether or not the assumption of the model (proportional hazards for the Cox model, a parametric distribution of the event times for the AFT model) is met. The goal of this paper is to review basic concepts of survival analysis. Discussions relating the Cox model and the AFT model will be provided. The use and interpretation of the survival methods model are illustrated using an artificially simulated dataset. PMID:24810431

George, Brandon; Seals, Samantha; Aban, Inmaculada

2014-08-01

49

Dissociation is defined as the disruption of the usually integrated functions of consciousness, such as memory, identity, and perceptions of the environment. Causes include various psychological, neurological and neurobiological mechanisms, none of which have been consistently supported. To our knowledge, the role of gene-environment interactions in dissociative experiences in obsessive-compulsive disorder (OCD) has not previously been investigated. Eighty-three Caucasian patients (29 male, 54 female) with a principal diagnosis of OCD were included. The Dissociative Experiences Scale was used to assess dissociation. The role of childhood trauma (assessed with the Childhood Trauma Questionnaire), and a functional 44-bp insertion/deletion polymorphism in the promoter region of the serotonin transporter, or 5-HTT, in mediating dissociation, was investigated using multiple regression analysis and path analysis using the partial least squares model. Both analyses indicated that an interaction between physical neglect and the S/S genotype of the 5-HTT gene significantly predicted dissociation in patients with OCD. Dissociation may be a predictor of poorer treatment outcome in patients with OCD; therefore, a better understanding of the mechanisms that underlie this phenomenon may be useful. Here, two different but related statistical techniques (multiple regression and partial least squares), confirmed that physical neglect and the 5-HTT genotype jointly play a role in predicting dissociation in OCD. PMID:17943026

Lochner, Christine; Seedat, Soraya; Hemmings, Sian M J; Moolman-Smook, Johanna C; Kidd, Martin; Stein, Dan J

2007-01-01

50

Polynomial regression analysis and significance test of the regression function

International Nuclear Information System (INIS)

In order to analyze the decay heating power of a certain radioactive isotope per kilogram with polynomial regression method, the paper firstly demonstrated the broad usage of polynomial function and deduced its parameters with ordinary least squares estimate. Then significance test method of polynomial regression function is derived considering the similarity between the polynomial regression model and the multivariable linear regression model. Finally, polynomial regression analysis and significance test of the polynomial function are done to the decay heating power of the iso tope per kilogram in accord with the authors' real work. (authors)

51

Multiple linear regression (MLR), radial basis network (RB), and multilayer perceptron (MLP) neural network (NN) models have been explored for the estimation of toxicity of ammonium, imidazolium, morpholinium, phosphonium, piperidinium, pyridinium, pyrrolidinium and quinolinium ionic liquid salts in the Leukemia Rat Cell Line (IPC-81) and Acetylcholinesterase (AChE) using only their empirical formulas (elemental composition) and molecular weights. The toxicity values were estimated by means of decadic logarithms of the half maximal effective concentration (EC(50)) in microM (log(10)EC(50)). The model's performances were analyzed by statistical parameters, analysis of residuals and central tendency and statistical dispersion tests. The MLP model estimates the log(10)EC(50) in IPC-81 and AchE with a mean prediction error less than 2.2 and 3.8%, respectively. PMID:18805639

Torrecilla, José S; García, Julián; Rojo, Ester; Rodríguez, Francisco

2009-05-15

52

The relationship between maceral content plus mineral matter and gross calorific value (GCV) for a wide range of West Virginia coal samples (from 6518 to 15330 BTU/lb; 15.16 to 35.66MJ/kg) has been investigated by multivariable regression and adaptive neuro-fuzzy inference system (ANFIS). The stepwise least square mathematical method comparison between liptinite, vitrinite, plus mineral matter as input data sets with measured GCV reported a nonlinear correlation coefficient (R2) of 0.83. Using the same data set the correlation between the predicted GCV from the ANFIS model and the actual GCV reported a R2 value of 0.96. It was determined that the GCV-based prediction methods, as used in this article, can provide a reasonable estimation of GCV. Copyright ?? Taylor & Francis Group, LLC.

Chelgani, S.C.; Hart, B.; Grady, W.C.; Hower, J.C.

2011-01-01

53

Polylinear regression analysis in radiochemistry

International Nuclear Information System (INIS)

A number of radiochemical problems have been formulated in the framework of polylinear regression analysis, which permits the use of conventional mathematical methods for their solution. The authors have considered features of the use of polylinear regression analysis for estimating the contributions of various sources to the atmospheric pollution, for studying irradiated nuclear fuel, for estimating concentrations from spectral data, for measuring neutron fields of a nuclear reactor, for estimating crystal lattice parameters from X-ray diffraction patterns, for interpreting data of X-ray fluorescence analysis, for estimating complex formation constants, and for analyzing results of radiometric measurements. The problem of estimating the target parameters can be incorrect at certain properties of the system under study. The authors showed the possibility of regularization by adding a fictitious set of data open-quotes obtainedclose quotes from the orthogonal design. To estimate only a part of the parameters under consideration, the authors used incomplete rank models. In this case, it is necessary to take into account the possibility of confounding estimates. An algorithm for evaluating the degree of confounding is presented which is realized using standard software or regression analysis

54

Retail sales forecasting with application the multiple regression

Directory of Open Access Journals (Sweden)

Full Text Available The article begins with a formulation for predictive learning called multiple regression model. Theoretical approach on construction of the regression models is described. The key information of the article is the mathematical formulation for the forecast linear equation that estimates the multiple regression model. Calculation the quantitative value of dependent variable forecast under influence of independent variables is explained. This paper presents the retail sales forecasting with multiple model estimation. One of the most important decisions a retailer can make with information obtained by the multiple regression. Recently, a changing retail environment is causing by an expected consumer’s income and advertising costs. Checking model on the goodness of fit and statistical significance are explored in the article. Finally, the quantitative value of retail sales forecast based on multiple regression model is calculated.

Kuzhda, Tetyana

2012-05-01

55

International Nuclear Information System (INIS)

The problem of performing process capability analysis when auto correlations are present is discussed. It is shown that when the systematic nonrandom phenomenon induced by autocorrelation is ignored the variance estimate obtained from the original data is no longer an appropriate estimate for use in the process capability analyses. A remedial measure based on an autoregressive integrated moving average model is proposed. It is also shown that the process variance estimated from the residual analysis yields appropriate results for the process capability indices

56

Estimation of transport airplane aerodynamics using multiple stepwise regression

This paper presents an application of multiple stepwise regression to the flight test data of a typical transport airplane. The flight test data was carefully preprocessed to eliminate aliasing, time skews and high frequency noise. The data consisted both of basic certification maneuvers, such as wind-up-turns and maneuvers suitable for parameter estimation, such as responses to elevator pulses and doublets. It is shown that the results of multiple stepwise regression techniques compare favorably with the results obtained from maximum likelihood estimation. Finally, it is concluded that multiple stepwise regression could be a fast economical way to estimate transport airplane aerodynamics.

Keskar, D. A.; Klein, V.; Batterson, J. G.

1985-01-01

57

Gaussian process regression analysis for functional data

Gaussian Process Regression Analysis for Functional Data presents nonparametric statistical methods for functional regression analysis, specifically the methods based on a Gaussian process prior in a functional space. The authors focus on problems involving functional response variables and mixed covariates of functional and scalar variables.Covering the basics of Gaussian process regression, the first several chapters discuss functional data analysis, theoretical aspects based on the asymptotic properties of Gaussian process regression models, and new methodological developments for high dime

Shi, Jian Qing

2011-01-01

58

Regression Commonality Analysis: A Technique for Quantitative Theory Building

When it comes to multiple linear regression analysis (MLR), it is common for social and behavioral science researchers to rely predominately on beta weights when evaluating how predictors contribute to a regression model. Presenting an underutilized statistical technique, this article describes how organizational researchers can use commonality…

Nimon, Kim; Reio, Thomas G., Jr.

2011-01-01

59

Directory of Open Access Journals (Sweden)

Full Text Available Wire Electrical Discharge Machining (WEDM is a specialized thermal machining process capable of accurately machining parts with varying hardness or complex shapes, which have sharp edges that are very difficult to be machined by the main stream machining processes. In WEDM a specific wire run-off speed is applied to compensate wear and avoid wire breakage. Since the workpiece generally stays stationary and short discharge durations are applied, the relative displacement between wire and workpiece during one single discharge is very small. This study outlines the development of model and its application to optimize WEDM machining parameters using the Taguchi?s technique which is based on the robust design. Present study outlines the electrode wear estimation in the wire EDM. EN-8 and EN-19 was machined using different process parameters based on L?16 orthogonal array. Among different process parameters voltage and flush rate were kept constant. Parameters such as bed speed, current, pulse-on and pulse-off was varied. Molybdenum wire having diameter of 0.18 mm was used as an electrode. Electrode wear was measured using universal measuring machine. Estimation and comparison of electrode wear was done using multiple regression analysis and group method data handling technique. From the results it was observed that, measured electrode wear and estimated electrode wear correlates well with respect to MRA than GMDH

G. Ugrasen

2014-05-01

60

Energy Technology Data Exchange (ETDEWEB)

Study is under way for a more accurate solar radiation quantity prediction for the enhancement of solar energy utilization efficiency. Utilizing the technique of roughly estimating the day`s clearness index from forecast weather, the forecast weather (constituted of weather conditions such as `clear,` `cloudy,` etc., and adverbs or adjectives such as `afterward,` `temporary,` and `intermittent`) has been quantified relative to the clearness index. This index is named the `weather index` for the purpose of this article. The error high in rate in the weather index relates to cloudy days, which means a weather index falling in 0.2-0.5. It has also been found that there is a high correlation between the clearness index and the north-south wind direction component. A multiple regression analysis has been carried out, under the circumstances, for the estimation of clearness index from the maximum temperature and the north-south wind direction component. As compared with estimation of the clearness index on the basis only of the weather index, estimation using the weather index and maximum temperature achieves a 3% improvement throughout the year. It has also been learned that estimation by use of the weather index and north-south wind direction component enables a 2% improvement for summer and a 5% or higher improvement for winter. 2 refs., 6 figs., 4 tabs.

Nakagawa, S. [Maizuru National College of Technology, Kyoto (Japan); Kenmoku, Y.; Sakakibara, T. [Toyohashi University of Technology, Aichi (Japan); Kawamoto, T. [Shizuoka University, Shizuoka (Japan). Faculty of Engineering

1996-10-27

61

International Nuclear Information System (INIS)

Various statistical techniques was used on five-year data from 1998-2002 of average humidity, rainfall, maximum and minimum temperatures, respectively. The relationships to regression analysis time series (RATS) were developed for determining the overall trend of these climate parameters on the basis of which forecast models can be corrected and modified. We computed the coefficient of determination as a measure of goodness of fit, to our polynomial regression analysis time series (PRATS). The correlation to multiple linear regression (MLR) and multiple linear regression analysis time series (MLRATS) were also developed for deciphering the interdependence of weather parameters. Spearman's rand correlation and Goldfeld-Quandt test were used to check the uniformity or non-uniformity of variances in our fit to polynomial regression (PR). The Breusch-Pagan test was applied to MLR and MLRATS, respectively which yielded homoscedasticity. We also employed Bartlett's test for homogeneity of variances on a five-year data of rainfall and humidity, respectively which showed that the variances in rainfall data were not homogenous while in case of humidity, were homogenous. Our results on regression and regression analysis time series show the best fit to prediction modeling on climatic data of Quetta, Pakistan. (author)

62

Application of Partial Least-Squares Regression Model on Temperature Analysis and Prediction of RCCD

This study, based on the temperature monitoring data of jiangya RCCD, uses principle and method of partial least-squares regression to analyze and predict temperature variation of RCCD. By founding partial least-squares regression model, multiple correlations of independent variables is overcome, organic combination on multiple linear regressions, multiple linear regression and canonical correlation analysis is achieved. Compared with general least-squares regression model result, it is more ...

Yuqing Zhao; Zhenxian Xing

2013-01-01

63

A number of clinical trials and single-subject studies have been published measuring the effectiveness of long-term, comprehensive applied behavior analytic (ABA) intervention for young children with autism. However, the overall appreciation of this literature through standardized measures has been hampered by the varying methods, designs, treatment features and quality standards of published studies. In an attempt to fill this gap in the literature, state-of-the-art meta-analytical methods were implemented, including quality assessment, sensitivity analysis, meta-regression, dose-response meta-analysis and meta-analysis of studies of different metrics. Results suggested that long-term, comprehensive ABA intervention leads to (positive) medium to large effects in terms of intellectual functioning, language development, acquisition of daily living skills and social functioning in children with autism. Although favorable effects were apparent across all outcomes, language-related outcomes (IQ, receptive and expressive language, communication) were superior to non-verbal IQ, social functioning and daily living skills, with effect sizes approaching 1.5 for receptive and expressive language and communication skills. Dose-dependant effect sizes were apparent by levels of total treatment hours for language and adaptation composite scores. Methodological issues relating ABA clinical trials for autism are discussed. PMID:20223569

Virués-Ortega, Javier

2010-06-01

64

In the first part of this work [1] a field operational test (FOT) on micro-HEVs (hybrid electric vehicles) and conventional vehicles was introduced. Valve-regulated lead-acid (VRLA) batteries in absorbent glass mat (AGM) technology and flooded batteries were applied. The FOT data were analyzed by kernel density estimation. In this publication multiple regression analysis is applied to the same data. Square regression models without interdependencies are used. Hereby, capacity loss serves as dependent parameter and several battery-related and vehicle-related parameters as independent variables. Battery temperature is found to be the most critical parameter. It is proven that flooded batteries operated in the conventional power system (CPS) degrade faster than VRLA-AGM batteries in the micro-hybrid power system (MHPS). A smaller number of FOT batteries were applied in a vehicle-assigned test design where the test battery is repeatedly mounted in a unique test vehicle. Thus, vehicle category and specific driving profiles can be taken into account in multiple regression. Both parameters have only secondary influence on battery degradation, instead, extended vehicle rest time linked to low mileage performance is more serious. A tear-down analysis was accomplished for selected VRLA-AGM batteries operated in the MHPS. Clear indications are found that pSoC-operation with periodically fully charging the battery (refresh charging) does not result in sulphation of the negative electrode. Instead, the batteries show corrosion of the positive grids and weak adhesion of the positive active mass.

Schaeck, S.; Karspeck, T.; Ott, C.; Weirather-Koestner, D.; Stoermer, A. O.

2011-03-01

65

Vehicle Travel Time Predication based on Multiple Kernel Regression

Directory of Open Access Journals (Sweden)

Full Text Available With the rapid development of transportation and logistics economy, the vehicle travel time prediction and planning become an important topic in logistics. Travel time prediction, which is indispensible for traffic guidance, has become a key issue for researchers in this field. At present, the prediction of travel time is mainly short term prediction, and the predication methods include artificial neural network, Kaman filter and support vector regression (SVR method etc. However, these algorithms still have some shortcomings, such as highcomputationcomplexity, slow convergence rate etc. This paper exploits the learning ability of multiple kernel learning regression (MKLR in nonlinear prediction processing characteristics, logistics planning based on MKLR for vehicle travel time prediction. The method for Vehicle travel time prediction includes the following steps: (1 preprocessing historical data; (2 selecting appropriate kernel function, training the historical data and performing analysis ;(3 predicting the vehicle travel time based on the trained model. The experimental results show that, through the analysis of using different methods for prediction, the vehicle travel time prediction method proposed in this paper, archives higher accuracy than other methods. It also illustrates the feasibility and effectiveness of the proposed prediction method.

Wenjing Xu

2014-07-01

66

Teasing out the effect of tutorials via multiple regression

We transformed an upper-division physics course using a variety of elements, including homework help sessions, tutorials, clicker questions with peer instruction, and explicit learning goals. Overall, the course transformations improved student learning, as measured by our conceptual assessment. Since these transformations were multi-faceted, we would like to understand the impact of individual course elements. Attendance at tutorials and homework help sessions was optional, and occurred outside the class environment. In order to identify the impact of these optional out-of-class sessions, given self-selection effects in student attendance, we performed a multiple regression analysis. Even when background variables are taken into account, tutorial attendance is positively correlated with student conceptual understanding of the material - though not with performance on course exams. Other elements that increase student time-on-task, such as homework help sessions and lectures, do not achieve the same impacts.

Chasteen, Stephanie V.

2012-02-01

67

Analysing Conjoint Analysis Data by a Random Coefficient Regression Model

Since late 1960s conjoint analysis has been applied in estimating consumer preferences in marketing research. This article discusses how to model the data coming from a full or a fractional factorial design within a unique regression model, as an alternative to the estimation done by n independent multiple linear regression models, one for each subject. The advantage of the method presented here resides in the possibility of computing correct standard errors for the conjoint analysis utility ...

Furlan, Roberto; Corradetti, Roberto

2005-01-01

68

A variable group Y is assumed to depend upon R thematic variable groups X 1, >..., X R . We assume that components in Y depend linearly upon components in the Xr's. In this work, we propose a multiple covariance criterion which extends that of PLS regression to this multiple predictor groups situation. On this criterion, we build a PLS-type exploratory method - Structural Equation Exploratory Regression (SEER) - that allows to simultaneously perform dimension reduction in groups and investigate the linear model of the components. SEER uses the multidimensional structure of each group. An application example is given.

Bry, Xavier; Cazes, Pierre

2008-01-01

69

Directory of Open Access Journals (Sweden)

Full Text Available In the last few decades, techniques such as Artificial Neural Networks and Fuzzy Inference Systems were used for developing predictive models to estimate the required parameters. Since the recent past Soft Computing techniques are being used as alternate statistical tool. Determination of nature of financial time series data is difficult, expensive, time consuming and involves complex tests. In this paper, we use Multi Layer Perception and Radial Basis Functions of Artificial Neural Networks, Adaptive Neuro Fuzzy Inference System for prediction of S% (Financial Stress percent of financial time series data and compare it with traditional statistical tool of Multiple Regression. The accuracies of Artificial Neural Network and Adaptive Neuro Fuzzy Inference System techniques are evaluated as relatively similar. It is found that Radial Basis Functions constructed exhibit high performance than Multi Layer Perception, Adaptive Neuro Fuzzy Inference System and Multiple Regression for predicting S%. The performance comparison shows that Soft Computing paradigm is a promising tool for minimizing uncertainties in financial time series data. Further Soft Computing also minimizes the potential inconsistency of correlations.

Arindam Chaudhuri

2012-09-01

70

Interpreting Multiple Linear Regression: A Guidebook of Variable Importance

Multiple regression (MR) analyses are commonly employed in social science fields. It is also common for interpretation of results to typically reflect overreliance on beta weights, often resulting in very limited interpretations of variable importance. It appears that few researchers employ other methods to obtain a fuller understanding of what…

Nathans, Laura L.; Oswald, Frederick L.; Nimon, Kim

2012-01-01

71

Directory of Open Access Journals (Sweden)

Full Text Available This study explores the relationship between the student performance and instructional design. The research was conducted at the E-Learning School at a university in Turkey. A list of design factors that had potential influence on student success was created through a review of the literature and interviews with relevant experts. From this, the five most import design factors were chosen. The experts scored 25 university courses on the extent to which they demonstrated the chosen design factors. Multiple-regression and supervised artificial neural network (ANN models were used to examine the relationship between student grade point averages and the scores on the five design factors. The results indicated that there is no statistical difference between the two models. Both models identified the use of examples and applications as the most influential factor. The ANN model provided more information and was used to predict the course-specific factor values required for a desired level of success.

Halil Ibrahim Cebeci

2009-12-01

72

Standardized Regression Coefficients as Indices of Effect Sizes in Meta-Analysis

When conducting a meta-analysis, it is common to find many collected studies that report regression analyses, because multiple regression analysis is widely used in many fields. Meta-analysis uses effect sizes drawn from individual studies as a means of synthesizing a collection of results. However, indices of effect size from regression analyses…

Kim, Rae Seon

2011-01-01

73

Predicting share price by using Multiple Linear Regression.

The aim of the project was to design a multiple linear regression model and use it to predict the share’s closing price for 44 companies listed on the OMX Stockholm stock exchange’s Large Cap list. The model is intended to be used as a day trading guideline i.e. today’s information is used to predict tomorrow’s closing price. The regression was done in Microsoft Excel 2010[18] by using its built-in function LINEST. The LINEST-function uses the dependent variable y and all the covariat...

Forslund, Gustaf; A?kesson, David

2013-01-01

74

Moderation analysis using a two-level regression model.

Moderation analysis is widely used in social and behavioral research. The most commonly used model for moderation analysis is moderated multiple regression (MMR) in which the explanatory variables of the regression model include product terms, and the model is typically estimated by least squares (LS). This paper argues for a two-level regression model in which the regression coefficients of a criterion variable on predictors are further regressed on moderator variables. An algorithm for estimating the parameters of the two-level model by normal-distribution-based maximum likelihood (NML) is developed. Formulas for the standard errors (SEs) of the parameter estimates are provided and studied. Results indicate that, when heteroscedasticity exists, NML with the two-level model gives more efficient and more accurate parameter estimates than the LS analysis of the MMR model. When error variances are homoscedastic, NML with the two-level model leads to essentially the same results as LS with the MMR model. Most importantly, the two-level regression model permits estimating the percentage of variance of each regression coefficient that is due to moderator variables. When applied to data from General Social Surveys 1991, NML with the two-level model identified a significant moderation effect of race on the regression of job prestige on years of education while LS with the MMR model did not. An R package is also developed and documented to facilitate the application of the two-level model. PMID:24337935

Yuan, Ke-Hai; Cheng, Ying; Maxwell, Scott

2014-10-01

75

Mediation analysis in child and adolescent development research is possible using large secondary data sets. This article provides an overview of two statistical methods commonly used to test mediated effects in secondary analysis: multiple regression and structural equation modeling (SEM). Two empirical studies are presented to illustrate the…

Li, Spencer D.

2011-01-01

76

Sensitization of stress-responsive neurobiological systems as a possible consequence of early adverse experience has been implicated in the pathophysiology of mood and anxiety disorders. In addition to early adversities, adulthood stressors are also known to precipitate the manifestation of these disorders. The present study sought to evaluate the relative role of early adverse experience vs. stress experiences in adulthood in the prediction of neuroendocrine stress reactivity in women. A total of 49 women (normal volunteers, depressed patients, and women with a history of early abuse) underwent a battery of interviews and completed dimensional rating scales on stress experiences and psychopathology, and were subsequently exposed to a standardized psychosocial laboratory stressor. Outcome measures were plasma adrenocorticotropin (ACTH) and cortisol responses to the stress test. Multiple linear regression analyses were performed to identify the impact of demographic variables, childhood abuse, adulthood trauma, major life events in the past year, and daily hassles in the past month, as well as psychopathology on hormonal stress responsiveness. Peak ACTH responses to psychosocial stress were predicted by a history of childhood abuse, the number of separate abuse events, the number of adulthood traumas, and the severity of depression. Similar predictors were identified for peak cortisol responses. Although abused women reported more severe negative life events in adulthood than controls, life events did not affect neuroendocrine reactivity. The regression model explained 35% of the variance of ACTH responses. The interaction of childhood abuse and adulthood trauma was the most powerful predictor of ACTH responsiveness. Our findings suggest that a history of childhood abuse per se is related to increased neuroendocrine stress reactivity, which is further enhanced when additional trauma is experienced in adulthood. PMID:12001180

Heim, Christine; Newport, D Jeffrey; Wagner, Dieter; Wilcox, Molly M; Miller, Andrew H; Nemeroff, Charles B

2002-01-01

77

Outlier Detection for Multivariate Multiple Regression in Y-direction

This study focuses on the outlier detection for Multivariate Multiple Regression in Y-direction however, we propose an alternative method based on the squared distances of the residuals. The proposed method refers to the robust estimates of location and covariance matrices derived from the squared distances of the residuals. The proposed method is compared to Mahalanobis Distance method, Minimum Covariance Determinant method and Minimum Volume Ellipsoid met...

Paweena Tangjuang; Pachitjanut Siripanich

2014-01-01

78

International Nuclear Information System (INIS)

The incremental diagnostic yield of clinical data, exercise ECG, stress thallium scintigraphy, and cardiac fluoroscopy to predict coronary and multivessel disease was assessed in 171 symptomatic men by means of multiple logistic regression analyses. When clinical variables alone were analyzed, chest pain type and age were predictive of coronary disease, whereas chest pain type, age, a family history of premature coronary disease before age 55 years, and abnormal ST-T wave changes on the rest ECG were predictive of multivessel disease. The percentage of patients correctly classified by cardiac fluoroscopy (presence or absence of coronary artery calcification), exercise ECG, and thallium scintigraphy was 9%, 25%, and 50%, respectively, greater than for clinical variables, when the presence or absence of coronary disease was the outcome, and 13%, 25%, and 29%, respectively, when multivessel disease was studied; 5% of patients were misclassified. When the 37 clinical and noninvasive test variables were analyzed jointly, the most significant variable predictive of coronary disease was an abnormal thallium scan and for multivessel disease, the amount of exercise performed. The data from this study provide a quantitative model and confirm previous reports that optimal diagnostic efficacy is obtained when noninvasive tests are ordered sequentially. In symptomatic men, cardiac fluoroscopy is a relatively ineffective test when compared to exercise ECG and thallium scintigraphyto exercise ECG and thallium scintigraphy

79

Introducing Evolutionary Computing in Regression Analysis

A typical upper level undergraduate or first year graduate level regression course syllabus treats model selection with various stepwise regression methods. Here we implement evolutionary computing for subset model selection and accomplish two goals: i) introduce students to the powerful optimization method of genetic algorithms, and ii) transform a regression analysis course to a regression and modeling without requiring any additional time or software commitment.Furthermore we also employed Akaike Information Criterion (AIC) as a measure of model fitness instead of another commonly used measure of R-square. The model selection tool uses Excel which makes the procedure accessible to a very wide spectrum of interdisciplinary students with no specialized software requirement. An Excel macro, to be used as an instructional tool is freely available through the author's website.

Olcay Akman

80

Survival analysis and regression models

Time-to-event outcomes are common in medical research as they offer more information than simply whether or not an event occurred. To handle these outcomes, as well as censored observations where the event was not observed during follow-up, survival analysis methods should be used. Kaplan-Meier estimation can be used to create graphs of the observed survival curves, while the log-rank test can be used to compare curves from different groups. If it is desired to test continuous predictors or t...

George, Brandon; Seals, Samantha; Aban, Inmaculada

2014-01-01

81

Multiple predictor smoothing methods for sensitivity analysis.

Energy Technology Data Exchange (ETDEWEB)

The use of multiple predictor smoothing methods in sampling-based sensitivity analyses of complex models is investigated. Specifically, sensitivity analysis procedures based on smoothing methods employing the stepwise application of the following nonparametric regression techniques are described: (1) locally weighted regression (LOESS), (2) additive models, (3) projection pursuit regression, and (4) recursive partitioning regression. The indicated procedures are illustrated with both simple test problems and results from a performance assessment for a radioactive waste disposal facility (i.e., the Waste Isolation Pilot Plant). As shown by the example illustrations, the use of smoothing procedures based on nonparametric regression techniques can yield more informative sensitivity analysis results than can be obtained with more traditional sensitivity analysis procedures based on linear regression, rank regression or quadratic regression when nonlinear relationships between model inputs and model predictions are present.

Helton, Jon Craig; Storlie, Curtis B.

2006-08-01

82

Multiple predictor smoothing methods for sensitivity analysis

International Nuclear Information System (INIS)

The use of multiple predictor smoothing methods in sampling-based sensitivity analyses of complex models is investigated. Specifically, sensitivity analysis procedures based on smoothing methods employing the stepwise application of the following nonparametric regression techniques are described: (1) locally weighted regression (LOESS), (2) additive models, (3) projection pursuit regression, and (4) recursive partitioning regression. The indicated procedures are illustrated with both simple test problems and results from a performance assessment for a radioactive waste disposal facility (i.e., the Waste Isolation Pilot Plant). As shown by the example illustrations, the use of smoothing procedures based on nonparametric regression techniques can yield more informative sensitivity analysis results than can be obtained with more traditional sensitivity analysis procedures based on linear regression, rank regression or quadratic regression when nonlinear relationships between model inputs and model predictions are present

83

Tools to Support Interpreting Multiple Regression in the Face of Multicollinearity

While multicollinearity may increase the difficulty of interpreting multiple regression results, it should not cause undue problems for the knowledgeable researcher. In the current paper, we argue that rather than using one technique to investigate regression results, researchers should consider multiple indices to understand the contributions that predictors make not only to a regression model, but to each other as well. Some of the techniques to interpret multiple regression effects include...

AmandaKraha; LindaZientek

2012-01-01

84

Directory of Open Access Journals (Sweden)

Full Text Available Problem statement: This study presents a novel method for the determination of average winding temperature rise of transformers under its predetermined field operating conditions. Rise in the winding temperature was determined from the estimated values of winding resistance during the heat run test conducted as per IEC standard. Approach: The estimation of hot resistance was modeled using Multiple Variable Regression (MVR, Multiple Polynomial Regression (MPR and soft computing techniques such as Artificial Neural Network (ANN and Adaptive Neuro Fuzzy Inference System (ANFIS. The modeled hot resistance will help to find the load losses at any load situation without using complicated measurement set up in transformers. Results: These techniques were applied for the hot resistance estimation for dry type transformer by using the input variables cold resistance, ambient temperature and temperature rise. The results are compared and they show a good agreement between measured and computed values. Conclusion: According to our experiments, the proposed methods are verified using experimental results, which have been obtained from temperature rise test performed on a 55 kVA dry-type transformer.

M. Srinivasan

2012-01-01

85

Bayesian latent variable models for median regression on multiple outcomes.

Often a response of interest cannot be measured directly and it is necessary to rely on multiple surrogates, which can be assumed to be conditionally independent given the latent response and observed covariates. Latent response models typically assume that residual densities are Gaussian. This article proposes a Bayesian median regression modeling approach, which avoids parametric assumptions about residual densities by relying on an approximation based on quantiles. To accommodate within-subject dependency, the quantile response categories of the surrogate outcomes are related to underlying normal variables, which depend on a latent normal response. This underlying Gaussian covariance structure simplifies interpretation and model fitting, without restricting the marginal densities of the surrogate outcomes. A Markov chain Monte Carlo algorithm is proposed for posterior computation, and the methods are applied to single-cell electrophoresis (comet assay) data from a genetic toxicology study. PMID:12926714

Dunson, David B; Watson, M; Taylor, Jack A

2003-06-01

86

LOGISTIC REGRESSION ANALYSIS WITH STANDARDIZED MARKERS

Two different approaches to analysis of data from diagnostic biomarker studies are commonly employed. Logistic regression is used to fit models for probability of disease given marker values, while ROC curves and risk distributions are used to evaluate classification performance. In this paper we present a method that simultaneously accomplishes both tasks. The key step is to standardize markers relative to the nondiseased population before including them in the logistic reg...

Huang, Ying; Pepe, Margaret S.; Feng, Ziding

2013-01-01

87

In real time analysis and forecasting of time series data, it is important to detect the structural change as immediately, correctly, and simply as possible. And it is necessary for rebuilding the next prediction model after the change point as soon as possible. For this kind of time series data analysis, in general, multiple linear regression models are used. In this paper, we present two methods, i.e., Sequential Probability Ratio Test (SPRT) and Chow Test that is well-known in economics, and describe those experimental evaluations of the effectiveness in the change detection using the multiple regression models. Moreover, we extend the definition of the detected change point in the SPRT method, and show the improvement of the change detection accuracy.

Takeda, Katsunori; Hattori, Tetsuo; Kawano, Hiromichi

88

Multiple Regression Model for Compressive Strength Prediction of High Performance Concrete

A mathematical model for the prediction of compressive strength of high performance concrete was performed using statistical analysis for the concrete data obtained from experimental work done in this study. The multiple non-linear regression model yielded excellent correlation coefficient for the prediction of compressive strength at different ages (3, 7, 14, 28 and 91 days). The coefficient of correlation was 99.99% for each strength (at each age). Also, the model gives high correlat...

Zain, M. F. M.; Abd, S. M.

2009-01-01

89

Crime is an important determinant of public health outcomes, including quality of life, mental well-being, and health behavior. A body of research has documented the association between community social capital and crime victimization. The association between social capital and crime victimization has been examined at multiple levels of spatial aggregation, ranging from entire countries, to states, metropolitan areas, counties, and neighborhoods. In multilevel analysis, the spatial boundaries at level 2 are most often drawn from administrative boundaries (e.g., Census tracts in the U.S.). One problem with adopting administrative definitions of neighborhoods is that it ignores spatial spillover. We conducted a study of social capital and crime victimization in one ward of Tokyo city, using a spatial Durbin model with an inverse-distance weighting matrix that assigned each respondent a unique level of "exposure" to social capital based on all other residents' perceptions. The study is based on a postal questionnaire sent to 20-69 years old residents of Arakawa Ward, Tokyo. The response rate was 43.7%. We examined the contextual influence of generalized trust, perceptions of reciprocity, two types of social network variables, as well as two principal components of social capital (constructed from the above four variables). Our outcome measure was self-reported crime victimization in the last five years. In the spatial Durbin model, we found that neighborhood generalized trust, reciprocity, supportive networks and two principal components of social capital were each inversely associated with crime victimization. By contrast, a multilevel regression performed with the same data (using administrative neighborhood boundaries) found generally null associations between neighborhood social capital and crime. Spatial regression methods may be more appropriate for investigating the contextual influence of social capital in homogeneous cultural settings such as Japan. PMID:22901675

Takagi, Daisuke; Ikeda, Ken'ichi; Kawachi, Ichiro

2012-11-01

90

Functional linear regression analysis for longitudinal data

We propose nonparametric methods for functional linear regression which are designed for sparse longitudinal data, where both the predictor and response are functions of a covariate such as time. Predictor and response processes have smooth random trajectories, and the data consist of a small number of noisy repeated measurements made at irregular times for a sample of subjects. In longitudinal studies, the number of repeated measurements per subject is often small and may be modeled as a discrete random number and, accordingly, only a finite and asymptotically nonincreasing number of measurements are available for each subject or experimental unit. We propose a functional regression approach for this situation, using functional principal component analysis, where we estimate the functional principal component scores through conditional expectations. This allows the prediction of an unobserved response trajectory from sparse measurements of a predictor trajectory. The resulting technique is flexible and allow...

Yao, F; Wang, J L; Yao, Fang; Müller, Hans-Georg; Wang, Jane-Ling

2005-01-01

91

Assessment method of study program: Results from regression analysis

Assessment is an important part in any universities programs. Various approach of assessment has been used in determining the students' grade for the subjects. Therefore, this article discussed the empirical study for finding the best solution for determining the student grades. Several predictors for determining the students' grades i.e. total marks were identified such as coursework marks, mid-semester marks and final exam marks. Therefore, raw data from the database for a particular semester at one university in east coast of Malaysia are used for this purpose. The Correlational analysis was used to determine the strength of the association between the three predictors and the criterion variable. Also, multiple regression analysis was used to find the best regression model for the purpose of the study. Implications of the study were also discussed.

Hamid, Mohd Rashid Bin Ab; Mohamed, Mohd Rusllim Bin; Mustafa, Zainol

2015-02-01

92

Regression Analysis for the Social Sciences

The book provides graduate students in the social sciences with the basic skills that they need to estimate, interpret, present, and publish basic regression models using contemporary standards. Key features of the book include:interweaving the teaching of statistical concepts with examples developed for the course from publicly-available social science data or drawn from the literature. thorough integration of teaching statistical theory with teaching data processing and analysis.teaching of both SAS and Stata "side-by-side" and use of chapter exercises in which students practice programming

Gordon, Rachel A A

2012-01-01

93

An Effect Size for Regression Predictors in Meta-Analysis

A new effect size representing the predictive power of an independent variable from a multiple regression model is presented. The index, denoted as r[subscript sp], is the semipartial correlation of the predictor with the outcome of interest. This effect size can be computed when multiple predictor variables are included in the regression model…

Aloe, Ariel M.; Becker, Betsy Jane

2012-01-01

94

Sliced Inverse Regression for big data analysis

Modem advances in computing power have greatly widened scientists' scope in gathering and investigating information from many variables. We describe sliced inverse regression (SIR), for reducing the dimension of the input variable x without going through any parametric or nonparametric model-fitting process. This method explores the simplicity of the inverse view of regression. Instead of regressing the univariate output variable y against the multivariate x, we regress x against y. Forward r...

Kevin, Li

2014-01-01

95

Using Dominance Analysis to Determine Predictor Importance in Logistic Regression

This article proposes an extension of dominance analysis that allows researchers to determine the relative importance of predictors in logistic regression models. Criteria for choosing logistic regression R[superscript 2] analogues were determined and measures were selected that can be used to perform dominance analysis in logistic regression. A…

Azen, Razia; Traxel, Nicole

2009-01-01

96

Most investigations of the adverse health effects of multiple air pollutants analyse the time series involved by simultaneously entering the multiple pollutants into a Poisson log-linear model. Concerns have been raised about this type of analysis, and it has been stated that new methodology or models should be developed for investigating the adverse health effects of multiple air pollutants. In this paper, we introduce the use of the lasso for this purpose and compare its statistical properties to those of ridge regression and the Poisson log-linear model. Ridge regression has been used in time series analyses on the adverse health effects of multiple air pollutants but its properties for this purpose have not been investigated. A series of simulation studies was used to compare the performance of the lasso, ridge regression, and the Poisson log-linear model. In these simulations, realistic mortality time series were generated with known air pollution mortality effects permitting the performance of the three models to be compared. Both the lasso and ridge regression produced more accurate estimates of the adverse health effects of the multiple air pollutants than those produced using the Poisson log-linear model. This increase in accuracy came at the expense of increased bias. Ridge regression produced more accurate estimates than the lasso, but the lasso produced more interpretable models. The lasso and ridge regression offer a flexible way of obtaining more accurate estimation of pollutant effects than that provided by the standard Poisson log-linear model.

Roberts, Steven; Martin, Michael

97

Tools to Support Interpreting Multiple Regression in the Face of Multicollinearity

Directory of Open Access Journals (Sweden)

Full Text Available While multicollinearity may increase the difficulty of interpreting multiple regression results, it should not cause undue problems for the knowledgeable researcher. In the current paper, we argue that rather than using one technique to investigate regression results, researchers should consider multiple indices to understand the contributions that predictors make not only to a regression model, but to each other as well. Some of the techniques to interpret multiple regression effects include, but are not limited to, correlation coefficients, beta weights, structure coefficients, all possible subsets regression, commonality coefficients, dominance weights, and relative importance weights. This article will review a set of techniques to interpret multiple regression effects, identify the elements of the data on which the methods focus, and identify statistical software to support such analyses.

AmandaKraha

2012-03-01

98

Throughput Prediction of Fishing Goods Based on the Grey Multiple Linear Regression Method

Based on the grey prediction method and multiple linear regression method, the grey multiple linear regression method was presented. This method was applied to the throughput prediction of fishing goods according to five fishing ports’ actual throughput data. The result of comparing the calculating conclusion to the time series one-dimensional linear regression method and grey prediction method proved that the method of calculation and analyzing was more effective and the forecasting precis...

Changping Chen; Changlu Zhou; Xueda Zhao; Yanna Zheng; Xianying Shi

2014-01-01

99

A Software Tool for Regression Analysis and its Assumptions

Nowadays, among the forecasting methods, the most important one is the regression analysis. In this method, the aim is to estimate the population regression model as much as accurate by taking as basis the sample regression function. Its results are valid under certain assumptions and the violations of these assumptions cause the invalidity of some properties of the estimators. In this study, a new object-oriented program concentrated only on the regression analysis and its assumptions has be...

Sona Mardikyan; Darcan, Osman N.

2006-01-01

100

MULTIPLE LOGISTIC REGRESSION MODEL TO PREDICT RISK FACTORS OF ORAL HEALTH DISEASES

Directory of Open Access Journals (Sweden)

Full Text Available Purpose: To analysis the dependence of oral health diseases i.e. dental caries and periodontal disease on considering the number of risk factors through the applications of logistic regression model. Method: The cross sectional study involves a systematic random sample of 1760 permanent dentition aged between 18-40 years in Dharwad, Karnataka, India. Dharwad is situated in North Karnataka. The mean age was 34.26±7.28. The risk factors of dental caries and periodontal disease were established by multiple logistic regression model using SPSS statistical software. Results: The factors like frequency of brushing, timings of cleaning teeth and type of toothpastes are significant persistent predictors of dental caries and periodontal disease. The log likelihood value of full model is –1013.1364 and Akaike’s Information Criterion (AIC is 1.1752 as compared to reduced regression model are -1019.8106 and 1.1748 respectively for dental caries. But, the log likelihood value of full model is –1085.7876 and AIC is 1.2577 followed by reduced regression model are -1019.8106 and 1.1748 respectively for periodontal disease. The area under Receiver Operating Characteristic (ROC curve for the dental caries is 0.7509 (full model and 0.7447 (reduced model; the ROC for the periodontal disease is 0.6128 (full model and 0.5821 (reduced model. Conclusions: The frequency of brushing, timings of cleaning teeth and type of toothpastes are main signifi cant risk factors of dental caries and periodontal disease. The fitting performance of reduced logistic regression model is slightly a better fit as compared to full logistic regression model in identifying the these risk factors for both dichotomous dental caries and periodontal disease.

Parameshwar V. Pandit

2012-06-01

101

Spatial regression analysis on 32 years total column ozone data

Directory of Open Access Journals (Sweden)

Full Text Available Multiple-regressions analysis have been performed on 32 years of total ozone column data that was spatially gridded with a 1° × 1.5° resolution. The total ozone data consists of the MSR (Multi Sensor Reanalysis; 1979–2008 and two years of assimilated SCIAMACHY ozone data (2009–2010. The two-dimensionality in this data-set allows us to perform the regressions locally and investigate spatial patterns of regression coefficients and their explanatory power. Seasonal dependencies of ozone on regressors are included in the analysis. A new physically oriented model is developed to parameterize stratospheric ozone. Ozone variations on non-seasonal timescales are parameterized by explanatory variables describing the solar cycle, stratospheric aerosols, the quasi-biennial oscillation (QBO, El Nino (ENSO and stratospheric alternative halogens (EESC. For several explanatory variables, seasonally adjusted versions of these explanatory variables are constructed to account for the difference in their effect on ozone throughout the year. To account for seasonal variation in ozone, explanatory variables describing the polar vortex, geopotential height, potential vorticity and average day length are included. Results of this regression model are compared to that of similar analysis based on a more commonly applied statistically oriented model. The physically oriented model provides spatial patterns in the regression results for each explanatory variable. The EESC has a significant depleting effect on ozone at high and mid-latitudes, the solar cycle affects ozone positively mostly at the Southern Hemisphere, stratospheric aerosols affect ozone negatively at high Northern latitudes, the effect of QBO is positive and negative at the tropics and mid to high-latitudes respectively and ENSO affects ozone negatively between 30° N and 30° S, particularly at the Pacific. The contribution of explanatory variables describing seasonal ozone variation is generally large at mid to high latitudes. We observe ozone contributing effects for potential vorticity and day length, negative effect on ozone for geopotential height and variable ozone effects due to the polar vortex at regions to the north and south of the polar vortices. Recovery of ozone is identified globally. However, recovery rates and uncertainties strongly depend on choices that can be made in defining the explanatory variables. In particular the recovery rates over Antarctica might not be statistically significant. Furthermore, the results show that there is no spatial homogeneous pattern which regression model and explanatory variables provide the best fit to the data and the most accurate estimates of the recovery rates. Overall these results suggest that care has to be taken in determining ozone recovery rates, in particular for the Antarctic ozone hole.

J. S. Knibbe

2014-02-01

102

Spatial regression analysis on 32 years total column ozone data

Multiple-regressions analysis have been performed on 32 years of total ozone column data that was spatially gridded with a 1° × 1.5° resolution. The total ozone data consists of the MSR (Multi Sensor Reanalysis; 1979-2008) and two years of assimilated SCIAMACHY ozone data (2009-2010). The two-dimensionality in this data-set allows us to perform the regressions locally and investigate spatial patterns of regression coefficients and their explanatory power. Seasonal dependencies of ozone on regressors are included in the analysis. A new physically oriented model is developed to parameterize stratospheric ozone. Ozone variations on non-seasonal timescales are parameterized by explanatory variables describing the solar cycle, stratospheric aerosols, the quasi-biennial oscillation (QBO), El Nino (ENSO) and stratospheric alternative halogens (EESC). For several explanatory variables, seasonally adjusted versions of these explanatory variables are constructed to account for the difference in their effect on ozone throughout the year. To account for seasonal variation in ozone, explanatory variables describing the polar vortex, geopotential height, potential vorticity and average day length are included. Results of this regression model are compared to that of similar analysis based on a more commonly applied statistically oriented model. The physically oriented model provides spatial patterns in the regression results for each explanatory variable. The EESC has a significant depleting effect on ozone at high and mid-latitudes, the solar cycle affects ozone positively mostly at the Southern Hemisphere, stratospheric aerosols affect ozone negatively at high Northern latitudes, the effect of QBO is positive and negative at the tropics and mid to high-latitudes respectively and ENSO affects ozone negatively between 30° N and 30° S, particularly at the Pacific. The contribution of explanatory variables describing seasonal ozone variation is generally large at mid to high latitudes. We observe ozone contributing effects for potential vorticity and day length, negative effect on ozone for geopotential height and variable ozone effects due to the polar vortex at regions to the north and south of the polar vortices. Recovery of ozone is identified globally. However, recovery rates and uncertainties strongly depend on choices that can be made in defining the explanatory variables. In particular the recovery rates over Antarctica might not be statistically significant. Furthermore, the results show that there is no spatial homogeneous pattern which regression model and explanatory variables provide the best fit to the data and the most accurate estimates of the recovery rates. Overall these results suggest that care has to be taken in determining ozone recovery rates, in particular for the Antarctic ozone hole.

Knibbe, J. S.; van der A, R. J.; de Laat, A. T. J.

2014-02-01

103

Directory of Open Access Journals (Sweden)

Full Text Available Both linear neural network and multiple linear regression models can be used for multi-factor analysis and forecasting, but the data of the multiple linear regression model are required to meet such conditions as independence and normality, while the data of the linear neural network are only required to have a linear relationship. This article uses the same set of data to establish respectively a linear neural network model and a multiple linear regression model, compares the abilities of fitting and forecasting of the two kinds of models, and consequently, comes to the conclusion that the linear neural network method has a stronger fitting ability and a more stable ability of prediction so that it can be further applied and promoted in the analyzing and forecasting of continuous data factors.

Guoli Wang

2011-10-01

104

Regression models are being used to quantify the effect of an exposure on an outcome, while adjusting for potential confounders. While the type of regression model to be used is determined by the nature of the outcome variable, e.g. linear regression has to be applied for continuous outcome variables, all regression models can handle any kind of exposure variables. However, some fundamentals of representation of the exposure in a regression model and also some potential pitfalls have to be kept in mind in order to obtain meaningful interpretation of results. The objective of this educational paper was to illustrate these fundamentals and pitfalls, using various multiple regression models applied to data from a hypothetical cohort of 3000 patients with chronic kidney disease. In particular, we illustrate how to represent different types of exposure variables (binary, categorical with two or more categories and continuous), and how to interpret the regression coefficients in linear, logistic and Cox models. We also discuss the linearity assumption in these models, and show how wrongly assuming linearity may produce biased results and how flexible modelling using spline functions may provide better estimates. PMID:24366898

Leffondré, Karen; Jager, Kitty J; Boucquemont, Julie; Stel, Vianda S; Heinze, Georg

2014-10-01

105

Tucker Tensor Regression and Neuroimaging Analysis

Large-scale neuroimaging studies have been collecting brain images of study individuals, which take the form of two-dimensional, three-dimensional, or higher dimensional arrays, also known as tensors. Addressing scientific questions arising from such data demands new regression models that take multidimensional arrays as covariates. Simply turning an image array into a long vector causes extremely high dimensionality that compromises classical regression methods, and, more s...

Li, Xiaoshan; Zhou, Hua; Li, Lexin

2013-01-01

106

Multivariate regression is a common statistical tool for practical problems. Many multivariate regression techniques are designed for univariate response cases. For problems with multiple response variables available, one common approach is to apply the univariate response regression technique separately on each response variable. Although it is simple and popular, the univariate response approach ignores the joint information among response variables. In this paper, we propose three new meth...

Lee, Wonyul; Liu, Yufeng

2012-01-01

107

Background Removable dentures are subject to plaque and/or staining problems. Denture hygiene habits and risk factors differ among countries and regions. The aims of this study were to assess hygiene habits and denture plaque and staining risk factors in Chinese removable denture wearers aged >40 years in Xi’an through multiple logistic regression analysis (MLRA). Methods Questionnaires were administered to 222 patients whose removable dentures were examined clinically to assess wear status and levels of plaque and staining. Univariate analyses were performed to identify potential risk factors for denture plaque/staining. MLRA was performed to identify significant risk factors. Results Brushing (77.93%) was the most prevalent cleaning method in the present study. Only 16.4% of patients regularly used commercial cleansers. Most (81.08%) patients removed their dentures overnight. MLRA indicated that potential risk factors for denture plaque were the duration of denture use (reference, ?0.5 years; 2.1–5 years: OR?=?4.155, P?=?0.001; >5 years: OR?=?7.238, Pcleaning method (reference, chemical cleanser; running water: OR?=?7.081, P?=?0.010; brushing: OR?=?3.567, P?=?0.005). Potential risk factors for denture staining were female gender (OR?=?0.377, P?=?0.013), smoking (OR?=?5.471, P?=?0.031), tea consumption (OR?=?3.957, P?=?0.002), denture scratching (OR?=?4.557, P?=?0.036), duration of denture use (reference, ?0.5 years; 2.1–5 years: OR?=?7.899, P?=?0.001; >5 years: OR?=?27.226, Pcleaning method (reference, chemical cleanser; running water: OR?=?29.184, PDenture hygiene habits need further improvement. An understanding of the risk factors for denture plaque and staining may provide the basis for preventive efforts. PMID:24498369

Chai, Zhiguo; Chen, Jihua; Zhang, Shaofeng

2014-01-01

108

Modeling Lateral and Longitudinal Control of Human Drivers with Multiple Linear Regression Models

In this paper, we describe results to model lateral and longitudinal control behavior of drivers with simple linear multiple regression models. This approach fits into the Bayesian Programming (BP) approach (Bessi

Lenk, Jan; M, Claus

2011-01-01

109

Comparison of Fuzzy Inference System and Multiple Regression to Predict Synthetic Envelopes Clogging

Geo-synthetic materials are being used with acceptable performance in soil and water projects worldwide. Geotextiles are one of the categories of geo-synthetics being used in drainage systems. First generation of geotextiles used in the late 1950’s as an alternative for gravel envelopes. In this research two methods (multiple regression and fuzzy interference system) evaluate to predict synthetic envelope clogging. In multiple regression method the correlation coefficients for PP450, PP700 ...

Bakhtiar Karimi; Farhad Mirzaei; Mohammad Javad Nahvinia; Behnam Ababaei

2010-01-01

110

Directory of Open Access Journals (Sweden)

Full Text Available Landslide is a natural hazard that causes many damages to the environment. Depending on the landform, several factors can cause the Landslide. This research addresses the methodology for landslide susceptibility mapping using multiple regression analysis and GIS tools. Based on the initial hypothesis, ten factors were recognized as effectual elements on landslide, which is geology, slope, aspect, distance from roads, faults and drainage network, soil capability, land use and rainfall. Crossing investigated parameters with the observed landslides indicated that three factor including distance from channel network, distance from fault and rainfall have no major effect on observed landslide in Tajan area. In order to quantifying the parameters in the form of weighting factors, the coverage of landslides in different observation was determined. Then Stepwise method was used for statistical analysis. It was found that slope, aspect, distance from the roads and soil capability are as most effective factors in landslide respectively.

Somayeh Mashari

2012-07-01

111

Neutron multiplicity analysis tool

International Nuclear Information System (INIS)

I describe the capabilities of the EXCOM (EXcel based COincidence and Multiplicity) calculation tool which is used to analyze experimental data or simulated neutron multiplicity data. The input to the program is the count-rate data (including the multiplicity distribution) for a measurement, the isotopic composition of the sample and relevant dates. The program carries out deadtime correction and background subtraction and then performs a number of analyses. These are: passive calibration curve, known alpha and multiplicity analysis. The latter is done with both the point model and with the weighted point model. In the current application EXCOM carries out the rapid analysis of Monte Carlo calculated quantities and allows the user to determine the magnitude of sample perturbations that lead to systematic errors. Neutron multiplicity counting is an assay method used in the analysis of plutonium for safeguards applications. It is widely used in nuclear material accountancy by international (IAEA) and national inspectors. The method uses the measurement of the correlations in a pulse train to extract information on the spontaneous fission rate in the presence of neutrons from (?,n) reactions and induced fission. The measurement is relatively simple to perform and gives results very quickly ((le) 1 hour). By contrast, destructive analysis techniques are extremely costly and time consuming (several days). By improving the achievable accuracy of neutron multiplicity countie accuracy of neutron multiplicity counting, a nondestructive analysis technique, it could be possible to reduce the use of destructive analysis measurements required in safeguards applications. The accuracy of a neutron multiplicity measurement can be affected by a number of variables such as density, isotopic composition, chemical composition and moisture in the material. In order to determine the magnitude of these effects on the measured plutonium mass a calculational tool, EXCOM, has been produced using VBA within Excel. This program was developed to help speed the analysis of Monte Carlo neutron transport simulation (MCNP) data, and only requires the count-rate data to calculate the mass of material using INCC's analysis methods instead of the full neutron multiplicity distribution required to run analysis in INCC. This paper describes what is implemented within EXCOM, including the methods used, how the program corrects for deadtime, and how uncertainty is calculated. This paper also describes how to use EXCOM within Excel.

112

1. There are two reasons for wanting to compare measurers or methods of measurement. One is to calibrate one method or measurer against another; the other is to detect bias. Fixed bias is present when one method gives higher (or lower) values across the whole range of measurement. Proportional bias is present when one method gives values that diverge progressively from those of the other. 2. Linear regression analysis is a popular method for comparing methods of measurement, but the familiar ordinary least squares (OLS) method is rarely acceptable. The OLS method requires that the x values are fixed by the design of the study, whereas it is usual that both y and x values are free to vary and are subject to error. In this case, special regression techniques must be used. 3. Clinical chemists favour techniques such as major axis regression ('Deming's method'), the Passing-Bablok method or the bivariate least median squares method. Other disciplines, such as allometry, astronomy, biology, econometrics, fisheries research, genetics, geology, physics and sports science, have their own preferences. 4. Many Monte Carlo simulations have been performed to try to decide which technique is best, but the results are almost uninterpretable. 5. I suggest that pharmacologists and physiologists should use ordinary least products regression analysis (geometric mean regression, reduced major axis regression): it is versatile, can be used for calibration or to detect bias and can be executed by hand-held calculator or by using the loss function in popular, general-purpose, statistical software. PMID:20337658

Ludbrook, John

2010-07-01

113

Multinomial Inverse Regression for Text Analysis

Text data, including speeches, stories, and other document forms, are often connected to sentiment variables that are of interest for research in marketing, economics, and elsewhere. It is also very high dimensional and difficult to incorporate into statistical analyses. This article introduces a straightforward framework of sentiment-preserving dimension reduction for text data. Multinomial inverse regression is introduced as a general tool for simplifying predictor sets th...

Taddy, Matt

2010-01-01

114

Multiple predictor smoothing methods for sensitivity analysis: Example results

International Nuclear Information System (INIS)

The use of multiple predictor smoothing methods in sampling-based sensitivity analyses of complex models is investigated. Specifically, sensitivity analysis procedures based on smoothing methods employing the stepwise application of the following nonparametric regression techniques are described in the first part of this presentation: (i) locally weighted regression (LOESS), (ii) additive models, (iii) projection pursuit regression, and (iv) recursive partitioning regression. In this, the second and concluding part of the presentation, the indicated procedures are illustrated with both simple test problems and results from a performance assessment for a radioactive waste disposal facility (i.e., the Waste Isolation Pilot Plant). As shown by the example illustrations, the use of smoothing procedures based on nonparametric regression techniques can yield more informative sensitivity analysis results than can be obtained with more traditional sensitivity analysis procedures based on linear regression, rank regression or quadratic regression when nonlinear relationships between model inputs and model predictions are present

115

Multiple predictor smoothing methods for sensitivity analysis: Description of techniques

International Nuclear Information System (INIS)

The use of multiple predictor smoothing methods in sampling-based sensitivity analyses of complex models is investigated. Specifically, sensitivity analysis procedures based on smoothing methods employing the stepwise application of the following nonparametric regression techniques are described: (i) locally weighted regression (LOESS), (ii) additive models, (iii) projection pursuit regression, and (iv) recursive partitioning regression. Then, in the second and concluding part of this presentation, the indicated procedures are illustrated with both simple test problems and results from a performance assessment for a radioactive waste disposal facility (i.e., the Waste Isolation Pilot Plant). As shown by the example illustrations, the use of smoothing procedures based on nonparametric regression techniques can yield more informative sensitivity analysis results than can be obtained with more traditional sensitivity analysis procedures based on linear regression, rank regression or quadratic regression when nonlinear relationships between model inputs and model predictions are present

116

The Precision Efficacy Analysis for Regression Sample Size Method.

The general purpose of this study was to examine the efficiency of the Precision Efficacy Analysis for Regression (PEAR) method for choosing appropriate sample sizes in regression studies used for precision. The PEAR method, which is based on the algebraic manipulation of an accepted cross-validity formula, essentially uses an effect size to…

Brooks, Gordon P.; Barcikowski, Robert S.

117

A multilevel regression-analysis-based nonlocal means denoising algorithm

This paper focuses on image denoising under the powerful framework-non local means. First, the introduction and development of NL-means is discussed. Second, a powerful scheme based on linear regression analysis for the classification of image meaningful parts is proposed. Third, an improved version of NL-means is carried out, which uses a novel patch similarity rule based on quadratic regression analysis. This multilevel regression analysis based algorithm can better describe and smooth the noisy image and finally, experimental results validate the algorithm in both effectiveness and efficiency.

Xu, Jin; Zheng, Pengcheng; Lv, Rui

2011-06-01

118

Directory of Open Access Journals (Sweden)

Full Text Available The triggers of forest area loss in Cameroon have not been properly understood. The measures used to curb forest area loss have been simplistic, generalized with no clear cut knowledge of the specific role of different potential factors. This study aims at investigating the hypothesis that population growth is the main cause of loss in forest area. This study will be able to identify what factors are of more significance in the causal equation. The open R programming software has been used to produce multiple linear regression models. The correlation between the dependent variable and the independent variables was established by a correlation matrix and the strength of the models tested by power analysis. The results supports the hypothesis that population growth is the most dominant cause of deforestation in Cameroon while arable production and permanent crop land and arable production per capita index are second and third respectively.

Epule Terence Epule

2011-08-01

119

Analysis and forecasting of air quality parameters are important topics of atmospheric and environmental research today due to the health impact caused by air pollution. This study examines transformation of nitrogen dioxide (NO(2)) into ozone (O(3)) at urban environment using time series plot. Data on the concentration of environmental pollutants and meteorological variables were employed to predict the concentration of O(3) in the atmosphere. Possibility of employing multiple linear regression models as a tool for prediction of O(3) concentration was tested. Results indicated that the presence of NO(2) and sunshine influence the concentration of O(3) in Malaysia. The influence of the previous hour ozone on the next hour concentrations was also demonstrated. PMID:19440846

Ghazali, Nurul Adyani; Ramli, Nor Azam; Yahaya, Ahmad Shukri; Yusof, Noor Faizah Fitri M D; Sansuddin, Nurulilyana; Al Madhoun, Wesam Ahmed

2010-06-01

120

Directory of Open Access Journals (Sweden)

Full Text Available Polyethylene glycol (PEG is the most common preservative in use for bulking and maintaining structural integrity in waterlogged wood. Conservators therefore have a need to be able to determine PEG concentrations in wood in a non-destructive manner. We present a study highlighting the application of infrared spectroscopy coupled with multivariate analysis techniques to predict the concentration of polyethylene glycol 400 (PEG-400 and water simultaneously. This technique uses attenuated total reflectance (ATR spectroscopy andunconstrained stepwise multiple linear regression (SMLR analysis for prediction of multiple components in archaeological wood. Using this model we have calculated the concentration of PEG-400 and water in treated archaeological waterlogged wood samples.

Rohan PATEL

2012-03-01

121

A method for the analysis of capillary column Polychlorinated biphenyl (PCB) data using regression analysis with outlier checking and elimination, COMSTAR, is presented and evaluated. his algorithm determines the best combination of the commercial PCB mixtures which best fits the...

122

In the logistic regression analysis of a small-sized, case-control study on Alzheimer's disease, some of the risk factors exhibited missing values, motivating the use of multiple imputation. Usually, Rubin's rules (RR) for combining point estimates and variances would then be used to estimate (symmetric) confidence intervals (CIs), on the assumption that the regression coefficients were distributed normally. Yet, rarely is this assumption tested, with or without transformation. In analyses of small, sparse, or nearly separated data sets, such symmetric CI may not be reliable. Thus, RR alternatives have been considered, for example, Bayesian sampling methods, but not yet those that combine profile likelihoods, particularly penalized profile likelihoods, which can remove first order biases and guarantee convergence of parameter estimation. To fill the gap, we consider the combination of penalized likelihood profiles (CLIP) by expressing them as posterior cumulative distribution functions (CDFs) obtained via a chi-squared approximation to the penalized likelihood ratio statistic. CDFs from multiple imputations can then easily be averaged into a combined CDF c , allowing confidence limits for a parameter ? ?at level 1?-?? to be identified as those ?* and ?** that satisfy CDF c (?*)?=?????2 and CDF c (?**)?=?1?-?????2. We demonstrate that the CLIP method outperforms RR in analyzing both simulated data and data from our motivating example. CLIP can also be useful as a confirmatory tool, should it show that the simpler RR are adequate for extended analysis. We also compare the performance of CLIP to Bayesian sampling methods using Markov chain Monte Carlo. CLIP is available in the R package logistf. PMID:23873477

Heinze, Georg; Ploner, Meinhard; Beyea, Jan

2013-12-20

123

The Synthesis of Regression Slopes in Meta-Analysis

Research on methods of meta-analysis (the synthesis of related study results) has dealt with many simple study indices, but less attention has been paid to the issue of summarizing regression slopes. In part this is because of the many complications that arise when real sets of regression models are accumulated. We outline the complexities involved in synthesizing slopes, describe existing methods of analysis and present a multivariate generalized least squares approach to t...

Becker, Betsy Jane; Wu, Meng-jia

2008-01-01

124

Principal Regression Analysis and the index leverage effect

We revisit the index leverage effect, that can be decomposed into a volatility effect and a correlation effect. We investigate the latter using a matrix regression analysis, that we call `Principal Regression Analysis' (PRA) and for which we provide some analytical (using Random Matrix Theory) and numerical benchmarks. We find that downward index trends increase the average correlation between stocks (as measured by the most negative eigenvalue of the conditional correlation...

Reigneron, Pierre-alain; Allez, Romain; Bouchaud, Jean-philippe

2010-01-01

125

On two flexible methods of 2-dimensional regression analysis.

Czech Academy of Sciences Publication Activity Database

Ro?. 18, ?. 4 (2012), s. 154-164. ISSN 1803-9782 Grant ostatní: GA ?R(CZ) GAP209/10/2045 Institutional support: RVO:67985556 Keywords : regression analysis * Gordon surface * prediction error * projection pursuit Subject RIV: BB - Applied Statistics, Operational Research http://library.utia.cas.cz/separaty/2013/SI/volf-on two flexible methods of 2-dimensional regression analysis.pdf

Volf, Petr

2012-01-01

126

Overdispersed logistic regression for SAGE: Modelling multiple groups and covariates

Abstract Background Two major identifiable sources of variation in data derived from the Serial Analysis of Gene Expression (SAGE) are within-library sampling variability and between-library heterogeneity within a group. Most published methods for identifying differential expression focus on just the sampling variability. In recent work, the problem of assessing differential expression between two groups of SAGE libraries has been addressed by introducing a beta-binomial hierarchical model th...

Morris Jeffrey S; Deng Li; Baggerly Keith A; Marcelo, Aldaz C.

2004-01-01

127

Benchmark Dose Analysis via Nonparametric Regression Modeling.

Estimation of benchmark doses (BMDs) in quantitative risk assessment traditionally is based upon parametric dose-response modeling. It is a well-known concern, however, that if the chosen parametric model is uncertain and/or misspecified, inaccurate and possibly unsafe low-dose inferences can result. We describe a nonparametric approach for estimating BMDs with quantal-response data based on an isotonic regression method, and also study use of corresponding, nonparametric, bootstrap-based confidence limits for the BMD. We explore the confidence limits' small-sample properties via a simulation study, and illustrate the calculations with an example from cancer risk assessment. It is seen that this nonparametric approach can provide a useful alternative for BMD estimation when faced with the problem of parametric model uncertainty. PMID:23683057

Piegorsch, Walter W; Xiong, Hui; Bhattacharya, Rabi N; Lin, Lizhen

2013-05-17

128

Comparison of Fuzzy Inference System and Multiple Regression to Predict Synthetic Envelopes Clogging

Directory of Open Access Journals (Sweden)

Full Text Available Geo-synthetic materials are being used with acceptable performance in soil and water projects worldwide. Geotextiles are one of the categories of geo-synthetics being used in drainage systems. First generation of geotextiles used in the late 1950’s as an alternative for gravel envelopes. In this research two methods (multiple regression and fuzzy interference system evaluate to predict synthetic envelope clogging. In multiple regression method the correlation coefficients for PP450, PP700 and PP900 are 62.66%, 79.37% and 90.62%, respectively and results of fuzzy interference system and decision tree showed that this method have high potential in comparison with multiple regression and values of total classification accuracy for PP450, PP700 and PP900 are 98.6%, 97.3% and 98% respectively. Then final results of this research showed fuzzy interference systems by using decision tree have high potential to predict clogging in envelops.

Bakhtiar Karimi

2010-07-01

129

Analysis of genome-wide association data by large-scale Bayesian logistic regression.

Single-locus analysis is often used to analyze genome-wide association (GWA) data, but such analysis is subject to severe multiple comparisons adjustment. Multivariate logistic regression is proposed to fit a multi-locus model for case-control data. However, when the sample size is much smaller than the number of single-nucleotide polymorphisms (SNPs) or when correlation among SNPs is high, traditional multivariate logistic regression breaks down. To accommodate the scale of data from a GWA while controlling for collinearity and overfitting in a high dimensional predictor space, we propose a variable selection procedure using Bayesian logistic regression. We explored a connection between Bayesian regression with certain priors and L1 and L2 penalized logistic regression. After analyzing large number of SNPs simultaneously in a Bayesian regression, we selected important SNPs for further consideration. With much fewer SNPs of interest, problems of multiple comparisons and collinearity are less severe. We conducted simulation studies to examine probability of correctly selecting disease contributing SNPs and applied developed methods to analyze Genetic Analysis Workshop 16 North American Rheumatoid Arthritis Consortium data. PMID:20018005

Wang, Yuanjia; Sha, Nanshi; Fang, Yixin

2009-01-01

130

Egg hatchability prediction by multiple linear regression and artificial neural networks

Scientific Electronic Library Online (English)

Full Text Available An artificial neural network (ANN) was compared with a multiple linear regression statistical method to predict hatchability in an artificial incubation process. A feedforward neural network architecture was applied. Network trainings were made by the backpropagation algorithm based on data obtained [...] from industrial incubations. The ANN model was chosen as it produced data that fit better the experimental data as compared to the multiple linear regression model, which used coefficients determined by minimum square method. The proposed simulation results of these approaches indicate that this ANN can be used for incubation performance prediction.

AC, Bolzan; RAF, Machado; JCZ, Piaia.

2008-06-01

131

A multiple regression model for predicting airwaves in shallow water sea bed logging data

This paper focuses on formulating a multiple regression model using matrix notation that can be used to predict the magnitude of airwaves in Shallow Water Sea Bed Logging (SBL) Data. The term airwaves refer to the propagated EM signals from the source antenna via atmosphere that is induced along air/sea surface and interferes with the subsurface signal. In shallow water, the airwaves have the ability to mask other subsurface responses possibly containing valuable information about subsurface resistive structure such as hydrocarbon reservoir. A fair representation of SBL environments was simulated to generate the airwaves data. Magnitude of airwaves at selected offset is used as the dependent variable. Whereas the predictor variables (independent variables) for the proposed multiple regression model are the frequency, seawater depth, seawater conductivity, sediment conductivity and offset. Akaike's Information Criterion (AIC) is used for selecting the multiple regression models. The formulated regression model is benchmarked with the theoretical well-known space-domain expression for the Airwaves estimation. The model reveals goodness of fit with R2 of 0.9561and the overall statistical significance of the estimated parameters F-value of 19.35. The result indicates that the magnitudes of airwaves predicted by the regression model are approximately consistent with theoretical model.

Abdulkarim, Muhammad; Shafi, Afza; Razali, Radzuan; Ansari, Adeel

2014-10-01

132

A simple regression model for network meta-analysis

Introduction: The aim of this paper is to propose a transparent, alternative approach for network meta-analysis based on a regression model that allows inclusion of studies with three or more treatment arms. Methodology: Based on the contingency tables describing the frequency distribution of the outcome in the different intervention arms, a data set is constructed. A logistic regression is used to determine the parameters describing the difference in effect between a specific interventio...

Kessels, A. G. H.; Ter Riet, G.; Puhan, Milo A.; Kleijnen, J.; Bachmann, L. M.; Minder, C.

2013-01-01

133

Regression Analysis of Censored Data with Applications in Perimetry

This thesis treats regression analysis when either the dependent or the independent variable is censored. We deal with quantile regression when the dependent variable is censored. Using the independence between the true values and the censoring limits the quantile function for the true values can be rewritten as another quantile function of the observed, censored values, where the quantile value itself is a function of the censoring distribution. The quantile value is estimated non-parametric...

Lindgren, Anna

1999-01-01

134

Robust In-Car Speech Recognition Based on Nonlinear Multiple Regressions

Directory of Open Access Journals (Sweden)

Full Text Available We address issues for improving handsfree speech recognition performance in different car environments using a single distant microphone. In this paper, we propose a nonlinear multiple-regression-based enhancement method for in-car speech recognition. In order to develop a data-driven in-car recognition system, we develop an effective algorithm for adapting the regression parameters to different driving conditions. We also devise the model compensation scheme by synthesizing the training data using the optimal regression parameters and by selecting the optimal HMM for the test speech. Based on isolated word recognition experiments conducted in 15 real car environments, the proposed adaptive regression approach shows an advantage in average relative word error rate (WER reductions of 52.5 and 14.8 , compared to original noisy speech and ETSI advanced front end, respectively.

Itakura Fumitada

2007-01-01

135

We report a case of tumor regression of multiple bone metastases from breast carcinoma after administration of strontium-89 chloride. This case suggests that strontium-89 chloride can not only relieve bone metastases pain not responsive to analgesics, but may also have a tumoricidal effect on bone metastases.

Heianna, Joichi; Miyauchi, Takaharu; Endo, Wataru; Miura, Naoki; Terui, Kazuyuki; Kamata, Syuichi; Hashimoto, Manabu

2014-01-01

136

Predicting Dropouts of University Freshmen: A Logit Regression Analysis.

Stepwise discriminant analysis coupled with logit regression analysis of freshmen data from Brandon University (Manitoba) indicated that six tested variables drawn from research on university dropouts were useful in predicting attrition: student status, residence, financial sources, distance from home town, goal fulfillment, and satisfaction with…

Lam, Y. L. Jack

1984-01-01

137

Applying Multiple Linear Regression and Neural Network to Predict Bank Performance

Directory of Open Access Journals (Sweden)

Full Text Available Globalization and technological advancement has created a highly competitive market in the banking and finance industry. Performance of the industry depends heavily on the accuracy of the decisions made at managerial level. This study uses multiple linear regression technique and feed forward artificial neural network in predicting bank performance. The study aims to predict bank performance using multiple linear regression and neural network. The study then evaluates the performance of the two techniques with a goal to find a powerful tool in predicting the bank performance. Data of thirteen banks for the period 2001-2006 was used in the study. ROA was used as a measure of bank performance, and hence is a dependent variable for the multiple linear regressions. Seven variables including liquidity, credit risk, cost to income ratio, size, concentration ratio, inflation and GDP were used as independent variables. Under supervised learning, the dependent variable, ROA was used as the target output for the artificial neural network. Seven inputs corresponding to seven predictor variables were used for pattern recognition at the training phase. Experimental results from the multiple linear regression show that two variables: credit risk and cost to income ratio are significant in determining the bank performance. Two variables were found to explain about 60.9 percent of the total variation in the data with a mean square error (MSE of 0.330. The artificial neural network was found to give optimal results by using thirteen hidden neurons. Testing results show that the seven inputs explain about 66.9 percent of the total variation in the data with a very low MSE of 0.00687. Performance of both methods is measured by mean square prediction error (MSPR at the validation stage. The MSPR value for neural network is lower than the MPSR value for multiple linear regression (0.0061 against 0.6190. The study concludes that artificial neural network is the more powerful tool in predicting bank performance.

Nor Mazlina Abu Bakar

2009-09-01

138

Evaluating Productivity Index in a Gas Well Using Regression Analysis

In this study, a new approach is introduced to augment existing correlations for the analysis of Productivity Index of a gas well. The Modified Isochronal test method is used in this analysis. The Productivity Index trend of the gas well is evaluated from the test data. Regression Analysis is used to develop a correlation, which is then used to evaluate andforecast future Productivity Index trend. The back pressure equation of the Simplified Analysis method is also used to exa...

Tobuyei Christopher; Osokogwu Uche

2014-01-01

139

Ratio Versus Regression Analysis: Some Empirical Evidence in Brazil

Directory of Open Access Journals (Sweden)

Full Text Available This work compares the traditional methodology for ratio analysis, applied to a sample of Brazilian firms, with the alternative one of regression analysis both to cross-industry and intra-industry samples. It was tested the structural validity of the traditional methodology through a model that represents its analogous regression format. The data are from 156 Brazilian public companies in nine industrial sectors for the year 1997. The results provide weak empirical support for the traditional ratio methodology as it was verified that the validity of this methodology may differ between ratios.

Newton Carneiro Affonso da Costa Jr.

2004-06-01

140

Multivariate regression is a common statistical tool for practical problems. Many multivariate regression techniques are designed for univariate response cases. For problems with multiple response variables available, one common approach is to apply the univariate response regression technique separately on each response variable. Although it is simple and popular, the univariate response approach ignores the joint information among response variables. In this paper, we propose three new methods for utilizing joint information among response variables. All methods are in a penalized likelihood framework with weighted L1 regularization. The proposed methods provide sparse estimators of conditional inverse co-variance matrix of response vector given explanatory variables as well as sparse estimators of regression parameters. Our first approach is to estimate the regression coefficients with plug-in estimated inverse covariance matrices, and our second approach is to estimate the inverse covariance matrix with plug-in estimated regression parameters. Our third approach is to estimate both simultaneously. Asymptotic properties of these methods are explored. Our numerical examples demonstrate that the proposed methods perform competitively in terms of prediction, variable selection, as well as inverse covariance matrix estimation. PMID:22791925

Lee, Wonyul; Liu, Yufeng

2012-01-01

141

Multivariate regression is a common statistical tool for practical problems. Many multivariate regression techniques are designed for univariate response cases. For problems with multiple response variables available, one common approach is to apply the univariate response regression technique separately on each response variable. Although it is simple and popular, the univariate response approach ignores the joint information among response variables. In this paper, we propose three new methods for utilizing joint information among response variables. All methods are in a penalized likelihood framework with weighted L(1) regularization. The proposed methods provide sparse estimators of conditional inverse co-variance matrix of response vector given explanatory variables as well as sparse estimators of regression parameters. Our first approach is to estimate the regression coefficients with plug-in estimated inverse covariance matrices, and our second approach is to estimate the inverse covariance matrix with plug-in estimated regression parameters. Our third approach is to estimate both simultaneously. Asymptotic properties of these methods are explored. Our numerical examples demonstrate that the proposed methods perform competitively in terms of prediction, variable selection, as well as inverse covariance matrix estimation. PMID:22791925

Lee, Wonyul; Liu, Yufeng

2012-10-01

142

Analysis of Sting Balance Calibration Data Using Optimized Regression Models

Calibration data of a wind tunnel sting balance was processed using a candidate math model search algorithm that recommends an optimized regression model for the data analysis. During the calibration the normal force and the moment at the balance moment center were selected as independent calibration variables. The sting balance itself had two moment gages. Therefore, after analyzing the connection between calibration loads and gage outputs, it was decided to choose the difference and the sum of the gage outputs as the two responses that best describe the behavior of the balance. The math model search algorithm was applied to these two responses. An optimized regression model was obtained for each response. Classical strain gage balance load transformations and the equations of the deflection of a cantilever beam under load are used to show that the search algorithm s two optimized regression models are supported by a theoretical analysis of the relationship between the applied calibration loads and the measured gage outputs. The analysis of the sting balance calibration data set is a rare example of a situation when terms of a regression model of a balance can directly be derived from first principles of physics. In addition, it is interesting to note that the search algorithm recommended the correct regression model term combinations using only a set of statistical quality metrics that were applied to the experimental data during the algorithm s term selection process.

Ulbrich, N.; Bader, Jon B.

2010-01-01

143

Energy Technology Data Exchange (ETDEWEB)

Polarisation curves performed at the Fuel Cell System Laboratory (FC LAB) at Belfort on a PEM fuel cell stack using a homemade fully instrumented test bench led to more than 100 variables depending on time. Visualising and analysing all the different test variables are complex. In this work, we show how the Principal Component Analysis (PCA) method helps to explore correlations between variables and similarities between measurements at a specific sampling time (individuals). To complete this method, an empirical model of the PEM fuel cell is proposed by linking the different input parameters to the cell voltage using Multiple Linear Regression. (author)

Placca, Latevi [FC LAB., Fuel Cell System Laboratory, Rue Thierry Mieg, 90000 Belfort (France); M3M research laboratory, University of Technology of Belfort-Montbeliard, 90010 Belfort (France); CEA, LITEN, 17, Rue des Martyrs - 38000 Grenoble (France); Kouta, Raed; Charon, Willy [FC LAB., Fuel Cell System Laboratory, Rue Thierry Mieg, 90000 Belfort (France); M3M research laboratory, University of Technology of Belfort-Montbeliard, 90010 Belfort (France); Candusso, Denis [FC LAB., Fuel Cell System Laboratory, Rue Thierry Mieg, 90000 Belfort (France); INRETS, The French National Institute for Transport and Safety Research, Laboratory of New Technologies (LTN), 25 Allee des Marronniers, 78000 Versailles - Satory (France); Blachot, Jean-Francois [FC LAB., Fuel Cell System Laboratory, Rue Thierry Mieg, 90000 Belfort (France); CEA, LITEN, 17, Rue des Martyrs - 38000 Grenoble (France)

2010-05-15

144

Aerosol optical depth (AOD) from AERONET data has a very fine resolution but air pollution index (API), visibility and relative humidity from the ground truth measurements are coarse. To obtain the local AOD in the atmosphere, the relationship between these three parameters was determined using multiple regression analysis. The data of southwest monsoon period (August to September, 2012) taken in Penang, Malaysia, was used to establish a quantitative relationship in which the AOD is modeled as a function of API, relative humidity, and visibility. The highest correlated model was used to predict AOD values during southwest monsoon period. When aerosol is not uniformly distributed in the atmosphere then the predicted AOD can be highly deviated from the measured values. Therefore these deviated data can be removed by comparing between the predicted AOD values and the actual AERONET data which help to investigate whether the non uniform source of the aerosol is from the ground surface or from higher altitude level. This model can accurately predict AOD if only the aerosol is uniformly distributed in the atmosphere. However, further study is needed to determine this model is suitable to use for AOD predicting not only in Penang, but also other state in Malaysia or even global.

Tan, F.; Lim, H. S.; Abdullah, K.; Yoon, T. L.; Zubir Matjafri, M.; Holben, B.

2014-02-01

145

Application of stepwise multiple regression techniques to inversion of Nimbus 'IRIS' observations.

Exploratory studies with Nimbus-3 infrared interferometer-spectrometer (IRIS) data indicate that, in addition to temperature, such meteorological parameters as geopotential heights of pressure surfaces, tropopause pressure, and tropopause temperature can be inferred from the observed spectra with the use of simple regression equations. The technique of screening the IRIS spectral data by means of stepwise regression to obtain the best radiation predictors of meteorological parameters is validated. The simplicity of application of the technique and the simplicity of the derived linear regression equations - which contain only a few terms - suggest usefulness for this approach. Based upon the results obtained, suggestions are made for further development and exploitation of the stepwise regression analysis technique.

Ohring, G.

1972-01-01

146

Gene-environment (G × E) interactions are biologically important for a wide range of environmental exposures and clinical outcomes. Because of the large number of potential interactions in genomewide association data, the standard approach fits one model per G × E interaction with multiple hypothesis correction (MHC) used to control the type I error rate. Although sometimes effective, using one model per candidate G × E interaction test has two important limitations: low power due to MHC and omitted variable bias. To avoid the coefficient estimation bias associated with independent models, researchers have used penalized regression methods to jointly test all main effects and interactions in a single regression model. Although penalized regression supports joint analysis of all interactions, can be used with hierarchical constraints, and offers excellent predictive performance, it cannot assess the statistical significance of G × E interactions or compute meaningful estimates of effect size. To address the challenge of low power, researchers have separately explored screening-testing, or two-stage, methods in which the set of potential G × E interactions is first filtered and then tested for interactions with MHC only applied to the tests actually performed in the second stage. Although two-stage methods are statistically valid and effective at improving power, they still test multiple separate models and so are impacted by MHC and biased coefficient estimation. To remedy the challenges of both poor power and omitted variable bias encountered with traditional G × E interaction detection methods, we propose a novel approach that combines elements of screening-testing and hierarchical penalized regression. Specifically, our proposed method uses, in the first stage, an elastic net-penalized multiple logistic regression model to jointly estimate either the marginal association filter statistic or the gene-environment correlation filter statistic for all candidate genetic markers. In the second stage, a single multiple logistic regression model is used to jointly assess marginal terms and G × E interactions for all genetic markers that pass the first stage filter. A single likelihood-ratio test is used to determine whether any of the interactions are statistically significant. We demonstrate the efficacy of our method relative to alternative G × E detection methods on a bladder cancer data set. PMID:25592580

Frost, H Robert; Andrew, Angeline S; Karagas, Margaret R; Moore, Jason H

2015-01-01

147

International Nuclear Information System (INIS)

Much attention is focused on increasing the energy efficiency to decrease fuel costs and CO2 emissions throughout industrial sectors. The ORC (organic Rankine cycle) is a relatively simple but efficient process that can be used for this purpose by converting low and medium temperature waste heat to power. In this study we propose four linear regression models to predict the maximum obtainable thermal efficiency for simple and recuperated ORCs. A previously derived methodology is able to determine the maximum thermal efficiency among many combinations of fluids and processes, given the boundary conditions of the process. Hundreds of optimised cases with varied design parameters are used as observations in four multiple regression analyses. We analyse the model assumptions, prediction abilities and extrapolations, and compare the results with recent studies in the literature. The models are in agreement with the literature, and they present an opportunity for accurate prediction of the potential of an ORC to convert heat sources with temperatures from 80 to 360 °C, without detailed knowledge or need for simulation of the process. - Highlights: • The maximum thermal efficiency of ORCs in hundreds of cases was analysed. • Multiple regression models were derived to predict the maximum obtainable efficiency of ORCs. • Using only key design parameters, the maximum obtainable efficiency can be evaluated. • The regression models decrease the resources needed to evaluate the maximum potential. • The models are statistically strong and in good agreement with the literature

148

Mass estimation of loose parts in nuclear power plant based on multiple regression

International Nuclear Information System (INIS)

According to the application of the Hilbert–Huang transform to the non-stationary signal and the relation between the mass of loose parts in nuclear power plant and corresponding frequency content, a new method for loose part mass estimation based on the marginal Hilbert–Huang spectrum (MHS) and multiple regression is proposed in this paper. The frequency spectrum of a loose part in a nuclear power plant can be expressed by the MHS. The multiple regression model that is constructed by the MHS feature of the impact signals for mass estimation is used to predict the unknown masses of a loose part. A simulated experiment verified that the method is feasible and the errors of the results are acceptable. (paper)

149

Directory of Open Access Journals (Sweden)

Full Text Available In this study, we propose a Leverage Based Near-Neighbor (LBNN method where prior information on the structure of the heteroscedastic error is not required. In the proposed LBNN method, weights are determined not from the near-neighbor values of the explanatory variables, but from their corresponding leverage values so that it can be readily applied to a multiple regression model. Both the empirical and Monte Carlo simulation results show that the LBNN method offers substantial improvement over the existing methods. The LBNN has significantly reduced the standard errors of the estimates and also the standard errors of residuals for both simple and multiple linear regression models. Hence, the LBNN can be established as one reliable alternative approach to other existing methods that deal with heteroscedastic errors when the form of heteroscedasticity is unknown.

H. Midi

2009-01-01

150

Variable selection in multiple linear regression: The influence of individual cases

The influence of individual cases in a data set is studied when variable selection is applied in multiple linear regression. Two different influence measures, based on the C_p criterion and Akaike's information criterion, are introduced. The relative change in the selection criterion when an individual case is omitted is proposed as the selection influence of the specific omitted case. Four standard examples from the literature are considered and the selection influence of the cases is calcul...

Sj, Steel; Dw, Uys

2007-01-01

151

Several statistical packages are capable of estimating generalized linear mixed models and these packages provide one or more of three estimation methods: penalized quasi-likelihood, Laplace, and Gauss-Hermite. Many studies have investigated these methods’ performance for the mixed-effects logistic regression model. However, the authors focused on models with one or two random effects and assumed a simple covariance structure between them, which may not be realistic. When there are multiple...

Kim, Yoonsang; Choi, Young-ku; Emery, Sherry

2013-01-01

152

Estimation of Saturation Percentage of Soil Using Multiple Regression, ANN, and ANFIS Techniques

The saturation percentage (SP) of soils is an important index in hydrological studies. In this paper, arti?cial neural networks (ANNs), multiple regression (MR), and adaptive neural-based fuzzy inference system (ANFIS) were used for estimation of saturation percentage of soils collected from Boukan region in the northwestern part of Iran. Percent clay, silt, sand and organic carbon (OC) were used to develop the applied methods. In additions contributions of each input variable were asse...

Khaled Ahmad Aali; Masoud Parsinejad; Bizhan Rahmani

2009-01-01

153

We present a new, to our knowledge, method for extracting optical properties from integrating sphere measurements on thin biological samples. The method is based on multivariate calibration techniques involving Monte Carlo simulations, multiple polynomial regression, and a Newton-Raphson algorithm for solving nonlinear equation systems. Prediction tests with simulated data showed that the mean relative prediction error of the absorption and the reduced scattering coefficients within typical b...

Dam, J. S.; Dalgaard, T.; Fabricius, P. E.; Andersson-engels, Stefan

2000-01-01

154

General regression neural network in energy cost analysis

International Nuclear Information System (INIS)

Previous researches on energy cost evaluation in industrial processes have been led by the authors using variance analysis techniques, MANOVA. The results were satisfactory and the codes developed using this techniques on process computers were capable to take care of various factors. Nevertheless either many hypothesis had to be made on the analytical form of the regression surfaces, or a pure MANOVA model had to be used, loosing information on the possible interpolation. Moreover, regression approach was hardly extensible to on-line acquisition of new data. In order to achieve this goal and to simplify the processing of data, we adopted neural networks techniques. We tested various types of networks and we found empirical evidence that the General Regression Neural Networks structure (GRNN) could behave consistently better than back-propagation algorithms

155

Regression Analysis between Properties of Subgrade Lateritic Soil

The results of a study that considered the use of regression analysis that may have correlation between index properties and California Bearing Ratio (CBR) of some lateritic soil within Osogbo town of South Western Nigeria have been presented. For an appreciable conclusion to be established, lateritic soil samples were collected from eight (8) different borrow pits within the town and various laboratory tests including Atterberg Limits, Gradation analysis, California Bearing Ratio, Compaction...

Bello, Afeez Adefemi

2012-01-01

156

MULTINOMIAL LOGISTIC REGRESSION: USAGE AND APPLICATION IN RISK ANALYSIS

The objective of the article was to explore the usage of multinomial logistic regression (MLR) in risk analysis. In this regard, performing MLR on risk analysis data corrected for the non-linear nature of binary response and did address the violation of equal variance and normality assumptions. Additionally, use of maximum likelihood (-2log) estimation provided a means of working with binary response data. The relationship of independent and dependent variables was also addressed.The data use...

Bayaga, Anass

2010-01-01

157

Directory of Open Access Journals (Sweden)

Full Text Available Abstract A random regression model for daily feed intake and a conventional multiple trait animal model for the four traits average daily gain on test (ADG, feed conversion ratio (FCR, carcass lean content and meat quality index were combined to analyse data from 1 449 castrated male Large White pigs performance tested in two French central testing stations in 1997. Group housed pigs fed ad libitum with electronic feed dispensers were tested from 35 to 100 kg live body weight. A quadratic polynomial in days on test was used as a regression function for weekly means of daily feed intake and to escribe its residual variance. The same fixed (batch and random (additive genetic, pen and individual permanent environmental effects were used for regression coefficients of feed intake and single measured traits. Variance components were estimated by means of a Bayesian analysis using Gibbs sampling. Four Gibbs chains were run for 550 000 rounds each, from which 50 000 rounds were discarded from the burn-in period. Estimates of posterior means of covariance matrices were calculated from the remaining two million samples. Low heritabilities of linear and quadratic regression coefficients and their unfavourable genetic correlations with other performance traits reveal that altering the shape of the feed intake curve by direct or indirect selection is difficult.

Künzi Niklaus

2002-01-01

158

Least Squares Adjustment: Linear and Nonlinear Weighted Regression Analysis

DEFF Research Database (Denmark)

This note primarily describes the mathematics of least squares regression analysis as it is often used in geodesy including land surveying and satellite positioning applications. In these fields regression is often termed adjustment. The note also contains a couple of typical land surveying and satellite positioning application examples. In these application areas we are typically interested in the parameters in the model typically 2- or 3-D positions and not in predictive modelling which is often the main concern in other regression analysis applications. Adjustment is often used to obtain estimates of relevant parameters in an over-determined system of equations which may arise from deliberately carrying out more measurements than actually needed to determine the set of desired parameters. An example may be the determination of a geographical position based on information from a number of Global Navigation Satellite System (GNSS) satellites also known as space vehicles (SV). It takes at least four SVs to determine the position (and the clock error) of a GNSS receiver. Often more than four SVs are used and we use adjustment to obtain a better estimate of the geographical position (and the clock error) and to obtain estimates of the uncertainty with which the position is determined. Regression analysis is used in many other fields of application both in the natural, the technical and the social sciences. Examples may be curve fitting, calibration, establishing relationships between different variables in an experiment or in a survey, etc. Regression analysis is probably one the most used statistical techniques around. Dr. Anna B. O. Jensen provided insight and data for the Global Positioning System (GPS) example. Matlab code and sections that are considered as either traditional land surveying material or as advanced material are typeset with smaller fonts. Comments in general or on for example unavoidable typos, shortcomings and errors are most welcome.

Nielsen, Allan Aasbjerg

2007-01-01

159

In this study, the application of Artificial Neural Networks (ANN) and Multiple regression analysis (MR) to forecast long-term seasonal spring rainfall in Victoria, Australia was investigated using lagged El Nino Southern Oscillation (ENSO) and Indian Ocean Dipole (IOD) as potential predictors. The use of dual (combined lagged ENSO-IOD) input sets for calibrating and validating ANN and MR Models is proposed to investigate the simultaneous effect of past values of these two major climate modes on long-term spring rainfall prediction. The MR models that did not violate the limits of statistical significance and multicollinearity were selected for future spring rainfall forecast. The ANN was developed in the form of multilayer perceptron using Levenberg-Marquardt algorithm. Both MR and ANN modelling were assessed statistically using mean square error (MSE), mean absolute error (MAE), Pearson correlation (r) and Willmott index of agreement (d). The developed MR and ANN models were tested on out-of-sample test sets; the MR models showed very poor generalisation ability for east Victoria with correlation coefficients of -0.99 to -0.90 compared to ANN with correlation coefficients of 0.42-0.93; ANN models also showed better generalisation ability for central and west Victoria with correlation coefficients of 0.68-0.85 and 0.58-0.97 respectively. The ability of multiple regression models to forecast out-of-sample sets is compatible with ANN for Daylesford in central Victoria and Kaniva in west Victoria (r = 0.92 and 0.67 respectively). The errors of the testing sets for ANN models are generally lower compared to multiple regression models. The statistical analysis suggest the potential of ANN over MR models for rainfall forecasting using large scale climate modes.

Mekanik, F.; Imteaz, M. A.; Gato-Trinidad, S.; Elmahdi, A.

2013-10-01

160

Simultaneous multiple non-crossing quantile regression estimation using kernel constraints.

Quantile regression (QR) is a very useful statistical tool for learning the relationship between the response variable and covariates. For many applications, one often needs to estimate multiple conditional quantile functions of the response variable given covariates. Although one can estimate multiple quantiles separately, it is of great interest to estimate them simultaneously. One advantage of simultaneous estimation is that multiple quantiles can share strength among them to gain better estimation accuracy than individually estimated quantile functions. Another important advantage of joint estimation is the feasibility of incorporating simultaneous non-crossing constraints of QR functions. In this paper, we propose a new kernel-based multiple QR estimation technique, namely simultaneous non-crossing quantile regression (SNQR). We use kernel representations for QR functions and apply constraints on the kernel coefficients to avoid crossing. Both unregularised and regularised SNQR techniques are considered. Asymptotic properties such as asymptotic normality of linear SNQR and oracle properties of the sparse linear SNQR are developed. Our numerical results demonstrate the competitive performance of our SNQR over the original individual QR estimation. PMID:22190842

Liu, Yufeng; Wu, Yichao

2011-06-01

161

Robust regression applied to fractal/multifractal analysis.

Fractal and multifractal are concepts that have grown increasingly popular in recent years in the soil analysis, along with the development of fractal models. One of the common steps is to calculate the slope of a linear fit commonly using least squares method. This shouldn't be a special problem, however, in many situations using experimental data the researcher has to select the range of scales at which is going to work neglecting the rest of points to achieve the best linearity that in this type of analysis is necessary. Robust regression is a form of regression analysis designed to circumvent some limitations of traditional parametric and non-parametric methods. In this method we don't have to assume that the outlier point is simply an extreme observation drawn from the tail of a normal distribution not compromising the validity of the regression results. In this work we have evaluated the capacity of robust regression to select the points in the experimental data used trying to avoid subjective choices. Based on this analysis we have developed a new work methodology that implies two basic steps: • Evaluation of the improvement of linear fitting when consecutive points are eliminated based on R p-value. In this way we consider the implications of reducing the number of points. • Evaluation of the significance of slope difference between fitting with the two extremes points and fitted with the available points. We compare the results applying this methodology and the common used least squares one. The data selected for these comparisons are coming from experimental soil roughness transect and simulated based on middle point displacement method adding tendencies and noise. The results are discussed indicating the advantages and disadvantages of each methodology. Acknowledgements Funding provided by CEIGRAM (Research Centre for the Management of Agricultural and Environmental Risks) and by Spanish Ministerio de Ciencia e Innovación (MICINN) through project no. AGL2010-21501/AGR is greatly appreciated.

Portilla, F.; Valencia, J. L.; Tarquis, A. M.; Saa-Requejo, A.

2012-04-01

162

What fiscal policy is most effective? A Meta Regression Analysis

We apply meta regression analysis to a unique data set of 104 studies on multiplier effects with 1069 reported multipliers in order to derive stylized facts and to quantify the differing effectiveness of the composition of fiscal impulses, adjusted for the interference of study-design characteristics and sample specifics. As a major result, we find that public spending multipliers are close to one and about 0.3 to 0.4 units larger than tax and transfer multipliers. Public investment multiplie...

Gechert, Sebastian

2013-01-01

163

Risk factors for mortality after bereavement: a logistic regression analysis

A national sample of elderly widowed people was followed up for six years. Excess mortality was found for men aged 75 years and over in the first six months of bereavement compared with men of the same age in the general population. Logistic regression analysis, controlling for age and sex together, demonstrated that the best independent predictors of mortality among the elderly widowed were: interviewer assessment of low happiness level; interviewer assessed and self-reported problems with n...

Bowling, Ann; Charlton, John

1987-01-01

164

A new multivariate concept of quantile, based on a directional version of Koenker and Bassett's traditional regression quantiles, is introduced for multivariate location and multiple-output regression problems. In their empirical version, those quantiles can be computed efficiently via linear programming techniques. Consistency, Bahadur representation and asymptotic normality results are established. Most importantly, the contours generated by those quantiles are shown to coincide with the classical halfspace depth contours associated with the name of Tukey. This relation does not only allow for efficient depth contour computations by means of parametric linear programming, but also for transferring from the quantile to the depth universe such asymptotic results as Bahadur representations. Finally, linear programming duality opens the way to promising developments in depth-related multivariate rank-based inference.

Hallin, Marc; Šiman, Miroslav; 10.1214/09-AOS723

2010-01-01

165

Poisson Regression Analysis of Illness and Injury Surveillance Data

Energy Technology Data Exchange (ETDEWEB)

The Department of Energy (DOE) uses illness and injury surveillance to monitor morbidity and assess the overall health of the work force. Data collected from each participating site include health events and a roster file with demographic information. The source data files are maintained in a relational data base, and are used to obtain stratified tables of health event counts and person time at risk that serve as the starting point for Poisson regression analysis. The explanatory variables that define these tables are age, gender, occupational group, and time. Typical response variables of interest are the number of absences due to illness or injury, i.e., the response variable is a count. Poisson regression methods are used to describe the effect of the explanatory variables on the health event rates using a log-linear main effects model. Results of fitting the main effects model are summarized in a tabular and graphical form and interpretation of model parameters is provided. An analysis of deviance table is used to evaluate the importance of each of the explanatory variables on the event rate of interest and to determine if interaction terms should be considered in the analysis. Although Poisson regression methods are widely used in the analysis of count data, there are situations in which over-dispersion occurs. This could be due to lack-of-fit of the regression model, extra-Poisson variation, or both. A score test statistic and regression diagnostics are used to identify over-dispersion. A quasi-likelihood method of moments procedure is used to evaluate and adjust for extra-Poisson variation when necessary. Two examples are presented using respiratory disease absence rates at two DOE sites to illustrate the methods and interpretation of the results. In the first example the Poisson main effects model is adequate. In the second example the score test indicates considerable over-dispersion and a more detailed analysis attributes the over-dispersion to extra-Poisson variation. The R open source software environment for statistical computing and graphics is used for analysis. Additional details about R and the data that were used in this report are provided in an Appendix. Information on how to obtain R and utility functions that can be used to duplicate results in this report are provided.

Frome E.L., Watkins J.P., Ellis E.D.

2012-12-12

166

Accurate projections of stratospheric ozone are required, because ozone changes impact onexposures to ultraviolet radiation and on tropospheric climate. Unweighted multi-model ensemble mean (uMMM) projections from chemistry-climate models (CCMs) are commonly used to project ozone in the 21 th century, when ozone-depleting substances are expected to decline and greenhouse gases expected to rise. Here, we address the question whether Antarctic total column ozone projections in October given by the uMMM of CCM simulations can be improved by using a process-oriented multiple diagnostic ensemble regression (MDER) method. This method is based on the correlation between simulated future ozone and selected key processes relevant for stratospheric ozone under present-day conditions. The regression model is built using an algorithm that selects those process-oriented diagnostics which explain a significant fraction of the spread in the projected ozone among the CCMs. The regression model with observed diagnostics is then used to predict future ozone and associated uncertainty. The precision of our method is tested in a pseudo-reality, i.e. the prediction is validated against an independent CCM projection used to replace unavailable future observations. The test shows that MDER has a higher precision than uMMM, suggesting an improvement in the estimate of future Antarctic ozone. Our method projects that Antarctic total ozone will return to 1980 values around 2060 with the 95% confidence interval ranging from 2040 to 2080. This reduces the range of return dates across the ensemble of CCMs by more than a decade and suggests that the earliest simulated return dates are unlikely. Karpechko, Maraun and Eyring (2013) Improving Antarctic Total Ozone Projections by a Process-Oriented Multiple Diagnostic Ensemble Regression, J. Atmos. Sci. 70: 3959-3976

Karpechko, Aleyey; Maraun, Douglas; Eyring, Veronika

2014-05-01

167

This article proposes the use of the coefficient of determination as a statistic for hypothesis testing in multiple linear regression based on distributions acquired by beta sampling. (Contains 3 figures.)

Quinino, Roberto C.; Reis, Edna A.; Bessegato, Lupercio F.

2013-01-01

168

Energy Technology Data Exchange (ETDEWEB)

Multiple linear regression analysis has been used to study bidding and production data for Federal offshore oil and gas leases. Policy value conclusions have been stated therefrom. We, firstly, address the applicability of the inherent assumptions of normality and homoscedasticity finding the assumptions unsupported and questioning the statistical inferences which could otherwise be drawn. Secondly, even given the legitimacy of the assumptions and the usual statistical inferences from multiple linear regression results, we show the conclusions are volatilely sensitive. We are led to a strong assertion that quantitative assessment of the assumptions of normality and homoscedasticity be a mandatory requirement for the proper understanding and use, if indeed any is possible, of multiple linear regression analysis results for drawing policy value conclusions from data with the statistical behavior of Federal offshore oil and gas lease data.

Berger, P.D.; Lohrenz, J.

1980-06-01

169

Rivers are important systems which provide water to fulfill human needs. However, excessive human uses over the years have led to deterioration in quality of river causing, causing health problems from contaminated water. This study focuses on the application of statistical techniques, Multiple Linear Regression model and MANOVA to assess health impacts due to pollution in Cauvery river stretch in Srirangapatna. In this study, using Multiple Linear Regression, it is fou...

Sudevi Basu; Lokesh, K. S.

2014-01-01

170

Reducing construction waste is becoming a key environmental issue in the construction industry. The quantification of waste generation rates in the construction sector is an invaluable management tool in supporting mitigation actions. However, the quantification of waste can be a difficult process because of the specific characteristics and the wide range of materials used in different construction projects. Large variations are observed in the methods used to predict the amount of waste generated because of the range of variables involved in construction processes and the different contexts in which these methods are employed. This paper proposes a statistical model to determine the amount of waste generated in the construction of high-rise buildings by assessing the influence of design process and production system, often mentioned as the major culprits behind the generation of waste in construction. Multiple regression was used to conduct a case study based on multiple sources of data of eighteen residential buildings. The resulting statistical model produced dependent (i.e. amount of waste generated) and independent variables associated with the design and the production system used. The best regression model obtained from the sample data resulted in an adjusted R(2) value of 0.694, which means that it predicts approximately 69% of the factors involved in the generation of waste in similar constructions. Most independent variables showed a low determination coefficient when assessed in isolation, which emphasizes the importance of assessing their joint influence on the response (dependent) variable. PMID:25704604

Parisi Kern, Andrea; Ferreira Dias, Michele; Piva Kulakowski, Marlova; Paulo Gomes, Luciana

2015-05-01

171

International Nuclear Information System (INIS)

Highlights: ? We obtained models for estimation of cetane number of biodiesel. ? Twenty-four neural networks using two topologies were evaluated. ? The best neural network for predict the cetane number was selected. ? The best accuracy was obtained for the selected neural network. - Abstract: Models for estimation of cetane number of biodiesel from their fatty acid methyl ester composition using multiple linear regression and artificial neural networks were obtained in this work. For the obtaining of models to predict the cetane number, an experimental data from literature reports that covers 48 and 15 biodiesels in the modeling-training step and validation step respectively were taken. Twenty-four neural networks using two topologies and different algorithms for the second training step were evaluated. The model obtained using multiple regression was compared with two other models from literature and it was able to predict cetane number with 89% of accuracy, observing one outlier. A model to predict cetane number using artificial neural network was obtained with better accuracy than 92% except one outlier. The best neural network to predict the cetane number was a backpropagation network (11:5:1) using the Levenberg–Marquardt algorithm for the second step of the networks training and showing R = 0.9544 for the validation data.

172

Residential behavioural energy savings : a meta-regression analysis

Energy Technology Data Exchange (ETDEWEB)

Increasing attention is being given to opportunities for residential energy behavioural savings, as developed countries attempt to reduce energy use and greenhouse gas emissions. Several utility companies have undertaken pilot programs geared at understanding which interventions are most effective in reducing residential energy consumption through behavioural change. This paper presented the first metaregression analysis of residential energy behavioural savings. This study focused on interventions which affected household energy-related behaviours and as a result, affected household energy use. The paper described rational choice theory, the theory of planned behaviour, and the integration of rational choice theory and the adjusted expectancy values theory in a simple framework. The paper also discussed the review of various social, psychological and economics journals and databases. The results of the studies were presented. A basic concept in meta-regression analysis is the effects size which is defined as the program effect divided by the standard error of the program effect. A lengthy review of the literature found twenty-eight treatments from ten experiments for which an effect size could be calculated. The experiments involved classifying treatments according to whether the interventions were information, goal setting, feedback, rewards or combinations of these interventions. The impact of these alternative interventions on the effect size was then modelled using White's robust regression. Five regression models were compared on the basis of the Akaike's information criterion. It was found that model 5, which used all of the regressors, was the preferred model. It was concluded that the theory of planned behaviour is more appropriate in the context of analysis of behavioural change and energy use. 21 refs., 4 tabs.

Tiedemann, K.H. [BC Hydro, Burnaby, BC (Canada)

2009-07-01

173

International Nuclear Information System (INIS)

Objective: To analyze the correlations between liver lipid level determined by liver 3.0 T 1H-MRS in vivo and influencing factors using multiple linear stepwise regression. Methods: The prospective study of liver 1H-MRS was performed with 3.0 T system and eight-channel torso phased-array coils using PRESS sequence. Forty-four volunteers were enrolled in this study. Liver spectra were collected with a TR of 1500 ms, TE of 30 ms, volume of interest of 2 cm×2 cm×2 cm, NSA of 64 times. The acquired raw proton MRS data were processed by using a software program SAGE. For each MRS measurement, using water as the internal reference, the amplitude of the lipid signal was normalized to the sum of the signal from lipid and water to obtain percentage lipid within the liver. The statistical description of height, weight, age and BMI, Line width and water suppression were recorded, and Pearson analysis was applied to test their relationships. Multiple linear stepwise regression was used to set the statistical model for the prediction of Liver lipid content. Results: Age (39.1±12.6) years, body weight (64.4±10.4) kg, BMI (23.3±3.1) kg/m2, linewidth (18.9±4.4) and the water suppression (90.7±6.5)% had significant correlation with liver lipid content (0.00 to 0.96%, median 0.02%), r were 0.11, 0.44, 0.40, 0.52, -0.73 respectively (P<0.05). But only age, BMI, line width, and the water suppression entered into the multiple linear regression equation. Liver lipid content prediction equation was as follows: Y= 1.395 - (0.021×water suppression) + (0.022×BMI) + (0.014×line width) - (0.004×age), and the coefficient of determination was 0. 613, corrected coefficient of determination was 0.59. Conclusion: The regression model fitted well, since the variables of age, BMI, width, and water suppression can explain about 60% of liver lipid content changes. (authors)

174

Transient sensory, motor or cognitive event elicit not only phase-locked event-related potentials (ERPs) in the ongoing electroencephalogram (EEG), but also induce non-phase-locked modulations of ongoing EEG oscillations. These modulations can be detected when single-trial waveforms are analysed in the time-frequency domain, and consist in stimulus-induced decreases (event-related desynchronization, ERD) or increases (event-related synchronization, ERS) of synchrony in the activity of the underlying neuronal populations. ERD and ERS reflect changes in the parameters that control oscillations in neuronal networks and, depending on the frequency at which they occur, represent neuronal mechanisms involved in cortical activation, inhibition and binding. ERD and ERS are commonly estimated by averaging the time-frequency decomposition of single trials. However, their trial-to-trial variability that can reflect physiologically-important information is lost by across-trial averaging. Here, we aim to (1) develop novel approaches to explore single-trial parameters (including latency, frequency and magnitude) of ERP/ERD/ERS; (2) disclose the relationship between estimated single-trial parameters and other experimental factors (e.g., perceived intensity). We found that (1) stimulus-elicited ERP/ERD/ERS can be correctly separated using principal component analysis (PCA) decomposition with Varimax rotation on the single-trial time-frequency distributions; (2) time-frequency multiple linear regression with dispersion term (TF-MLRd) enhances the signal-to-noise ratio of ERP/ERD/ERS in single trials, and provides an unbiased estimation of their latency, frequency, and magnitude at single-trial level; (3) these estimates can be meaningfully correlated with each other and with other experimental factors at single-trial level (e.g., perceived stimulus intensity and ERP magnitude). The methods described in this article allow exploring fully non-phase-locked stimulus-induced cortical oscillations, obtaining single-trial estimate of response latency, frequency, and magnitude. This permits within-subject statistical comparisons, correlation with pre-stimulus features, and integration of simultaneously-recorded EEG and fMRI. PMID:25665966

Hu, L; Zhang, Z G; Mouraux, A; Iannetti, G D

2015-05-01

175

Spontaneous regression of multiple pulmonary metastatic nodules of hepatocarcinoma: a case report

International Nuclear Information System (INIS)

Although are spontaneous regression of either primary or metastatic malignant tumor in the absence of or inadequate therapy has been well documented. Since the earliest day of this century various malignant tumors have been reported to spontaneously disappear or to be arrested of their growth, but the cases of hepatocarcinoma has been very rare. From the literature, we were able to find out 5 previously reported cases of hepatocarcinoma which showed spontaneous regression at the primary site. Recently we have seen a case of multiple pulmonary metastatic nodules of hepatocarcinoma which completely regressed spontaneously and this forms the basis of the present case report. The patient was 55-year-old male admitted to St. Mary's Hospital, Catholic Medical College because of a hard palpable mass in the epigastrium on April 26, 1978. The admission PA chest roentgenogram revealed multiple small nodular densities scattered throughout both lung field especially in lower zones and toward the peripheral portion. A hepatoscintigram revealed a large cold area involving the left lobe and inermediate zone of the liver. Alfa-fetoprotein and hepatitis B serum antigen test were positive whereas many other standard liver function tests turned out to be negative. A needle biopsy of the tumor revealed well differentiated hepatocellular carcinoma. The patient was put under chemotherapy which consisted of 5 FU 500 mg intravenously for 6 days from April 28 to May 3, 1978. The patient was dApril 28 to May 3, 1978. The patient was discharged after this single course of 5 FU treatment and was on a herb medicine, the nature and quantity of which obscure. No other specific treatment was given. The second admission took place on Dec. 3, 1980 because of irregularity in bowel habits and dyspepsia. A follow up PA chest roentgenogram obtained on the second admission revealed complete disappearance of previously noted multiple pulmonary nodular lesions (Fig. 3). Follow up liver scan revealed persistence of the cold area in the left lobe with slight decrease in size. The patient was discharged again without any specific prescription after confirming negative results of various clinical studies including upper GI series and colon study. At the time of finishing this paper the patient is doing well without apparent medical problems

176

Spontaneous regression of multiple pulmonary metastatic nodules of hepatocarcinoma: a case report

Energy Technology Data Exchange (ETDEWEB)

Although are spontaneous regression of either primary or metastatic malignant tumor in the absence of or inadequate therapy has been well documented. Since the earliest day of this century various malignant tumors have been reported to spontaneously disappear or to be arrested of their growth, but the cases of hepatocarcinoma has been very rare. From the literature, we were able to find out 5 previously reported cases of hepatocarcinoma which showed spontaneous regression at the primary site. Recently we have seen a case of multiple pulmonary metastatic nodules of hepatocarcinoma which completely regressed spontaneously and this forms the basis of the present case report. The patient was 55-year-old male admitted to St. Mary's Hospital, Catholic Medical College because of a hard palpable mass in the epigastrium on April 26, 1978. The admission PA chest roentgenogram revealed multiple small nodular densities scattered throughout both lung field especially in lower zones and toward the peripheral portion. A hepatoscintigram revealed a large cold area involving the left lobe and inermediate zone of the liver. Alfa-fetoprotein and hepatitis B serum antigen test were positive whereas many other standard liver function tests turned out to be negative. A needle biopsy of the tumor revealed well differentiated hepatocellular carcinoma. The patient was put under chemotherapy which consisted of 5 FU 500 mg intravenously for 6 days from April 28 to May 3, 1978. The patient was discharged after this single course of 5 FU treatment and was on a herb medicine, the nature and quantity of which obscure. No other specific treatment was given. The second admission took place on Dec. 3, 1980 because of irregularity in bowel habits and dyspepsia. A follow up PA chest roentgenogram obtained on the second admission revealed complete disappearance of previously noted multiple pulmonary nodular lesions (Fig. 3). Follow up liver scan revealed persistence of the cold area in the left lobe with slight decrease in size. The patient was discharged again without any specific prescription after confirming negative results of various clinical studies including upper GI series and colon study. At the time of finishing this paper the patient is doing well without apparent medical problems.

Bahk, Yong Whee; Park, Seog Hee; Kim, Sun Moo [St. Mary' s Hospital, Catholic Medical College, Seoul (Korea, Republic of)

1981-09-15

177

This contribution introduces a statistically based approach for uncertainty assessment in hydrological modeling, in an optimality context. Indeed, in several real world applications, there is the need for the user to select a model that is deemed to be the best possible choice accordingly to a given goodness of fit criteria. In this case, it is extremely important to assess the model uncertainty, intended as the range around the model output within which the measured hydrological variable is expected to fall with a given probability. This indication allows the user to quantify the risk associated to a decision that is based on the model response. The technique proposed here is carried out by inferring the probability distribution of the hydrological model error through a non linear multiple regression approach, depending on an arbitrary number of selected conditioning variables. These may include the current and previous model output as well as internal state variables of the model. The purpose is to indirectly relate the model error to the sources of uncertainty, through the conditioning variables. The method can be applied to any model of arbitrary complexity, included distributed approaches. The probability distribution of the model error is derived in the Gaussian space, through a meta-Gaussian approach. The normal quantile transform is applied in order to make the marginal probability distribution of the model error and the conditioning variables Gaussian. Then the above marginal probability distributions are related through the multivariate Gaussian distribution, whose parameters are estimated via multiple regression. Application of the inverse of the normal quantile transform allows the user to derive the confidence limits of the model output for an assigned significance level. The proposed technique is valid under statistical assumptions, that are essentially those conditioning the validity of the multiple regression in the Gaussian space. Statistical tests are proposed and discussed in order to test the reliability of the estimated confidence limits. Applications are shown in validation mode, that refer to a rainfall-runoff model applied to an Italian river basin. It is significant to note that the optimality context does not refuse the concept of equifinality. The proposed approach is meant to be a tool for quantifying the uncertainty when the use of a fixed model is made necessary by the application requirements.

Montanari, A.

2006-12-01

178

International Nuclear Information System (INIS)

Parameters derived from computer analysis of digital radio-frequency (rf) ultrasound scan data of untreated uveal malignant melanomas were examined for correlations with tumor regression following cobalt-60 plaque. Parameters included tumor height, normalized power spectrum and acoustic tissue type (ATT). Acoustic tissue type was based upon discriminant analysis of tumor power spectra, with spectra of tumors of known pathology serving as a model. Results showed ATT to be correlated with tumor regression during the first 18 months following treatment. Tumors with ATT associated with spindle cell malignant melanoma showed over twice the percentage reduction in height as those with ATT associated with mixed/epithelioid melanomas. Pre-treatment height was only weakly correlated with regression. Additionally, significant spectral changes were observed following treatment. Ultrasonic spectrum analysis thus provides a noninvasive tool for classification, prediction and monitoring of tumor response to cobalt-60 plaque

179

Determination of ventilatory threshold through quadratic regression analysis.

Ventilatory threshold (VT) has been used to measure physiological occurrences in athletes through models via gas analysis with limited accuracy. The purpose of this study is to establish a mathematical model to more accurately detect the ventilatory threshold using the ventilatory equivalent of carbon dioxide (VE/VCO2) and the ventilatory equivalent of oxygen (VE/Vo2). The methodology is primarily a mathematical analysis of data. The raw data used were archived from the cardiorespiratory laboratory in the Department of Kinesiology at Midwestern State University. Procedures for archived data collection included breath-by-breath gas analysis averaged every 20 seconds (ParVoMedics, TrueMax 2400). A ramp protocol on a Velotron bicycle ergometer was used with increased work at 25 W.min beginning with 150 W, until volitional fatigue. The subjects consisted of 27 healthy, trained cyclists with age ranging from 18 to 50 years. All subjects signed a university approved informed consent before testing. Graphic scatterplots and statistical regression analyses were performed to establish the crossover and subsequent dissociation of VE/Vo2 to VE/VCO2. A polynomial trend line along the scatterplots for VE/VO2 and VE/VCO2 was used because of the high correlation coefficient, the coefficient of determination, and trend line. The equations derived from the scatterplots and trend lines were quadratic in nature because they have a polynomial degree of 2. A graphing calculator in conjunction with a spreadsheet was used to find the exact point of intersection of the 2 trend lines. After the quadratic regression analysis, the exact point of VE/Vo2 and VE/VCO2 crossover was established as the VT. This application will allow investigators to more accurately determine the VT in subsequent research. PMID:20802290

Gregg, Joey S; Wyatt, Frank B; Kilgore, J Lon

2010-09-01

180

Directory of Open Access Journals (Sweden)

Full Text Available The purpose of this research work is to build a multiple linear regression model for the characteristics of multicylinder diesel engine using multicomponent blends (diesel- pungamia methyl ester-ethanol as fuel. Nine blends were tested by varying diesel (100 to 10% by Vol., biodiesel (80 to 10% by vol. and keeping ethanol as 10% constant. The brake thermal efficiency, smoke, oxides of nitrogen, carbon dioxide, maximum cylinder pressure, angle of maximum pressure, angle of 5% and 90% mass burning were predicted based on load, speed, diesel and biodiesel percentage. To validate this regression model another multi component fuel comprising diesel-palm methyl ester-ethanol was used in same engine. Statistical analysis was carried out between predicted and experimental data for both fuel. The performance, emission and combustion characteristics of multi cylinder diesel engine using similar fuel blends can be predicted without any expenses for experimentation.

Gopal Rajendiran

2014-01-01

181

DEFF Research Database (Denmark)

Much attention is focused on increasing the energy efficiency to decrease fuel costs and CO2 emissions throughout industrial sectors. The ORC (organic Rankine cycle) is a relatively simple but efficient process that can be used for this purpose by converting low and medium temperature waste heat to power. In this study we propose four linear regression models to predict the maximum obtainable thermal efficiency for simple and recuperated ORCs. A previously derived methodology is able to determine the maximum thermal efficiency among many combinations of fluids and processes, given the boundary conditions of the process. Hundreds of optimised cases with varied design parameters are used as observations in four multiple regression analyses. We analyse the model assumptions, prediction abilities and extrapolations, and compare the results with recent studies in the literature. The models are in agreement with the literature, and they present an opportunity for accurate prediction of the potential of an ORC to convert heat sources with temperatures from 80 to 360 C, without detailed knowledge or need for simulation of the process. © 2013 Elsevier Ltd. All rights reserved

Larsen, Ulrik; Pierobon, Leonardo

2014-01-01

182

The use of weighted multiple linear regression to estimate QTL-by-QTL epistatic effects

Scientific Electronic Library Online (English)

Full Text Available SciELO Brazil | Language: English Abstract in english Knowledge of the nature and magnitude of gene effects, as well as their contribution to the control of metric traits, is important in formulating efficient breeding programs for the improvement of plant genetics. Information concerning a genetic parameter such as the additive-by-additive epistatic e [...] ffect can be useful in traditional breeding. This report describes the results obtained by applying weighted multiple linear regression to estimate the parameter connected with an additive-by-additive epistatic interaction. Three weight variants were used: (1) standard weights based on estimated variances, (2) different weights for minimal, maximal and other lines, and (3) different weights for extreme and other lines. The approach described here combines two methods of estimation, one based on phenotypic observations and the other using molecular marker data. The comparison was done using Monte Carlo simulations. The results show that the application of weighted regression to the marker data yielded estimates similar to those obtained by phenotypic methods.

Jan, Bocianowski.

183

Directory of Open Access Journals (Sweden)

Full Text Available The objectives of this study were to estimate (covariance functions for additive genetic and permanent environmental effects, as well as the genetic parameters for milk yield over multiple parities, using random regressions models (RRM. Records of 4,757 complete lactations of Murrah breed buffaloes from 12 herds were analyzed. Ages at calving were between 2 and 11 years. The model included the additive genetic and permanent environmental random effects and the fixed effects of contemporary groups (herd, year and calving season and milking frequency (1 or 2. A cubic regression on Legendre orthogonal polynomials of ages was used to model the mean trend. The additive genetic and permanent environmental effects were modeled by Legendre orthogonal polynomials. Residual variances were considered homogenous or heterogeneous, modeled through variance functions or step functions with 5, 7 or 10 classes. Results from Akaike’s and Schwarz’s Bayesian information criterion indicated that a RRM considering a third order polynomial for the additive genetic and permanent environmental effects and a step function with 5 classes for residual variances fitted best. Heritability estimates obtained by this model varied from 0.10 to 0.28. Genetic correlations were high between consecutive ages, but decreased when intervals between ages increased

H. Tonhati

2010-02-01

184

The aim of this work is to obtain an expression using multiple lineal regressions (MLR) to evaluate environmental soil quality. We used four forest soils from Alicante province (SE Spain), comprising three Mollisols and one Entisol, developed under natural vegetation with minimum human disturbance, considered as reference soils of high quality. We carried out MLR integrating different soil physical, chemical and biochemical properties, and we searched those regressions with Kjeldahl nitrogen (N(k)), soil organic carbon (SOC) or microbial biomass carbon (MBC) as predicted parameter. We observed that Mollisols and Entisols presented different relationships among their properties. Thus, we searched different equations for both groups of soils. The selected equation for Mollisols was N=0.448 (P) + 0.017 (water holding capacity) + 0.410(phosphatase) - 0.567 (urease) + 0.001 (MBC) + 0.410 (beta - glucosidase) - 0.980, and for the Entisol SOC = 4.247 (P) + 8.183 (beta-glucosidase) -7.949 (urease) + 17.333. Equations were applied to samples from two forest soils in advanced degree of degradation, one for Mollisols and the other one for the Entisol. We observed a clear deviation in the predicted parameters values related to the real properties. The obtained results show that MLR is a good tool for soil quality evaluation, because it seems to be capable of reflecting the balance among its properties, as well as deviations from it. PMID:17321568

Zornoza, Raúl; Mataix-Solera, Jorge; Guerrero, César; Arcenegui, Victoria; García-Orenes, Fuensanta; Mataix-Beneyto, Jorge; Morugán, Alicia

2007-05-25

185

In the present work, an attempt is made to formulate multiple regression equations using all possible regressions method for groundwater quality assessment of Ajmer-Pushkar railway line region in pre- and post-monsoon seasons. Correlation studies revealed the existence of linear relationships (r 0.7) for electrical conductivity (EC), total hardness (TH) and total dissolved solids (TDS) with other water quality parameters. The highest correlation was found between EC and TDS (r = 0.973). EC showed highly significant positive correlation with Na, K, Cl, TDS and total solids (TS). TH showed highest correlation with Ca and Mg. TDS showed significant correlation with Na, K, SO4, PO4 and Cl. The study indicated that most of the contamination present was water soluble or ionic in nature. Mg was present as MgCl2; K mainly as KCl and K2SO4, and Na was present as the salts of Cl, SO4 and PO4. On the other hand, F and NO3 showed no significant correlations. The r2 values and F values (at 95% confidence limit, alpha = 0.05) for the modelled equations indicated high degree of linearity among independent and dependent variables. Also the error % between calculated and experimental values was contained within +/- 15% limit. PMID:21114099

Mathur, Praveen; Sharma, Sarita; Soni, Bhupendra

2010-01-01

186

Estimation of Saturation Percentage of Soil Using Multiple Regression, ANN, and ANFIS Techniques

Directory of Open Access Journals (Sweden)

Full Text Available The saturation percentage (SP of soils is an important index in hydrological studies. In this paper, arti?cial neural networks (ANNs, multiple regression (MR, and adaptive neural-based fuzzy inference system (ANFIS were used for estimation of saturation percentage of soils collected from Boukan region in the northwestern part of Iran. Percent clay, silt, sand and organic carbon (OC were used to develop the applied methods. In additions contributions of each input variable were assessed on estimation of SP index. Two performance functions, namely root mean square errors (RMSE and determination coefficient (R2, were used to evaluate the adequacy of the models. ANFIS method was found to be superior over the other methods. It is, then, proposed that ANFIS model can be used for reasonable estimation of SP values of soils.

Khaled Ahmad Aali

2009-07-01

187

Regression analysis exploring teacher impact on student FCI post scores

High School Modeling Workshops are designed to improve high school physics teachers' understanding of physics and how to teach using the Modeling method. The basic assumption is that the teacher plays a critical role in their students' physics education. This study investigated teacher impacts on students' Force Concept Inventory scores, (FCI), with the hopes of identifying quantitative differences between teachers. This study examined student FCI scores from 18 teachers with at least a year of teaching high school physics. This data was then evaluated using a General Linear Model (GLM), which allowed for a regression equation to be fitted to the data. This regression equation was used to predict student post FCI scores, based on: teacher ID, student pre FCI score, gender, and representation. The results show 12 out of 18 teachers significantly impact their student post FCI scores. The GLM further revealed that of the 12 teachers only five have a positive impact on student post FCI scores. Given these differences among teachers it is our intention to extend our analysis to investigate pedagogical differences between them.

Mahadeo, Jonathan V.; Manthey, Seth R.; Brewe, Eric

2013-01-01

188

A Visual Analytics Approach for Correlation, Classification, and Regression Analysis

Energy Technology Data Exchange (ETDEWEB)

New approaches that combine the strengths of humans and machines are necessary to equip analysts with the proper tools for exploring today's increasing complex, multivariate data sets. In this paper, a novel visual data mining framework, called the Multidimensional Data eXplorer (MDX), is described that addresses the challenges of today's data by combining automated statistical analytics with a highly interactive parallel coordinates based canvas. In addition to several intuitive interaction capabilities, this framework offers a rich set of graphical statistical indicators, interactive regression analysis, visual correlation mining, automated axis arrangements and filtering, and data classification techniques. The current work provides a detailed description of the system as well as a discussion of key design aspects and critical feedback from domain experts.

Steed, Chad A [ORNL; SwanII, J. Edward [Mississippi State University (MSU); Fitzpatrick, Patrick J. [Mississippi State University (MSU); Jankun-Kelly, T.J. [Mississippi State University (MSU)

2012-02-01

189

Node-Mapping EIT Method Based on Regression Analysis

Directory of Open Access Journals (Sweden)

Full Text Available Medical Imaging shows people the morphology of the body's internal organs function intuitive ly. Electrical Impedance Tomography (EIT is an emerging medical imaging technology. It has the advantages of simple structure, low cost, non-radiological hazards and non-invasive . EIT can not only take advantage of the impedance differences between the different organizations reconstruction of anatomical images, and cantissues and organs to achieve functional imaging impedance changes in different physiological and pathological state, and is suitable for long -term monitoring. The solution is approximate due to t he ill -posedness of inverse problem . Because the image is accuracy and computation of contradictions in not quick enough, EIT is still unable to meet the requirements of practical pplication. By using regression analysis algorithm , Node-Mapping Method only calculates the node potential . The speed of operation and the reconstructed image quality have been greatly improved.

Jianjun Zhang

2012-12-01

190

A Quantile Regression Analysis of Micro-lending's Poverty Impact

Directory of Open Access Journals (Sweden)

Full Text Available This paper aims to evaluate the impact of a microlending program on ameliorating measured poverty within its client population, with the aim of improving that impact. We analyze over 18,000 women micro-finance clients of the Negros Women for Tomorrow Foundation (NWTF, a database using the Progress out of Poverty (PPI Scorecard as a measure of poverty. Analysis using both OLS and quantile multivariate regression models shows how observable borrower attributes affect the ability of clients to reduce their measured poverty. Loan size, duration, and the economic activity supported all have strongly identifiable effects. Moreover, estimates suggest which among the poor are receiving the greatest effective help by the program. Results offer specific advice to the NWTF and other micro-lenders: impact is greatest with fewer, larger loans in particular economic sectors (sari-sari, service and trade but require patience as each additional year increases the client’s average change in poverty score.

Stephen W. Polk

2012-07-01

191

A Visual Analytics Approach for Correlation, Classification, and Regression Analysis

Energy Technology Data Exchange (ETDEWEB)

New approaches that combine the strengths of humans and machines are necessary to equip analysts with the proper tools for exploring today s increasing complex, multivariate data sets. In this paper, a visual data mining framework, called the Multidimensional Data eXplorer (MDX), is described that addresses the challenges of today s data by combining automated statistical analytics with a highly interactive parallel coordinates based canvas. In addition to several intuitive interaction capabilities, this framework offers a rich set of graphical statistical indicators, interactive regression analysis, visual correlation mining, automated axis arrangements and filtering, and data classification techniques. This chapter provides a detailed description of the system as well as a discussion of key design aspects and critical feedback from domain experts.

Steed, Chad A [ORNL; SwanII, J. Edward [Mississippi State University (MSU); Fitzpatrick, Patrick J. [Mississippi State University (MSU); Jankun-Kelly, T.J. [Mississippi State University (MSU)

2013-01-01

192

Logistic regression analysis on the risk factors of radiation pneumonitis

International Nuclear Information System (INIS)

Objective: To identify the risk factors of radiation pneumonitis (RP). Methods: A retrospective study was conducted on 101 patients with radiation pneumonitis using SPSS 8.0 software. Factors evaluated included: gender, age, pathology, clinical stage, irradiation dose, irradiation field size, history of smoking, cardiovascular disease, bronchitis, surgery, chemotherapy, lung infection, atelectasis, obstructive infection and pleural effusion. Univariate analysis was performed using Chi-Square test and multivariate analysis was performed using Logistic regression model. Results: Univariate analysis revealed a significant relationship between 10 factors: pulmonary infection, atelectasis, obstructive infection, cardiovascular disease, bronchitis, chemotherapy, irradiation dose, number of days of radiation and irradiation field size were factors leading to radiation pneumonitis. Multivariate analysis showed that 9 factors: pulmonary infection, obs tractive infection, atelectasis, pleural effusion, bronchitis, cardiovascular disease, chemotherapy, irradiation dose, and irradiation field size were independent factors. Conclusion: Comprehensive consideration of the accompanying disease, chemotherapy, dose, field size, etc during the planning of radiotherapy is able to minimize the possibility of developing radiation pneumonitis

193

Gaze tracking technology is a convenient interfacing method for mobile devices. Most previous studies used a large-sized desktop or head-mounted display. In this study, we propose a novel gaze tracking method using an active appearance model (AAM) and multiple support vector regression (SVR) on a mobile device. Our research has four main contributions. First, in calculating the gaze position, the amount of facial rotation and translation based on four feature values is computed using facial feature points detected by AAM. Second, the amount of eye rotation based on two feature values is computed for measuring eye gaze position. Third, to compensate for the fitting error of an AAM in facial rotation, we use the adaptive discrete Kalman filter (DKF), which applies a different velocity of state transition matrix to the facial feature points. Fourth, we obtain gaze position on a mobile device based on multiple SVR by separating the rotation and translation of face and eye rotation. Experimental results show that the root mean square (rms) gaze error is 36.94 pixels on the 4.5-in. screen of a mobile device with a screen resolution of 800×600 pixels.

Lee, Eui Chul; Ko, You Jin; Park, Kang Ryoung

2009-07-01

194

Accounting for data errors discovered from an audit in multiple linear regression.

A data coordinating team performed onsite audits and discovered discrepancies between the data sent to the coordinating center and that recorded at sites. We present statistical methods for incorporating audit results into analyses. This can be thought of as a measurement error problem, where the distribution of errors is a mixture with a point mass at 0. If the error rate is nonzero, then even if the mean of the discrepancy between the reported and correct values of a predictor is 0, naive estimates of the association between two continuous variables will be biased. We consider scenarios where there are (1) errors in the predictor, (2) errors in the outcome, and (3) possibly correlated errors in the predictor and outcome. We show how to incorporate the error rate and magnitude, estimated from a random subset (the audited records), to compute unbiased estimates of association and proper confidence intervals. We then extend these results to multiple linear regression where multiple covariates may be incorrect in the database and the rate and magnitude of the errors may depend on study site. We study the finite sample properties of our estimators using simulations, discuss some practical considerations, and illustrate our methods with data from 2815 HIV-infected patients in Latin America, of whom 234 had their data audited using a sequential auditing plan. PMID:21281274

Shepherd, Bryan E; Yu, Chang

2011-09-01

195

The working capacity of the alcohol abuser. Prognostic multiple regression analyses.

Thirty-four alcohol abusers treated at various rehabilitational locations in Sweden were the subjects of an extensive and interdisciplinary study. Thereafter, the working capacity of each subject was followed over a two-year period. Twelve individuals regained capacity for work, either partially or completely. Thirteen subjects were sick-listed or remained unemployed. The remaining 9 abusers were quickly and unexpectedly pensioned. In order to predict the rehabilitational outcome from the interdisciplinary findings at the onset of the time period, stepwise multiple regression analyses were performed. Those who felt less lonely and had no drinking buddies appeared most likely to be rehabilitated vocationally. This core combination of characteristics accounted for about one third of the variability in the outcome criterion, either trichotomized or dichotomized. Rehabilitational success could be even more strongly predicted by the appearance of such features as less prolonged abuse, social introversion (not cohabiting, reserved attitudes and having only a few friends), orientation to the future and a history of psychiatric care. An elevated level of plasma albumin and a decreased plasma-IgA value raised additionally the multiple correlation coefficient. PMID:3347824

Hörnquist, J O; Hansson, B; Akerlind, I

1988-01-01

196

Low-Cost Housing in Sabah, Malaysia: A Regression Analysis

Directory of Open Access Journals (Sweden)

Full Text Available Low-cost housing plays a vital role in the development process especially in providing accommodation to those who are less fortunate and the lower income group. This effort is also a step in overcoming the squatter problem which could cripple the competitive drive of the local community especially in the state of Sabah, Malaysia. This article attempts to look into the influencing factors to low-cost housing in Sabah namely the government’s budget (allocation for low cost housing projects and Sabah’s total population. At the same time, this study will attempt to show the implication from the development and economic crises which occurred during period 1971 to 2000 towards the provision of low cost houses in Sabah. Empirical analyses were conducted using the multiple linear regression method, stepwise and also the dummy variable approach in demonstrating the link. The empirical result shows that the government’s budget for low-cost housing is the main contributor to the provision of low-cost housing in Sabah. The empirical decision also suggests that economic growth namely Gross Domestic Product (GDP did not provide a significant effect to the low-cost housing in Sabah. However, almost all major crises that have beset upon Malaysia’s economy caused a significant and consistent effect to the low-cost housing in Sabah especially the financial crisis which occurred in mid 1997.

Dullah Mulok

2009-02-01

197

The aim of the present study was to derive multiple regression equations for in vivo estimation of the carcass lean and fat content in Muscovy ducks. The experimental materials consisted of 240 White Muscovy ducklings (120 male and 120 female). One hundred sixteen females aged 10 wk and 112 males aged 12 wk were slaughtered. Before slaughter the ducks were weighed, and the following body measurements were taken: humerus length, drumstick length, chest girth, breast-bone crest length, width between the humeral bones, chest depth, and breast muscle thickness. The coefficients of simple correlation between carcass tissue components and body measurements were calculated. It was found that live body weight was highly correlated with the weights of all tissue components (r = 0.701 to 0.857). In males a significant interrelation was found between breast muscle weight and all body measurements, whereas in females breast muscle weight was correlated with breast-bone crest length, chest girth, width between the humeral bones, chest depth, and breast muscle thickness only. In both males and females the carcass lean content was closely correlated with drumstick length, breast-bone crest length, chest girth, and width between the humeral bones. In drakes the carcass fat content was closely correlated with all body measurements, whereas in hens significant correlations were observed between the carcass fat content and chest girth, width between the humeral bones, and chest depth only. The coefficients of simple correlation between the percentages of carcass tissue components and body measurements were generally low and statistically nonsignificant. Twelve multiple regression equations formulated based on the body measurements of live ducks were verified with respect to the accuracy of estimation of the content of breast muscles, meat, and fat with skin in the carcass. These equations give small SE of the estimate (Sy = 23.3 to 83.8 g), high values of coefficients of multiple correlation between the dependent variable and the set of independent variables, and high values of determination coefficients. PMID:16830875

Kleczek, K; Wawro, K; Wilkiewicz-Wawro, E; Makowski, W

2006-07-01

198

ROC curve regression analysis: the use of ordinal regression models for diagnostic test assessment.

Diagnostic tests commonly are characterized by their true positive (sensitivity) and true negative (specificity) classification rates, which rely on a single decision threshold to classify a test result as positive. A more complete description of test accuracy is given by the receiver operating characteristic (ROC) curve, a graph of the false positive and true positive rates obtained as the decision threshold is varied. A generalized regression methodology, which uses a class of ordinal regre...

Tosteson, A. N.; Weinstein, M. C.; Wittenberg, J.; Begg, C. B.

1994-01-01

199

Directory of Open Access Journals (Sweden)

Full Text Available oped for prediction of particulate matter. The performance of the multiple regression models was assessed. For the development of neural network models, a feed forward with back propagation learning algorithm was used to train the network. The performance of neural network was determined in terms of correlation coefficient (R and Mean Square Error (MSE. The optimum number of hidden neurons was found out for obtaining the lowest value of MSE and the highest value of R. The results indicated that the network can predict particulate concentrations better than multiple regression models.

T.A. Renaldy

2011-01-01

200

Application of a Bayesian method for optimal subset regression to linkage analysis of Q1 and Q2.

We explore an approach that allows us to consider a trait for which we wish to determine the optimal subset of markers out of a set of p > or = 3 candidate markers being considered in a linkage analysis. The most effective analysis would find the model that only includes the q markers closest to the q major genes which determine the trait. Finding this optimal model using classical "frequentist" multiple regression techniques would require consideration of all 2p possible subsets. We apply the work of George and McCulloch [J Am Stat Assoc 88:881-9, 1993], who have developed a Bayesian approach to optimal subset selection regression, to a modification of the Haseman-Elston linkage statistic [Elston et al., Genet Epidemiol 19:1-17, 2000] in the analysis of the two quantitative traits simulated in Problem 2. The results obtained using this Bayesian method are compared to those obtained using (1) multiple regression and (2) the modified Haseman-Elston method (single variable regression analysis). We note upon doing this that for both Q1 and Q2, (1) we have extremely low power with all methods using the samples as given and have to resort to combining several simulated samples in order to have power of 50%, (2) the multivariate analysis does not have greater power than the univariate analysis for these traits, and (3) the Bayesian approach identifies the correct model more frequently than the frequentist approaches but shows no clear advantage over the multivariate approach. PMID:11793765

Suh, Y J; Finch, S J; Mendell, N R

2001-01-01

201

International Nuclear Information System (INIS)

In environmental epidemiology, trace and toxic substance concentrations frequently have very highly skewed distributions ranging over one or more orders of magnitude, and prediction by conventional regression is often poor. Classification and Regression Tree Analysis (CART) is an alternative in such contexts. To compare the techniques, two Pennsylvania data sets and three independent variables are used: house radon progeny (RnD) and gamma levels as predicted by construction characteristics in 1330 houses; and ?200 house radon (Rn) measurements as predicted by topographic parameters. CART may identify structural variables of interest not identified by conventional regression, and vice versa, but in general the regression models are similar. CART has major advantages in dealing with other common characteristics of environmental data sets, such as missing values, continuous variables requiring transformations, and large sets of potential independent variables. CART is most useful in the identification and screening of independent variables, greatly reducing the need for cross-tabulations and nested breakdown analyses. There is no need to discard cases with missing values for the independent variables because surrogate variables are intrinsic to CART. The tree-structured approach is also independent of the scale on which the independent variables are measured, so that transformations are unnecessary. CART identifies important interactions as well as main effects. The major advantages of CART appear to be in exploring data. Once the important variables are identified, conventional regressions seem to lead to results similar but more interpretable by most audiences. 12 refs., 8 figs., 10 tabs

202

The analysis of kernel ridge regression learning algorithm.

The paper presents Kernel Ridge Regression, a nonlinear extension of the well known statistical model of ridge regression. New insights on the method are also presented. In particular, the connection between ridge regression and local translation-invariant squared loss minimization algorithm is shown. An iterative training algorithm is proposed, that allows training the KRR for large datasets. The training time is empirically found to scale quadratically with the number of samples. The applic...

Pozdnoukhov, Alexei

2002-01-01

203

Semiparametric regression for periodic longitudinal hormone data from multiple menstrual cycles.

We consider semiparametric regression for periodic longitudinal data. Parametric fixed effects are used to model the covariate effects and a periodic nonparametric smooth function is used to model the time effect. The within-subject correlation is modeled using subject-specific random effects and a random stochastic process with a periodic variance function. We use maximum penalized likelihood to estimate the regression coefficients and the periodic nonparametric time function, whose estimator is shown to be a periodic cubic smoothing spline. We use restricted maximum likelihood to simultaneously estimate the smoothing parameter and the variance components. We show that all model parameters can be easily obtained by fitting a linear mixed model. A common problem in the analysis of longitudinal data is to compare the time profiles of two groups, e.g., between treatment and placebo. We develop a scaled chi-squared test for the equality of two nonparametric time functions. The proposed model and the test are illustrated by analyzing hormone data collected during two consecutive menstrual cycles and their performance is evaluated through simulations. PMID:10783774

Zhang, D; Lin, X; Sowers, M

2000-03-01

204

Regression Analysis between Properties of Subgrade Lateritic Soil

Directory of Open Access Journals (Sweden)

Full Text Available The results of a study that considered the use of regression analysis that may have correlation between index properties and California Bearing Ratio (CBR of some lateritic soil within Osogbo town of South Western Nigeria have been presented. For an appreciable conclusion to be established, lateritic soil samples were collected from eight (8 different borrow pits within the town and various laboratory tests including Atterberg Limits, Gradation analysis, California Bearing Ratio, Compaction and Specific Gravity were performed on the soil samples.Various linear relationships between index properties and CBR of the samples were investigated and predictive equations estimating CBR from the experimental index values were developed. The findings indicate that good correlation exists between the two groups (i.e Index properties and CBR values. However, the values of the CBR computed from the models are only to be used for preliminary in view of simplicity and economy and not acceptable alternatives to laboratory testing because of the anisotropic nature of lateritic soil and its heterogeneity.

Afeez Adefemi BELLO

2012-12-01

205

Use of generalized regression models for the analysis of stress-rupture data

International Nuclear Information System (INIS)

The design of components for operation in an elevated-temperature environment often requires a detailed consideration of the creep and creep-rupture properties of the construction materials involved. Techniques for the analysis and extrapolation of creep data have been widely discussed. The paper presents a generalized regression approach to the analysis of such data. This approach has been applied to multiple heat data sets for types 304 and 316 austenitic stainless steel, ferritic 21/4 Cr-1 Mo steel, and the high-nickel austenitic alloy 800H. Analyses of data for single heats of several materials are also presented. All results appear good. The techniques presented represent a simple yet flexible and powerful means for the analysis and extrapolation of creep and creep-rupture data

206

International Nuclear Information System (INIS)

Risk associated with power generation must be identified to make intelligent choices between alternate power technologies. Radionuclide air stack emissions for a single coal plant and a single nuclear plant are used to compute the single plant leukemia incidence risk and total industry leukemia incidence risk. Leukemia incidence is the response variable as a function of radionuclide bone dose for the six proposed dose response curves considered. During normal operation a coal plant has higher radionuclide emissions than a nuclear plant and the coal industry has a higher leukaemia incidence risk than the nuclear industry, unless a nuclear accident occurs. Variation of nuclear accident size allows quantification of the impact of accidents on the total industry leukemia incidence risk comparison. The leukemia incidence risk is quantified as the number of accidents of a given size for the nuclear industry leukemia incidence risk to equal the coal industry leukemia incidence risk. The general linear model is used to develop equations that relate the accident frequency required for equal industry risks to the magnitude of the nuclear emission. Exploratory data analysis revealed that the relationship between the natural log of accident number versus the natural log of accident size is linear. (Author)

207

DEFF Research Database (Denmark)

Multiple regression and model building with mediator variables was addressed to avoid double counting when economic values are estimated from data simulated with herd simulation modeling (using the SimHerd model). The simulated incidence of metritis was analyzed statistically as the independent variable, while using the traits representing the direct effects of metritis on yield, fertility and occurrence of other diseases as mediator variables. The economic value of metritis was estimated to be €78 per 100 cow-years for each 1% increase of metritis in the period of 1-100 days in milk in multiparous cows. The merit of using this approach was demonstrated since the economic value of metritis was estimated to be 81% higher when no mediator variables were included in the multiple regression analysis

Østergaard, SØren; Ettema, Jehan Frans

208

Mapping of multiple quantitative trait loci by simple regression in half-sib designs

Detection of QTL in outbred half-sib family structures has mainly been based on interval mapping of single QTL on individual chromosomes. Methods to account for linked and unlinked QTL have been developed, but most of them are only applicable in designs with inbred species or pose great demands on computing facilities. This study describes a strategy that allows for rapid analysis, involving multiple QTL, of complete genomes. The methods combine information from individual analyses after whic...

Koning, D. J.; Schulman, N.; Elo, K.; Moisio, S.; Kinos, R.; Vilkki, J.; Maki-tanila, A.

2001-01-01

209

Adaptive regression analysis: theory and applications in econometrics

Directory of Open Access Journals (Sweden)

Full Text Available In this work we (a discuss some theoretical and computational difficulties of regression analysing dependences, describing the behaviour of the heterogeneous systems, (b offer a set of new techniques adaptable to regression analysing the heterogeneous dependences and (c demonstrate the advantages of application of these new techniques in econometrics.

J. Garc\\u00EDa P\\u00E9rez

2003-01-01

210

International Nuclear Information System (INIS)

Several MRI features of supratentorial astrocytomas are associated with high histologic grade by statistically significant p values. We sought to apply this information prospectively to a group of astrocytomas in the prediction of tumor grade. We used 10 MRI features of fibrillary astrocytomas from 52 patient studies to develop neural network and multiple linear regression models for practical use in predicting tumor grade. The models were tested prospectively on MR images from 29 patient studies. The performance of the models was compared against that of a radiologist. Neural network accuracy was 61 % in distinguishing between low and high grade tumors. Multiple linear regression achieved an accuracy of 59 %. Assessment of the images by a radiologist yielded 57 % accuracy. We conclude that while certain MRI parameters may be statistically related to astrocytoma histologic grade, neural network and linear regression models cannot reliably use them to predict tumor grade. (orig.)

211

Oil and gas pipeline construction cost analysis and developing regression models for cost estimation

In this study, cost data for 180 pipelines and 136 compressor stations have been analyzed. On the basis of the distribution analysis, regression models have been developed. Material, Labor, ROW and miscellaneous costs make up the total cost of a pipeline construction. The pipelines are analyzed based on different pipeline lengths, diameter, location, pipeline volume and year of completion. In a pipeline construction, labor costs dominate the total costs with a share of about 40%. Multiple non-linear regression models are developed to estimate the component costs of pipelines for various cross-sectional areas, lengths and locations. The Compressor stations are analyzed based on the capacity, year of completion and location. Unlike the pipeline costs, material costs dominate the total costs in the construction of compressor station, with an average share of about 50.6%. Land costs have very little influence on the total costs. Similar regression models are developed to estimate the component costs of compressor station for various capacities and locations.

Thaduri, Ravi Kiran

212

By definition, multiple regression (MR) considers more than one predictor variable, and each variable's beta will depend on both its correlation with the criterion and its correlation with the other predictor(s). Despite ad nauseam coverage of this characteristic in organizational psychology and statistical texts, researchers' applications of MR in bivariate hypothesis testing has been the subject of recent and renewed interest. Accordingly, we conducted a targeted survey of the literature by coding articles, covering a five-year span from two top-tier organizational journals, that employed MR for testing bivariate relations. The results suggest that MR coefficients, rather than correlation coefficients, were most common for testing hypotheses of bivariate relations, yet supporting theoretical rationales were rarely offered. Regarding the potential impact on scientific advancement, in almost half of the articles reviewed (44 %), at least one conclusion of each study (i.e., that the hypothesis was or was not supported) would have been different, depending on the author's use of correlation or beta to test the bivariate hypothesis. It follows that inappropriate decisions to interpret the correlation versus the beta will affect the accumulation of consistent and replicable scientific evidence. We conclude with recommendations for improving bivariate hypothesis testing. PMID:24142838

O'Neill, Thomas A; McLarnon, Matthew J W; Schneider, Travis J; Gardner, Robert C

2014-09-01

213

Dental malocclusion and body posture in young subjects: A multiple regression study

Scientific Electronic Library Online (English)

Full Text Available SciELO Brazil | Language: English Abstract in english OBJECTIVES: Controversial results have been reported on potential correlations between the stomatognathic system and body posture. We investigated whether malocclusal traits correlate with body posture alterations in young subjects to determine possible clinical applications. METHODS: A total of 122 [...] subjects, including 86 males and 36 females (age range of 10.8-16.3 years), were enrolled. All subjects tested negative for temporomandibular disorders or other conditions affecting the stomatognathic systems, except malocclusion. A dental occlusion assessment included phase of dentition, molar class, overjet, overbite, anterior and posterior crossbite, scissorbite, mandibular crowding and dental midline deviation. In addition, body posture was recorded through static posturography using a vertical force platform. Recordings were performed under two conditions, namely, i) mandibular rest position (RP) and ii) dental intercuspidal position (ICP). Posturographic parameters included the projected sway area and velocity and the antero-posterior and right-left load differences. Multiple regression models were run for both recording conditions to evaluate associations between each malocclusal trait and posturographic parameters. RESULTS: All of the posturographic parameters had large variability and were very similar between the two recording conditions. Moreover, a limited number of weakly significant correlations were observed, mainly for overbite and dentition phase, when using multivariate models. CONCLUSION: Our current findings, particularly with regard to the use of posturography as a diagnostic aid for subjects affected by dental malocclusion, do not support existence of clinically relevant correlations between malocclusal traits and body posture

Giuseppe, Perinetti; Luca, Contardo; Armando, Silvestrini-Biavati; Lucia, Perdoni; Attilio, Castaldo.

214

In the recent years, new techniques such as; artificial neural networks and fuzzy inference systems were employed for developing of the predictive models to estimate the needed parameters. Soft computing techniques are now being used as alternate statistical tool. Determination of swell potential of soil is difficult, expensive, time consuming and involves destructive tests. In this paper, use of MLP and RBF functions of ANN (artificial neural networks), ANFIS (adaptive neuro-fuzzy inference system) for prediction of S% (swell percent) of soil was described, and compared with the traditional statistical model of MR (multiple regression). However the accuracies of ANN and ANFIS models may be evaluated relatively similar. It was found that the constructed RBF exhibited a high performance than MLP, ANFIS and MR for predicting S%. The performance comparison showed that the soft computing system is a good tool for minimizing the uncertainties in the soil engineering projects. The use of soft computing will also may provide new approaches and methodologies, and minimize the potential inconsistency of correlations.

Yilmaz, Isik; Kaynar, Oguz

2010-05-01

215

Supply and Demand of Jeneberang River Aggregate Using Multiple Regression Model

Directory of Open Access Journals (Sweden)

Full Text Available Aggregate plays an important role in developing infrastructure because it is the major raw materials used in construction such as roads, hospitals, schools, factories, homes and other buildings. Sand and gravel are essential sources of aggregate and exploited often from the active channels of river systems. Jeneberang River is one of the main rivers in South Sulawesi Province which is located at Gowa Regency and mined in order to fulfill the aggregate demand of Gowa Regency and Makassar City. Supply and demand are economic occurrences that affected by several factors, so this research aims to (1 determine influencing factors to aggregate supply and demand, (2 develop supply and demand model. Data was obtained from Central Bureau Statistics of Gowa Regency and Makassar City, and Department of Mines and Energy, Gowa Regency for eleven years (2001 – 2011. In this research, aggregate supply and demand were modeled using multiple regression method. First, relationship among supply and influencing factors were established, followed by demand and its factors. Second, supply and demand model was established using SPSS. The result of this research showed that the model can be used to estimate accurately supply and demand of aggregate using the established relationship among the influencing factors. Supply of aggregate was affected by several factors including price, number of trucks, number of mining companies and mining permit area meanwhile the price, GDP, income per capita, length of road, number of buildings and economic growth had high influence on demand rate.

Aryanti Virtanti Anas

2013-07-01

216

This study used Monte Carlo simulation to examine the properties of the overall odds ratio (OOR), which was recently introduced as an index for overall effect size in multiple logistic regression. It was found that the OOR was relatively independent of study base rate and performed better than most commonly used R-square analogs in indexing model…

Le, Huy; Marcus, Justin

2012-01-01

217

A Novel Multiobjective Evolutionary Algorithm Based on Regression Analysis

As is known, the Pareto set of a continuous multiobjective optimization problem with m objective functions is a piecewise continuous (m ? 1)-dimensional manifold in the decision space under some mild conditions. However, how to utilize the regularity to design multiobjective optimization algorithms has become the research focus. In this paper, based on this regularity, a model-based multiobjective evolutionary algorithm with regression analysis (MMEA-RA) is put forward to solve continuous multiobjective optimization problems with variable linkages. In the algorithm, the optimization problem is modelled as a promising area in the decision space by a probability distribution, and the centroid of the probability distribution is (m ? 1)-dimensional piecewise continuous manifold. The least squares method is used to construct such a model. A selection strategy based on the nondominated sorting is used to choose the individuals to the next generation. The new algorithm is tested and compared with NSGA-II and RM-MEDA. The result shows that MMEA-RA outperforms RM-MEDA and NSGA-II on the test instances with variable linkages. At the same time, MMEA-RA has higher efficiency than the other two algorithms. A few shortcomings of MMEA-RA have also been identified and discussed in this paper.

Song, Zhiming; Wang, Maocai; Dai, Guangming; Vasile, Massimiliano

2015-01-01

218

A simplified procedure of linear regression in a preliminary analysis

Directory of Open Access Journals (Sweden)

Full Text Available The analysis of a statistical large data-set can be led by the study of a particularly interesting variable Y – regressed and an explicative variable X, chosen among the remained variables, conjointly observed. The study gives a simplified procedure to obtain the functional link of the variables y=y(x by a partition of the data-set into m subsets, in which the observations are synthesized by location indices (mean or median of X and Y. Polynomial models for y(x of order r are considered to verify the characteristics of the given procedure, in particular we assume r= 1 and 2. The distributions of the parameter estimators are obtained by simulation, when the fitting is done for m= r + 1. Comparisons of the results, in terms of distribution and efficiency, are made with the results obtained by the ordinary least square methods. The study also gives some considerations on the consistency of the estimated parameters obtained by the given procedure.

Silvia Facchinetti

2013-05-01

219

Heparin, a widely used anticoagulant primarily extracted from animal sources, contains varying amounts of galactosamine impurities. Currently, the United States Pharmacopeia (USP) monograph for heparin purity specifies that the weight percent of galactosamine (%Gal) may not exceed 1%. In the present study, multivariate regression (MVR) analysis of (1)H NMR spectral data obtained from heparin samples was employed to build quantitative models for the prediction of %Gal. MVR analysis was conducted using four separate methods: multiple linear regression, ridge regression, partial least squares regression, and support vector regression (SVR). Genetic algorithms and stepwise selection methods were applied for variable selection. In each case, two separate prediction models were constructed: a global model based on dataset A which contained the full range (0-10%) of galactosamine in the samples and a local model based on the subset dataset B for which the galactosamine level (0-2%) spanned the 1% USP limit. All four regression methods performed equally well for dataset A with low prediction errors under optimal conditions, whereas SVR was clearly superior among the four methods for dataset B. The results from this study show that (1)H NMR spectroscopy, already a USP requirement for the screening of contaminants in heparin, may offer utility as a rapid method for quantitative determination of %Gal in heparin samples when used in conjunction with MVR approaches. PMID:20953772

Zang, Qingda; Keire, David A; Wood, Richard D; Buhse, Lucinda F; Moore, Christine M V; Nasr, Moheb; Al-Hakim, Ali; Trehy, Michael L; Welsh, William J

2011-01-01

220

Scientific Electronic Library Online (English)

Full Text Available SciELO Public Health | Language: English Abstract in spanish RESUMEN OBJETIVO: Realizar una revisión sistemática de ensayos aleatorizados y controlados en los que se compara el efecto de la administración de múltiples micronutrientes con el de la administración de hierro y ácido fólico sobre los resultados de los embarazos en los países en vías de desarrollo. [...] MÉTODOS: Se realizaron búsquedas en MEDLINE y EMBASE. Los resultados de interés fueron: peso del neonato, bajo peso neonatal, neonatos con una talla baja para la edad gestacional, mortalidad perinatal y mortalidad neonatal. Se calcularon los riesgos relativos (RR) agrupados, empleando modelos de efectos aleatorios. Se investigaron las fuentes de heterogeneidad del metanálisis y la metarregresión de los subgrupos. RESULTADOS: La administración de múltiples micronutrientes fue más eficaz que la administración de hierro y ácido fólico a la hora de reducir el riesgo del peso bajo neonatal (RR=0,86, IC del 95%=0,79-0,93) y la talla baja para la edad gestacional (RR=0,85; IC del 95%=0,78-0,93). La administración de micronutrientes no tuvo un efecto global en la mortalidad perinatal (RR=1,05; IC del 95%=0,90-1,22), si bien la heterogeneidad fue importante y evidente (I²=58%; p de heterogeneidad=0,008). Los análisis de los subgrupos y de la metarregresión sugirieron que la administración de micronutrientes estaba asociada a un menor riesgo de mortalidad perinatal en aquellos estudios en los que más del 50% de las madres tenía formación universitaria (RR=0,93; IC del 95%=0,82-1,06) o en los que la administración se inició después de una media de 20 semanas de gestación (RR=0,88; IC del 95%=0,80-0,97). CONCLUSIÓN: La educación de la madre o la edad gestacional en la que se inició la administración pueden haber contribuido a los efectos heterogéneos observados en la mortalidad perinatal. Se debe seguir investigando la seguridad, la eficacia y la efectividad de la administración de micronutrientes a mujeres embarazadas. Abstract in english OBJECTIVE: To systematically review randomized controlled trials comparing the effect of supplementation with multiple micronutrients versus iron and folic acid on pregnancy outcomes in developing countries. METHODS: MEDLINE and EMBASE were searched. Outcomes of interest were birth weight, low birth [...] weight, small size for gestational age, perinatal mortality and neonatal mortality. Pooled relative risks (RRs) were estimated by random effects models. Sources of heterogeneity were explored through subgroup meta-analyses and meta-regression. FINDINGS: Multiple micronutrient supplementation was more effective than iron and folic acid supplementation at reducing the risk of low birth weight (RR:0.86, 95% confidence interval, CI:0.79-0.93) and of small size for gestational age (RR:0.85; 95% CI: 0.78-0.93). Micronutrient supplementation had no overall effect on perinatal mortality (RR:1.05; 95% CI:0.90-1.22), although substantial heterogeneity was evident (I²=58%; P for heterogeneity=0.008). Subgroup and meta-regression analyses suggested that micronutrient supplementation was associated with a lower risk of perinatal mortality in trials in which >50% of mothers had formal education (RR:0.93; 95% CI:0.82-1.06) or in which supplementation was initiated after a mean of 20 weeks of gestation (RR:0.88; 95% CI:0.80-0.97). CONCLUSION: Maternal education or gestational age at initiation of supplementation may have contributed to the observed heterogeneous effects on perinatal mortality. The safety, efficacy and effective delivery of maternal micronutrient supplementation require further research.

Kosuke, Kawai; Donna, Spiegelman; Anuraj H, Shankar; Wafaie W, Fawzi.

2011-06-01

221

We describe a microcomputer program (COXSURV) for proportional hazards multiple regression analysis of survival and other failure-time data generated in clinical trials and in retrospective clinical epidemiology studies. COXSURV is menu-driven and has powerful variable factoring and data exploratory capabilities for multivariate modeling. A batch mode allows automatic uni- or multivariate analyses for confounder summarization. Model selection for predictive purposes is possible through a step-up algorithm. The partial likelihood method used in the program allows the use of either discrete or continuous time scales by treating tied uncensored observations by either the exact method or by a robust approximation method. The program calculates most standard model fitting statistics for either overall or stratified analyses and uses data layout files compatible with those of other related epidemiologic analysis software. PMID:2335079

Campos-Filho, N; Franco, E L

1990-02-01

222

International Nuclear Information System (INIS)

The gamma/beta TLD badge used by OPPD consists of two TLD-700 chips (Harshaw G7 card), one of which (chip number sign 2) is shielded by a 0.102 cm-thick aluminum filter, and the other (chip number sign 1) is unshielded, as shown in Fig. 1. Standard procedure had been to determine the beta dose to the badge by subtracting the response of chip number sign 2 from that of chip number sign 1 and then dividing by a calibrated beta-sensitivity factor; the gamma dose was taken to be the response of chip number sign 2 divided by the chip's gamma-sensitivity factor followed by the subtraction of the background dose. A problem with this procedure is penetration of energetic beta particles through the aluminum filter on chip number sign 2 which causes an over-response. Due to the technique used to obtain the beta dose, this also results in an under-estimate of the beta dose. This problem has been corrected through application of multiple linear regression analysis on a large data base of pure gamma (137Cs), pure beta (90Sr), and mixed exposures. The outcome of the analysis is an algorithm that automatically corrects for penetration effects. Performance tests using the ANSI N13.11 standard are presented to show the improvement

223

SummaryIn this article, an approach using Bayesian Generalised Least Squares (BGLS) regression in a region-of-influence (ROI) framework is proposed for regional flood frequency analysis (RFFA) for ungauged catchments. Using the data from 399 catchments in eastern Australia, the BGLS-ROI is constructed to regionalise the flood quantiles (Quantile Regression Technique (QRT)) and the first three moments of the log-Pearson type 3 (LP3) distribution (Parameter Regression Technique (PRT)). This scheme firstly develops a fixed region model to select the best set of predictor variables for use in the subsequent regression analyses using an approach that minimises the model error variance while also satisfying a number of statistical selection criteria. The identified optimal regression equation is then used in the ROI experiment where the ROI is chosen for a site in question as the region that minimises the predictive uncertainty. To evaluate the overall performances of the quantiles estimated by the QRT and PRT, a one-at-a-time cross-validation procedure is applied. Results of the proposed method indicate that both the QRT and PRT in a BGLS-ROI framework lead to more accurate and reliable estimates of flood quantiles and moments of the LP3 distribution when compared to a fixed region approach. Also the BGLS-ROI can deal reasonably well with the heterogeneity in Australian catchments as evidenced by the regression diagnostics. Based on the evaluation statistics it was found that both BGLS-QRT and PRT-ROI perform similarly well, which suggests that the PRT is a viable alternative to QRT in RFFA. The RFFA methods developed in this paper is based on the database available in eastern Australia. It is expected that availability of a more comprehensive database (in terms of both quality and quantity) will further improve the predictive performance of both the fixed and ROI based RFFA methods presented in this study, which however needs to be investigated in future when such a database is available.

Haddad, Khaled; Rahman, Ataur

2012-04-01

224

M-quantile regression analysis of temporal gene expression data.

In this paper, we explore the use of M-quantile regression and M-quantile coefficients to detect statistical differences between temporal curves that belong to different experimental conditions. In particular, we consider the application of temporal gene expression data. Here, the aim is to detect genes whose temporal expression is significantly different across a number of biological conditions. We present a new method to approach this problem. Firstly, the temporal profiles of the genes are modelled by a parametric M-quantile regression model. This model is particularly appealing to small-sample gene expression data, as it is very robust against outliers and it does not make any assumption on the error distribution. Secondly, we further increase the robustness of the method by summarising the M-quantile regression models for a large range of quantile values into an M-quantile coefficient. Finally, we fit a polynomial M-quantile regression model to the M-quantile coefficients over time and employ a Hotelling T(2)-test to detect significant differences of the temporal M-quantile coefficients profiles across conditions. Extensive simulations show the increased power and robustness of M-quantile regression methods over standard regression methods and over some of the previously published methods. We conclude by applying the method to detect differentially expressed genes from time-course microarray data on muscular dystrophy. PMID:19799560

Vinciotti, Veronica; Yu, Keming

2009-01-01

225

Linear Maximum Likelihood Regression Analysis for Untransformed Log-Normally Distributed Data

Medical research data are often skewed and heteroscedastic. It has therefore become practice to log-transform data in regression analysis, in order to stabilize the variance. Regression analysis on log-transformed data estimates the relative effect, whereas it is often the absolute effect of a predictor that is of interest. We propose a maximum likelihood (ML)-based approach to estimate a linear regression model on log-normal, heteroscedastic data. The new method was evaluated with a large si...

Gustavsson, Sara M.; Sandra Johannesson; Gerd Sallsten; Andersson, Eva M.

2012-01-01

226

Correlation analysis in conjunction with principal-component and multiple-regression analyses were applied to laboratory chemical and petrographic data to assess the usefulness of these techniques in evaluating selected physical and hydraulic properties of carbonate-rock aquifers in central Pennsylvania. Correlation and principal-component analyses were used to establish relations and associations among variables, to determine dimensions of property variation of samples, and to filter the variables containing similar information. Principal-component and correlation analyses showed that porosity is related to other measured variables and that permeability is most related to porosity and grain size. Four principal components are found to be significant in explaining the variance of data. Stepwise multiple-regression analysis was used to see how well the measured variables could predict porosity and (or) permeability for this suite of rocks. The variation in permeability and porosity is not totally predicted by the other variables, but the regression is significant at the 5% significance level. ?? 1993.

Brown, C.E.

1993-01-01

227

The Kaplan-Meier and the Cox regression methods are the most used statistical techniques for performing "time to event analysis" in epidemiological and clinical research. The Kaplan-Meier analysis allows to build up one or more survival curves describing the occurrence of the outcome of interest over time according to the presence/absence of one or more exposures. The Cox regression method models the relationship between a specific exposure (either a continuous one like age, and systolic blood pressure or a categorical one like diabetes, degree of obesity, etc.) and the occurrence of a given outcome taking into account multiple confounders and/or predictors. PMID:23114547

Abd ElHafeez, Samar; Torino, Claudia; D'Arrigo, Graziella; Bolignano, Davide; Provenzano, Fabio; Mattace-Raso, Francesco; Zoccali, Carmine; Tripepi, Giovanni

2012-06-01

228

Spatial regression analysis on 32 years of total column ozone data

Multiple-regression analyses have been performed on 32 years of total ozone column data that was spatially gridded with a 1 × 1.5° resolution. The total ozone data consist of the MSR (Multi Sensor Reanalysis; 1979-2008) and 2 years of assimilated SCIAMACHY (SCanning Imaging Absorption spectroMeter for Atmospheric CHartographY) ozone data (2009-2010). The two-dimensionality in this data set allows us to perform the regressions locally and investigate spatial patterns of regression coefficients and their explanatory power. Seasonal dependencies of ozone on regressors are included in the analysis. A new physically oriented model is developed to parameterize stratospheric ozone. Ozone variations on nonseasonal timescales are parameterized by explanatory variables describing the solar cycle, stratospheric aerosols, the quasi-biennial oscillation (QBO), El Niño-Southern Oscillation (ENSO) and stratospheric alternative halogens which are parameterized by the effective equivalent stratospheric chlorine (EESC). For several explanatory variables, seasonally adjusted versions of these explanatory variables are constructed to account for the difference in their effect on ozone throughout the year. To account for seasonal variation in ozone, explanatory variables describing the polar vortex, geopotential height, potential vorticity and average day length are included. Results of this regression model are compared to that of a similar analysis based on a more commonly applied statistically oriented model. The physically oriented model provides spatial patterns in the regression results for each explanatory variable. The EESC has a significant depleting effect on ozone at mid- and high latitudes, the solar cycle affects ozone positively mostly in the Southern Hemisphere, stratospheric aerosols affect ozone negatively at high northern latitudes, the effect of QBO is positive and negative in the tropics and mid- to high latitudes, respectively, and ENSO affects ozone negatively between 30° N and 30° S, particularly over the Pacific. The contribution of explanatory variables describing seasonal ozone variation is generally large at mid- to high latitudes. We observe ozone increases with potential vorticity and day length and ozone decreases with geopotential height and variable ozone effects due to the polar vortex in regions to the north and south of the polar vortices. Recovery of ozone is identified globally. However, recovery rates and uncertainties strongly depend on choices that can be made in defining the explanatory variables. The application of several trend models, each with their own pros and cons, yields a large range of recovery rate estimates. Overall these results suggest that care has to be taken in determining ozone recovery rates, in particular for the Antarctic ozone hole.

Knibbe, J. S.; van der A, R. J.; de Laat, A. T. J.

2014-08-01

229

Polymer micelles are promising drug delivery vehicles for the delivery of anticancer agents to tumors. Often, anticancer drugs display potent cytotoxic effects towards cancer cells but are too hydrophobic to be administered in the clinic as a free drug. To address this problem, a polymer micelle was designed using a triblock copolymer (ITP-101) that enables hydrophobic drugs to be encapsulated. An SN-38 encapsulated micelle, IT-141, was prepared that exhibited potent in vitro cytotoxicity against a wide array of cancer cell lines. In a mouse model, pharmacokinetic analysis revealed that IT-141 had a much longer circulation time, plasma exposure, and tumor exposure compared to irinotecan. IT-141 was also superior to irinotecan in terms of antitumor activity, exhibiting greater tumor inhibition in HT-29 and HCT116 colorectal cancer xenograft models at half the dose of irinotecan. The antitumor effect of IT-141 was dose-dependent and caused complete growth inhibition and tumor regression at well-tolerated doses. Varying the specific concentration of SN-38 within the IT-141 micelle had no detectible effect on this antitumor activity, indicating no differences in activity between different IT-141 formulations. In summary, IT-141 is a potent micelle-based chemotherapy that holds promise for the treatment of colorectal cancer. PMID:22187652

Carie, Adam; Rios-Doria, Jonathan; Costich, Tara; Burke, Brian; Slama, Richard; Skaff, Habib; Sill, Kevin

2011-01-01

230

Derailments are the most common type of freight-train accidents in the United States. Derailments cause damage to infrastructure and rolling stock, disrupt services, and may cause casualties and harm the environment. Accordingly, derailment analysis and prevention has long been a high priority in the rail industry and government. Despite the low probability of a train derailment, the potential for severe consequences justify the need to better understand the factors influencing train derailment severity. In this paper, a zero-truncated negative binomial (ZTNB) regression model is developed to estimate the conditional mean of train derailment severity. Recognizing that the mean is not the only statistic describing data distribution, a quantile regression (QR) model is also developed to estimate derailment severity at different quantiles. The two regression models together provide a better understanding of train derailment severity distribution. Results of this work can be used to estimate train derailment severity under various operational conditions and by different accident causes. This research is intended to provide insights regarding development of cost-efficient train safety policies. PMID:23770389

Liu, Xiang; Saat, M Rapik; Qin, Xiao; Barkan, Christopher P L

2013-10-01

231

Grades, Gender, and Encouragement: A Regression Discontinuity Analysis

The author employs a regression discontinuity design to provide direct evidence on the effects of grades earned in economics principles classes on the decision to major in economics and finds a differential effect for male and female students. Specifically, for female students, receiving an A for a final grade in the first economics class is…

Owen, Ann L.

2010-01-01

232

Stepwise Regression as an Exploratory Data Analysis Procedure.

This paper identifies specific problems with stepwise regression, notes criticisms of stepwise methods by statisticians, suggests appropriate ways in which stepwise procedures can be used, and gives examples of how this can be done. Although the stepwise method has been routinely criticized by statisticians, it is still frequently used in the…

Thayer, Jerome D.

233

Analysis on Train Stopping Accuracy based on Regression Algorithms

Directory of Open Access Journals (Sweden)

Full Text Available Stopping accuracy is one of the most important indexes of efficiency of automatic train operation (ATO systems. Traditional stopping control algorithms in ATO systems have some drawbacks, as many factors have not been taken into account. In the large amount of field-collected data about stopping accuracy there are many factors (e.g. system delays, stopping time, net pressure which affecting stopping accuracy. In this paper, three popular data mining methods are proposed to analyze the train stopping accuracy. Firstly, we find fifteen factors which have impact on the stopping accuracy. Then, ridge regression, lasso regression and elastic net regression are employed to mine models to reflecting the relationship between the fifteen factors and the stopping accuracy. Then, the three models are compared by using Akaike information criterion (AIC, a model selection criterion which considering the trade-off between accuracy and complexity. The computational results show that elastic net regression model has a best performance on AIC value. Finally, we obtain the parameters which can make the train stop more accurately which can provide a reference to improve stopping accuracy for ATO systems.

Lin Ma

2014-05-01

234

A Bayesian Quantile Regression Analysis of Potential Risk Factors for Violent Crimes in USA

Directory of Open Access Journals (Sweden)

Full Text Available Bayesian quantile regression has drawn more attention in widespread applications recently. Yu and Moyeed (2001 proposed an asymmetric Laplace distribution to provide likelihood based mechanism for Bayesian inference of quantile regression models. In this work, the primary objective is to evaluate the performance of Bayesian quantile regression compared with simple regression and quantile regression through simulation and with application to a crime dataset from 50 USA states for assessing the effect of potential risk factors on the violent crime rate. This paper also explores improper priors, and conducts sensitivity analysis on the parameter estimates. The data analysis reveals that the percent of population that are single parents always has a significant positive influence on violent crimes occurrence, and Bayesian quantile regression provides more comprehensive statistical description of this association.

Ming Wang

2012-12-01

235

The prevalence of hepatitis C virus (HCV) infection varies across the world, with the highest percent of infections reported in Middle East, increasingly in Egypt. The current study aimed at examining the bio-statistical correlation and multiple regression analyses of pituitary growth hormone (GH) and liver activities among HCV genotype-4 patients treated with PEG-IFN-? plus RBV therapy. Herein, the current study was conducted on 100 HCV genotype-4 infected patients and 50 healthy controls. ...

Eskander, Emad F.; Abd-rabou, Ahmed A.; Yahya, Shaymaa M. M.; El Sherbini, Ashraf; Mohamed, Mervat S.; Shaker, Olfat G.

2013-01-01

236

Application of a multiple least-squares regression program to dual energy NaI-CsI(T1) measurements

International Nuclear Information System (INIS)

In conjunction with the development of an optimum background subtraction routine, a multiple least-squares regression program for simultaneous utilization of both the NaI(T1) and CsI(T1) energy ranges of a dual anti-coincidence detection system was applied. To experimentally evaluate the program for whole body counting purposes, an Am-241 contaminated subject was measured in the whole body counter using the standard three phoswich detector array surrounding the head

237

Rock mass classification systems are one of the most common ways of determining rock mass excavatability and related equipment assessment. However, the strength and weak points of such rating-based classifications have always been questionable. Such classification systems assign quantifiable values to predefined classified geotechnical parameters of rock mass. This causes particular ambiguities, leading to the misuse of such classifications in practical applications. Recently, intelligence system approaches such as artificial neural networks (ANNs) and neuro-fuzzy methods, along with multiple regression models, have been used successfully to overcome such uncertainties. The purpose of the present study is the construction of several models by using an adaptive neuro-fuzzy inference system (ANFIS) method with two data clustering approaches, including fuzzy c-means (FCM) clustering and subtractive clustering, an ANN and non-linear multiple regression to estimate the basic rock mass diggability index. A set of data from several case studies was used to obtain the real rock mass diggability index and compared to the predicted values by the constructed models. In conclusion, it was observed that ANFIS based on the FCM model shows higher accuracy and correlation with actual data compared to that of the ANN and multiple regression. As a result, one can use the assimilation of ANNs with fuzzy clustering-based models to construct such rigorous predictor tools.

Saeidi, Omid; Torabi, Seyed Rahman; Ataei, Mohammad

2014-03-01

238

Aims: The aim of the study was to assess the relationships between the remaining shelf-life (RSL) of cold-smoked salmon and various microbiological and physico-chemical parameters, using a multivariate data analysis in the form of stepwise forward multiple regression. Methods and Results: Thirteen batches of French cold-smoked salmon were analysed weekly during vacuum-packed storage at 5 degreesC for their lipid, water, salt, phenol, pH, total volatile basic nitrogen (TVBN) and trimethyla...

Leroi, Francoise; Joffraud, Jean-jacques; Chevalier, Frederique; Cardinal, Mireille

2001-01-01

239

REGRESSION ANALYSIS OF PRODUCTIVITY USING MIXED EFFECT MODEL

Directory of Open Access Journals (Sweden)

Full Text Available Production plants of a company are located in several areas that spread across Middle and East Java. As the production process employs mostly manpower, we suspected that each location has different characteristics affecting the productivity. Thus, the production data may have a spatial and hierarchical structure. For fitting a linear regression using the ordinary techniques, we are required to make some assumptions about the nature of the residuals i.e. independent, identically and normally distributed. However, these assumptions were rarely fulfilled especially for data that have a spatial and hierarchical structure. We worked out the problem using mixed effect model. This paper discusses the model construction of productivity and several characteristics in the production line by taking location as a random effect. The simple model with high utility that satisfies the necessary regression assumptions was built using a free statistic software R version 2.6.1.

Siana Halim

2007-01-01

240

Regression analysis of censored data using pseudo-observations

DEFF Research Database (Denmark)

We draw upon a series of articles in which a method based on pseu- dovalues is proposed for direct regression modeling of the survival function, the restricted mean, and the cumulative incidence function in competing risks with right-censored data. The models, once the pseudovalues have been computed, can be fit using standard generalized estimating equation software. Here we present Stata procedures for computing these pseudo-observations. An example from a bone marrow transplantation study is used to illustrate the method.

Parner, Erik T.; Andersen, Per Kragh

2010-01-01

241

Bayesian analysis of logistic regression with an unknown change point

We discuss Bayesian estimation of a logistic regression model with an unknown threshold limiting value (TLV). In these models it is assumed that there is no effect of a covariate on the response under a certain unknown TLV. The estimation of these models with a focus on the TLV in a Bayesian context by Markov chain Monte Carlo (MCMC) methods is considered. We extend the model by accounting for measurement error in the covariate. The Bayesian solution is compared with the likelihood solution...

Go?ssl, Christoff; Ku?chenhoff, Helmut

1999-01-01

242

Model performance analysis and model validation in logistic regression

Directory of Open Access Journals (Sweden)

Full Text Available In this paper a new model validation procedure for a logistic regression model is presented. At first, we illustrate a brief review of different techniques of model validation. Next, we define a number of properties required for a model to be considered "good", and a number of quantitative performance measures. Lastly, we describe a methodology for the assessment of the performance of a given model by using an example taken from a management study.

Rosa Arboretti Giancristofaro

2007-10-01

243

Analysis of apoptosis during hair follicle regression (catagen)

Keratinocyte apoptosis is a central element in the regulation of hair follicle regression (catagen), yet the exact location and the control of follicular keratinocyte apoptosis remain obscure. To generate an "apoptomap" of the hair follicle, we have studied selected apoptosis-associated parameters in the C57BL/6 mouse model for hair research during normal and pharmacologically manipulated, pathological catagen development. As assessed by terminal deoxynucleotide transferase dUTP fluorescein n...

Lindner, G.; Botchkarev, V. A.; Botchkareva, N. V.; Ling, G.; Veen, C.; Paus, R.

1997-01-01

244

BRGLM, Interactive Linear Regression Analysis by Least Square Fit

International Nuclear Information System (INIS)

1 - Description of program or function: BRGLM is an interactive program written to fit general linear regression models by least squares and to provide a variety of statistical diagnostic information about the fit. Stepwise and all-subsets regression can be carried out also. There are facilities for interactive data management (e.g. setting missing value flags, data transformations) and tools for constructing design matrices for the more commonly-used models such as factorials, cubic Splines, and auto-regressions. 2 - Method of solution: The least squares computations are based on the orthogonal (QR) decomposition of the design matrix obtained using the modified Gram-Schmidt algorithm. 3 - Restrictions on the complexity of the problem: The current release of BRGLM allows maxima of 1000 observations, 99 variables, and 3000 words of main memory workspace. For a problem with N observations and P variables, the number of words of main memory storage required is MAX(N*(P+6), N*P+P*P+3*N, and 3*P*P+6*N). Any linear model may be fit although the in-memory workspace will have to be increased for larger problems

245

Multi-Class Sparse Bayesian Regression for Neuroimaging Data Analysis

The use of machine learning tools is gaining popularity in neuroimaging, as it provides a sensitive assessment of the information conveyed by brain images. In particular, finding regions of the brain whose functional signal reliably predicts some behavioral information makes it possible to better understand how this information is encoded or processed in the brain. However, such a prediction is performed through regression or classification algorithms that suffer from the curse of dimensionality, because a huge number of features (i.e. voxels) are available to fit some target, with very few samples (i.e. scans) to learn the informative regions. A commonly used solution is to regularize the weights of the parametric prediction function. However, model specification needs a careful design to balance adaptiveness and sparsity. In this paper, we introduce a novel method, Multi - Class Sparse Bayesian Regression(MCBR), that generalizes classical approaches such as Ridge regression and Automatic Relevance Determination. Our approach is based on a grouping of the features into several classes, where each class is regularized with specific parameters. We apply our algorithm to the prediction of a behavioral variable from brain activation images. The method presented here achieves similar prediction accuracies than reference methods, and yields more interpretable feature loadings.

Michel, Vincent; Eger, Evelyn; Keribin, Christine; Thirion, Bertrand

246

International Nuclear Information System (INIS)

A plot of lung-cancer rates versus radon exposures in 965 US counties, or in all US states, has a strong negative slope, b, in sharp contrast to the strong positive slope predicted by linear/no-threshold theory. The discrepancy between these slopes exceeds 20 standard deviations (SD). Including smoking frequency in the analysis substantially improves fits to a linear relationship but has little effect on the discrepancy in b, because correlations between smoking frequency and radon levels are quite weak. Including 17 socioeconomic variables (SEV) in multiple regression analysis reduces the discrepancy to 15 SD. Data were divided into segments by stratifying on each SEV in turn, and on geography, and on both simultaneously, giving over 300 data sets to be analyzed individually, but negative slopes predominated. The slope is negative whether one considers only the most urban counties or only the most rural; only the richest or only the poorest; only the richest in the South Atlantic region or only the poorest in that region, etc., etc.,; and for all the strata in between. Since this is an ecological study, the well-known problems with ecological studies were investigated and found not to be applicable here. The open-quotes ecological fallacyclose quotes was shown not to apply in testing a linear/no-threshold theory, and the vulnerability to confounding is greatly reduced when confounding factors are only weakly correlated with radon levels, as is generally the case hereadon levels, as is generally the case here. All confounding factors known to correlate with radon and with lung cancer were investigated quantitatively and found to have little effect on the discrepancy

247

Detrended fluctuation analysis as a regression framework: Estimating dependence at different scales

We propose a novel framework combining detrended fluctuation analysis with standard regression methodology. The method is built on detrended variances and covariances and it is designed to estimate regression parameters at different scales and under potential non-stationarity and power-law correlations. Selected examples from physics, finance and environmental sciences illustrate usefulness of the framework.

Kristoufek, Ladislav

2014-01-01

248

Energy Technology Data Exchange (ETDEWEB)

In this paper, the most relevant multiple regression models for sales forecasting of gas stations, developed over the past ten years, are reviewed. The most significant variables related to gas station sales, the types of the multiple regression models (linear or non-linear), the most common uses in supporting decision making and its limits are presented. The predictive power of each model and its impact on decision-making, such as sensitivity analysis and confidence intervals for independent variables, are also commented. Four models are presented, based on studies conducted in South Africa, Portugal and Brazil. In conclusion, suggestions for future developments are presented based on past developments. (author)

Wanke, Peter [Universidade Federal do Rio de Janeiro (UFRJ), RJ (Brazil). Instituto de Pesquisa e Pos-Graduacao em Administracao de Empresas (COPPEAD). Centro de Estudos em Logistica

2004-07-01

249

Use of Structure Coefficients in Published Multiple Regression Articles: Beta Is Not Enough.

Reviewed articles published in the "Journal of Applied Psychology" (JAP) to determine how interpretations might have differed if standardized regression coefficients and structure coefficients (or bivariate "r"s of predictors with the criterion) had been interpreted. Summarizes some dramatic misinterpretations or incomplete interpretations.…

Courville, Troy; Thompson, Bruce

2001-01-01

250

We present an extension of the logistic regression procedure to identify dichotomous differential item functioning (DIF) in the presence of more than two groups of respondents. Starting from the usual framework of a single focal group, we propose a general approach to estimate the item response functions in each group and to test for the presence…

Magis, David; Raiche, Gilles; Beland, Sebastien; Gerard, Paul

2011-01-01

251

Semi-parametric ROC regression analysis with placement values.

Advances in technology provide new diagnostic tests for early detection of disease. Frequently, these tests have continuous outcomes. One popular method to summarize the accuracy of such a test is the Receiver Operating Characteristic (ROC) curve. Methods for estimating ROC curves have long been available. To examine covariate effects, Pepe (1997, 2000) and Alonzo and Pepe (2002) proposed distribution-free approaches based on a parametric regression model for the ROC curve. Cai and Pepe (2002) extended the parametric ROC regression model by allowing an arbitrary non-parametric baseline function. In this paper, while we follow the same semi-parametric setting as in that paper, we highlight a new estimator that offers several improvements over the earlier work: superior efficiency, the ability to estimate the covariate effects without estimating the non-parametric baseline function and easy implementation with standard software. The methodology is applied to a case control dataset where we evaluate the accuracy of the prostate-specific antigen as a biomarker for early detection of prostate cancer. Simulation studies suggest that the new estimator under the semi-parametric model, while always being more robust, has efficiency that is comparable to or better than the Alonzo and Pepe (2002) estimator from the parametric model. PMID:14744827

Cai, Tianxi

2004-01-01

252

Scientific Electronic Library Online (English)

Full Text Available Carrying out regression analysis for gas leakage of pressure-relief valve (PRV) to get accurate leakage flow and changing trend of leakage will be helpful in assessing the reliability of PRV. Classic support vector regression (SVR) is an excellent regression model, and has been widely used in variou [...] s fields. However, standard SVR model does regression only using leakage data without elements closely related to the leakage considered. In this paper a regression model based on support vector regression plus (SVR+) is put forward to perform leakage regression of PRV, in which particle swarm optimization (PSO) is used to select optimum parameters of SVR+, termed PSO_SVR+. The experimental results demonstrate that the proposed model taking the difference of inlet pressure and outlet pressure of PRV as hidden information can access a more favorable regression precision than SVR can provide. Meanwhile this article also investigates effects of PSO and Genetic Algorithm on the performance of regression model (SVR+ or SVR)

W., Sun; G. X., Meng; Q., Ye; H. L., Jin; J. Z., Zhang.

2012-04-01

253

Directory of Open Access Journals (Sweden)

Full Text Available O ar é um meio eficiente de dispersão de poluentes atmosféricos e seucomportamento depende dos movimentos atmosféricos que ocorrem na troposfera. Em Porto Alegre, Estado do Rio Grande do Sul, há um grande tráfego diário e uma concentração de indústrias que podem ser responsáveis por emissões atmosféricas. Neste trabalho, estudou-se ocomportamento das concentrações diárias de material particulado (PM10 desta cidade, considerando a influência dos elementos meteorológicos. A análise dos dados foi realizada a partir de estatísticas descritivas, correlação linear e regressão múltipla. Os dados foram fornecidos pela Fundação Estadual de Proteção Ambiental Henrique Luiz Roessler - RS (FEPAM e pelo Instituto Nacional de Meteorologia (INMET. A partir das análises pôde-se verificar que: asconcentrações do PM10, medidos diariamente às 16h, não ultrapassaram os padrões nacionais de qualidade do ar; os elementos meteorológicos que influenciam nas concentrações do PM10 foram: a velocidade média diária do vento e a radiação média diária com relações negativas; astemperaturas médias diárias do ar e as direções, norte e noroeste, do vento, com relações positivas. As direções do vento que contribuem significativamente para diminuir as concentrações nos locais medidos são Leste e Sudeste.Air is an efficient means of atmospheric pollutants dispersal and its r behavior depends on the atmospheric movements that occur in the troposphere. In Porto Alegre, Rio Grande do Sul State, there is a large daily traffic and a concentration of industries that may be responsible for atmospheric emission. In the present work we studied the behavior of daily concentrations of particulate matter (PM10, in this city, considering the influence of meteorological variables. Dataanalysis was performed from descriptive statistics, linear correlation and multiple regressions. Data were provided by the State Foundation of Environmental Protection Henrique Luiz Roessler - RS and the National Institute of Meteorology. Based on the analysis it was possible to verify that: the concentration of PM10, measured every day at 4:00 p.m., did not exceed national standards for air quality; meteorological elements that influenced on the concentrations of PM10 were the daily average wind speed and average daily radiation with negative relations; the daily average temperature of the air and the directions, north and northwest of wind, with positive relations. Wind directions which contribute significantly to lower concentrations on the measured placesare east and southeast.

Angela Radünz Lazzari

2011-01-01

254

A primer for biomedical scientists on how to execute model II linear regression analysis.

1.?There are two very different ways of executing linear regression analysis. One is Model I, when the x-values are fixed by the experimenter. The other is Model II, in which the x-values are free to vary and are subject to error. 2.?I have received numerous complaints from biomedical scientists that they have great difficulty in executing Model II linear regression analysis. This may explain the results of a Google Scholar search, which showed that the authors of articles in journals of physiology, pharmacology and biochemistry rarely use Model II regression analysis. 3.?I repeat my previous arguments in favour of using least products linear regression analysis for Model II regressions. I review three methods for executing ordinary least products (OLP) and weighted least products (WLP) regression analysis: (i) scientific calculator and/or computer spreadsheet; (ii) specific purpose computer programs; and (iii) general purpose computer programs. 4.?Using a scientific calculator and/or computer spreadsheet, it is easy to obtain correct values for OLP slope and intercept, but the corresponding 95% confidence intervals (CI) are inaccurate. 5.?Using specific purpose computer programs, the freeware computer program smatr gives the correct OLP regression coefficients and obtains 95% CI by bootstrapping. In addition, smatr can be used to compare the slopes of OLP lines. 6.?When using general purpose computer programs, I recommend the commercial programs systat and Statistica for those who regularly undertake linear regression analysis and I give step-by-step instructions in the Supplementary Information as to how to use loss functions. PMID:22077731

Ludbrook, John

2012-04-01

255

Multiple time correlation functions are found in the dynamical description of different phenomena. They encode and describe the fluctuations of the dynamical variables of a system. In this paper we formulate a theory of non-Markovian multiple-time correlation functions (MTCF) for a wide class of systems. We derive the dynamical equation of the {\\it reduced propagator}, an object that evolve state vectors of the system conditioned to the dynamics of its environment, which is ...

Alonso, Daniel; Vega, Ine?s

2004-01-01

256

Directory of Open Access Journals (Sweden)

Full Text Available Functional MRI studies have revealed changes in default-mode and salience networks in neurodegenerative dementias, especially in Alzheimer’s disease. The purpose of this study was to analyze the whole brain cortex resting state networks in patients with behavioral variant frontotemporal dementia by using resting state functional MRI. The group specific resting state networks were identified by high model order independent component analysis and a dual regression technique was used to detect between-group differences in the resting state networks with p<0.05 threshold corrected for multiple comparisons. A y-concatenation method was used to correct for multiple comparisons for multiple independent components, grey matter differences as well as the voxel level. We found increased connectivity in several networks within patients with bvFTD compared to the control group. The most prominent enhancement was seen in the right frontotemporal area and insula. A significant increase in functional connectivity was also detected in the left dorsal attention network, in anterior paracingulate – a default mode sub-network as well as in the anterior parts of the frontal pole. Notably the increased patterns of connectivity were seen in areas around atrophic regions. The present results demonstrate abnormal increased connectivity in several important brain networks including the dorsal attention network and default-mode network in patients with behavioral variant frontotemporal dementia. These changes may be associated with decline in executive functions and attention as well as apathy, which are the major cognitive and neuropsychiatric defects in patients with frontotemporal dementia.

AnneMarjaRemes

2013-08-01

257

A multiple linear regression technique was used to evaluate and correct the matrix interferences in the determination of As and Pb concentrations in fly ashes by inductively coupled plasma optical emission spectrometry. The direct determination of As and Pb in SRM 1633b by ICP-OES failed to obtain the certified concentrations, except in a couple of cases. However, it proved possible to use the multiple linear regression (MLR) technique to correct the determined concentrations to a satisfactor...

Ilander, Aki; Va?isa?nen, Ari

2010-01-01

258

Factors Associated with Methadone Treatment Duration: A Cox Regression Analysis

This study examined retention rates and associated predictors of methadone maintenance treatment (MMT) duration among 128 newly admitted patients in Taiwan. A semi-structured questionnaire was used to obtain demographic and drug use history. Daily records of methadone taken and test results for HIV, HCV, and morphine toxicology were taken from a computerized medical registry. Cox regression analyses were performed to examine factors associated with MMT duration. MMT retention rates were 80.5%, 68.8%, 53.9%, and 41.4% for 3, 6, 12, and 18 months, respectively. Excluding 38 patients incarcerated during the study period, retention rates were 81.1%, 73.3%, 61.1%, and 48.9% for 3 months, 6 months, 12 months, and 18 months, respectively. No participant seroconverted to HIV and 1 died during the 18-months follow-up. Results showed that being female, imprisonment, a longer distance from house to clinic, having a lower methadone dose after 30 days, being HCV positive, and in the New Taipei city program predicted early patient dropout. The findings suggest favorable MMT outcomes of HIV seroincidence and mortality. Results indicate that the need to minimize travel distance and to provide programs that meet women’s requirements justify expansion of MMT clinics in Taiwan. PMID:25875531

Peng, Ching-Yi; Chao, En; Lee, Tony Szu-Hsien

2015-01-01

259

This study encompasses air surface temperature (AST) modeling in the lower atmosphere. Data of four atmosphere pollutant gases (CO, O3, CH4, and H2O) dataset, retrieved from the National Aeronautics and Space Administration Atmospheric Infrared Sounder (AIRS), from 2003 to 2008 was employed to develop a model to predict AST value in the Malaysian peninsula using the multiple regression method. For the entire period, the pollutants were highly correlated (R=0.821) with predicted AST. Comparisons among five stations in 2009 showed close agreement between the predicted AST and the observed AST from AIRS, especially in the southwest monsoon (SWM) season, within 1.3 K, and for in situ data, within 1 to 2 K. The validation results of AST with AST from AIRS showed high correlation coefficient (R=0.845 to 0.918), indicating the model's efficiency and accuracy. Statistical analysis in terms of ? showed that H2O (0.565 to 1.746) tended to contribute significantly to high AST values during the northeast monsoon season. Generally, these results clearly indicate the advantage of using the satellite AIRS data and a correlation analysis study to investigate the impact of atmospheric greenhouse gases on AST over the Malaysian peninsula. A model was developed that is capable of retrieving the Malaysian peninsulan AST in all weather conditions, with total uncertainties ranging between 1 and 2 K.

Rajab, Jasim Mohammed; Jafri, Mohd. Zubir Mat; Lim, Hwee San; Abdullah, Khiruddin

2012-10-01

260

Energy Technology Data Exchange (ETDEWEB)

Evaluation of economic feasibility of a bio-gasification facility needs understanding of its unit cost under different production capacities. The objective of this study was to evaluate the unit cost of syngas production at capacities from 60 through 1800Nm 3/h using an economic model with three regression analysis techniques (simple regression, reciprocal regression, and log-log regression). The preliminary result of this study showed that reciprocal regression analysis technique had the best fit curve between per unit cost and production capacity, with sum of error squares (SES) lower than 0.001 and coefficient of determination of (R 2) 0.996. The regression analysis techniques determined the minimum unit cost of syngas production for micro-scale bio-gasification facilities of $0.052/Nm 3, under the capacity of 2,880 Nm 3/h. The results of this study suggest that to reduce cost, facilities should run at a high production capacity. In addition, the contribution of this technique could be the new categorical criterion to evaluate micro-scale bio-gasification facility from the perspective of economic analysis.

Deng, Yangyang; Parajuli, Prem B.

2011-08-10

261

Predicting sediment yield is necessary for good land and water management in any river basin. However, sometimes, the sediment data is either not available or is sparse, which renders estimating sediment yield a daunting task. The present study investigates the factors influencing suspended sediment yield using the principal component analysis (PCA). Additionally, the regression relationships for estimating suspended sediment yield, based on the selected key factors from the PCA, are develope...

Piyawat Wuttichaikitcharoen; Mukand Singh Babel

2014-01-01

262

A regression analysis on the green olives debittering

Directory of Open Access Journals (Sweden)

Full Text Available In this paper, a regression model, which gives the debittering time t as a function of the sodium hydroxide concentration 0 and the debittering temperature T, at the debittering of medium size green olive fruit of the Conservolea variety, is fitted. This model has the simple form t=a_{o}C^{a1} ? e^{a2/T}, where a_{o}, a_{1}, and a_{2} are constants. The values of a_{o}, a_{1}, and a_{2} are determined by the method of least squares from a set of experimental data. The determined model is very satisfactory for the conditions in which Greek green olives are debittered.

En este artículo se ajusta un modelo de regresión, que da el tiempo de endulzamiento t en función de la concentración de hidróxido sódico C y la temperatura de endulzamiento T, en el endulzamiento de aceitunas verdes de tamaño mediano de la variedad Conservolea. Este modelo tiene la forma simple t=a_{o}C^{a1} ? e^{a2/T}, donde a_{1} y a_{2} son constantes. Los valores de a_{o}, a_{1}, y a_{2} son determinados por el método de los mínimos cuadrados a partir de un grupo de datos experimentales. El modelo determinado es muy satisfactorio para las condiciones en las que las aceitunas verdes griegas son endulzadas.

Kopsidas, Gerassimos C.

1991-12-01

263

THE THEORY AND APPLICATION OF REGRESSION ANALYSIS AND THE LEAST-SQAURES PRINCIPLE

Directory of Open Access Journals (Sweden)

Full Text Available The theory and practice of regression analysis, and the principle of least-squares on which it is based, is frequently encountered in Mathematics and particularly Statistical Mathematics, but less well known are some very useful applications in a military environment. It is therefore the aim of this article to firstly give a general description of the theory of regression analyses, and secondly to highlight some military applications of the theory.

P. De Viliers

2012-02-01

264

The primary treatment goal of radiotherapy for paragangliomas of the head and neck region (HNPGLs) is local control of the tumor, i.e. stabilization of tumor volume. Interestingly, regression of tumor volume has also been reported. Up to the present, no meta-analysis has been performed giving an overview of regression rates after radiotherapy in HNPGLs. The main objective was to perform a systematic review and meta-analysis to assess regression of tumor volume in HNPGL-patients after radiotherapy. A second outcome was local tumor control. Design of the study is systematic review and meta-analysis. PubMed, EMBASE, Web of Science, COCHRANE and Academic Search Premier and references of key articles were searched in March 2012 to identify potentially relevant studies. Considering the indolent course of HNPGLs, only studies with ? 12 months follow-up were eligible. Main outcomes were the pooled proportions of regression and local control after radiotherapy as initial, combined (i.e. directly post-operatively or post-embolization) or salvage treatment (i.e. after initial treatment has failed) for HNPGLs. A meta-analysis was performed with an exact likelihood approach using a logistic regression with a random effect at the study level. Pooled proportions with 95% confidence intervals (CI) were reported. Fifteen studies were included, concerning a total of 283 jugulotympanic HNPGLs in 276 patients. Pooled regression proportions for initial, combined and salvage treatment were respectively 21%, 33% and 52% in radiosurgery studies and 4%, 0% and 64% in external beam radiotherapy studies. Pooled local control proportions for radiotherapy as initial, combined and salvage treatment ranged from 79% to 100%. Radiotherapy for jugulotympanic paragangliomas results in excellent local tumor control and therefore is a valuable treatment for these types of tumors. The effects of radiotherapy on regression of tumor volume remain ambiguous, although the data suggest that regression can be achieved at least in some patients. More research is needed to identify predictors for treatment success. PMID:23332889

van Hulsteijn, Leonie T; Corssmit, Eleonora P M; Coremans, Ida E M; Smit, Johannes W A; Jansen, Jeroen C; Dekkers, Olaf M

2013-02-01

265

International Nuclear Information System (INIS)

The primary treatment goal of radiotherapy for paragangliomas of the head and neck region (HNPGLs) is local control of the tumor, i.e. stabilization of tumor volume. Interestingly, regression of tumor volume has also been reported. Up to the present, no meta-analysis has been performed giving an overview of regression rates after radiotherapy in HNPGLs. The main objective was to perform a systematic review and meta-analysis to assess regression of tumor volume in HNPGL-patients after radiotherapy. A second outcome was local tumor control. Design of the study is systematic review and meta-analysis. PubMed, EMBASE, Web of Science, COCHRANE and Academic Search Premier and references of key articles were searched in March 2012 to identify potentially relevant studies. Considering the indolent course of HNPGLs, only studies with ?12 months follow-up were eligible. Main outcomes were the pooled proportions of regression and local control after radiotherapy as initial, combined (i.e. directly post-operatively or post-embolization) or salvage treatment (i.e. after initial treatment has failed) for HNPGLs. A meta-analysis was performed with an exact likelihood approach using a logistic regression with a random effect at the study level. Pooled proportions with 95% confidence intervals (CI) were reported. Fifteen studies were included, concerning a total of 283 jugulotympanic HNPGLs in 276 patients. Pooled regression proportions for initial, combined and salvage treatment were respectively 21%, 33% and 52% in radiosurgery studies and 4%, 0% and 64% in external beam radiotherapy studies. Pooled local control proportions for radiotherapy as initial, combined and salvage treatment ranged from 79% to 100%. Radiotherapy for jugulotympanic paragangliomas results in excellent local tumor control and therefore is a valuable treatment for these types of tumors. The effects of radiotherapy on regression of tumor volume remain ambiguous, although the data suggest that regression can be achieved at least in some patients. More research is needed to identify predictors for treatment success

266

The Use of Nonparametric Kernel Regression Methods in Econometric Production Analysis

DEFF Research Database (Denmark)

This PhD thesis addresses one of the fundamental problems in applied econometric analysis, namely the econometric estimation of regression functions. The conventional approach to regression analysis is the parametric approach, which requires the researcher to specify the form of the regression function. However, the a priori specification of a functional form involves the risk of choosing one that is not similar to the “true” but unknown relationship between the regressors and the dependent variable. This problem, known as parametric misspecification, can result in biased parameter estimates and hence, also in biased measures, which are derived from the estimated parameters. This, in turn, can result in incorrect economic conclusions and recommendations for managers, politicians and decision makers in general. This PhD thesis focuses on a nonparametric econometric approach that can be used to avoid this problem. The main objective is to investigate the applicability of the nonparametric kernel regression method in applied production analysis. The focus of the empirical analyses included in this thesis is the agricultural sector in Poland. Data on Polish farms are used to investigate practically and politically relevant problems and to illustrate how nonparametric regression methods can be used in applied microeconomic production analysis both in panel data and cross-section data settings. The thesis consists of four papers. The first paper addresses problems of parametric and nonparametric estimations of production functions in order to evaluate the optimal firm size. The second paper discusses the use of parametric and nonparametric regression methods to estimate panel data regression models. The third paper analyses production risk, price uncertainty, and farmers' risk preferences within a nonparametric panel data regression framework. The fourth paper analyses the technical efficiency of dairy farms with environmental output using nonparametric kernel regression in a semiparametric stochastic frontier analysis. The results provided in this PhD thesis show that nonparametric kernel methods are well-suited to econometric production analysis and can outperform traditional parametric methods. Although the empirical focus of this thesis is on the application of nonparametric kernel regression in applied production analysis, the findings are also applicable to econometric estimations in general.

Czekaj, Tomasz Gerard

2013-01-01

267

A Noncentral "t" Regression Model for Meta-Analysis

In this article, three multilevel models for meta-analysis are examined. Hedges and Olkin suggested that effect sizes follow a noncentral "t" distribution and proposed several approximate methods. Raudenbush and Bryk further refined this model; however, this procedure is based on a normal approximation. In the current research literature, this…

Camilli, Gregory; de la Torre, Jimmy; Chiu, Chia-Yi

2010-01-01

268

Outcome misclassification is widespread in epidemiology, but methods to account for it are rarely used. We describe the use of multiple imputation to reduce bias when validation data are available for a subgroup of study participants. This approach is illustrated using data from 308 participants in the multicenter Herpetic Eye Disease Study between 1992 and 1998 (48% female; 85% white; median age, 49 years). The odds ratio comparing the acyclovir group with the placebo group on the gold-stand...

Edwards, Jessie K.; Cole, Stephen R.; Troester, Melissa A.; Richardson, David B.

2013-01-01

269

Robust regression applied to fractal/multifractal analysis.

Fractal and multifractal are concepts that have grown increasingly popular in recent years in the soil analysis, along with the development of fractal models. One of the common steps is to calculate the slope of a linear fit commonly using least squares method. This shouldn?t be a special problem, however, in many situations using experimental data the researcher has to select the range of scales at which is going to work neglecting the rest of points to achieve the best linearity that in thi...

Portilla, F.; Valencia Delfa, Jose? Luis; Tarquis Alfonso, Ana Maria; Saa Requejo, Antonio

2012-01-01

270

DEFF Research Database (Denmark)

This paper proposes a simple method for estimating emissions and predicted environmental concentrations (PECs) in water and air for organic chemicals that are used in household products and industrial processes. The method has been tested on existing data for 63 organic high-production volume chemicals available in the European Chemicals Bureau risk assessment reports (RARs). The method suggests a simple linear relationship between Henry's Law constant, octanol-water coefficient, use and production volumes, and emissions and PECs on a regional scale in the European Union. Emissions and PECs are a result of a complex interaction between chemical properties, production and use patterns and geographical characteristics. A linear relationship cannot capture these complexities; however, it may be applied at a cost-efficient screening level for suggesting critical chemicals that are candidates for an in-depth risk assessment. Uncertainty measures are not available for the RAR data; however, uncertainties for the applied regression models are given in the paper. Evaluation of the methods reveals that between 79% and 93% of all emission and PEC estimates are within one order of magnitude of the reported RAR values. Bearing in mind that the domain of the method comprises organic industrial high-production volume chemicals, four chemicals, prioritized in the Water Framework Directive and the Stockholm Convention on Persistent Organic Pollutants, were used to test the method for estimated emissions and PECs, with corresponding uncertainty intervals, in air and water at regional EU level.

Fauser, Patrik; Thomsen, Marianne

2010-01-01

271

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Accurate prediction of antigenic epitopes is important for immunologic research and medical applications, but it is still an open problem in bioinformatics. The case for discontinuous epitopes is even worse - currently there are only a few discontinuous epitope prediction servers available, though discontinuous peptides constitute the majority of all B-cell antigenic epitopes. The small number of structures for antigen-antibody complexes limits the development of reliable discontinuous epitope prediction methods and an unbiased benchmark to evaluate developed methods. Results In this work, we present two novel server applications for discontinuous epitope prediction: EPSVR and EPMeta, where EPMeta is a meta server. EPSVR, EPMeta, and datasets are available at http://sysbio.unl.edu/services. Conclusion The server application for discontinuous epitope prediction, EPSVR, uses a Support Vector Regression (SVR method to integrate six scoring terms. Furthermore, we combined EPSVR with five existing epitope prediction servers to construct EPMeta. All methods were benchmarked by our curated independent test set, in which all antigens had no complex structures with the antibody, and their epitopes were identified by various biochemical experiments. The area under the receiver operating characteristic curve (AUC of EPSVR was 0.597, higher than that of any other existing single server, and EPMeta had a better performance than any single server - with an AUC of 0.638, significantly higher than PEPITO and Disctope (p-value

Yao Bo

2010-07-01

272

Prostate cancer most commonly presents as initially castration dependent, however in a minority of patients the disease will progress to a state of castration resistance. Here, approaches for correlating alterations in the phosphoproteome with androgen independent cell survival in the LNCaP, PC3, and MDa-PCa-2b cell lines are discussed. The performance of the regression techniques multiple linear, ridge, principal component, and partial least squares regression is compared. The predictive performance of these algorithms over randomized data sets and using the Akaike Information Criterion is explored, and principal component and partial least squares regression are found to outperform other regression approaches. The effect of altering the number of features versus observations on the R(2) value and predictive performance is also examined using the partial least squares regression model. Utilizing these approaches "drivers" of castration resistant disease can be identified whose modulation alters phenotypic outcomes. These data provide an empirical comparison of the various considerations when statistically analyzing phosphorylation data with the aim of correlating with phenotypic outcomes. PMID:24413303

Lescarbeau, Reynald; Kaplan, David L

2014-03-01

273

The purpose of this research work is to build a multiple linear regression model for the characteristics of multicylinder diesel engine using multicomponent blends (diesel- pungamia methyl ester-ethanol) as fuel. Nine blends were tested by varying diesel (100 to 10% by Vol.), biodiesel (80 to 10% by vol.) and keeping ethanol as 10% constant. The brake thermal efficiency, smoke, oxides of nitrogen, carbon dioxide, maximum cylinder pressure, angle of maximum ...

Gopal Rajendiran; Kavandappa-Goundar Mayilsamy; Ramasamy Subramanian; Natarajan Nedunchezhian; Ramasamy Venkatachalam

2014-01-01

274

Directory of Open Access Journals (Sweden)

Full Text Available Predicting sediment yield is necessary for good land and water management in any river basin. However, sometimes, the sediment data is either not available or is sparse, which renders estimating sediment yield a daunting task. The present study investigates the factors influencing suspended sediment yield using the principal component analysis (PCA. Additionally, the regression relationships for estimating suspended sediment yield, based on the selected key factors from the PCA, are developed. The PCA shows six components of key factors that can explain at least up to 86.7% of the variation of all variables. The regression models show that basin size, channel network characteristics, land use, basin steepness and rainfall distribution are the key factors affecting sediment yield. The validation of regression relationships for estimating suspended sediment yield shows the error of estimation ranging from ?55% to +315% and ?59% to +259% for suspended sediment yield and for area-specific suspended sediment yield, respectively. The proposed relationships may be considered useful for predicting suspended sediment yield in ungauged basins of Northern Thailand that have geologic, climatic and hydrologic conditions similar to the study area.

Piyawat Wuttichaikitcharoen

2014-08-01

275

DEFF Research Database (Denmark)

This paper analyses multivariate high frequency financial data using realized covariation. We provide a new asymptotic distribution theory for standard methods such as regression, correlation analysis, and covariance. It will be based on a fixed interval of time (e.g., a day or week), allowing the number of high frequency returns during this period to go to infinity. Our analysis allows us to study how high frequency correlations, regressions, and covariances change through time. In particular we provide confidence intervals for each of these quantities.

Barndorff-Nielsen, Ole Eiler; Shephard, N.

2004-01-01

276

This work studies a fundamental problem in blood capillary growth: how the cell proliferation or death induces the stress response and the capillary extension or regression. We develop a one-dimensional viscoelastic model of blood capillary extension/regression under nonlinear friction with surroundings, analyze its solution properties, and simulate various growth patterns in angiogenesis. The mathematical model treats the cell density as the growth pressure eliciting a viscoelastic response from the cells, which again induces extension or regression of the capillary. Nonlinear analysis captures two cases when the biologically meaningful solution exists: (1) the cell density decreases from root to tip, which may occur in vessel regression; (2) the cell density is time-independent and is of small variation along the capillary, which may occur in capillary extension without proliferation. The linear analysis with perturbation in cell density due to proliferation or death predicts the global biological solution exists provided the change in cell density is sufficiently slow in time. Examples with blow-ups are captured by numerical approximations and the global solutions are recovered by slow growth processes, which validate the linear analysis theory. Numerical simulations demonstrate this model can reproduce angiogenesis experiments under several biological conditions including blood vessel extension without proliferation and blood vessel regression. PMID:23149501

Zheng, Xiaoming; Xie, Chunjing

2014-01-01

277

DEFF Research Database (Denmark)

The Sea Level Thematic Assembly Center in the EUFP7 MyOcean project aims at build a sea level service for multiple satellite sea level observations at a European level for GMES marine applications. It aims to improve the sea level related products to guarantee the sustainability and the quality of GMES marine core service. One such added value will be a multivariate regression model of sea level variability of multisatellite and in-situ tide gauge observations with the aim at improved future high spatial and temporal sea level prediction for i.e., human safety. Tide gauges and satellite altimetry data from the last seventeen years have been compared for an area around UK and temporal correlation coefficients between them were calculated. The results are extremely encouraging, as we have shown that the detided signal from response method correlates to more than 90% for nearly all tide gauge stations with satellite altimetry.

Cheng, Yongcun; Andersen, Ole Baltazar

2010-01-01

278

International Nuclear Information System (INIS)

The interelectronic repulsion and spin-orbit interaction parameters for some Ndsup(3+)?-diketone complexes have been computed using partial and multiple regression method from the observed absorption spectra in the region 1000-23500 cmsup(-1). A brief outline of this method which is an alternative to a computer programming method is given. The energy parameters (Slater-Condon and Lande') derived from intra-fsup(N) transitions of lanthanide ion have their importance to predict the covalent tendency of the metal-ligand bond in the complex on the basis of the decrease in the value of these parameters. The complexes have been arranged in the increasing order of covalency as has been indicated by the value of ? or bsup(1/2). (author)

279

Directory of Open Access Journals (Sweden)

Full Text Available Habitat degradation and loss has been widely recognized as the main cause for the decline of wildlife population. Evaluating the quality of wildlife habitat can provide essential information for wildlife refuge design and management. The purpose of this study was to produce georeferenced ecological information about suitable habitats available for muntjac, Muntiacus muntjak in Chandoli tiger reserve, India (17° 04' 00" N to 17° 19' 54" N and 73° 40' 43" E to 73° 53' 09" E. Habitats were evaluated using multiple logistic regression integrated with remote sensing and geographic information system. Satellite imageries of LISS-III of IRS-P6 of study area were digitally processed. To generate collateral data topographic maps were analysed in a GIS framework. Layers of different variables such as Landuse land cover, forest density, proximity to disturbances and water resources and a digital terrain model were created from satellite and topographic sheets. These layers along with GPS location of muntjac presence/absence and ?multiple logistic regression (MLR techniques were integrated in a GIS environment to model habitat suitability index of muntjac. The results indicate that approximately 222.39 km2 (75.4% of the forest of tiger reserve was least suitable for muntjac, whereas, 29.53 km2 (10.02% was moderately suitable, 22.12 km2 (7.5% suitable and 20.70 km2 (7.0% was highly suitable. The accuracy level of this model was 97.6%. The model can be considered as potent enough to advocate that forests of this area are most appropriate for declaring it as a reserve for muntjac conservation, ultimately to provide prey base for tiger.

Imam EKWAL

2012-12-01

280

Texture Analysis and Classification With Linear Regression Model Based on Wavelet Transform

The wavelet transform as an important multiresolution analysis tool has already been commonly applied to texture analysis and classification. Nevertheless, it ignores the structural information while capturing the spectral information of the texture image at different scales. In this paper, we propose a texture analysis and classification approach kith the linear regression model based on the wavelet transform. This method is motivated by the observation that there exists a distinctive correl...

Wang, Zhi-zhong; Yong, Jun-hai

2008-01-01

281

Regression Analysis of Top of Descent Location for Idle-thrust Descents

In this paper, multiple regression analysis is used to model the top of descent (TOD) location of user-preferred descent trajectories computed by the flight management system (FMS) on over 1000 commercial flights into Melbourne, Australia. The independent variables cruise altitude, final altitude, cruise Mach, descent speed, wind, and engine type were also recorded or computed post-operations. Both first-order and second-order models are considered, where cross-validation, hypothesis testing, and additional analysis are used to compare models. This identifies the models that should give the smallest errors if used to predict TOD location for new data in the future. A model that is linear in TOD altitude, final altitude, descent speed, and wind gives an estimated standard deviation of 3.9 nmi for TOD location given the trajec- tory parameters, which means about 80% of predictions would have error less than 5 nmi in absolute value. This accuracy is better than demonstrated by other ground automation predictions using kinetic models. Furthermore, this approach would enable online learning of the model. Additional data or further knowl- edge of algorithms is necessary to conclude definitively that no second-order terms are appropriate. Possible applications of the linear model are described, including enabling arriving aircraft to fly optimized descents computed by the FMS even in congested airspace. In particular, a model for TOD location that is linear in the independent variables would enable decision support tool human-machine interfaces for which a kinetic approach would be computationally too slow.

Stell, Laurel; Bronsvoort, Jesper; McDonald, Greg

2013-01-01

282

The analysis of physical measurements often copes with highly correlated noises and interruptions caused by outliers, saturation events or transmission losses. We assess the impact of missing data on the performance of linear regression analysis involving the fit of modeled or measured time series. We show that data gaps can significantly alter the precision of the regression parameter estimation in the presence of colored noise, due to the frequency leakage of the noise power. We present a regression method which cancels this effect and estimates the parameters of interest with a precision comparable to the complete data case, even if the noise power spectral density (PSD) is not known a priori. The method is based on an autoregressive (AR) fit of the noise, which allows us to build an approximate generalized least squares estimator approaching the minimal variance bound. The method, which can be applied to any similar data processing, is tested on simulated measurements of the MICROSCOPE space mission, whos...

Baghi, Q; Bergé, J; Christophe, B; Touboul, P; Rodrigues, M

2015-01-01

283

In genome-wide association studies (GWAS), regression analysis has been most commonly used to establish an association between a phenotype and genetic variants, such as single nucleotide polymorphism (SNP). However, most applications of regression analysis have been restricted to the investigation of single marker because of the large computational burden. Thus, there have been limited applications of regression analysis to multiple SNPs, including gene-gene interaction (GGI) in large-scale GWAS data. In order to overcome this limitation, we propose CARAT-GxG, a GPU computing system-oriented toolkit, for performing regression analysis with GGI using CUDA (compute unified device architecture). Compared to other methods, CARAT-GxG achieved almost 700-fold execution speed and delivered highly reliable results through our GPU-specific optimization techniques. In addition, it was possible to achieve almost-linear speed acceleration with the application of a GPU computing system, which is implemented by the TORQUE Resource Manager. We expect that CARAT-GxG will enable large-scale regression analysis with GGI for GWAS data. PMID:25574130

Lee, Sungyoung; Kwon, Min-Seok; Park, Taesung

2014-01-01

284

In genome-wide association studies (GWAS), regression analysis has been most commonly used to establish an association between a phenotype and genetic variants, such as single nucleotide polymorphism (SNP). However, most applications of regression analysis have been restricted to the investigation of single marker because of the large computational burden. Thus, there have been limited applications of regression analysis to multiple SNPs, including gene–gene interaction (GGI) in large-scale GWAS data. In order to overcome this limitation, we propose CARAT-GxG, a GPU computing system-oriented toolkit, for performing regression analysis with GGI using CUDA (compute unified device architecture). Compared to other methods, CARAT-GxG achieved almost 700-fold execution speed and delivered highly reliable results through our GPU-specific optimization techniques. In addition, it was possible to achieve almost-linear speed acceleration with the application of a GPU computing system, which is implemented by the TORQUE Resource Manager. We expect that CARAT-GxG will enable large-scale regression analysis with GGI for GWAS data. PMID:25574130

Lee, Sungyoung; Kwon, Min-Seok; Park, Taesung

2014-01-01

285

International Nuclear Information System (INIS)

The empirical model of turbine efficiency is necessary for the control- and/or diagnosis-oriented simulation and useful for the simulation and analysis of dynamic performances of the turbine equipment and systems, such as air cycle refrigeration systems, power plants, turbine engines, and turbochargers. Existing empirical models of turbine efficiency are insufficient because there is no suitable form available for air cycle refrigeration turbines. This work performs a critical review of empirical models (called mean value models in some literature) of turbine efficiency and develops an empirical model in the desired form for air cycle refrigeration, the dominant cooling approach in aircraft environmental control systems. The Taylor series and regression analysis are used to build the model, with the Taylor series being used to expand functions with the polytropic exponent and the regression analysis to finalize the model. The measured data of a turbocharger turbine and two air cycle refrigeration turbines are used for the regression analysis. The proposed model is compact and able to present the turbine efficiency map. Its predictions agree with the measured data very well, with the corrected coefficient of determination Rc2 ? 0.96 and the mean absolute percentage deviation = 1.19% for the three turbines. -- Highlights: ? Performed a critical review of empirical models of turbine efficiency. ? Developed an empirical model in the desired form for air cycle refrigeration, using the Taylor expansion and regression analysis. ? Verified the method for developing the empirical model. ? Verified the model.

286

In QSRR discipline an easy novel to used parameter was designed (Vc) for evaluated classical topological index (W, ¹chi, Z, MTI) and two new generation ones (Xu, ¹chih). Regression between Vc and ¹chih presented a correlation index (r) of 0,9992, a surprising high value in comparison with that founds commonly in QSPR/QSAR discipline. Through Vc parameter, an idea to treatise multiple three independent variable regression is present. Model of 35 saturated hydrocarbons were used

Cornwell, E.

2006-01-01

287

Linear Maximum Likelihood Regression Analysis for Untransformed Log-Normally Distributed Data

Directory of Open Access Journals (Sweden)

Full Text Available Medical research data are often skewed and heteroscedastic. It has therefore become practice to log-transform data in regression analysis, in order to stabilize the variance. Regression analysis on log-transformed data estimates the relative effect, whereas it is often the absolute effect of a predictor that is of interest. We propose a maximum likelihood (ML-based approach to estimate a linear regression model on log-normal, heteroscedastic data. The new method was evaluated with a large simulation study. Log-normal observations were generated according to the simulation models and parameters were estimated using the new ML method, ordinary least-squares regression (LS and weighed least-squares regression (WLS. All three methods produced unbiased estimates of parameters and expected response, and ML and WLS yielded smaller standard errors than LS. The approximate normality of the Wald statistic, used for tests of the ML estimates, in most situations produced correct type I error risk. Only ML and WLS produced correct confidence intervals for the estimated expected value. ML had the highest power for tests regarding ?_{1}.

Sara M. Gustavsson

2012-10-01

288

We consider the problem of modeling heteroscedasticity in semiparametric regression analysis of crosssectional data. Existing work in this setting is rather limited and mostly adopts a fully nonparametric variance structure. This approach is hampered by curse of dimensionality in practical applications. Moreover, the corresponding asymptotic theory is largely restricted to estimators that minimize certain smooth objective functions. The asymptotic derivation thus excludes semiparametric quant...

Keilegom, Ingrid; Wang, Lan

2010-01-01

289

Isolating the Effects of Training Using Simple Regression Analysis: An Example of the Procedure.

This paper provides a case example of simple regression analysis, a forecasting procedure used to isolate the effects of training from an identified extraneous variable. This case example focuses on results of a three-day sales training program to improve bank loan officers' knowledge, skill-level, and attitude regarding solicitation and sale of…

Waugh, C. Keith

290

What Satisfies Students?: Mining Student-Opinion Data with Regression and Decision Tree Analysis

To investigate how students' characteristics and experiences affect satisfaction, this study uses regression and decision tree analysis with the CHAID algorithm to analyze student-opinion data. A data mining approach identifies the specific aspects of students' university experience that most influence three measures of general satisfaction. The…

Thomas, Emily H.; Galambos, Nora

2004-01-01

291

BMDP program for piecewise linear regression.

Piecewise linear regression has potentially broad applications in medical data analysis as well as other types of regression. Various kinds of algorithms have been proposed for finding optimum piecewise linear regressions. This paper presents a BMDP program for obtaining near optimum piecewise linear regression equations. An idea intrinsic to the method is that restricting parameter space to a discrete set makes the difficult problems become standard problems. Any software having the variable selection feature in the multiple linear regression can be used to apply the method. PMID:3638186

Nakamura, T

1986-08-01

292

Methods and applications of linear models regression and the analysis of variance

Praise for the Second Edition"An essential desktop reference book . . . it should definitely be on your bookshelf." -Technometrics A thoroughly updated book, Methods and Applications of Linear Models: Regression and the Analysis of Variance, Third Edition features innovative approaches to understanding and working with models and theory of linear regression. The Third Edition provides readers with the necessary theoretical concepts, which are presented using intuitive ideas rather than complicated proofs, to describe the inference that is appropriate for the methods being discussed. The book

Hocking, Ronald R

2013-01-01

293

Detrended fluctuation analysis as a regression framework: Estimating dependence at different scales

We propose a framework combining detrended fluctuation analysis with standard regression methodology. The method is built on detrended variances and covariances and it is designed to estimate regression parameters at different scales and under potential nonstationarity and power-law correlations. The former feature allows for distinguishing between effects for a pair of variables from different temporal perspectives. The latter ones make the method a significant improvement over the standard least squares estimation. Theoretical claims are supported by Monte Carlo simulations. The method is then applied on selected examples from physics, finance, environmental science, and epidemiology. For most of the studied cases, the relationship between variables of interest varies strongly across scales.

Kristoufek, Ladislav

2015-02-01

294

Statistical methods in regression and calibration analysis of chromosome aberration data

International Nuclear Information System (INIS)

The method of iteratively reweighted least squares for the regression analysis of Poisson distributed chromosome aberration data is reviewed in the context of other fit procedures used in the cytogenetic literature. As an application of the resulting regression curves methods for calculating confidence intervals on dose from aberration yield are described and compared, and, for the linear quadratic model a confidence interval is given. Emphasis is placed on the rational interpretation and the limitations of various methods from a statistical point of view. (orig./MG)

295

Electricity Consumption Analysis Using Spline Regression Models: The Case of a Turkish Province

Directory of Open Access Journals (Sweden)

Full Text Available Energy is one of the indispensible elements of human life and electrical energy is adopted as the most frequently used energy type. As this type of energy can not be stored at the present time, it has to be instantly consumed. In other words, the demand of the consumers has to be compensated, immediately. This paper employs to model the electrical consumption of Erzurum province in 2011 by spline regression and to decide whether a statistically seasonal variation exists for this consumption. The one-year data set of the investigation was obtained from Turkish Electricity Transmission Company Provincial Directorate of Erzurum and was analyzed by the agency of continuous partial polynomial spline regressions. This analysis determined three knots and fits linear, quadratic and cubic spline regression models.

Omer Alkan

2013-05-01

296

It is estimated that mainland Chinese tourists travelling to Taiwan can bring annual revenues of 400 billion NTD to the Taiwan economy. Thus, how the Taiwanese Government formulates relevant measures to satisfy both sides is the focus of most concern. Taiwan must improve the facilities and service quality of its tourism industry so as to attract more mainland tourists. This paper conducted a questionnaire survey of mainland tourists and used grey relational analysis in grey mathematics to analyze the satisfaction performance of all satisfaction question items. The first eight satisfaction items were used as independent variables, and the overall satisfaction performance was used as a dependent variable for quantile regression model analysis to discuss the relationship between the dependent variable under different quantiles and independent variables. Finally, this study further discussed the predictive accuracy of the least mean regression model and each quantile regression model, as a reference for research personnel. The analysis results showed that other variables could also affect the overall satisfaction performance of mainland tourists, in addition to occupation and age. The overall predictive accuracy of quantile regression model Q0.25 was higher than that of the other three models. PMID:24574916

Wang, Wen-Cheng; Cho, Wen-Chien; Chen, Yin-Jen

2014-01-01

297

Objectives To review the efficacy of cognitive interventions on improving general cognition in dementia. Method Online literature databases and trial registers, previous systematic reviews and leading journals were searched for relevant randomised controlled trials. A systematic review, random-effects meta-analyses and meta-regression were conducted. Cognitive interventions were categorised as: cognitive stimulation (CS), involving a range of social and cognitive activities to stimulate multiple cognitive domains; cognitive training (CT), involving repeated practice of standardised tasks targeting a specific cognitive function; cognitive rehabilitation (CR), which takes a person-centred approach to target impaired function; or mixed CT and stimulation (MCTS). Separate analyses were conducted for general cognitive outcome measures and for studies using ‘active’ (designed to control for non-specific therapeutic effects) and non-active (minimal or no intervention) control groups. Results 33 studies were included. Significant positive effect sizes (Hedges’ g) were found for CS with the mini-mental state examination (MMSE) (g=0.51, 95% CI 0.29 to 0.69; pAlzheimer's disease Assessment Scale-Cognition (ADAS-Cog) (g=?0.26, 95% CI ?0.445 to ?0.08; p=0.005). There was no evidence that CT or MCTS produced significant improvements on general cognition outcomes and not enough CR studies for meta-analysis. The lowest accepted minimum clinically important difference was reached in 11/17 CS studies for the MMSE, but only 2/9 studies for the ADAS-Cog. Additionally, 95% prediction intervals suggested that although statistically significant, CS may not lead to benefits on the ADAS-Cog in all clinical settings. Conclusions CS improves scores on MMSE and ADAS-Cog in dementia, but benefits on the ADAS-Cog are generally not clinically significant and difficulties with blinding of patients and use of adequate placebo controls make comparison with the results of dementia drug treatments problematic. PMID:25838501

Huntley, J D; Gould, R L; Liu, K; Smith, M; Howard, R J

2015-01-01

298

Formal Specification Language Based IaaS Cloud Workload Regression Analysis

Cloud Computing is an emerging area for accessing computing resources. In general, Cloud service providers offer services that can be clustered into three categories: SaaS, PaaS and IaaS. This paper discusses the Cloud workload analysis. The efficient Cloud workload resource mapping technique is proposed. This paper aims to provide a means of understanding and investigating IaaS Cloud workloads and the resources. In this paper, regression analysis is used to analyze the Clou...

Singh, Sukhpal; Chana, Inderveer

2014-01-01

299

Analysis of Herd Behavior Using Quantile Regression: Evidence from Karachi Stock Exchange (KSE)

The objectives of this paper are to explore the herd behavior in the Karachi Stock Exchange (KSE) by using Ordinary Least Square (OLS) and Quantile Regression analysis for normal as well as bullish (up) and bearish(down) market conditions. Greed stimulates people to make increasingly risky investments and therefore investors tend to follow one another blindly and ignore rational analysis. Herd behavior can be defined as when investor ignore available information and follow other investors dur...

Malik, Saif Ullah; Elahi, Muhammad Ather

2014-01-01

300

Directory of Open Access Journals (Sweden)

Full Text Available A nanocrystalline SnO2 thin film was synthesized by a chemical bath method. The parameters affecting the energy band gap and surface morphology of the deposited SnO2 thin film were optimized using a semi-empirical method. Four parameters, including deposition time, pH, bath temperature and tin chloride (SnCl2·2H2O concentration were optimized by a factorial method. The factorial used a Taguchi OA (TOA design method to estimate certain interactions and obtain the actual responses. Statistical evidences in analysis of variance including high F-value (4,112.2 and 20.27, very low P-value (<0.012 and 0.0478, non-significant lack of fit, the determination coefficient (R2 equal to 0.978 and 0.977 and the adequate precision (170.96 and 12.57 validated the suggested model. The optima of the suggested model were verified in the laboratory and results were quite close to the predicted values, indicating that the model successfully simulated the optimum conditions of SnO2 thin film synthesis.

Saeideh Ebrahimiasl

2014-02-01

301

A nanocrystalline SnO2 thin film was synthesized by a chemical bath method. The parameters affecting the energy band gap and surface morphology of the deposited SnO2 thin film were optimized using a semi-empirical method. Four parameters, including deposition time, pH, bath temperature and tin chloride (SnCl2·2H2O) concentration were optimized by a factorial method. The factorial used a Taguchi OA (TOA) design method to estimate certain interactions and obtain the actual responses. Statistical evidences in analysis of variance including high F-value (4,112.2 and 20.27), very low P-value (<0.012 and 0.0478), non-significant lack of fit, the determination coefficient (R2 equal to 0.978 and 0.977) and the adequate precision (170.96 and 12.57) validated the suggested model. The optima of the suggested model were verified in the laboratory and results were quite close to the predicted values, indicating that the model successfully simulated the optimum conditions of SnO2 thin film synthesis. PMID:24509767

Ebrahimiasl, Saeideh; Zakaria, Azmi

2014-01-01

302

Application of appropriate models to approximate the performance function warrants more precise prediction and helps to make the best decisions in the poultry industry. This study reevaluated the factors affecting hatchability in laying hens from 29 to 56 wk of age. Twenty-eight data lines representing 4 inputs consisting of egg weight, eggshell thickness, egg sphericity, and yolk/albumin ratio and 1 output, hatchability, were obtained from the literature and used to train an artificial neural network (ANN). The prediction ability of ANN was compared with that of fuzzy logic to evaluate the fitness of these 2 methods. The models were compared using R(2), mean absolute deviation (MAD), mean squared error (MSE), mean absolute percentage error (MAPE), and bias. The developed model was used to assess the relative importance of each variable on the hatchability by calculating the variable sensitivity ratio. The statistical evaluations showed that the ANN-based model predicted hatchability more accurately than fuzzy logic. The ANN-based model had a higher determination of coefficient (R(2) = 0.99) and lower residual distribution (MAD = 0.005; MSE = 0.00004; MAPE = 0.732; bias = 0.0012) than fuzzy logic (R(2) = 0.87; MAD = 0.014; MSE = 0.0004; MAPE = 2.095; bias = 0.0046). The sensitivity analysis revealed that the most important variable in the ANN-based model of hatchability was egg weight (variable sensitivity ratio, VSR = 283.11), followed by yolk/albumin ratio (VSR = 113.16), eggshell thickness (VSR = 16.23), and egg sphericity (VSR = 3.63). The results of this research showed that the universal approximation capability of ANN made it a powerful tool to approximate complex functions such as hatchability in the incubation process. PMID:23472039

Mehri, M

2013-04-01

303

DEFF Research Database (Denmark)

Colloids are potential carriers for strongly sorbing chemicals in macroporous soils, but predicting the amount of colloids readily available for facilitated chemical transport is an unsolved challenge. This study addresses potential key parameters and predictive indicators when assessing colloid dispersibility and transport at the field scale. Samples representing three measurement scales (1-2 mm aggregates, intact 100 cm3 rings, and intact 6283 cm3 columns) were retrieved from the topsoil of a 1.69 ha agricultural field in a 15 m × 15 m grid (65 locations) to determine soil dispersibility as well as 24 comparison parameters including textural, chemical, and structural (e.g. air permeability) 8 soil properties. The soil dispersibility was determined (i) using a laser diffraction method on 1-2 mm aggregates equilibrated to an initial matric potential of -100 cm H2O, (ii) using an end-over-end shaking on 6.06 cm (diam.) × 3.48 cm (height) cm intact soil rings equilibrated to an initial matric potential of -5 cmH2O, and (iii) as the accumulated amount of particles leached from 20 cm × 20 cm intact soil columns after 6.5 hr (60 mm accumulated outflow). At all three scales, soil dispersibility was higher in samples collected from the northern part of the field where the greatest leaching of pesticides was observed in a horizontal well at ~ 3.5 m depth during a 9-year monitoring program. This suggests that the three dispersibility methods used are all relevant for field-scale mapping of areas with enhanced risk of colloid-facilitated transport. Subsequently, using multiple linear regression (MLR) analyses, soil dispersibility was predicted at all three sample scales from the 24 measured, geo-referenced parameters to produce sets of only a few promising indicator parameters for evaluating soil stability and particle mobilization on field scale. The MLR analyses at each scale were separated in predictions using all, only north, and only south locations in the field. We found that different independent variables were included in the regression models when the sample scale increased from aggregate to column level. Generally, the predictive power of the regression models was better on the 1-2 mm aggregate scale than on the intact 100 cm3 and 20 cm × 20 cm scales. Overall, results suggested that different drivers controlled soil dispersibility 1 at the three scales and the two sub-areas of the field. Predictions of soil dispersibility and the risk of colloid-facilitated chemical transport will therefore need to be highly scale- and area-specific.

NØrgaard, Trine; MØldrup, Per

2014-01-01

304

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background There is a growing body of literature linking GIS-based measures of traffic density to asthma and other respiratory outcomes. However, no consensus exists on which traffic indicators best capture variability in different pollutants or within different settings. As part of a study on childhood asthma etiology, we examined variability in outdoor concentrations of multiple traffic-related air pollutants within urban communities, using a range of GIS-based predictors and land use regression techniques. Methods We measured fine particulate matter (PM2.5, nitrogen dioxide (NO2, and elemental carbon (EC outside 44 homes representing a range of traffic densities and neighborhoods across Boston, Massachusetts and nearby communities. Multiple three to four-day average samples were collected at each home during winters and summers from 2003 to 2005. Traffic indicators were derived using Massachusetts Highway Department data and direct traffic counts. Multivariate regression analyses were performed separately for each pollutant, using traffic indicators, land use, meteorology, site characteristics, and central site concentrations. Results PM2.5 was strongly associated with the central site monitor (R2 = 0.68. Additional variability was explained by total roadway length within 100 m of the home, smoking or grilling near the monitor, and block-group population density (R2 = 0.76. EC showed greater spatial variability, especially during winter months, and was predicted by roadway length within 200 m of the home. The influence of traffic was greater under low wind speed conditions, and concentrations were lower during summer (R2 = 0.52. NO2 showed significant spatial variability, predicted by population density and roadway length within 50 m of the home, modified by site characteristics (obstruction, and with higher concentrations during summer (R2 = 0.56. Conclusion Each pollutant examined displayed somewhat different spatial patterns within urban neighborhoods, and were differently related to local traffic and meteorology. Our results indicate a need for multi-pollutant exposure modeling to disentangle causal agents in epidemiological studies, and further investigation of site-specific and meteorological modification of the traffic-concentration relationship in urban neighborhoods.

Baxter Lisa K

2008-05-01

305

Despite the recent flourishing of mediation analysis techniques, many modern approaches are difficult to implement or applicable to only a restricted range of regression models. This report provides practical guidance for implementing a new technique utilizing inverse odds ratio weighting (IORW) to estimate natural direct and indirect effects for mediation analyses. IORW takes advantage of the odds ratio's invariance property and condenses information on the odds ratio for the relationship between the exposure (treatment) and multiple mediators, conditional on covariates, by regressing exposure on mediators and covariates. The inverse of the covariate-adjusted exposure-mediator odds ratio association is used to weight the primary analytical regression of the outcome on treatment. The treatment coefficient in such a weighted regression estimates the natural direct effect of treatment on the outcome, and indirect effects are identified by subtracting direct effects from total effects. Weighting renders treatment and mediators independent, thereby deactivating indirect pathways of the mediators. This new mediation technique accommodates multiple discrete or continuous mediators. IORW is easily implemented and is appropriate for any standard regression model, including quantile regression and survival analysis. An empirical example is given using data from the Moving to Opportunity (1994-2002) experiment, testing whether neighborhood context mediated the effects of a housing voucher program on obesity. Relevant Stata code (StataCorp LP, College Station, Texas) is provided. PMID:25693776

Nguyen, Quynh C; Osypuk, Theresa L; Schmidt, Nicole M; Glymour, M Maria; Tchetgen Tchetgen, Eric J

2015-03-01

306

Multilayer perceptron for robust nonlinear interval regression analysis using genetic algorithms.

On the basis of fuzzy regression, computational models in intelligence such as neural networks have the capability to be applied to nonlinear interval regression analysis for dealing with uncertain and imprecise data. When training data are not contaminated by outliers, computational models perform well by including almost all given training data in the data interval. Nevertheless, since training data are often corrupted by outliers, robust learning algorithms employed to resist outliers for interval regression analysis have been an interesting area of research. Several approaches involving computational intelligence are effective for resisting outliers, but the required parameters for these approaches are related to whether the collected data contain outliers or not. Since it seems difficult to prespecify the degree of contamination beforehand, this paper uses multilayer perceptron to construct the robust nonlinear interval regression model using the genetic algorithm. Outliers beyond or beneath the data interval will impose slight effect on the determination of data interval. Simulation results demonstrate that the proposed method performs well for contaminated datasets. PMID:25110755

Hu, Yi-Chung

2014-01-01

307

A new cluster-histo-regression analysis for incremental learning from temporal data chunks

Directory of Open Access Journals (Sweden)

Full Text Available In scenarios where data chunks arrive temporally, a good algorithm for exploratory analysisshould be able to generate the knowledge and with the next chunk of data arriving, the process should bethe one of just updating online by accumulating the knowledge derived from the recent chunk. Such anincremental learning process in most of the cases indent a lot of memory requiring to carry all earlier data inthe process of updating the knowledge successively. In this research work we propose to employ a novelCluster-Histo-Regression analysis of the chunk to extract the knowledge for the temporal instant and fusethis knowledge through Histo-Regression-Distance analysis with the already accumulated knowledge. Wehave designed a methodology which (i discards all those data samples from the chunk which haveparticipated in the knowledge generation process (ii indents minimum amount of memory to carry theaccumulated knowledge and (iii proposes to carry forward only those limited data samples (referred to ashard samples which could not contribute to knowledge generated at that moment. Knowledge of eachcluster is represented in the form of a histogram for each dimension of the clustered data and is transformedto regression line for the compact representation of the knowledge. The regression line parameters of theclusters obtained by incremental augmentation have shown an accuracy of up to 100% for some of the datasets that are considered for experimentation.

Nagabhushan P.

2010-03-01

308

DEFF Research Database (Denmark)

Recently there has been an explosion in the availability of bacterial genomic sequences, making possible now an analysis of genomic signatures across more than 800 hundred different bacterial chromosomes, from a wide variety of environments. Using genomic signatures, we pair-wise compared 867 different genomic DNA sequences, taken from chromosomes and plasmids more than 100,000 base-pairs in length. Hierarchical clustering was performed on the outcome of the comparisons before a multinomial regression model was fitted. The regression model included the cluster groups as the response variable with AT content, phyla, growth temperature, selective pressure, habitat, sequence size, oxygen requirement and pathogenicity as predictors. Many significant factors were associated with the genomic signature, most notably AT content. Phyla was also an important factor, although considerably less so than AT content. Small improvements to the regression model, although significant, were also obtained by factors such as sequence size, habitat, growth temperature, selective pressure measured as oligonucleotide usage variance, and oxygen requirement.The statistics obtained using hierarchical clustering and multinomial regression analysis indicate that the genomic signature is shaped by many factors, and this may explain the varying ability to classify prokaryotic organisms below genus level.

Ussery, David; Bohlin, Jon

2009-01-01

309

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Recently there has been an explosion in the availability of bacterial genomic sequences, making possible now an analysis of genomic signatures across more than 800 hundred different bacterial chromosomes, from a wide variety of environments. Using genomic signatures, we pair-wise compared 867 different genomic DNA sequences, taken from chromosomes and plasmids more than 100,000 base-pairs in length. Hierarchical clustering was performed on the outcome of the comparisons before a multinomial regression model was fitted. The regression model included the cluster groups as the response variable with AT content, phyla, growth temperature, selective pressure, habitat, sequence size, oxygen requirement and pathogenicity as predictors. Results Many significant factors were associated with the genomic signature, most notably AT content. Phyla was also an important factor, although considerably less so than AT content. Small improvements to the regression model, although significant, were also obtained by factors such as sequence size, habitat, growth temperature, selective pressure measured as oligonucleotide usage variance, and oxygen requirement. Conclusion The statistics obtained using hierarchical clustering and multinomial regression analysis indicate that the genomic signature is shaped by many factors, and this may explain the varying ability to classify prokaryotic organisms below genus level.

Skjerve Eystein

2009-10-01

310

Mitigating the effects of salt and selenium on water quality in the Grand Valley and lower Gunnison River Basin in western Colorado is a major concern for land managers. Previous modeling indicated means to improve the models by including more detailed geospatial data and a more rigorous method for developing the models. After evaluating all possible combinations of geospatial variables, four multiple linear regression models resulted that could estimate irrigation-season salt yield, nonirrigation-season salt yield, irrigation-season selenium yield, and nonirrigation-season selenium yield. The adjusted r-squared and the residual standard error (in units of log-transformed yield) of the models were, respectively, 0.87 and 2.03 for the irrigation-season salt model, 0.90 and 1.25 for the nonirrigation-season salt model, 0.85 and 2.94 for the irrigation-season selenium model, and 0.93 and 1.75 for the nonirrigation-season selenium model. The four models were used to estimate yields and loads from contributing areas corresponding to 12-digit hydrologic unit codes in the lower Gunnison River Basin study area. Each of the 175 contributing areas was ranked according to its estimated mean seasonal yield of salt and selenium.

Linard, Joshua I.

2013-01-01

311

Directory of Open Access Journals (Sweden)

Full Text Available In the present work, support vector machines (SVMs and multiple linear regression (MLR techniques were used for quantitative structure–property relationship (QSPR studies of retention time (tR in standardized liquid chromatography–UV–mass spectrometry of 67 mycotoxins (aflatoxins, trichothecenes, roquefortines and ochratoxins based on molecular descriptors calculated from the optimized 3D structures. By applying missing value, zero and multicollinearity tests with a cutoff value of 0.95, and genetic algorithm method of variable selection, the most relevant descriptors were selected to build QSPR models. MLRand SVMs methods were employed to build QSPR models. The robustness of the QSPR models was characterized by the statistical validation and applicability domain (AD. The prediction results from the MLR and SVM models are in good agreement with the experimental values. The correlation and predictability measure by r2 and q2 are 0.931 and 0.932, repectively, for SVM and 0.923 and 0.915, respectively, for MLR. The applicability domain of the model was investigated using William’s plot. The effects of different descriptors on the retention times are described.

Fereshteh Shiri

2010-08-01

312

In tissue counter analysis, complex histologic sections are overlaid with regularly distributed measuring masks of equal size and shape, and the digital contents of each mask (or tissue element) are evaluated by gray level, color, and texture parameters. In this study, the feasibility of tissue counter analysis and classification and regression trees for the quantitative evaluation of skin biopsies was assessed. From 100 randomly selected skin biopsies, a learning set of tissue elements was created, differentiating between cellular elements, collagenous elements of the reticular dermis, fatty elements and other tissue components. Classification and regression trees based on the learning set were used to automatically classify tissue elements in samples of normal skin, benign common nevi, malignant melanoma, molluscum contagiosum, seborrheic keratosis, epidermoid cysts, basal cell carcinoma, and scleroderma. The procedure yielded reproducible assessments of the relative amounts of tissue components in various diagnostic groups. Furthermore, a reliable diagnostic separation of molluscum contagiosum versus normal skin and epidermal cysts, benign common nevi versus malignant melanoma, and seborrheic keratosis versus basal cell carcinoma was possible. Tissue counter analysis combined with classification and regression trees may be a suitable approach to the fully automated analysis of histologic sections of skin biopsies. PMID:12775984

Smolle, Josef; Gerger, Armin

2003-06-01

313

Robust Outlier Detection in Linear Regression

New methodology of robust outlier detection based on Robustly Studentized Robust Residuals (RSRR) examination is well established in linear regression analysis. Two new robust location estimators of linear regression parameters are developed in simple and multiple cases. Based on these robust estimators we obtain RSRR. We used RSRR to derive a new measure of distance to be used in outlier detection. A graphical display using new measure of distance is constructed for detecting multiple outlie...

Jajo, Nethal K.; Xizhi Wu

2004-01-01

314

MICROARRAY DATA ANALYSIS USING MULTIPLE STATISTICAL MODELS

Microarray Data Analysis Using Multiple Statistical Models Wenjun Bao1, Judith E. Schmid1, Amber K. Goetz1, Ming Ouyang2, William J. Welsh2,Andrew I. Brooks3,4, ChiYi Chu3,Mitsunori Ogihara3,4, Yinhe Cheng5, David J. Dix1. 1National Health and Environmental Effects Researc...

315

Scientific Electronic Library Online (English)

Full Text Available SciELO Brazil | Language: Portuguese Abstract in portuguese Objetivou-se, com o presente trabalho, desenvolver uma metodologia para classificação da composição iônica da água de irrigação, através da regressão linear múltipla, tendo-se, como variável dependente, a condutividade elétrica e, como variáveis independentes, as concentrações de cátions e ânions da [...] água de irrigação, classificada de acordo com o peso de cada íon no modelo estatístico. A fonte secundária de dados para a pesquisa foi o Banco de Dados do Laboratório de Análise de Água e Fertilidade do Solo, da Escola Superior de Agricultura de Mossoró (LAAFS/ESAM). As regressões foram ajustadas utilizando-se o método da seleção por etapas, conhecido como the stepwise regression procedure, no qual a variável dependente foi a condutividade elétrica e, como variáveis independentes, os íons determinados pela análise físico-química da água. Os resultados mostraram que, empregando-se este critério de regressão linear múltipla, havia variação na contribuição de cada variável no modelo ajustado, cuja estimativa era baseada no aumento da soma de quadrado, devido à regressão, a medida em que se incorporava, ao modelo, cada variável independente. Em função de critérios preestabelecidos, águas provenientes de mananciais da região da Chapada do Apodi foram classificadas como cálcica-sódica, cálcica e cloretada, quando provinham de poço tubular, de poço amazonas e rio, respectivamente. As águas oriundas da região do Baixo Açu, foram classificadas como sódica, magnesiana-sódica e sódica, para as águas de poço tubular, poço amazonas e rio, respectivamente. Abstract in english This work was conducted with the objective of developing a methodology for classification of the ionic composition of the irrigation water using multiple linear regression. A Stepwise Regression Analysis model was tested, using electrical conductivity as the dependent variable and analyzed ions calc [...] ium, sodium, potassium, carbonate, bicarbonate and chlorides as the independent variables in all tested models. All water samples were collected by the farmers of the region where this work was conducted. The regression models were adjusted using the water analysis database from the ESAM's Analysis Laboratory (Laboratório de Análises de Água e Fertilidade do Solo da Escola Superior de Agricultura de Mossoró - LAAFS/ESAM). The linear model, adjusted using the Stepwise Regression Procedure, shows that the degree of model adjustment tested depends upon geological formation of watersheds and whether it is collected in a river or tubular wells. The classification of the water in calcareous region of the Chapada do Apodi is calcic-sodic, calcic or choride if this source was tubular well, piezometric well (drilled in unconfined water denominated in the region as poço amazonas) or surface rivers and lagoons water, respectively. In Baixo Açu region, these waters were classified as sodic, magnesian-sodic or sodic depending if the source collected is a tubular well (drilled in Açu sedimentary geological formation), piezometric well or superficial water, respectivelly.

Celsemy E., Maia; Elís R.C. de, Morais; Maurício de, Oliveira.

2001-04-01

316

Scientific Electronic Library Online (English)

Full Text Available SciELO Mexico | Language: Spanish Abstract in spanish Es necesario contar con registros largos de información hidrológica anual para obtener una imagen más apegada a la realidad de su variabilidad, así como estimaciones confiables de sus propiedades estadísticas. Para obtener tales registros es común buscar fuentes adicionales de datos y técnicas de tr [...] ansferencia. Una técnica es la regresión lineal múltiple, cuya aplicación numérica lleva implícita la selección óptima de los registros largos cercanos (regresores) para buscar que la ampliación del registro corto sea una estimación confiable. Este proceso de selección implica tres análisis: 1) cómo definir las mejores estimaciones, 2) cuáles ecuaciones de regresión investigar, y 3) cuál modelo tiene mejor capacidad predictiva. Para el primer análisis se presentan cuatro criterios basados en las sumas de los cuadrados de los residuos; para el segundo se investigan todas las regresiones posibles porque en los problemas de transferencia de información hidrológica se dispondrá máximo de cinco regresores; para el tercero, seleccionar el mejor modelo predictivo se utiliza el análisis de residuales y la validación cruzada. La aplicación numérica descrita es una ampliación del registro de volúmenes escurridos anuales en la estación hidrométrica Platón Sánchez del sistema del río Tempoal, en la Región Hidrológica No. 26 (Pánuco, México). En este caso se utilizan cuatro regresores que son los registros del resto de las estaciones de aforos de tal sistema. Se concluye que incluso en problemas con multicolinealidad, los criterios de selección y los análisis expuestos conducen a resultados consistentes y permiten obtener las mejores ecuaciones de regresión. La similitud de los resultados alcanzados con los modelos de regresión seleccionados genera confianza en las estimaciones adoptadas. Abstract in english It is necessary to have long records of annual hydrological data to get a truer picture of their variability, as well as reliable estimates of their statistical properties. To obtain these records it is common to use additional sources of data and transfer techniques. One technique is the multiple l [...] inear regression whose numerical application implies the optimum selection of close lengthy records (regressors) to have the extension of short registration be a reliable estimate. This selection process involves three analyses: 1) how to define the best estimates, 2) what regression equations should be investigated, and 3) which model has better predictive ability. For the first analysis four criteria based on the sums of the squares of the residuals are presented; for the second all possible regressions are investigated since in the problems of hydrological information transfer, we will have five regressors at the most; for the third, about selecting the best predictive model, we used the residual analysis and cross-validation. The numerical application described is an extension of the annual runoff volume record in the Platón Sánchez hydrometric station of the Tempoal river system in the 26 Hydrological Region (Pánuco, México). Here we used four regressors that are the records of other gauging stations in such system. We came to the conclusion that even in problems with multicollinearity, the selection criteria and analysis led to consistent results and allowed for the best regression equations. The similarity of the results obtained with the selected regression models generated confidence in the estimates adopted.

Daniel F., Campos-Aranda.

2011-12-01

317

Directory of Open Access Journals (Sweden)

Full Text Available Objetivou-se, com o presente trabalho, desenvolver uma metodologia para classificação da composição iônica da água de irrigação, através da regressão linear múltipla, tendo-se, como variável dependente, a condutividade elétrica e, como variáveis independentes, as concentrações de cátions e ânions da água de irrigação, classificada de acordo com o peso de cada íon no modelo estatístico. A fonte secundária de dados para a pesquisa foi o Banco de Dados do Laboratório de Análise de Água e Fertilidade do Solo, da Escola Superior de Agricultura de Mossoró (LAAFS/ESAM. As regressões foram ajustadas utilizando-se o método da seleção por etapas, conhecido como the stepwise regression procedure, no qual a variável dependente foi a condutividade elétrica e, como variáveis independentes, os íons determinados pela análise físico-química da água. Os resultados mostraram que, empregando-se este critério de regressão linear múltipla, havia variação na contribuição de cada variável no modelo ajustado, cuja estimativa era baseada no aumento da soma de quadrado, devido à regressão, a medida em que se incorporava, ao modelo, cada variável independente. Em função de critérios preestabelecidos, águas provenientes de mananciais da região da Chapada do Apodi foram classificadas como cálcica-sódica, cálcica e cloretada, quando provinham de poço tubular, de poço amazonas e rio, respectivamente. As águas oriundas da região do Baixo Açu, foram classificadas como sódica, magnesiana-sódica e sódica, para as águas de poço tubular, poço amazonas e rio, respectivamente.This work was conducted with the objective of developing a methodology for classification of the ionic composition of the irrigation water using multiple linear regression. A Stepwise Regression Analysis model was tested, using electrical conductivity as the dependent variable and analyzed ions calcium, sodium, potassium, carbonate, bicarbonate and chlorides as the independent variables in all tested models. All water samples were collected by the farmers of the region where this work was conducted. The regression models were adjusted using the water analysis database from the ESAM's Analysis Laboratory (Laboratório de Análises de Água e Fertilidade do Solo da Escola Superior de Agricultura de Mossoró - LAAFS/ESAM. The linear model, adjusted using the Stepwise Regression Procedure, shows that the degree of model adjustment tested depends upon geological formation of watersheds and whether it is collected in a river or tubular wells. The classification of the water in calcareous region of the Chapada do Apodi is calcic-sodic, calcic or choride if this source was tubular well, piezometric well (drilled in unconfined water denominated in the region as poço amazonas or surface rivers and lagoons water, respectively. In Baixo Açu region, these waters were classified as sodic, magnesian-sodic or sodic depending if the source collected is a tubular well (drilled in Açu sedimentary geological formation, piezometric well or superficial water, respectivelly.

Celsemy E. Maia

2001-04-01

318

Analysis of the Evolution of the Gross Domestic Product by Means of Cyclic Regressions

In this article, we will carry out an analysis on the regularity of the Gross Domestic Product of a country, in our case the United States. The method of analysis is based on a new method of analysis – the cyclic regressions based on the Fourier series of a function. Another point of view is that of considering instead the growth rate of GDP the speed of variation of this rate, computed as a numerical derivative. The obtained results show a cycle for this indicator for 71 years, the mean sq...

Catalin Angelo Ioan; Gina Ioan

2011-01-01

319

Analysis of the Evolution of the Gross Domestic Product by Means of Cyclic Regressions

Directory of Open Access Journals (Sweden)

Full Text Available In this article, we will carry out an analysis on the regularity of the Gross Domestic Product of a country, in our case the United States. The method of analysis is based on a new method of analysis – the cyclic regressions based on the Fourier series of a function. Another point of view is that of considering instead the growth rate of GDP the speed of variation of this rate, computed as a numerical derivative. The obtained results show a cycle for this indicator for 71 years, the mean square error being 0.93%. The method described allows an prognosis on short-term trends in GDP.

Catalin Angelo Ioan

2011-08-01

320

When independent variables have high linear correlation in a multiple linear regression model, we can have wrong analysis. It happens if we do the multiple linear regression analysis based on common Ordinary Least Squares (OLS) method. In this situation, we are suggested to use ridge regression estimator. We conduct some simulation study to compare the performance of ridge regression estimator and the OLS. We found that Hoerl and Kennard ridge regression estimation method has better performan...

Anwar Fitrianto; Lee Ceng Yik

2014-01-01

321

Directory of Open Access Journals (Sweden)

Full Text Available Credit scoring is a vital topic for Banks since there is a need to use limited financial sources more effectively. There are several credit scoring methods that are used by Banks. One of them is to estimate whether a credit demanding customer’s repayment order will be regular or not. In this study, artificial neural networks and logistic regression analysis have been used to provide a support to the Banks’ credit risk prediction and to estimate whether a credit demanding customers’ repayment order will be regular or not. The results of the study showed that artificial neural networks method is more reliable than logistic regression analysis while estimating a credit demanding customer’s repayment order.

Hüseyin BUDAK

2012-11-01

322

Production planning and control (PPC) systems have to deal with rising complexity and dynamics. The complexity of planning tasks is due to some existing multiple variables and dynamic factors derived from uncertainties surrounding the PPC. Although literatures on exact scheduling algorithms, simulation approaches, and heuristic methods are extensive in production planning, they seem to be inefficient because of daily fluctuations in real factories. Decision support systems can provide productive tools for production planners to offer a feasible and prompt decision in effective and robust production planning. In this paper, we propose a robust decision support tool for detailed production planning based on statistical multivariate method including principal component analysis and logistic regression. The proposed approach has been used in a real case in Iranian automotive industry. In the presence of existing multisource uncertainties, the results of applying the proposed method in the selected case show that the accuracy of daily production planning increases in comparison with the existing method.

Mehrjoo, Saeed; Bashiri, Mahdi

2013-05-01

323

Robust estimation for homoscedastic regression in the secondary analysis of case-control data.

Primary analysis of case-control studies focuses on the relationship between disease D and a set of covariates of interest (Y, X). A secondary application of the case-control study, which is often invoked in modern genetic epidemiologic association studies, is to investigate the interrelationship between the covariates themselves. The task is complicated owing to the case-control sampling, where the regression of Y on X is different from what it is in the population. Previous work has assumed a parametric distribution for Y given X and derived semiparametric efficient estimation and inference without any distributional assumptions about X. We take up the issue of estimation of a regression function when Y given X follows a homoscedastic regression model, but otherwise the distribution of Y is unspecified. The semiparametric efficient approaches can be used to construct semiparametric efficient estimates, but they suffer from a lack of robustness to the assumed model for Y given X. We take an entirely different approach. We show how to estimate the regression parameters consistently even if the assumed model for Y given X is incorrect, and thus the estimates are model robust. For this we make the assumption that the disease rate is known or well estimated. The assumption can be dropped when the disease is rare, which is typically so for most case-control studies, and the estimation algorithm simplifies. Simulations and empirical examples are used to illustrate the approach. PMID:23637568

Wei, Jiawei; Carroll, Raymond J; Müller, Ursula U; Van Keilegom, Ingrid; Chatterjee, Nilanjan

2013-01-01

324

Regression analysis of MCS intensity and ground motion spectral accelerations (SAs) in Italy

We present the results of the regression analyses between Mercalli-Cancani-Sieberg (MCS) intensity and the spectral acceleration (SA) at 0.3, 1.0 and 2.0 s (SA03, SA10 and SA20). In Italy, the MCS scale is used to describe the level of ground shaking suffered by manufactures or perceived by the people, and it differs to some extent from the Mercalli Modified scale in use in other countries. We have assembled a new SA/MCS-intensity data set from the DBMI04 intensity database and the ITACA accelerometric data bank. The SA peak values are calculated in two ways—using the maximum among the two horizontal components, and using the geometrical mean among the two horizontal components. The regression analysis has been performed separately for the two kinds of data sets and for the three target periods. Since both peak ground parameters and intensities suffer of appreciable uncertainties, we have used the orthogonal distance regression technique. Also, tests designed to assess the robustness of the estimated coefficients have shown that single-line parametrizations for the regressions are sufficient to model the data within the model uncertainties.

Faenza, Licia; Michelini, Alberto

2011-09-01

325

A logistic normal multinomial regression model for microbiome compositional data analysis.

Changes in human microbiome are associated with many human diseases. Next generation sequencing technologies make it possible to quantify the microbial composition without the need for laboratory cultivation. One important problem of microbiome data analysis is to identify the environmental/biological covariates that are associated with different bacterial taxa. Taxa count data in microbiome studies are often over-dispersed and include many zeros. To account for such an over-dispersion, we propose to use an additive logistic normal multinomial regression model to associate the covariates to bacterial composition. The model can naturally account for sampling variabilities and zero observations and also allow for a flexible covariance structure among the bacterial taxa. In order to select the relevant covariates and to estimate the corresponding regression coefficients, we propose a group ?1 penalized likelihood estimation method for variable selection and estimation. We develop a Monte Carlo expectation-maximization algorithm to implement the penalized likelihood estimation. Our simulation results show that the proposed method outperforms the group ?1 penalized multinomial logistic regression and the Dirichlet multinomial regression models in variable selection. We demonstrate the methods using a data set that links human gut microbiome to micro-nutrients in order to identify the nutrients that are associated with the human gut microbiome enterotype. PMID:24128059

Xia, Fan; Chen, Jun; Fung, Wing Kam; Li, Hongzhe

2013-12-01

326

Robust estimation for homoscedastic regression in the secondary analysis of case–control data

Primary analysis of case–control studies focuses on the relationship between disease D and a set of covariates of interest (Y, X). A secondary application of the case–control study, which is often invoked in modern genetic epidemiologic association studies, is to investigate the interrelationship between the covariates themselves. The task is complicated owing to the case–control sampling, where the regression of Y on X is different from what it is in the population. Previous work has a...

Wei, Jiawei; Carroll, Raymond; Mu?ller, Ursula; Keilegom, Ingrid; Chatterjee, Nilanjan

2013-01-01

327

High-Dimensional Heteroscedastic Regression with an Application to eQTL Data Analysis

We consider the problem of high-dimensional regression under non-constant error variances. Despite being a common phenomenon in biological applications, heteroscedasticity has, so far, been largely ignored in high-dimensional analysis of genomic data sets. We propose a new methodology that allows non-constant error variances for high-dimensional estimation and model selection. Our method incorporates heteroscedasticity by simultaneously modeling both the mean and variance components via a nov...

Daye, Z. John; Chen, Jinbo; Li, Hongzhe

2011-01-01

328

The effects of exchange rate variability on international trade: a Meta-Regression Analysis

Abstract The trade effects of exchange rate variability have been an issue in international economics for the past 30 years. The contribution of this paper is to apply meta-regression analysis (MRA) to the empirical literature. On average, exchange rate variability exerts a negative effect on international trade. Yet MRA confirms the view that this result is highly conditional, by identifying factors that help to explain why estimated trade effects vary from significantly negative ...

??ori??, Bruno; Pugh, Geoffrey Thomas

2008-01-01

329

The risk of antisocial outcomes in individuals with personality disorder (PD) remains uncertain. The authors synthesize the current evidence on the risks of antisocial behavior, violence, and repeat offending in PD, and they explore sources of heterogeneity in risk estimates through a systematic review and meta-regression analysis of observational studies comparing antisocial outcomes in personality disordered individuals with controls groups. Fourteen studies examined risk of antisocial and ...

Yu, R.; Geddes, Jr; Fazel, S.

2012-01-01

330

A new statistical methodology is developed for the analysis of spontaneous adverse event (AE) reports from post-marketing drug surveillance data. The method involves both empirical Bayes (EB) and fully Bayes estimation of rate multipliers for each drug within a class of drugs, for a particular AE, based on a mixed-effects Poisson regression model. Both parametric and semiparametric models for the random-effect distribution are examined. The method is applied to data from Food and Drug Adminis...

Gibbons, Robert D.; Segawa, Eisuke; Karabatsos, George; Amatya, Anup K.; Bhaumik, Dulal K.; Brown, C. Hendricks; Kapur, Kush; Marcus, Sue M.; Hur, Kwan; Mann, J. John

2008-01-01

331

Stratified Cox Regression Analysis of Survival under CIMAvax^{®}EGF Vaccine

Background: The Center of Molecular Immunology (CIM) is a center in Cuba devoted to the research, development and manufacturing of biotechnological products. CIMAvax®EGF vaccine, based on data collected in a phase II and a phase III clinical trials. Methods: The stratified Cox regression model is used to evaluate the effects of these prognostic factors, based on separate analysis for each trial, and on the combined data from both trials. Results: Patients with Performance status 0 or 1, wit...

Carmen Viada Gonzalez; Jean-François Dupuy; Martha Fors López; Patricia Lorenzo Luaces; Camilo Rodríguez Rodríguez; Gisela González Marinello; Elia Neninger Vinagera; Beatriz García Verdecia; Bárbara Wilkinson Brito; Liana Martínez Pérez; Mayelin Troche de la Concepción; Tania Crombet-Ramos

2013-01-01

332

LOGISTIC REGRESSION RESPONSE FUNCTIONS WITH MAIN AND INTERACTION EFFECTS IN THE CONJOINT ANALYSIS

In the Conjoint Analysis (COA) model proposed here - an extension of the traditional COA - the polytomous response variable (i.e. evaluation of the overall desirability of alternative product profiles) is described by a sequence of binary variables. To link the categories of overall evaluation to the factor levels, we adopt - at the aggregate level - a multivariate logistic regression model, based on a main and two-factor interaction effects experimental design. The model provides several ove...

Luca, Amedeo; Ciapparelli, Sara

2011-01-01

333

Abstract Background Incidence of liver hydatid cyst (LHC) rupture ranged 15%-40% of all cases and most of them concern the bile duct tree. Patients with biliocystic communication (BCC) had specific clinic and therapeutic aspect. The purpose of this study was to determine witch patients with LHC may develop BCC using classification and regression tree (CART) analysis Methods A retrospective study of 672 patients with liver hydatid cyst treated at the surgery department "A" at Ibn Sina Universi...

Souadka Amine; El Mejdoubi Yasser; El Malki Hadj; Mohsine Raouf; Ifrine Lahcen; Abouqal Redouane; Belkouchi Abdelkader

2010-01-01

334

Scientific Electronic Library Online (English)

Full Text Available SciELO Brazil | Language: English Abstract in english OBJECTIVES: Recent guidelines recommend that all cirrhotic patients should undergo endoscopic screening for esophageal varices. That identifying cirrhotic patients with esophageal varices by noninvasive predictors would allow for the restriction of the performance of endoscopy to patients with a hig [...] h risk of having varices. This study aimed to develop a decision model based on classification and regression tree analysis for the prediction of large esophageal varices in cirrhotic patients. METHODS: 309 cirrhotic patients (training sample, 187 patients; test sample 122 patients) were included. Within the training sample, the classification and regression tree analysis was used to identify predictors and prediction model of large esophageal varices. The prediction model was then further evaluated in the test sample and different Child-Pugh classes. RESULTS: The prevalence of large esophageal varices in cirrhotic patients was 50.8%. A tree model that was consisted of spleen width, portal vein diameter and prothrombin time was developed by classification and regression tree analysis achieved a diagnostic accuracy of 84% for prediction of large esophageal varices. When reconstructed into two groups, the rate of varices was 83.2% for high-risk group and 15.2% for low-risk group. Accuracy of the tree model was maintained in the test sample and different Child-Pugh classes. CONCLUSIONS: A decision tree model that consists of spleen width, portal vein diameter and prothrombin time may be useful for prediction of large esophageal varices in cirrhotic patients

Wan-dong, Hong; Le-mei, Dong; Zen-cai, Jiang; Qi-huai, Zhu; Shu-Qing, Jin.

335

Directory of Open Access Journals (Sweden)

Full Text Available Estimating forest canopy height from large-footprint satellite LiDAR waveforms is challenging given the complex interaction between LiDAR waveforms, terrain, and vegetation, especially in dense tropical and equatorial forests. In this study, canopy height in French Guiana was estimated using multiple linear regression models and the Random Forest technique (RF. This analysis was either based on LiDAR waveform metrics extracted from the GLAS (Geoscience Laser Altimeter System spaceborne LiDAR data and terrain information derived from the SRTM (Shuttle Radar Topography Mission DEM (Digital Elevation Model or on Principal Component Analysis (PCA of GLAS waveforms. Results show that the best statistical model for estimating forest height based on waveform metrics and digital elevation data is a linear regression of waveform extent, trailing edge extent, and terrain index (RMSE of 3.7 m. For the PCA based models, better canopy height estimation results were observed using a regression model that incorporated both the first 13 principal components (PCs and the waveform extent (RMSE = 3.8 m. Random Forest regressions revealed that the best configuration for canopy height estimation used all the following metrics: waveform extent, leading edge, trailing edge, and terrain index (RMSE = 3.4 m. Waveform extent was the variable that best explained canopy height, with an importance factor almost three times higher than those for the other three metrics (leading edge, trailing edge, and terrain index. Furthermore, the Random Forest regression incorporating the first 13 PCs and the waveform extent had a slightly-improved canopy height estimation in comparison to the linear model, with an RMSE of 3.6 m. In conclusion, multiple linear regressions and RF regressions provided canopy height estimations with similar precision using either LiDAR metrics or PCs. However, a regression model (linear regression or RF based on the PCA of waveform samples with waveform extent information is an interesting alternative for canopy height estimation as it does not require several metrics that are difficult to derive from GLAS waveforms in dense forests, such as those in French Guiana.

Ibrahim Fayad

2014-11-01

336

A sensitivity analysis of a distributed hydrologic model with a large number of parameters is essential for understanding the model structure and simplifying model calibration efforts. It is also useful for guiding future field data collection and sampling efforts. Global sensitivity analysis methods are widely recognized today as superior to local or one-at-a-time methods because they are not limited by model linearity requirements and have a more extensive coverage of the parameter space. In this study, two global sensitivity analysis methods, the variance-based Sobol method and a Latin Hypercube Sampling based Multiple Linear Regression (LHS-MLR) approach, are employed to evaluate the effect of model parameter variability on simulated stages in the Everglades National Park (ENP) in Florida, USA. Both methods provide robust estimates of model parameter sensitivity. However, due to the distinctive characteristics of the two methods, they provide unique insights regarding model parameter sensitivities. These observations are compared in detail in this study. The simulated stage results from the distributed-parameter Regional Simulation Model (RSM), developed by the South Florida Water Management District, are used for this comparison. The parameters considered for sensitivity analysis consist of several model parameters that influence overland and groundwater flows as well as evapotranspiration within the ENP. Their relative sensitivities are assessed under dry, wet and average hydrologic conditions existing in the ENP watershed. The use of a variety of hydrologic conditions allows the robust assessment of parameter sensitivities obtained using the two global sensitivity analysis methods.

Dessalegne, T.; Senarath, S. U.; Novoa, R. J.

2010-12-01

337

1. Central questions of behavioural and evolutionary ecology are what factors influence the reproductive success of dominant breeders and subordinate nonbreeders within animal societies? A complete understanding of any society requires that these questions be answered for all individuals. 2. The clown anemonefish, Amphiprion percula, forms simple societies that live in close association with sea anemones, Heteractis magnifica. Here, we use data from a well-studied population of A. percula to determine the major predictors of reproductive success of dominant pairs in this species. 3. We analyse the effect of multiple predictors on four components of reproductive success, using a relatively new technique from the field of statistical learning: boosted regression trees (BRTs). BRTs have the potential to model complex relationships in ways that give powerful insight. 4. We show that the reproductive success of dominant pairs is unrelated to the presence, number or phenotype of nonbreeders. This is consistent with the observation that nonbreeders do not help or hinder breeders in any way, confirming and extending the results of a previous study. 5. Primarily, reproductive success is negatively related to male growth and positively related to breeding experience. It is likely that these effects are interrelated because males that grow a lot have little breeding experience. These effects are indicative of a trade-off between male growth and parental investment. 6. Secondarily, reproductive success is positively related to female growth and size. In this population, female size is positively related to group size and anemone size, also. These positive correlations among traits likely are caused by variation in site quality and are suggestive of a silver-spoon effect. 7. Noteworthily, whereas reproductive success is positively related to female size, it is unrelated to male size. This observation provides support for the size advantage hypothesis for sex change: both individuals maximize their reproductive success when the larger individual adopts the female tactic. 8. This study provides the most complete picture to date of the factors that predict the reproductive success of dominant pairs of clown anemonefish and illustrates the utility of BRTs for analysis of complex behavioural and evolutionary ecology data. PMID:21284624

Buston, Peter M; Elith, Jane

2011-05-01

338

A regression analysis of the effect of energy use in agriculture

International Nuclear Information System (INIS)

This study investigates the impacts of energy use on productivity of Turkey's agriculture. It reports the results of a regression analysis of the relationship between energy use and agricultural productivity. The study is based on the analysis of the yearbook data for the period 1971-2003. Agricultural productivity was specified as a function of its energy consumption (TOE) and gross additions of fixed assets during the year. Least square (LS) was employed to estimate equation parameters. The data of this study comes from the State Institute of Statistics (SIS) and The Ministry of Energy of Turkey

339

A regression analysis of the effect of energy use in agriculture

Energy Technology Data Exchange (ETDEWEB)

This study investigates the impacts of energy use on productivity of Turkey's agriculture. It reports the results of a regression analysis of the relationship between energy use and agricultural productivity. The study is based on the analysis of the yearbook data for the period 1971-2003. Agricultural productivity was specified as a function of its energy consumption (TOE) and gross additions of fixed assets during the year. Least square (LS) was employed to estimate equation parameters. The data of this study comes from the State Institute of Statistics (SIS) and The Ministry of Energy of Turkey. (Author)

Karkacier, Osman [Gaziosmanpasa Univ., Dept. of Business Administration, Tokat (Turkey); Goktolga, Z. Gokalp; Cicek, Adnan [Gaziosmanpasa Univ., Dept. of Agricultural Economics, Tokat (Turkey)

2006-12-15

340

The analysis of physical measurements often copes with highly correlated noises and interruptions caused by outliers, saturation events, or transmission losses. We assess the impact of missing data on the performance of linear regression analysis involving the fit of modeled or measured time series. We show that data gaps can significantly alter the precision of the regression parameter estimation in the presence of colored noise, due to the frequency leakage of the noise power. We present a regression method that cancels this effect and estimates the parameters of interest with a precision comparable to the complete data case, even if the noise power spectral density (PSD) is not known a priori. The method is based on an autoregressive fit of the noise, which allows us to build an approximate generalized least squares estimator approaching the minimal variance bound. The method, which can be applied to any similar data processing, is tested on simulated measurements of the MICROSCOPE space mission, whose goal is to test the weak equivalence principle (WEP) with a precision of 1 0-15. In this particular context the signal of interest is the WEP violation signal expected to be found around a well defined frequency. We test our method with different gap patterns and noise of known PSD and find that the results agree with the mission requirements, decreasing the uncertainty by a factor of 60 with respect to ordinary least squares methods. We show that it also provides a test of significance to assess the uncertainty of the measurement.

Baghi, Quentin; Métris, Gilles; Bergé, Joël; Christophe, Bruno; Touboul, Pierre; Rodrigues, Manuel

2015-03-01

341

Multiple regression models of ?13C and ?15N for fish populations in the eastern Gulf of Mexico

Multiple regression models were created to explain spatial and temporal variation in the ?13C and ?15N values of fish populations on the West Florida Shelf (eastern Gulf of Mexico, USA). Extensive trawl surveys from three time periods were used to acquire muscle samples from seven groundfish species. Isotopic variation (?13Cvar and ?15Nvar) was calculated as the deviation from the isotopic mean of each fish species. Static spatial data and dynamic water quality parameters were used to create models predicting ?13Cvar and ?15Nvar in three fish species that were caught in the summers of 2009 and 2010. Additional data sets were then used to determine the accuracy of the models for predicting isotopic variation (1) in a different time period (fall 2010) and (2) among four entirely different fish species that were collected during summer 2009. The ?15Nvar model was relatively stable and could be applied to different time periods and species with similar accuracy (mean absolute errors 0.31-0.33‰). The ?13Cvar model had a lower predictive capability and mean absolute errors ranged from 0.42 to 0.48‰. ?15N trends are likely linked to gradients in nitrogen fixation and Mississippi River influence on the West Florida Shelf, while ?13C trends may be linked to changes in algal species, photosynthetic fractionation, and abundance of benthic vs. planktonic basal resources. These models of isotopic variability may be useful for future stable isotope investigations of trophic level, basal resource use, and animal migration on the West Florida Shelf.

Radabaugh, Kara R.; Peebles, Ernst B.

2014-08-01

342

DEFF Research Database (Denmark)

Background: The next fifty years will see a drastic increase in the older population. Among other effects, ageing causes a decrease in strength. It is necessary to provide safe and comfortable environments for the elderly. To achieve this, digital human modelling has proved to be a useful and valuable ergonomic tool. Objective: To investigate age and gender effects on the torque-producing ability in the knee and elbow in older adults. To create strength scaled equations based on age, gender, upper/lower limb lengths and masses using multiple linear regression. To reduce the number of dependent parameters based on statistical redundancies, and then validate these equations. Methods: 283 subjects (141 males, 142 females) aged 50-59 years (54.9 +/- 2.9) , 60-69 years (65.4 +/- 2.9) and 70-79 years (73.7 +/- 2.7) were tested for maximal voluntary isometric torque of right knee extensors and elbow flexors. Results: Males were signifantly stronger than females across all age groups. Elbow peak torque (EPT) was better preserved from 60s to 70s whereas knee peak torque (KPT) reduced significantly (P<0.05) across all age groups. This held true for males and females. Gender, thigh mass and age best predicted KPT (R2=0.60). Gender, forearm mass and age best predicted EPT (R2=0.75). Good crossvalidation was established for both elbow and knee models. Conclusion: This cross-sectional study of muscle strength created and validated strength scaled equations of EPT and KPT using only gender, segment mass and age.

D'Souza, Sonia; Rasmussen, John

2012-01-01

343

Analysis of electrical resistance tomography (ERT) data using least-squares regression modelling in industrial process tomographs has been tested. Potential differences measured between electrodes in rings have been used to carry out the regression modelling to investigate the location and size of a disturbance present in the system. Extensive experiments have been carried out with ERT to test a suitable regression algorithm to extract the disturbance. Current analysis has been performed for a single disturbance known to be present in the system. For the environment considered, the least-squares regression reported in this paper demonstrates an alternative approach for analysis of tomography data in industrial applications. The position (concentric or off-centre) and the size of the disturbance (in concentric cases) can be well defined by the reported regression modelling approach. However, it is still a challenge to define the size of the off-centre disturbance.

Khanal, Manoj; Morrison, Rob

2009-04-01

344

International Nuclear Information System (INIS)

Analysis of electrical resistance tomography (ERT) data using least-squares regression modelling in industrial process tomographs has been tested. Potential differences measured between electrodes in rings have been used to carry out the regression modelling to investigate the location and size of a disturbance present in the system. Extensive experiments have been carried out with ERT to test a suitable regression algorithm to extract the disturbance. Current analysis has been performed for a single disturbance known to be present in the system. For the environment considered, the least-squares regression reported in this paper demonstrates an alternative approach for analysis of tomography data in industrial applications. The position (concentric or off-centre) and the size of the disturbance (in concentric cases) can be well defined by the reported regression modelling approach. However, it is still a challenge to define the size of the off-centre disturbance

345

Investigation of Water Parameters in a River System with a Two-Dimensional Regression Analysis Model

A key step of European Water Framework Directive (WFD) implementation is the ecological status classification and the achievement of good water statuses for all waters, by 2015. In transitional waters, the changing environmental niche induces responses in the macroinvertebrate guilds and macroinvertebrate responses induce uncertainty in the metrics. In this case, the sources of uncertainty in the ecological classification with benthic macroinvertebrates, is addressed by focusing on two major potential sources: spatial heterogeneity and temporal heterogeneity. A coherent study of the series of correlation between the physics and chemistry parameters is needed in order to succeed in reaching a complete picture. In this paper we present a bi-dimensional regression analysis model dependence of chemistry component by two independent environment variables—temperature and pH. The consistent experimental data set and the regression computation approach lead to a series of interesting outcomes.

Murariu, Gabriel; Caldararu, Aurelia; Georgescu, Lucian; Voiculescu, Mirela; Puscasu, Gheorghe; Basset, Alberto

2011-10-01

346

The aim of this study was to develop mathematical models for estimating earthquake casualties such as death, number of injured persons, affected families and total cost of damage. To quantify the direct damages from earthquakes to human beings and properties given the magnitude, intensity, depth of focus, location of epicentre and time duration, the regression models were made. The researchers formulated models through regression analysis using matrices and used ? = 0.01. The study considered thirty destructive earthquakes that hit the Philippines from the inclusive years 1968 to 2012. Relevant data about these said earthquakes were obtained from Philippine Institute of Volcanology and Seismology. Data on damages and casualties were gathered from the records of National Disaster Risk Reduction and Management Council. The mathematical models made are as follows: This study will be of great value in emergency planning, initiating and updating programs for earthquake hazard reductionin the Philippines, which is an earthquake-prone country.

Urrutia, J. D.; Bautista, L. A.; Baccay, E. B.

2014-04-01

347

DEFF Research Database (Denmark)

This paper introduces a new estimator to measure the ex-post covariation between high-frequency financial time series under market microstructure noise. We provide an asymptotic limit theory (including feasible central limit theorems) for standard methods such as regression, correlation analysis and covariance, for which we obtain the optimal rate of convergence. We demonstrate some positive semidefinite estimators of the covariation and construct a positive semidefinite estimator of the conditional covariance matrix in the central limit theorem. Furthermore, we indicate how the assumptions on the noise process can be relaxed and how our method can be applied to non-synchronous observations. We also present an empirical study of how high-frequency correlations, regressions and covariances change through time.

Kinnebrock, Silja; Podolskij, Mark

2008-01-01

348

Sub-pixel estimation of tree cover and bare surface densities using regression tree analysis

Directory of Open Access Journals (Sweden)

Full Text Available Sub-pixel analysis is capable of generating continuous fields, which represent the spatial variability of certain thematic classes. The aim of this work was to develop numerical models to represent the variability of tree cover and bare surfaces within the study area. This research was conducted in the riparian buffer within a watershed of the São Francisco River in the North of Minas Gerais, Brazil. IKONOS and Landsat TM imagery were used with the GUIDE algorithm to construct the models. The results were two index images derived with regression trees for the entire study area, one representing tree cover and the other representing bare surface. The use of non-parametric and non-linear regression tree models presented satisfactory results to characterize wetland, deciduous and savanna patterns of forest formation.

Carlos Augusto Zangrando Toneli

2011-09-01

349

Hybrid fuzzy regression with trapezoidal fuzzy data

In this regard, this research deals with a method for hybrid fuzzy least-squares regression. The extension of symmetric triangular fuzzy coefficients to asymmetric trapezoidal fuzzy coefficients is considered as an effective measure for removing unnecessary fuzziness of the linear fuzzy model. First, trapezoidal fuzzy variable is applied to derive a bivariate regression model. In the following, normal equations are formulated to solve the four parts of hybrid regression coefficients. Also the model is extended to multiple regression analysis. Eventually, method is compared with Y-H.O. chang's model.

Razzaghnia, T.; Danesh, S.; Maleki, A.

2012-01-01

350

Analysis of Dynamic Multiplicity Fluctuations at PHOBOS

This paper presents the analysis of the dynamic fluctuations in the inclusive charged particle multiplicity measured by PHOBOS for Au+Au collisions at sqrt(s_NN)=200$GeV within the pseudo-rapidity range of -3

Chai, Z; Baker, M D; Ballintijn, M; Barton, D S; Betts, R R; Bickley, A A; Bindel, R; Budzanowski, A; Busza, W; Carroll, A; Chai, Z; Decowski, M P; García, E; George, N; Gulbrandsen, K H; Gushue, S; Halliwell, C; Hamblen, J; Heintzelman, G A; Henderson, C; Hofman, D J; Hollis, R S; Holynski, R; Holzman, B; Iordanova, A; Johnson, E; Kane, J L; Katzy, J; Khan, N; Kucewicz, W; Kulinich, P; Kuo, C M; Lin, W T; Manly, S; McLeod, D; Mignerey, A C; Nouicer, R; Olszewski, A; Pak, R; Park, I C; Pernegger, H; Reed, C; Remsberg, L P; Reuter, M; Rolan, C; Roland, G; Rosenberg, L J; Sagerer, J; Sarin, P; Sawicki, P; Skulski, W; Steinberg, P; Stephans, G S F; Sukhanov, A; Tang, J L; Trzupek, A; Vale, C; van Nieuwenhuizen, G J; Verdier, R; Wolfs, F L H; Wosiek, B; Wozniak, K; Wuosmaa, A H; Wyslouch, B; Chai, Zhengwei

2005-01-01

351

Palatal rugae patterns are relatively unique to an individual and are well protected by the lips, buccal pad of fat and teeth. They are considered to be stable throughout life following completion of growth, although there is considerable debate on the matter, they can be used successfully in post mortem identification provided an antemortem record exists. Thus the aim of this study was to examine palatal rugae shape among two Indian populations and determine the accuracy in defining the Indian population using logistic regression analysis. The study comprises two groups from geographically different regions of India with basic origin from Maharashtra and Karnataka state. The sample includes 100 plaster cast equally distributed between two populations and genders with age ranging between 18 and 40 years. Impression of maxillary arch was obtained using alginate impression material and plaster cast was made. The rugae was delineated on the cast using a sharp graphite pencil under adequate light and magnification and recorded according to classification given by Kapali et al. and Thomas and Kotze (1983). Chi-Square analysis showed significant difference in wavy, circular and divergent pattern between the two populations. The straight and wavy forms were significant in logistic regression analysis. A predictive value of 71% was obtained in determining the original cases correctly when straight, wavy, curved and circular patterns were assessed. 70% of predictive value was achieved when all rugae patterns were assessed. Mean number of rugae was greater in females compared to males with straight pattern showing statistically significant difference between males and females. Significant difference was recorded among straight, wavy, circular and divergent pattern between two populations. Consequently this study demonstrates moderate accuracy of palatal rugae pattern using logistic regression analysis in identification of Indians. PMID:22018168

Kotrashetti, Vijayalakshmi S; Hollikatti, Kiran; Mallapur, M D; Hallikeremath, Seema R; Kale, Alka D

2011-11-01

352

International Nuclear Information System (INIS)

A quantitative structure-property relationship (QSPR) study was performed to develop models those relate the structures of 150 drug organic compounds to their n-octanol-water partition coefficients (log Po/w). Molecular descriptors derived solely from 3D structures of the molecular drugs. A genetic algorithm was also applied as a variable selection tool in QSPR analysis. The models were constructed using 110 molecules as training set, and predictive ability tested using 40 compounds. Modeling of log Po/w of these compounds as a function of the theoretically derived descriptors was established by multiple linear regression (MLR). Four descriptors for these compounds molecular volume (MV) (geometrical), hydrophilic-lipophilic balance (HLB) (constitutional), hydrogen bond forming ability (HB) (electronic) and polar surface area (PSA) (electrostatic) are taken as inputs for the model. The use of descriptors calculated only from molecular structure eliminates the need for experimental determination of properties for use in the correlation and allows for the estimation of log Po/w for molecules not yet synthesized. Application of the developed model to a testing set of 40 drug organic compounds demonstrates that the model is reliable with good predictive accuracy and simple formulation. The prediction results are in good agreement with the experimental value. The root mean square error of prediction (RMSEP) and square correlation coefficient(RMSEP) and square correlation coefficient (R2) for MLR model were 0.22 and 0.99 for the prediction set log Po/w

353

The deterioration of water quality, especially organic pollution in Tai Lake and the Qiantang River, have recently received attention in China. The objectives of this study were to evaluate the formation of halonitromethanes (HNMs) using multiple regression models for chlorination and chloramination and to identify the key factors that influence the formation of HNMs in Tai Lake and the Qiantang River. The results showed that the total formation of HNMs (T-HNMs) during chlorination and chloramination could be described using the following models: (1) [Formula: see text] =(10)(5.267)(DON)(6.645)(Br(-))(0.737)(DOC)(-)(5.537)(Cl2)(0.333)(t)(0.165) (R(2)=0.974, pDOC). The nitrite and bromide concentrations and the reaction time mainly affected the T-HNM yields during chloramination. Additional analysis indicated that the bromine incorporation factors (BIFs) for trihalogenated HNMs generally decreased as the chlorine/chloramine dose, temperature and reaction time decreased and increased as the bromide concentration increased. PMID:25112580

Hong, Huachang; Qian, Lingya; Xiong, Yujing; Xiao, Zhuoqun; Lin, Hongjun; Yu, Haiying

2015-01-01

354

The potential for bias due to misclassification error in regression analysis is well understood by statisticians and epidemiologists. Assuming little or no available data for estimating misclassification probabilities, investigators sometimes seek to gauge the sensitivity of an estimated effect to variations in the assumed values of those probabilities. We present an intuitive and flexible approach to such a sensitivity analysis, assuming an underlying logistic regression model. For outcome misclassification, we argue that a likelihood-based analysis is the cleanest and the most preferable approach. In the case of covariate misclassification, we combine observed data on the outcome, error-prone binary covariate of interest, and other covariates measured without error, together with investigator-supplied values for sensitivity and specificity parameters, to produce corresponding positive and negative predictive values. These values serve as estimated weights to be used in fitting the model of interest to an appropriately defined expanded data set using standard statistical software. Jackknifing provides a convenient tool for incorporating uncertainty in the estimated weights into valid standard errors to accompany log odds ratio estimates obtained from the sensitivity analysis. Examples illustrate the flexibility of this unified strategy, and simulations suggest that it performs well relative to a maximum likelihood approach carried out via numerical optimization. PMID:20552681

Lyles, Robert H; Lin, Ji

2010-09-30

355

Analysis of designed experiments by stabilised PLS Regression and jack-knifing

DEFF Research Database (Denmark)

Pragmatical, visually oriented methods for assessing and optimising bi-linear regression models are described, and applied to PLS Regression (PLSR) analysis of multi-response data from controlled experiments. The paper outlines some ways to stabilise the PLSR method to extend its range of applicability to the analysis of effects in designed experiments. Two ways of passifying unreliable variables are shown. A method for estimating the reliability of the cross- validated prediction error RMSEP is demonstrated. Some recently developed jack-knifing extensions are illustrated, for estimating the reliability of the linear and bi-linear model parameter estimates. The paper illustrates how the obtained PLSR "significance" probabilities are similar to those from conventional factorial ANOVA, but the PLSR is shown to give important additional overview plots of the main relevant structures in the multi-response data. The study is part of an ongoing effort to establish a cognitively simple and versatile approach to multivariate data analysis, with reliability assessment based on the data at hand, and with little need for abstract distribution theory [H. Martens, M. Martens, Multivariate Analysis of Quality. An Introduction, Wiley, Chichester, UK, 2001].

Martens, Harald; HØy, M.

2001-01-01

356

Bayesian analysis of a multivariate null intercept errors-in-variables regression model.

Longitudinal data are of great interest in analysis of clinical trials. In many practical situations the covariate can not be measured precisely and a natural alternative model is the errors-in-variables regression models. In this paper we study a null intercept errors-in-variables regression model with a structure of dependency between the response variables within the same group. We apply the model to real data presented in Hadgu and Koch (Hadgu, A., Koch, G. (1999). Application of generalized estimating equations to a dental randomized clinical trial. J. Biopharmaceutical Statistics 9(1):161-178). In that study volunteers with preexisting dental plaque were randomized to two experimental mouth rinses (A and B) or a control mouth rinse with double blinding. The dental plaque index was measured for each subject in the beginning of the study and at two follow-up times, which leads to the presence of an interclass correlation. We propose the use of a Bayesian approach to model a multivariate null intercept errors-in-variables regression model to the longitudinal data. The proposed Bayesian approach accommodates the correlated measurements and incorporates the restriction that the slopes must lie in the (0, 1) interval. A Gibbs sampler is used to perform the computations. PMID:14584721

Aoki, Reiko; Bolfarine, Heleno; Achcar, Jorge A; Dorival, Leão P Júnior

2003-11-01

357

We consider nonparametric regression analysis in a generalized linear model (GLM) framework for data with covariates that are the subject-specific random effects of longitudinal measurements. The usual assumption that the effects of the longitudinal covariate processes are linear in the GLM may be unrealistic and if this happens it can cast doubt on the inference of observed covariate effects. Allowing the regression functions to be unknown, we propose to apply Bayesian nonparametric methods including cubic smoothing splines or P-splines for the possible nonlinearity and use an additive model in this complex setting. To improve computational efficiency, we propose the use of data-augmentation schemes. The approach allows flexible covariance structures for the random effects and within-subject measurement errors of the longitudinal processes. The posterior model space is explored through a Markov chain Monte Carlo (MCMC) sampler. The proposed methods are illustrated and compared to other approaches, the "naive" approach and the regression calibration, via simulations and by an application that investigates the relationship between obesity in adulthood and childhood growth curves. PMID:20880012

Ryu, Duchwan; Li, Erning; Mallick, Bani K

2011-06-01

358

The reasons why the coordinate measuring machine (CMM) dynamic error exists are complicate. And there are many elements which influence the error. So it is hard to build an accurate model. For the sake of attaining a model which not only avoided analyzing complex error sources and the interactions among them, but also solved the multiple colinearity among the variables. This paper adopted the Partial Least-Squares Regression (PLSR) to build model. The model takes 3D coordinates (X, Y, Z) and the moving velocity as the independent variable and takes the CMM dynamic error value as the dependent variable. The experimental results show that the model can be easily explained. At the same time the results show the magnitude and direction of the independent variable influencing the dependent variable.

Zhang, Mei; Fei, Yetai; Sheng, Li; Ma, Xiushui; Yang, Hong-tao

2008-12-01

359

Pitfalls in predictions of rock properties using multivariate analysis and regression methods

Statistical methods are commonly used for prediction of geoscience and engineering properties. This commonly involves selection of a small number of variables among a large number of available geological, geophysical, petrophysical and engineering variables. The conventional view is to select the variables that have highest correlations with the variable of concern. In this article, we show that this may not always be a wise approach because it ignores a critical aspect of the variable interaction — suppression. We review the suppression phenomenon, and discuss three types of suppression in multiple linear regression of geoscience and reservoir properties. We present examples using wireline logs, seismic attributes, and other engineering parameters. We show that understanding the suppression phenomenon is important for selecting appropriate variables for optimal prediction of geoscience and reservoir properties.

Ma, Y. Zee

2011-10-01

360

An electro-optical device called an oculometer which tracks a subject's lookpoint as a time function has been used to collect data in a real-time simulation study of instrument landing system (ILS) approaches. The data describing the scanning behavior of a pilot during the instrument approaches have been analyzed by use of a stepwise regression analysis technique. A statistically significant correlation between pilot workload, as indicated by pilot ratings, and scanning behavior has been established. In addition, it was demonstrated that parameters derived from the scanning behavior data can be combined in a mathematical equation to provide a good representation of pilot workload.

Waller, M. C.

1976-01-01

361

International Nuclear Information System (INIS)

Statistical analysis of properties of powder, compacts and wize of tungsten VA was made to determine optimum conditions of plastic working of tungsten and its alloys. The data were collected on 29 parameters and processed on ''Minsk-22'' computer. Correlations were found between wire structure and such factors as hardness and density of compacts, fractional composition and volume weight of powder and others. A regression equation was obtained which connected the structure of 0.52 mm wire with a number of parameters of initial material

362

Data Management, EDA, and Regression Analysis with 1969-2000 Major League Baseball Attendance

This article, created by James J. Cochran of Louisiana Tech University, describes a dataset containing Major League Baseball data from seasons 1969 through 2000 and illustrates how this data can be used as a course long project covering basic data management, the use of exploratory data analysis to "clean" data, and construction of regression models. The set contains data such as: runs scored, runs allowed, wins, losses, number of games behind the division leader and attendance. This is a great lesson for anyone interested in the statistics of baseball. The data is in .dat format.

Cochran, James J.

363

LINEAR REGRESSION MODEL IN THE ANALYSIS OF THE GROSS DOMESTIC PRODUCT

Directory of Open Access Journals (Sweden)

Full Text Available As we ascertain the evolutionary trend of the global economy, it becomes evident that strict analyses on the evolution of a certain micro or macro-economical indicator is no longer enough to describe the corresponding phenomenon, as the emphasis shifts towards the analysis of the correlations existing between two or more indicators, able to offer a much stronger insight on the economical phenomenon. We propose to use the simple linear regression model, a relatively easy and very effective modality to establish the correlation between two economical indicators. The measurement of the factor’s influence on the indicator will most surely offer additional information on the phenomen they describe.

Constantin ANGHELACHE

2011-12-01

364

This study performs a Differential Item Function (DIF) analysis in terms of gender and culture on the items available in the PISA 2009 mathematics literacy sub-test. The DIF analyses were done through the Mantel Haenszel, Logistic Regression and the SIBTEST methods. The data for the gender variable were collected from the responses given by 332 students to the items in the mathematics literacy sub-test during the administration of the 5th booklet in the PISA 2009 application whereas the data ...

Süleyman Demir; ?brahim Alper Köse

2014-01-01

365

International Nuclear Information System (INIS)

An uncertainty analysis method is proposed here, which uses Fourier Amplitude Sensitivity Test (FAST) and Stepwise Regression Technique (SRT). This method is a compromise between the approximation method [response surface method (RSM) or moments method] and Monte Carlo method (MCM). It is concluded that: 1. FAST gives the partial variance for each input parameter, which can be used as global sensitivity ranking between input parameters, with moderate sampling point compared to crude MCM. 2. SRT is a good tool to construct the later-used first- or second-order response surface model consisting of comparatively important parameters. 3. The combined uncertainty analysis method using FAST and SRT can be used for uncertainty/sensitivity analysis of the large computer codes with moderate cost and it will be a useful tool to analyze the feasibility of the newly developed, highly uncertain system models

366

A Bayesian Quantile Regression Analysis of Potential Risk Factors for Violent Crimes in USA

Bayesian quantile regression has drawn more attention in widespread applications recently. Yu and Moyeed (2001) proposed an asymmetric Laplace distribution to provide likelihood based mechanism for Bayesian inference of quantile regression models. In this work, the primary objective is to evaluate the performance of Bayesian quantile regression compared with simple regression and quantile regression through simulation and with application to a crime dataset from 50 USA states for assessing th...

Ming Wang; Lijun Zhang

2012-01-01

367

Regression analysis of growth responses to water depth in three wetland plant species

DEFF Research Database (Denmark)

Background and aims Plant species composition in wetlands and on lakeshores often shows dramatic zonation which is frequently ascribed to differences in flooding tolerance. This study compared the growth responses to water depth of three species (Phormium tenax, Carex secta, Typha orientalis) differing in depth preferences in wetlands, using non-linear and quantile regression analyses to establish how flooding tolerance can explain field zonation. Methodology Plants were established for 8 months in outdoor cultures in waterlogged soil without standing water, and then randomly allocated to water depths from 0 – 0.5 m. Morphological and growth responses to depth were followed for 54 days before harvest, and then analysed by repeated measures analysis of covariance, and non-linear and quantile regression analysis (QRA), to compare flooding tolerances. Principal results Growth responses to depth differed between the three species, and were non-linear. P. tenax growth rapidly decreased in standing water > 0.25 m depth, C. secta growth increased initially with depth but then decreased at depths > 0.30 m, accompanied by increased shoot height and decreased shoot density, and T. orientalis was unaffected by the 0 – 0.50 m depth range. In P. tenax the decrease in growth was associated with a decrease in the number of leaves produced per ramet and in C. secta the effect of water depth was greatest for the tallest shoots. Allocation patterns were unaffected by depth. Conclusions The responses are consistent with the principle that zonation in the field is primarily structured by competition in shallow water and by physiological flooding tolerance in deep water. Regression analyses, especially QRA, proved to be powerful tools in distinguishing genuine phenotypic responses to water depth from non-phenotypic variation due to size and developmental differences.

Sorrell, Brian K; Tanner, Chris C

2012-01-01

368

Directory of Open Access Journals (Sweden)

Full Text Available O ar é um meio eficiente de dispersão de poluentes atmosféricos e seu comportamento depende dos movimentos atmosféricos que ocorrem na troposfera. Em Porto Alegre, Estado do Rio Grande do Sul, há um grande tráfego diário e uma concentração de indústrias que podem ser responsáveis por emissões atmosféricas. Neste trabalho, estudou-se o comportamento das concentrações diárias de material particulado (PM10 desta cidade, considerando a influência dos elementos meteorológicos. A análise dos dados foi realizada a partir de estatísticas descritivas, correlação linear e regressão múltipla. Os dados foram fornecidos pela Fundação Estadual de Proteção Ambiental Henrique Luiz Roessler - RS (FEPAM e pelo Instituto Nacional de Meteorologia (INMET. A partir das análises pôde-se verificar que: as concentrações do PM10, medidos diariamente às 16h, não ultrapassaram os padrões nacionais de qualidade do ar; os elementos meteorológicos que influenciam nas concentrações do PM10 foram: a velocidade média diária do vento e a radiação média diária com relações negativas; as temperaturas médias diárias do ar e as direções, norte e noroeste, do vento, com relações positivas. As direções do vento que contribuem significativamente para diminuir as concentrações nos locais medidos são Leste e Sudeste.Air is an efficient means of atmospheric pollutants dispersal and its r behavior depends on the atmospheric movements that occur in the troposphere. In Porto Alegre, Rio Grande do Sul State, there is a large daily traffic and a concentration of industries that may be responsible for atmospheric emission. In the present work we studied the behavior of daily concentrations of particulate matter (PM10, in this city, considering the influence of meteorological variables. Data analysis was performed from descriptive statistics, linear correlation and multiple regressions. Data were provided by the State Foundation of Environmental Protection Henrique Luiz Roessler - RS and the National Institute of Meteorology. Based on the analysis it was possible to verify that: the concentration of PM10, measured every day at 4:00 p.m., did not exceed national standards for air quality; meteorological elements that influenced on the concentrations of PM10 were the daily average wind speed and average daily radiation with negative relations; the daily average temperature of the air and the directions, north and northwest of wind, with positive relations. Wind directions which contribute significantly to lower concentrations on the measured places are east and southeast.

Rosana de Cassia de Souza Schneider

2011-03-01

369

Survival regression analysis: a powerful tool for evaluating fighting and assessment.

Theoretical models of animal contests frequently generate predictions about how asymmetries (e.g. differences in size, residence status) between contestants affect fight duration. Linear regression and nonparametric correlation analyses are commonly used to test the fit of data to such models. We show how survival regression analysis (SRA) is a powerful technique for studying the effect of asymmetries on the duration of contests. SRA, which is under-utilized by students of animal behaviour, offers several advantages over more frequently used procedures. It provides unbiased parameter estimates even when including censored data (i.e. results of contests that have not ended at the time when observations are stopped). The analysis of hazard functions, which is a component of SRA, is an easy way to test for consistency with predictions of the sequential assessment game model. These and other advantages of SRA are illustrated by using SRA and more conventional methods to analyse the effect of asymmetries on contest duration for encounters between female Mediterranean tarantulas, Lycosa tarentula (L.). It is hoped that this example of the advantages of SRA will encourage more widespread use of this powerful technique. Copyright 2000 The Association for the Study of Animal Behaviour. PMID:11007639

Moya-Laraño; Wise

2000-09-01

370

International Nuclear Information System (INIS)

Non-invasively reconstructing the transmembrane potentials (TMPs) from body surface potentials (BSPs) constitutes one form of the inverse ECG problem that can be treated as a regression problem with multi-inputs and multi-outputs, and which can be solved using the support vector regression (SVR) method. In developing an effective SVR model, feature extraction is an important task for pre-processing the original input data. This paper proposes the application of principal component analysis (PCA) and kernel principal component analysis (KPCA) to the SVR method for feature extraction. Also, the genetic algorithm and simplex optimization method is invoked to determine the hyper-parameters of the SVR. Based on the realistic heart-torso model, the equivalent double-layer source method is applied to generate the data set for training and testing the SVR model. The experimental results show that the SVR method with feature extraction (PCA-SVR and KPCA-SVR) can perform better than that without the extract feature extraction (single SVR) in terms of the reconstruction of the TMPs on epi- and endocardial surfaces. Moreover, compared with the PCA-SVR, the KPCA-SVR features good approximation and generalization ability when reconstructing the TMPs.

371

Energy Technology Data Exchange (ETDEWEB)

Non-invasively reconstructing the transmembrane potentials (TMPs) from body surface potentials (BSPs) constitutes one form of the inverse ECG problem that can be treated as a regression problem with multi-inputs and multi-outputs, and which can be solved using the support vector regression (SVR) method. In developing an effective SVR model, feature extraction is an important task for pre-processing the original input data. This paper proposes the application of principal component analysis (PCA) and kernel principal component analysis (KPCA) to the SVR method for feature extraction. Also, the genetic algorithm and simplex optimization method is invoked to determine the hyper-parameters of the SVR. Based on the realistic heart-torso model, the equivalent double-layer source method is applied to generate the data set for training and testing the SVR model. The experimental results show that the SVR method with feature extraction (PCA-SVR and KPCA-SVR) can perform better than that without the extract feature extraction (single SVR) in terms of the reconstruction of the TMPs on epi- and endocardial surfaces. Moreover, compared with the PCA-SVR, the KPCA-SVR features good approximation and generalization ability when reconstructing the TMPs.

Jiang Mingfeng; Wang Yaming [College of Electronics and Informatics, Zhejiang Sci-Tech University, Hangzhou 310018 (China); Zhu Lingyan [Dongfang College, Zhejiang University of Finance and Economics, Hangzhou, 310018 (China); Xia Ling; Shou Guofa; Liu Feng [Department of Biomedical Engineering, Zhejiang University, Hangzhou 310027 (China); Crozier, Stuart, E-mail: peterjiang0517@163.com, E-mail: jiang.mingfeng@hotmail.com [School of Information Technology and Electrical Engineering, University of Queensland, St Lucia, Brisbane, Queensland 4072 (Australia)

2011-03-21

372

Non-invasively reconstructing the transmembrane potentials (TMPs) from body surface potentials (BSPs) constitutes one form of the inverse ECG problem that can be treated as a regression problem with multi-inputs and multi-outputs, and which can be solved using the support vector regression (SVR) method. In developing an effective SVR model, feature extraction is an important task for pre-processing the original input data. This paper proposes the application of principal component analysis (PCA) and kernel principal component analysis (KPCA) to the SVR method for feature extraction. Also, the genetic algorithm and simplex optimization method is invoked to determine the hyper-parameters of the SVR. Based on the realistic heart-torso model, the equivalent double-layer source method is applied to generate the data set for training and testing the SVR model. The experimental results show that the SVR method with feature extraction (PCA-SVR and KPCA-SVR) can perform better than that without the extract feature extraction (single SVR) in terms of the reconstruction of the TMPs on epi- and endocardial surfaces. Moreover, compared with the PCA-SVR, the KPCA-SVR features good approximation and generalization ability when reconstructing the TMPs.

Jiang, Mingfeng; Zhu, Lingyan; Wang, Yaming; Xia, Ling; Shou, Guofa; Liu, Feng; Crozier, Stuart

2011-03-01

373

Local linear regression for function learning: an analysis based on sample discrepancy.

Local linear regression models, a kind of nonparametric structures that locally perform a linear estimation of the target function, are analyzed in the context of empirical risk minimization (ERM) for function learning. The analysis is carried out with emphasis on geometric properties of the available data. In particular, the discrepancy of the observation points used both to build the local regression models and compute the empirical risk is considered. This allows to treat indifferently the case in which the samples come from a random external source and the one in which the input space can be freely explored. Both consistency of the ERM procedure and approximating capabilities of the estimator are analyzed, proving conditions to ensure convergence. Since the theoretical analysis shows that the estimation improves as the discrepancy of the observation points becomes smaller, low-discrepancy sequences, a family of sampling methods commonly employed for efficient numerical integration, are also analyzed. Simulation results involving two different examples of function learning are provided. PMID:25330431

Cervellera, Cristiano; Macciò, Danilo

2014-11-01

374

Directory of Open Access Journals (Sweden)

Full Text Available In recent years, alloy steels have been widely usedin aerospace and automotive industries. Machining of these materials requires better understanding of cutting processes regarding accuracy and efficiency. This study addresses the modelling of the machinability of EN353 and 20mncr5 materials. In this study, multiple regression analysis (MRA is used to investigate the influence of some parameters on the thrust force and torque in the drilling processes of alloy steel materials. The model were identified by using cutting speed, feed rate, and depth as input data and the thrust force and torque as the output data. The statistical analysis accompanied with results showed that cutting feed (f were the most significant parameters on the drilling process, while spindle speed seemed insignificant. Since the spindle speed was insignificant, it directed us to set it either at the highest spindle speed to obtain high material removal rate or at the lowest spindle speed to prolong the tool life depending on the need for the application. The mathematical model is based on a power regression modelling, dependent on the three above mentioned parameters.

Keerthiprasad.K

2014-08-01

375

A least trimmed square regression method for second level FMRI effective connectivity analysis.

We present a least trimmed square (LTS) robust regression method to combine different runs/subjects for second/high level effective connectivity analysis. The basic idea of this method is to treat the extreme nonlinear model variability as outliers if they exceed a certain threshold. A bootstrap method for the LTS estimation is employed to detect model outliers. We compared the LTS robust method with a non-robust method using simulated and real datasets. The difference between LTS and the non-robust method for second level effective connectivity analysis is significant, suggesting the conventional non-robust method is easily affected by the model variability from the first level analysis. In addition, after these outliers are detected and excluded for the high level analysis, the model coefficients of the second level are combined within the framework of a mixed model. The variance of the mixed model is estimated using the Newton-Raphson (NR) type Levenberg-Marquardt algorithm. Three sets of real data are adopted to compare conventional methods which do not include random effects in the analysis with a mixed model for second level effective connectivity analysis. The results show that the conventional method is significantly different from the mixed model when greater model variability exists, suggesting there is a strong random effect, and the mixed model should be employed for the second level effective connectivity analysis. PMID:23093379

Li, Xingfeng; Coyle, Damien; Maguire, Liam; McGinnity, Thomas Martin

2013-01-01

376

Statistical learning method in regression analysis of simulated positron spectral data

International Nuclear Information System (INIS)

Positron lifetime spectroscopy is a non-destructive tool for detection of radiation induced defects in nuclear reactor materials. This work concerns the applicability of the support vector machines method for the input data compression in the neural network analysis of positron lifetime spectra. It has been demonstrated that the SVM technique can be successfully applied to regression analysis of positron spectra. A substantial data compression of about 50 % and 8 % of the whole training set with two and three spectral components respectively has been achieved including a high accuracy of the spectra approximation. However, some parameters in the SVM approach such as the insensitivity zone e and the penalty parameter C have to be chosen carefully to obtain a good performance. (author)

377

Within-session analysis of the extinction of pavlovian fear-conditioning using robust regression

Directory of Open Access Journals (Sweden)

Full Text Available Traditionally , the analysis of extinction data in fear conditioning experiments has involved the use of standard linear models, mostly ANOVA of between-group differences of subjects that have undergone different extinction protocols, pharmacological manipulations or some other treatment. Although some studies report individual differences in quantities such as suppression rates or freezing percentages, these differences are not included in the statistical modeling. Withinsubject response patterns are then averaged using coarse-grain time windows which can overlook these individual performance dynamics. Here we illustrate an alternative analytical procedure consisting of 2 steps: the estimation of a trend for within-session data and analysis of group differences in trend as main outcome. This procedure is tested on real fear-conditioning extinction data, comparing trend estimates via Ordinary Least Squares (OLS and robust Least Median of Squares (LMS regression estimates, as well as comparing between-group differences and analyzing mean freezing percentage versus LMS slopes as outcomes

Vargas-Irwin, Cristina

2010-06-01

378

Regression analysis with a bounded outcome is a common problem in applied statistics. Typical examples include regression models for percentage outcomes and the analysis of ratings that are measured on a bounded scale. In this paper, we consider beta regression, which is a generalization of logit models to situations where the response is continuous on the interval (0,1). Consequently, beta regression is a convenient tool for analyzing percentage responses. The classical approach to fit a bet...

Schmid, Matthias; Wickler, Florian; Maloney, Kelly O.; Mitchell, Richard; Fenske, Nora; Mayr, Andreas

2013-01-01

379

A Skew-t space-varying regression model for the spectral analysis of resting state brain activity.

It is known that in many neurological disorders such as Down syndrome, main brain rhythms shift their frequencies slightly, and characterizing the spatial distribution of these shifts is of interest. This article reports on the development of a Skew-t mixed model for the spatial analysis of resting state brain activity in healthy controls and individuals with Down syndrome. Time series of oscillatory brain activity are recorded using magnetoencephalography, and spectral summaries are examined at multiple sensor locations across the scalp. We focus on the mean frequency of the power spectral density, and use space-varying regression to examine associations with age, gender and Down syndrome across several scalp regions. Spatial smoothing priors are incorporated based on a multivariate Markov random field, and the markedly non-Gaussian nature of the spectral response variable is accommodated by the use of a Skew-t distribution. A range of models representing different assumptions on the association structure and response distribution are examined, and we conduct model selection using the deviance information criterion. (1) Our analysis suggests region-specific differences between healthy controls and individuals with Down syndrome, particularly in the left and right temporal regions, and produces smoothed maps indicating the scalp topography of the estimated differences. PMID:22614763

Ismail, Salimah; Sun, Wenqi; Nathoo, Farouk S; Babul, Arif; Moiseev, Alexader; Beg, Mirza Faisal; Virji-Babul, Naznin

2013-08-01

380

Modelling and analysis of turbulent datasets using Auto Regressive Moving Average processes

International Nuclear Information System (INIS)

We introduce a novel way to extract information from turbulent datasets by applying an Auto Regressive Moving Average (ARMA) statistical analysis. Such analysis goes well beyond the analysis of the mean flow and of the fluctuations and links the behavior of the recorded time series to a discrete version of a stochastic differential equation which is able to describe the correlation structure in the dataset. We introduce a new index ? that measures the difference between the resulting analysis and the Obukhov model of turbulence, the simplest stochastic model reproducing both Richardson law and the Kolmogorov spectrum. We test the method on datasets measured in a von Kármán swirling flow experiment. We found that the ARMA analysis is well correlated with spatial structures of the flow, and can discriminate between two different flows with comparable mean velocities, obtained by changing the forcing. Moreover, we show that the ? is highest in regions where shear layer vortices are present, thereby establishing a link between deviations from the Kolmogorov model and coherent structures. These deviations are consistent with the ones observed by computing the Hurst exponents for the same time series. We show that some salient features of the analysis are preserved when considering global instead of local observables. Finally, we analyze flow configurations with multistability features where the ARMA technique is efficient in discriminating different stability branches of the system

381

Modelling and analysis of turbulent datasets using Auto Regressive Moving Average processes

We introduce a novel way to extract information from turbulent datasets by applying an Auto Regressive Moving Average (ARMA) statistical analysis. Such analysis goes well beyond the analysis of the mean flow and of the fluctuations and links the behavior of the recorded time series to a discrete version of a stochastic differential equation which is able to describe the correlation structure in the dataset. We introduce a new index ? that measures the difference between the resulting analysis and the Obukhov model of turbulence, the simplest stochastic model reproducing both Richardson law and the Kolmogorov spectrum. We test the method on datasets measured in a von Kármán swirling flow experiment. We found that the ARMA analysis is well correlated with spatial structures of the flow, and can discriminate between two different flows with comparable mean velocities, obtained by changing the forcing. Moreover, we show that the ? is highest in regions where shear layer vortices are present, thereby establishing a link between deviations from the Kolmogorov model and coherent structures. These deviations are consistent with the ones observed by computing the Hurst exponents for the same time series. We show that some salient features of the analysis are preserved when considering global instead of local observables. Finally, we analyze flow configurations with multistability features where the ARMA technique is efficient in discriminating different stability branches of the system.

Faranda, Davide; Pons, Flavio Maria Emanuele; Dubrulle, Bérengère; Daviaud, François; Saint-Michel, Brice; Herbert, Éric; Cortet, Pierre-Philippe

2014-10-01

382

Modelling and analysis of turbulent datasets using Auto Regressive Moving Average processes

Energy Technology Data Exchange (ETDEWEB)

We introduce a novel way to extract information from turbulent datasets by applying an Auto Regressive Moving Average (ARMA) statistical analysis. Such analysis goes well beyond the analysis of the mean flow and of the fluctuations and links the behavior of the recorded time series to a discrete version of a stochastic differential equation which is able to describe the correlation structure in the dataset. We introduce a new index ? that measures the difference between the resulting analysis and the Obukhov model of turbulence, the simplest stochastic model reproducing both Richardson law and the Kolmogorov spectrum. We test the method on datasets measured in a von Kármán swirling flow experiment. We found that the ARMA analysis is well correlated with spatial structures of the flow, and can discriminate between two different flows with comparable mean velocities, obtained by changing the forcing. Moreover, we show that the ? is highest in regions where shear layer vortices are present, thereby establishing a link between deviations from the Kolmogorov model and coherent structures. These deviations are consistent with the ones observed by computing the Hurst exponents for the same time series. We show that some salient features of the analysis are preserved when considering global instead of local observables. Finally, we analyze flow configurations with multistability features where the ARMA technique is efficient in discriminating different stability branches of the system.

Faranda, Davide, E-mail: davide.faranda@cea.fr; Dubrulle, Bérengère; Daviaud, François [Laboratoire SPHYNX, Service de Physique de l' Etat Condensé, DSM, CEA Saclay, CNRS URA 2464, 91191 Gif-sur-Yvette (France); Pons, Flavio Maria Emanuele [Dipartimento di Scienze Statistiche, Universitá di Bologna, Via delle Belle Arti 41, 40126 Bologna (Italy); Saint-Michel, Brice [Institut de Recherche sur les Phénomènes Hors Equilibre, Technopole de Chateau Gombert, 49 rue Frédéric Joliot Curie, B.P. 146, 13 384 Marseille (France); Herbert, Éric [Université Paris Diderot - LIED - UMR 8236, Laboratoire Interdisciplinaire des Énergies de Demain, Paris (France); Cortet, Pierre-Philippe [Laboratoire FAST, CNRS, Université Paris-Sud (France)

2014-10-15

383

Energy Technology Data Exchange (ETDEWEB)

The ChemCam instrument on the Mars Science Laboratory (MSL) will include a laser-induced breakdown spectrometer (LIBS) to quantify major and minor elemental compositions. The traditional analytical chemistry approach to calibration curves for these data regresses a single diagnostic peak area against concentration for each element. This approach contrasts with a new multivariate method in which elemental concentrations are predicted by step-wise multiple regression analysis based on areas of a specific set of diagnostic peaks for each element. The method is tested on LIBS data from igneous and metamorphosed rocks. Between 4 and 13 partial regression coefficients are needed to describe each elemental abundance accurately (i.e., with a regression line of R{sup 2} > 0.9995 for the relationship between predicted and measured elemental concentration) for all major and minor elements studied. Validation plots suggest that the method is limited at present by the small data set, and will work best for prediction of concentration when a wide variety of compositions and rock types has been analyzed.

Clegg, Samuel M [Los Alamos National Laboratory; Barefield, James E [Los Alamos National Laboratory; Wiens, Roger C [Los Alamos National Laboratory; Dyar, Melinda D [MT HOLYOKE COLLEGE; Schafer, Martha W [LSU; Tucker, Jonathan M [MT HOLYOKE COLLEGE

2008-01-01

384

Random regression models allow for analysis of longitudinal data, which together with the use of genomic information are expected to increase accuracy of selection, when compared with analyzing average or total production with pedigree information. The objective of this study was to estimate variance components for egg production over time in a commercial brown egg layer population using genomic relationship information. A random regression reduced animal model with a marker-based relationship matrix was used to estimate genomic breeding values of 3,908 genotyped animals from 6 generations. The first 5 generations were used for training, and predictions were validated in generation 6. Daily egg production up to 46 wk in lay was accumulated into 85,462 biweekly (every 2 wk) records for training, of which 17,570 were recorded on genotyped hens and the remaining on their nongenotyped progeny. The effect of adding additional egg production data of 2,167 nongenotyped sibs of selection candidates [16,037 biweekly (every 2 wk) records] to the training data was also investigated. The model included a 5th order Legendre polynomial nested within hatch-week as fixed effects and random terms for coefficients of quadratic polynomials for genetic and permanent environmental components. Residual variance was assumed heterogeneous among 2-wk periods. Models using pedigree and genomic relationships were compared. Estimates of residual variance were very similar under both models, but the model with genomic relationships resulted in a larger estimate of genetic variance. Heritability estimates increased with age up to mid production and decreased afterward, resulting in an average heritability of 0.20 and 0.33 for pedigree and genomic models. Prediction of total egg number was more accurate with the genomic than with the pedigree-based random regression model (correlation in validation 0.26 vs. 0.16). The genomic model outperformed the pedigree model in most of the 2-wk periods. Thus, results of this study show that random regression reduced animal models can be used in breeding programs using genomic information and can result in substantial improvements in the accuracy of selection for trajectory traits. PMID:23687143

Wolc, A; Arango, J; Settar, P; Fulton, J E; O'Sullivan, N P; Preisinger, R; Fernando, R; Garrick, D J; Dekkers, J C M

2013-06-01

385

Directory of Open Access Journals (Sweden)

Full Text Available Accurate prediction of the remaining useful life (RUL of lithium-ion batteries is important for battery management systems. Traditional empirical data-driven approaches for RUL prediction usually require multidimensional physical characteristics including the current, voltage, usage duration, battery temperature, and ambient temperature. From a capacity fading analysis of lithium-ion batteries, it is found that the energy efficiency and battery working temperature are closely related to the capacity degradation, which account for all performance metrics of lithium-ion batteries with regard to the RUL and the relationships between some performance metrics. Thus, we devise a non-iterative prediction model based on flexible support vector regression (F-SVR and an iterative multi-step prediction model based on support vector regression (SVR using the energy efficiency and battery working temperature as input physical characteristics. The experimental results show that the proposed prognostic models have high prediction accuracy by using fewer dimensions for the input data than the traditional empirical models.

Shuai Wang

2014-10-01

386

We present here an implementation of a least squares iterative regression method applied to the sine functions embedded in the principal components extracted from geophysical time series. This method seems to represent a useful improvement for the non-stationary time series periodicity quantitative analysis. The principal components determination followed by the least squares iterative regression method was implemented in an algorithm written in the Scilab (2006) language. The main result of the method is to obtain the set of sine functions embedded in the series analyzed in decreasing order of significance, from the most important ones, likely to represent the physical processes involved in the generation of the series, to the less important ones that represent noise components. Taking into account the need of a deeper knowledge of the Sun's past history and its implication to global climate change, the method was applied to the Sunspot Number series (1750-2004). With the threshold and parameter values used here, the application of the method leads to a total of 441 explicit sine functions, among which 65 were considered as being significant and were used for a reconstruction that gave a normalized mean squared error of 0.146.

Nordemann, D. J. R.; Rigozo, N. R.; de Souza Echer, M. P.; Echer, E.

2008-11-01

387

Semiparametric regression analysis for time-to-event marked endpoints in cancer studies.

In cancer studies the disease natural history process is often observed only at a fixed, random point of diagnosis (a survival time), leading to a current status observation (Sun (2006). The statistical analysis of interval-censored failure time data. Berlin: Springer.) representing a surrogate (a mark) (Jacobsen (2006). Point process theory and applications: marked point and piecewise deterministic processes. Basel: Birkhauser.) attached to the observed survival time. Examples include time to recurrence and stage (local vs. metastatic). We study a simple model that provides insights into the relationship between the observed marked endpoint and the latent disease natural history leading to it. A semiparametric regression model is developed to assess the covariate effects on the observed marked endpoint explained by a latent disease process. The proposed semiparametric regression model can be represented as a transformation model in terms of mark-specific hazards, induced by a process-based mixed effect. Large-sample properties of the proposed estimators are established. The methodology is illustrated by Monte Carlo simulation studies, and an application to a randomized clinical trial of adjuvant therapy for breast cancer. PMID:24379192

Hu, Chen; Tsodikov, Alex

2014-07-01

388

A cautionary note on the use of EESC-based regression analysis for ozone trend studies

Equivalent effective stratospheric chlorine (EESC) construct of ozone regression models attributes ozone changes to EESC changes using a single value of the sensitivity of ozone to EESC over the whole period. Using space-based total column ozone (TCO) measurements, and a synthetic TCO time series constructed such that EESC does not fall below its late 1990s maximum, we demonstrate that the EESC-based estimates of ozone changes in the polar regions (70-90°) after 2000 may, falsely, suggest an EESC-driven increase in ozone over this period. An EESC-based regression of our synthetic "failed Montreal Protocol with constant EESC" time series suggests a positive TCO trend that is statistically significantly different from zero over 2001-2012 when, in fact, no recovery has taken place. Our analysis demonstrates that caution needs to be exercised when using explanatory variables, with a single fit coefficient, fitted to the entire data record, to interpret changes in only part of the record.

Kuttippurath, J.; Bodeker, G. E.; Roscoe, H. K.; Nair, P. J.

2015-01-01

389

Directory of Open Access Journals (Sweden)

Full Text Available ??????????????????????????????????????????????????logratio??????????(PLS??????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? ? ???? ? ????? Prediction of water consumption structure on the basis of the relationship between water consumption structure and industrial structure is essential to the exploitation and utilization of water resources. Based on the symmetrical logratio transformation and partial least-squares regression, linear regression model for water consumption structure and industrial structure in FujianProvinceis developed in this study. Analysis on the model showed that the compositional data of water consumption structure and industrial structure inFujianProvincehad obvious linear relationship. This model fit the data very well with high accuracy and can be used to predict water consumption structure. Agricultural water was highly correlated with primary industry, and so was the industrial water with secondary industry. Agricultural water showed significantly negative correlation with secondary industry and tertiary industry. The variation of domestic water had an insignificant correlation with industrial structure. The capacity to explain water consumption structure of the industrial structure factors was in the order of primary industry > secondary industry > tertiary industry.

???

2012-06-01

390

This paper deals with the analysis of correlation and regression between the parameters of particle ionizing radiation and the stability characteristics of the irradiated monocrystalline silicon film. Based on the presented theoretical model of correlation and linear regression between two random variables, numeric and real experiments were performed. In the numeric experiment, a simulation of the effect of alpha radiation on a thin layer of monocrystalline...

Jakši? Uroš G.; Arsi? Nebojša B.; Fetahovi? Irfan S.; Stankovi? Koviljka ?.

2014-01-01

391

Genetics Analysis Workshop 17 provided common and rare genetic variants from exome sequencing data and simulated binary and quantitative traits in 200 replicates. We provide a brief review of the machine learning and regression-based methods used in the analyses of these data. Several regression and machine learning methods were used to address different problems inherent in the analyses of these data, which are high-dimension, low-sample-size data typical of many genetic association studies....

Dasgupta, Abhijit; Sun, Yan V.; Ko?nig, Inke R.; Bailey-wilson, Joan E.; Malley, James D.

2011-01-01

392

Energy Technology Data Exchange (ETDEWEB)

A quantitative structure-property relationship (QSPR) study was performed to develop models those relate the structures of 150 drug organic compounds to their n-octanol-water partition coefficients (log P{sub o/w}). Molecular descriptors derived solely from 3D structures of the molecular drugs. A genetic algorithm was also applied as a variable selection tool in QSPR analysis. The models were constructed using 110 molecules as training set, and predictive ability tested using 40 compounds. Modeling of log P{sub o/w} of these compounds as a function of the theoretically derived descriptors was established by multiple linear regression (MLR). Four descriptors for these compounds molecular volume (MV) (geometrical), hydrophilic-lipophilic balance (HLB) (constitutional), hydrogen bond forming ability (HB) (electronic) and polar surface area (PSA) (electrostatic) are taken as inputs for the model. The use of descriptors calculated only from molecular structure eliminates the need for experimental determination of properties for use in the correlation and allows for the estimation of log P{sub o/w} for molecules not yet synthesized. Application of the developed model to a testing set of 40 drug organic compounds demonstrates that the model is reliable with good predictive accuracy and simple formulation. The prediction results are in good agreement with the experimental value. The root mean square error of prediction (RMSEP) and square correlation coefficient (R{sup 2}) for MLR model were 0.22 and 0.99 for the prediction set log P{sub o/w}.

Ghasemi, Jahanbakhsh [Chemistry Department, Faculty of Sciences, Razi University, Kermanshah (Iran, Islamic Republic of)], E-mail: Jahan.ghasemi@gmail.com; Saaidpour, Saadi [Chemistry Department, Faculty of Sciences, Razi University, Kermanshah (Iran, Islamic Republic of)

2007-12-05

393

We consider the scenario where one observes an outcome variable and sets of features from multiple assays, all measured on the same set of samples. One approach that has been proposed for dealing with this type of data is ``sparse multiple canonical correlation analysis'' (sparse mCCA). All of the current sparse mCCA techniques are biconvex and thus have no guarantees about reaching a global optimum. We propose a method for performing sparse supervised canonical correlation ...

Gross, Samuel M.; Tibshirani, Robert

2014-01-01

394

Ordinal Logistic Regression for the Estimate of the Response Functions in the Conjoint Analysis

Directory of Open Access Journals (Sweden)

Full Text Available In the Conjoint Analysis (COA model proposed here – a new approach to estimate more than one response function–an extension of the traditional COA, the polytomous response variable (i.e. evaluation of the overall desirability of alternative product profiles is described by a sequence of binary variables. To link the categories of overall evaluation to the factor levels, we adopt – at the aggregate level – an ordinal logistic regression, based on a main effects experimental design.The model provides several overall desirability functions (aggregated part-worths sets, as many as the overall ordered categories are, unlike the traditional metric and non metric COA, which gives only one response function. We provide an application of the model and an interpretation of the main effects.

Amedeo De Luca

2011-12-01

395

Research of NiMH Battery Modeling and Simulation Based on Linear Regression Analysis Method

Directory of Open Access Journals (Sweden)

Full Text Available The battery State-Of-Charge estimation was one of core issues in the development of electric vehicles battery management system, and higher accurate model was needed in State-Of-Charge estimation correctly. Therefore, accurate battery modeling and simulation was researched here. The thevenin equivalent circuit model of NiMH battery was established for the poor accuracy of traditional model. Based on the data which were brought from the 6V 6Ah NiMH battery hybrid pulse cycling test experiments, thevenin model parameters were identified by means of the linear regression analysis method. Then, the battery equivalent circuit simulating model was built in the MATLAB/Simulink environment. The simulation and experimental results showed that the model has better accuracy and can be used to guide the battery State-Of-Charge estimation.

Yong-sheng Zhang

2013-11-01

396

Two quantitative structure-activity relationships (QSAR) models for predicting 95 compounds inhibiting Acyl-coenzyme A: cholesterol acyltransferase2 (ACAT2) were developed. The whole data set was randomly split into a training set including 72 compounds and a test set including 23 compounds. The molecules were represented by 11 descriptors calculated by software ADRIANA.Code. Then the inhibitory activity of ACAT2 inhibitors was predicted using multilinear regression (MLR) analysis and support vector machine (SVM) method, respectively. The correlation coefficients of the models for the test sets were 0.90 for MLR model, and 0.91 for SVM model. Y-randomization was employed to ensure the robustness of the SVM model. The atom charge and electronegativity related descriptors were important for the interaction between the inhibitors and ACAT2. PMID:23711921

Zhong, Min; Xuan, Shouyi; Wang, Ling; Hou, Xiaoli; Wang, Maolin; Yan, Aixia; Dai, Bin

2013-07-01

397

Directory of Open Access Journals (Sweden)

Full Text Available This paper uses data envelopment analysis to investigate the extent to which universities in the United States have undergone productivity and efficiency changes, partly due to managerial performance, during the 2005-09 academic years. Using panel data for 133 research and doctoral universities, the focus is on the primary drivers of U.S. publicly controlled higher education. DEA efficiency and returns to scale estimates are provided. In addition, university total factor productivity changes via the Malmquist index are decomposed into component parts. Results suggest that U.S. universities experienced average productivity regress. On an annual basis such was present prior to the global financial crisis. However, productivity gains appeared in concert with the crisis. Managerial efficiency tended to hamper productivity gains but, on the positive side, showed slight improvements over time. Decreasing returns to scale prevailed but from a policy perspective a return to economy wide growth may automatically correct some over production.

G. Thomas Sav

2012-08-01

398

Energy Technology Data Exchange (ETDEWEB)

The monitoring of detailed 3-dimensional (3D) reactor core power distribution is a prerequisite in the operation of nuclear power reactors to ensure that various safety limits imposed on the LPD and DNBR, are not violated during nuclear power reactor operation. The LPD and DNBR should be calculated in order to perform the two major functions of the core protection calculator system (CPCS) and the core operation limit supervisory system (COLSS). The LPD at the hottest part of a hot fuel rod, which is related to the power peaking factor (PPF, F{sub q} ), is more important than the LPD at any other position in a reactor core. The LPD needs to be estimated accurately to prevent nuclear fuel rods from melting. In this study, support vector regression (SVR) and uncertainty analysis have been applied to estimation of reactor core power peaking factor.

Bae, In Ho; Naa, Man Gyun [Chosun Univ., Gwangju (Korea, Republic of); Lee, Yoon Joon [Cheju National Univ., Jeju-do (Korea, Republic of); Park, Goon Cherl [Seoul National Univ., Seoul (Korea, Republic of)

2009-05-15

399

A Logistic Regression Analysis of the Contractor`s Awareness Regarding Waste Management

Directory of Open Access Journals (Sweden)

Full Text Available This study has highlighted a number of factors affecting contractor`s awareness regarding construction waste management to the construction industry. The data in the present study is based on contractors registered with the Construction Industry Development Board of Malaysia. Binary logistic regression analysis is employed for exploring the factors affecting the awareness. Contractor`s awareness regarding waste management will tend to be significantly adequate with the increasing values in the factors of having waste management plan, awareness of source reduction of waste minimisation measures, awareness of reusing and recycling of waste materials, sorting waste materials, perception on harmfulness of construction waste to the human health and willing to pay more for improved waste collection and disposal services. The findings generated from the study could help the environmental and waste management planners in their decision making for managing construction waste and reducing environmental pollution.

Rawshan Ara Begum

2006-01-01

400

International Nuclear Information System (INIS)

Fast neutron induced gamma spectrometry is based on inelastic scattering and capture of fast neutrons in the nucleus of various elements and consequent detection of emitted characteristic gamma. It is a useful technique for online, nondestructive elemental analysis of composition of various compounds. In this technique fast neutron, typically 14 MeV are made to incident on the sample and inelastic, capture gamma is collected. The elements present in the sample can be identified through the peaks at their characteristic energies in the collected spectrum and the peak heights contain the information about the abundance of the elements in the sample. Analyzing this gamma spectrum gives the quantitative composition of the sample. A two step method consisting of spectrum evaluation and calibration is used currently for quantitative abundance analysis. In field applications such as explosive detection, cancer diagnostics where real-time composition analysis is required, this method is inconvenient and not practical. In this work a new single step method based on Partial least square regression (PLS) has been proposed. The gamma energy spectrums of various compounds are collected and used to calibrate the correlation between peak height and elements quantity. Based on this analysis the unknown composition of any compound having similar elements can be predicted with comparatively higher accuracy. Monte-Carlo simulations has been carried out to verify the proposed method and ied out to verify the proposed method and used to predict the quantity of various elements present in some unknown compounds. (author)

401

Regression analysis based method for turbidity and ocean current velocity estimation with remote sensing satellite data is proposed. Through regressive analysis with MODIS data and measured data of turbidity and ocean current velocity, regressive equation which allows estimation of turbidity and ocean current velocity is obtained. With the regressive equation as well as long term MODIS data, turbidity and ocean current velocity trends in Ariake Sea area are clarified. It is also confirmed tha...

Yuichi Sarusawa; Kohei Arai

2013-01-01

402

The surface roughness is a very significant indicator of surface quality. It represents an essential exploitation requirement and influences technological time and costs, i.e. productivity. For that reason, the main objective of this paper is to analyse the influence of face milling cutting parameters (number of revolution, feed rate and depth of cut) on the surface roughness of aluminium alloy. Hence, a statistical (regression) model has been developed to predict the surface roughness by using the methodology of experimental design. Central composite design is chosen for fitting response surface. Also, numerical optimization considering two goals simultaneously (minimum propagation of error and minimum roughness) was performed throughout the experimental region. In this way, the settings of cutting parameters causing the minimum variability in response were determined for the estimated variations of the significant regression factors.

Simunovic, K.; Simunovic, G.; Saric, T.

2013-10-01

403

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Introduction Examples of the spontaneous regression of primary intracranial germinomas can be found in the literature. We present the case of a patient with disseminated lesions of primary intracranial germinoma which synchronously shrunk following diagnostic irradiation. We will discuss whether this regression was spontaneous or radiation-induced. Case presentation A 43-year-old Japanese woman presented to our hospital complaining of memory problems over a period of one year and blurred vision over a period of three months. Following magnetic resonance imaging, she was found to have a massive lesion in the third ventricle and small lesions in the pineal region, fourth ventricle, and in the anterior horn of the left lateral ventricle. Prior to an open biopsy to confirm the pathology of the lesions, she underwent a single cranial computed tomography scan and a single cranial digital subtraction angiography for a transcranial biopsy. Fourteen days after the first magnetic resonance image - 12 and eight days after the computed tomography scan and digital subtraction angiography, respectively - a pre-operative magnetic resonance image was taken, which showed a notable synchronous shrinkage of the third ventricle tumor, as well as shrinkage of the lesions in the pineal region and in the fourth ventricle. She did not undergo steroid administration until after a biopsy that confirmed the pathological diagnosis of pure germinoma. She then underwent whole craniospinal irradiation and went into a complete remission. Conclusions In our case report, we state that diagnostic radiation can induce the regression of germinomas; this is the most reasonable explanation for the synchronous multiple regression observed in this case of germinoma. Clinicians should keep this non-spontaneous regression in mind and monitor germinoma lesions with minimal exposure to diagnostic radiation before diagnostic confirmation, and also before radiation treatment with or without chemotherapy begins.

Natsumeda Manabu

2011-01-01

404

The Analysis of Internet Addiction Scale Using Multivariate Adaptive Regression Splines

Directory of Open Access Journals (Sweden)

Full Text Available "nBackground: Determining real effects on internet dependency is too crucial with unbiased and robust statistical method. MARS is a new non-parametric method in use in the literature for parameter estimations of cause and effect based research. MARS can both obtain legible model curves and make unbiased parametric predictions."nMethods: In order to examine the performance of MARS, MARS findings will be compared to Classification and Regression Tree (C&RT findings, which are considered in the literature to be efficient in revealing correlations between variables. The data set for the study is taken from "The Internet Addiction Scale" (IAS, which attempts to reveal addiction levels of individuals. The population of the study consists of 754 secondary school students (301 female, 443 male students with 10 missing data. MARS 2.0 trial version is used for analysis by MARS method and C&RT analysis was done by SPSS."nResults: MARS obtained six base functions of the model. As a common result of these six functions, regression equation of the model was found. Over the predicted variable, MARS showed that the predictors of daily Internet-use time on average, the purpose of Internet- use, grade of students and occupations of mothers had a significant effect (P< 0.05. In this comparative study, MARS obtained different findings from C&RT in dependency level prediction."nConclusion: The fact that MARS revealed extent to which the variable, which was considered significant, changes the character of the model was observed in this study.

M Kayri

2010-12-01

405

Highly accurate interval forecasting of a stock price index is fundamental to successfully making a profit when making investment decisions, by providing a range of values rather than a point estimate. In this study, we investigate the possibility of forecasting an interval-valued stock price index series over short and long horizons using multi-output support vector regression (MSVR). Furthermore, this study proposes a firefly algorithm (FA)-based approach, built on the est...

Xiong, Tao; Bao, Yukun; Hu, Zhongyi

2014-01-01

406

Directory of Open Access Journals (Sweden)

Full Text Available Aim: A study was undertaken to develop a forecasting model for predicting bluetongue outbreaks in North-west agroclimatic zone of Tamil Nadu, India. Materials and Methods: Eleven bluetongue outbreaks were characterised by active and passive surveillances for a period of twelve years and used in this study. Meteorological data comprising of maximum and minimum temperatures, relative humidity, rainfall and wind speed were collected and used as the multiple predictor variables in the multiple liner regression model. Results: A multiple liner regression model was developed for the North-west zone of Tamil Nadu. Values of the dependant variables were less than or greater than one, and indicated remote or greater chances of bluetongue outbreaks respectively. The monthly mean maximum and minimum temperatures, relative humidity at 8.30 h and at 17.00 h IST, wind speed, and monthly total rainfall of 29.1 - 31.0°C, 20.1 - 22.0°C, 80.1 ? 85.0%, 65.1 ? 70.0%, 3.1 ? 5.0 km/h and < 200 mm respectively, were identified as the ideal climatic conditions for increased numbers of bluetongue outbreaks in this zone. Conclusion: Based on the values obtained from the prediction model, stake holders can be warned timely through the media to institute suitable prophylactic measures against bluetongue, to avoid economic losses due to disease. [Vet World 2013; 6(6.000: 321-324

G. Selvaraju

2013-12-01

407

Applying support vector regression analysis on grip force level-related corticomuscular coherence

DEFF Research Database (Denmark)

Voluntary motor performance is the result of cortical commands driving muscle actions. Corticomuscular coherence can be used to examine the functional coupling or communication between human brain and muscles. To investigate the effects of grip force level on corticomuscular coherence in an accessory muscle, this study proposed an expanded support vector regression (ESVR) algorithm to quantify the coherence between electroencephalogram (EEG) from sensorimotor cortex and surface electromyogram (EMG) from brachioradialis in upper limb. A measure called coherence proportion was introduced to compare the corticomuscular coherence in the alpha (7–15Hz), beta (15–30Hz) and gamma (30–45Hz) band at 25 % maximum grip force (MGF) and 75 % MGF. Results show that ESVR could reduce the influence of deflected signals and summarize the overall behavior of multiple coherence curves. Coherence proportion is more sensitive to grip force level than coherence area. The significantly higher corticomuscular coherence occurred in the alpha (p<0.01) and beta band (p<0.01) during 75 % MGF, but in the gamma band (p<0.01) during 25 % MGF. The results suggest that sensorimotor cortex might control the activity of an accessory muscle for hand grip with increased grip intensity by changing functional corticomuscular coupling at certain frequency bands (alpha, beta and gamma bands).

Rong, Yao; Han, Xixuan

2014-01-01

408

Directory of Open Access Journals (Sweden)

Full Text Available This study performs a Differential Item Function (DIF analysis in terms of gender and culture on the items available in the PISA 2009 mathematics literacy sub-test. The DIF analyses were done through the Mantel Haenszel, Logistic Regression and the SIBTEST methods. The data for the gender variable were collected from the responses given by 332 students to the items in the mathematics literacy sub-test during the administration of the 5th booklet in the PISA 2009 application whereas the data for the culture variable were collected through the application of the 5th booklet in Turkey, Germany, Finland and the United States in the PISA 2009 application. As a result of DIF analysis according to gender, 4 items carried out in favor of men, only one item can be said to