WorldWideScience

Sample records for multiple regression analysis

  1. Multiple Correlation versus Multiple Regression.

    Science.gov (United States)

    Huberty, Carl J.

    2003-01-01

    Describes differences between multiple correlation analysis (MCA) and multiple regression analysis (MRA), showing how these approaches involve different research questions and study designs, different inferential approaches, different analysis strategies, and different reported information. (SLD)

  2. MULTIPLE REGRESSION ANALYSIS OF MAIN ECONOMIC INDICATORS IN TOURISM

    Directory of Open Access Journals (Sweden)

    Erika KULCSÁR

    2009-12-01

    Full Text Available This paper analysis the measure between GDP dependent variable in the sector of hotels and restaurants and the following independent variables: overnight stays in the establishments of touristic reception, arrivals in the establishments of touristic reception and investments in hotels and restaurants sector in the period of analysis 1995-2007. With the multiple regression analysis I found that investments and tourist arrivals are significant predictors for the GDP dependent variable. Based on these results, I identified those components of the marketing mix, which in my opinion require investment, which could contribute to the positive development of tourist arrivals in the establishments of touristic reception.

  3. Coding for Direct Interpretation in Multiple Regression Analysis.

    Science.gov (United States)

    Serlin, Ronald C.; Levin, Joel R.

    A general procedure is presented for generating code values for a qualitative variable in multiple linear regression analyses that result in directly interpretable estimates of interest. The basic approach, in viewing ANOVA as a multiple regression problem, is to derive quantitative code values for the various levels of the qualitative ANOVA…

  4. Using Robust Standard Errors to Combine Multiple Regression Estimates with Meta-Analysis

    Science.gov (United States)

    Williams, Ryan T.

    2012-01-01

    Combining multiple regression estimates with meta-analysis has continued to be a difficult task. A variety of methods have been proposed and used to combine multiple regression slope estimates with meta-analysis, however, most of these methods have serious methodological and practical limitations. The purpose of this study was to explore the use…

  5. Analysis of ? spectra in airborne radioactivity measurements using multiple linear regressions

    International Nuclear Information System (INIS)

    This paper describes the net peak counts calculating of nuclide 137Cs at 662 keV of ? spectra in airborne radioactivity measurements using multiple linear regressions. Mathematic model is founded by analyzing every factor that has contribution to Cs peak counts in spectra, and multiple linear regression function is established. Calculating process adopts stepwise regression, and the indistinctive factors are eliminated by F check. The regression results and its uncertainty are calculated using Least Square Estimation, then the Cs peak net counts and its uncertainty can be gotten. The analysis results for experimental spectrum are displayed. The influence of energy shift and energy resolution on the analyzing result is discussed. In comparison with the stripping spectra method, multiple linear regression method needn't stripping radios, and the calculating result has relation with the counts in Cs peak only, and the calculating uncertainty is reduced. (authors)

  6. Quantitative electron microscope autoradiography: application of multiple linear regression analysis

    International Nuclear Information System (INIS)

    A new method for the analysis of high resolution EM autoradiographs is described. It identifies labelled cell organelle profiles in sections on a strictly statistical basis and provides accurate estimates for their radioactivity without the need to make any assumptions about their size, shape and spatial arrangement. (author)

  7. Multiple regression analysis of Jominy hardenability data for boron treated steels

    Energy Technology Data Exchange (ETDEWEB)

    Komenda, J. [Swedish Inst. for Metals Research, Stockholm (Sweden); Sandstroem, R. [Royal Inst. of Tech., Stockholm (Sweden); Tukiainen, M. [Fundia Wire Oy Ab, Lappohja (Finland)

    1997-03-01

    The relations between chemical composition and their hardenability of boron treated steels have been investigated using a multiple regression analysis method. A linear model of regression was chosen. The free boron content that is effective for the hardenability was calculated using a model proposed by Jansson. The regression analysis for 1261 steel heats provided equations that were statistically significant at the 95% level. All heats met the specification according to the nordic countries producers classification. The variation in chemical composition explained typically 80 to 90% of the variation in the hardenability. In the regression analysis elements which did not significantly contribute to the calculated hardness according to the F test were eliminated. Carbon, silicon, manganese, phosphorus and chromium were of importance at all Jominy distances, nickel, vanadium, boron and nitrogen at distances above 6 mm. After the regression analysis it was demonstrated that very few outliers were present in the data set, i.e. data points outside four times the standard deviation. The model has successfully been used in industrial practice replacing some of the necessary Jominy tests. (orig.)

  8. Multiple regression analysis of Jominy hardenability data for boron treated steels

    International Nuclear Information System (INIS)

    The relations between chemical composition and their hardenability of boron treated steels have been investigated using a multiple regression analysis method. A linear model of regression was chosen. The free boron content that is effective for the hardenability was calculated using a model proposed by Jansson. The regression analysis for 1261 steel heats provided equations that were statistically significant at the 95% level. All heats met the specification according to the nordic countries producers classification. The variation in chemical composition explained typically 80 to 90% of the variation in the hardenability. In the regression analysis elements which did not significantly contribute to the calculated hardness according to the F test were eliminated. Carbon, silicon, manganese, phosphorus and chromium were of importance at all Jominy distances, nickel, vanadium, boron and nitrogen at distances above 6 mm. After the regression analysis it was demonstrated that very few outliers were present in the data set, i.e. data points outside four times the standard deviation. The model has successfully been used in industrial practice replacing some of the necessary Jominy tests. (orig.)

  9. Predictive model of Amorphophallus muelleri growth in some agroforestry in East Java by multiple regression analysis

    Directory of Open Access Journals (Sweden)

    BUDIMAN

    2012-01-01

    Full Text Available Budiman, Arisoesilaningsih E. 2012. Predictive model of Amorphophallus muelleri growth in some agroforestry in East Java by multiple regression analysis. Biodiversitas 13: 18-22. The aims of this research was to determine the multiple regression models of vegetative and corm growth of Amorphophallus muelleri Blume in some age variations and habitat conditions of agroforestry in East Java. Descriptive exploratory research method was conducted by systematic random sampling at five agroforestries on four plantations in East Java: Saradan, Bojonegoro, Nganjuk and Blitar. In each agroforestry, we observed A. muelleri vegetative and corm growth on four growing age (1, 2, 3 and 4 years old respectively as well as environmental variables such as altitude, vegetation, climate and soil conditions. Data were analyzed using descriptive statistics to compare A. muelleri habitat in five agroforestries. Meanwhile, the influence and contribution of each environmental variable to the growth of A. muelleri vegetative and corm were determined using multiple regression analysis of SPSS 17.0. The multiple regression models of A. muelleri vegetative and corm growth were generated based on some characteristics of agroforestries and age showed high validity with R2 = 88-99%. Regression model showed that age, monthly temperatures, percentage of radiation and soil calcium (Ca content either simultaneously or partially determined the growth of A. muelleri vegetative and corm. Based on these models, the A. muelleri corm reached the optimal growth after four years of cultivation and they will be ready to be harvested. Additionally, the soil Ca content should reach 25.3 me.hg-1 as Sugihwaras agroforestry, with the maximal radiation of 60%.

  10. Multivariate modeling of aging in bottled lager beer by principal component analysis and multiple regression methods.

    Science.gov (United States)

    Liu, Jing; Li, Qi; Dong, Jianjun; Chen, Jian; Gu, Guoxian

    2008-08-27

    Data collected from the sensory test score evaluation of bottled lager beer, together with the chemical components related to aging, including carbonyl compounds, higher alcohols, unsaturated fatty acid, organic acids, alpha-amino acids, dissolved oxygen, and staling evaluation indices, including lag time of electron spin resonance (ESR) curve, 1,1'-diphenyl-2-picrylhydrazyl (DPPH) scavenged amounts, and thiobarbituric acid (TBA) values, were used to predict the extent of aging in bottled lager beer, using both multiple linear regression and principal component analysis methods. Carbonyl compounds, higher alcohols, and TBA value were significantly and positively correlated with sensory evaluation of staling flavor. While lag time and DPPH scavenging amount were negatively correlated with taste test score. Multiple regression analysis was used to fit the sensory test data using the above chemical compound aging related parameters and evaluation indices as predictors. A variable selection method based on high loadings of varimax rotated principal components was used to obtain subsets of the predominant predictor variables to be included in the regression model of beer aging, so as to eliminate the multicollinearity of the original nine variables. It was found that staling extent was influenced significantly by higher alcohols, TBA value, and DPPH scavenging amount, and the multicollinearity of the regression model was found to be weak by examining the variance inflation factors of the new predictor variables. A mathematic model of the organoleptic test score for beer aging using these three predictors was obtained by multiple linear regression, showing that the major contributors to the sensory taste of beer aging were higher alcohols, TBA index, and DPPH scavenging amount, with the adjusted R(2) of the model being 0.62. PMID:18624409

  11. REVAAM Model to determine a company's value by multiple valuation and linear regression analysis

    Directory of Open Access Journals (Sweden)

    Luis G. Acosta-Calzado

    2010-07-01

    Full Text Available This paper shows an alternative model to the widely used method of multiple valuation (or relative valuation) in order to calculate the value of a company by using either the Price Earnings (PE) and/or the Enterprise Value to Earnings Before Interest, Taxes, Depreciation and Amortization (EV/EBITDA). When calculating multiples, analysts tend to consider average multiples within an industry and apply them directly to the target company; however, we believe that this practice is not considering differences among the companies being compared, although they belong to the same sector or industry. REVAAM Model uses linear regression to calculate adjusted PE and EV/EBITDA multiples by taking into consideration profitability factors for each multiple in order to differentiate companies in the samples. Calculations are based on public data for US companies, but could be further expanded to other markets. Not only REVAAM Model provides a better estimate to relative valuation analysis than simply using average multiples, but it could be used to compare under/overvalued companies or sectors, and also analyze multiple value changes over time as the intrinsic fundamentals change.

  12. A Performance Study of Data Mining Techniques: Multiple Linear Regression vs. Factor Analysis

    Directory of Open Access Journals (Sweden)

    R.K.Chauhan

    2011-04-01

    Full Text Available The growing volume of data usually creates an interesting challenge for the need of data analysis tools that discover regularities in these data. Data mining has emerged as disciplines that contribute tools for data analysis, discovery of hidden knowledge, and autonomous decision making in many application domains. The purpose of this study is to compare the performance of two data mining techniques viz., factor analysis and multiple linear regression for different sample sizes on three unique sets of data. The performance of the two data mining techniques is compared on following parameters like mean square error (MSE, R-square, R-Square adjusted, condition number, root mean square error(RMSE, number of variables included in the prediction model, modified coefficient of efficiency, F-value, and test of normality. These parameters have been computed using various data mining tools like SPSS, XLstat, Stata, and MS-Excel. It is seen that for all the given dataset, factor analysis outperform multiple linear regression. But the absolute value of prediction accuracy varied between the three datasets indicating that the data distribution and data characteristics play a major role in choosing the correct prediction technique.

  13. Estimate of Compressive Strength for Concrete using Ultrasonics by Multiple Regression Analysis Method

    International Nuclear Information System (INIS)

    Various types of ultrasonic techniques have been used for the estimation of compressive strength of concrete structures. However, conventional ultrasonic velocity method using only longitudial wave cannot be determined the compressive strength of concrete structures with accuracy. In this paper, by using the introduction of multiple parameter, e. g. velocity of shear wave, velocity of longitudinal wave, attenuation coefficient of shear wave, attenuation coefficient of longitudinal wave, combination condition, age and preservation method, multiple regression analysis method was applied to the determination of compressive strength of concrete structures. The experimental results show that velocity of shear wave can be estimated compressive strength of concrete with more accuracy compared with the velocity of longitudinal wave, accuracy of estimated error range of compressive strength of concrete structures can be enhanced within the range of ± 10% approximately

  14. Estimating leaf photosynthetic pigments information by stepwise multiple linear regression analysis and a leaf optical model

    Science.gov (United States)

    Liu, Pudong; Shi, Runhe; Wang, Hong; Bai, Kaixu; Gao, Wei

    2014-10-01

    Leaf pigments are key elements for plant photosynthesis and growth. Traditional manual sampling of these pigments is labor-intensive and costly, which also has the difficulty in capturing their temporal and spatial characteristics. The aim of this work is to estimate photosynthetic pigments at large scale by remote sensing. For this purpose, inverse model were proposed with the aid of stepwise multiple linear regression (SMLR) analysis. Furthermore, a leaf radiative transfer model (i.e. PROSPECT model) was employed to simulate the leaf reflectance where wavelength varies from 400 to 780 nm at 1 nm interval, and then these values were treated as the data from remote sensing observations. Meanwhile, simulated chlorophyll concentration (Cab), carotenoid concentration (Car) and their ratio (Cab/Car) were taken as target to build the regression model respectively. In this study, a total of 4000 samples were simulated via PROSPECT with different Cab, Car and leaf mesophyll structures as 70% of these samples were applied for training while the last 30% for model validation. Reflectance (r) and its mathematic transformations (1/r and log (1/r)) were all employed to build regression model respectively. Results showed fair agreements between pigments and simulated reflectance with all adjusted coefficients of determination (R2) larger than 0.8 as 6 wavebands were selected to build the SMLR model. The largest value of R2 for Cab, Car and Cab/Car are 0.8845, 0.876 and 0.8765, respectively. Meanwhile, mathematic transformations of reflectance showed little influence on regression accuracy. We concluded that it was feasible to estimate the chlorophyll and carotenoids and their ratio based on statistical model with leaf reflectance data.

  15. Variables Associated with Communicative Participation in People with Multiple Sclerosis: A Regression Analysis

    Science.gov (United States)

    Baylor, Carolyn; Yorkston, Kathryn; Bamer, Alyssa; Britton, Deanna; Amtmann, Dagmar

    2010-01-01

    Purpose: To explore variables associated with self-reported communicative participation in a sample (n = 498) of community-dwelling adults with multiple sclerosis (MS). Method: A battery of questionnaires was administered online or on paper per participant preference. Data were analyzed using multiple linear backward stepwise regression. The…

  16. Thermodynamic analysis of simple gas turbine cycle with multiple regression modelling and optimization

    International Nuclear Information System (INIS)

    In this study, thermodynamic and statistical analyses were performed on a gas turbine system, to assess the impact of some important operating parameters like CIT (Compressor Inlet Temperature), PR (Pressure Ratio) and TIT (Turbine Inlet Temperature) on its performance characteristics such as net power output, energy efficiency, exergy efficiency and fuel consumption. Each performance characteristic was enunciated as a function of operating parameters, followed by a parametric study and optimization. The results showed that the performance characteristics increase with an increase in the TIT and a decrease in the CIT, except fuel consumption which behaves oppositely. The net power output and efficiencies increase with the PR up to certain initial values and then start to decrease, whereas the fuel consumption always decreases with an increase in the PR. The results of exergy analysis showed the combustion chamber as a major contributor to the exergy destruction, followed by stack gas. Subsequently, multiple regression models were developed to correlate each of the response variables (performance characteristic) with the predictor variables (operating parameters). The regression model equations showed a significant statistical relationship between the predictor and response variables. (author)

  17. Analysis of longitudinal clinical trials with missing data using multiple imputation in conjunction with robust regression.

    Science.gov (United States)

    Mehrotra, Devan V; Li, Xiaoming; Liu, Jiajun; Lu, Kaifeng

    2012-12-01

    In a typical randomized clinical trial, a continuous variable of interest (e.g., bone density) is measured at baseline and fixed postbaseline time points. The resulting longitudinal data, often incomplete due to dropouts and other reasons, are commonly analyzed using parametric likelihood-based methods that assume multivariate normality of the response vector. If the normality assumption is deemed untenable, then semiparametric methods such as (weighted) generalized estimating equations are considered. We propose an alternate approach in which the missing data problem is tackled using multiple imputation, and each imputed dataset is analyzed using robust regression (M-estimation; Huber, 1973, Annals of Statistics 1, 799-821.) to protect against potential non-normality/outliers in the original or imputed dataset. The robust analysis results from each imputed dataset are combined for overall estimation and inference using either the simple Rubin (1987, Multiple Imputation for Nonresponse in Surveys, New York: Wiley) method, or the more complex but potentially more accurate Robins and Wang (2000, Biometrika 87, 113-124.) method. We use simulations to show that our proposed approach performs at least as well as the standard methods under normality, but is notably better under both elliptically symmetric and asymmetric non-normal distributions. A clinical trial example is used for illustration. PMID:22994905

  18. Anomalous particle pinch and scaling of vin/D based on transport analysis and multiple regression

    Science.gov (United States)

    Becker, G.; Kardaun, O.

    2007-01-01

    Predictions of density profiles in current tokamaks and ITER require a validated scaling relation for vin/D where vin is the anomalous inward drift velocity and D is the anomalous diffusion coefficient. Transport analysis is necessary for determining the anomalous particle pinch from measured density profiles and for separating the impact of particle sources. A set of discharges in ASDEX Upgrade, DIII-D, JET and ASDEX is analysed using a special version of the 1.5-D BALDUR transport code. Profiles of ?svin/D with ?s the effective separatrix radius, five other dimensionless parameters and many further quantities in the confinement zone are compiled, resulting in the dataset VIND1.dat, which covers a wide parameter range. Weighted multiple regression is applied to the ASDEX Upgrade subset which leads to a two-term scaling \\rho _sv_in ({x'}) /D ({x'}) =0.0432 [ { ({L_{T_{\\rme}} ({ \\bar {x}'}) / \\rho _s}) ^{-2.58}+7.13 \\, U_L^{1.55} \

  19. Investigations upon the indefinite rolls quality assurance in multiple regression analysis

    International Nuclear Information System (INIS)

    The rolling rolls quality has been enhanced mainly due to the improvements of the chemical compositions of rolls materials. The realization of an optimal chemical composition can constitute a technical efficient mode to assure the exploitation properties, the material from which the rolling mills rolls are manufactured having a higher importance in this sense. This paper continues to present the scientifically results of our experimental research in the area of the rolling rolls. The basic research contains concrete elements of immediate practical utilities in the metallurgical enterprises, for the quality improvements of rolls, having in last as the aim the durability growth and the safety in exploitation. This paper presents an analysis of the chemical composition, the influences upon the mechanical properties of the indefinite cast iron rolls. We present some mathematical correlations and graphical interpretations between the hardness (on the working surface and on necks) and the chemical composition. Using the double and triple correlations which is really helpful in the foundry practice, as it allows us to determine variation boundaries for the chemical composition, in view the obtaining the optimal values of the hardness. We suggest a mathematical interpretation of the influence of the chemical composition over the hardness of these indefinite rolling rolls. In this sense we use the multiple regression analysis which can be an important statistical tool for the be an important statistical tool for the investigation of relationships between variables. The enunciation of some mathematically modeling results can be described through a number of multi-component equations determined for the spaces with 3 and 4 dimensions. Also, the regression surfaces, curves of levels and volumes of variations can be represented and interpreted by technologists considering these as correlation diagrams between the analyzed variables. In this sense, these researches results can be used in the engineers collectives of the foundries and the rolling mills sectors, for quality assurances of rolls as far back as phase of production, as well as in exploitation of these, what lead to, inevitably, to the quality assurance of produced laminates. (Author) 16 refs.

  20. Production of Relativistic Electrons Following Storms and Substorms: Results from Stepwise Multiple Regression and Path Analysis

    Science.gov (United States)

    Simms, L. E.; Engebretson, M. J.; Pilipenko, V.; Reeves, G. D.

    2012-12-01

    A number of parameters of the solar wind and magnetosphere are correlated with the production of relativistic electrons. These include the level of relativistic electrons at storm onset, seed electron flux, solar wind velocity and number density, IMF Bz, AE and Kp indices, and ULF and VLF wave power. However, as all these variables may be intercorrelated between each other as well, simple correlations between each predictor variable and electron flux may not tell the whole story. We identified 166 storms and substorms (1992-2002) with at least 72 storm free hours after the minimum Dst. We obtained hourly averaged electron fluxes for relativistic electrons (> 1.5 MeV) and seed electrons (100 keV) from several spacecraft (Los Alamos National Laboratory geosynchronous energetic particle instruments). For each storm or substorm event, we found the log10 maximum relativistic electron flux for each satellite following the end of the main phase of each storm. No spacecraft was in operation for this entire period, so we averaged over all available satellites in each hour. As each satellite was calibrated differently, we first converted each observation to a standardized score with mean 0 and standard deviation of 1. We performed a stepwise multiple regression using solar wind velocity and flow angles (both latitude and longitude), number density, standard deviation of velocity and number density, a ULF index (Kozyreva et al., 2007, Planet. Space Sci., 55, 755-769), VLF (.5-1.0 kHz), AE, Kp, Dst, IMF Bz, Bz RMS standard deviation , and log10 of seed electron and onset relativistic electron fluxes as predictor variables. We also performed regressions entering physical variables first (e.g., solar wind velocity) and adding indices second (e.g., Dst) to determine if physical variables were more predictive. We subsequently performed a path analysis, showing the relationships between the predictor variables, as well as their influence on electron flux following each event. The rise in relativistic electron flux following storms and substorms is best explained by a set of variables rather than by one or two factors. Vsw, ULF, main phase seed electron flux, and either IMF Bz or Dst are the most significant explanatory variables. AE (relating to substorm activity) and Kp show somewhat less influence.

  1. Regression analysis by example

    CERN Document Server

    Chatterjee, Samprit

    2012-01-01

    Praise for the Fourth Edition: ""This book is . . . an excellent source of examples for regression analysis. It has been and still is readily readable and understandable."" -Journal of the American Statistical Association Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. Regression Analysis by Example, Fifth Edition has been expanded

  2. Investigations upon the indefinite rolls quality assurance in multiple regression analysis

    Directory of Open Access Journals (Sweden)

    Kiss, I.

    2012-04-01

    Full Text Available The rolling rolls quality has been enhanced mainly due to the improvements of the chemical compositions of rolls materials. The realization of an optimal chemical composition can constitute a technical efficient mode to assure the exploitation properties, the material from which the rolling mills rolls are manufactured having a higher importance in this sense. This paper continues to present the scientifically results of our experimental research in the area of the rolling rolls. The basic research contains concrete elements of immediate practical utilities in the metallurgical enterprises, for the quality improvements of rolls, having in last as the aim the durability growth and the safety in exploitation. This paper presents an analysis of the chemical composition, the influences upon the mechanical properties of the indefinite cast iron rolls. We present some mathematical correlations and graphical interpretations between the hardness (on the working surface and on necks and the chemical composition. Using the double and triple correlations which is really helpful in the foundry practice, as it allows us to determine variation boundaries for the chemical composition, in view the obtaining the optimal values of the hardness. We suggest a mathematical interpretation of the influence of the chemical composition over the hardness of these indefinite rolling rolls. In this sense we use the multiple regression analysis which can be an important statistical tool for the investigation of relationships between variables. The enunciation of some mathematically modeling results can be described through a number of multi-component equations determined for the spaces with 3 and 4 dimensions. Also, the regression surfaces, curves of levels and volumes of variations can be represented and interpreted by technologists considering these as correlation diagrams between the analyzed variables. In this sense, these researches results can be used in the engineers collectives of the foundries and the rolling mills sectors, for quality assurances of rolls as far back as phase of production, as well as in exploitation of these, what lead to, inevitably, to the quality assurance of produced laminates.

    Con este trabajo se ha logrado asegurar la calidad de los cilindros de laminación, debido fundamentalmente a la aportación de una determinada composición química a los materiales cilíndricos. Esta composición química mejorada, puede desarrollar de una forma efiicaz las propiedades de explotación, donde estos cilindros de laminación podrán ser fabricados, ofreciendo mejores resultados. El trabajo se presenta de una forma científica, aportando los resultados de una investigación experimental en el área de los cilindros de laminación. Dicha investigación contiene elementos suficientes y de inmediata utilidad práctica para las empresas metalúrgicas, y así de esta forma, mejorar la calidad de los cilindros de laminación. El objetivo principal es el aumento de la durabilidad y la seguridad en la explotación. En este proceso se presenta un análisis de la composición química y de la influencia sobre las propiedades mecánicas de los cilindros de laminación indefinida. Presentamos algunas correlaciones matemáticas añadiendo una interpretación gráfica entre la dureza (en la superficie de trabajo y el cuello y la composición química. La determinación de las correlaciones dobles y triples, que son realmente útiles en la práctica de la fundición, nos permite determinar los límites de variación de la composición química, con vistas a obtener los valores óptimos de la dureza.Se podrá observar una interpretación matemática de la influencia de la composición química, sobre la dureza de estos cilindros de laminación. En este sentido, realizamos el análisis de regresión múltiple el cual puede aportar un importante instrumento estadístico para la investigación de las relaciones entre las variables. Los resultados matemáticamente modelados, pueden ser descritos mediante una ser

  3. Crude Oil Price Forecasting Based on Hybridizing Wavelet Multiple Linear Regression Model, Particle Swarm Optimization Techniques, and Principal Component Analysis

    OpenAIRE

    Ani Shabri; Ruhaidah Samsudin

    2014-01-01

    Crude oil prices do play significant role in the global economy and are a key input into option pricing formulas, portfolio allocation, and risk measurement. In this paper, a hybrid model integrating wavelet and multiple linear regressions (MLR) is proposed for crude oil price forecasting. In this model, Mallat wavelet transform is first selected to decompose an original time series into several subseries with different scale. Then, the principal component analysis (PCA) is used in processing...

  4. Understanding logistic regression analysis

    OpenAIRE

    Sperandei, Sandro

    2014-01-01

    Logistic regression is used to obtain odds ratio in the presence of more than one explanatory variable. The procedure is quite similar to multiple linear regression, with the exception that the response variable is binomial. The result is the impact of each variable on the odds ratio of the observed event of interest. The main advantage is to avoid confounding effects by analyzing the association of all variables together. In this article, we explain the logistic regression procedure using ex...

  5. Multiple Regressions in Analysing House Price Variations

    Directory of Open Access Journals (Sweden)

    Aminah Md Yusof

    2012-03-01

    Full Text Available An application of rigorous statistical analysis in aiding investment decision making gains momentum in the United States of America as well as the United Kingdom. Nonetheless in Malaysia the responses from the local academician are rather slow and the rate is even slower as far as the practitioners are concern. This paper illustrates how Multiple Regression Analysis (MRA and its extension, Hedonic Regression Analysis been used in explaining price variation for selected houses in Malaysia. Each attribute that theoretically identified as price determinant is priced and the perceived contribution of each is explicitly shown. The paper demonstrates how the statistical analysis is capable of analyzing property investment by considering multiple determinants. The consideration of various characteristics which is more rigorous enables better investment decision making.

  6. Prediction of relativistic electron flux at geostationary orbit following storms: Multiple regression analysis

    Science.gov (United States)

    Simms, Laura E.; Pilipenko, Viacheslav; Engebretson, Mark J.; Reeves, Geoffrey D.; Smith, A. J.; Clilverd, Mark

    2014-09-01

    Many solar wind and magnetosphere parameters correlate with relativistic electron flux following storms. These include relativistic electron flux before the storm; seed electron flux; solar wind velocity and number density (and their variation); interplanetary magnetic field Bz, AE and Kp indices; and ultra low frequency (ULF) and very low frequency (VLF) wave power. However, as all these variables are intercorrelated, we use multiple regression analyses to determine which are the most predictive of flux when other variables are controlled. Using 219 storms (1992-2002), we obtained hourly averaged electron fluxes for outer radiation belt relativistic electrons (>1.5 MeV) and seed electrons (100 keV) from Los Alamos National Laboratory spacecraft (geosynchronous orbit). For each storm, we found the log10 maximum relativistic electron flux 48-120 h after the end of the main phase of each storm. Each predictor variable was averaged over the 12 h before the storm, the main phase, and the 48 h following minimum Dst. High levels of flux following storms are best modeled by a set of variables. In decreasing influence, ULF, seed electron flux, Vsw and its variation, and after-storm Bz were the most significant explanatory variables. Kp can be added to the model, but it adds no further explanatory power. Although we included ground-based VLF power from Halley, Antarctica, it shows little predictive ability. We produced predictive models using the coefficients from the regression models and assessed their effectiveness in predicting novel observations. The correlation between observed values and those predicted by these empirical models ranged from 0.645 to 0.795.

  7. Application of multiple regression analysis to forecasting South Africa's electricity demand

    Scientific Electronic Library Online (English)

    Renee, Koen; Jennifer, Holloway.

    2014-11-01

    Full Text Available In a developing country such as South Africa, understanding the expected future demand for electricity is very important in various planning contexts. It is specifically important to understand how expected scenarios regarding population or economic growth can be translated into corresponding future [...] electricity usage patterns. This paper discusses a methodology for forecasting long-term electricity demand that was specifically developed for applying to such scenarios. The methodology uses a series of multiple regression models to quantify historical patterns of electricity usage per sector in relation to patterns observed in certain economic and demographic variables, and uses these relationships to derive expected future electricity usage patterns. The methodology has been used successfully to derive forecasts used for strategic planning within a private company as well as to provide forecasts to aid planning in the public sector. This paper discusses the development of the modelling methodology, provides details regarding the extensive data collection and validation processes followed during the model development, and reports on the relevant model fit statistics. The paper also shows that the forecasting methodology has to some extent been able to match the actual patterns, and therefore concludes that the methodology can be used to support planning by translating changes relating to economic and demographic growth, for a range of scenarios, into a corresponding electricity demand. The methodology therefore fills a particular gap within the South African long-term electricity forecasting domain.

  8. Use of Factor Analysis Scores in Multiple Regression Model for Estimation of Body Weight from Some Body Measurement in Muscovy Duck

    OpenAIRE

    Ogah, D. M.; Alaga, A. A.; Momoh, M. O.

    2009-01-01

    Factor and multiple regression analysis were carried out on morphological traits (body length, body width, bill length, bill width, bill height, shank length, body height, head length, head width, neck length, wing length, chest circumference and body weight) of male and female muscovy ducks. Obvious sexual dimorphism was exhibited between sexes, relationship between body measurement and body weight were examined through factor and multiple linear regression analysis. Three factors had positi...

  9. Practical Session: Multiple Linear Regression

    Science.gov (United States)

    Clausel, M.; Grégoire, G.

    2014-01-01

    Three exercises are proposed to illustrate the simple linear regression. In the first one investigates the influence of several factors on atmospheric pollution. It has been proposed by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr33.pdf) and is based on data coming from 20 cities of U.S. Exercise 2 is an introduction to model selection whereas Exercise 3 provides a first example of analysis of variance. Exercises 2 and 3 have been proposed by A. Dalalyan at ENPC (see Exercises 2 and 3 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_5.pdf).

  10. Reliability and Regression Analysis

    Science.gov (United States)

    Lane, David M.

    This applet, by David M. Lane of Rice University, demonstrates how the reliability of X and Y affect various aspects of the regression of Y on X. Java 1.1 is required and a full set of instructions is given in order to get the full value from the applet. Exercises and definitions to key terms are also given to help students understand reliability and regression analysis.

  11. Application of cluster analysis and multiple regression to calculate the effect of vegetation and topography on snow accumulation and snowmelt

    Science.gov (United States)

    Pevná, Hana; Jení?ek, Michal

    2014-05-01

    Snow is the important component of hydrological cycle in the central Europe. Large quantity of water is accumulated as snow during winter period and this water runs off into rivers in relative short time during spring period. Increased risk of floods in central Europe exists namely in alpine and pre-alpine catchments which have the pluvio-nival flow regime. Research of snow accumulation and snowmelt processes is important for runoff forecast and reservoir management. The research is carried out in small mountain catchments in the Czech Republic. Experimental catchments are differing in elevation range, aspect, slope and type of vegetation cover. Automatic and field measurements of the snow depth and snow water equivalent (SWE) have been caring out at specific localities since 2008. Each locality is specified with elevation, aspect, slope and vegetation type (open area, clearing, young forest, sparse mature forest and dense mature forest). Measurements of snow depth and SWE are carried out at 19 localities both during snow accumulation and snow melt period. Data of snow depth and SWE were assessed using both simple statistical analysis and multiple regression and cluster analysis in order to describe the spatial distribution in snow accumulation and snowmelt. The correlation of SWE with vegetation type, elevation, aspect and slope was tested. The main findings of the research show that vegetation type has the most significant influence on the snowpack distribution and on the snow accumulation and snowmelt dynamics. Significant correlations were also proved for aspect (especially for southern slopes). The study completes similar results carried out in different study areas and climatic conditions but moreover it shows changes of importace of governing factors during snow accumulation and snowmelt periods. The results demonstrate a good applicability of cluster analysis and multiple regression for description of snowpack distribution.

  12. A multiple linear regression analysis of hot corrosion attack on a series of nickel base turbine alloys

    Science.gov (United States)

    Barrett, C. A.

    1985-01-01

    Multiple linear regression analysis was used to determine an equation for estimating hot corrosion attack for a series of Ni base cast turbine alloys. The U transform (i.e., 1/sin (% A/100) to the 1/2) was shown to give the best estimate of the dependent variable, y. A complete second degree equation is described for the centered" weight chemistries for the elements Cr, Al, Ti, Mo, W, Cb, Ta, and Co. In addition linear terms for the minor elements C, B, and Zr were added for a basic 47 term equation. The best reduced equation was determined by the stepwise selection method with essentially 13 terms. The Cr term was found to be the most important accounting for 60 percent of the explained variability hot corrosion attack.

  13. Comparison of a neural network with multiple linear regression for quantitative analysis in ICP-atomic emission spectroscopy

    International Nuclear Information System (INIS)

    A two layer perceptron with backpropagation of error is used for quantitative analysis in ICP-AES. The network was trained by emission spectra of two interfering lines of Cd and As and the concentrations of both elements were subsequently estimated from mixture spectra. The spectra of the Cd and As lines were also used to perform multiple linear regression (MLR) via the calculation of the pseudoinverse S+ of the sensitivity matrix S. In the present paper it is shown that there exist close relations between the operation of the perceptron and the MLR procedure. These are most clearly apparent in the correlation between the weights of the backpropagation network and the elements of the pseudoinverse. Using MLR, the confidence intervals over the predictions are exploited to correct for the optical device of the wavelength shift. (orig.)

  14. FORECASTING RETURNS FOR THE STOCK EXCHANGE OF THAILAND INDEX USING MULTIPLE REGRESSION BASED ON PRINCIPAL COMPONENT ANALYSIS

    Directory of Open Access Journals (Sweden)

    Nop Sopipan

    2013-01-01

    Full Text Available The aim of this study was to forecast the returns for the Stock Exchange of Thailand (SET Index by adding some explanatory variables and stationary Autoregressive Moving-Average order p and q (ARMA (p, q in the mean equation of returns. In addition, we used Principal Component Analysis (PCA to remove possible complications caused by multicollinearity. Afterwards, we forecast the volatility of the returns for the SET Index. Results showed that the ARMA (1,1, which includes multiple regression based on PCA, has the best performance. In forecasting the volatility of returns, the GARCH model performs best for one day ahead; and the EGARCH model performs best for five days, ten days and twenty-two days ahead.

  15. Linear Regression Analysis

    CERN Document Server

    Seber, George A F

    2012-01-01

    Concise, mathematically clear, and comprehensive treatment of the subject.* Expanded coverage of diagnostics and methods of model fitting.* Requires no specialized knowledge beyond a good grasp of matrix algebra and some acquaintance with straight-line regression and simple analysis of variance models.* More than 200 problems throughout the book plus outline solutions for the exercises.* This revision has been extensively class-tested.

  16. Statistical analysis using multiple regression of stereological parameters for skeleton castings microstructure

    Directory of Open Access Journals (Sweden)

    M. Cholewa

    2011-07-01

    Full Text Available In this article authors showed influence of technological parameters and modification treatment on structural properties for closed skeleton castings. Approach obtained maximal refinement of structure and minimal structure diversification. Skeleton castings were manufactured in accordance with elaborated production technology. Experimental castings were manufactured in variables technological conditions: range of pouring temperature 953 ÷ 1013 K , temperature of mould 293 ÷ 373 K and height of gating system above casting level 105 ÷ 175 mm. Analysis of metallographic specimens and quantitative analysis of silicon crystals and secondary dendrite-arm spacing analysis of solution ? were performed. Average values of stereological parameters for all castings were determined. (B/L and (P/A factors were determined. On basis results of microstructural analysis authors compares research of samples. The aim of analysis was selected samples on least diversification of refinement degree of structure and least silicon crystals. On basis microstructural analysis authors state that samples 5 (AlSi11, Tpour 1013K, Tmould 333K, h – 265 mm has the best structural properties (least diversification of refinement degree of structure and the least refinement of silicon crystals. Then statistical analysis results of structural analysis was obtained. On basis statistical analysis autors statethat the best structural properties for technological parameters: Tpour= 1013 K, Tmould= 373 K and h = 230 mm [4]. The results of statistical analysis are the prerequisite for optimization studies.

  17. Correlation Weights in Multiple Regression

    Science.gov (United States)

    Waller, Niels G.; Jones, Jeff A.

    2010-01-01

    A general theory on the use of correlation weights in linear prediction has yet to be proposed. In this paper we take initial steps in developing such a theory by describing the conditions under which correlation weights perform well in population regression models. Using OLS weights as a comparison, we define cases in which the two weighting…

  18. Crude Oil Price Forecasting Based on Hybridizing Wavelet Multiple Linear Regression Model, Particle Swarm Optimization Techniques, and Principal Component Analysis

    Science.gov (United States)

    Shabri, Ani; Samsudin, Ruhaidah

    2014-01-01

    Crude oil prices do play significant role in the global economy and are a key input into option pricing formulas, portfolio allocation, and risk measurement. In this paper, a hybrid model integrating wavelet and multiple linear regressions (MLR) is proposed for crude oil price forecasting. In this model, Mallat wavelet transform is first selected to decompose an original time series into several subseries with different scale. Then, the principal component analysis (PCA) is used in processing subseries data in MLR for crude oil price forecasting. The particle swarm optimization (PSO) is used to adopt the optimal parameters of the MLR model. To assess the effectiveness of this model, daily crude oil market, West Texas Intermediate (WTI), has been used as the case study. Time series prediction capability performance of the WMLR model is compared with the MLR, ARIMA, and GARCH models using various statistics measures. The experimental results show that the proposed model outperforms the individual models in forecasting of the crude oil prices series. PMID:24895666

  19. Spectroscopic determination of leaf biochemistry using band-depth analysis of absorption features and stepwise multiple linear regression

    Science.gov (United States)

    Kokaly, R.F.; Clark, R.N.

    1999-01-01

    We develop a new method for estimating the biochemistry of plant material using spectroscopy. Normalized band depths calculated from the continuum-removed reflectance spectra of dried and ground leaves were used to estimate their concentrations of nitrogen, lignin, and cellulose. Stepwise multiple linear regression was used to select wavelengths in the broad absorption features centered at 1.73 ??m, 2.10 ??m, and 2.30 ??m that were highly correlated with the chemistry of samples from eastern U.S. forests. Band depths of absorption features at these wavelengths were found to also be highly correlated with the chemistry of four other sites. A subset of data from the eastern U.S. forest sites was used to derive linear equations that were applied to the remaining data to successfully estimate their nitrogen, lignin, and cellulose concentrations. Correlations were highest for nitrogen (R2 from 0.75 to 0.94). The consistent results indicate the possibility of establishing a single equation capable of estimating the chemical concentrations in a wide variety of species from the reflectance spectra of dried leaves. The extension of this method to remote sensing was investigated. The effects of leaf water content, sensor signal-to-noise and bandpass, atmospheric effects, and background soil exposure were examined. Leaf water was found to be the greatest challenge to extending this empirical method to the analysis of fresh whole leaves and complete vegetation canopies. The influence of leaf water on reflectance spectra must be removed to within 10%. Other effects were reduced by continuum removal and normalization of band depths. If the effects of leaf water can be compensated for, it might be possible to extend this method to remote sensing data acquired by imaging spectrometers to give estimates of nitrogen, lignin, and cellulose concentrations over large areas for use in ecosystem studies.We develop a new method for estimating the biochemistry of plant material using spectroscopy. Normalized band depths calculated from the continuum-removed reflectance spectra of dried and ground leaves were used to estimate their concentrations of nitrogen, lignin, and cellulose. Stepwise multiple linear regression was used to select wavelengths in the broad absorption features centered at 1.73 ??m, 2.10 ??m, and 2.301 ??m that were highly correlated with the chemistry of samples from eastern U.S. forests. Band depths of absorption features at these wavelengths were found to also be highly correlated with the chemistry of four other sites. A subset of data from the eastern U.S. forest sites was used to derive linear equations that were applied to the remaining data to successfully estimate their nitrogen, lignin, and cellulose concentrations. Correlations were highest for nitrogen (R2 from 0.75 to 0.94). The consistent results indicate the possibility of establishing a single equation capable of estimating the chemical concentrations in a wide variety of species from the reflectance spectra of dried leaves. The extension of this method to remote sensing was investigated. The effects of leaf water content, sensor signal-to-noise and bandpass, atmospheric effects, and background soil exposure were examined. Leaf water was found to be the greatest challenge to extending this empirical method to the analysis of fresh whole leaves and complete vegetation canopies. The influence of leaf water on reflectance spectra must be removed to within 10%. Other effects were reduced by continuum removal and normalization of band depths. If the effects of leaf water can be compensated for, it might be possible to extend this method to remote sensing data acquired by imaging spectrometers to give estimates of nitrogen, lignin, and cellulose concentrations over large areas for use in ecosystem studies.

  20. Regression Analysis A Constructive Critique

    CERN Document Server

    Berk, Richard A

    2003-01-01

    Regression Analysis: A Constructive Critique identifies a wide variety of problems with regression analysis as it is commonly used and then provides a number of ways in which practice could be improved. Regression is most useful for data reduction, leading to relatively simple but rich and precise descriptions of patterns in a data set. The emphasis on description provides readers with an insightful rethinking from the ground up of what regression analysis can do, so that readers can better match regression analysis with useful empirical questions and improved policy-related research. "An

  1. Estimation of nutrients and organic matter in Korean swine slurry using multiple regression analysis of physical and chemical properties.

    Science.gov (United States)

    Suresh, Arumuganainar; Choi, Hong Lim

    2011-10-01

    Swine waste land application has increased due to organic fertilization, but excess application in an arable system can cause environmental risk. Therefore, in situ characterizations of such resources are important prior to application. To explore this, 41 swine slurry samples were collected from Korea, and wide differences were observed in the physico-biochemical properties. However, significant (Phydrometer, EC meter, drying oven and pH meter were found useful to estimate Mn, Fe, Ca, K, Al, Na, N and 5-day biochemical oxygen demands (BOD?) at improved R² values of 0.83, 0.82, 0.77, 0.75, 0.67, 0.47, 0.88 and 0.70, respectively. The results from this study suggest that multiple property regressions can facilitate the prediction of micronutrients and organic matter much better than a single property regression for livestock waste. PMID:21767950

  2. A Dirty Model for Multiple Sparse Regression

    CERN Document Server

    Jalali, Ali; Sanghavi, Sujay

    2011-01-01

    Sparse linear regression -- finding an unknown vector from linear measurements -- is now known to be possible with fewer samples than variables, via methods like the LASSO. We consider the multiple sparse linear regression problem, where several related vectors -- with partially shared support sets -- have to be recovered. A natural question in this setting is whether one can use the sharing to further decrease the overall number of samples required. A line of recent research has studied the use of \\ell_1/\\ell_q norm block-regularizations with q>1 for such problems; however these could actually perform worse in sample complexity -- vis a vis solving each problem separately ignoring sharing -- depending on the level of sharing. We present a new method for multiple sparse linear regression that can leverage support and parameter overlap when it exists, but not pay a penalty when it does not. A very simple idea: we decompose the parameters into two components and regularize these differently. We show both theore...

  3. Prediction of coal grindability based on petrography, proximate and ultimate analysis using multiple regression and artificial neural network models

    Energy Technology Data Exchange (ETDEWEB)

    Chelgani, S. Chehreh; Jorjani, E.; Mesroghli, Sh.; Bagherieh, A.H. [Department of Mining Engineering, Research and Science Campus, Islamic Azad University, Poonak, Hesarak Tehran (Iran); Hower, James C. [Center for Applied Energy Research, University of Kentucky, 2540 Research Park Drive, Lexington, KY 40511 (United States)

    2008-01-15

    The effects of proximate and ultimate analysis, maceral content, and coal rank (R{sub max}) for a wide range of Kentucky coal samples from calorific value of 4320 to 14960 (BTU/lb) (10.05 to 34.80 MJ/kg) on Hardgrove Grindability Index (HGI) have been investigated by multivariable regression and artificial neural network methods (ANN). The stepwise least square mathematical method shows that the relationship between (a) Moisture, ash, volatile matter, and total sulfur; (b) ln (total sulfur), hydrogen, ash, ln ((oxygen + nitrogen)/carbon) and moisture; (c) ln (exinite), semifusinite, micrinite, macrinite, resinite, and R{sub max} input sets with HGI in linear condition can achieve the correlation coefficients (R{sup 2}) of 0.77, 0.75, and 0.81, respectively. The ANN, which adequately recognized the characteristics of the coal samples, can predict HGI with correlation coefficients of 0.89, 0.89 and 0.95 respectively in testing process. It was determined that ln (exinite), semifusinite, micrinite, macrinite, resinite, and R{sub max} can be used as the best predictor for the estimation of HGI on multivariable regression (R{sup 2} = 0.81) and also artificial neural network methods (R{sup 2} = 0.95). The ANN based prediction method, as used in this paper, can be further employed as a reliable and accurate method, in the hardgrove grindability index prediction. (author)

  4. MODELLING THE EFFECT OF THE TREATMENT MEDIUM PH ON THE HEAT INACTIVATION OF ENTEROCOCCUS FAECIUM USING MULTIPLE REGRESSION ANALYSIS

    Directory of Open Access Journals (Sweden)

    S. CONDON

    2014-06-01

    Full Text Available The thermal inactivation of Enterococcus faecium under isothermal conditions in tryptic soy broth of different pH (4.0, 5.5 and 7.4 was studied. The bacterial cells were more sensitive at higher temperature and in media of low pH. Decimal reduction times at 71ºC were 2.56, 0.39 and 0.03 min at pH 7.4, 5.5 and 4.0 respectively. At all temperatures and pH assayed, the survival curves obtained were linear. A mathematical model based on the first order kinetic accurately described these survival curves. The relationship between DT values and temperature was also linear. A mean z-value of 5ºC was established. A multiple linear regression model using four predictor variables (pH, T, pH2 and T2 related the Log of DT value with pH and treatment temperature. The developed tertiary model satisfactorily predicted the heat inactivation of Enterococcus faeciumunder the treatment conditions investigated.

  5. Estimating the input function non-invasively for FDG-PET quantification with multiple linear regression analysis: simulation and verification with in vivo data

    International Nuclear Information System (INIS)

    A novel statistical method, namely Regression-Estimated Input Function (REIF), is proposed in this study for the purpose of non-invasive estimation of the input function for fluorine-18 2-fluoro-2-deoxy-d-glucose positron emission tomography (FDG-PET) quantitative analysis. We collected 44 patients who had undergone a blood sampling procedure during their FDG-PET scans. First, we generated tissue time-activity curves of the grey matter and the whole brain with a segmentation technique for every subject. Summations of different intervals of these two curves were used as a feature vector, which also included the net injection dose. Multiple linear regression analysis was then applied to find the correlation between the input function and the feature vector. After a simulation study with in vivo data, the data of 29 patients were applied to calculate the regression coefficients, which were then used to estimate the input functions of the other 15 subjects. Comparing the estimated input functions with the corresponding real input functions, the averaged error percentages of the area under the curve and the cerebral metabolic rate of glucose (CMRGlc) were 12.13±8.85 and 16.60±9.61, respectively. Regression analysis of the CMRGlc values derived from the real and estimated input functions revealed a high correlation (r=0.91). No significant difference was found between the real CMRGlc and that derived from our regression-estimated input function (Student's t test, P>0.05).put function (Student's t test, P>0.05). The proposed REIF method demonstrated good abilities for input function and CMRGlc estimation, and represents a reliable replacement for the blood sampling procedures in FDG-PET quantification. (orig.)

  6. A comparison between Joint Regression Analysis and the Additive Main and Multiplicative Interaction model: the robustness with increasing amounts of missing data

    Scientific Electronic Library Online (English)

    Paulo Canas, Rodrigues; Dulce Gamito Santinhos, Pereira; João Tiago, Mexia.

    2011-12-01

    Full Text Available SciELO Brazil | Language: English Abstract in english This paper joins the main properties of joint regression analysis (JRA), a model based on the Finlay-Wilkinson regression to analyse multi-environment trials, and of the additive main effects and multiplicative interaction (AMMI) model. The study compares JRA and AMMI with particular focus on robust [...] ness with increasing amounts of randomly selected missing data. The application is made using a data set from a breeding program of durum wheat (Triticum turgidum L., Durum Group) conducted in Portugal. The results of the two models result in similar dominant cultivars (JRA) and winner of mega-environments (AMMI) for the same environments. However, JRA had more stable results with the increase in the incidence rates of missing values.

  7. A comparison on parameter-estimation methods in multiple regression analysis with existence of multicollinearity among independent variables

    Directory of Open Access Journals (Sweden)

    Hukharnsusatrue, A.

    2005-11-01

    Full Text Available The objective of this research is to compare multiple regression coefficients estimating methods with existence of multicollinearity among independent variables. The estimation methods are Ordinary Least Squares method (OLS, Restricted Least Squares method (RLS, Restricted Ridge Regression method (RRR and Restricted Liu method (RL when restrictions are true and restrictions are not true. The study used the Monte Carlo Simulation method. The experiment was repeated 1,000 times under each situation. The analyzed results of the data are demonstrated as follows. CASE 1: The restrictions are true. In all cases, RRR and RL methods have a smaller Average Mean Square Error (AMSE than OLS and RLS method, respectively. RRR method provides the smallest AMSE when the level of correlations is high and also provides the smallest AMSE for all level of correlations and all sample sizes when standard deviation is equal to 5. However, RL method provides the smallest AMSE when the level of correlations is low and middle, except in the case of standard deviation equal to 3, small sample sizes, RRR method provides the smallest AMSE.The AMSE varies with, most to least, respectively, level of correlations, standard deviation and number of independent variables but inversely with to sample size.CASE 2: The restrictions are not true.In all cases, RRR method provides the smallest AMSE, except in the case of standard deviation equal to 1 and error of restrictions equal to 5%, OLS method provides the smallest AMSE when the level of correlations is low or median and there is a large sample size, but the small sample sizes, RL method provides the smallest AMSE. In addition, when error of restrictions is increased, OLS method provides the smallest AMSE for all level, of correlations and all sample sizes, except when the level of correlations is high and sample sizes small. Moreover, the case OLS method provides the smallest AMSE, the most RLS method has a smaller AMSE than RRR and RL methods when the level of correlations is low or median and sample sizes are large.The AMSE varies with, most to least, respectively, error of restrictions, level of correlations, standard deviation and number of independent variables but inversely with to sample sizes, except that error of restrictions does not affect AMSE of OLS method.

  8. Survival analysis and regression models.

    Science.gov (United States)

    George, Brandon; Seals, Samantha; Aban, Inmaculada

    2014-08-01

    Time-to-event outcomes are common in medical research as they offer more information than simply whether or not an event occurred. To handle these outcomes, as well as censored observations where the event was not observed during follow-up, survival analysis methods should be used. Kaplan-Meier estimation can be used to create graphs of the observed survival curves, while the log-rank test can be used to compare curves from different groups. If it is desired to test continuous predictors or to test multiple covariates at once, survival regression models such as the Cox model or the accelerated failure time model (AFT) should be used. The choice of model should depend on whether or not the assumption of the model (proportional hazards for the Cox model, a parametric distribution of the event times for the AFT model) is met. The goal of this paper is to review basic concepts of survival analysis. Discussions relating the Cox model and the AFT model will be provided. The use and interpretation of the survival methods model are illustrated using an artificially simulated dataset. PMID:24810431

  9. Investigating the possible effects of trauma experiences and 5-HTT on the dissociative experiences of patients with OCD using path analysis and multiple regression.

    Science.gov (United States)

    Lochner, Christine; Seedat, Soraya; Hemmings, Sian M J; Moolman-Smook, Johanna C; Kidd, Martin; Stein, Dan J

    2007-01-01

    Dissociation is defined as the disruption of the usually integrated functions of consciousness, such as memory, identity, and perceptions of the environment. Causes include various psychological, neurological and neurobiological mechanisms, none of which have been consistently supported. To our knowledge, the role of gene-environment interactions in dissociative experiences in obsessive-compulsive disorder (OCD) has not previously been investigated. Eighty-three Caucasian patients (29 male, 54 female) with a principal diagnosis of OCD were included. The Dissociative Experiences Scale was used to assess dissociation. The role of childhood trauma (assessed with the Childhood Trauma Questionnaire), and a functional 44-bp insertion/deletion polymorphism in the promoter region of the serotonin transporter, or 5-HTT, in mediating dissociation, was investigated using multiple regression analysis and path analysis using the partial least squares model. Both analyses indicated that an interaction between physical neglect and the S/S genotype of the 5-HTT gene significantly predicted dissociation in patients with OCD. Dissociation may be a predictor of poorer treatment outcome in patients with OCD; therefore, a better understanding of the mechanisms that underlie this phenomenon may be useful. Here, two different but related statistical techniques (multiple regression and partial least squares), confirmed that physical neglect and the 5-HTT genotype jointly play a role in predicting dissociation in OCD. PMID:17943026

  10. Polynomial regression analysis and significance test of the regression function

    International Nuclear Information System (INIS)

    In order to analyze the decay heating power of a certain radioactive isotope per kilogram with polynomial regression method, the paper firstly demonstrated the broad usage of polynomial function and deduced its parameters with ordinary least squares estimate. Then significance test method of polynomial regression function is derived considering the similarity between the polynomial regression model and the multivariable linear regression model. Finally, polynomial regression analysis and significance test of the polynomial function are done to the decay heating power of the iso tope per kilogram in accord with the authors' real work. (authors)

  11. Estimation of toxicity of ionic liquids in Leukemia Rat Cell Line and Acetylcholinesterase enzyme by principal component analysis, neural networks and multiple lineal regressions.

    Science.gov (United States)

    Torrecilla, José S; García, Julián; Rojo, Ester; Rodríguez, Francisco

    2009-05-15

    Multiple linear regression (MLR), radial basis network (RB), and multilayer perceptron (MLP) neural network (NN) models have been explored for the estimation of toxicity of ammonium, imidazolium, morpholinium, phosphonium, piperidinium, pyridinium, pyrrolidinium and quinolinium ionic liquid salts in the Leukemia Rat Cell Line (IPC-81) and Acetylcholinesterase (AChE) using only their empirical formulas (elemental composition) and molecular weights. The toxicity values were estimated by means of decadic logarithms of the half maximal effective concentration (EC(50)) in microM (log(10)EC(50)). The model's performances were analyzed by statistical parameters, analysis of residuals and central tendency and statistical dispersion tests. The MLP model estimates the log(10)EC(50) in IPC-81 and AchE with a mean prediction error less than 2.2 and 3.8%, respectively. PMID:18805639

  12. Study relationship between inorganic and organic coal analysis with gross calorific value by multiple regression and ANFIS

    Science.gov (United States)

    Chelgani, S.C.; Hart, B.; Grady, W.C.; Hower, J.C.

    2011-01-01

    The relationship between maceral content plus mineral matter and gross calorific value (GCV) for a wide range of West Virginia coal samples (from 6518 to 15330 BTU/lb; 15.16 to 35.66MJ/kg) has been investigated by multivariable regression and adaptive neuro-fuzzy inference system (ANFIS). The stepwise least square mathematical method comparison between liptinite, vitrinite, plus mineral matter as input data sets with measured GCV reported a nonlinear correlation coefficient (R2) of 0.83. Using the same data set the correlation between the predicted GCV from the ANFIS model and the actual GCV reported a R2 value of 0.96. It was determined that the GCV-based prediction methods, as used in this article, can provide a reasonable estimation of GCV. Copyright ?? Taylor & Francis Group, LLC.

  13. Polylinear regression analysis in radiochemistry

    International Nuclear Information System (INIS)

    A number of radiochemical problems have been formulated in the framework of polylinear regression analysis, which permits the use of conventional mathematical methods for their solution. The authors have considered features of the use of polylinear regression analysis for estimating the contributions of various sources to the atmospheric pollution, for studying irradiated nuclear fuel, for estimating concentrations from spectral data, for measuring neutron fields of a nuclear reactor, for estimating crystal lattice parameters from X-ray diffraction patterns, for interpreting data of X-ray fluorescence analysis, for estimating complex formation constants, and for analyzing results of radiometric measurements. The problem of estimating the target parameters can be incorrect at certain properties of the system under study. The authors showed the possibility of regularization by adding a fictitious set of data open-quotes obtainedclose quotes from the orthogonal design. To estimate only a part of the parameters under consideration, the authors used incomplete rank models. In this case, it is necessary to take into account the possibility of confounding estimates. An algorithm for evaluating the degree of confounding is presented which is realized using standard software or regression analysis

  14. Retail sales forecasting with application the multiple regression

    Directory of Open Access Journals (Sweden)

    Kuzhda, Tetyana

    2012-05-01

    Full Text Available The article begins with a formulation for predictive learning called multiple regression model. Theoretical approach on construction of the regression models is described. The key information of the article is the mathematical formulation for the forecast linear equation that estimates the multiple regression model. Calculation the quantitative value of dependent variable forecast under influence of independent variables is explained. This paper presents the retail sales forecasting with multiple model estimation. One of the most important decisions a retailer can make with information obtained by the multiple regression. Recently, a changing retail environment is causing by an expected consumer’s income and advertising costs. Checking model on the goodness of fit and statistical significance are explored in the article. Finally, the quantitative value of retail sales forecast based on multiple regression model is calculated.

  15. A combined multiple regression-time series approach to process capability analysis when data are auto correlated

    International Nuclear Information System (INIS)

    The problem of performing process capability analysis when auto correlations are present is discussed. It is shown that when the systematic nonrandom phenomenon induced by autocorrelation is ignored the variance estimate obtained from the original data is no longer an appropriate estimate for use in the process capability analyses. A remedial measure based on an autoregressive integrated moving average model is proposed. It is also shown that the process variance estimated from the residual analysis yields appropriate results for the process capability indices

  16. Estimation of transport airplane aerodynamics using multiple stepwise regression

    Science.gov (United States)

    Keskar, D. A.; Klein, V.; Batterson, J. G.

    1985-01-01

    This paper presents an application of multiple stepwise regression to the flight test data of a typical transport airplane. The flight test data was carefully preprocessed to eliminate aliasing, time skews and high frequency noise. The data consisted both of basic certification maneuvers, such as wind-up-turns and maneuvers suitable for parameter estimation, such as responses to elevator pulses and doublets. It is shown that the results of multiple stepwise regression techniques compare favorably with the results obtained from maximum likelihood estimation. Finally, it is concluded that multiple stepwise regression could be a fast economical way to estimate transport airplane aerodynamics.

  17. Gaussian process regression analysis for functional data

    CERN Document Server

    Shi, Jian Qing

    2011-01-01

    Gaussian Process Regression Analysis for Functional Data presents nonparametric statistical methods for functional regression analysis, specifically the methods based on a Gaussian process prior in a functional space. The authors focus on problems involving functional response variables and mixed covariates of functional and scalar variables.Covering the basics of Gaussian process regression, the first several chapters discuss functional data analysis, theoretical aspects based on the asymptotic properties of Gaussian process regression models, and new methodological developments for high dime

  18. Regression Commonality Analysis: A Technique for Quantitative Theory Building

    Science.gov (United States)

    Nimon, Kim; Reio, Thomas G., Jr.

    2011-01-01

    When it comes to multiple linear regression analysis (MLR), it is common for social and behavioral science researchers to rely predominately on beta weights when evaluating how predictors contribute to a regression model. Presenting an underutilized statistical technique, this article describes how organizational researchers can use commonality…

  19. Comparative Study of Electrode Wear Estimation in Wire EDM using Multiple Regression Analysis and Group Method Data Handling Technique for EN-8 and EN-19

    Directory of Open Access Journals (Sweden)

    G. Ugrasen

    2014-05-01

    Full Text Available Wire Electrical Discharge Machining (WEDM is a specialized thermal machining process capable of accurately machining parts with varying hardness or complex shapes, which have sharp edges that are very difficult to be machined by the main stream machining processes. In WEDM a specific wire run-off speed is applied to compensate wear and avoid wire breakage. Since the workpiece generally stays stationary and short discharge durations are applied, the relative displacement between wire and workpiece during one single discharge is very small. This study outlines the development of model and its application to optimize WEDM machining parameters using the Taguchi?s technique which is based on the robust design. Present study outlines the electrode wear estimation in the wire EDM. EN-8 and EN-19 was machined using different process parameters based on L?16 orthogonal array. Among different process parameters voltage and flush rate were kept constant. Parameters such as bed speed, current, pulse-on and pulse-off was varied. Molybdenum wire having diameter of 0.18 mm was used as an electrode. Electrode wear was measured using universal measuring machine. Estimation and comparison of electrode wear was done using multiple regression analysis and group method data handling technique. From the results it was observed that, measured electrode wear and estimated electrode wear correlates well with respect to MRA than GMDH

  20. Clearness index in cloudy days estimated with meteorological information by multiple regression analysis; Kisho joho wo riyoshita kaiki bunseki ni yoru dontenbi no seiten shisu no suitei

    Energy Technology Data Exchange (ETDEWEB)

    Nakagawa, S. [Maizuru National College of Technology, Kyoto (Japan); Kenmoku, Y.; Sakakibara, T. [Toyohashi University of Technology, Aichi (Japan); Kawamoto, T. [Shizuoka University, Shizuoka (Japan). Faculty of Engineering

    1996-10-27

    Study is under way for a more accurate solar radiation quantity prediction for the enhancement of solar energy utilization efficiency. Utilizing the technique of roughly estimating the day`s clearness index from forecast weather, the forecast weather (constituted of weather conditions such as `clear,` `cloudy,` etc., and adverbs or adjectives such as `afterward,` `temporary,` and `intermittent`) has been quantified relative to the clearness index. This index is named the `weather index` for the purpose of this article. The error high in rate in the weather index relates to cloudy days, which means a weather index falling in 0.2-0.5. It has also been found that there is a high correlation between the clearness index and the north-south wind direction component. A multiple regression analysis has been carried out, under the circumstances, for the estimation of clearness index from the maximum temperature and the north-south wind direction component. As compared with estimation of the clearness index on the basis only of the weather index, estimation using the weather index and maximum temperature achieves a 3% improvement throughout the year. It has also been learned that estimation by use of the weather index and north-south wind direction component enables a 2% improvement for summer and a 5% or higher improvement for winter. 2 refs., 6 figs., 4 tabs.

  1. Regression and regression analysis time series prediction modeling on climate data of quetta, pakistan

    International Nuclear Information System (INIS)

    Various statistical techniques was used on five-year data from 1998-2002 of average humidity, rainfall, maximum and minimum temperatures, respectively. The relationships to regression analysis time series (RATS) were developed for determining the overall trend of these climate parameters on the basis of which forecast models can be corrected and modified. We computed the coefficient of determination as a measure of goodness of fit, to our polynomial regression analysis time series (PRATS). The correlation to multiple linear regression (MLR) and multiple linear regression analysis time series (MLRATS) were also developed for deciphering the interdependence of weather parameters. Spearman's rand correlation and Goldfeld-Quandt test were used to check the uniformity or non-uniformity of variances in our fit to polynomial regression (PR). The Breusch-Pagan test was applied to MLR and MLRATS, respectively which yielded homoscedasticity. We also employed Bartlett's test for homogeneity of variances on a five-year data of rainfall and humidity, respectively which showed that the variances in rainfall data were not homogenous while in case of humidity, were homogenous. Our results on regression and regression analysis time series show the best fit to prediction modeling on climatic data of Quetta, Pakistan. (author)

  2. Application of Partial Least-Squares Regression Model on Temperature Analysis and Prediction of RCCD

    OpenAIRE

    Yuqing Zhao; Zhenxian Xing

    2013-01-01

    This study, based on the temperature monitoring data of jiangya RCCD, uses principle and method of partial least-squares regression to analyze and predict temperature variation of RCCD. By founding partial least-squares regression model, multiple correlations of independent variables is overcome, organic combination on multiple linear regressions, multiple linear regression and canonical correlation analysis is achieved. Compared with general least-squares regression model result, it is more ...

  3. Applied behavior analytic intervention for autism in early childhood: meta-analysis, meta-regression and dose-response meta-analysis of multiple outcomes.

    Science.gov (United States)

    Virués-Ortega, Javier

    2010-06-01

    A number of clinical trials and single-subject studies have been published measuring the effectiveness of long-term, comprehensive applied behavior analytic (ABA) intervention for young children with autism. However, the overall appreciation of this literature through standardized measures has been hampered by the varying methods, designs, treatment features and quality standards of published studies. In an attempt to fill this gap in the literature, state-of-the-art meta-analytical methods were implemented, including quality assessment, sensitivity analysis, meta-regression, dose-response meta-analysis and meta-analysis of studies of different metrics. Results suggested that long-term, comprehensive ABA intervention leads to (positive) medium to large effects in terms of intellectual functioning, language development, acquisition of daily living skills and social functioning in children with autism. Although favorable effects were apparent across all outcomes, language-related outcomes (IQ, receptive and expressive language, communication) were superior to non-verbal IQ, social functioning and daily living skills, with effect sizes approaching 1.5 for receptive and expressive language and communication skills. Dose-dependant effect sizes were apparent by levels of total treatment hours for language and adaptation composite scores. Methodological issues relating ABA clinical trials for autism are discussed. PMID:20223569

  4. A field operational test on valve-regulated lead-acid absorbent-glass-mat batteries in micro-hybrid electric vehicles. Part II. Results based on multiple regression analysis and tear-down analysis

    Science.gov (United States)

    Schaeck, S.; Karspeck, T.; Ott, C.; Weirather-Koestner, D.; Stoermer, A. O.

    2011-03-01

    In the first part of this work [1] a field operational test (FOT) on micro-HEVs (hybrid electric vehicles) and conventional vehicles was introduced. Valve-regulated lead-acid (VRLA) batteries in absorbent glass mat (AGM) technology and flooded batteries were applied. The FOT data were analyzed by kernel density estimation. In this publication multiple regression analysis is applied to the same data. Square regression models without interdependencies are used. Hereby, capacity loss serves as dependent parameter and several battery-related and vehicle-related parameters as independent variables. Battery temperature is found to be the most critical parameter. It is proven that flooded batteries operated in the conventional power system (CPS) degrade faster than VRLA-AGM batteries in the micro-hybrid power system (MHPS). A smaller number of FOT batteries were applied in a vehicle-assigned test design where the test battery is repeatedly mounted in a unique test vehicle. Thus, vehicle category and specific driving profiles can be taken into account in multiple regression. Both parameters have only secondary influence on battery degradation, instead, extended vehicle rest time linked to low mileage performance is more serious. A tear-down analysis was accomplished for selected VRLA-AGM batteries operated in the MHPS. Clear indications are found that pSoC-operation with periodically fully charging the battery (refresh charging) does not result in sulphation of the negative electrode. Instead, the batteries show corrosion of the positive grids and weak adhesion of the positive active mass.

  5. Vehicle Travel Time Predication based on Multiple Kernel Regression

    Directory of Open Access Journals (Sweden)

    Wenjing Xu

    2014-07-01

    Full Text Available With the rapid development of transportation and logistics economy, the vehicle travel time prediction and planning become an important topic in logistics. Travel time prediction, which is indispensible for traffic guidance, has become a key issue for researchers in this field. At present, the prediction of travel time is mainly short term prediction, and the predication methods include artificial neural network, Kaman filter and support vector regression (SVR method etc. However, these algorithms still have some shortcomings, such as highcomputationcomplexity, slow convergence rate etc. This paper exploits the learning ability of multiple kernel learning regression (MKLR in nonlinear prediction processing characteristics, logistics planning based on MKLR for vehicle travel time prediction. The method for Vehicle travel time prediction includes the following steps: (1 preprocessing historical data; (2 selecting appropriate kernel function, training the historical data and performing analysis ;(3 predicting the vehicle travel time based on the trained model. The experimental results show that, through the analysis of using different methods for prediction, the vehicle travel time prediction method proposed in this paper, archives higher accuracy than other methods. It also illustrates the feasibility and effectiveness of the proposed prediction method.

  6. Teasing out the effect of tutorials via multiple regression

    Science.gov (United States)

    Chasteen, Stephanie V.

    2012-02-01

    We transformed an upper-division physics course using a variety of elements, including homework help sessions, tutorials, clicker questions with peer instruction, and explicit learning goals. Overall, the course transformations improved student learning, as measured by our conceptual assessment. Since these transformations were multi-faceted, we would like to understand the impact of individual course elements. Attendance at tutorials and homework help sessions was optional, and occurred outside the class environment. In order to identify the impact of these optional out-of-class sessions, given self-selection effects in student attendance, we performed a multiple regression analysis. Even when background variables are taken into account, tutorial attendance is positively correlated with student conceptual understanding of the material - though not with performance on course exams. Other elements that increase student time-on-task, such as homework help sessions and lectures, do not achieve the same impacts.

  7. Analysing Conjoint Analysis Data by a Random Coefficient Regression Model

    OpenAIRE

    Furlan, Roberto; Corradetti, Roberto

    2005-01-01

    Since late 1960s conjoint analysis has been applied in estimating consumer preferences in marketing research. This article discusses how to model the data coming from a full or a fractional factorial design within a unique regression model, as an alternative to the estimation done by n independent multiple linear regression models, one for each subject. The advantage of the method presented here resides in the possibility of computing correct standard errors for the conjoint analysis utility ...

  8. A multiple covariance approach to PLS regression with several predictor groups: Structural Equation Exploratory Regression

    CERN Document Server

    Bry, Xavier; Cazes, Pierre

    2008-01-01

    A variable group Y is assumed to depend upon R thematic variable groups X 1, >..., X R . We assume that components in Y depend linearly upon components in the Xr's. In this work, we propose a multiple covariance criterion which extends that of PLS regression to this multiple predictor groups situation. On this criterion, we build a PLS-type exploratory method - Structural Equation Exploratory Regression (SEER) - that allows to simultaneously perform dimension reduction in groups and investigate the linear model of the components. SEER uses the multidimensional structure of each group. An application example is given.

  9. Forecasting Financial Time Series Using Multiple Regression, Multi Layer Perception, Radial Basis Function and Adaptive Neuro Fuzzy Inference System Models: A Comparative Analysis

    Directory of Open Access Journals (Sweden)

    Arindam Chaudhuri

    2012-09-01

    Full Text Available In the last few decades, techniques such as Artificial Neural Networks and Fuzzy Inference Systems were used for developing predictive models to estimate the required parameters. Since the recent past Soft Computing techniques are being used as alternate statistical tool. Determination of nature of financial time series data is difficult, expensive, time consuming and involves complex tests. In this paper, we use Multi Layer Perception and Radial Basis Functions of Artificial Neural Networks, Adaptive Neuro Fuzzy Inference System for prediction of S% (Financial Stress percent of financial time series data and compare it with traditional statistical tool of Multiple Regression. The accuracies of Artificial Neural Network and Adaptive Neuro Fuzzy Inference System techniques are evaluated as relatively similar. It is found that Radial Basis Functions constructed exhibit high performance than Multi Layer Perception, Adaptive Neuro Fuzzy Inference System and Multiple Regression for predicting S%. The performance comparison shows that Soft Computing paradigm is a promising tool for minimizing uncertainties in financial time series data. Further Soft Computing also minimizes the potential inconsistency of correlations.

  10. Interpreting Multiple Linear Regression: A Guidebook of Variable Importance

    Science.gov (United States)

    Nathans, Laura L.; Oswald, Frederick L.; Nimon, Kim

    2012-01-01

    Multiple regression (MR) analyses are commonly employed in social science fields. It is also common for interpretation of results to typically reflect overreliance on beta weights, often resulting in very limited interpretations of variable importance. It appears that few researchers employ other methods to obtain a fuller understanding of what…

  11. A comparative analysis of the effects of instructional design factors on student success in e-learning: multiple-regression versus neural networks

    Directory of Open Access Journals (Sweden)

    Halil Ibrahim Cebeci

    2009-12-01

    Full Text Available This study explores the relationship between the student performance and instructional design. The research was conducted at the E-Learning School at a university in Turkey. A list of design factors that had potential influence on student success was created through a review of the literature and interviews with relevant experts. From this, the five most import design factors were chosen. The experts scored 25 university courses on the extent to which they demonstrated the chosen design factors. Multiple-regression and supervised artificial neural network (ANN models were used to examine the relationship between student grade point averages and the scores on the five design factors. The results indicated that there is no statistical difference between the two models. Both models identified the use of examples and applications as the most influential factor. The ANN model provided more information and was used to predict the course-specific factor values required for a desired level of success.

  12. Standardized Regression Coefficients as Indices of Effect Sizes in Meta-Analysis

    Science.gov (United States)

    Kim, Rae Seon

    2011-01-01

    When conducting a meta-analysis, it is common to find many collected studies that report regression analyses, because multiple regression analysis is widely used in many fields. Meta-analysis uses effect sizes drawn from individual studies as a means of synthesizing a collection of results. However, indices of effect size from regression analyses…

  13. Predicting share price by using Multiple Linear Regression.

    OpenAIRE

    Forslund, Gustaf; A?kesson, David

    2013-01-01

    The aim of the project was to design a multiple linear regression model and use it to predict the share’s closing price for 44 companies listed on the OMX Stockholm stock exchange’s Large Cap list. The model is intended to be used as a day trading guideline i.e. today’s information is used to predict tomorrow’s closing price. The regression was done in Microsoft Excel 2010[18] by using its built-in function LINEST. The LINEST-function uses the dependent variable y and all the covariat...

  14. Moderation analysis using a two-level regression model.

    Science.gov (United States)

    Yuan, Ke-Hai; Cheng, Ying; Maxwell, Scott

    2014-10-01

    Moderation analysis is widely used in social and behavioral research. The most commonly used model for moderation analysis is moderated multiple regression (MMR) in which the explanatory variables of the regression model include product terms, and the model is typically estimated by least squares (LS). This paper argues for a two-level regression model in which the regression coefficients of a criterion variable on predictors are further regressed on moderator variables. An algorithm for estimating the parameters of the two-level model by normal-distribution-based maximum likelihood (NML) is developed. Formulas for the standard errors (SEs) of the parameter estimates are provided and studied. Results indicate that, when heteroscedasticity exists, NML with the two-level model gives more efficient and more accurate parameter estimates than the LS analysis of the MMR model. When error variances are homoscedastic, NML with the two-level model leads to essentially the same results as LS with the MMR model. Most importantly, the two-level regression model permits estimating the percentage of variance of each regression coefficient that is due to moderator variables. When applied to data from General Social Surveys 1991, NML with the two-level model identified a significant moderation effect of race on the regression of job prestige on years of education while LS with the MMR model did not. An R package is also developed and documented to facilitate the application of the two-level model. PMID:24337935

  15. Testing Mediation Using Multiple Regression and Structural Equation Modeling Analyses in Secondary Data

    Science.gov (United States)

    Li, Spencer D.

    2011-01-01

    Mediation analysis in child and adolescent development research is possible using large secondary data sets. This article provides an overview of two statistical methods commonly used to test mediated effects in secondary analysis: multiple regression and structural equation modeling (SEM). Two empirical studies are presented to illustrate the…

  16. The role of early adverse experience and adulthood stress in the prediction of neuroendocrine stress reactivity in women: a multiple regression analysis.

    Science.gov (United States)

    Heim, Christine; Newport, D Jeffrey; Wagner, Dieter; Wilcox, Molly M; Miller, Andrew H; Nemeroff, Charles B

    2002-01-01

    Sensitization of stress-responsive neurobiological systems as a possible consequence of early adverse experience has been implicated in the pathophysiology of mood and anxiety disorders. In addition to early adversities, adulthood stressors are also known to precipitate the manifestation of these disorders. The present study sought to evaluate the relative role of early adverse experience vs. stress experiences in adulthood in the prediction of neuroendocrine stress reactivity in women. A total of 49 women (normal volunteers, depressed patients, and women with a history of early abuse) underwent a battery of interviews and completed dimensional rating scales on stress experiences and psychopathology, and were subsequently exposed to a standardized psychosocial laboratory stressor. Outcome measures were plasma adrenocorticotropin (ACTH) and cortisol responses to the stress test. Multiple linear regression analyses were performed to identify the impact of demographic variables, childhood abuse, adulthood trauma, major life events in the past year, and daily hassles in the past month, as well as psychopathology on hormonal stress responsiveness. Peak ACTH responses to psychosocial stress were predicted by a history of childhood abuse, the number of separate abuse events, the number of adulthood traumas, and the severity of depression. Similar predictors were identified for peak cortisol responses. Although abused women reported more severe negative life events in adulthood than controls, life events did not affect neuroendocrine reactivity. The regression model explained 35% of the variance of ACTH responses. The interaction of childhood abuse and adulthood trauma was the most powerful predictor of ACTH responsiveness. Our findings suggest that a history of childhood abuse per se is related to increased neuroendocrine stress reactivity, which is further enhanced when additional trauma is experienced in adulthood. PMID:12001180

  17. Outlier Detection for Multivariate Multiple Regression in Y-direction

    OpenAIRE

    Paweena Tangjuang; Pachitjanut Siripanich

    2014-01-01

    This study focuses on the outlier detection for Multivariate Multiple Regression in Y-direction however, we propose an alternative method based on the squared distances of the residuals. The proposed method refers to the robust estimates of location and covariance matrices derived from the squared distances of the residuals. The proposed method is compared to Mahalanobis Distance method, Minimum Covariance Determinant method and Minimum Volume Ellipsoid met...

  18. Logistic regression analysis of multiple noninvasive tests for the prediction of the presence and extent of coronary artery disease in men

    International Nuclear Information System (INIS)

    The incremental diagnostic yield of clinical data, exercise ECG, stress thallium scintigraphy, and cardiac fluoroscopy to predict coronary and multivessel disease was assessed in 171 symptomatic men by means of multiple logistic regression analyses. When clinical variables alone were analyzed, chest pain type and age were predictive of coronary disease, whereas chest pain type, age, a family history of premature coronary disease before age 55 years, and abnormal ST-T wave changes on the rest ECG were predictive of multivessel disease. The percentage of patients correctly classified by cardiac fluoroscopy (presence or absence of coronary artery calcification), exercise ECG, and thallium scintigraphy was 9%, 25%, and 50%, respectively, greater than for clinical variables, when the presence or absence of coronary disease was the outcome, and 13%, 25%, and 29%, respectively, when multivessel disease was studied; 5% of patients were misclassified. When the 37 clinical and noninvasive test variables were analyzed jointly, the most significant variable predictive of coronary disease was an abnormal thallium scan and for multivessel disease, the amount of exercise performed. The data from this study provide a quantitative model and confirm previous reports that optimal diagnostic efficacy is obtained when noninvasive tests are ordered sequentially. In symptomatic men, cardiac fluoroscopy is a relatively ineffective test when compared to exercise ECG and thallium scintigraphyto exercise ECG and thallium scintigraphy

  19. Introducing Evolutionary Computing in Regression Analysis

    Science.gov (United States)

    Olcay Akman

    A typical upper level undergraduate or first year graduate level regression course syllabus treats model selection with various stepwise regression methods. Here we implement evolutionary computing for subset model selection and accomplish two goals: i) introduce students to the powerful optimization method of genetic algorithms, and ii) transform a regression analysis course to a regression and modeling without requiring any additional time or software commitment.Furthermore we also employed Akaike Information Criterion (AIC) as a measure of model fitness instead of another commonly used measure of R-square. The model selection tool uses Excel which makes the procedure accessible to a very wide spectrum of interdisciplinary students with no specialized software requirement. An Excel macro, to be used as an instructional tool is freely available through the author's website.

  20. Survival analysis and regression models

    OpenAIRE

    George, Brandon; Seals, Samantha; Aban, Inmaculada

    2014-01-01

    Time-to-event outcomes are common in medical research as they offer more information than simply whether or not an event occurred. To handle these outcomes, as well as censored observations where the event was not observed during follow-up, survival analysis methods should be used. Kaplan-Meier estimation can be used to create graphs of the observed survival curves, while the log-rank test can be used to compare curves from different groups. If it is desired to test continuous predictors or t...

  1. Multiple predictor smoothing methods for sensitivity analysis.

    Energy Technology Data Exchange (ETDEWEB)

    Helton, Jon Craig; Storlie, Curtis B.

    2006-08-01

    The use of multiple predictor smoothing methods in sampling-based sensitivity analyses of complex models is investigated. Specifically, sensitivity analysis procedures based on smoothing methods employing the stepwise application of the following nonparametric regression techniques are described: (1) locally weighted regression (LOESS), (2) additive models, (3) projection pursuit regression, and (4) recursive partitioning regression. The indicated procedures are illustrated with both simple test problems and results from a performance assessment for a radioactive waste disposal facility (i.e., the Waste Isolation Pilot Plant). As shown by the example illustrations, the use of smoothing procedures based on nonparametric regression techniques can yield more informative sensitivity analysis results than can be obtained with more traditional sensitivity analysis procedures based on linear regression, rank regression or quadratic regression when nonlinear relationships between model inputs and model predictions are present.

  2. Multiple predictor smoothing methods for sensitivity analysis

    International Nuclear Information System (INIS)

    The use of multiple predictor smoothing methods in sampling-based sensitivity analyses of complex models is investigated. Specifically, sensitivity analysis procedures based on smoothing methods employing the stepwise application of the following nonparametric regression techniques are described: (1) locally weighted regression (LOESS), (2) additive models, (3) projection pursuit regression, and (4) recursive partitioning regression. The indicated procedures are illustrated with both simple test problems and results from a performance assessment for a radioactive waste disposal facility (i.e., the Waste Isolation Pilot Plant). As shown by the example illustrations, the use of smoothing procedures based on nonparametric regression techniques can yield more informative sensitivity analysis results than can be obtained with more traditional sensitivity analysis procedures based on linear regression, rank regression or quadratic regression when nonlinear relationships between model inputs and model predictions are present

  3. Tools to Support Interpreting Multiple Regression in the Face of Multicollinearity

    OpenAIRE

    AmandaKraha; LindaZientek

    2012-01-01

    While multicollinearity may increase the difficulty of interpreting multiple regression results, it should not cause undue problems for the knowledgeable researcher. In the current paper, we argue that rather than using one technique to investigate regression results, researchers should consider multiple indices to understand the contributions that predictors make not only to a regression model, but to each other as well. Some of the techniques to interpret multiple regression effects include...

  4. Hot Resistance Estimation for Dry Type Transformer Using Multiple Variable Regression, Multiple Polynomial Regression and Soft Computing Techniques

    Directory of Open Access Journals (Sweden)

    M. Srinivasan

    2012-01-01

    Full Text Available Problem statement: This study presents a novel method for the determination of average winding temperature rise of transformers under its predetermined field operating conditions. Rise in the winding temperature was determined from the estimated values of winding resistance during the heat run test conducted as per IEC standard. Approach: The estimation of hot resistance was modeled using Multiple Variable Regression (MVR, Multiple Polynomial Regression (MPR and soft computing techniques such as Artificial Neural Network (ANN and Adaptive Neuro Fuzzy Inference System (ANFIS. The modeled hot resistance will help to find the load losses at any load situation without using complicated measurement set up in transformers. Results: These techniques were applied for the hot resistance estimation for dry type transformer by using the input variables cold resistance, ambient temperature and temperature rise. The results are compared and they show a good agreement between measured and computed values. Conclusion: According to our experiments, the proposed methods are verified using experimental results, which have been obtained from temperature rise test performed on a 55 kVA dry-type transformer.

  5. Bayesian latent variable models for median regression on multiple outcomes.

    Science.gov (United States)

    Dunson, David B; Watson, M; Taylor, Jack A

    2003-06-01

    Often a response of interest cannot be measured directly and it is necessary to rely on multiple surrogates, which can be assumed to be conditionally independent given the latent response and observed covariates. Latent response models typically assume that residual densities are Gaussian. This article proposes a Bayesian median regression modeling approach, which avoids parametric assumptions about residual densities by relying on an approximation based on quantiles. To accommodate within-subject dependency, the quantile response categories of the surrogate outcomes are related to underlying normal variables, which depend on a latent normal response. This underlying Gaussian covariance structure simplifies interpretation and model fitting, without restricting the marginal densities of the surrogate outcomes. A Markov chain Monte Carlo algorithm is proposed for posterior computation, and the methods are applied to single-cell electrophoresis (comet assay) data from a genetic toxicology study. PMID:12926714

  6. LOGISTIC REGRESSION ANALYSIS WITH STANDARDIZED MARKERS

    OpenAIRE

    Huang, Ying; Pepe, Margaret S.; Feng, Ziding

    2013-01-01

    Two different approaches to analysis of data from diagnostic biomarker studies are commonly employed. Logistic regression is used to fit models for probability of disease given marker values, while ROC curves and risk distributions are used to evaluate classification performance. In this paper we present a method that simultaneously accomplishes both tasks. The key step is to standardize markers relative to the nondiseased population before including them in the logistic reg...

  7. Multiple Regression Model Based Sequential Probability Ratio Test for Structural Change Detection of Time Series

    Science.gov (United States)

    Takeda, Katsunori; Hattori, Tetsuo; Kawano, Hiromichi

    In real time analysis and forecasting of time series data, it is important to detect the structural change as immediately, correctly, and simply as possible. And it is necessary for rebuilding the next prediction model after the change point as soon as possible. For this kind of time series data analysis, in general, multiple linear regression models are used. In this paper, we present two methods, i.e., Sequential Probability Ratio Test (SPRT) and Chow Test that is well-known in economics, and describe those experimental evaluations of the effectiveness in the change detection using the multiple regression models. Moreover, we extend the definition of the detected change point in the SPRT method, and show the improvement of the change detection accuracy.

  8. Multiple Regression Model for Compressive Strength Prediction of High Performance Concrete

    OpenAIRE

    Zain, M. F. M.; Abd, S. M.

    2009-01-01

    A mathematical model for the prediction of compressive strength of high performance concrete was performed using statistical analysis for the concrete data obtained from experimental work done in this study. The multiple non-linear regression model yielded excellent correlation coefficient for the prediction of compressive strength at different ages (3, 7, 14, 28 and 91 days). The coefficient of correlation was 99.99% for each strength (at each age). Also, the model gives high correlat...

  9. Neighborhood social capital and crime victimization: comparison of spatial regression analysis and hierarchical regression analysis.

    Science.gov (United States)

    Takagi, Daisuke; Ikeda, Ken'ichi; Kawachi, Ichiro

    2012-11-01

    Crime is an important determinant of public health outcomes, including quality of life, mental well-being, and health behavior. A body of research has documented the association between community social capital and crime victimization. The association between social capital and crime victimization has been examined at multiple levels of spatial aggregation, ranging from entire countries, to states, metropolitan areas, counties, and neighborhoods. In multilevel analysis, the spatial boundaries at level 2 are most often drawn from administrative boundaries (e.g., Census tracts in the U.S.). One problem with adopting administrative definitions of neighborhoods is that it ignores spatial spillover. We conducted a study of social capital and crime victimization in one ward of Tokyo city, using a spatial Durbin model with an inverse-distance weighting matrix that assigned each respondent a unique level of "exposure" to social capital based on all other residents' perceptions. The study is based on a postal questionnaire sent to 20-69 years old residents of Arakawa Ward, Tokyo. The response rate was 43.7%. We examined the contextual influence of generalized trust, perceptions of reciprocity, two types of social network variables, as well as two principal components of social capital (constructed from the above four variables). Our outcome measure was self-reported crime victimization in the last five years. In the spatial Durbin model, we found that neighborhood generalized trust, reciprocity, supportive networks and two principal components of social capital were each inversely associated with crime victimization. By contrast, a multilevel regression performed with the same data (using administrative neighborhood boundaries) found generally null associations between neighborhood social capital and crime. Spatial regression methods may be more appropriate for investigating the contextual influence of social capital in homogeneous cultural settings such as Japan. PMID:22901675

  10. Functional linear regression analysis for longitudinal data

    CERN Document Server

    Yao, F; Wang, J L; Yao, Fang; Müller, Hans-Georg; Wang, Jane-Ling

    2005-01-01

    We propose nonparametric methods for functional linear regression which are designed for sparse longitudinal data, where both the predictor and response are functions of a covariate such as time. Predictor and response processes have smooth random trajectories, and the data consist of a small number of noisy repeated measurements made at irregular times for a sample of subjects. In longitudinal studies, the number of repeated measurements per subject is often small and may be modeled as a discrete random number and, accordingly, only a finite and asymptotically nonincreasing number of measurements are available for each subject or experimental unit. We propose a functional regression approach for this situation, using functional principal component analysis, where we estimate the functional principal component scores through conditional expectations. This allows the prediction of an unobserved response trajectory from sparse measurements of a predictor trajectory. The resulting technique is flexible and allow...

  11. Assessment method of study program: Results from regression analysis

    Science.gov (United States)

    Hamid, Mohd Rashid Bin Ab; Mohamed, Mohd Rusllim Bin; Mustafa, Zainol

    2015-02-01

    Assessment is an important part in any universities programs. Various approach of assessment has been used in determining the students' grade for the subjects. Therefore, this article discussed the empirical study for finding the best solution for determining the student grades. Several predictors for determining the students' grades i.e. total marks were identified such as coursework marks, mid-semester marks and final exam marks. Therefore, raw data from the database for a particular semester at one university in east coast of Malaysia are used for this purpose. The Correlational analysis was used to determine the strength of the association between the three predictors and the criterion variable. Also, multiple regression analysis was used to find the best regression model for the purpose of the study. Implications of the study were also discussed.

  12. Regression Analysis for the Social Sciences

    CERN Document Server

    Gordon, Rachel A A

    2012-01-01

    The book provides graduate students in the social sciences with the basic skills that they need to estimate, interpret, present, and publish basic regression models using contemporary standards. Key features of the book include:interweaving the teaching of statistical concepts with examples developed for the course from publicly-available social science data or drawn from the literature. thorough integration of teaching statistical theory with teaching data processing and analysis.teaching of both SAS and Stata "side-by-side" and use of chapter exercises in which students practice programming

  13. An Effect Size for Regression Predictors in Meta-Analysis

    Science.gov (United States)

    Aloe, Ariel M.; Becker, Betsy Jane

    2012-01-01

    A new effect size representing the predictive power of an independent variable from a multiple regression model is presented. The index, denoted as r[subscript sp], is the semipartial correlation of the predictor with the outcome of interest. This effect size can be computed when multiple predictor variables are included in the regression model…

  14. Sliced Inverse Regression for big data analysis

    OpenAIRE

    Kevin, Li

    2014-01-01

    Modem advances in computing power have greatly widened scientists' scope in gathering and investigating information from many variables. We describe sliced inverse regression (SIR), for reducing the dimension of the input variable x without going through any parametric or nonparametric model-fitting process. This method explores the simplicity of the inverse view of regression. Instead of regressing the univariate output variable y against the multivariate x, we regress x against y. Forward r...

  15. Using Dominance Analysis to Determine Predictor Importance in Logistic Regression

    Science.gov (United States)

    Azen, Razia; Traxel, Nicole

    2009-01-01

    This article proposes an extension of dominance analysis that allows researchers to determine the relative importance of predictors in logistic regression models. Criteria for choosing logistic regression R[superscript 2] analogues were determined and measures were selected that can be used to perform dominance analysis in logistic regression. A…

  16. A critical assessment of shrinkage-based regression approaches for estimating the adverse health effects of multiple air pollutants

    Science.gov (United States)

    Roberts, Steven; Martin, Michael

    Most investigations of the adverse health effects of multiple air pollutants analyse the time series involved by simultaneously entering the multiple pollutants into a Poisson log-linear model. Concerns have been raised about this type of analysis, and it has been stated that new methodology or models should be developed for investigating the adverse health effects of multiple air pollutants. In this paper, we introduce the use of the lasso for this purpose and compare its statistical properties to those of ridge regression and the Poisson log-linear model. Ridge regression has been used in time series analyses on the adverse health effects of multiple air pollutants but its properties for this purpose have not been investigated. A series of simulation studies was used to compare the performance of the lasso, ridge regression, and the Poisson log-linear model. In these simulations, realistic mortality time series were generated with known air pollution mortality effects permitting the performance of the three models to be compared. Both the lasso and ridge regression produced more accurate estimates of the adverse health effects of the multiple air pollutants than those produced using the Poisson log-linear model. This increase in accuracy came at the expense of increased bias. Ridge regression produced more accurate estimates than the lasso, but the lasso produced more interpretable models. The lasso and ridge regression offer a flexible way of obtaining more accurate estimation of pollutant effects than that provided by the standard Poisson log-linear model.

  17. Tools to Support Interpreting Multiple Regression in the Face of Multicollinearity

    Directory of Open Access Journals (Sweden)

    AmandaKraha

    2012-03-01

    Full Text Available While multicollinearity may increase the difficulty of interpreting multiple regression results, it should not cause undue problems for the knowledgeable researcher. In the current paper, we argue that rather than using one technique to investigate regression results, researchers should consider multiple indices to understand the contributions that predictors make not only to a regression model, but to each other as well. Some of the techniques to interpret multiple regression effects include, but are not limited to, correlation coefficients, beta weights, structure coefficients, all possible subsets regression, commonality coefficients, dominance weights, and relative importance weights. This article will review a set of techniques to interpret multiple regression effects, identify the elements of the data on which the methods focus, and identify statistical software to support such analyses.

  18. Throughput Prediction of Fishing Goods Based on the Grey Multiple Linear Regression Method

    OpenAIRE

    Changping Chen; Changlu Zhou; Xueda Zhao; Yanna Zheng; Xianying Shi

    2014-01-01

    Based on the grey prediction method and multiple linear regression method, the grey multiple linear regression method was presented. This method was applied to the throughput prediction of fishing goods according to five fishing ports’ actual throughput data. The result of comparing the calculating conclusion to the time series one-dimensional linear regression method and grey prediction method proved that the method of calculation and analyzing was more effective and the forecasting precis...

  19. A Software Tool for Regression Analysis and its Assumptions

    OpenAIRE

    Sona Mardikyan; Darcan, Osman N.

    2006-01-01

    Nowadays, among the forecasting methods, the most important one is the regression analysis. In this method, the aim is to estimate the population regression model as much as accurate by taking as basis the sample regression function. Its results are valid under certain assumptions and the violations of these assumptions cause the invalidity of some properties of the estimators. In this study, a new object-oriented program concentrated only on the regression analysis and its assumptions has be...

  20. MULTIPLE LOGISTIC REGRESSION MODEL TO PREDICT RISK FACTORS OF ORAL HEALTH DISEASES

    Directory of Open Access Journals (Sweden)

    Parameshwar V. Pandit

    2012-06-01

    Full Text Available Purpose: To analysis the dependence of oral health diseases i.e. dental caries and periodontal disease on considering the number of risk factors through the applications of logistic regression model. Method: The cross sectional study involves a systematic random sample of 1760 permanent dentition aged between 18-40 years in Dharwad, Karnataka, India. Dharwad is situated in North Karnataka. The mean age was 34.26±7.28. The risk factors of dental caries and periodontal disease were established by multiple logistic regression model using SPSS statistical software. Results: The factors like frequency of brushing, timings of cleaning teeth and type of toothpastes are significant persistent predictors of dental caries and periodontal disease. The log likelihood value of full model is –1013.1364 and Akaike’s Information Criterion (AIC is 1.1752 as compared to reduced regression model are -1019.8106 and 1.1748 respectively for dental caries. But, the log likelihood value of full model is –1085.7876 and AIC is 1.2577 followed by reduced regression model are -1019.8106 and 1.1748 respectively for periodontal disease. The area under Receiver Operating Characteristic (ROC curve for the dental caries is 0.7509 (full model and 0.7447 (reduced model; the ROC for the periodontal disease is 0.6128 (full model and 0.5821 (reduced model. Conclusions: The frequency of brushing, timings of cleaning teeth and type of toothpastes are main signifi cant risk factors of dental caries and periodontal disease. The fitting performance of reduced logistic regression model is slightly a better fit as compared to full logistic regression model in identifying the these risk factors for both dichotomous dental caries and periodontal disease.

  1. Spatial regression analysis on 32 years total column ozone data

    Directory of Open Access Journals (Sweden)

    J. S. Knibbe

    2014-02-01

    Full Text Available Multiple-regressions analysis have been performed on 32 years of total ozone column data that was spatially gridded with a 1° × 1.5° resolution. The total ozone data consists of the MSR (Multi Sensor Reanalysis; 1979–2008 and two years of assimilated SCIAMACHY ozone data (2009–2010. The two-dimensionality in this data-set allows us to perform the regressions locally and investigate spatial patterns of regression coefficients and their explanatory power. Seasonal dependencies of ozone on regressors are included in the analysis. A new physically oriented model is developed to parameterize stratospheric ozone. Ozone variations on non-seasonal timescales are parameterized by explanatory variables describing the solar cycle, stratospheric aerosols, the quasi-biennial oscillation (QBO, El Nino (ENSO and stratospheric alternative halogens (EESC. For several explanatory variables, seasonally adjusted versions of these explanatory variables are constructed to account for the difference in their effect on ozone throughout the year. To account for seasonal variation in ozone, explanatory variables describing the polar vortex, geopotential height, potential vorticity and average day length are included. Results of this regression model are compared to that of similar analysis based on a more commonly applied statistically oriented model. The physically oriented model provides spatial patterns in the regression results for each explanatory variable. The EESC has a significant depleting effect on ozone at high and mid-latitudes, the solar cycle affects ozone positively mostly at the Southern Hemisphere, stratospheric aerosols affect ozone negatively at high Northern latitudes, the effect of QBO is positive and negative at the tropics and mid to high-latitudes respectively and ENSO affects ozone negatively between 30° N and 30° S, particularly at the Pacific. The contribution of explanatory variables describing seasonal ozone variation is generally large at mid to high latitudes. We observe ozone contributing effects for potential vorticity and day length, negative effect on ozone for geopotential height and variable ozone effects due to the polar vortex at regions to the north and south of the polar vortices. Recovery of ozone is identified globally. However, recovery rates and uncertainties strongly depend on choices that can be made in defining the explanatory variables. In particular the recovery rates over Antarctica might not be statistically significant. Furthermore, the results show that there is no spatial homogeneous pattern which regression model and explanatory variables provide the best fit to the data and the most accurate estimates of the recovery rates. Overall these results suggest that care has to be taken in determining ozone recovery rates, in particular for the Antarctic ozone hole.

  2. Spatial regression analysis on 32 years total column ozone data

    Science.gov (United States)

    Knibbe, J. S.; van der A, R. J.; de Laat, A. T. J.

    2014-02-01

    Multiple-regressions analysis have been performed on 32 years of total ozone column data that was spatially gridded with a 1° × 1.5° resolution. The total ozone data consists of the MSR (Multi Sensor Reanalysis; 1979-2008) and two years of assimilated SCIAMACHY ozone data (2009-2010). The two-dimensionality in this data-set allows us to perform the regressions locally and investigate spatial patterns of regression coefficients and their explanatory power. Seasonal dependencies of ozone on regressors are included in the analysis. A new physically oriented model is developed to parameterize stratospheric ozone. Ozone variations on non-seasonal timescales are parameterized by explanatory variables describing the solar cycle, stratospheric aerosols, the quasi-biennial oscillation (QBO), El Nino (ENSO) and stratospheric alternative halogens (EESC). For several explanatory variables, seasonally adjusted versions of these explanatory variables are constructed to account for the difference in their effect on ozone throughout the year. To account for seasonal variation in ozone, explanatory variables describing the polar vortex, geopotential height, potential vorticity and average day length are included. Results of this regression model are compared to that of similar analysis based on a more commonly applied statistically oriented model. The physically oriented model provides spatial patterns in the regression results for each explanatory variable. The EESC has a significant depleting effect on ozone at high and mid-latitudes, the solar cycle affects ozone positively mostly at the Southern Hemisphere, stratospheric aerosols affect ozone negatively at high Northern latitudes, the effect of QBO is positive and negative at the tropics and mid to high-latitudes respectively and ENSO affects ozone negatively between 30° N and 30° S, particularly at the Pacific. The contribution of explanatory variables describing seasonal ozone variation is generally large at mid to high latitudes. We observe ozone contributing effects for potential vorticity and day length, negative effect on ozone for geopotential height and variable ozone effects due to the polar vortex at regions to the north and south of the polar vortices. Recovery of ozone is identified globally. However, recovery rates and uncertainties strongly depend on choices that can be made in defining the explanatory variables. In particular the recovery rates over Antarctica might not be statistically significant. Furthermore, the results show that there is no spatial homogeneous pattern which regression model and explanatory variables provide the best fit to the data and the most accurate estimates of the recovery rates. Overall these results suggest that care has to be taken in determining ozone recovery rates, in particular for the Antarctic ozone hole.

  3. A Comparison between the Linear Neural Network Method and the Multiple Linear Regression Method in the Modeling of Continuous Data

    Directory of Open Access Journals (Sweden)

    Guoli Wang

    2011-10-01

    Full Text Available Both linear neural network and multiple linear regression models can be used for multi-factor analysis and forecasting, but the data of the multiple linear regression model are required to meet such conditions as independence and normality, while the data of the linear neural network are only required to have a linear relationship. This article uses the same set of data to establish respectively a linear neural network model and a multiple linear regression model, compares the abilities of fitting and forecasting of the two kinds of models, and consequently, comes to the conclusion that the linear neural network method has a stronger fitting ability and a more stable ability of prediction so that it can be further applied and promoted in the analyzing and forecasting of continuous data factors.

  4. Representation of exposures in regression analysis and interpretation of regression coefficients: basic concepts and pitfalls.

    Science.gov (United States)

    Leffondré, Karen; Jager, Kitty J; Boucquemont, Julie; Stel, Vianda S; Heinze, Georg

    2014-10-01

    Regression models are being used to quantify the effect of an exposure on an outcome, while adjusting for potential confounders. While the type of regression model to be used is determined by the nature of the outcome variable, e.g. linear regression has to be applied for continuous outcome variables, all regression models can handle any kind of exposure variables. However, some fundamentals of representation of the exposure in a regression model and also some potential pitfalls have to be kept in mind in order to obtain meaningful interpretation of results. The objective of this educational paper was to illustrate these fundamentals and pitfalls, using various multiple regression models applied to data from a hypothetical cohort of 3000 patients with chronic kidney disease. In particular, we illustrate how to represent different types of exposure variables (binary, categorical with two or more categories and continuous), and how to interpret the regression coefficients in linear, logistic and Cox models. We also discuss the linearity assumption in these models, and show how wrongly assuming linearity may produce biased results and how flexible modelling using spline functions may provide better estimates. PMID:24366898

  5. Tucker Tensor Regression and Neuroimaging Analysis

    OpenAIRE

    Li, Xiaoshan; Zhou, Hua; Li, Lexin

    2013-01-01

    Large-scale neuroimaging studies have been collecting brain images of study individuals, which take the form of two-dimensional, three-dimensional, or higher dimensional arrays, also known as tensors. Addressing scientific questions arising from such data demands new regression models that take multidimensional arrays as covariates. Simply turning an image array into a long vector causes extremely high dimensionality that compromises classical regression methods, and, more s...

  6. Simultaneous Multiple Response Regression and Inverse Covariance Matrix Estimation via Penalized Gaussian Maximum Likelihood

    OpenAIRE

    Lee, Wonyul; Liu, Yufeng

    2012-01-01

    Multivariate regression is a common statistical tool for practical problems. Many multivariate regression techniques are designed for univariate response cases. For problems with multiple response variables available, one common approach is to apply the univariate response regression technique separately on each response variable. Although it is simple and popular, the univariate response approach ignores the joint information among response variables. In this paper, we propose three new meth...

  7. Multiple Logistic Regression Analysis of Risk Factors Associated with Denture Plaque and Staining in Chinese Removable Denture Wearers over 40 Years Old in Xi’an – a Cross-Sectional Study

    Science.gov (United States)

    Chai, Zhiguo; Chen, Jihua; Zhang, Shaofeng

    2014-01-01

    Background Removable dentures are subject to plaque and/or staining problems. Denture hygiene habits and risk factors differ among countries and regions. The aims of this study were to assess hygiene habits and denture plaque and staining risk factors in Chinese removable denture wearers aged >40 years in Xi’an through multiple logistic regression analysis (MLRA). Methods Questionnaires were administered to 222 patients whose removable dentures were examined clinically to assess wear status and levels of plaque and staining. Univariate analyses were performed to identify potential risk factors for denture plaque/staining. MLRA was performed to identify significant risk factors. Results Brushing (77.93%) was the most prevalent cleaning method in the present study. Only 16.4% of patients regularly used commercial cleansers. Most (81.08%) patients removed their dentures overnight. MLRA indicated that potential risk factors for denture plaque were the duration of denture use (reference, ?0.5 years; 2.1–5 years: OR?=?4.155, P?=?0.001; >5 years: OR?=?7.238, Pcleaning method (reference, chemical cleanser; running water: OR?=?7.081, P?=?0.010; brushing: OR?=?3.567, P?=?0.005). Potential risk factors for denture staining were female gender (OR?=?0.377, P?=?0.013), smoking (OR?=?5.471, P?=?0.031), tea consumption (OR?=?3.957, P?=?0.002), denture scratching (OR?=?4.557, P?=?0.036), duration of denture use (reference, ?0.5 years; 2.1–5 years: OR?=?7.899, P?=?0.001; >5 years: OR?=?27.226, Pcleaning method (reference, chemical cleanser; running water: OR?=?29.184, PDenture hygiene habits need further improvement. An understanding of the risk factors for denture plaque and staining may provide the basis for preventive efforts. PMID:24498369

  8. Modeling Lateral and Longitudinal Control of Human Drivers with Multiple Linear Regression Models

    OpenAIRE

    Lenk, Jan; M, Claus

    2011-01-01

    In this paper, we describe results to model lateral and longitudinal control behavior of drivers with simple linear multiple regression models. This approach fits into the Bayesian Programming (BP) approach (Bessi

  9. Comparison of Fuzzy Inference System and Multiple Regression to Predict Synthetic Envelopes Clogging

    OpenAIRE

    Bakhtiar Karimi; Farhad Mirzaei; Mohammad Javad Nahvinia; Behnam Ababaei

    2010-01-01

    Geo-synthetic materials are being used with acceptable performance in soil and water projects worldwide. Geotextiles are one of the categories of geo-synthetics being used in drainage systems. First generation of geotextiles used in the late 1950’s as an alternative for gravel envelopes. In this research two methods (multiple regression and fuzzy interference system) evaluate to predict synthetic envelope clogging. In multiple regression method the correlation coefficients for PP450, PP700 ...

  10. Landslide Susceptibility Mapping Using Multiple Regression and GIS Tools in Tajan Basin, North of Iran

    Directory of Open Access Journals (Sweden)

    Somayeh Mashari

    2012-07-01

    Full Text Available Landslide is a natural hazard that causes many damages to the environment. Depending on the landform, several factors can cause the Landslide. This research addresses the methodology for landslide susceptibility mapping using multiple regression analysis and GIS tools. Based on the initial hypothesis, ten factors were recognized as effectual elements on landslide, which is geology, slope, aspect, distance from roads, faults and drainage network, soil capability, land use and rainfall. Crossing investigated parameters with the observed landslides indicated that three factor including distance from channel network, distance from fault and rainfall have no major effect on observed landslide in Tajan area. In order to quantifying the parameters in the form of weighting factors, the coverage of landslides in different observation was determined. Then Stepwise method was used for statistical analysis. It was found that slope, aspect, distance from the roads and soil capability are as most effective factors in landslide respectively.

  11. Neutron multiplicity analysis tool

    International Nuclear Information System (INIS)

    I describe the capabilities of the EXCOM (EXcel based COincidence and Multiplicity) calculation tool which is used to analyze experimental data or simulated neutron multiplicity data. The input to the program is the count-rate data (including the multiplicity distribution) for a measurement, the isotopic composition of the sample and relevant dates. The program carries out deadtime correction and background subtraction and then performs a number of analyses. These are: passive calibration curve, known alpha and multiplicity analysis. The latter is done with both the point model and with the weighted point model. In the current application EXCOM carries out the rapid analysis of Monte Carlo calculated quantities and allows the user to determine the magnitude of sample perturbations that lead to systematic errors. Neutron multiplicity counting is an assay method used in the analysis of plutonium for safeguards applications. It is widely used in nuclear material accountancy by international (IAEA) and national inspectors. The method uses the measurement of the correlations in a pulse train to extract information on the spontaneous fission rate in the presence of neutrons from (?,n) reactions and induced fission. The measurement is relatively simple to perform and gives results very quickly ((le) 1 hour). By contrast, destructive analysis techniques are extremely costly and time consuming (several days). By improving the achievable accuracy of neutron multiplicity countie accuracy of neutron multiplicity counting, a nondestructive analysis technique, it could be possible to reduce the use of destructive analysis measurements required in safeguards applications. The accuracy of a neutron multiplicity measurement can be affected by a number of variables such as density, isotopic composition, chemical composition and moisture in the material. In order to determine the magnitude of these effects on the measured plutonium mass a calculational tool, EXCOM, has been produced using VBA within Excel. This program was developed to help speed the analysis of Monte Carlo neutron transport simulation (MCNP) data, and only requires the count-rate data to calculate the mass of material using INCC's analysis methods instead of the full neutron multiplicity distribution required to run analysis in INCC. This paper describes what is implemented within EXCOM, including the methods used, how the program corrects for deadtime, and how uncertainty is calculated. This paper also describes how to use EXCOM within Excel.

  12. Linear regression analysis for comparing two measurers or methods of measurement: but which regression?

    Science.gov (United States)

    Ludbrook, John

    2010-07-01

    1. There are two reasons for wanting to compare measurers or methods of measurement. One is to calibrate one method or measurer against another; the other is to detect bias. Fixed bias is present when one method gives higher (or lower) values across the whole range of measurement. Proportional bias is present when one method gives values that diverge progressively from those of the other. 2. Linear regression analysis is a popular method for comparing methods of measurement, but the familiar ordinary least squares (OLS) method is rarely acceptable. The OLS method requires that the x values are fixed by the design of the study, whereas it is usual that both y and x values are free to vary and are subject to error. In this case, special regression techniques must be used. 3. Clinical chemists favour techniques such as major axis regression ('Deming's method'), the Passing-Bablok method or the bivariate least median squares method. Other disciplines, such as allometry, astronomy, biology, econometrics, fisheries research, genetics, geology, physics and sports science, have their own preferences. 4. Many Monte Carlo simulations have been performed to try to decide which technique is best, but the results are almost uninterpretable. 5. I suggest that pharmacologists and physiologists should use ordinary least products regression analysis (geometric mean regression, reduced major axis regression): it is versatile, can be used for calibration or to detect bias and can be executed by hand-held calculator or by using the loss function in popular, general-purpose, statistical software. PMID:20337658

  13. Multinomial Inverse Regression for Text Analysis

    OpenAIRE

    Taddy, Matt

    2010-01-01

    Text data, including speeches, stories, and other document forms, are often connected to sentiment variables that are of interest for research in marketing, economics, and elsewhere. It is also very high dimensional and difficult to incorporate into statistical analyses. This article introduces a straightforward framework of sentiment-preserving dimension reduction for text data. Multinomial inverse regression is introduced as a general tool for simplifying predictor sets th...

  14. Multiple predictor smoothing methods for sensitivity analysis: Example results

    International Nuclear Information System (INIS)

    The use of multiple predictor smoothing methods in sampling-based sensitivity analyses of complex models is investigated. Specifically, sensitivity analysis procedures based on smoothing methods employing the stepwise application of the following nonparametric regression techniques are described in the first part of this presentation: (i) locally weighted regression (LOESS), (ii) additive models, (iii) projection pursuit regression, and (iv) recursive partitioning regression. In this, the second and concluding part of the presentation, the indicated procedures are illustrated with both simple test problems and results from a performance assessment for a radioactive waste disposal facility (i.e., the Waste Isolation Pilot Plant). As shown by the example illustrations, the use of smoothing procedures based on nonparametric regression techniques can yield more informative sensitivity analysis results than can be obtained with more traditional sensitivity analysis procedures based on linear regression, rank regression or quadratic regression when nonlinear relationships between model inputs and model predictions are present

  15. Multiple predictor smoothing methods for sensitivity analysis: Description of techniques

    International Nuclear Information System (INIS)

    The use of multiple predictor smoothing methods in sampling-based sensitivity analyses of complex models is investigated. Specifically, sensitivity analysis procedures based on smoothing methods employing the stepwise application of the following nonparametric regression techniques are described: (i) locally weighted regression (LOESS), (ii) additive models, (iii) projection pursuit regression, and (iv) recursive partitioning regression. Then, in the second and concluding part of this presentation, the indicated procedures are illustrated with both simple test problems and results from a performance assessment for a radioactive waste disposal facility (i.e., the Waste Isolation Pilot Plant). As shown by the example illustrations, the use of smoothing procedures based on nonparametric regression techniques can yield more informative sensitivity analysis results than can be obtained with more traditional sensitivity analysis procedures based on linear regression, rank regression or quadratic regression when nonlinear relationships between model inputs and model predictions are present

  16. The Precision Efficacy Analysis for Regression Sample Size Method.

    Science.gov (United States)

    Brooks, Gordon P.; Barcikowski, Robert S.

    The general purpose of this study was to examine the efficiency of the Precision Efficacy Analysis for Regression (PEAR) method for choosing appropriate sample sizes in regression studies used for precision. The PEAR method, which is based on the algebraic manipulation of an accepted cross-validity formula, essentially uses an effect size to…

  17. A multilevel regression-analysis-based nonlocal means denoising algorithm

    Science.gov (United States)

    Xu, Jin; Zheng, Pengcheng; Lv, Rui

    2011-06-01

    This paper focuses on image denoising under the powerful framework-non local means. First, the introduction and development of NL-means is discussed. Second, a powerful scheme based on linear regression analysis for the classification of image meaningful parts is proposed. Third, an improved version of NL-means is carried out, which uses a novel patch similarity rule based on quadratic regression analysis. This multilevel regression analysis based algorithm can better describe and smooth the noisy image and finally, experimental results validate the algorithm in both effectiveness and efficiency.

  18. Forest Loss Triggers in Cameroon: A Quantitative Assessment Using Multiple Linear Regression Approach

    Directory of Open Access Journals (Sweden)

    Epule Terence Epule

    2011-08-01

    Full Text Available The triggers of forest area loss in Cameroon have not been properly understood. The measures used to curb forest area loss have been simplistic, generalized with no clear cut knowledge of the specific role of different potential factors. This study aims at investigating the hypothesis that population growth is the main cause of loss in forest area. This study will be able to identify what factors are of more significance in the causal equation. The open R programming software has been used to produce multiple linear regression models. The correlation between the dependent variable and the independent variables was established by a correlation matrix and the strength of the models tested by power analysis. The results supports the hypothesis that population growth is the most dominant cause of deforestation in Cameroon while arable production and permanent crop land and arable production per capita index are second and third respectively.

  19. Transformation of nitrogen dioxide into ozone and prediction of ozone concentrations using multiple linear regression techniques.

    Science.gov (United States)

    Ghazali, Nurul Adyani; Ramli, Nor Azam; Yahaya, Ahmad Shukri; Yusof, Noor Faizah Fitri M D; Sansuddin, Nurulilyana; Al Madhoun, Wesam Ahmed

    2010-06-01

    Analysis and forecasting of air quality parameters are important topics of atmospheric and environmental research today due to the health impact caused by air pollution. This study examines transformation of nitrogen dioxide (NO(2)) into ozone (O(3)) at urban environment using time series plot. Data on the concentration of environmental pollutants and meteorological variables were employed to predict the concentration of O(3) in the atmosphere. Possibility of employing multiple linear regression models as a tool for prediction of O(3) concentration was tested. Results indicated that the presence of NO(2) and sunshine influence the concentration of O(3) in Malaysia. The influence of the previous hour ozone on the next hour concentrations was also demonstrated. PMID:19440846

  20. The Determination of Polyethlylene Glycol and Water in Archaeological Wood using Infrared Spectroscopy and Stepwise Multiple Linear Regression

    Directory of Open Access Journals (Sweden)

    Rohan PATEL

    2012-03-01

    Full Text Available Polyethylene glycol (PEG is the most common preservative in use for bulking and maintaining structural integrity in waterlogged wood. Conservators therefore have a need to be able to determine PEG concentrations in wood in a non-destructive manner. We present a study highlighting the application of infrared spectroscopy coupled with multivariate analysis techniques to predict the concentration of polyethylene glycol 400 (PEG-400 and water simultaneously. This technique uses attenuated total reflectance (ATR spectroscopy andunconstrained stepwise multiple linear regression (SMLR analysis for prediction of multiple components in archaeological wood. Using this model we have calculated the concentration of PEG-400 and water in treated archaeological waterlogged wood samples.

  1. DETERMINATION OF POLYCHLORINATED BIPHENYLS USING MULTIPLE REGRESSION WITH OUTLIER DETECTION AND ELIMINATION

    Science.gov (United States)

    A method for the analysis of capillary column Polychlorinated biphenyl (PCB) data using regression analysis with outlier checking and elimination, COMSTAR, is presented and evaluated. his algorithm determines the best combination of the commercial PCB mixtures which best fits the...

  2. Confidence intervals after multiple imputation: combining profile likelihood information from logistic regressions.

    Science.gov (United States)

    Heinze, Georg; Ploner, Meinhard; Beyea, Jan

    2013-12-20

    In the logistic regression analysis of a small-sized, case-control study on Alzheimer's disease, some of the risk factors exhibited missing values, motivating the use of multiple imputation. Usually, Rubin's rules (RR) for combining point estimates and variances would then be used to estimate (symmetric) confidence intervals (CIs), on the assumption that the regression coefficients were distributed normally. Yet, rarely is this assumption tested, with or without transformation. In analyses of small, sparse, or nearly separated data sets, such symmetric CI may not be reliable. Thus, RR alternatives have been considered, for example, Bayesian sampling methods, but not yet those that combine profile likelihoods, particularly penalized profile likelihoods, which can remove first order biases and guarantee convergence of parameter estimation. To fill the gap, we consider the combination of penalized likelihood profiles (CLIP) by expressing them as posterior cumulative distribution functions (CDFs) obtained via a chi-squared approximation to the penalized likelihood ratio statistic. CDFs from multiple imputations can then easily be averaged into a combined CDF c , allowing confidence limits for a parameter ? ?at level 1?-?? to be identified as those ?* and ?** that satisfy CDF c (?*)?=?????2 and CDF c (?**)?=?1?-?????2. We demonstrate that the CLIP method outperforms RR in analyzing both simulated data and data from our motivating example. CLIP can also be useful as a confirmatory tool, should it show that the simpler RR are adequate for extended analysis. We also compare the performance of CLIP to Bayesian sampling methods using Markov chain Monte Carlo. CLIP is available in the R package logistf. PMID:23873477

  3. The Synthesis of Regression Slopes in Meta-Analysis

    OpenAIRE

    Becker, Betsy Jane; Wu, Meng-jia

    2008-01-01

    Research on methods of meta-analysis (the synthesis of related study results) has dealt with many simple study indices, but less attention has been paid to the issue of summarizing regression slopes. In part this is because of the many complications that arise when real sets of regression models are accumulated. We outline the complexities involved in synthesizing slopes, describe existing methods of analysis and present a multivariate generalized least squares approach to t...

  4. Principal Regression Analysis and the index leverage effect

    OpenAIRE

    Reigneron, Pierre-alain; Allez, Romain; Bouchaud, Jean-philippe

    2010-01-01

    We revisit the index leverage effect, that can be decomposed into a volatility effect and a correlation effect. We investigate the latter using a matrix regression analysis, that we call `Principal Regression Analysis' (PRA) and for which we provide some analytical (using Random Matrix Theory) and numerical benchmarks. We find that downward index trends increase the average correlation between stocks (as measured by the most negative eigenvalue of the conditional correlation...

  5. On two flexible methods of 2-dimensional regression analysis.

    Czech Academy of Sciences Publication Activity Database

    Volf, Petr

    2012-01-01

    Ro?. 18, ?. 4 (2012), s. 154-164. ISSN 1803-9782 Grant ostatní: GA ?R(CZ) GAP209/10/2045 Institutional support: RVO:67985556 Keywords : regression analysis * Gordon surface * prediction error * projection pursuit Subject RIV: BB - Applied Statistics, Operational Research http://library.utia.cas.cz/separaty/2013/SI/volf-on two flexible methods of 2-dimensional regression analysis.pdf

  6. Overdispersed logistic regression for SAGE: Modelling multiple groups and covariates

    OpenAIRE

    Morris Jeffrey S; Deng Li; Baggerly Keith A; Marcelo, Aldaz C.

    2004-01-01

    Abstract Background Two major identifiable sources of variation in data derived from the Serial Analysis of Gene Expression (SAGE) are within-library sampling variability and between-library heterogeneity within a group. Most published methods for identifying differential expression focus on just the sampling variability. In recent work, the problem of assessing differential expression between two groups of SAGE libraries has been addressed by introducing a beta-binomial hierarchical model th...

  7. Benchmark Dose Analysis via Nonparametric Regression Modeling.

    Science.gov (United States)

    Piegorsch, Walter W; Xiong, Hui; Bhattacharya, Rabi N; Lin, Lizhen

    2013-05-17

    Estimation of benchmark doses (BMDs) in quantitative risk assessment traditionally is based upon parametric dose-response modeling. It is a well-known concern, however, that if the chosen parametric model is uncertain and/or misspecified, inaccurate and possibly unsafe low-dose inferences can result. We describe a nonparametric approach for estimating BMDs with quantal-response data based on an isotonic regression method, and also study use of corresponding, nonparametric, bootstrap-based confidence limits for the BMD. We explore the confidence limits' small-sample properties via a simulation study, and illustrate the calculations with an example from cancer risk assessment. It is seen that this nonparametric approach can provide a useful alternative for BMD estimation when faced with the problem of parametric model uncertainty. PMID:23683057

  8. Comparison of Fuzzy Inference System and Multiple Regression to Predict Synthetic Envelopes Clogging

    Directory of Open Access Journals (Sweden)

    Bakhtiar Karimi

    2010-07-01

    Full Text Available Geo-synthetic materials are being used with acceptable performance in soil and water projects worldwide. Geotextiles are one of the categories of geo-synthetics being used in drainage systems. First generation of geotextiles used in the late 1950’s as an alternative for gravel envelopes. In this research two methods (multiple regression and fuzzy interference system evaluate to predict synthetic envelope clogging. In multiple regression method the correlation coefficients for PP450, PP700 and PP900 are 62.66%, 79.37% and 90.62%, respectively and results of fuzzy interference system and decision tree showed that this method have high potential in comparison with multiple regression and values of total classification accuracy for PP450, PP700 and PP900 are 98.6%, 97.3% and 98% respectively. Then final results of this research showed fuzzy interference systems by using decision tree have high potential to predict clogging in envelops.

  9. Analysis of genome-wide association data by large-scale Bayesian logistic regression.

    Science.gov (United States)

    Wang, Yuanjia; Sha, Nanshi; Fang, Yixin

    2009-01-01

    Single-locus analysis is often used to analyze genome-wide association (GWA) data, but such analysis is subject to severe multiple comparisons adjustment. Multivariate logistic regression is proposed to fit a multi-locus model for case-control data. However, when the sample size is much smaller than the number of single-nucleotide polymorphisms (SNPs) or when correlation among SNPs is high, traditional multivariate logistic regression breaks down. To accommodate the scale of data from a GWA while controlling for collinearity and overfitting in a high dimensional predictor space, we propose a variable selection procedure using Bayesian logistic regression. We explored a connection between Bayesian regression with certain priors and L1 and L2 penalized logistic regression. After analyzing large number of SNPs simultaneously in a Bayesian regression, we selected important SNPs for further consideration. With much fewer SNPs of interest, problems of multiple comparisons and collinearity are less severe. We conducted simulation studies to examine probability of correctly selecting disease contributing SNPs and applied developed methods to analyze Genetic Analysis Workshop 16 North American Rheumatoid Arthritis Consortium data. PMID:20018005

  10. Egg hatchability prediction by multiple linear regression and artificial neural networks

    Scientific Electronic Library Online (English)

    AC, Bolzan; RAF, Machado; JCZ, Piaia.

    2008-06-01

    Full Text Available An artificial neural network (ANN) was compared with a multiple linear regression statistical method to predict hatchability in an artificial incubation process. A feedforward neural network architecture was applied. Network trainings were made by the backpropagation algorithm based on data obtained [...] from industrial incubations. The ANN model was chosen as it produced data that fit better the experimental data as compared to the multiple linear regression model, which used coefficients determined by minimum square method. The proposed simulation results of these approaches indicate that this ANN can be used for incubation performance prediction.

  11. A multiple regression model for predicting airwaves in shallow water sea bed logging data

    Science.gov (United States)

    Abdulkarim, Muhammad; Shafi, Afza; Razali, Radzuan; Ansari, Adeel

    2014-10-01

    This paper focuses on formulating a multiple regression model using matrix notation that can be used to predict the magnitude of airwaves in Shallow Water Sea Bed Logging (SBL) Data. The term airwaves refer to the propagated EM signals from the source antenna via atmosphere that is induced along air/sea surface and interferes with the subsurface signal. In shallow water, the airwaves have the ability to mask other subsurface responses possibly containing valuable information about subsurface resistive structure such as hydrocarbon reservoir. A fair representation of SBL environments was simulated to generate the airwaves data. Magnitude of airwaves at selected offset is used as the dependent variable. Whereas the predictor variables (independent variables) for the proposed multiple regression model are the frequency, seawater depth, seawater conductivity, sediment conductivity and offset. Akaike's Information Criterion (AIC) is used for selecting the multiple regression models. The formulated regression model is benchmarked with the theoretical well-known space-domain expression for the Airwaves estimation. The model reveals goodness of fit with R2 of 0.9561and the overall statistical significance of the estimated parameters F-value of 19.35. The result indicates that the magnitudes of airwaves predicted by the regression model are approximately consistent with theoretical model.

  12. A simple regression model for network meta-analysis

    OpenAIRE

    Kessels, A. G. H.; Ter Riet, G.; Puhan, Milo A.; Kleijnen, J.; Bachmann, L. M.; Minder, C.

    2013-01-01

    Introduction: The aim of this paper is to propose a transparent, alternative approach for network meta-analysis based on a regression model that allows inclusion of studies with three or more treatment arms. Methodology: Based on the contingency tables describing the frequency distribution of the outcome in the different intervention arms, a data set is constructed. A logistic regression is used to determine the parameters describing the difference in effect between a specific interventio...

  13. Regression Analysis of Censored Data with Applications in Perimetry

    OpenAIRE

    Lindgren, Anna

    1999-01-01

    This thesis treats regression analysis when either the dependent or the independent variable is censored. We deal with quantile regression when the dependent variable is censored. Using the independence between the true values and the censoring limits the quantile function for the true values can be rewritten as another quantile function of the observed, censored values, where the quantile value itself is a function of the censoring distribution. The quantile value is estimated non-parametric...

  14. Robust In-Car Speech Recognition Based on Nonlinear Multiple Regressions

    Directory of Open Access Journals (Sweden)

    Itakura Fumitada

    2007-01-01

    Full Text Available We address issues for improving handsfree speech recognition performance in different car environments using a single distant microphone. In this paper, we propose a nonlinear multiple-regression-based enhancement method for in-car speech recognition. In order to develop a data-driven in-car recognition system, we develop an effective algorithm for adapting the regression parameters to different driving conditions. We also devise the model compensation scheme by synthesizing the training data using the optimal regression parameters and by selecting the optimal HMM for the test speech. Based on isolated word recognition experiments conducted in 15 real car environments, the proposed adaptive regression approach shows an advantage in average relative word error rate (WER reductions of 52.5 and 14.8 , compared to original noisy speech and ETSI advanced front end, respectively.

  15. Tumor regression of multiple bone metastases from breast cancer after administration of strontium-89 chloride (Metastron)

    OpenAIRE

    Heianna, Joichi; Miyauchi, Takaharu; Endo, Wataru; Miura, Naoki; Terui, Kazuyuki; Kamata, Syuichi; Hashimoto, Manabu

    2014-01-01

    We report a case of tumor regression of multiple bone metastases from breast carcinoma after administration of strontium-89 chloride. This case suggests that strontium-89 chloride can not only relieve bone metastases pain not responsive to analgesics, but may also have a tumoricidal effect on bone metastases.

  16. Predicting Dropouts of University Freshmen: A Logit Regression Analysis.

    Science.gov (United States)

    Lam, Y. L. Jack

    1984-01-01

    Stepwise discriminant analysis coupled with logit regression analysis of freshmen data from Brandon University (Manitoba) indicated that six tested variables drawn from research on university dropouts were useful in predicting attrition: student status, residence, financial sources, distance from home town, goal fulfillment, and satisfaction with…

  17. Applying Multiple Linear Regression and Neural Network to Predict Bank Performance

    Directory of Open Access Journals (Sweden)

    Nor Mazlina Abu Bakar

    2009-09-01

    Full Text Available Globalization and technological advancement has created a highly competitive market in the banking and finance industry. Performance of the industry depends heavily on the accuracy of the decisions made at managerial level. This study uses multiple linear regression technique and feed forward artificial neural network in predicting bank performance. The study aims to predict bank performance using multiple linear regression and neural network. The study then evaluates the performance of the two techniques with a goal to find a powerful tool in predicting the bank performance. Data of thirteen banks for the period 2001-2006 was used in the study. ROA was used as a measure of bank performance, and hence is a dependent variable for the multiple linear regressions. Seven variables including liquidity, credit risk, cost to income ratio, size, concentration ratio, inflation and GDP were used as independent variables. Under supervised learning, the dependent variable, ROA was used as the target output for the artificial neural network. Seven inputs corresponding to seven predictor variables were used for pattern recognition at the training phase. Experimental results from the multiple linear regression show that two variables: credit risk and cost to income ratio are significant in determining the bank performance.  Two variables were found to explain about 60.9 percent of the total variation in the data with a mean square error (MSE of 0.330. The artificial neural network was found to give optimal results by using thirteen hidden neurons. Testing results show that the seven inputs explain about 66.9 percent of the total variation in the data with a very low MSE of 0.00687. Performance of both methods is measured by mean square prediction error (MSPR at the validation stage. The MSPR value for neural network is lower than the MPSR value for multiple linear regression (0.0061 against 0.6190. The study concludes that artificial neural network is the more powerful tool in predicting bank performance.

  18. Evaluating Productivity Index in a Gas Well Using Regression Analysis

    OpenAIRE

    Tobuyei Christopher; Osokogwu Uche

    2014-01-01

    In this study, a new approach is introduced to augment existing correlations for the analysis of Productivity Index of a gas well. The Modified Isochronal test method is used in this analysis. The Productivity Index trend of the gas well is evaluated from the test data. Regression Analysis is used to develop a correlation, which is then used to evaluate andforecast future Productivity Index trend. The back pressure equation of the Simplified Analysis method is also used to exa...

  19. Ratio Versus Regression Analysis: Some Empirical Evidence in Brazil

    Directory of Open Access Journals (Sweden)

    Newton Carneiro Affonso da Costa Jr.

    2004-06-01

    Full Text Available This work compares the traditional methodology for ratio analysis, applied to a sample of Brazilian firms, with the alternative one of regression analysis both to cross-industry and intra-industry samples. It was tested the structural validity of the traditional methodology through a model that represents its analogous regression format. The data are from 156 Brazilian public companies in nine industrial sectors for the year 1997. The results provide weak empirical support for the traditional ratio methodology as it was verified that the validity of this methodology may differ between ratios.

  20. Simultaneous Multiple Response Regression and Inverse Covariance Matrix Estimation via Penalized Gaussian Maximum Likelihood

    Science.gov (United States)

    Lee, Wonyul; Liu, Yufeng

    2012-01-01

    Multivariate regression is a common statistical tool for practical problems. Many multivariate regression techniques are designed for univariate response cases. For problems with multiple response variables available, one common approach is to apply the univariate response regression technique separately on each response variable. Although it is simple and popular, the univariate response approach ignores the joint information among response variables. In this paper, we propose three new methods for utilizing joint information among response variables. All methods are in a penalized likelihood framework with weighted L1 regularization. The proposed methods provide sparse estimators of conditional inverse co-variance matrix of response vector given explanatory variables as well as sparse estimators of regression parameters. Our first approach is to estimate the regression coefficients with plug-in estimated inverse covariance matrices, and our second approach is to estimate the inverse covariance matrix with plug-in estimated regression parameters. Our third approach is to estimate both simultaneously. Asymptotic properties of these methods are explored. Our numerical examples demonstrate that the proposed methods perform competitively in terms of prediction, variable selection, as well as inverse covariance matrix estimation. PMID:22791925

  1. Simultaneous Multiple Response Regression and Inverse Covariance Matrix Estimation via Penalized Gaussian Maximum Likelihood.

    Science.gov (United States)

    Lee, Wonyul; Liu, Yufeng

    2012-10-01

    Multivariate regression is a common statistical tool for practical problems. Many multivariate regression techniques are designed for univariate response cases. For problems with multiple response variables available, one common approach is to apply the univariate response regression technique separately on each response variable. Although it is simple and popular, the univariate response approach ignores the joint information among response variables. In this paper, we propose three new methods for utilizing joint information among response variables. All methods are in a penalized likelihood framework with weighted L(1) regularization. The proposed methods provide sparse estimators of conditional inverse co-variance matrix of response vector given explanatory variables as well as sparse estimators of regression parameters. Our first approach is to estimate the regression coefficients with plug-in estimated inverse covariance matrices, and our second approach is to estimate the inverse covariance matrix with plug-in estimated regression parameters. Our third approach is to estimate both simultaneously. Asymptotic properties of these methods are explored. Our numerical examples demonstrate that the proposed methods perform competitively in terms of prediction, variable selection, as well as inverse covariance matrix estimation. PMID:22791925

  2. Analysis of Sting Balance Calibration Data Using Optimized Regression Models

    Science.gov (United States)

    Ulbrich, N.; Bader, Jon B.

    2010-01-01

    Calibration data of a wind tunnel sting balance was processed using a candidate math model search algorithm that recommends an optimized regression model for the data analysis. During the calibration the normal force and the moment at the balance moment center were selected as independent calibration variables. The sting balance itself had two moment gages. Therefore, after analyzing the connection between calibration loads and gage outputs, it was decided to choose the difference and the sum of the gage outputs as the two responses that best describe the behavior of the balance. The math model search algorithm was applied to these two responses. An optimized regression model was obtained for each response. Classical strain gage balance load transformations and the equations of the deflection of a cantilever beam under load are used to show that the search algorithm s two optimized regression models are supported by a theoretical analysis of the relationship between the applied calibration loads and the measured gage outputs. The analysis of the sting balance calibration data set is a rare example of a situation when terms of a regression model of a balance can directly be derived from first principles of physics. In addition, it is interesting to note that the search algorithm recommended the correct regression model term combinations using only a set of statistical quality metrics that were applied to the experimental data during the algorithm s term selection process.

  3. Analysis of PEM fuel cell experimental data using principal component analysis and multi linear regression

    Energy Technology Data Exchange (ETDEWEB)

    Placca, Latevi [FC LAB., Fuel Cell System Laboratory, Rue Thierry Mieg, 90000 Belfort (France); M3M research laboratory, University of Technology of Belfort-Montbeliard, 90010 Belfort (France); CEA, LITEN, 17, Rue des Martyrs - 38000 Grenoble (France); Kouta, Raed; Charon, Willy [FC LAB., Fuel Cell System Laboratory, Rue Thierry Mieg, 90000 Belfort (France); M3M research laboratory, University of Technology of Belfort-Montbeliard, 90010 Belfort (France); Candusso, Denis [FC LAB., Fuel Cell System Laboratory, Rue Thierry Mieg, 90000 Belfort (France); INRETS, The French National Institute for Transport and Safety Research, Laboratory of New Technologies (LTN), 25 Allee des Marronniers, 78000 Versailles - Satory (France); Blachot, Jean-Francois [FC LAB., Fuel Cell System Laboratory, Rue Thierry Mieg, 90000 Belfort (France); CEA, LITEN, 17, Rue des Martyrs - 38000 Grenoble (France)

    2010-05-15

    Polarisation curves performed at the Fuel Cell System Laboratory (FC LAB) at Belfort on a PEM fuel cell stack using a homemade fully instrumented test bench led to more than 100 variables depending on time. Visualising and analysing all the different test variables are complex. In this work, we show how the Principal Component Analysis (PCA) method helps to explore correlations between variables and similarities between measurements at a specific sampling time (individuals). To complete this method, an empirical model of the PEM fuel cell is proposed by linking the different input parameters to the cell voltage using Multiple Linear Regression. (author)

  4. Multiple regression method to determine aerosol optical depth in atmospheric column in Penang, Malaysia

    Science.gov (United States)

    Tan, F.; Lim, H. S.; Abdullah, K.; Yoon, T. L.; Zubir Matjafri, M.; Holben, B.

    2014-02-01

    Aerosol optical depth (AOD) from AERONET data has a very fine resolution but air pollution index (API), visibility and relative humidity from the ground truth measurements are coarse. To obtain the local AOD in the atmosphere, the relationship between these three parameters was determined using multiple regression analysis. The data of southwest monsoon period (August to September, 2012) taken in Penang, Malaysia, was used to establish a quantitative relationship in which the AOD is modeled as a function of API, relative humidity, and visibility. The highest correlated model was used to predict AOD values during southwest monsoon period. When aerosol is not uniformly distributed in the atmosphere then the predicted AOD can be highly deviated from the measured values. Therefore these deviated data can be removed by comparing between the predicted AOD values and the actual AERONET data which help to investigate whether the non uniform source of the aerosol is from the ground surface or from higher altitude level. This model can accurately predict AOD if only the aerosol is uniformly distributed in the atmosphere. However, further study is needed to determine this model is suitable to use for AOD predicting not only in Penang, but also other state in Malaysia or even global.

  5. Application of stepwise multiple regression techniques to inversion of Nimbus 'IRIS' observations.

    Science.gov (United States)

    Ohring, G.

    1972-01-01

    Exploratory studies with Nimbus-3 infrared interferometer-spectrometer (IRIS) data indicate that, in addition to temperature, such meteorological parameters as geopotential heights of pressure surfaces, tropopause pressure, and tropopause temperature can be inferred from the observed spectra with the use of simple regression equations. The technique of screening the IRIS spectral data by means of stepwise regression to obtain the best radiation predictors of meteorological parameters is validated. The simplicity of application of the technique and the simplicity of the derived linear regression equations - which contain only a few terms - suggest usefulness for this approach. Based upon the results obtained, suggestions are made for further development and exploitation of the stepwise regression analysis technique.

  6. A screening-testing approach for detecting gene-environment interactions using sequential penalized and unpenalized multiple logistic regression.

    Science.gov (United States)

    Frost, H Robert; Andrew, Angeline S; Karagas, Margaret R; Moore, Jason H

    2015-01-01

    Gene-environment (G × E) interactions are biologically important for a wide range of environmental exposures and clinical outcomes. Because of the large number of potential interactions in genomewide association data, the standard approach fits one model per G × E interaction with multiple hypothesis correction (MHC) used to control the type I error rate. Although sometimes effective, using one model per candidate G × E interaction test has two important limitations: low power due to MHC and omitted variable bias. To avoid the coefficient estimation bias associated with independent models, researchers have used penalized regression methods to jointly test all main effects and interactions in a single regression model. Although penalized regression supports joint analysis of all interactions, can be used with hierarchical constraints, and offers excellent predictive performance, it cannot assess the statistical significance of G × E interactions or compute meaningful estimates of effect size. To address the challenge of low power, researchers have separately explored screening-testing, or two-stage, methods in which the set of potential G × E interactions is first filtered and then tested for interactions with MHC only applied to the tests actually performed in the second stage. Although two-stage methods are statistically valid and effective at improving power, they still test multiple separate models and so are impacted by MHC and biased coefficient estimation. To remedy the challenges of both poor power and omitted variable bias encountered with traditional G × E interaction detection methods, we propose a novel approach that combines elements of screening-testing and hierarchical penalized regression. Specifically, our proposed method uses, in the first stage, an elastic net-penalized multiple logistic regression model to jointly estimate either the marginal association filter statistic or the gene-environment correlation filter statistic for all candidate genetic markers. In the second stage, a single multiple logistic regression model is used to jointly assess marginal terms and G × E interactions for all genetic markers that pass the first stage filter. A single likelihood-ratio test is used to determine whether any of the interactions are statistically significant. We demonstrate the efficacy of our method relative to alternative G × E detection methods on a bladder cancer data set. PMID:25592580

  7. Multiple regression models for the prediction of the maximum obtainable thermal efficiency of organic Rankine cycles

    International Nuclear Information System (INIS)

    Much attention is focused on increasing the energy efficiency to decrease fuel costs and CO2 emissions throughout industrial sectors. The ORC (organic Rankine cycle) is a relatively simple but efficient process that can be used for this purpose by converting low and medium temperature waste heat to power. In this study we propose four linear regression models to predict the maximum obtainable thermal efficiency for simple and recuperated ORCs. A previously derived methodology is able to determine the maximum thermal efficiency among many combinations of fluids and processes, given the boundary conditions of the process. Hundreds of optimised cases with varied design parameters are used as observations in four multiple regression analyses. We analyse the model assumptions, prediction abilities and extrapolations, and compare the results with recent studies in the literature. The models are in agreement with the literature, and they present an opportunity for accurate prediction of the potential of an ORC to convert heat sources with temperatures from 80 to 360 °C, without detailed knowledge or need for simulation of the process. - Highlights: • The maximum thermal efficiency of ORCs in hundreds of cases was analysed. • Multiple regression models were derived to predict the maximum obtainable efficiency of ORCs. • Using only key design parameters, the maximum obtainable efficiency can be evaluated. • The regression models decrease the resources needed to evaluate the maximum potential. • The models are statistically strong and in good agreement with the literature

  8. Mass estimation of loose parts in nuclear power plant based on multiple regression

    International Nuclear Information System (INIS)

    According to the application of the Hilbert–Huang transform to the non-stationary signal and the relation between the mass of loose parts in nuclear power plant and corresponding frequency content, a new method for loose part mass estimation based on the marginal Hilbert–Huang spectrum (MHS) and multiple regression is proposed in this paper. The frequency spectrum of a loose part in a nuclear power plant can be expressed by the MHS. The multiple regression model that is constructed by the MHS feature of the impact signals for mass estimation is used to predict the unknown masses of a loose part. A simulated experiment verified that the method is feasible and the errors of the results are acceptable. (paper)

  9. Estimation of Parameters in Heteroscedastic Multiple Regression Model using Leverage Based Near-Neighbors

    Directory of Open Access Journals (Sweden)

    H. Midi

    2009-01-01

    Full Text Available In this study, we propose a Leverage Based Near-Neighbor (LBNN method where prior information on the structure of the heteroscedastic error is not required. In the proposed LBNN method, weights are determined not from the near-neighbor values of the explanatory variables, but from their corresponding leverage values so that it can be readily applied to a multiple regression model. Both the empirical and Monte Carlo simulation results show that the LBNN method offers substantial improvement over the existing methods. The LBNN has significantly reduced the standard errors of the estimates and also the standard errors of residuals for both simple and multiple linear regression models. Hence, the LBNN can be established as one reliable alternative approach to other existing methods that deal with heteroscedastic errors when the form of heteroscedasticity is unknown.

  10. Variable selection in multiple linear regression: The influence of individual cases

    OpenAIRE

    Sj, Steel; Dw, Uys

    2007-01-01

    The influence of individual cases in a data set is studied when variable selection is applied in multiple linear regression. Two different influence measures, based on the C_p criterion and Akaike's information criterion, are introduced. The relative change in the selection criterion when an individual case is omitted is proposed as the selection influence of the specific omitted case. Four standard examples from the literature are considered and the selection influence of the cases is calcul...

  11. Logistic Regression with Multiple Random Effects: A Simulation Study of Estimation Methods and Statistical Packages

    OpenAIRE

    Kim, Yoonsang; Choi, Young-ku; Emery, Sherry

    2013-01-01

    Several statistical packages are capable of estimating generalized linear mixed models and these packages provide one or more of three estimation methods: penalized quasi-likelihood, Laplace, and Gauss-Hermite. Many studies have investigated these methods’ performance for the mixed-effects logistic regression model. However, the authors focused on models with one or two random effects and assumed a simple covariance structure between them, which may not be realistic. When there are multiple...

  12. Estimation of Saturation Percentage of Soil Using Multiple Regression, ANN, and ANFIS Techniques

    OpenAIRE

    Khaled Ahmad Aali; Masoud Parsinejad; Bizhan Rahmani

    2009-01-01

    The saturation percentage (SP) of soils is an important index in hydrological studies. In this paper, arti?cial neural networks (ANNs), multiple regression (MR), and adaptive neural-based fuzzy inference system (ANFIS) were used for estimation of saturation percentage of soils collected from Boukan region in the northwestern part of Iran. Percent clay, silt, sand and organic carbon (OC) were used to develop the applied methods.  In additions contributions of each input variable were asse...

  13. Multiple polynomial regression method for determination of biomedical optical properties from integrating sphere measurements

    OpenAIRE

    Dam, J. S.; Dalgaard, T.; Fabricius, P. E.; Andersson-engels, Stefan

    2000-01-01

    We present a new, to our knowledge, method for extracting optical properties from integrating sphere measurements on thin biological samples. The method is based on multivariate calibration techniques involving Monte Carlo simulations, multiple polynomial regression, and a Newton-Raphson algorithm for solving nonlinear equation systems. Prediction tests with simulated data showed that the mean relative prediction error of the absorption and the reduced scattering coefficients within typical b...

  14. General regression neural network in energy cost analysis

    International Nuclear Information System (INIS)

    Previous researches on energy cost evaluation in industrial processes have been led by the authors using variance analysis techniques, MANOVA. The results were satisfactory and the codes developed using this techniques on process computers were capable to take care of various factors. Nevertheless either many hypothesis had to be made on the analytical form of the regression surfaces, or a pure MANOVA model had to be used, loosing information on the possible interpolation. Moreover, regression approach was hardly extensible to on-line acquisition of new data. In order to achieve this goal and to simplify the processing of data, we adopted neural networks techniques. We tested various types of networks and we found empirical evidence that the General Regression Neural Networks structure (GRNN) could behave consistently better than back-propagation algorithms

  15. Regression Analysis between Properties of Subgrade Lateritic Soil

    OpenAIRE

    Bello, Afeez Adefemi

    2012-01-01

    The results of a study that considered the use of regression analysis that may have correlation between index properties and California Bearing Ratio (CBR) of some lateritic soil within Osogbo town of South Western Nigeria have been presented. For an appreciable conclusion to be established, lateritic soil samples were collected from eight (8) different borrow pits within the town and various laboratory tests including Atterberg Limits, Gradation analysis, California Bearing Ratio, Compaction...

  16. MULTINOMIAL LOGISTIC REGRESSION: USAGE AND APPLICATION IN RISK ANALYSIS

    OpenAIRE

    Bayaga, Anass

    2010-01-01

    The objective of the article was to explore the usage of multinomial logistic regression (MLR) in risk analysis. In this regard, performing MLR on risk analysis data corrected for the non-linear nature of binary response and did address the violation of equal variance and normality assumptions. Additionally, use of maximum likelihood (-2log) estimation provided a means of working with binary response data. The relationship of independent and dependent variables was also addressed.The data use...

  17. Multiple trait model combining random regressions for daily feed intake with single measured performance traits of growing pigs

    Directory of Open Access Journals (Sweden)

    Künzi Niklaus

    2002-01-01

    Full Text Available Abstract A random regression model for daily feed intake and a conventional multiple trait animal model for the four traits average daily gain on test (ADG, feed conversion ratio (FCR, carcass lean content and meat quality index were combined to analyse data from 1 449 castrated male Large White pigs performance tested in two French central testing stations in 1997. Group housed pigs fed ad libitum with electronic feed dispensers were tested from 35 to 100 kg live body weight. A quadratic polynomial in days on test was used as a regression function for weekly means of daily feed intake and to escribe its residual variance. The same fixed (batch and random (additive genetic, pen and individual permanent environmental effects were used for regression coefficients of feed intake and single measured traits. Variance components were estimated by means of a Bayesian analysis using Gibbs sampling. Four Gibbs chains were run for 550 000 rounds each, from which 50 000 rounds were discarded from the burn-in period. Estimates of posterior means of covariance matrices were calculated from the remaining two million samples. Low heritabilities of linear and quadratic regression coefficients and their unfavourable genetic correlations with other performance traits reveal that altering the shape of the feed intake curve by direct or indirect selection is difficult.

  18. Least Squares Adjustment: Linear and Nonlinear Weighted Regression Analysis

    DEFF Research Database (Denmark)

    Nielsen, Allan Aasbjerg

    2007-01-01

    This note primarily describes the mathematics of least squares regression analysis as it is often used in geodesy including land surveying and satellite positioning applications. In these fields regression is often termed adjustment. The note also contains a couple of typical land surveying and satellite positioning application examples. In these application areas we are typically interested in the parameters in the model typically 2- or 3-D positions and not in predictive modelling which is often the main concern in other regression analysis applications. Adjustment is often used to obtain estimates of relevant parameters in an over-determined system of equations which may arise from deliberately carrying out more measurements than actually needed to determine the set of desired parameters. An example may be the determination of a geographical position based on information from a number of Global Navigation Satellite System (GNSS) satellites also known as space vehicles (SV). It takes at least four SVs to determine the position (and the clock error) of a GNSS receiver. Often more than four SVs are used and we use adjustment to obtain a better estimate of the geographical position (and the clock error) and to obtain estimates of the uncertainty with which the position is determined. Regression analysis is used in many other fields of application both in the natural, the technical and the social sciences. Examples may be curve fitting, calibration, establishing relationships between different variables in an experiment or in a survey, etc. Regression analysis is probably one the most used statistical techniques around. Dr. Anna B. O. Jensen provided insight and data for the Global Positioning System (GPS) example. Matlab code and sections that are considered as either traditional land surveying material or as advanced material are typeset with smaller fonts. Comments in general or on for example unavoidable typos, shortcomings and errors are most welcome.

  19. Multiple regression and Artificial Neural Network for long-term rainfall forecasting using large scale climate modes

    Science.gov (United States)

    Mekanik, F.; Imteaz, M. A.; Gato-Trinidad, S.; Elmahdi, A.

    2013-10-01

    In this study, the application of Artificial Neural Networks (ANN) and Multiple regression analysis (MR) to forecast long-term seasonal spring rainfall in Victoria, Australia was investigated using lagged El Nino Southern Oscillation (ENSO) and Indian Ocean Dipole (IOD) as potential predictors. The use of dual (combined lagged ENSO-IOD) input sets for calibrating and validating ANN and MR Models is proposed to investigate the simultaneous effect of past values of these two major climate modes on long-term spring rainfall prediction. The MR models that did not violate the limits of statistical significance and multicollinearity were selected for future spring rainfall forecast. The ANN was developed in the form of multilayer perceptron using Levenberg-Marquardt algorithm. Both MR and ANN modelling were assessed statistically using mean square error (MSE), mean absolute error (MAE), Pearson correlation (r) and Willmott index of agreement (d). The developed MR and ANN models were tested on out-of-sample test sets; the MR models showed very poor generalisation ability for east Victoria with correlation coefficients of -0.99 to -0.90 compared to ANN with correlation coefficients of 0.42-0.93; ANN models also showed better generalisation ability for central and west Victoria with correlation coefficients of 0.68-0.85 and 0.58-0.97 respectively. The ability of multiple regression models to forecast out-of-sample sets is compatible with ANN for Daylesford in central Victoria and Kaniva in west Victoria (r = 0.92 and 0.67 respectively). The errors of the testing sets for ANN models are generally lower compared to multiple regression models. The statistical analysis suggest the potential of ANN over MR models for rainfall forecasting using large scale climate modes.

  20. Simultaneous multiple non-crossing quantile regression estimation using kernel constraints.

    Science.gov (United States)

    Liu, Yufeng; Wu, Yichao

    2011-06-01

    Quantile regression (QR) is a very useful statistical tool for learning the relationship between the response variable and covariates. For many applications, one often needs to estimate multiple conditional quantile functions of the response variable given covariates. Although one can estimate multiple quantiles separately, it is of great interest to estimate them simultaneously. One advantage of simultaneous estimation is that multiple quantiles can share strength among them to gain better estimation accuracy than individually estimated quantile functions. Another important advantage of joint estimation is the feasibility of incorporating simultaneous non-crossing constraints of QR functions. In this paper, we propose a new kernel-based multiple QR estimation technique, namely simultaneous non-crossing quantile regression (SNQR). We use kernel representations for QR functions and apply constraints on the kernel coefficients to avoid crossing. Both unregularised and regularised SNQR techniques are considered. Asymptotic properties such as asymptotic normality of linear SNQR and oracle properties of the sparse linear SNQR are developed. Our numerical results demonstrate the competitive performance of our SNQR over the original individual QR estimation. PMID:22190842

  1. Robust regression applied to fractal/multifractal analysis.

    Science.gov (United States)

    Portilla, F.; Valencia, J. L.; Tarquis, A. M.; Saa-Requejo, A.

    2012-04-01

    Fractal and multifractal are concepts that have grown increasingly popular in recent years in the soil analysis, along with the development of fractal models. One of the common steps is to calculate the slope of a linear fit commonly using least squares method. This shouldn't be a special problem, however, in many situations using experimental data the researcher has to select the range of scales at which is going to work neglecting the rest of points to achieve the best linearity that in this type of analysis is necessary. Robust regression is a form of regression analysis designed to circumvent some limitations of traditional parametric and non-parametric methods. In this method we don't have to assume that the outlier point is simply an extreme observation drawn from the tail of a normal distribution not compromising the validity of the regression results. In this work we have evaluated the capacity of robust regression to select the points in the experimental data used trying to avoid subjective choices. Based on this analysis we have developed a new work methodology that implies two basic steps: • Evaluation of the improvement of linear fitting when consecutive points are eliminated based on R p-value. In this way we consider the implications of reducing the number of points. • Evaluation of the significance of slope difference between fitting with the two extremes points and fitted with the available points. We compare the results applying this methodology and the common used least squares one. The data selected for these comparisons are coming from experimental soil roughness transect and simulated based on middle point displacement method adding tendencies and noise. The results are discussed indicating the advantages and disadvantages of each methodology. Acknowledgements Funding provided by CEIGRAM (Research Centre for the Management of Agricultural and Environmental Risks) and by Spanish Ministerio de Ciencia e Innovación (MICINN) through project no. AGL2010-21501/AGR is greatly appreciated.

  2. What fiscal policy is most effective? A Meta Regression Analysis

    OpenAIRE

    Gechert, Sebastian

    2013-01-01

    We apply meta regression analysis to a unique data set of 104 studies on multiplier effects with 1069 reported multipliers in order to derive stylized facts and to quantify the differing effectiveness of the composition of fiscal impulses, adjusted for the interference of study-design characteristics and sample specifics. As a major result, we find that public spending multipliers are close to one and about 0.3 to 0.4 units larger than tax and transfer multipliers. Public investment multiplie...

  3. Risk factors for mortality after bereavement: a logistic regression analysis

    OpenAIRE

    Bowling, Ann; Charlton, John

    1987-01-01

    A national sample of elderly widowed people was followed up for six years. Excess mortality was found for men aged 75 years and over in the first six months of bereavement compared with men of the same age in the general population. Logistic regression analysis, controlling for age and sex together, demonstrated that the best independent predictors of mortality among the elderly widowed were: interviewer assessment of low happiness level; interviewer assessed and self-reported problems with n...

  4. Multivariate quantiles and multiple-output regression quantiles: From $L_1$ optimization to halfspace depth

    CERN Document Server

    Hallin, Marc; Šiman, Miroslav; 10.1214/09-AOS723

    2010-01-01

    A new multivariate concept of quantile, based on a directional version of Koenker and Bassett's traditional regression quantiles, is introduced for multivariate location and multiple-output regression problems. In their empirical version, those quantiles can be computed efficiently via linear programming techniques. Consistency, Bahadur representation and asymptotic normality results are established. Most importantly, the contours generated by those quantiles are shown to coincide with the classical halfspace depth contours associated with the name of Tukey. This relation does not only allow for efficient depth contour computations by means of parametric linear programming, but also for transferring from the quantile to the depth universe such asymptotic results as Bahadur representations. Finally, linear programming duality opens the way to promising developments in depth-related multivariate rank-based inference.

  5. Poisson Regression Analysis of Illness and Injury Surveillance Data

    Energy Technology Data Exchange (ETDEWEB)

    Frome E.L., Watkins J.P., Ellis E.D.

    2012-12-12

    The Department of Energy (DOE) uses illness and injury surveillance to monitor morbidity and assess the overall health of the work force. Data collected from each participating site include health events and a roster file with demographic information. The source data files are maintained in a relational data base, and are used to obtain stratified tables of health event counts and person time at risk that serve as the starting point for Poisson regression analysis. The explanatory variables that define these tables are age, gender, occupational group, and time. Typical response variables of interest are the number of absences due to illness or injury, i.e., the response variable is a count. Poisson regression methods are used to describe the effect of the explanatory variables on the health event rates using a log-linear main effects model. Results of fitting the main effects model are summarized in a tabular and graphical form and interpretation of model parameters is provided. An analysis of deviance table is used to evaluate the importance of each of the explanatory variables on the event rate of interest and to determine if interaction terms should be considered in the analysis. Although Poisson regression methods are widely used in the analysis of count data, there are situations in which over-dispersion occurs. This could be due to lack-of-fit of the regression model, extra-Poisson variation, or both. A score test statistic and regression diagnostics are used to identify over-dispersion. A quasi-likelihood method of moments procedure is used to evaluate and adjust for extra-Poisson variation when necessary. Two examples are presented using respiratory disease absence rates at two DOE sites to illustrate the methods and interpretation of the results. In the first example the Poisson main effects model is adequate. In the second example the score test indicates considerable over-dispersion and a more detailed analysis attributes the over-dispersion to extra-Poisson variation. The R open source software environment for statistical computing and graphics is used for analysis. Additional details about R and the data that were used in this report are provided in an Appendix. Information on how to obtain R and utility functions that can be used to duplicate results in this report are provided.

  6. Improving Antarctic Total Ozone Projections by a Process-Oriented Multiple Diagnostic Ensemble Regression

    Science.gov (United States)

    Karpechko, Aleyey; Maraun, Douglas; Eyring, Veronika

    2014-05-01

    Accurate projections of stratospheric ozone are required, because ozone changes impact onexposures to ultraviolet radiation and on tropospheric climate. Unweighted multi-model ensemble mean (uMMM) projections from chemistry-climate models (CCMs) are commonly used to project ozone in the 21 th century, when ozone-depleting substances are expected to decline and greenhouse gases expected to rise. Here, we address the question whether Antarctic total column ozone projections in October given by the uMMM of CCM simulations can be improved by using a process-oriented multiple diagnostic ensemble regression (MDER) method. This method is based on the correlation between simulated future ozone and selected key processes relevant for stratospheric ozone under present-day conditions. The regression model is built using an algorithm that selects those process-oriented diagnostics which explain a significant fraction of the spread in the projected ozone among the CCMs. The regression model with observed diagnostics is then used to predict future ozone and associated uncertainty. The precision of our method is tested in a pseudo-reality, i.e. the prediction is validated against an independent CCM projection used to replace unavailable future observations. The test shows that MDER has a higher precision than uMMM, suggesting an improvement in the estimate of future Antarctic ozone. Our method projects that Antarctic total ozone will return to 1980 values around 2060 with the 95% confidence interval ranging from 2040 to 2080. This reduces the range of return dates across the ensemble of CCMs by more than a decade and suggests that the earliest simulated return dates are unlikely. Karpechko, Maraun and Eyring (2013) Improving Antarctic Total Ozone Projections by a Process-Oriented Multiple Diagnostic Ensemble Regression, J. Atmos. Sci. 70: 3959-3976

  7. Using the Coefficient of Determination "R"[superscript 2] to Test the Significance of Multiple Linear Regression

    Science.gov (United States)

    Quinino, Roberto C.; Reis, Edna A.; Bessegato, Lupercio F.

    2013-01-01

    This article proposes the use of the coefficient of determination as a statistic for hypothesis testing in multiple linear regression based on distributions acquired by beta sampling. (Contains 3 figures.)

  8. Use of standard regression analysis for relating bonus bids and oil/gas production: Federal offshore leases

    Energy Technology Data Exchange (ETDEWEB)

    Berger, P.D.; Lohrenz, J.

    1980-06-01

    Multiple linear regression analysis has been used to study bidding and production data for Federal offshore oil and gas leases. Policy value conclusions have been stated therefrom. We, firstly, address the applicability of the inherent assumptions of normality and homoscedasticity finding the assumptions unsupported and questioning the statistical inferences which could otherwise be drawn. Secondly, even given the legitimacy of the assumptions and the usual statistical inferences from multiple linear regression results, we show the conclusions are volatilely sensitive. We are led to a strong assertion that quantitative assessment of the assumptions of normality and homoscedasticity be a mandatory requirement for the proper understanding and use, if indeed any is possible, of multiple linear regression analysis results for drawing policy value conclusions from data with the statistical behavior of Federal offshore oil and gas lease data.

  9. Application of Multiple Linear Regression and Manova to Evaluate Health Impacts Due to Changing River Water Quality

    OpenAIRE

    Sudevi Basu; Lokesh, K. S.

    2014-01-01

    Rivers are important systems which provide water to fulfill human needs. However, excessive human uses over the years have led to deterioration in quality of river causing, causing health problems from contaminated water. This study focuses on the application of statistical techniques, Multiple Linear Regression model and MANOVA to assess health impacts due to pollution in Cauvery river stretch in Srirangapatna. In this study, using Multiple Linear Regression, it is fou...

  10. Waste generated in high-rise buildings construction: A quantification model based on statistical multiple regression.

    Science.gov (United States)

    Parisi Kern, Andrea; Ferreira Dias, Michele; Piva Kulakowski, Marlova; Paulo Gomes, Luciana

    2015-05-01

    Reducing construction waste is becoming a key environmental issue in the construction industry. The quantification of waste generation rates in the construction sector is an invaluable management tool in supporting mitigation actions. However, the quantification of waste can be a difficult process because of the specific characteristics and the wide range of materials used in different construction projects. Large variations are observed in the methods used to predict the amount of waste generated because of the range of variables involved in construction processes and the different contexts in which these methods are employed. This paper proposes a statistical model to determine the amount of waste generated in the construction of high-rise buildings by assessing the influence of design process and production system, often mentioned as the major culprits behind the generation of waste in construction. Multiple regression was used to conduct a case study based on multiple sources of data of eighteen residential buildings. The resulting statistical model produced dependent (i.e. amount of waste generated) and independent variables associated with the design and the production system used. The best regression model obtained from the sample data resulted in an adjusted R(2) value of 0.694, which means that it predicts approximately 69% of the factors involved in the generation of waste in similar constructions. Most independent variables showed a low determination coefficient when assessed in isolation, which emphasizes the importance of assessing their joint influence on the response (dependent) variable. PMID:25704604

  11. Prediction of the cetane number of biodiesel using artificial neural networks and multiple linear regression

    International Nuclear Information System (INIS)

    Highlights: ? We obtained models for estimation of cetane number of biodiesel. ? Twenty-four neural networks using two topologies were evaluated. ? The best neural network for predict the cetane number was selected. ? The best accuracy was obtained for the selected neural network. - Abstract: Models for estimation of cetane number of biodiesel from their fatty acid methyl ester composition using multiple linear regression and artificial neural networks were obtained in this work. For the obtaining of models to predict the cetane number, an experimental data from literature reports that covers 48 and 15 biodiesels in the modeling-training step and validation step respectively were taken. Twenty-four neural networks using two topologies and different algorithms for the second training step were evaluated. The model obtained using multiple regression was compared with two other models from literature and it was able to predict cetane number with 89% of accuracy, observing one outlier. A model to predict cetane number using artificial neural network was obtained with better accuracy than 92% except one outlier. The best neural network to predict the cetane number was a backpropagation network (11:5:1) using the Levenberg–Marquardt algorithm for the second step of the networks training and showing R = 0.9544 for the validation data.

  12. Residential behavioural energy savings : a meta-regression analysis

    Energy Technology Data Exchange (ETDEWEB)

    Tiedemann, K.H. [BC Hydro, Burnaby, BC (Canada)

    2009-07-01

    Increasing attention is being given to opportunities for residential energy behavioural savings, as developed countries attempt to reduce energy use and greenhouse gas emissions. Several utility companies have undertaken pilot programs geared at understanding which interventions are most effective in reducing residential energy consumption through behavioural change. This paper presented the first metaregression analysis of residential energy behavioural savings. This study focused on interventions which affected household energy-related behaviours and as a result, affected household energy use. The paper described rational choice theory, the theory of planned behaviour, and the integration of rational choice theory and the adjusted expectancy values theory in a simple framework. The paper also discussed the review of various social, psychological and economics journals and databases. The results of the studies were presented. A basic concept in meta-regression analysis is the effects size which is defined as the program effect divided by the standard error of the program effect. A lengthy review of the literature found twenty-eight treatments from ten experiments for which an effect size could be calculated. The experiments involved classifying treatments according to whether the interventions were information, goal setting, feedback, rewards or combinations of these interventions. The impact of these alternative interventions on the effect size was then modelled using White's robust regression. Five regression models were compared on the basis of the Akaike's information criterion. It was found that model 5, which used all of the regressors, was the preferred model. It was concluded that the theory of planned behaviour is more appropriate in the context of analysis of behavioural change and energy use. 21 refs., 4 tabs.

  13. Multiple linear stepwise regression of liver lipid levels: proton MR spectroscopy study in vivo at 3.0 T

    International Nuclear Information System (INIS)

    Objective: To analyze the correlations between liver lipid level determined by liver 3.0 T 1H-MRS in vivo and influencing factors using multiple linear stepwise regression. Methods: The prospective study of liver 1H-MRS was performed with 3.0 T system and eight-channel torso phased-array coils using PRESS sequence. Forty-four volunteers were enrolled in this study. Liver spectra were collected with a TR of 1500 ms, TE of 30 ms, volume of interest of 2 cm×2 cm×2 cm, NSA of 64 times. The acquired raw proton MRS data were processed by using a software program SAGE. For each MRS measurement, using water as the internal reference, the amplitude of the lipid signal was normalized to the sum of the signal from lipid and water to obtain percentage lipid within the liver. The statistical description of height, weight, age and BMI, Line width and water suppression were recorded, and Pearson analysis was applied to test their relationships. Multiple linear stepwise regression was used to set the statistical model for the prediction of Liver lipid content. Results: Age (39.1±12.6) years, body weight (64.4±10.4) kg, BMI (23.3±3.1) kg/m2, linewidth (18.9±4.4) and the water suppression (90.7±6.5)% had significant correlation with liver lipid content (0.00 to 0.96%, median 0.02%), r were 0.11, 0.44, 0.40, 0.52, -0.73 respectively (P<0.05). But only age, BMI, line width, and the water suppression entered into the multiple linear regression equation. Liver lipid content prediction equation was as follows: Y= 1.395 - (0.021×water suppression) + (0.022×BMI) + (0.014×line width) - (0.004×age), and the coefficient of determination was 0. 613, corrected coefficient of determination was 0.59. Conclusion: The regression model fitted well, since the variables of age, BMI, width, and water suppression can explain about 60% of liver lipid content changes. (authors)

  14. Multiple linear regression to estimate time-frequency electrophysiological responses in single trials.

    Science.gov (United States)

    Hu, L; Zhang, Z G; Mouraux, A; Iannetti, G D

    2015-05-01

    Transient sensory, motor or cognitive event elicit not only phase-locked event-related potentials (ERPs) in the ongoing electroencephalogram (EEG), but also induce non-phase-locked modulations of ongoing EEG oscillations. These modulations can be detected when single-trial waveforms are analysed in the time-frequency domain, and consist in stimulus-induced decreases (event-related desynchronization, ERD) or increases (event-related synchronization, ERS) of synchrony in the activity of the underlying neuronal populations. ERD and ERS reflect changes in the parameters that control oscillations in neuronal networks and, depending on the frequency at which they occur, represent neuronal mechanisms involved in cortical activation, inhibition and binding. ERD and ERS are commonly estimated by averaging the time-frequency decomposition of single trials. However, their trial-to-trial variability that can reflect physiologically-important information is lost by across-trial averaging. Here, we aim to (1) develop novel approaches to explore single-trial parameters (including latency, frequency and magnitude) of ERP/ERD/ERS; (2) disclose the relationship between estimated single-trial parameters and other experimental factors (e.g., perceived intensity). We found that (1) stimulus-elicited ERP/ERD/ERS can be correctly separated using principal component analysis (PCA) decomposition with Varimax rotation on the single-trial time-frequency distributions; (2) time-frequency multiple linear regression with dispersion term (TF-MLRd) enhances the signal-to-noise ratio of ERP/ERD/ERS in single trials, and provides an unbiased estimation of their latency, frequency, and magnitude at single-trial level; (3) these estimates can be meaningfully correlated with each other and with other experimental factors at single-trial level (e.g., perceived stimulus intensity and ERP magnitude). The methods described in this article allow exploring fully non-phase-locked stimulus-induced cortical oscillations, obtaining single-trial estimate of response latency, frequency, and magnitude. This permits within-subject statistical comparisons, correlation with pre-stimulus features, and integration of simultaneously-recorded EEG and fMRI. PMID:25665966

  15. Spontaneous regression of multiple pulmonary metastatic nodules of hepatocarcinoma: a case report

    International Nuclear Information System (INIS)

    Although are spontaneous regression of either primary or metastatic malignant tumor in the absence of or inadequate therapy has been well documented. Since the earliest day of this century various malignant tumors have been reported to spontaneously disappear or to be arrested of their growth, but the cases of hepatocarcinoma has been very rare. From the literature, we were able to find out 5 previously reported cases of hepatocarcinoma which showed spontaneous regression at the primary site. Recently we have seen a case of multiple pulmonary metastatic nodules of hepatocarcinoma which completely regressed spontaneously and this forms the basis of the present case report. The patient was 55-year-old male admitted to St. Mary's Hospital, Catholic Medical College because of a hard palpable mass in the epigastrium on April 26, 1978. The admission PA chest roentgenogram revealed multiple small nodular densities scattered throughout both lung field especially in lower zones and toward the peripheral portion. A hepatoscintigram revealed a large cold area involving the left lobe and inermediate zone of the liver. Alfa-fetoprotein and hepatitis B serum antigen test were positive whereas many other standard liver function tests turned out to be negative. A needle biopsy of the tumor revealed well differentiated hepatocellular carcinoma. The patient was put under chemotherapy which consisted of 5 FU 500 mg intravenously for 6 days from April 28 to May 3, 1978. The patient was dApril 28 to May 3, 1978. The patient was discharged after this single course of 5 FU treatment and was on a herb medicine, the nature and quantity of which obscure. No other specific treatment was given. The second admission took place on Dec. 3, 1980 because of irregularity in bowel habits and dyspepsia. A follow up PA chest roentgenogram obtained on the second admission revealed complete disappearance of previously noted multiple pulmonary nodular lesions (Fig. 3). Follow up liver scan revealed persistence of the cold area in the left lobe with slight decrease in size. The patient was discharged again without any specific prescription after confirming negative results of various clinical studies including upper GI series and colon study. At the time of finishing this paper the patient is doing well without apparent medical problems

  16. Spontaneous regression of multiple pulmonary metastatic nodules of hepatocarcinoma: a case report

    Energy Technology Data Exchange (ETDEWEB)

    Bahk, Yong Whee; Park, Seog Hee; Kim, Sun Moo [St. Mary' s Hospital, Catholic Medical College, Seoul (Korea, Republic of)

    1981-09-15

    Although are spontaneous regression of either primary or metastatic malignant tumor in the absence of or inadequate therapy has been well documented. Since the earliest day of this century various malignant tumors have been reported to spontaneously disappear or to be arrested of their growth, but the cases of hepatocarcinoma has been very rare. From the literature, we were able to find out 5 previously reported cases of hepatocarcinoma which showed spontaneous regression at the primary site. Recently we have seen a case of multiple pulmonary metastatic nodules of hepatocarcinoma which completely regressed spontaneously and this forms the basis of the present case report. The patient was 55-year-old male admitted to St. Mary's Hospital, Catholic Medical College because of a hard palpable mass in the epigastrium on April 26, 1978. The admission PA chest roentgenogram revealed multiple small nodular densities scattered throughout both lung field especially in lower zones and toward the peripheral portion. A hepatoscintigram revealed a large cold area involving the left lobe and inermediate zone of the liver. Alfa-fetoprotein and hepatitis B serum antigen test were positive whereas many other standard liver function tests turned out to be negative. A needle biopsy of the tumor revealed well differentiated hepatocellular carcinoma. The patient was put under chemotherapy which consisted of 5 FU 500 mg intravenously for 6 days from April 28 to May 3, 1978. The patient was discharged after this single course of 5 FU treatment and was on a herb medicine, the nature and quantity of which obscure. No other specific treatment was given. The second admission took place on Dec. 3, 1980 because of irregularity in bowel habits and dyspepsia. A follow up PA chest roentgenogram obtained on the second admission revealed complete disappearance of previously noted multiple pulmonary nodular lesions (Fig. 3). Follow up liver scan revealed persistence of the cold area in the left lobe with slight decrease in size. The patient was discharged again without any specific prescription after confirming negative results of various clinical studies including upper GI series and colon study. At the time of finishing this paper the patient is doing well without apparent medical problems.

  17. A non linear multiple regression approach for inferring the probability distribution of hydrological model errors

    Science.gov (United States)

    Montanari, A.

    2006-12-01

    This contribution introduces a statistically based approach for uncertainty assessment in hydrological modeling, in an optimality context. Indeed, in several real world applications, there is the need for the user to select a model that is deemed to be the best possible choice accordingly to a given goodness of fit criteria. In this case, it is extremely important to assess the model uncertainty, intended as the range around the model output within which the measured hydrological variable is expected to fall with a given probability. This indication allows the user to quantify the risk associated to a decision that is based on the model response. The technique proposed here is carried out by inferring the probability distribution of the hydrological model error through a non linear multiple regression approach, depending on an arbitrary number of selected conditioning variables. These may include the current and previous model output as well as internal state variables of the model. The purpose is to indirectly relate the model error to the sources of uncertainty, through the conditioning variables. The method can be applied to any model of arbitrary complexity, included distributed approaches. The probability distribution of the model error is derived in the Gaussian space, through a meta-Gaussian approach. The normal quantile transform is applied in order to make the marginal probability distribution of the model error and the conditioning variables Gaussian. Then the above marginal probability distributions are related through the multivariate Gaussian distribution, whose parameters are estimated via multiple regression. Application of the inverse of the normal quantile transform allows the user to derive the confidence limits of the model output for an assigned significance level. The proposed technique is valid under statistical assumptions, that are essentially those conditioning the validity of the multiple regression in the Gaussian space. Statistical tests are proposed and discussed in order to test the reliability of the estimated confidence limits. Applications are shown in validation mode, that refer to a rainfall-runoff model applied to an Italian river basin. It is significant to note that the optimality context does not refuse the concept of equifinality. The proposed approach is meant to be a tool for quantifying the uncertainty when the use of a fixed model is made necessary by the application requirements.

  18. Regression of uveal malignant melanomas following cobalt-60 plaque. Correlates between acoustic spectrum analysis and tumor regression

    International Nuclear Information System (INIS)

    Parameters derived from computer analysis of digital radio-frequency (rf) ultrasound scan data of untreated uveal malignant melanomas were examined for correlations with tumor regression following cobalt-60 plaque. Parameters included tumor height, normalized power spectrum and acoustic tissue type (ATT). Acoustic tissue type was based upon discriminant analysis of tumor power spectra, with spectra of tumors of known pathology serving as a model. Results showed ATT to be correlated with tumor regression during the first 18 months following treatment. Tumors with ATT associated with spindle cell malignant melanoma showed over twice the percentage reduction in height as those with ATT associated with mixed/epithelioid melanomas. Pre-treatment height was only weakly correlated with regression. Additionally, significant spectral changes were observed following treatment. Ultrasonic spectrum analysis thus provides a noninvasive tool for classification, prediction and monitoring of tumor response to cobalt-60 plaque

  19. Determination of ventilatory threshold through quadratic regression analysis.

    Science.gov (United States)

    Gregg, Joey S; Wyatt, Frank B; Kilgore, J Lon

    2010-09-01

    Ventilatory threshold (VT) has been used to measure physiological occurrences in athletes through models via gas analysis with limited accuracy. The purpose of this study is to establish a mathematical model to more accurately detect the ventilatory threshold using the ventilatory equivalent of carbon dioxide (VE/VCO2) and the ventilatory equivalent of oxygen (VE/Vo2). The methodology is primarily a mathematical analysis of data. The raw data used were archived from the cardiorespiratory laboratory in the Department of Kinesiology at Midwestern State University. Procedures for archived data collection included breath-by-breath gas analysis averaged every 20 seconds (ParVoMedics, TrueMax 2400). A ramp protocol on a Velotron bicycle ergometer was used with increased work at 25 W.min beginning with 150 W, until volitional fatigue. The subjects consisted of 27 healthy, trained cyclists with age ranging from 18 to 50 years. All subjects signed a university approved informed consent before testing. Graphic scatterplots and statistical regression analyses were performed to establish the crossover and subsequent dissociation of VE/Vo2 to VE/VCO2. A polynomial trend line along the scatterplots for VE/VO2 and VE/VCO2 was used because of the high correlation coefficient, the coefficient of determination, and trend line. The equations derived from the scatterplots and trend lines were quadratic in nature because they have a polynomial degree of 2. A graphing calculator in conjunction with a spreadsheet was used to find the exact point of intersection of the 2 trend lines. After the quadratic regression analysis, the exact point of VE/Vo2 and VE/VCO2 crossover was established as the VT. This application will allow investigators to more accurately determine the VT in subsequent research. PMID:20802290

  20. Experimental and regression analysis for multi cylinder diesel engine operated with hybrid fuel blends

    Directory of Open Access Journals (Sweden)

    Gopal Rajendiran

    2014-01-01

    Full Text Available The purpose of this research work is to build a multiple linear regression model for the characteristics of multicylinder diesel engine using multicomponent blends (diesel- pungamia methyl ester-ethanol as fuel. Nine blends were tested by varying diesel (100 to 10% by Vol., biodiesel (80 to 10% by vol. and keeping ethanol as 10% constant. The brake thermal efficiency, smoke, oxides of nitrogen, carbon dioxide, maximum cylinder pressure, angle of maximum pressure, angle of 5% and 90% mass burning were predicted based on load, speed, diesel and biodiesel percentage. To validate this regression model another multi component fuel comprising diesel-palm methyl ester-ethanol was used in same engine. Statistical analysis was carried out between predicted and experimental data for both fuel. The performance, emission and combustion characteristics of multi cylinder diesel engine using similar fuel blends can be predicted without any expenses for experimentation.

  1. Multiple regression models for the prediction of the maximum obtainable thermal efficiency of organic Rankine cycles

    DEFF Research Database (Denmark)

    Larsen, Ulrik; Pierobon, Leonardo

    2014-01-01

    Much attention is focused on increasing the energy efficiency to decrease fuel costs and CO2 emissions throughout industrial sectors. The ORC (organic Rankine cycle) is a relatively simple but efficient process that can be used for this purpose by converting low and medium temperature waste heat to power. In this study we propose four linear regression models to predict the maximum obtainable thermal efficiency for simple and recuperated ORCs. A previously derived methodology is able to determine the maximum thermal efficiency among many combinations of fluids and processes, given the boundary conditions of the process. Hundreds of optimised cases with varied design parameters are used as observations in four multiple regression analyses. We analyse the model assumptions, prediction abilities and extrapolations, and compare the results with recent studies in the literature. The models are in agreement with the literature, and they present an opportunity for accurate prediction of the potential of an ORC to convert heat sources with temperatures from 80 to 360 C, without detailed knowledge or need for simulation of the process. © 2013 Elsevier Ltd. All rights reserved

  2. The use of weighted multiple linear regression to estimate QTL-by-QTL epistatic effects

    Scientific Electronic Library Online (English)

    Jan, Bocianowski.

    Full Text Available SciELO Brazil | Language: English Abstract in english Knowledge of the nature and magnitude of gene effects, as well as their contribution to the control of metric traits, is important in formulating efficient breeding programs for the improvement of plant genetics. Information concerning a genetic parameter such as the additive-by-additive epistatic e [...] ffect can be useful in traditional breeding. This report describes the results obtained by applying weighted multiple linear regression to estimate the parameter connected with an additive-by-additive epistatic interaction. Three weight variants were used: (1) standard weights based on estimated variances, (2) different weights for minimal, maximal and other lines, and (3) different weights for extreme and other lines. The approach described here combines two methods of estimation, one based on phenotypic observations and the other using molecular marker data. The comparison was done using Monte Carlo simulations. The results show that the application of weighted regression to the marker data yielded estimates similar to those obtained by phenotypic methods.

  3. Random regressions models to describe the genetic variation of milk yield over multiple parities in Buffaloes

    Directory of Open Access Journals (Sweden)

    H. Tonhati

    2010-02-01

    Full Text Available The objectives of this study were to estimate (covariance functions for additive genetic and permanent environmental effects, as well as the genetic parameters for milk yield over multiple parities, using random regressions models (RRM. Records of 4,757 complete lactations of Murrah breed buffaloes from 12 herds were analyzed. Ages at calving were between 2 and 11 years. The model included the additive genetic and permanent environmental random effects and the fixed effects of contemporary groups (herd, year and calving season and milking frequency (1 or 2. A cubic regression on Legendre orthogonal polynomials of ages was used to model the mean trend. The additive genetic and permanent environmental effects were modeled by Legendre orthogonal polynomials. Residual variances were considered homogenous or heterogeneous, modeled through variance functions or step functions with 5, 7 or 10 classes. Results from Akaike’s and Schwarz’s Bayesian information criterion indicated that a RRM considering a third order polynomial for the additive genetic and permanent environmental effects and a step function with 5 classes for residual variances fitted best. Heritability estimates obtained by this model varied from 0.10 to 0.28. Genetic correlations were high between consecutive ages, but decreased when intervals between ages increased

  4. Evaluation of soil quality using multiple lineal regression based on physical, chemical and biochemical properties.

    Science.gov (United States)

    Zornoza, Raúl; Mataix-Solera, Jorge; Guerrero, César; Arcenegui, Victoria; García-Orenes, Fuensanta; Mataix-Beneyto, Jorge; Morugán, Alicia

    2007-05-25

    The aim of this work is to obtain an expression using multiple lineal regressions (MLR) to evaluate environmental soil quality. We used four forest soils from Alicante province (SE Spain), comprising three Mollisols and one Entisol, developed under natural vegetation with minimum human disturbance, considered as reference soils of high quality. We carried out MLR integrating different soil physical, chemical and biochemical properties, and we searched those regressions with Kjeldahl nitrogen (N(k)), soil organic carbon (SOC) or microbial biomass carbon (MBC) as predicted parameter. We observed that Mollisols and Entisols presented different relationships among their properties. Thus, we searched different equations for both groups of soils. The selected equation for Mollisols was N=0.448 (P) + 0.017 (water holding capacity) + 0.410(phosphatase) - 0.567 (urease) + 0.001 (MBC) + 0.410 (beta - glucosidase) - 0.980, and for the Entisol SOC = 4.247 (P) + 8.183 (beta-glucosidase) -7.949 (urease) + 17.333. Equations were applied to samples from two forest soils in advanced degree of degradation, one for Mollisols and the other one for the Entisol. We observed a clear deviation in the predicted parameters values related to the real properties. The obtained results show that MLR is a good tool for soil quality evaluation, because it seems to be capable of reflecting the balance among its properties, as well as deviations from it. PMID:17321568

  5. Multiple regression equations modelling of groundwater of Ajmer-Pushkar railway line region, Rajasthan (India).

    Science.gov (United States)

    Mathur, Praveen; Sharma, Sarita; Soni, Bhupendra

    2010-01-01

    In the present work, an attempt is made to formulate multiple regression equations using all possible regressions method for groundwater quality assessment of Ajmer-Pushkar railway line region in pre- and post-monsoon seasons. Correlation studies revealed the existence of linear relationships (r 0.7) for electrical conductivity (EC), total hardness (TH) and total dissolved solids (TDS) with other water quality parameters. The highest correlation was found between EC and TDS (r = 0.973). EC showed highly significant positive correlation with Na, K, Cl, TDS and total solids (TS). TH showed highest correlation with Ca and Mg. TDS showed significant correlation with Na, K, SO4, PO4 and Cl. The study indicated that most of the contamination present was water soluble or ionic in nature. Mg was present as MgCl2; K mainly as KCl and K2SO4, and Na was present as the salts of Cl, SO4 and PO4. On the other hand, F and NO3 showed no significant correlations. The r2 values and F values (at 95% confidence limit, alpha = 0.05) for the modelled equations indicated high degree of linearity among independent and dependent variables. Also the error % between calculated and experimental values was contained within +/- 15% limit. PMID:21114099

  6. Estimation of Saturation Percentage of Soil Using Multiple Regression, ANN, and ANFIS Techniques

    Directory of Open Access Journals (Sweden)

    Khaled Ahmad Aali

    2009-07-01

    Full Text Available The saturation percentage (SP of soils is an important index in hydrological studies. In this paper, arti?cial neural networks (ANNs, multiple regression (MR, and adaptive neural-based fuzzy inference system (ANFIS were used for estimation of saturation percentage of soils collected from Boukan region in the northwestern part of Iran. Percent clay, silt, sand and organic carbon (OC were used to develop the applied methods.  In additions contributions of each input variable were assessed on estimation of SP index. Two performance functions, namely root mean square errors (RMSE and determination coefficient (R2, were used to evaluate the adequacy of the models.  ANFIS method was found to be superior over the other methods. It is, then, proposed that ANFIS model can be used for reasonable estimation of SP values of soils.

  7. Regression analysis exploring teacher impact on student FCI post scores

    Science.gov (United States)

    Mahadeo, Jonathan V.; Manthey, Seth R.; Brewe, Eric

    2013-01-01

    High School Modeling Workshops are designed to improve high school physics teachers' understanding of physics and how to teach using the Modeling method. The basic assumption is that the teacher plays a critical role in their students' physics education. This study investigated teacher impacts on students' Force Concept Inventory scores, (FCI), with the hopes of identifying quantitative differences between teachers. This study examined student FCI scores from 18 teachers with at least a year of teaching high school physics. This data was then evaluated using a General Linear Model (GLM), which allowed for a regression equation to be fitted to the data. This regression equation was used to predict student post FCI scores, based on: teacher ID, student pre FCI score, gender, and representation. The results show 12 out of 18 teachers significantly impact their student post FCI scores. The GLM further revealed that of the 12 teachers only five have a positive impact on student post FCI scores. Given these differences among teachers it is our intention to extend our analysis to investigate pedagogical differences between them.

  8. A Visual Analytics Approach for Correlation, Classification, and Regression Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Steed, Chad A [ORNL; SwanII, J. Edward [Mississippi State University (MSU); Fitzpatrick, Patrick J. [Mississippi State University (MSU); Jankun-Kelly, T.J. [Mississippi State University (MSU)

    2012-02-01

    New approaches that combine the strengths of humans and machines are necessary to equip analysts with the proper tools for exploring today's increasing complex, multivariate data sets. In this paper, a novel visual data mining framework, called the Multidimensional Data eXplorer (MDX), is described that addresses the challenges of today's data by combining automated statistical analytics with a highly interactive parallel coordinates based canvas. In addition to several intuitive interaction capabilities, this framework offers a rich set of graphical statistical indicators, interactive regression analysis, visual correlation mining, automated axis arrangements and filtering, and data classification techniques. The current work provides a detailed description of the system as well as a discussion of key design aspects and critical feedback from domain experts.

  9. Node-Mapping EIT Method Based on Regression Analysis

    Directory of Open Access Journals (Sweden)

    Jianjun Zhang

    2012-12-01

    Full Text Available Medical Imaging shows people the morphology of the body's internal organs function intuitive ly. Electrical Impedance Tomography (EIT is an emerging medical imaging technology. It has the advantages of simple structure, low cost, non-radiological hazards and non-invasive . EIT can not only take advantage of the impedance differences between the different organizations reconstruction of anatomical images, and cantissues and organs to achieve functional imaging impedance changes in different physiological and pathological state, and is suitable for long -term monitoring. The solution is approximate due to t he ill -posedness of inverse problem . Because the image is accuracy and computation of contradictions in not quick enough, EIT is still unable to meet the requirements of practical pplication. By using regression analysis algorithm , Node-Mapping Method only calculates the node potential . The speed of operation and the reconstructed image quality have been greatly improved.

  10. A Quantile Regression Analysis of Micro-lending's Poverty Impact

    Directory of Open Access Journals (Sweden)

    Stephen W. Polk

    2012-07-01

    Full Text Available This paper aims to evaluate the impact of a microlending program on ameliorating measured poverty within its client population, with the aim of improving that impact. We analyze over 18,000 women micro-finance clients of the Negros Women for Tomorrow Foundation (NWTF, a database using the Progress out of Poverty (PPI Scorecard as a measure of poverty. Analysis using both OLS and quantile multivariate regression models shows how observable borrower attributes affect the ability of clients to reduce their measured poverty. Loan size, duration, and the economic activity supported all have strongly identifiable effects. Moreover, estimates suggest which among the poor are receiving the greatest effective help by the program. Results offer specific advice to the NWTF and other micro-lenders: impact is greatest with fewer, larger loans in particular economic sectors (sari-sari, service and trade but require patience as each additional year increases the client’s average change in poverty score.

  11. A Visual Analytics Approach for Correlation, Classification, and Regression Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Steed, Chad A [ORNL; SwanII, J. Edward [Mississippi State University (MSU); Fitzpatrick, Patrick J. [Mississippi State University (MSU); Jankun-Kelly, T.J. [Mississippi State University (MSU)

    2013-01-01

    New approaches that combine the strengths of humans and machines are necessary to equip analysts with the proper tools for exploring today s increasing complex, multivariate data sets. In this paper, a visual data mining framework, called the Multidimensional Data eXplorer (MDX), is described that addresses the challenges of today s data by combining automated statistical analytics with a highly interactive parallel coordinates based canvas. In addition to several intuitive interaction capabilities, this framework offers a rich set of graphical statistical indicators, interactive regression analysis, visual correlation mining, automated axis arrangements and filtering, and data classification techniques. This chapter provides a detailed description of the system as well as a discussion of key design aspects and critical feedback from domain experts.

  12. Logistic regression analysis on the risk factors of radiation pneumonitis

    International Nuclear Information System (INIS)

    Objective: To identify the risk factors of radiation pneumonitis (RP). Methods: A retrospective study was conducted on 101 patients with radiation pneumonitis using SPSS 8.0 software. Factors evaluated included: gender, age, pathology, clinical stage, irradiation dose, irradiation field size, history of smoking, cardiovascular disease, bronchitis, surgery, chemotherapy, lung infection, atelectasis, obstructive infection and pleural effusion. Univariate analysis was performed using Chi-Square test and multivariate analysis was performed using Logistic regression model. Results: Univariate analysis revealed a significant relationship between 10 factors: pulmonary infection, atelectasis, obstructive infection, cardiovascular disease, bronchitis, chemotherapy, irradiation dose, number of days of radiation and irradiation field size were factors leading to radiation pneumonitis. Multivariate analysis showed that 9 factors: pulmonary infection, obs tractive infection, atelectasis, pleural effusion, bronchitis, cardiovascular disease, chemotherapy, irradiation dose, and irradiation field size were independent factors. Conclusion: Comprehensive consideration of the accompanying disease, chemotherapy, dose, field size, etc during the planning of radiotherapy is able to minimize the possibility of developing radiation pneumonitis

  13. Gaze tracking based on active appearance model and multiple support vector regression on mobile devices

    Science.gov (United States)

    Lee, Eui Chul; Ko, You Jin; Park, Kang Ryoung

    2009-07-01

    Gaze tracking technology is a convenient interfacing method for mobile devices. Most previous studies used a large-sized desktop or head-mounted display. In this study, we propose a novel gaze tracking method using an active appearance model (AAM) and multiple support vector regression (SVR) on a mobile device. Our research has four main contributions. First, in calculating the gaze position, the amount of facial rotation and translation based on four feature values is computed using facial feature points detected by AAM. Second, the amount of eye rotation based on two feature values is computed for measuring eye gaze position. Third, to compensate for the fitting error of an AAM in facial rotation, we use the adaptive discrete Kalman filter (DKF), which applies a different velocity of state transition matrix to the facial feature points. Fourth, we obtain gaze position on a mobile device based on multiple SVR by separating the rotation and translation of face and eye rotation. Experimental results show that the root mean square (rms) gaze error is 36.94 pixels on the 4.5-in. screen of a mobile device with a screen resolution of 800×600 pixels.

  14. Accounting for data errors discovered from an audit in multiple linear regression.

    Science.gov (United States)

    Shepherd, Bryan E; Yu, Chang

    2011-09-01

    A data coordinating team performed onsite audits and discovered discrepancies between the data sent to the coordinating center and that recorded at sites. We present statistical methods for incorporating audit results into analyses. This can be thought of as a measurement error problem, where the distribution of errors is a mixture with a point mass at 0. If the error rate is nonzero, then even if the mean of the discrepancy between the reported and correct values of a predictor is 0, naive estimates of the association between two continuous variables will be biased. We consider scenarios where there are (1) errors in the predictor, (2) errors in the outcome, and (3) possibly correlated errors in the predictor and outcome. We show how to incorporate the error rate and magnitude, estimated from a random subset (the audited records), to compute unbiased estimates of association and proper confidence intervals. We then extend these results to multiple linear regression where multiple covariates may be incorrect in the database and the rate and magnitude of the errors may depend on study site. We study the finite sample properties of our estimators using simulations, discuss some practical considerations, and illustrate our methods with data from 2815 HIV-infected patients in Latin America, of whom 234 had their data audited using a sequential auditing plan. PMID:21281274

  15. The working capacity of the alcohol abuser. Prognostic multiple regression analyses.

    Science.gov (United States)

    Hörnquist, J O; Hansson, B; Akerlind, I

    1988-01-01

    Thirty-four alcohol abusers treated at various rehabilitational locations in Sweden were the subjects of an extensive and interdisciplinary study. Thereafter, the working capacity of each subject was followed over a two-year period. Twelve individuals regained capacity for work, either partially or completely. Thirteen subjects were sick-listed or remained unemployed. The remaining 9 abusers were quickly and unexpectedly pensioned. In order to predict the rehabilitational outcome from the interdisciplinary findings at the onset of the time period, stepwise multiple regression analyses were performed. Those who felt less lonely and had no drinking buddies appeared most likely to be rehabilitated vocationally. This core combination of characteristics accounted for about one third of the variability in the outcome criterion, either trichotomized or dichotomized. Rehabilitational success could be even more strongly predicted by the appearance of such features as less prolonged abuse, social introversion (not cohabiting, reserved attitudes and having only a few friends), orientation to the future and a history of psychiatric care. An elevated level of plasma albumin and a decreased plasma-IgA value raised additionally the multiple correlation coefficient. PMID:3347824

  16. Low-Cost Housing in Sabah, Malaysia: A Regression Analysis

    Directory of Open Access Journals (Sweden)

    Dullah Mulok

    2009-02-01

    Full Text Available Low-cost housing plays a vital role in the development process especially in providing accommodation to those who are less fortunate and the lower income group. This effort is also a step in overcoming the squatter problem which could cripple the competitive drive of the local community especially in the state of Sabah, Malaysia. This article attempts to look into the influencing factors to low-cost housing in Sabah namely the government’s budget (allocation for low cost housing projects and Sabah’s total population. At the same time, this study will attempt to show the implication from the development and economic crises which occurred during period 1971 to 2000 towards the provision of low cost houses in Sabah. Empirical analyses were conducted using the multiple linear regression method, stepwise and also the dummy variable approach in demonstrating the link. The empirical result shows that the government’s budget for low-cost housing is the main contributor to the provision of low-cost housing in Sabah. The empirical decision also suggests that economic growth namely Gross Domestic Product (GDP did not provide a significant effect to the low-cost housing in Sabah. However, almost all major crises that have beset upon Malaysia’s economy caused a significant and consistent effect to the low-cost housing in Sabah especially the financial crisis which occurred in mid 1997.

  17. Multiple regression equations to estimate the content of breast muscles, meat, and fat in Muscovy ducks.

    Science.gov (United States)

    Kleczek, K; Wawro, K; Wilkiewicz-Wawro, E; Makowski, W

    2006-07-01

    The aim of the present study was to derive multiple regression equations for in vivo estimation of the carcass lean and fat content in Muscovy ducks. The experimental materials consisted of 240 White Muscovy ducklings (120 male and 120 female). One hundred sixteen females aged 10 wk and 112 males aged 12 wk were slaughtered. Before slaughter the ducks were weighed, and the following body measurements were taken: humerus length, drumstick length, chest girth, breast-bone crest length, width between the humeral bones, chest depth, and breast muscle thickness. The coefficients of simple correlation between carcass tissue components and body measurements were calculated. It was found that live body weight was highly correlated with the weights of all tissue components (r = 0.701 to 0.857). In males a significant interrelation was found between breast muscle weight and all body measurements, whereas in females breast muscle weight was correlated with breast-bone crest length, chest girth, width between the humeral bones, chest depth, and breast muscle thickness only. In both males and females the carcass lean content was closely correlated with drumstick length, breast-bone crest length, chest girth, and width between the humeral bones. In drakes the carcass fat content was closely correlated with all body measurements, whereas in hens significant correlations were observed between the carcass fat content and chest girth, width between the humeral bones, and chest depth only. The coefficients of simple correlation between the percentages of carcass tissue components and body measurements were generally low and statistically nonsignificant. Twelve multiple regression equations formulated based on the body measurements of live ducks were verified with respect to the accuracy of estimation of the content of breast muscles, meat, and fat with skin in the carcass. These equations give small SE of the estimate (Sy = 23.3 to 83.8 g), high values of coefficients of multiple correlation between the dependent variable and the set of independent variables, and high values of determination coefficients. PMID:16830875

  18. ROC curve regression analysis: the use of ordinal regression models for diagnostic test assessment.

    OpenAIRE

    Tosteson, A. N.; Weinstein, M. C.; Wittenberg, J.; Begg, C. B.

    1994-01-01

    Diagnostic tests commonly are characterized by their true positive (sensitivity) and true negative (specificity) classification rates, which rely on a single decision threshold to classify a test result as positive. A more complete description of test accuracy is given by the receiver operating characteristic (ROC) curve, a graph of the false positive and true positive rates obtained as the decision threshold is varied. A generalized regression methodology, which uses a class of ordinal regre...

  19. Development of Multiple Regression and Neural Network Models for Assessment of Blasting Dust at a Large Surface Coal Mine

    Directory of Open Access Journals (Sweden)

    T.A. Renaldy

    2011-01-01

    Full Text Available oped for prediction of particulate matter. The performance of the multiple regression models was assessed. For the development of neural network models, a feed forward with back propagation learning algorithm was used to train the network. The performance of neural network was determined in terms of correlation coefficient (R and Mean Square Error (MSE. The optimum number of hidden neurons was found out for obtaining the lowest value of MSE and the highest value of R. The results indicated that the network can predict particulate concentrations better than multiple regression models.

  20. Application of a Bayesian method for optimal subset regression to linkage analysis of Q1 and Q2.

    Science.gov (United States)

    Suh, Y J; Finch, S J; Mendell, N R

    2001-01-01

    We explore an approach that allows us to consider a trait for which we wish to determine the optimal subset of markers out of a set of p > or = 3 candidate markers being considered in a linkage analysis. The most effective analysis would find the model that only includes the q markers closest to the q major genes which determine the trait. Finding this optimal model using classical "frequentist" multiple regression techniques would require consideration of all 2p possible subsets. We apply the work of George and McCulloch [J Am Stat Assoc 88:881-9, 1993], who have developed a Bayesian approach to optimal subset selection regression, to a modification of the Haseman-Elston linkage statistic [Elston et al., Genet Epidemiol 19:1-17, 2000] in the analysis of the two quantitative traits simulated in Problem 2. The results obtained using this Bayesian method are compared to those obtained using (1) multiple regression and (2) the modified Haseman-Elston method (single variable regression analysis). We note upon doing this that for both Q1 and Q2, (1) we have extremely low power with all methods using the samples as given and have to resort to combining several simulated samples in order to have power of 50%, (2) the multivariate analysis does not have greater power than the univariate analysis for these traits, and (3) the Bayesian approach identifies the correct model more frequently than the frequentist approaches but shows no clear advantage over the multivariate approach. PMID:11793765

  1. Prediction of radiation levels in residences: A methodological comparison of CART [Classification and Regression Tree Analysis] and conventional regression

    International Nuclear Information System (INIS)

    In environmental epidemiology, trace and toxic substance concentrations frequently have very highly skewed distributions ranging over one or more orders of magnitude, and prediction by conventional regression is often poor. Classification and Regression Tree Analysis (CART) is an alternative in such contexts. To compare the techniques, two Pennsylvania data sets and three independent variables are used: house radon progeny (RnD) and gamma levels as predicted by construction characteristics in 1330 houses; and ?200 house radon (Rn) measurements as predicted by topographic parameters. CART may identify structural variables of interest not identified by conventional regression, and vice versa, but in general the regression models are similar. CART has major advantages in dealing with other common characteristics of environmental data sets, such as missing values, continuous variables requiring transformations, and large sets of potential independent variables. CART is most useful in the identification and screening of independent variables, greatly reducing the need for cross-tabulations and nested breakdown analyses. There is no need to discard cases with missing values for the independent variables because surrogate variables are intrinsic to CART. The tree-structured approach is also independent of the scale on which the independent variables are measured, so that transformations are unnecessary. CART identifies important interactions as well as main effects. The major advantages of CART appear to be in exploring data. Once the important variables are identified, conventional regressions seem to lead to results similar but more interpretable by most audiences. 12 refs., 8 figs., 10 tabs

  2. The analysis of kernel ridge regression learning algorithm.

    OpenAIRE

    Pozdnoukhov, Alexei

    2002-01-01

    The paper presents Kernel Ridge Regression, a nonlinear extension of the well known statistical model of ridge regression. New insights on the method are also presented. In particular, the connection between ridge regression and local translation-invariant squared loss minimization algorithm is shown. An iterative training algorithm is proposed, that allows training the KRR for large datasets. The training time is empirically found to scale quadratically with the number of samples. The applic...

  3. Semiparametric regression for periodic longitudinal hormone data from multiple menstrual cycles.

    Science.gov (United States)

    Zhang, D; Lin, X; Sowers, M

    2000-03-01

    We consider semiparametric regression for periodic longitudinal data. Parametric fixed effects are used to model the covariate effects and a periodic nonparametric smooth function is used to model the time effect. The within-subject correlation is modeled using subject-specific random effects and a random stochastic process with a periodic variance function. We use maximum penalized likelihood to estimate the regression coefficients and the periodic nonparametric time function, whose estimator is shown to be a periodic cubic smoothing spline. We use restricted maximum likelihood to simultaneously estimate the smoothing parameter and the variance components. We show that all model parameters can be easily obtained by fitting a linear mixed model. A common problem in the analysis of longitudinal data is to compare the time profiles of two groups, e.g., between treatment and placebo. We develop a scaled chi-squared test for the equality of two nonparametric time functions. The proposed model and the test are illustrated by analyzing hormone data collected during two consecutive menstrual cycles and their performance is evaluated through simulations. PMID:10783774

  4. Regression Analysis between Properties of Subgrade Lateritic Soil

    Directory of Open Access Journals (Sweden)

    Afeez Adefemi BELLO

    2012-12-01

    Full Text Available The results of a study that considered the use of regression analysis that may have correlation between index properties and California Bearing Ratio (CBR of some lateritic soil within Osogbo town of South Western Nigeria have been presented. For an appreciable conclusion to be established, lateritic soil samples were collected from eight (8 different borrow pits within the town and various laboratory tests including Atterberg Limits, Gradation analysis, California Bearing Ratio, Compaction and Specific Gravity were performed on the soil samples.Various linear relationships between index properties and CBR of the samples were investigated and predictive equations estimating CBR from the experimental index values were developed. The findings indicate that good correlation exists between the two groups (i.e Index properties and CBR values. However, the values of the CBR computed from the models are only to be used for preliminary in view of simplicity and economy and not acceptable alternatives to laboratory testing because of the anisotropic nature of lateritic soil and its heterogeneity.

  5. Use of generalized regression models for the analysis of stress-rupture data

    International Nuclear Information System (INIS)

    The design of components for operation in an elevated-temperature environment often requires a detailed consideration of the creep and creep-rupture properties of the construction materials involved. Techniques for the analysis and extrapolation of creep data have been widely discussed. The paper presents a generalized regression approach to the analysis of such data. This approach has been applied to multiple heat data sets for types 304 and 316 austenitic stainless steel, ferritic 21/4 Cr-1 Mo steel, and the high-nickel austenitic alloy 800H. Analyses of data for single heats of several materials are also presented. All results appear good. The techniques presented represent a simple yet flexible and powerful means for the analysis and extrapolation of creep and creep-rupture data

  6. The role of multiple regression and exploratory data analysis in the development of leukemia incidence risk models for comparison of radionuclide air stack emissions from nuclear and coal power industries

    International Nuclear Information System (INIS)

    Risk associated with power generation must be identified to make intelligent choices between alternate power technologies. Radionuclide air stack emissions for a single coal plant and a single nuclear plant are used to compute the single plant leukemia incidence risk and total industry leukemia incidence risk. Leukemia incidence is the response variable as a function of radionuclide bone dose for the six proposed dose response curves considered. During normal operation a coal plant has higher radionuclide emissions than a nuclear plant and the coal industry has a higher leukaemia incidence risk than the nuclear industry, unless a nuclear accident occurs. Variation of nuclear accident size allows quantification of the impact of accidents on the total industry leukemia incidence risk comparison. The leukemia incidence risk is quantified as the number of accidents of a given size for the nuclear industry leukemia incidence risk to equal the coal industry leukemia incidence risk. The general linear model is used to develop equations that relate the accident frequency required for equal industry risks to the magnitude of the nuclear emission. Exploratory data analysis revealed that the relationship between the natural log of accident number versus the natural log of accident size is linear. (Author)

  7. Multiple Regression and Mediator Variables can be used to Avoid Double Counting when Economic Values are Derived using Stochastic Herd Simulation

    DEFF Research Database (Denmark)

    Østergaard, SØren; Ettema, Jehan Frans

    Multiple regression and model building with mediator variables was addressed to avoid double counting when economic values are estimated from data simulated with herd simulation modeling (using the SimHerd model). The simulated incidence of metritis was analyzed statistically as the independent variable, while using the traits representing the direct effects of metritis on yield, fertility and occurrence of other diseases as mediator variables. The economic value of metritis was estimated to be €78 per 100 cow-years for each 1% increase of metritis in the period of 1-100 days in milk in multiparous cows. The merit of using this approach was demonstrated since the economic value of metritis was estimated to be 81% higher when no mediator variables were included in the multiple regression analysis

  8. Mapping of multiple quantitative trait loci by simple regression in half-sib designs

    OpenAIRE

    Koning, D. J.; Schulman, N.; Elo, K.; Moisio, S.; Kinos, R.; Vilkki, J.; Maki-tanila, A.

    2001-01-01

    Detection of QTL in outbred half-sib family structures has mainly been based on interval mapping of single QTL on individual chromosomes. Methods to account for linked and unlinked QTL have been developed, but most of them are only applicable in designs with inbred species or pose great demands on computing facilities. This study describes a strategy that allows for rapid analysis, involving multiple QTL, of complete genomes. The methods combine information from individual analyses after whic...

  9. Adaptive regression analysis: theory and applications in econometrics

    Directory of Open Access Journals (Sweden)

    J. Garc\\u00EDa P\\u00E9rez

    2003-01-01

    Full Text Available In this work we (a discuss some theoretical and computational difficulties of regression analysing dependences, describing the behaviour of the heterogeneous systems, (b offer a set of new techniques adaptable to regression analysing the heterogeneous dependences and (c demonstrate the advantages of application of these new techniques in econometrics.

  10. Use of a neural network and a multiple regression model to predict histologic grade of astrocytoma from MRI appearances

    International Nuclear Information System (INIS)

    Several MRI features of supratentorial astrocytomas are associated with high histologic grade by statistically significant p values. We sought to apply this information prospectively to a group of astrocytomas in the prediction of tumor grade. We used 10 MRI features of fibrillary astrocytomas from 52 patient studies to develop neural network and multiple linear regression models for practical use in predicting tumor grade. The models were tested prospectively on MR images from 29 patient studies. The performance of the models was compared against that of a radiologist. Neural network accuracy was 61 % in distinguishing between low and high grade tumors. Multiple linear regression achieved an accuracy of 59 %. Assessment of the images by a radiologist yielded 57 % accuracy. We conclude that while certain MRI parameters may be statistically related to astrocytoma histologic grade, neural network and linear regression models cannot reliably use them to predict tumor grade. (orig.)

  11. Oil and gas pipeline construction cost analysis and developing regression models for cost estimation

    Science.gov (United States)

    Thaduri, Ravi Kiran

    In this study, cost data for 180 pipelines and 136 compressor stations have been analyzed. On the basis of the distribution analysis, regression models have been developed. Material, Labor, ROW and miscellaneous costs make up the total cost of a pipeline construction. The pipelines are analyzed based on different pipeline lengths, diameter, location, pipeline volume and year of completion. In a pipeline construction, labor costs dominate the total costs with a share of about 40%. Multiple non-linear regression models are developed to estimate the component costs of pipelines for various cross-sectional areas, lengths and locations. The Compressor stations are analyzed based on the capacity, year of completion and location. Unlike the pipeline costs, material costs dominate the total costs in the construction of compressor station, with an average share of about 50.6%. Land costs have very little influence on the total costs. Similar regression models are developed to estimate the component costs of compressor station for various capacities and locations.

  12. Current misuses of multiple regression for investigating bivariate hypotheses: an example from the organizational domain.

    Science.gov (United States)

    O'Neill, Thomas A; McLarnon, Matthew J W; Schneider, Travis J; Gardner, Robert C

    2014-09-01

    By definition, multiple regression (MR) considers more than one predictor variable, and each variable's beta will depend on both its correlation with the criterion and its correlation with the other predictor(s). Despite ad nauseam coverage of this characteristic in organizational psychology and statistical texts, researchers' applications of MR in bivariate hypothesis testing has been the subject of recent and renewed interest. Accordingly, we conducted a targeted survey of the literature by coding articles, covering a five-year span from two top-tier organizational journals, that employed MR for testing bivariate relations. The results suggest that MR coefficients, rather than correlation coefficients, were most common for testing hypotheses of bivariate relations, yet supporting theoretical rationales were rarely offered. Regarding the potential impact on scientific advancement, in almost half of the articles reviewed (44 %), at least one conclusion of each study (i.e., that the hypothesis was or was not supported) would have been different, depending on the author's use of correlation or beta to test the bivariate hypothesis. It follows that inappropriate decisions to interpret the correlation versus the beta will affect the accumulation of consistent and replicable scientific evidence. We conclude with recommendations for improving bivariate hypothesis testing. PMID:24142838

  13. Dental malocclusion and body posture in young subjects: A multiple regression study

    Scientific Electronic Library Online (English)

    Giuseppe, Perinetti; Luca, Contardo; Armando, Silvestrini-Biavati; Lucia, Perdoni; Attilio, Castaldo.

    Full Text Available SciELO Brazil | Language: English Abstract in english OBJECTIVES: Controversial results have been reported on potential correlations between the stomatognathic system and body posture. We investigated whether malocclusal traits correlate with body posture alterations in young subjects to determine possible clinical applications. METHODS: A total of 122 [...] subjects, including 86 males and 36 females (age range of 10.8-16.3 years), were enrolled. All subjects tested negative for temporomandibular disorders or other conditions affecting the stomatognathic systems, except malocclusion. A dental occlusion assessment included phase of dentition, molar class, overjet, overbite, anterior and posterior crossbite, scissorbite, mandibular crowding and dental midline deviation. In addition, body posture was recorded through static posturography using a vertical force platform. Recordings were performed under two conditions, namely, i) mandibular rest position (RP) and ii) dental intercuspidal position (ICP). Posturographic parameters included the projected sway area and velocity and the antero-posterior and right-left load differences. Multiple regression models were run for both recording conditions to evaluate associations between each malocclusal trait and posturographic parameters. RESULTS: All of the posturographic parameters had large variability and were very similar between the two recording conditions. Moreover, a limited number of weakly significant correlations were observed, mainly for overbite and dentition phase, when using multivariate models. CONCLUSION: Our current findings, particularly with regard to the use of posturography as a diagnostic aid for subjects affected by dental malocclusion, do not support existence of clinically relevant correlations between malocclusal traits and body posture

  14. Multiple regression, ANN (RBF, MLP) and ANFIS models for prediction of swell potential of clayey soils

    Science.gov (United States)

    Yilmaz, Isik; Kaynar, Oguz

    2010-05-01

    In the recent years, new techniques such as; artificial neural networks and fuzzy inference systems were employed for developing of the predictive models to estimate the needed parameters. Soft computing techniques are now being used as alternate statistical tool. Determination of swell potential of soil is difficult, expensive, time consuming and involves destructive tests. In this paper, use of MLP and RBF functions of ANN (artificial neural networks), ANFIS (adaptive neuro-fuzzy inference system) for prediction of S% (swell percent) of soil was described, and compared with the traditional statistical model of MR (multiple regression). However the accuracies of ANN and ANFIS models may be evaluated relatively similar. It was found that the constructed RBF exhibited a high performance than MLP, ANFIS and MR for predicting S%. The performance comparison showed that the soft computing system is a good tool for minimizing the uncertainties in the soil engineering projects. The use of soft computing will also may provide new approaches and methodologies, and minimize the potential inconsistency of correlations.

  15. Supply and Demand of Jeneberang River Aggregate Using Multiple Regression Model

    Directory of Open Access Journals (Sweden)

    Aryanti Virtanti Anas

    2013-07-01

    Full Text Available Aggregate plays an important role in developing infrastructure because it is the major raw materials used in construction such as roads, hospitals, schools, factories, homes and other buildings. Sand and gravel are essential sources of aggregate and exploited often from the active channels of river systems. Jeneberang River is one of the main rivers in South Sulawesi Province which is located at Gowa Regency and mined in order to fulfill the aggregate demand of Gowa Regency and Makassar City. Supply and demand are economic occurrences that affected by several factors, so this research aims to (1 determine influencing factors to aggregate supply and demand, (2 develop supply and demand model. Data was obtained from Central Bureau Statistics of Gowa Regency and Makassar City, and Department of Mines and Energy, Gowa Regency for eleven years (2001 – 2011. In this research, aggregate supply and demand were modeled using multiple regression method. First, relationship among supply and influencing factors were established, followed by demand and its factors. Second, supply and demand model was established using SPSS. The result of this research showed that the model can be used to estimate accurately supply and demand of aggregate using the established relationship among the influencing factors. Supply of aggregate was affected by several factors including price, number of trucks, number of mining companies and mining permit area meanwhile the price, GDP, income per capita, length of road, number of buildings and economic growth had high influence on demand rate.

  16. The Overall Odds Ratio as an Intuitive Effect Size Index for Multiple Logistic Regression: Examination of Further Refinements

    Science.gov (United States)

    Le, Huy; Marcus, Justin

    2012-01-01

    This study used Monte Carlo simulation to examine the properties of the overall odds ratio (OOR), which was recently introduced as an index for overall effect size in multiple logistic regression. It was found that the OOR was relatively independent of study base rate and performed better than most commonly used R-square analogs in indexing model…

  17. A Novel Multiobjective Evolutionary Algorithm Based on Regression Analysis

    Science.gov (United States)

    Song, Zhiming; Wang, Maocai; Dai, Guangming; Vasile, Massimiliano

    2015-01-01

    As is known, the Pareto set of a continuous multiobjective optimization problem with m objective functions is a piecewise continuous (m ? 1)-dimensional manifold in the decision space under some mild conditions. However, how to utilize the regularity to design multiobjective optimization algorithms has become the research focus. In this paper, based on this regularity, a model-based multiobjective evolutionary algorithm with regression analysis (MMEA-RA) is put forward to solve continuous multiobjective optimization problems with variable linkages. In the algorithm, the optimization problem is modelled as a promising area in the decision space by a probability distribution, and the centroid of the probability distribution is (m ? 1)-dimensional piecewise continuous manifold. The least squares method is used to construct such a model. A selection strategy based on the nondominated sorting is used to choose the individuals to the next generation. The new algorithm is tested and compared with NSGA-II and RM-MEDA. The result shows that MMEA-RA outperforms RM-MEDA and NSGA-II on the test instances with variable linkages. At the same time, MMEA-RA has higher efficiency than the other two algorithms. A few shortcomings of MMEA-RA have also been identified and discussed in this paper.

  18. A simplified procedure of linear regression in a preliminary analysis

    Directory of Open Access Journals (Sweden)

    Silvia Facchinetti

    2013-05-01

    Full Text Available The analysis of a statistical large data-set can be led by the study of a particularly interesting variable Y – regressed and an explicative variable X, chosen among the remained variables, conjointly observed. The study gives a simplified procedure to obtain the functional link of the variables y=y(x by a partition of the data-set into m subsets, in which the observations are synthesized by location indices (mean or median of X and Y. Polynomial models for y(x of order r are considered to verify the characteristics of the given procedure, in particular we assume r= 1 and 2. The distributions of the parameter estimators are obtained by simulation, when the fitting is done for m= r + 1. Comparisons of the results, in terms of distribution and efficiency, are made with the results obtained by the ordinary least square methods. The study also gives some considerations on the consistency of the estimated parameters obtained by the given procedure.

  19. Determination of galactosamine impurities in heparin samples by multivariate regression analysis of their (1)H NMR spectra.

    Science.gov (United States)

    Zang, Qingda; Keire, David A; Wood, Richard D; Buhse, Lucinda F; Moore, Christine M V; Nasr, Moheb; Al-Hakim, Ali; Trehy, Michael L; Welsh, William J

    2011-01-01

    Heparin, a widely used anticoagulant primarily extracted from animal sources, contains varying amounts of galactosamine impurities. Currently, the United States Pharmacopeia (USP) monograph for heparin purity specifies that the weight percent of galactosamine (%Gal) may not exceed 1%. In the present study, multivariate regression (MVR) analysis of (1)H NMR spectral data obtained from heparin samples was employed to build quantitative models for the prediction of %Gal. MVR analysis was conducted using four separate methods: multiple linear regression, ridge regression, partial least squares regression, and support vector regression (SVR). Genetic algorithms and stepwise selection methods were applied for variable selection. In each case, two separate prediction models were constructed: a global model based on dataset A which contained the full range (0-10%) of galactosamine in the samples and a local model based on the subset dataset B for which the galactosamine level (0-2%) spanned the 1% USP limit. All four regression methods performed equally well for dataset A with low prediction errors under optimal conditions, whereas SVR was clearly superior among the four methods for dataset B. The results from this study show that (1)H NMR spectroscopy, already a USP requirement for the screening of contaminants in heparin, may offer utility as a rapid method for quantitative determination of %Gal in heparin samples when used in conjunction with MVR approaches. PMID:20953772

  20. Maternal multiple micronutrient supplementation and pregnancy outcomes in developing countries: meta-analysis and meta-regression / Supplémentation maternelle en micronutriments multiples et issues de la grossesse dans les pays en voie de développement: méta-analyse et méta-régression / Administración de múltiples micronutrientes durante el embarazo y resultados en los países en vías de desarrollo: metanálisis y metarregresión

    Scientific Electronic Library Online (English)

    Kosuke, Kawai; Donna, Spiegelman; Anuraj H, Shankar; Wafaie W, Fawzi.

    2011-06-01

    Full Text Available SciELO Public Health | Language: English Abstract in spanish RESUMEN OBJETIVO: Realizar una revisión sistemática de ensayos aleatorizados y controlados en los que se compara el efecto de la administración de múltiples micronutrientes con el de la administración de hierro y ácido fólico sobre los resultados de los embarazos en los países en vías de desarrollo. [...] MÉTODOS: Se realizaron búsquedas en MEDLINE y EMBASE. Los resultados de interés fueron: peso del neonato, bajo peso neonatal, neonatos con una talla baja para la edad gestacional, mortalidad perinatal y mortalidad neonatal. Se calcularon los riesgos relativos (RR) agrupados, empleando modelos de efectos aleatorios. Se investigaron las fuentes de heterogeneidad del metanálisis y la metarregresión de los subgrupos. RESULTADOS: La administración de múltiples micronutrientes fue más eficaz que la administración de hierro y ácido fólico a la hora de reducir el riesgo del peso bajo neonatal (RR=0,86, IC del 95%=0,79-0,93) y la talla baja para la edad gestacional (RR=0,85; IC del 95%=0,78-0,93). La administración de micronutrientes no tuvo un efecto global en la mortalidad perinatal (RR=1,05; IC del 95%=0,90-1,22), si bien la heterogeneidad fue importante y evidente (I²=58%; p de heterogeneidad=0,008). Los análisis de los subgrupos y de la metarregresión sugirieron que la administración de micronutrientes estaba asociada a un menor riesgo de mortalidad perinatal en aquellos estudios en los que más del 50% de las madres tenía formación universitaria (RR=0,93; IC del 95%=0,82-1,06) o en los que la administración se inició después de una media de 20 semanas de gestación (RR=0,88; IC del 95%=0,80-0,97). CONCLUSIÓN: La educación de la madre o la edad gestacional en la que se inició la administración pueden haber contribuido a los efectos heterogéneos observados en la mortalidad perinatal. Se debe seguir investigando la seguridad, la eficacia y la efectividad de la administración de micronutrientes a mujeres embarazadas. Abstract in english OBJECTIVE: To systematically review randomized controlled trials comparing the effect of supplementation with multiple micronutrients versus iron and folic acid on pregnancy outcomes in developing countries. METHODS: MEDLINE and EMBASE were searched. Outcomes of interest were birth weight, low birth [...] weight, small size for gestational age, perinatal mortality and neonatal mortality. Pooled relative risks (RRs) were estimated by random effects models. Sources of heterogeneity were explored through subgroup meta-analyses and meta-regression. FINDINGS: Multiple micronutrient supplementation was more effective than iron and folic acid supplementation at reducing the risk of low birth weight (RR:0.86, 95% confidence interval, CI:0.79-0.93) and of small size for gestational age (RR:0.85; 95% CI: 0.78-0.93). Micronutrient supplementation had no overall effect on perinatal mortality (RR:1.05; 95% CI:0.90-1.22), although substantial heterogeneity was evident (I²=58%; P for heterogeneity=0.008). Subgroup and meta-regression analyses suggested that micronutrient supplementation was associated with a lower risk of perinatal mortality in trials in which >50% of mothers had formal education (RR:0.93; 95% CI:0.82-1.06) or in which supplementation was initiated after a mean of 20 weeks of gestation (RR:0.88; 95% CI:0.80-0.97). CONCLUSION: Maternal education or gestational age at initiation of supplementation may have contributed to the observed heterogeneous effects on perinatal mortality. The safety, efficacy and effective delivery of maternal micronutrient supplementation require further research.

  1. Microcomputer-assisted multivariate survival data analysis using Cox's proportional hazards regression model.

    Science.gov (United States)

    Campos-Filho, N; Franco, E L

    1990-02-01

    We describe a microcomputer program (COXSURV) for proportional hazards multiple regression analysis of survival and other failure-time data generated in clinical trials and in retrospective clinical epidemiology studies. COXSURV is menu-driven and has powerful variable factoring and data exploratory capabilities for multivariate modeling. A batch mode allows automatic uni- or multivariate analyses for confounder summarization. Model selection for predictive purposes is possible through a step-up algorithm. The partial likelihood method used in the program allows the use of either discrete or continuous time scales by treating tied uncensored observations by either the exact method or by a robust approximation method. The program calculates most standard model fitting statistics for either overall or stratified analyses and uses data layout files compatible with those of other related epidemiologic analysis software. PMID:2335079

  2. Improved performance of a two-element TLD badge for determining gamma and beta doses using multiple linear regression

    International Nuclear Information System (INIS)

    The gamma/beta TLD badge used by OPPD consists of two TLD-700 chips (Harshaw G7 card), one of which (chip number sign 2) is shielded by a 0.102 cm-thick aluminum filter, and the other (chip number sign 1) is unshielded, as shown in Fig. 1. Standard procedure had been to determine the beta dose to the badge by subtracting the response of chip number sign 2 from that of chip number sign 1 and then dividing by a calibrated beta-sensitivity factor; the gamma dose was taken to be the response of chip number sign 2 divided by the chip's gamma-sensitivity factor followed by the subtraction of the background dose. A problem with this procedure is penetration of energetic beta particles through the aluminum filter on chip number sign 2 which causes an over-response. Due to the technique used to obtain the beta dose, this also results in an under-estimate of the beta dose. This problem has been corrected through application of multiple linear regression analysis on a large data base of pure gamma (137Cs), pure beta (90Sr), and mixed exposures. The outcome of the analysis is an algorithm that automatically corrects for penetration effects. Performance tests using the ANSI N13.11 standard are presented to show the improvement

  3. Regional flood frequency analysis in eastern Australia: Bayesian GLS regression-based methods within fixed region and ROI framework - Quantile Regression vs. Parameter Regression Technique

    Science.gov (United States)

    Haddad, Khaled; Rahman, Ataur

    2012-04-01

    SummaryIn this article, an approach using Bayesian Generalised Least Squares (BGLS) regression in a region-of-influence (ROI) framework is proposed for regional flood frequency analysis (RFFA) for ungauged catchments. Using the data from 399 catchments in eastern Australia, the BGLS-ROI is constructed to regionalise the flood quantiles (Quantile Regression Technique (QRT)) and the first three moments of the log-Pearson type 3 (LP3) distribution (Parameter Regression Technique (PRT)). This scheme firstly develops a fixed region model to select the best set of predictor variables for use in the subsequent regression analyses using an approach that minimises the model error variance while also satisfying a number of statistical selection criteria. The identified optimal regression equation is then used in the ROI experiment where the ROI is chosen for a site in question as the region that minimises the predictive uncertainty. To evaluate the overall performances of the quantiles estimated by the QRT and PRT, a one-at-a-time cross-validation procedure is applied. Results of the proposed method indicate that both the QRT and PRT in a BGLS-ROI framework lead to more accurate and reliable estimates of flood quantiles and moments of the LP3 distribution when compared to a fixed region approach. Also the BGLS-ROI can deal reasonably well with the heterogeneity in Australian catchments as evidenced by the regression diagnostics. Based on the evaluation statistics it was found that both BGLS-QRT and PRT-ROI perform similarly well, which suggests that the PRT is a viable alternative to QRT in RFFA. The RFFA methods developed in this paper is based on the database available in eastern Australia. It is expected that availability of a more comprehensive database (in terms of both quality and quantity) will further improve the predictive performance of both the fixed and ROI based RFFA methods presented in this study, which however needs to be investigated in future when such a database is available.

  4. M-quantile regression analysis of temporal gene expression data.

    Science.gov (United States)

    Vinciotti, Veronica; Yu, Keming

    2009-01-01

    In this paper, we explore the use of M-quantile regression and M-quantile coefficients to detect statistical differences between temporal curves that belong to different experimental conditions. In particular, we consider the application of temporal gene expression data. Here, the aim is to detect genes whose temporal expression is significantly different across a number of biological conditions. We present a new method to approach this problem. Firstly, the temporal profiles of the genes are modelled by a parametric M-quantile regression model. This model is particularly appealing to small-sample gene expression data, as it is very robust against outliers and it does not make any assumption on the error distribution. Secondly, we further increase the robustness of the method by summarising the M-quantile regression models for a large range of quantile values into an M-quantile coefficient. Finally, we fit a polynomial M-quantile regression model to the M-quantile coefficients over time and employ a Hotelling T(2)-test to detect significant differences of the temporal M-quantile coefficients profiles across conditions. Extensive simulations show the increased power and robustness of M-quantile regression methods over standard regression methods and over some of the previously published methods. We conclude by applying the method to detect differentially expressed genes from time-course microarray data on muscular dystrophy. PMID:19799560

  5. Linear Maximum Likelihood Regression Analysis for Untransformed Log-Normally Distributed Data

    OpenAIRE

    Gustavsson, Sara M.; Sandra Johannesson; Gerd Sallsten; Andersson, Eva M.

    2012-01-01

    Medical research data are often skewed and heteroscedastic. It has therefore become practice to log-transform data in regression analysis, in order to stabilize the variance. Regression analysis on log-transformed data estimates the relative effect, whereas it is often the absolute effect of a predictor that is of interest. We propose a maximum likelihood (ML)-based approach to estimate a linear regression model on log-normal, heteroscedastic data. The new method was evaluated with a large si...

  6. Use of principal-component, correlation, and stepwise multiple-regression analyses to investigate selected physical and hydraulic properties of carbonate-rock aquifers

    Science.gov (United States)

    Brown, C.E.

    1993-01-01

    Correlation analysis in conjunction with principal-component and multiple-regression analyses were applied to laboratory chemical and petrographic data to assess the usefulness of these techniques in evaluating selected physical and hydraulic properties of carbonate-rock aquifers in central Pennsylvania. Correlation and principal-component analyses were used to establish relations and associations among variables, to determine dimensions of property variation of samples, and to filter the variables containing similar information. Principal-component and correlation analyses showed that porosity is related to other measured variables and that permeability is most related to porosity and grain size. Four principal components are found to be significant in explaining the variance of data. Stepwise multiple-regression analysis was used to see how well the measured variables could predict porosity and (or) permeability for this suite of rocks. The variation in permeability and porosity is not totally predicted by the other variables, but the regression is significant at the 5% significance level. ?? 1993.

  7. An overview on standard statistical methods for assessing exposure-outcome link in survival analysis (Part II): the Kaplan-Meier analysis and the Cox regression method.

    Science.gov (United States)

    Abd ElHafeez, Samar; Torino, Claudia; D'Arrigo, Graziella; Bolignano, Davide; Provenzano, Fabio; Mattace-Raso, Francesco; Zoccali, Carmine; Tripepi, Giovanni

    2012-06-01

    The Kaplan-Meier and the Cox regression methods are the most used statistical techniques for performing "time to event analysis" in epidemiological and clinical research. The Kaplan-Meier analysis allows to build up one or more survival curves describing the occurrence of the outcome of interest over time according to the presence/absence of one or more exposures. The Cox regression method models the relationship between a specific exposure (either a continuous one like age, and systolic blood pressure or a categorical one like diabetes, degree of obesity, etc.) and the occurrence of a given outcome taking into account multiple confounders and/or predictors. PMID:23114547

  8. Spatial regression analysis on 32 years of total column ozone data

    Science.gov (United States)

    Knibbe, J. S.; van der A, R. J.; de Laat, A. T. J.

    2014-08-01

    Multiple-regression analyses have been performed on 32 years of total ozone column data that was spatially gridded with a 1 × 1.5° resolution. The total ozone data consist of the MSR (Multi Sensor Reanalysis; 1979-2008) and 2 years of assimilated SCIAMACHY (SCanning Imaging Absorption spectroMeter for Atmospheric CHartographY) ozone data (2009-2010). The two-dimensionality in this data set allows us to perform the regressions locally and investigate spatial patterns of regression coefficients and their explanatory power. Seasonal dependencies of ozone on regressors are included in the analysis. A new physically oriented model is developed to parameterize stratospheric ozone. Ozone variations on nonseasonal timescales are parameterized by explanatory variables describing the solar cycle, stratospheric aerosols, the quasi-biennial oscillation (QBO), El Niño-Southern Oscillation (ENSO) and stratospheric alternative halogens which are parameterized by the effective equivalent stratospheric chlorine (EESC). For several explanatory variables, seasonally adjusted versions of these explanatory variables are constructed to account for the difference in their effect on ozone throughout the year. To account for seasonal variation in ozone, explanatory variables describing the polar vortex, geopotential height, potential vorticity and average day length are included. Results of this regression model are compared to that of a similar analysis based on a more commonly applied statistically oriented model. The physically oriented model provides spatial patterns in the regression results for each explanatory variable. The EESC has a significant depleting effect on ozone at mid- and high latitudes, the solar cycle affects ozone positively mostly in the Southern Hemisphere, stratospheric aerosols affect ozone negatively at high northern latitudes, the effect of QBO is positive and negative in the tropics and mid- to high latitudes, respectively, and ENSO affects ozone negatively between 30° N and 30° S, particularly over the Pacific. The contribution of explanatory variables describing seasonal ozone variation is generally large at mid- to high latitudes. We observe ozone increases with potential vorticity and day length and ozone decreases with geopotential height and variable ozone effects due to the polar vortex in regions to the north and south of the polar vortices. Recovery of ozone is identified globally. However, recovery rates and uncertainties strongly depend on choices that can be made in defining the explanatory variables. The application of several trend models, each with their own pros and cons, yields a large range of recovery rate estimates. Overall these results suggest that care has to be taken in determining ozone recovery rates, in particular for the Antarctic ozone hole.

  9. IT-141, a Polymer Micelle Encapsulating SN-38, Induces Tumor Regression in Multiple Colorectal Cancer Models

    Science.gov (United States)

    Carie, Adam; Rios-Doria, Jonathan; Costich, Tara; Burke, Brian; Slama, Richard; Skaff, Habib; Sill, Kevin

    2011-01-01

    Polymer micelles are promising drug delivery vehicles for the delivery of anticancer agents to tumors. Often, anticancer drugs display potent cytotoxic effects towards cancer cells but are too hydrophobic to be administered in the clinic as a free drug. To address this problem, a polymer micelle was designed using a triblock copolymer (ITP-101) that enables hydrophobic drugs to be encapsulated. An SN-38 encapsulated micelle, IT-141, was prepared that exhibited potent in vitro cytotoxicity against a wide array of cancer cell lines. In a mouse model, pharmacokinetic analysis revealed that IT-141 had a much longer circulation time, plasma exposure, and tumor exposure compared to irinotecan. IT-141 was also superior to irinotecan in terms of antitumor activity, exhibiting greater tumor inhibition in HT-29 and HCT116 colorectal cancer xenograft models at half the dose of irinotecan. The antitumor effect of IT-141 was dose-dependent and caused complete growth inhibition and tumor regression at well-tolerated doses. Varying the specific concentration of SN-38 within the IT-141 micelle had no detectible effect on this antitumor activity, indicating no differences in activity between different IT-141 formulations. In summary, IT-141 is a potent micelle-based chemotherapy that holds promise for the treatment of colorectal cancer. PMID:22187652

  10. Analysis of U.S. freight-train derailment severity using zero-truncated negative binomial regression and quantile regression.

    Science.gov (United States)

    Liu, Xiang; Saat, M Rapik; Qin, Xiao; Barkan, Christopher P L

    2013-10-01

    Derailments are the most common type of freight-train accidents in the United States. Derailments cause damage to infrastructure and rolling stock, disrupt services, and may cause casualties and harm the environment. Accordingly, derailment analysis and prevention has long been a high priority in the rail industry and government. Despite the low probability of a train derailment, the potential for severe consequences justify the need to better understand the factors influencing train derailment severity. In this paper, a zero-truncated negative binomial (ZTNB) regression model is developed to estimate the conditional mean of train derailment severity. Recognizing that the mean is not the only statistic describing data distribution, a quantile regression (QR) model is also developed to estimate derailment severity at different quantiles. The two regression models together provide a better understanding of train derailment severity distribution. Results of this work can be used to estimate train derailment severity under various operational conditions and by different accident causes. This research is intended to provide insights regarding development of cost-efficient train safety policies. PMID:23770389

  11. Grades, Gender, and Encouragement: A Regression Discontinuity Analysis

    Science.gov (United States)

    Owen, Ann L.

    2010-01-01

    The author employs a regression discontinuity design to provide direct evidence on the effects of grades earned in economics principles classes on the decision to major in economics and finds a differential effect for male and female students. Specifically, for female students, receiving an A for a final grade in the first economics class is…

  12. Stepwise Regression as an Exploratory Data Analysis Procedure.

    Science.gov (United States)

    Thayer, Jerome D.

    This paper identifies specific problems with stepwise regression, notes criticisms of stepwise methods by statisticians, suggests appropriate ways in which stepwise procedures can be used, and gives examples of how this can be done. Although the stepwise method has been routinely criticized by statisticians, it is still frequently used in the…

  13. Analysis on Train Stopping Accuracy based on Regression Algorithms

    Directory of Open Access Journals (Sweden)

    Lin Ma

    2014-05-01

    Full Text Available Stopping accuracy is one of the most important indexes of efficiency of automatic train operation (ATO systems. Traditional stopping control algorithms in ATO systems have some drawbacks, as many factors have not been taken into account. In the large amount of field-collected data about stopping accuracy there are many factors (e.g. system delays, stopping time, net pressure which affecting stopping accuracy. In this paper, three popular data mining methods are proposed to analyze the train stopping accuracy. Firstly, we find fifteen factors which have impact on the stopping accuracy. Then, ridge regression, lasso regression and elastic net regression are employed to mine models to reflecting the relationship between the fifteen factors and the stopping accuracy. Then, the three models are compared by using Akaike information criterion (AIC, a model selection criterion which considering the trade-off between accuracy and complexity. The computational results show that elastic net regression model has a best performance on AIC value. Finally, we obtain the parameters which can make the train stop more accurately which can provide a reference to improve stopping accuracy for ATO systems.

  14. A Bayesian Quantile Regression Analysis of Potential Risk Factors for Violent Crimes in USA

    Directory of Open Access Journals (Sweden)

    Ming Wang

    2012-12-01

    Full Text Available Bayesian quantile regression has drawn more attention in widespread applications recently. Yu and Moyeed (2001 proposed an asymmetric Laplace distribution to provide likelihood based mechanism for Bayesian inference of quantile regression models. In this work, the primary objective is to evaluate the performance of Bayesian quantile regression compared with simple regression and quantile regression through simulation and with application to a crime dataset from 50 USA states for assessing the effect of potential risk factors on the violent crime rate. This paper also explores improper priors, and conducts sensitivity analysis on the parameter estimates. The data analysis reveals that the percent of population that are single parents always has a significant positive influence on violent crimes occurrence, and Bayesian quantile regression provides more comprehensive statistical description of this association.

  15. Correlation and Multiple Regression Analyses of Pituitary Growth Hormone and Hepatic Activities in Hepatitis C Infection and Interferon Response

    OpenAIRE

    Eskander, Emad F.; Abd-rabou, Ahmed A.; Yahya, Shaymaa M. M.; El Sherbini, Ashraf; Mohamed, Mervat S.; Shaker, Olfat G.

    2013-01-01

    The prevalence of hepatitis C virus (HCV) infection varies across the world, with the highest percent of infections reported in Middle East, increasingly in Egypt. The current study aimed at examining the bio-statistical correlation and multiple regression analyses of pituitary growth hormone (GH) and liver activities among HCV genotype-4 patients treated with PEG-IFN-? plus RBV therapy. Herein, the current study was conducted on 100 HCV genotype-4 infected patients and 50 healthy controls. ...

  16. Application of a multiple least-squares regression program to dual energy NaI-CsI(T1) measurements

    International Nuclear Information System (INIS)

    In conjunction with the development of an optimum background subtraction routine, a multiple least-squares regression program for simultaneous utilization of both the NaI(T1) and CsI(T1) energy ranges of a dual anti-coincidence detection system was applied. To experimentally evaluate the program for whole body counting purposes, an Am-241 contaminated subject was measured in the whole body counter using the standard three phoswich detector array surrounding the head

  17. Prediction of the Rock Mass Diggability Index by Using Fuzzy Clustering-Based, ANN and Multiple Regression Methods

    Science.gov (United States)

    Saeidi, Omid; Torabi, Seyed Rahman; Ataei, Mohammad

    2014-03-01

    Rock mass classification systems are one of the most common ways of determining rock mass excavatability and related equipment assessment. However, the strength and weak points of such rating-based classifications have always been questionable. Such classification systems assign quantifiable values to predefined classified geotechnical parameters of rock mass. This causes particular ambiguities, leading to the misuse of such classifications in practical applications. Recently, intelligence system approaches such as artificial neural networks (ANNs) and neuro-fuzzy methods, along with multiple regression models, have been used successfully to overcome such uncertainties. The purpose of the present study is the construction of several models by using an adaptive neuro-fuzzy inference system (ANFIS) method with two data clustering approaches, including fuzzy c-means (FCM) clustering and subtractive clustering, an ANN and non-linear multiple regression to estimate the basic rock mass diggability index. A set of data from several case studies was used to obtain the real rock mass diggability index and compared to the predicted values by the constructed models. In conclusion, it was observed that ANFIS based on the FCM model shows higher accuracy and correlation with actual data compared to that of the ANN and multiple regression. As a result, one can use the assimilation of ANNs with fuzzy clustering-based models to construct such rigorous predictor tools.

  18. Research of quality indices for cold-smoked salmon using a stepwise multiple regression of microbiological counts and physico-chemical parameters

    OpenAIRE

    Leroi, Francoise; Joffraud, Jean-jacques; Chevalier, Frederique; Cardinal, Mireille

    2001-01-01

    Aims: The aim of the study was to assess the relationships between the remaining shelf-life (RSL) of cold-smoked salmon and various microbiological and physico-chemical parameters, using a multivariate data analysis in the form of stepwise forward multiple regression. Methods and Results: Thirteen batches of French cold-smoked salmon were analysed weekly during vacuum-packed storage at 5 degreesC for their lipid, water, salt, phenol, pH, total volatile basic nitrogen (TVBN) and trimethyla...

  19. REGRESSION ANALYSIS OF PRODUCTIVITY USING MIXED EFFECT MODEL

    Directory of Open Access Journals (Sweden)

    Siana Halim

    2007-01-01

    Full Text Available Production plants of a company are located in several areas that spread across Middle and East Java. As the production process employs mostly manpower, we suspected that each location has different characteristics affecting the productivity. Thus, the production data may have a spatial and hierarchical structure. For fitting a linear regression using the ordinary techniques, we are required to make some assumptions about the nature of the residuals i.e. independent, identically and normally distributed. However, these assumptions were rarely fulfilled especially for data that have a spatial and hierarchical structure. We worked out the problem using mixed effect model. This paper discusses the model construction of productivity and several characteristics in the production line by taking location as a random effect. The simple model with high utility that satisfies the necessary regression assumptions was built using a free statistic software R version 2.6.1.

  20. Regression analysis of censored data using pseudo-observations

    DEFF Research Database (Denmark)

    Parner, Erik T.; Andersen, Per Kragh

    2010-01-01

    We draw upon a series of articles in which a method based on pseu- dovalues is proposed for direct regression modeling of the survival function, the restricted mean, and the cumulative incidence function in competing risks with right-censored data. The models, once the pseudovalues have been computed, can be fit using standard generalized estimating equation software. Here we present Stata procedures for computing these pseudo-observations. An example from a bone marrow transplantation study is used to illustrate the method.

  1. Bayesian analysis of logistic regression with an unknown change point

    OpenAIRE

    Go?ssl, Christoff; Ku?chenhoff, Helmut

    1999-01-01

    We discuss Bayesian estimation of a logistic regression model with an unknown threshold limiting value (TLV). In these models it is assumed that there is no effect of a covariate on the response under a certain unknown TLV. The estimation of these models with a focus on the TLV in a Bayesian context by Markov chain Monte Carlo (MCMC) methods is considered. We extend the model by accounting for measurement error in the covariate. The Bayesian solution is compared with the likelihood solution...

  2. Model performance analysis and model validation in logistic regression

    Directory of Open Access Journals (Sweden)

    Rosa Arboretti Giancristofaro

    2007-10-01

    Full Text Available In this paper a new model validation procedure for a logistic regression model is presented. At first, we illustrate a brief review of different techniques of model validation. Next, we define a number of properties required for a model to be considered "good", and a number of quantitative performance measures. Lastly, we describe a methodology for the assessment of the performance of a given model by using an example taken from a management study.

  3. Analysis of apoptosis during hair follicle regression (catagen)

    OpenAIRE

    Lindner, G.; Botchkarev, V. A.; Botchkareva, N. V.; Ling, G.; Veen, C.; Paus, R.

    1997-01-01

    Keratinocyte apoptosis is a central element in the regulation of hair follicle regression (catagen), yet the exact location and the control of follicular keratinocyte apoptosis remain obscure. To generate an "apoptomap" of the hair follicle, we have studied selected apoptosis-associated parameters in the C57BL/6 mouse model for hair research during normal and pharmacologically manipulated, pathological catagen development. As assessed by terminal deoxynucleotide transferase dUTP fluorescein n...

  4. BRGLM, Interactive Linear Regression Analysis by Least Square Fit

    International Nuclear Information System (INIS)

    1 - Description of program or function: BRGLM is an interactive program written to fit general linear regression models by least squares and to provide a variety of statistical diagnostic information about the fit. Stepwise and all-subsets regression can be carried out also. There are facilities for interactive data management (e.g. setting missing value flags, data transformations) and tools for constructing design matrices for the more commonly-used models such as factorials, cubic Splines, and auto-regressions. 2 - Method of solution: The least squares computations are based on the orthogonal (QR) decomposition of the design matrix obtained using the modified Gram-Schmidt algorithm. 3 - Restrictions on the complexity of the problem: The current release of BRGLM allows maxima of 1000 observations, 99 variables, and 3000 words of main memory workspace. For a problem with N observations and P variables, the number of words of main memory storage required is MAX(N*(P+6), N*P+P*P+3*N, and 3*P*P+6*N). Any linear model may be fit although the in-memory workspace will have to be increased for larger problems

  5. Multi-Class Sparse Bayesian Regression for Neuroimaging Data Analysis

    Science.gov (United States)

    Michel, Vincent; Eger, Evelyn; Keribin, Christine; Thirion, Bertrand

    The use of machine learning tools is gaining popularity in neuroimaging, as it provides a sensitive assessment of the information conveyed by brain images. In particular, finding regions of the brain whose functional signal reliably predicts some behavioral information makes it possible to better understand how this information is encoded or processed in the brain. However, such a prediction is performed through regression or classification algorithms that suffer from the curse of dimensionality, because a huge number of features (i.e. voxels) are available to fit some target, with very few samples (i.e. scans) to learn the informative regions. A commonly used solution is to regularize the weights of the parametric prediction function. However, model specification needs a careful design to balance adaptiveness and sparsity. In this paper, we introduce a novel method, Multi - Class Sparse Bayesian Regression(MCBR), that generalizes classical approaches such as Ridge regression and Automatic Relevance Determination. Our approach is based on a grouping of the features into several classes, where each class is regularized with specific parameters. We apply our algorithm to the prediction of a behavioral variable from brain activation images. The method presented here achieves similar prediction accuracies than reference methods, and yields more interpretable feature loadings.

  6. Multi-stratified multiple regression tests of the linear/no-threshold theory of radon-induced lung cancer

    International Nuclear Information System (INIS)

    A plot of lung-cancer rates versus radon exposures in 965 US counties, or in all US states, has a strong negative slope, b, in sharp contrast to the strong positive slope predicted by linear/no-threshold theory. The discrepancy between these slopes exceeds 20 standard deviations (SD). Including smoking frequency in the analysis substantially improves fits to a linear relationship but has little effect on the discrepancy in b, because correlations between smoking frequency and radon levels are quite weak. Including 17 socioeconomic variables (SEV) in multiple regression analysis reduces the discrepancy to 15 SD. Data were divided into segments by stratifying on each SEV in turn, and on geography, and on both simultaneously, giving over 300 data sets to be analyzed individually, but negative slopes predominated. The slope is negative whether one considers only the most urban counties or only the most rural; only the richest or only the poorest; only the richest in the South Atlantic region or only the poorest in that region, etc., etc.,; and for all the strata in between. Since this is an ecological study, the well-known problems with ecological studies were investigated and found not to be applicable here. The open-quotes ecological fallacyclose quotes was shown not to apply in testing a linear/no-threshold theory, and the vulnerability to confounding is greatly reduced when confounding factors are only weakly correlated with radon levels, as is generally the case hereadon levels, as is generally the case here. All confounding factors known to correlate with radon and with lung cancer were investigated quantitatively and found to have little effect on the discrepancy

  7. Detrended fluctuation analysis as a regression framework: Estimating dependence at different scales

    CERN Document Server

    Kristoufek, Ladislav

    2014-01-01

    We propose a novel framework combining detrended fluctuation analysis with standard regression methodology. The method is built on detrended variances and covariances and it is designed to estimate regression parameters at different scales and under potential non-stationarity and power-law correlations. Selected examples from physics, finance and environmental sciences illustrate usefulness of the framework.

  8. A review of the most relevant multiple regression models for sales forecasting in gas stations; Uma revisao dos principais modelos de regressao multipla para previsao de vendas de postos de combustiveis

    Energy Technology Data Exchange (ETDEWEB)

    Wanke, Peter [Universidade Federal do Rio de Janeiro (UFRJ), RJ (Brazil). Instituto de Pesquisa e Pos-Graduacao em Administracao de Empresas (COPPEAD). Centro de Estudos em Logistica

    2004-07-01

    In this paper, the most relevant multiple regression models for sales forecasting of gas stations, developed over the past ten years, are reviewed. The most significant variables related to gas station sales, the types of the multiple regression models (linear or non-linear), the most common uses in supporting decision making and its limits are presented. The predictive power of each model and its impact on decision-making, such as sensitivity analysis and confidence intervals for independent variables, are also commented. Four models are presented, based on studies conducted in South Africa, Portugal and Brazil. In conclusion, suggestions for future developments are presented based on past developments. (author)

  9. Use of Structure Coefficients in Published Multiple Regression Articles: Beta Is Not Enough.

    Science.gov (United States)

    Courville, Troy; Thompson, Bruce

    2001-01-01

    Reviewed articles published in the "Journal of Applied Psychology" (JAP) to determine how interpretations might have differed if standardized regression coefficients and structure coefficients (or bivariate "r"s of predictors with the criterion) had been interpreted. Summarizes some dramatic misinterpretations or incomplete interpretations.…

  10. A Generalized Logistic Regression Procedure to Detect Differential Item Functioning among Multiple Groups

    Science.gov (United States)

    Magis, David; Raiche, Gilles; Beland, Sebastien; Gerard, Paul

    2011-01-01

    We present an extension of the logistic regression procedure to identify dichotomous differential item functioning (DIF) in the presence of more than two groups of respondents. Starting from the usual framework of a single focal group, we propose a general approach to estimate the item response functions in each group and to test for the presence…

  11. Semi-parametric ROC regression analysis with placement values.

    Science.gov (United States)

    Cai, Tianxi

    2004-01-01

    Advances in technology provide new diagnostic tests for early detection of disease. Frequently, these tests have continuous outcomes. One popular method to summarize the accuracy of such a test is the Receiver Operating Characteristic (ROC) curve. Methods for estimating ROC curves have long been available. To examine covariate effects, Pepe (1997, 2000) and Alonzo and Pepe (2002) proposed distribution-free approaches based on a parametric regression model for the ROC curve. Cai and Pepe (2002) extended the parametric ROC regression model by allowing an arbitrary non-parametric baseline function. In this paper, while we follow the same semi-parametric setting as in that paper, we highlight a new estimator that offers several improvements over the earlier work: superior efficiency, the ability to estimate the covariate effects without estimating the non-parametric baseline function and easy implementation with standard software. The methodology is applied to a case control dataset where we evaluate the accuracy of the prostate-specific antigen as a biomarker for early detection of prostate cancer. Simulation studies suggest that the new estimator under the semi-parametric model, while always being more robust, has efficiency that is comparable to or better than the Alonzo and Pepe (2002) estimator from the parametric model. PMID:14744827

  12. Application of support vector machines plus to regression analysis for pressure-relief valves leaking

    Scientific Electronic Library Online (English)

    W., Sun; G. X., Meng; Q., Ye; H. L., Jin; J. Z., Zhang.

    2012-04-01

    Full Text Available Carrying out regression analysis for gas leakage of pressure-relief valve (PRV) to get accurate leakage flow and changing trend of leakage will be helpful in assessing the reliability of PRV. Classic support vector regression (SVR) is an excellent regression model, and has been widely used in variou [...] s fields. However, standard SVR model does regression only using leakage data without elements closely related to the leakage considered. In this paper a regression model based on support vector regression plus (SVR+) is put forward to perform leakage regression of PRV, in which particle swarm optimization (PSO) is used to select optimum parameters of SVR+, termed PSO_SVR+. The experimental results demonstrate that the proposed model taking the difference of inlet pressure and outlet pressure of PRV as hidden information can access a more favorable regression precision than SVR can provide. Meanwhile this article also investigates effects of PSO and Genetic Algorithm on the performance of regression model (SVR+ or SVR)

  13. Análise de regressão múltipla das concentrações de PM10 em função de elementos meteorológicos para Porto Alegre, Estado do Rio Grande do Sul, em 2005 e 2006 = Multiple regression analysis of PM10 concentration concerning to meteorological elements for Porto Alegre, Rio Grande do Sul State, in 2005 and 2006

    Directory of Open Access Journals (Sweden)

    Angela Radünz Lazzari

    2011-01-01

    Full Text Available O ar é um meio eficiente de dispersão de poluentes atmosféricos e seucomportamento depende dos movimentos atmosféricos que ocorrem na troposfera. Em Porto Alegre, Estado do Rio Grande do Sul, há um grande tráfego diário e uma concentração de indústrias que podem ser responsáveis por emissões atmosféricas. Neste trabalho, estudou-se ocomportamento das concentrações diárias de material particulado (PM10 desta cidade, considerando a influência dos elementos meteorológicos. A análise dos dados foi realizada a partir de estatísticas descritivas, correlação linear e regressão múltipla. Os dados foram fornecidos pela Fundação Estadual de Proteção Ambiental Henrique Luiz Roessler - RS (FEPAM e pelo Instituto Nacional de Meteorologia (INMET. A partir das análises pôde-se verificar que: asconcentrações do PM10, medidos diariamente às 16h, não ultrapassaram os padrões nacionais de qualidade do ar; os elementos meteorológicos que influenciam nas concentrações do PM10 foram: a velocidade média diária do vento e a radiação média diária com relações negativas; astemperaturas médias diárias do ar e as direções, norte e noroeste, do vento, com relações positivas. As direções do vento que contribuem significativamente para diminuir as concentrações nos locais medidos são Leste e Sudeste.Air is an efficient means of atmospheric pollutants dispersal and its r behavior depends on the atmospheric movements that occur in the troposphere. In Porto Alegre, Rio Grande do Sul State, there is a large daily traffic and a concentration of industries that may be responsible for atmospheric emission. In the present work we studied the behavior of daily concentrations of particulate matter (PM10, in this city, considering the influence of meteorological variables. Dataanalysis was performed from descriptive statistics, linear correlation and multiple regressions. Data were provided by the State Foundation of Environmental Protection Henrique Luiz Roessler - RS and the National Institute of Meteorology. Based on the analysis it was possible to verify that: the concentration of PM10, measured every day at 4:00 p.m., did not exceed national standards for air quality; meteorological elements that influenced on the concentrations of PM10 were the daily average wind speed and average daily radiation with negative relations; the daily average temperature of the air and the directions, north and northwest of wind, with positive relations. Wind directions which contribute significantly to lower concentrations on the measured placesare east and southeast.

  14. A primer for biomedical scientists on how to execute model II linear regression analysis.

    Science.gov (United States)

    Ludbrook, John

    2012-04-01

    1.?There are two very different ways of executing linear regression analysis. One is Model I, when the x-values are fixed by the experimenter. The other is Model II, in which the x-values are free to vary and are subject to error. 2.?I have received numerous complaints from biomedical scientists that they have great difficulty in executing Model II linear regression analysis. This may explain the results of a Google Scholar search, which showed that the authors of articles in journals of physiology, pharmacology and biochemistry rarely use Model II regression analysis. 3.?I repeat my previous arguments in favour of using least products linear regression analysis for Model II regressions. I review three methods for executing ordinary least products (OLP) and weighted least products (WLP) regression analysis: (i) scientific calculator and/or computer spreadsheet; (ii) specific purpose computer programs; and (iii) general purpose computer programs. 4.?Using a scientific calculator and/or computer spreadsheet, it is easy to obtain correct values for OLP slope and intercept, but the corresponding 95% confidence intervals (CI) are inaccurate. 5.?Using specific purpose computer programs, the freeware computer program smatr gives the correct OLP regression coefficients and obtains 95% CI by bootstrapping. In addition, smatr can be used to compare the slopes of OLP lines. 6.?When using general purpose computer programs, I recommend the commercial programs systat and Statistica for those who regularly undertake linear regression analysis and I give step-by-step instructions in the Supplementary Information as to how to use loss functions. PMID:22077731

  15. Multiple-time correlation functions for non-Markovian interaction: Beyond the Quantum Regression Theorem

    OpenAIRE

    Alonso, Daniel; Vega, Ine?s

    2004-01-01

    Multiple time correlation functions are found in the dynamical description of different phenomena. They encode and describe the fluctuations of the dynamical variables of a system. In this paper we formulate a theory of non-Markovian multiple-time correlation functions (MTCF) for a wide class of systems. We derive the dynamical equation of the {\\it reduced propagator}, an object that evolve state vectors of the system conditioned to the dynamics of its environment, which is ...

  16. GroupICA dual regression analysis of resting state networks in behavioural variant of frontotemporal dementia

    Directory of Open Access Journals (Sweden)

    AnneMarjaRemes

    2013-08-01

    Full Text Available Functional MRI studies have revealed changes in default-mode and salience networks in neurodegenerative dementias, especially in Alzheimer’s disease. The purpose of this study was to analyze the whole brain cortex resting state networks in patients with behavioral variant frontotemporal dementia by using resting state functional MRI. The group specific resting state networks were identified by high model order independent component analysis and a dual regression technique was used to detect between-group differences in the resting state networks with p<0.05 threshold corrected for multiple comparisons. A y-concatenation method was used to correct for multiple comparisons for multiple independent components, grey matter differences as well as the voxel level. We found increased connectivity in several networks within patients with bvFTD compared to the control group. The most prominent enhancement was seen in the right frontotemporal area and insula. A significant increase in functional connectivity was also detected in the left dorsal attention network, in anterior paracingulate – a default mode sub-network as well as in the anterior parts of the frontal pole. Notably the increased patterns of connectivity were seen in areas around atrophic regions. The present results demonstrate abnormal increased connectivity in several important brain networks including the dorsal attention network and default-mode network in patients with behavioral variant frontotemporal dementia. These changes may be associated with decline in executive functions and attention as well as apathy, which are the major cognitive and neuropsychiatric defects in patients with frontotemporal dementia.

  17. Control of matrix interferences by multiple linear regression models in the determination of arsenic and lead concentrations in fly ashes by inductively coupled plasma optical emission spectrometry

    OpenAIRE

    Ilander, Aki; Va?isa?nen, Ari

    2010-01-01

    A multiple linear regression technique was used to evaluate and correct the matrix interferences in the determination of As and Pb concentrations in fly ashes by inductively coupled plasma optical emission spectrometry. The direct determination of As and Pb in SRM 1633b by ICP-OES failed to obtain the certified concentrations, except in a couple of cases. However, it proved possible to use the multiple linear regression (MLR) technique to correct the determined concentrations to a satisfactor...

  18. Factors Associated with Methadone Treatment Duration: A Cox Regression Analysis

    Science.gov (United States)

    Peng, Ching-Yi; Chao, En; Lee, Tony Szu-Hsien

    2015-01-01

    This study examined retention rates and associated predictors of methadone maintenance treatment (MMT) duration among 128 newly admitted patients in Taiwan. A semi-structured questionnaire was used to obtain demographic and drug use history. Daily records of methadone taken and test results for HIV, HCV, and morphine toxicology were taken from a computerized medical registry. Cox regression analyses were performed to examine factors associated with MMT duration. MMT retention rates were 80.5%, 68.8%, 53.9%, and 41.4% for 3, 6, 12, and 18 months, respectively. Excluding 38 patients incarcerated during the study period, retention rates were 81.1%, 73.3%, 61.1%, and 48.9% for 3 months, 6 months, 12 months, and 18 months, respectively. No participant seroconverted to HIV and 1 died during the 18-months follow-up. Results showed that being female, imprisonment, a longer distance from house to clinic, having a lower methadone dose after 30 days, being HCV positive, and in the New Taipei city program predicted early patient dropout. The findings suggest favorable MMT outcomes of HIV seroincidence and mortality. Results indicate that the need to minimize travel distance and to provide programs that meet women’s requirements justify expansion of MMT clinics in Taiwan. PMID:25875531

  19. Regression analysis in modeling of air surface temperature and factors affecting its value in Peninsular Malaysia

    Science.gov (United States)

    Rajab, Jasim Mohammed; Jafri, Mohd. Zubir Mat; Lim, Hwee San; Abdullah, Khiruddin

    2012-10-01

    This study encompasses air surface temperature (AST) modeling in the lower atmosphere. Data of four atmosphere pollutant gases (CO, O3, CH4, and H2O) dataset, retrieved from the National Aeronautics and Space Administration Atmospheric Infrared Sounder (AIRS), from 2003 to 2008 was employed to develop a model to predict AST value in the Malaysian peninsula using the multiple regression method. For the entire period, the pollutants were highly correlated (R=0.821) with predicted AST. Comparisons among five stations in 2009 showed close agreement between the predicted AST and the observed AST from AIRS, especially in the southwest monsoon (SWM) season, within 1.3 K, and for in situ data, within 1 to 2 K. The validation results of AST with AST from AIRS showed high correlation coefficient (R=0.845 to 0.918), indicating the model's efficiency and accuracy. Statistical analysis in terms of ? showed that H2O (0.565 to 1.746) tended to contribute significantly to high AST values during the northeast monsoon season. Generally, these results clearly indicate the advantage of using the satellite AIRS data and a correlation analysis study to investigate the impact of atmospheric greenhouse gases on AST over the Malaysian peninsula. A model was developed that is capable of retrieving the Malaysian peninsulan AST in all weather conditions, with total uncertainties ranging between 1 and 2 K.

  20. Evaluation of syngas production unit cost of bio-gasification facility using regression analysis techniques

    Energy Technology Data Exchange (ETDEWEB)

    Deng, Yangyang; Parajuli, Prem B.

    2011-08-10

    Evaluation of economic feasibility of a bio-gasification facility needs understanding of its unit cost under different production capacities. The objective of this study was to evaluate the unit cost of syngas production at capacities from 60 through 1800Nm 3/h using an economic model with three regression analysis techniques (simple regression, reciprocal regression, and log-log regression). The preliminary result of this study showed that reciprocal regression analysis technique had the best fit curve between per unit cost and production capacity, with sum of error squares (SES) lower than 0.001 and coefficient of determination of (R 2) 0.996. The regression analysis techniques determined the minimum unit cost of syngas production for micro-scale bio-gasification facilities of $0.052/Nm 3, under the capacity of 2,880 Nm 3/h. The results of this study suggest that to reduce cost, facilities should run at a high production capacity. In addition, the contribution of this technique could be the new categorical criterion to evaluate micro-scale bio-gasification facility from the perspective of economic analysis.

  1. Principal Component and Multiple Regression Analyses for the Estimation of Suspended Sediment Yield in Ungauged Basins of Northern Thailand

    OpenAIRE

    Piyawat Wuttichaikitcharoen; Mukand Singh Babel

    2014-01-01

    Predicting sediment yield is necessary for good land and water management in any river basin. However, sometimes, the sediment data is either not available or is sparse, which renders estimating sediment yield a daunting task. The present study investigates the factors influencing suspended sediment yield using the principal component analysis (PCA). Additionally, the regression relationships for estimating suspended sediment yield, based on the selected key factors from the PCA, are develope...

  2. A regression analysis on the green olives debittering

    Directory of Open Access Journals (Sweden)

    Kopsidas, Gerassimos C.

    1991-12-01

    Full Text Available In this paper, a regression model, which gives the debittering time t as a function of the sodium hydroxide concentration 0 and the debittering temperature T, at the debittering of medium size green olive fruit of the Conservolea variety, is fitted. This model has the simple form t=aoCa1 ? ea2/T, where ao, a1, and a2 are constants. The values of ao, a1, and a2 are determined by the method of least squares from a set of experimental data. The determined model is very satisfactory for the conditions in which Greek green olives are debittered.

    En este artículo se ajusta un modelo de regresión, que da el tiempo de endulzamiento t en función de la concentración de hidróxido sódico C y la temperatura de endulzamiento T, en el endulzamiento de aceitunas verdes de tamaño mediano de la variedad Conservolea. Este modelo tiene la forma simple t=aoCa1 ? ea2/T, donde a1 y a2 son constantes. Los valores de ao, a1, y a2 son determinados por el método de los mínimos cuadrados a partir de un grupo de datos experimentales. El modelo determinado es muy satisfactorio para las condiciones en las que las aceitunas verdes griegas son endulzadas.

  3. THE THEORY AND APPLICATION OF REGRESSION ANALYSIS AND THE LEAST-SQAURES PRINCIPLE

    Directory of Open Access Journals (Sweden)

    P. De Viliers

    2012-02-01

    Full Text Available The theory and practice of regression analysis, and the principle of least-squares on which it is based, is frequently encountered in Mathematics and particularly Statistical Mathematics, but less well known are some very useful applications in a military environment. It is therefore the aim of this article to firstly give a general description of the theory of regression analyses, and secondly to highlight some military applications of the theory.

  4. Regression and local control rates after radiotherapy for jugulotympanic paragangliomas: systematic review and meta-analysis.

    Science.gov (United States)

    van Hulsteijn, Leonie T; Corssmit, Eleonora P M; Coremans, Ida E M; Smit, Johannes W A; Jansen, Jeroen C; Dekkers, Olaf M

    2013-02-01

    The primary treatment goal of radiotherapy for paragangliomas of the head and neck region (HNPGLs) is local control of the tumor, i.e. stabilization of tumor volume. Interestingly, regression of tumor volume has also been reported. Up to the present, no meta-analysis has been performed giving an overview of regression rates after radiotherapy in HNPGLs. The main objective was to perform a systematic review and meta-analysis to assess regression of tumor volume in HNPGL-patients after radiotherapy. A second outcome was local tumor control. Design of the study is systematic review and meta-analysis. PubMed, EMBASE, Web of Science, COCHRANE and Academic Search Premier and references of key articles were searched in March 2012 to identify potentially relevant studies. Considering the indolent course of HNPGLs, only studies with ? 12 months follow-up were eligible. Main outcomes were the pooled proportions of regression and local control after radiotherapy as initial, combined (i.e. directly post-operatively or post-embolization) or salvage treatment (i.e. after initial treatment has failed) for HNPGLs. A meta-analysis was performed with an exact likelihood approach using a logistic regression with a random effect at the study level. Pooled proportions with 95% confidence intervals (CI) were reported. Fifteen studies were included, concerning a total of 283 jugulotympanic HNPGLs in 276 patients. Pooled regression proportions for initial, combined and salvage treatment were respectively 21%, 33% and 52% in radiosurgery studies and 4%, 0% and 64% in external beam radiotherapy studies. Pooled local control proportions for radiotherapy as initial, combined and salvage treatment ranged from 79% to 100%. Radiotherapy for jugulotympanic paragangliomas results in excellent local tumor control and therefore is a valuable treatment for these types of tumors. The effects of radiotherapy on regression of tumor volume remain ambiguous, although the data suggest that regression can be achieved at least in some patients. More research is needed to identify predictors for treatment success. PMID:23332889

  5. Regression and local control rates after radiotherapy for jugulotympanic paragangliomas: Systematic review and meta-analysis

    International Nuclear Information System (INIS)

    The primary treatment goal of radiotherapy for paragangliomas of the head and neck region (HNPGLs) is local control of the tumor, i.e. stabilization of tumor volume. Interestingly, regression of tumor volume has also been reported. Up to the present, no meta-analysis has been performed giving an overview of regression rates after radiotherapy in HNPGLs. The main objective was to perform a systematic review and meta-analysis to assess regression of tumor volume in HNPGL-patients after radiotherapy. A second outcome was local tumor control. Design of the study is systematic review and meta-analysis. PubMed, EMBASE, Web of Science, COCHRANE and Academic Search Premier and references of key articles were searched in March 2012 to identify potentially relevant studies. Considering the indolent course of HNPGLs, only studies with ?12 months follow-up were eligible. Main outcomes were the pooled proportions of regression and local control after radiotherapy as initial, combined (i.e. directly post-operatively or post-embolization) or salvage treatment (i.e. after initial treatment has failed) for HNPGLs. A meta-analysis was performed with an exact likelihood approach using a logistic regression with a random effect at the study level. Pooled proportions with 95% confidence intervals (CI) were reported. Fifteen studies were included, concerning a total of 283 jugulotympanic HNPGLs in 276 patients. Pooled regression proportions for initial, combined and salvage treatment were respectively 21%, 33% and 52% in radiosurgery studies and 4%, 0% and 64% in external beam radiotherapy studies. Pooled local control proportions for radiotherapy as initial, combined and salvage treatment ranged from 79% to 100%. Radiotherapy for jugulotympanic paragangliomas results in excellent local tumor control and therefore is a valuable treatment for these types of tumors. The effects of radiotherapy on regression of tumor volume remain ambiguous, although the data suggest that regression can be achieved at least in some patients. More research is needed to identify predictors for treatment success

  6. The Use of Nonparametric Kernel Regression Methods in Econometric Production Analysis

    DEFF Research Database (Denmark)

    Czekaj, Tomasz Gerard

    2013-01-01

    This PhD thesis addresses one of the fundamental problems in applied econometric analysis, namely the econometric estimation of regression functions. The conventional approach to regression analysis is the parametric approach, which requires the researcher to specify the form of the regression function. However, the a priori specification of a functional form involves the risk of choosing one that is not similar to the “true” but unknown relationship between the regressors and the dependent variable. This problem, known as parametric misspecification, can result in biased parameter estimates and hence, also in biased measures, which are derived from the estimated parameters. This, in turn, can result in incorrect economic conclusions and recommendations for managers, politicians and decision makers in general. This PhD thesis focuses on a nonparametric econometric approach that can be used to avoid this problem. The main objective is to investigate the applicability of the nonparametric kernel regression method in applied production analysis. The focus of the empirical analyses included in this thesis is the agricultural sector in Poland. Data on Polish farms are used to investigate practically and politically relevant problems and to illustrate how nonparametric regression methods can be used in applied microeconomic production analysis both in panel data and cross-section data settings. The thesis consists of four papers. The first paper addresses problems of parametric and nonparametric estimations of production functions in order to evaluate the optimal firm size. The second paper discusses the use of parametric and nonparametric regression methods to estimate panel data regression models. The third paper analyses production risk, price uncertainty, and farmers' risk preferences within a nonparametric panel data regression framework. The fourth paper analyses the technical efficiency of dairy farms with environmental output using nonparametric kernel regression in a semiparametric stochastic frontier analysis. The results provided in this PhD thesis show that nonparametric kernel methods are well-suited to econometric production analysis and can outperform traditional parametric methods. Although the empirical focus of this thesis is on the application of nonparametric kernel regression in applied production analysis, the findings are also applicable to econometric estimations in general.

  7. A Noncentral "t" Regression Model for Meta-Analysis

    Science.gov (United States)

    Camilli, Gregory; de la Torre, Jimmy; Chiu, Chia-Yi

    2010-01-01

    In this article, three multilevel models for meta-analysis are examined. Hedges and Olkin suggested that effect sizes follow a noncentral "t" distribution and proposed several approximate methods. Raudenbush and Bryk further refined this model; however, this procedure is based on a normal approximation. In the current research literature, this…

  8. Accounting for Misclassified Outcomes in Binary Regression Models Using Multiple Imputation With Internal Validation Data

    OpenAIRE

    Edwards, Jessie K.; Cole, Stephen R.; Troester, Melissa A.; Richardson, David B.

    2013-01-01

    Outcome misclassification is widespread in epidemiology, but methods to account for it are rarely used. We describe the use of multiple imputation to reduce bias when validation data are available for a subgroup of study participants. This approach is illustrated using data from 308 participants in the multicenter Herpetic Eye Disease Study between 1992 and 1998 (48% female; 85% white; median age, 49 years). The odds ratio comparing the acyclovir group with the placebo group on the gold-stand...

  9. Robust regression applied to fractal/multifractal analysis.

    OpenAIRE

    Portilla, F.; Valencia Delfa, Jose? Luis; Tarquis Alfonso, Ana Maria; Saa Requejo, Antonio

    2012-01-01

    Fractal and multifractal are concepts that have grown increasingly popular in recent years in the soil analysis, along with the development of fractal models. One of the common steps is to calculate the slope of a linear fit commonly using least squares method. This shouldn?t be a special problem, however, in many situations using experimental data the researcher has to select the range of scales at which is going to work neglecting the rest of points to achieve the best linearity that in thi...

  10. Using Multiple Regression in Estimating (semi) VOC Emissions and Concentrations at the European Scale

    DEFF Research Database (Denmark)

    Fauser, Patrik; Thomsen, Marianne

    2010-01-01

    This paper proposes a simple method for estimating emissions and predicted environmental concentrations (PECs) in water and air for organic chemicals that are used in household products and industrial processes. The method has been tested on existing data for 63 organic high-production volume chemicals available in the European Chemicals Bureau risk assessment reports (RARs). The method suggests a simple linear relationship between Henry's Law constant, octanol-water coefficient, use and production volumes, and emissions and PECs on a regional scale in the European Union. Emissions and PECs are a result of a complex interaction between chemical properties, production and use patterns and geographical characteristics. A linear relationship cannot capture these complexities; however, it may be applied at a cost-efficient screening level for suggesting critical chemicals that are candidates for an in-depth risk assessment. Uncertainty measures are not available for the RAR data; however, uncertainties for the applied regression models are given in the paper. Evaluation of the methods reveals that between 79% and 93% of all emission and PEC estimates are within one order of magnitude of the reported RAR values. Bearing in mind that the domain of the method comprises organic industrial high-production volume chemicals, four chemicals, prioritized in the Water Framework Directive and the Stockholm Convention on Persistent Organic Pollutants, were used to test the method for estimated emissions and PECs, with corresponding uncertainty intervals, in air and water at regional EU level.

  11. EPSVR and EPMeta: prediction of antigenic epitopes using support vector regression and multiple server results

    Directory of Open Access Journals (Sweden)

    Yao Bo

    2010-07-01

    Full Text Available Abstract Background Accurate prediction of antigenic epitopes is important for immunologic research and medical applications, but it is still an open problem in bioinformatics. The case for discontinuous epitopes is even worse - currently there are only a few discontinuous epitope prediction servers available, though discontinuous peptides constitute the majority of all B-cell antigenic epitopes. The small number of structures for antigen-antibody complexes limits the development of reliable discontinuous epitope prediction methods and an unbiased benchmark to evaluate developed methods. Results In this work, we present two novel server applications for discontinuous epitope prediction: EPSVR and EPMeta, where EPMeta is a meta server. EPSVR, EPMeta, and datasets are available at http://sysbio.unl.edu/services. Conclusion The server application for discontinuous epitope prediction, EPSVR, uses a Support Vector Regression (SVR method to integrate six scoring terms. Furthermore, we combined EPSVR with five existing epitope prediction servers to construct EPMeta. All methods were benchmarked by our curated independent test set, in which all antigens had no complex structures with the antibody, and their epitopes were identified by various biochemical experiments. The area under the receiver operating characteristic curve (AUC of EPSVR was 0.597, higher than that of any other existing single server, and EPMeta had a better performance than any single server - with an AUC of 0.638, significantly higher than PEPITO and Disctope (p-value

  12. Correlating phosphoproteomic signaling with castration resistant prostate cancer survival through regression analysis.

    Science.gov (United States)

    Lescarbeau, Reynald; Kaplan, David L

    2014-03-01

    Prostate cancer most commonly presents as initially castration dependent, however in a minority of patients the disease will progress to a state of castration resistance. Here, approaches for correlating alterations in the phosphoproteome with androgen independent cell survival in the LNCaP, PC3, and MDa-PCa-2b cell lines are discussed. The performance of the regression techniques multiple linear, ridge, principal component, and partial least squares regression is compared. The predictive performance of these algorithms over randomized data sets and using the Akaike Information Criterion is explored, and principal component and partial least squares regression are found to outperform other regression approaches. The effect of altering the number of features versus observations on the R(2) value and predictive performance is also examined using the partial least squares regression model. Utilizing these approaches "drivers" of castration resistant disease can be identified whose modulation alters phenotypic outcomes. These data provide an empirical comparison of the various considerations when statistically analyzing phosphorylation data with the aim of correlating with phenotypic outcomes. PMID:24413303

  13. Experimental and regression analysis for multi cylinder diesel engine operated with hybrid fuel blends

    OpenAIRE

    Gopal Rajendiran; Kavandappa-Goundar Mayilsamy; Ramasamy Subramanian; Natarajan Nedunchezhian; Ramasamy Venkatachalam

    2014-01-01

    The purpose of this research work is to build a multiple linear regression model for the characteristics of multicylinder diesel engine using multicomponent blends (diesel- pungamia methyl ester-ethanol) as fuel. Nine blends were tested by varying diesel (100 to 10% by Vol.), biodiesel (80 to 10% by vol.) and keeping ethanol as 10% constant. The brake thermal efficiency, smoke, oxides of nitrogen, carbon dioxide, maximum cylinder pressure, angle of maximum ...

  14. Principal Component and Multiple Regression Analyses for the Estimation of Suspended Sediment Yield in Ungauged Basins of Northern Thailand

    Directory of Open Access Journals (Sweden)

    Piyawat Wuttichaikitcharoen

    2014-08-01

    Full Text Available Predicting sediment yield is necessary for good land and water management in any river basin. However, sometimes, the sediment data is either not available or is sparse, which renders estimating sediment yield a daunting task. The present study investigates the factors influencing suspended sediment yield using the principal component analysis (PCA. Additionally, the regression relationships for estimating suspended sediment yield, based on the selected key factors from the PCA, are developed. The PCA shows six components of key factors that can explain at least up to 86.7% of the variation of all variables. The regression models show that basin size, channel network characteristics, land use, basin steepness and rainfall distribution are the key factors affecting sediment yield. The validation of regression relationships for estimating suspended sediment yield shows the error of estimation ranging from ?55% to +315% and ?59% to +259% for suspended sediment yield and for area-specific suspended sediment yield, respectively. The proposed relationships may be considered useful for predicting suspended sediment yield in ungauged basins of Northern Thailand that have geologic, climatic and hydrologic conditions similar to the study area.

  15. Econometric analysis of realized covariation: high frequency based covariance, regression, and correlation in financial economics

    DEFF Research Database (Denmark)

    Barndorff-Nielsen, Ole Eiler; Shephard, N.

    2004-01-01

    This paper analyses multivariate high frequency financial data using realized covariation. We provide a new asymptotic distribution theory for standard methods such as regression, correlation analysis, and covariance. It will be based on a fixed interval of time (e.g., a day or week), allowing the number of high frequency returns during this period to go to infinity. Our analysis allows us to study how high frequency correlations, regressions, and covariances change through time. In particular we provide confidence intervals for each of these quantities.

  16. A viscoelastic model of blood capillary extension and regression: derivation, analysis, and simulation.

    Science.gov (United States)

    Zheng, Xiaoming; Xie, Chunjing

    2014-01-01

    This work studies a fundamental problem in blood capillary growth: how the cell proliferation or death induces the stress response and the capillary extension or regression. We develop a one-dimensional viscoelastic model of blood capillary extension/regression under nonlinear friction with surroundings, analyze its solution properties, and simulate various growth patterns in angiogenesis. The mathematical model treats the cell density as the growth pressure eliciting a viscoelastic response from the cells, which again induces extension or regression of the capillary. Nonlinear analysis captures two cases when the biologically meaningful solution exists: (1) the cell density decreases from root to tip, which may occur in vessel regression; (2) the cell density is time-independent and is of small variation along the capillary, which may occur in capillary extension without proliferation. The linear analysis with perturbation in cell density due to proliferation or death predicts the global biological solution exists provided the change in cell density is sufficiently slow in time. Examples with blow-ups are captured by numerical approximations and the global solutions are recovered by slow growth processes, which validate the linear analysis theory. Numerical simulations demonstrate this model can reproduce angiogenesis experiments under several biological conditions including blood vessel extension without proliferation and blood vessel regression. PMID:23149501

  17. Multivariate Regression Approach To Integrate Multiple Satellite And Tide Gauge Data For Real Time Sea Level Prediction

    DEFF Research Database (Denmark)

    Cheng, Yongcun; Andersen, Ole Baltazar

    2010-01-01

    The Sea Level Thematic Assembly Center in the EUFP7 MyOcean project aims at build a sea level service for multiple satellite sea level observations at a European level for GMES marine applications. It aims to improve the sea level related products to guarantee the sustainability and the quality of GMES marine core service. One such added value will be a multivariate regression model of sea level variability of multisatellite and in-situ tide gauge observations with the aim at improved future high spatial and temporal sea level prediction for i.e., human safety. Tide gauges and satellite altimetry data from the last seventeen years have been compared for an area around UK and temporal correlation coefficients between them were calculated. The results are extremely encouraging, as we have shown that the detided signal from response method correlates to more than 90% for nearly all tide gauge stations with satellite altimetry.

  18. Calculation of Slater-Condon and Lande parameters in some Ndsup(+3) complexes using partial and multiple regression method

    International Nuclear Information System (INIS)

    The interelectronic repulsion and spin-orbit interaction parameters for some Ndsup(3+)?-diketone complexes have been computed using partial and multiple regression method from the observed absorption spectra in the region 1000-23500 cmsup(-1). A brief outline of this method which is an alternative to a computer programming method is given. The energy parameters (Slater-Condon and Lande') derived from intra-fsup(N) transitions of lanthanide ion have their importance to predict the covalent tendency of the metal-ligand bond in the complex on the basis of the decrease in the value of these parameters. The complexes have been arranged in the increasing order of covalency as has been indicated by the value of ? or bsup(1/2). (author)

  19. Modelling of Habitat Suitability Index for Muntjac Muntiacus muntjak Using Remote Sensing, GIS and Multiple Logistic Regression

    Directory of Open Access Journals (Sweden)

    Imam EKWAL

    2012-12-01

    Full Text Available Habitat degradation and loss has been widely recognized as the main cause for the decline of wildlife population. Evaluating the quality of wildlife habitat can provide essential information for wildlife refuge design and management. The purpose of this study was to produce georeferenced ecological information about suitable habitats available for muntjac, Muntiacus muntjak in Chandoli tiger reserve, India (17° 04' 00" N to 17° 19' 54" N and 73° 40' 43" E to 73° 53' 09" E. Habitats were evaluated using multiple logistic regression integrated with remote sensing and geographic information system. Satellite imageries of LISS-III of IRS-P6 of study area were digitally processed. To generate collateral data topographic maps were analysed in a GIS framework. Layers of different variables such as Landuse land cover, forest density, proximity to disturbances and water resources and a digital terrain model were created from satellite and topographic sheets. These layers along with GPS location of muntjac presence/absence and ?multiple logistic regression (MLR techniques were integrated in a GIS environment to model habitat suitability index of muntjac. The results indicate that approximately 222.39 km2 (75.4% of the forest of tiger reserve was least suitable for muntjac, whereas, 29.53 km2 (10.02% was moderately suitable, 22.12 km2 (7.5% suitable and 20.70 km2 (7.0% was highly suitable. The accuracy level of this model was 97.6%. The model can be considered as potent enough to advocate that forests of this area are most appropriate for declaring it as a reserve for muntjac conservation, ultimately to provide prey base for tiger.

  20. Texture Analysis and Classification With Linear Regression Model Based on Wavelet Transform

    OpenAIRE

    Wang, Zhi-zhong; Yong, Jun-hai

    2008-01-01

    The wavelet transform as an important multiresolution analysis tool has already been commonly applied to texture analysis and classification. Nevertheless, it ignores the structural information while capturing the spectral information of the texture image at different scales. In this paper, we propose a texture analysis and classification approach kith the linear regression model based on the wavelet transform. This method is motivated by the observation that there exists a distinctive correl...

  1. Regression Analysis of Top of Descent Location for Idle-thrust Descents

    Science.gov (United States)

    Stell, Laurel; Bronsvoort, Jesper; McDonald, Greg

    2013-01-01

    In this paper, multiple regression analysis is used to model the top of descent (TOD) location of user-preferred descent trajectories computed by the flight management system (FMS) on over 1000 commercial flights into Melbourne, Australia. The independent variables cruise altitude, final altitude, cruise Mach, descent speed, wind, and engine type were also recorded or computed post-operations. Both first-order and second-order models are considered, where cross-validation, hypothesis testing, and additional analysis are used to compare models. This identifies the models that should give the smallest errors if used to predict TOD location for new data in the future. A model that is linear in TOD altitude, final altitude, descent speed, and wind gives an estimated standard deviation of 3.9 nmi for TOD location given the trajec- tory parameters, which means about 80% of predictions would have error less than 5 nmi in absolute value. This accuracy is better than demonstrated by other ground automation predictions using kinetic models. Furthermore, this approach would enable online learning of the model. Additional data or further knowl- edge of algorithms is necessary to conclude definitively that no second-order terms are appropriate. Possible applications of the linear model are described, including enabling arriving aircraft to fly optimized descents computed by the FMS even in congested airspace. In particular, a model for TOD location that is linear in the independent variables would enable decision support tool human-machine interfaces for which a kinetic approach would be computationally too slow.

  2. Regression analysis with missing data and unknown colored noise: application to the MICROSCOPE space mission

    CERN Document Server

    Baghi, Q; Bergé, J; Christophe, B; Touboul, P; Rodrigues, M

    2015-01-01

    The analysis of physical measurements often copes with highly correlated noises and interruptions caused by outliers, saturation events or transmission losses. We assess the impact of missing data on the performance of linear regression analysis involving the fit of modeled or measured time series. We show that data gaps can significantly alter the precision of the regression parameter estimation in the presence of colored noise, due to the frequency leakage of the noise power. We present a regression method which cancels this effect and estimates the parameters of interest with a precision comparable to the complete data case, even if the noise power spectral density (PSD) is not known a priori. The method is based on an autoregressive (AR) fit of the noise, which allows us to build an approximate generalized least squares estimator approaching the minimal variance bound. The method, which can be applied to any similar data processing, is tested on simulated measurements of the MICROSCOPE space mission, whos...

  3. CARAT-GxG: CUDA-Accelerated Regression Analysis Toolkit for Large-Scale Gene-Gene Interaction with GPU Computing System.

    Science.gov (United States)

    Lee, Sungyoung; Kwon, Min-Seok; Park, Taesung

    2014-01-01

    In genome-wide association studies (GWAS), regression analysis has been most commonly used to establish an association between a phenotype and genetic variants, such as single nucleotide polymorphism (SNP). However, most applications of regression analysis have been restricted to the investigation of single marker because of the large computational burden. Thus, there have been limited applications of regression analysis to multiple SNPs, including gene-gene interaction (GGI) in large-scale GWAS data. In order to overcome this limitation, we propose CARAT-GxG, a GPU computing system-oriented toolkit, for performing regression analysis with GGI using CUDA (compute unified device architecture). Compared to other methods, CARAT-GxG achieved almost 700-fold execution speed and delivered highly reliable results through our GPU-specific optimization techniques. In addition, it was possible to achieve almost-linear speed acceleration with the application of a GPU computing system, which is implemented by the TORQUE Resource Manager. We expect that CARAT-GxG will enable large-scale regression analysis with GGI for GWAS data. PMID:25574130

  4. CARAT-GxG: CUDA-Accelerated Regression Analysis Toolkit for Large-Scale Gene–Gene Interaction with GPU Computing System

    Science.gov (United States)

    Lee, Sungyoung; Kwon, Min-Seok; Park, Taesung

    2014-01-01

    In genome-wide association studies (GWAS), regression analysis has been most commonly used to establish an association between a phenotype and genetic variants, such as single nucleotide polymorphism (SNP). However, most applications of regression analysis have been restricted to the investigation of single marker because of the large computational burden. Thus, there have been limited applications of regression analysis to multiple SNPs, including gene–gene interaction (GGI) in large-scale GWAS data. In order to overcome this limitation, we propose CARAT-GxG, a GPU computing system-oriented toolkit, for performing regression analysis with GGI using CUDA (compute unified device architecture). Compared to other methods, CARAT-GxG achieved almost 700-fold execution speed and delivered highly reliable results through our GPU-specific optimization techniques. In addition, it was possible to achieve almost-linear speed acceleration with the application of a GPU computing system, which is implemented by the TORQUE Resource Manager. We expect that CARAT-GxG will enable large-scale regression analysis with GGI for GWAS data. PMID:25574130

  5. Development of an empirical model of turbine efficiency using the Taylor expansion and regression analysis

    International Nuclear Information System (INIS)

    The empirical model of turbine efficiency is necessary for the control- and/or diagnosis-oriented simulation and useful for the simulation and analysis of dynamic performances of the turbine equipment and systems, such as air cycle refrigeration systems, power plants, turbine engines, and turbochargers. Existing empirical models of turbine efficiency are insufficient because there is no suitable form available for air cycle refrigeration turbines. This work performs a critical review of empirical models (called mean value models in some literature) of turbine efficiency and develops an empirical model in the desired form for air cycle refrigeration, the dominant cooling approach in aircraft environmental control systems. The Taylor series and regression analysis are used to build the model, with the Taylor series being used to expand functions with the polytropic exponent and the regression analysis to finalize the model. The measured data of a turbocharger turbine and two air cycle refrigeration turbines are used for the regression analysis. The proposed model is compact and able to present the turbine efficiency map. Its predictions agree with the measured data very well, with the corrected coefficient of determination Rc2 ? 0.96 and the mean absolute percentage deviation = 1.19% for the three turbines. -- Highlights: ? Performed a critical review of empirical models of turbine efficiency. ? Developed an empirical model in the desired form for air cycle refrigeration, using the Taylor expansion and regression analysis. ? Verified the method for developing the empirical model. ? Verified the model.

  6. NEW IDEA FOR THE TOPOLOGICAL INDEX EVALUATION AND TREATISE MULTIPLE REGRESSION WITH THREE INDEPENDENT VARIABLES: SATURATED HYDROCARBONS USED LIKE A MODEL

    OpenAIRE

    Cornwell, E.

    2006-01-01

    In QSRR discipline an easy novel to used parameter was designed (Vc) for evaluated classical topological index (W, ¹chi, Z, MTI) and two new generation ones (Xu, ¹chih). Regression between Vc and ¹chih presented a correlation index (r) of 0,9992, a surprising high value in comparison with that founds commonly in QSPR/QSAR discipline. Through Vc parameter, an idea to treatise multiple three independent variable regression is present. Model of 35 saturated hydrocarbons were used

  7. Linear Maximum Likelihood Regression Analysis for Untransformed Log-Normally Distributed Data

    Directory of Open Access Journals (Sweden)

    Sara M. Gustavsson

    2012-10-01

    Full Text Available Medical research data are often skewed and heteroscedastic. It has therefore become practice to log-transform data in regression analysis, in order to stabilize the variance. Regression analysis on log-transformed data estimates the relative effect, whereas it is often the absolute effect of a predictor that is of interest. We propose a maximum likelihood (ML-based approach to estimate a linear regression model on log-normal, heteroscedastic data. The new method was evaluated with a large simulation study. Log-normal observations were generated according to the simulation models and parameters were estimated using the new ML method, ordinary least-squares regression (LS and weighed least-squares regression (WLS. All three methods produced unbiased estimates of parameters and expected response, and ML and WLS yielded smaller standard errors than LS. The approximate normality of the Wald statistic, used for tests of the ML estimates, in most situations produced correct type I error risk. Only ML and WLS produced correct confidence intervals for the estimated expected value. ML had the highest power for tests regarding ?1.

  8. Semiparametric modeling and estimation of heteroscedasticity in regression analysis of cross-sectional data

    OpenAIRE

    Keilegom, Ingrid; Wang, Lan

    2010-01-01

    We consider the problem of modeling heteroscedasticity in semiparametric regression analysis of crosssectional data. Existing work in this setting is rather limited and mostly adopts a fully nonparametric variance structure. This approach is hampered by curse of dimensionality in practical applications. Moreover, the corresponding asymptotic theory is largely restricted to estimators that minimize certain smooth objective functions. The asymptotic derivation thus excludes semiparametric quant...

  9. Isolating the Effects of Training Using Simple Regression Analysis: An Example of the Procedure.

    Science.gov (United States)

    Waugh, C. Keith

    This paper provides a case example of simple regression analysis, a forecasting procedure used to isolate the effects of training from an identified extraneous variable. This case example focuses on results of a three-day sales training program to improve bank loan officers' knowledge, skill-level, and attitude regarding solicitation and sale of…

  10. What Satisfies Students?: Mining Student-Opinion Data with Regression and Decision Tree Analysis

    Science.gov (United States)

    Thomas, Emily H.; Galambos, Nora

    2004-01-01

    To investigate how students' characteristics and experiences affect satisfaction, this study uses regression and decision tree analysis with the CHAID algorithm to analyze student-opinion data. A data mining approach identifies the specific aspects of students' university experience that most influence three measures of general satisfaction. The…

  11. BMDP program for piecewise linear regression.

    Science.gov (United States)

    Nakamura, T

    1986-08-01

    Piecewise linear regression has potentially broad applications in medical data analysis as well as other types of regression. Various kinds of algorithms have been proposed for finding optimum piecewise linear regressions. This paper presents a BMDP program for obtaining near optimum piecewise linear regression equations. An idea intrinsic to the method is that restricting parameter space to a discrete set makes the difficult problems become standard problems. Any software having the variable selection feature in the multiple linear regression can be used to apply the method. PMID:3638186

  12. Methods and applications of linear models regression and the analysis of variance

    CERN Document Server

    Hocking, Ronald R

    2013-01-01

    Praise for the Second Edition"An essential desktop reference book . . . it should definitely be on your bookshelf." -Technometrics A thoroughly updated book, Methods and Applications of Linear Models: Regression and the Analysis of Variance, Third Edition features innovative approaches to understanding and working with models and theory of linear regression. The Third Edition provides readers with the necessary theoretical concepts, which are presented using intuitive ideas rather than complicated proofs, to describe the inference that is appropriate for the methods being discussed. The book

  13. Detrended fluctuation analysis as a regression framework: Estimating dependence at different scales

    Science.gov (United States)

    Kristoufek, Ladislav

    2015-02-01

    We propose a framework combining detrended fluctuation analysis with standard regression methodology. The method is built on detrended variances and covariances and it is designed to estimate regression parameters at different scales and under potential nonstationarity and power-law correlations. The former feature allows for distinguishing between effects for a pair of variables from different temporal perspectives. The latter ones make the method a significant improvement over the standard least squares estimation. Theoretical claims are supported by Monte Carlo simulations. The method is then applied on selected examples from physics, finance, environmental science, and epidemiology. For most of the studied cases, the relationship between variables of interest varies strongly across scales.

  14. Statistical methods in regression and calibration analysis of chromosome aberration data

    International Nuclear Information System (INIS)

    The method of iteratively reweighted least squares for the regression analysis of Poisson distributed chromosome aberration data is reviewed in the context of other fit procedures used in the cytogenetic literature. As an application of the resulting regression curves methods for calculating confidence intervals on dose from aberration yield are described and compared, and, for the linear quadratic model a confidence interval is given. Emphasis is placed on the rational interpretation and the limitations of various methods from a statistical point of view. (orig./MG)

  15. Electricity Consumption Analysis Using Spline Regression Models: The Case of a Turkish Province

    Directory of Open Access Journals (Sweden)

    Omer Alkan

    2013-05-01

    Full Text Available Energy is one of the indispensible elements of human life and electrical energy is adopted as the most frequently used energy type. As this type of energy can not be stored at the present time, it has to be instantly consumed. In other words, the demand of the consumers has to be compensated, immediately. This paper employs to model the electrical consumption of Erzurum province in 2011 by spline regression and to decide whether a statistically seasonal variation exists for this consumption. The one-year data set of the investigation was obtained from Turkish Electricity Transmission Company Provincial Directorate of Erzurum and was analyzed by the agency of continuous partial polynomial spline regressions. This analysis determined three knots and fits linear, quadratic and cubic spline regression models.

  16. Analysis of the influence of quantile regression model on mainland tourists' service satisfaction performance.

    Science.gov (United States)

    Wang, Wen-Cheng; Cho, Wen-Chien; Chen, Yin-Jen

    2014-01-01

    It is estimated that mainland Chinese tourists travelling to Taiwan can bring annual revenues of 400 billion NTD to the Taiwan economy. Thus, how the Taiwanese Government formulates relevant measures to satisfy both sides is the focus of most concern. Taiwan must improve the facilities and service quality of its tourism industry so as to attract more mainland tourists. This paper conducted a questionnaire survey of mainland tourists and used grey relational analysis in grey mathematics to analyze the satisfaction performance of all satisfaction question items. The first eight satisfaction items were used as independent variables, and the overall satisfaction performance was used as a dependent variable for quantile regression model analysis to discuss the relationship between the dependent variable under different quantiles and independent variables. Finally, this study further discussed the predictive accuracy of the least mean regression model and each quantile regression model, as a reference for research personnel. The analysis results showed that other variables could also affect the overall satisfaction performance of mainland tourists, in addition to occupation and age. The overall predictive accuracy of quantile regression model Q0.25 was higher than that of the other three models. PMID:24574916

  17. Do cognitive interventions improve general cognition in dementia? A meta-analysis and meta-regression

    Science.gov (United States)

    Huntley, J D; Gould, R L; Liu, K; Smith, M; Howard, R J

    2015-01-01

    Objectives To review the efficacy of cognitive interventions on improving general cognition in dementia. Method Online literature databases and trial registers, previous systematic reviews and leading journals were searched for relevant randomised controlled trials. A systematic review, random-effects meta-analyses and meta-regression were conducted. Cognitive interventions were categorised as: cognitive stimulation (CS), involving a range of social and cognitive activities to stimulate multiple cognitive domains; cognitive training (CT), involving repeated practice of standardised tasks targeting a specific cognitive function; cognitive rehabilitation (CR), which takes a person-centred approach to target impaired function; or mixed  CT and stimulation (MCTS). Separate analyses were conducted for general cognitive outcome measures and for studies using ‘active’ (designed to control for non-specific therapeutic effects) and non-active (minimal or no intervention) control groups. Results 33 studies were included. Significant positive effect sizes (Hedges’ g) were found for CS with the mini-mental state examination (MMSE) (g=0.51, 95% CI 0.29 to 0.69; pAlzheimer's disease Assessment Scale-Cognition (ADAS-Cog) (g=?0.26, 95% CI ?0.445 to ?0.08; p=0.005). There was no evidence that CT or MCTS produced significant improvements on general cognition outcomes and not enough CR studies for meta-analysis. The lowest accepted minimum clinically important difference was reached in 11/17 CS studies for the MMSE, but only 2/9 studies for the ADAS-Cog. Additionally, 95% prediction intervals suggested that although statistically significant, CS may not lead to benefits on the ADAS-Cog in all clinical settings. Conclusions CS improves scores on MMSE and ADAS-Cog in dementia, but benefits on the ADAS-Cog are generally not clinically significant and difficulties with blinding of patients and use of adequate placebo controls make comparison with the results of dementia drug treatments problematic. PMID:25838501

  18. Formal Specification Language Based IaaS Cloud Workload Regression Analysis

    OpenAIRE

    Singh, Sukhpal; Chana, Inderveer

    2014-01-01

    Cloud Computing is an emerging area for accessing computing resources. In general, Cloud service providers offer services that can be clustered into three categories: SaaS, PaaS and IaaS. This paper discusses the Cloud workload analysis. The efficient Cloud workload resource mapping technique is proposed. This paper aims to provide a means of understanding and investigating IaaS Cloud workloads and the resources. In this paper, regression analysis is used to analyze the Clou...

  19. Analysis of Herd Behavior Using Quantile Regression: Evidence from Karachi Stock Exchange (KSE)

    OpenAIRE

    Malik, Saif Ullah; Elahi, Muhammad Ather

    2014-01-01

    The objectives of this paper are to explore the herd behavior in the Karachi Stock Exchange (KSE) by using Ordinary Least Square (OLS) and Quantile Regression analysis for normal as well as bullish (up) and bearish(down) market conditions. Greed stimulates people to make increasingly risky investments and therefore investors tend to follow one another blindly and ignore rational analysis. Herd behavior can be defined as when investor ignore available information and follow other investors dur...

  20. Simultaneous Optimization of Nanocrystalline SnO2 Thin Film Deposition Using Multiple Linear Regressions

    Directory of Open Access Journals (Sweden)

    Saeideh Ebrahimiasl

    2014-02-01

    Full Text Available A nanocrystalline SnO2 thin film was synthesized by a chemical bath method. The parameters affecting the energy band gap and surface morphology of the deposited SnO2 thin film were optimized using a semi-empirical method. Four parameters, including deposition time, pH, bath temperature and tin chloride (SnCl2·2H2O concentration were optimized by a factorial method. The factorial used a Taguchi OA (TOA design method to estimate certain interactions and obtain the actual responses. Statistical evidences in analysis of variance including high F-value (4,112.2 and 20.27, very low P-value (<0.012 and 0.0478, non-significant lack of fit, the determination coefficient (R2 equal to 0.978 and 0.977 and the adequate precision (170.96 and 12.57 validated the suggested model. The optima of the suggested model were verified in the laboratory and results were quite close to the predicted values, indicating that the model successfully simulated the optimum conditions of SnO2 thin film synthesis.

  1. Simultaneous optimization of nanocrystalline SnO2 thin film deposition using multiple linear regressions.

    Science.gov (United States)

    Ebrahimiasl, Saeideh; Zakaria, Azmi

    2014-01-01

    A nanocrystalline SnO2 thin film was synthesized by a chemical bath method. The parameters affecting the energy band gap and surface morphology of the deposited SnO2 thin film were optimized using a semi-empirical method. Four parameters, including deposition time, pH, bath temperature and tin chloride (SnCl2·2H2O) concentration were optimized by a factorial method. The factorial used a Taguchi OA (TOA) design method to estimate certain interactions and obtain the actual responses. Statistical evidences in analysis of variance including high F-value (4,112.2 and 20.27), very low P-value (<0.012 and 0.0478), non-significant lack of fit, the determination coefficient (R2 equal to 0.978 and 0.977) and the adequate precision (170.96 and 12.57) validated the suggested model. The optima of the suggested model were verified in the laboratory and results were quite close to the predicted values, indicating that the model successfully simulated the optimum conditions of SnO2 thin film synthesis. PMID:24509767

  2. A comparison of neural network models, fuzzy logic, and multiple linear regression for prediction of hatchability.

    Science.gov (United States)

    Mehri, M

    2013-04-01

    Application of appropriate models to approximate the performance function warrants more precise prediction and helps to make the best decisions in the poultry industry. This study reevaluated the factors affecting hatchability in laying hens from 29 to 56 wk of age. Twenty-eight data lines representing 4 inputs consisting of egg weight, eggshell thickness, egg sphericity, and yolk/albumin ratio and 1 output, hatchability, were obtained from the literature and used to train an artificial neural network (ANN). The prediction ability of ANN was compared with that of fuzzy logic to evaluate the fitness of these 2 methods. The models were compared using R(2), mean absolute deviation (MAD), mean squared error (MSE), mean absolute percentage error (MAPE), and bias. The developed model was used to assess the relative importance of each variable on the hatchability by calculating the variable sensitivity ratio. The statistical evaluations showed that the ANN-based model predicted hatchability more accurately than fuzzy logic. The ANN-based model had a higher determination of coefficient (R(2) = 0.99) and lower residual distribution (MAD = 0.005; MSE = 0.00004; MAPE = 0.732; bias = 0.0012) than fuzzy logic (R(2) = 0.87; MAD = 0.014; MSE = 0.0004; MAPE = 2.095; bias = 0.0046). The sensitivity analysis revealed that the most important variable in the ANN-based model of hatchability was egg weight (variable sensitivity ratio, VSR = 283.11), followed by yolk/albumin ratio (VSR = 113.16), eggshell thickness (VSR = 16.23), and egg sphericity (VSR = 3.63). The results of this research showed that the universal approximation capability of ANN made it a powerful tool to approximate complex functions such as hatchability in the incubation process. PMID:23472039

  3. Field-scale variation in colloid dispersibility and transport : multiple linear regressions to soil physico-chemical and structural properties

    DEFF Research Database (Denmark)

    NØrgaard, Trine; MØldrup, Per

    2014-01-01

    Colloids are potential carriers for strongly sorbing chemicals in macroporous soils, but predicting the amount of colloids readily available for facilitated chemical transport is an unsolved challenge. This study addresses potential key parameters and predictive indicators when assessing colloid dispersibility and transport at the field scale. Samples representing three measurement scales (1-2 mm aggregates, intact 100 cm3 rings, and intact 6283 cm3 columns) were retrieved from the topsoil of a 1.69 ha agricultural field in a 15 m × 15 m grid (65 locations) to determine soil dispersibility as well as 24 comparison parameters including textural, chemical, and structural (e.g. air permeability) 8 soil properties. The soil dispersibility was determined (i) using a laser diffraction method on 1-2 mm aggregates equilibrated to an initial matric potential of -100 cm H2O, (ii) using an end-over-end shaking on 6.06 cm (diam.) × 3.48 cm (height) cm intact soil rings equilibrated to an initial matric potential of -5 cmH2O, and (iii) as the accumulated amount of particles leached from 20 cm × 20 cm intact soil columns after 6.5 hr (60 mm accumulated outflow). At all three scales, soil dispersibility was higher in samples collected from the northern part of the field where the greatest leaching of pesticides was observed in a horizontal well at ~ 3.5 m depth during a 9-year monitoring program. This suggests that the three dispersibility methods used are all relevant for field-scale mapping of areas with enhanced risk of colloid-facilitated transport. Subsequently, using multiple linear regression (MLR) analyses, soil dispersibility was predicted at all three sample scales from the 24 measured, geo-referenced parameters to produce sets of only a few promising indicator parameters for evaluating soil stability and particle mobilization on field scale. The MLR analyses at each scale were separated in predictions using all, only north, and only south locations in the field. We found that different independent variables were included in the regression models when the sample scale increased from aggregate to column level. Generally, the predictive power of the regression models was better on the 1-2 mm aggregate scale than on the intact 100 cm3 and 20 cm × 20 cm scales. Overall, results suggested that different drivers controlled soil dispersibility 1 at the three scales and the two sub-areas of the field. Predictions of soil dispersibility and the risk of colloid-facilitated chemical transport will therefore need to be highly scale- and area-specific.

  4. Land use regression modeling of intra-urban residential variability in multiple traffic-related air pollutants

    Directory of Open Access Journals (Sweden)

    Baxter Lisa K

    2008-05-01

    Full Text Available Abstract Background There is a growing body of literature linking GIS-based measures of traffic density to asthma and other respiratory outcomes. However, no consensus exists on which traffic indicators best capture variability in different pollutants or within different settings. As part of a study on childhood asthma etiology, we examined variability in outdoor concentrations of multiple traffic-related air pollutants within urban communities, using a range of GIS-based predictors and land use regression techniques. Methods We measured fine particulate matter (PM2.5, nitrogen dioxide (NO2, and elemental carbon (EC outside 44 homes representing a range of traffic densities and neighborhoods across Boston, Massachusetts and nearby communities. Multiple three to four-day average samples were collected at each home during winters and summers from 2003 to 2005. Traffic indicators were derived using Massachusetts Highway Department data and direct traffic counts. Multivariate regression analyses were performed separately for each pollutant, using traffic indicators, land use, meteorology, site characteristics, and central site concentrations. Results PM2.5 was strongly associated with the central site monitor (R2 = 0.68. Additional variability was explained by total roadway length within 100 m of the home, smoking or grilling near the monitor, and block-group population density (R2 = 0.76. EC showed greater spatial variability, especially during winter months, and was predicted by roadway length within 200 m of the home. The influence of traffic was greater under low wind speed conditions, and concentrations were lower during summer (R2 = 0.52. NO2 showed significant spatial variability, predicted by population density and roadway length within 50 m of the home, modified by site characteristics (obstruction, and with higher concentrations during summer (R2 = 0.56. Conclusion Each pollutant examined displayed somewhat different spatial patterns within urban neighborhoods, and were differently related to local traffic and meteorology. Our results indicate a need for multi-pollutant exposure modeling to disentangle causal agents in epidemiological studies, and further investigation of site-specific and meteorological modification of the traffic-concentration relationship in urban neighborhoods.

  5. Practical guidance for conducting mediation analysis with multiple mediators using inverse odds ratio weighting.

    Science.gov (United States)

    Nguyen, Quynh C; Osypuk, Theresa L; Schmidt, Nicole M; Glymour, M Maria; Tchetgen Tchetgen, Eric J

    2015-03-01

    Despite the recent flourishing of mediation analysis techniques, many modern approaches are difficult to implement or applicable to only a restricted range of regression models. This report provides practical guidance for implementing a new technique utilizing inverse odds ratio weighting (IORW) to estimate natural direct and indirect effects for mediation analyses. IORW takes advantage of the odds ratio's invariance property and condenses information on the odds ratio for the relationship between the exposure (treatment) and multiple mediators, conditional on covariates, by regressing exposure on mediators and covariates. The inverse of the covariate-adjusted exposure-mediator odds ratio association is used to weight the primary analytical regression of the outcome on treatment. The treatment coefficient in such a weighted regression estimates the natural direct effect of treatment on the outcome, and indirect effects are identified by subtracting direct effects from total effects. Weighting renders treatment and mediators independent, thereby deactivating indirect pathways of the mediators. This new mediation technique accommodates multiple discrete or continuous mediators. IORW is easily implemented and is appropriate for any standard regression model, including quantile regression and survival analysis. An empirical example is given using data from the Moving to Opportunity (1994-2002) experiment, testing whether neighborhood context mediated the effects of a housing voucher program on obesity. Relevant Stata code (StataCorp LP, College Station, Texas) is provided. PMID:25693776

  6. Multilayer perceptron for robust nonlinear interval regression analysis using genetic algorithms.

    Science.gov (United States)

    Hu, Yi-Chung

    2014-01-01

    On the basis of fuzzy regression, computational models in intelligence such as neural networks have the capability to be applied to nonlinear interval regression analysis for dealing with uncertain and imprecise data. When training data are not contaminated by outliers, computational models perform well by including almost all given training data in the data interval. Nevertheless, since training data are often corrupted by outliers, robust learning algorithms employed to resist outliers for interval regression analysis have been an interesting area of research. Several approaches involving computational intelligence are effective for resisting outliers, but the required parameters for these approaches are related to whether the collected data contain outliers or not. Since it seems difficult to prespecify the degree of contamination beforehand, this paper uses multilayer perceptron to construct the robust nonlinear interval regression model using the genetic algorithm. Outliers beyond or beneath the data interval will impose slight effect on the determination of data interval. Simulation results demonstrate that the proposed method performs well for contaminated datasets. PMID:25110755

  7. A new cluster-histo-regression analysis for incremental learning from temporal data chunks

    Directory of Open Access Journals (Sweden)

    Nagabhushan P.

    2010-03-01

    Full Text Available In scenarios where data chunks arrive temporally, a good algorithm for exploratory analysisshould be able to generate the knowledge and with the next chunk of data arriving, the process should bethe one of just updating online by accumulating the knowledge derived from the recent chunk. Such anincremental learning process in most of the cases indent a lot of memory requiring to carry all earlier data inthe process of updating the knowledge successively. In this research work we propose to employ a novelCluster-Histo-Regression analysis of the chunk to extract the knowledge for the temporal instant and fusethis knowledge through Histo-Regression-Distance analysis with the already accumulated knowledge. Wehave designed a methodology which (i discards all those data samples from the chunk which haveparticipated in the knowledge generation process (ii indents minimum amount of memory to carry theaccumulated knowledge and (iii proposes to carry forward only those limited data samples (referred to ashard samples which could not contribute to knowledge generated at that moment. Knowledge of eachcluster is represented in the form of a histogram for each dimension of the clustered data and is transformedto regression line for the compact representation of the knowledge. The regression line parameters of theclusters obtained by incremental augmentation have shown an accuracy of up to 100% for some of the datasets that are considered for experimentation.

  8. Analysis of genomic signatures in prokaryotes using multinomial regression and hierarchical clustering

    DEFF Research Database (Denmark)

    Ussery, David; Bohlin, Jon

    2009-01-01

    Recently there has been an explosion in the availability of bacterial genomic sequences, making possible now an analysis of genomic signatures across more than 800 hundred different bacterial chromosomes, from a wide variety of environments. Using genomic signatures, we pair-wise compared 867 different genomic DNA sequences, taken from chromosomes and plasmids more than 100,000 base-pairs in length. Hierarchical clustering was performed on the outcome of the comparisons before a multinomial regression model was fitted. The regression model included the cluster groups as the response variable with AT content, phyla, growth temperature, selective pressure, habitat, sequence size, oxygen requirement and pathogenicity as predictors. Many significant factors were associated with the genomic signature, most notably AT content. Phyla was also an important factor, although considerably less so than AT content. Small improvements to the regression model, although significant, were also obtained by factors such as sequence size, habitat, growth temperature, selective pressure measured as oligonucleotide usage variance, and oxygen requirement.The statistics obtained using hierarchical clustering and multinomial regression analysis indicate that the genomic signature is shaped by many factors, and this may explain the varying ability to classify prokaryotic organisms below genus level.

  9. Analysis of genomic signatures in prokaryotes using multinomial regression and hierarchical clustering

    Directory of Open Access Journals (Sweden)

    Skjerve Eystein

    2009-10-01

    Full Text Available Abstract Background Recently there has been an explosion in the availability of bacterial genomic sequences, making possible now an analysis of genomic signatures across more than 800 hundred different bacterial chromosomes, from a wide variety of environments. Using genomic signatures, we pair-wise compared 867 different genomic DNA sequences, taken from chromosomes and plasmids more than 100,000 base-pairs in length. Hierarchical clustering was performed on the outcome of the comparisons before a multinomial regression model was fitted. The regression model included the cluster groups as the response variable with AT content, phyla, growth temperature, selective pressure, habitat, sequence size, oxygen requirement and pathogenicity as predictors. Results Many significant factors were associated with the genomic signature, most notably AT content. Phyla was also an important factor, although considerably less so than AT content. Small improvements to the regression model, although significant, were also obtained by factors such as sequence size, habitat, growth temperature, selective pressure measured as oligonucleotide usage variance, and oxygen requirement. Conclusion The statistics obtained using hierarchical clustering and multinomial regression analysis indicate that the genomic signature is shaped by many factors, and this may explain the varying ability to classify prokaryotic organisms below genus level.

  10. Ranking contributing areas of salt and selenium in the Lower Gunnison River Basin, Colorado, using multiple linear regression models

    Science.gov (United States)

    Linard, Joshua I.

    2013-01-01

    Mitigating the effects of salt and selenium on water quality in the Grand Valley and lower Gunnison River Basin in western Colorado is a major concern for land managers. Previous modeling indicated means to improve the models by including more detailed geospatial data and a more rigorous method for developing the models. After evaluating all possible combinations of geospatial variables, four multiple linear regression models resulted that could estimate irrigation-season salt yield, nonirrigation-season salt yield, irrigation-season selenium yield, and nonirrigation-season selenium yield. The adjusted r-squared and the residual standard error (in units of log-transformed yield) of the models were, respectively, 0.87 and 2.03 for the irrigation-season salt model, 0.90 and 1.25 for the nonirrigation-season salt model, 0.85 and 2.94 for the irrigation-season selenium model, and 0.93 and 1.75 for the nonirrigation-season selenium model. The four models were used to estimate yields and loads from contributing areas corresponding to 12-digit hydrologic unit codes in the lower Gunnison River Basin study area. Each of the 175 contributing areas was ranked according to its estimated mean seasonal yield of salt and selenium.

  11. 2D Quantitative Structure-Property Relationship Study of Mycotoxins by Multiple Linear Regression and Support Vector Machine

    Directory of Open Access Journals (Sweden)

    Fereshteh Shiri

    2010-08-01

    Full Text Available In the present work, support vector machines (SVMs and multiple linear regression (MLR techniques were used for quantitative structure–property relationship (QSPR studies of retention time (tR in standardized liquid chromatography–UV–mass spectrometry of 67 mycotoxins (aflatoxins, trichothecenes, roquefortines and ochratoxins based on molecular descriptors calculated from the optimized 3D structures. By applying missing value, zero and multicollinearity tests with a cutoff value of 0.95, and genetic algorithm method of variable selection, the most relevant descriptors were selected to build QSPR models. MLRand SVMs methods were employed to build QSPR models. The robustness of the QSPR models was characterized by the statistical validation and applicability domain (AD. The prediction results from the MLR and SVM models are in good agreement with the experimental values. The correlation and predictability measure by r2 and q2 are 0.931 and 0.932, repectively, for SVM and 0.923 and 0.915, respectively, for MLR. The applicability domain of the model was investigated using William’s plot. The effects of different descriptors on the retention times are described.

  12. Tissue counter analysis of tissue components in skin biopsies: evaluation using CART (Classification and Regression Trees).

    Science.gov (United States)

    Smolle, Josef; Gerger, Armin

    2003-06-01

    In tissue counter analysis, complex histologic sections are overlaid with regularly distributed measuring masks of equal size and shape, and the digital contents of each mask (or tissue element) are evaluated by gray level, color, and texture parameters. In this study, the feasibility of tissue counter analysis and classification and regression trees for the quantitative evaluation of skin biopsies was assessed. From 100 randomly selected skin biopsies, a learning set of tissue elements was created, differentiating between cellular elements, collagenous elements of the reticular dermis, fatty elements and other tissue components. Classification and regression trees based on the learning set were used to automatically classify tissue elements in samples of normal skin, benign common nevi, malignant melanoma, molluscum contagiosum, seborrheic keratosis, epidermoid cysts, basal cell carcinoma, and scleroderma. The procedure yielded reproducible assessments of the relative amounts of tissue components in various diagnostic groups. Furthermore, a reliable diagnostic separation of molluscum contagiosum versus normal skin and epidermal cysts, benign common nevi versus malignant melanoma, and seborrheic keratosis versus basal cell carcinoma was possible. Tissue counter analysis combined with classification and regression trees may be a suitable approach to the fully automated analysis of histologic sections of skin biopsies. PMID:12775984

  13. Robust Outlier Detection in Linear Regression

    OpenAIRE

    Jajo, Nethal K.; Xizhi Wu

    2004-01-01

    New methodology of robust outlier detection based on Robustly Studentized Robust Residuals (RSRR) examination is well established in linear regression analysis. Two new robust location estimators of linear regression parameters are developed in simple and multiple cases. Based on these robust estimators we obtain RSRR. We used RSRR to derive a new measure of distance to be used in outlier detection. A graphical display using new measure of distance is constructed for detecting multiple outlie...

  14. MICROARRAY DATA ANALYSIS USING MULTIPLE STATISTICAL MODELS

    Science.gov (United States)

    Microarray Data Analysis Using Multiple Statistical Models Wenjun Bao1, Judith E. Schmid1, Amber K. Goetz1, Ming Ouyang2, William J. Welsh2,Andrew I. Brooks3,4, ChiYi Chu3,Mitsunori Ogihara3,4, Yinhe Cheng5, David J. Dix1. 1National Health and Environmental Effects Researc...

  15. Classificação da composição iônica da água de irrigação usando regressão linear múltipla / Classification of the ionic composition of the irrigation water using multiple linear regression

    Scientific Electronic Library Online (English)

    Celsemy E., Maia; Elís R.C. de, Morais; Maurício de, Oliveira.

    2001-04-01

    Full Text Available SciELO Brazil | Language: Portuguese Abstract in portuguese Objetivou-se, com o presente trabalho, desenvolver uma metodologia para classificação da composição iônica da água de irrigação, através da regressão linear múltipla, tendo-se, como variável dependente, a condutividade elétrica e, como variáveis independentes, as concentrações de cátions e ânions da [...] água de irrigação, classificada de acordo com o peso de cada íon no modelo estatístico. A fonte secundária de dados para a pesquisa foi o Banco de Dados do Laboratório de Análise de Água e Fertilidade do Solo, da Escola Superior de Agricultura de Mossoró (LAAFS/ESAM). As regressões foram ajustadas utilizando-se o método da seleção por etapas, conhecido como the stepwise regression procedure, no qual a variável dependente foi a condutividade elétrica e, como variáveis independentes, os íons determinados pela análise físico-química da água. Os resultados mostraram que, empregando-se este critério de regressão linear múltipla, havia variação na contribuição de cada variável no modelo ajustado, cuja estimativa era baseada no aumento da soma de quadrado, devido à regressão, a medida em que se incorporava, ao modelo, cada variável independente. Em função de critérios preestabelecidos, águas provenientes de mananciais da região da Chapada do Apodi foram classificadas como cálcica-sódica, cálcica e cloretada, quando provinham de poço tubular, de poço amazonas e rio, respectivamente. As águas oriundas da região do Baixo Açu, foram classificadas como sódica, magnesiana-sódica e sódica, para as águas de poço tubular, poço amazonas e rio, respectivamente. Abstract in english This work was conducted with the objective of developing a methodology for classification of the ionic composition of the irrigation water using multiple linear regression. A Stepwise Regression Analysis model was tested, using electrical conductivity as the dependent variable and analyzed ions calc [...] ium, sodium, potassium, carbonate, bicarbonate and chlorides as the independent variables in all tested models. All water samples were collected by the farmers of the region where this work was conducted. The regression models were adjusted using the water analysis database from the ESAM's Analysis Laboratory (Laboratório de Análises de Água e Fertilidade do Solo da Escola Superior de Agricultura de Mossoró - LAAFS/ESAM). The linear model, adjusted using the Stepwise Regression Procedure, shows that the degree of model adjustment tested depends upon geological formation of watersheds and whether it is collected in a river or tubular wells. The classification of the water in calcareous region of the Chapada do Apodi is calcic-sodic, calcic or choride if this source was tubular well, piezometric well (drilled in unconfined water denominated in the region as poço amazonas) or surface rivers and lagoons water, respectively. In Baixo Açu region, these waters were classified as sodic, magnesian-sodic or sodic depending if the source collected is a tubular well (drilled in Açu sedimentary geological formation), piezometric well or superficial water, respectivelly.

  16. Transferencia de información hidrológica mendiante regresión lineal múltiple, con selección óptima de regresores / Transference of hydrologic information through multiple linear regression, with best predictor variables selection

    Scientific Electronic Library Online (English)

    Daniel F., Campos-Aranda.

    2011-12-01

    Full Text Available SciELO Mexico | Language: Spanish Abstract in spanish Es necesario contar con registros largos de información hidrológica anual para obtener una imagen más apegada a la realidad de su variabilidad, así como estimaciones confiables de sus propiedades estadísticas. Para obtener tales registros es común buscar fuentes adicionales de datos y técnicas de tr [...] ansferencia. Una técnica es la regresión lineal múltiple, cuya aplicación numérica lleva implícita la selección óptima de los registros largos cercanos (regresores) para buscar que la ampliación del registro corto sea una estimación confiable. Este proceso de selección implica tres análisis: 1) cómo definir las mejores estimaciones, 2) cuáles ecuaciones de regresión investigar, y 3) cuál modelo tiene mejor capacidad predictiva. Para el primer análisis se presentan cuatro criterios basados en las sumas de los cuadrados de los residuos; para el segundo se investigan todas las regresiones posibles porque en los problemas de transferencia de información hidrológica se dispondrá máximo de cinco regresores; para el tercero, seleccionar el mejor modelo predictivo se utiliza el análisis de residuales y la validación cruzada. La aplicación numérica descrita es una ampliación del registro de volúmenes escurridos anuales en la estación hidrométrica Platón Sánchez del sistema del río Tempoal, en la Región Hidrológica No. 26 (Pánuco, México). En este caso se utilizan cuatro regresores que son los registros del resto de las estaciones de aforos de tal sistema. Se concluye que incluso en problemas con multicolinealidad, los criterios de selección y los análisis expuestos conducen a resultados consistentes y permiten obtener las mejores ecuaciones de regresión. La similitud de los resultados alcanzados con los modelos de regresión seleccionados genera confianza en las estimaciones adoptadas. Abstract in english It is necessary to have long records of annual hydrological data to get a truer picture of their variability, as well as reliable estimates of their statistical properties. To obtain these records it is common to use additional sources of data and transfer techniques. One technique is the multiple l [...] inear regression whose numerical application implies the optimum selection of close lengthy records (regressors) to have the extension of short registration be a reliable estimate. This selection process involves three analyses: 1) how to define the best estimates, 2) what regression equations should be investigated, and 3) which model has better predictive ability. For the first analysis four criteria based on the sums of the squares of the residuals are presented; for the second all possible regressions are investigated since in the problems of hydrological information transfer, we will have five regressors at the most; for the third, about selecting the best predictive model, we used the residual analysis and cross-validation. The numerical application described is an extension of the annual runoff volume record in the Platón Sánchez hydrometric station of the Tempoal river system in the 26 Hydrological Region (Pánuco, México). Here we used four regressors that are the records of other gauging stations in such system. We came to the conclusion that even in problems with multicollinearity, the selection criteria and analysis led to consistent results and allowed for the best regression equations. The similarity of the results obtained with the selected regression models generated confidence in the estimates adopted.

  17. Classificação da composição iônica da água de irrigação usando regressão linear múltipla Classification of the ionic composition of the irrigation water using multiple linear regression

    Directory of Open Access Journals (Sweden)

    Celsemy E. Maia

    2001-04-01

    Full Text Available Objetivou-se, com o presente trabalho, desenvolver uma metodologia para classificação da composição iônica da água de irrigação, através da regressão linear múltipla, tendo-se, como variável dependente, a condutividade elétrica e, como variáveis independentes, as concentrações de cátions e ânions da água de irrigação, classificada de acordo com o peso de cada íon no modelo estatístico. A fonte secundária de dados para a pesquisa foi o Banco de Dados do Laboratório de Análise de Água e Fertilidade do Solo, da Escola Superior de Agricultura de Mossoró (LAAFS/ESAM. As regressões foram ajustadas utilizando-se o método da seleção por etapas, conhecido como the stepwise regression procedure, no qual a variável dependente foi a condutividade elétrica e, como variáveis independentes, os íons determinados pela análise físico-química da água. Os resultados mostraram que, empregando-se este critério de regressão linear múltipla, havia variação na contribuição de cada variável no modelo ajustado, cuja estimativa era baseada no aumento da soma de quadrado, devido à regressão, a medida em que se incorporava, ao modelo, cada variável independente. Em função de critérios preestabelecidos, águas provenientes de mananciais da região da Chapada do Apodi foram classificadas como cálcica-sódica, cálcica e cloretada, quando provinham de poço tubular, de poço amazonas e rio, respectivamente. As águas oriundas da região do Baixo Açu, foram classificadas como sódica, magnesiana-sódica e sódica, para as águas de poço tubular, poço amazonas e rio, respectivamente.This work was conducted with the objective of developing a methodology for classification of the ionic composition of the irrigation water using multiple linear regression. A Stepwise Regression Analysis model was tested, using electrical conductivity as the dependent variable and analyzed ions calcium, sodium, potassium, carbonate, bicarbonate and chlorides as the independent variables in all tested models. All water samples were collected by the farmers of the region where this work was conducted. The regression models were adjusted using the water analysis database from the ESAM's Analysis Laboratory (Laboratório de Análises de Água e Fertilidade do Solo da Escola Superior de Agricultura de Mossoró - LAAFS/ESAM. The linear model, adjusted using the Stepwise Regression Procedure, shows that the degree of model adjustment tested depends upon geological formation of watersheds and whether it is collected in a river or tubular wells. The classification of the water in calcareous region of the Chapada do Apodi is calcic-sodic, calcic or choride if this source was tubular well, piezometric well (drilled in unconfined water denominated in the region as poço amazonas or surface rivers and lagoons water, respectively. In Baixo Açu region, these waters were classified as sodic, magnesian-sodic or sodic depending if the source collected is a tubular well (drilled in Açu sedimentary geological formation, piezometric well or superficial water, respectivelly.

  18. Analysis of the Evolution of the Gross Domestic Product by Means of Cyclic Regressions

    OpenAIRE

    Catalin Angelo Ioan; Gina Ioan

    2011-01-01

    In this article, we will carry out an analysis on the regularity of the Gross Domestic Product of a country, in our case the United States. The method of analysis is based on a new method of analysis – the cyclic regressions based on the Fourier series of a function. Another point of view is that of considering instead the growth rate of GDP the speed of variation of this rate, computed as a numerical derivative. The obtained results show a cycle for this indicator for 71 years, the mean sq...

  19. Analysis of the Evolution of the Gross Domestic Product by Means of Cyclic Regressions

    Directory of Open Access Journals (Sweden)

    Catalin Angelo Ioan

    2011-08-01

    Full Text Available In this article, we will carry out an analysis on the regularity of the Gross Domestic Product of a country, in our case the United States. The method of analysis is based on a new method of analysis – the cyclic regressions based on the Fourier series of a function. Another point of view is that of considering instead the growth rate of GDP the speed of variation of this rate, computed as a numerical derivative. The obtained results show a cycle for this indicator for 71 years, the mean square error being 0.93%. The method described allows an prognosis on short-term trends in GDP.

  20. PERFORMANCE OF RIDGE REGRESSION ESTIMATOR METHODS ON SMALL SAMPLE SIZE BY VARYING CORRELATION COEFFICIENTS: A SIMULATION STUDY

    OpenAIRE

    Anwar Fitrianto; Lee Ceng Yik

    2014-01-01

    When independent variables have high linear correlation in a multiple linear regression model, we can have wrong analysis. It happens if we do the multiple linear regression analysis based on common Ordinary Least Squares (OLS) method. In this situation, we are suggested to use ridge regression estimator. We conduct some simulation study to compare the performance of ridge regression estimator and the OLS. We found that Hoerl and Kennard ridge regression estimation method has better performan...

  1. Comparison of Artificial Neural Networks and Logistic Regression Analysis in the Credit Risk Prediction

    Directory of Open Access Journals (Sweden)

    Hüseyin BUDAK

    2012-11-01

    Full Text Available Credit scoring is a vital topic for Banks since there is a need to use limited financial sources more effectively. There are several credit scoring methods that are used by Banks. One of them is to estimate whether a credit demanding customer’s repayment order will be regular or not. In this study, artificial neural networks and logistic regression analysis have been used to provide a support to the Banks’ credit risk prediction and to estimate whether a credit demanding customers’ repayment order will be regular or not. The results of the study showed that artificial neural networks method is more reliable than logistic regression analysis while estimating a credit demanding customer’s repayment order.

  2. An application of principal component analysis and logistic regression to facilitate production scheduling decision support system: an automotive industry case

    Science.gov (United States)

    Mehrjoo, Saeed; Bashiri, Mahdi

    2013-05-01

    Production planning and control (PPC) systems have to deal with rising complexity and dynamics. The complexity of planning tasks is due to some existing multiple variables and dynamic factors derived from uncertainties surrounding the PPC. Although literatures on exact scheduling algorithms, simulation approaches, and heuristic methods are extensive in production planning, they seem to be inefficient because of daily fluctuations in real factories. Decision support systems can provide productive tools for production planners to offer a feasible and prompt decision in effective and robust production planning. In this paper, we propose a robust decision support tool for detailed production planning based on statistical multivariate method including principal component analysis and logistic regression. The proposed approach has been used in a real case in Iranian automotive industry. In the presence of existing multisource uncertainties, the results of applying the proposed method in the selected case show that the accuracy of daily production planning increases in comparison with the existing method.

  3. Robust estimation for homoscedastic regression in the secondary analysis of case-control data.

    Science.gov (United States)

    Wei, Jiawei; Carroll, Raymond J; Müller, Ursula U; Van Keilegom, Ingrid; Chatterjee, Nilanjan

    2013-01-01

    Primary analysis of case-control studies focuses on the relationship between disease D and a set of covariates of interest (Y, X). A secondary application of the case-control study, which is often invoked in modern genetic epidemiologic association studies, is to investigate the interrelationship between the covariates themselves. The task is complicated owing to the case-control sampling, where the regression of Y on X is different from what it is in the population. Previous work has assumed a parametric distribution for Y given X and derived semiparametric efficient estimation and inference without any distributional assumptions about X. We take up the issue of estimation of a regression function when Y given X follows a homoscedastic regression model, but otherwise the distribution of Y is unspecified. The semiparametric efficient approaches can be used to construct semiparametric efficient estimates, but they suffer from a lack of robustness to the assumed model for Y given X. We take an entirely different approach. We show how to estimate the regression parameters consistently even if the assumed model for Y given X is incorrect, and thus the estimates are model robust. For this we make the assumption that the disease rate is known or well estimated. The assumption can be dropped when the disease is rare, which is typically so for most case-control studies, and the estimation algorithm simplifies. Simulations and empirical examples are used to illustrate the approach. PMID:23637568

  4. Regression analysis of MCS intensity and ground motion spectral accelerations (SAs) in Italy

    Science.gov (United States)

    Faenza, Licia; Michelini, Alberto

    2011-09-01

    We present the results of the regression analyses between Mercalli-Cancani-Sieberg (MCS) intensity and the spectral acceleration (SA) at 0.3, 1.0 and 2.0 s (SA03, SA10 and SA20). In Italy, the MCS scale is used to describe the level of ground shaking suffered by manufactures or perceived by the people, and it differs to some extent from the Mercalli Modified scale in use in other countries. We have assembled a new SA/MCS-intensity data set from the DBMI04 intensity database and the ITACA accelerometric data bank. The SA peak values are calculated in two ways—using the maximum among the two horizontal components, and using the geometrical mean among the two horizontal components. The regression analysis has been performed separately for the two kinds of data sets and for the three target periods. Since both peak ground parameters and intensities suffer of appreciable uncertainties, we have used the orthogonal distance regression technique. Also, tests designed to assess the robustness of the estimated coefficients have shown that single-line parametrizations for the regressions are sufficient to model the data within the model uncertainties.

  5. A logistic normal multinomial regression model for microbiome compositional data analysis.

    Science.gov (United States)

    Xia, Fan; Chen, Jun; Fung, Wing Kam; Li, Hongzhe

    2013-12-01

    Changes in human microbiome are associated with many human diseases. Next generation sequencing technologies make it possible to quantify the microbial composition without the need for laboratory cultivation. One important problem of microbiome data analysis is to identify the environmental/biological covariates that are associated with different bacterial taxa. Taxa count data in microbiome studies are often over-dispersed and include many zeros. To account for such an over-dispersion, we propose to use an additive logistic normal multinomial regression model to associate the covariates to bacterial composition. The model can naturally account for sampling variabilities and zero observations and also allow for a flexible covariance structure among the bacterial taxa. In order to select the relevant covariates and to estimate the corresponding regression coefficients, we propose a group ?1 penalized likelihood estimation method for variable selection and estimation. We develop a Monte Carlo expectation-maximization algorithm to implement the penalized likelihood estimation. Our simulation results show that the proposed method outperforms the group ?1 penalized multinomial logistic regression and the Dirichlet multinomial regression models in variable selection. We demonstrate the methods using a data set that links human gut microbiome to micro-nutrients in order to identify the nutrients that are associated with the human gut microbiome enterotype. PMID:24128059

  6. Robust estimation for homoscedastic regression in the secondary analysis of case–control data

    OpenAIRE

    Wei, Jiawei; Carroll, Raymond; Mu?ller, Ursula; Keilegom, Ingrid; Chatterjee, Nilanjan

    2013-01-01

    Primary analysis of case–control studies focuses on the relationship between disease D and a set of covariates of interest (Y, X). A secondary application of the case–control study, which is often invoked in modern genetic epidemiologic association studies, is to investigate the interrelationship between the covariates themselves. The task is complicated owing to the case–control sampling, where the regression of Y on X is different from what it is in the population. Previous work has a...

  7. High-Dimensional Heteroscedastic Regression with an Application to eQTL Data Analysis

    OpenAIRE

    Daye, Z. John; Chen, Jinbo; Li, Hongzhe

    2011-01-01

    We consider the problem of high-dimensional regression under non-constant error variances. Despite being a common phenomenon in biological applications, heteroscedasticity has, so far, been largely ignored in high-dimensional analysis of genomic data sets. We propose a new methodology that allows non-constant error variances for high-dimensional estimation and model selection. Our method incorporates heteroscedasticity by simultaneously modeling both the mean and variance components via a nov...

  8. The effects of exchange rate variability on international trade: a Meta-Regression Analysis

    OpenAIRE

    ??ori??, Bruno; Pugh, Geoffrey Thomas

    2008-01-01

    Abstract The trade effects of exchange rate variability have been an issue in international economics for the past 30 years. The contribution of this paper is to apply meta-regression analysis (MRA) to the empirical literature. On average, exchange rate variability exerts a negative effect on international trade. Yet MRA confirms the view that this result is highly conditional, by identifying factors that help to explain why estimated trade effects vary from significantly negative ...

  9. Personality disorders, violence, and antisocial behavior: a systematic review and meta-regression analysis.

    OpenAIRE

    Yu, R.; Geddes, Jr; Fazel, S.

    2012-01-01

    The risk of antisocial outcomes in individuals with personality disorder (PD) remains uncertain. The authors synthesize the current evidence on the risks of antisocial behavior, violence, and repeat offending in PD, and they explore sources of heterogeneity in risk estimates through a systematic review and meta-regression analysis of observational studies comparing antisocial outcomes in personality disordered individuals with controls groups. Fourteen studies examined risk of antisocial and ...

  10. Mixed-effects Poisson regression analysis of adverse event reports: The relationship between antidepressants and suicide

    OpenAIRE

    Gibbons, Robert D.; Segawa, Eisuke; Karabatsos, George; Amatya, Anup K.; Bhaumik, Dulal K.; Brown, C. Hendricks; Kapur, Kush; Marcus, Sue M.; Hur, Kwan; Mann, J. John

    2008-01-01

    A new statistical methodology is developed for the analysis of spontaneous adverse event (AE) reports from post-marketing drug surveillance data. The method involves both empirical Bayes (EB) and fully Bayes estimation of rate multipliers for each drug within a class of drugs, for a particular AE, based on a mixed-effects Poisson regression model. Both parametric and semiparametric models for the random-effect distribution are examined. The method is applied to data from Food and Drug Adminis...

  11. Stratified Cox Regression Analysis of Survival under CIMAvax®EGF Vaccine

    OpenAIRE

    Carmen Viada Gonzalez; Jean-François Dupuy; Martha Fors López; Patricia Lorenzo Luaces; Camilo Rodríguez Rodríguez; Gisela González Marinello; Elia Neninger Vinagera; Beatriz García Verdecia; Bárbara Wilkinson Brito; Liana Martínez Pérez; Mayelin Troche de la Concepción; Tania Crombet-Ramos

    2013-01-01

    Background: The Center of Molecular Immunology (CIM) is a center in Cuba devoted to the research, development and manufacturing of biotechnological products. CIMAvax®EGF vaccine, based on data collected in a phase II and a phase III clinical trials. Methods: The stratified Cox regression model is used to evaluate the effects of these prognostic factors, based on separate analysis for each trial, and on the combined data from both trials. Results: Patients with Performance status 0 or 1, wit...

  12. LOGISTIC REGRESSION RESPONSE FUNCTIONS WITH MAIN AND INTERACTION EFFECTS IN THE CONJOINT ANALYSIS

    OpenAIRE

    Luca, Amedeo; Ciapparelli, Sara

    2011-01-01

    In the Conjoint Analysis (COA) model proposed here - an extension of the traditional COA - the polytomous response variable (i.e. evaluation of the overall desirability of alternative product profiles) is described by a sequence of binary variables. To link the categories of overall evaluation to the factor levels, we adopt - at the aggregate level - a multivariate logistic regression model, based on a main and two-factor interaction effects experimental design. The model provides several ove...

  13. Predictive model of biliocystic communication in liver hydatid cysts using classification and regression tree analysis

    OpenAIRE

    Souadka Amine; El Mejdoubi Yasser; El Malki Hadj; Mohsine Raouf; Ifrine Lahcen; Abouqal Redouane; Belkouchi Abdelkader

    2010-01-01

    Abstract Background Incidence of liver hydatid cyst (LHC) rupture ranged 15%-40% of all cases and most of them concern the bile duct tree. Patients with biliocystic communication (BCC) had specific clinic and therapeutic aspect. The purpose of this study was to determine witch patients with LHC may develop BCC using classification and regression tree (CART) analysis Methods A retrospective study of 672 patients with liver hydatid cyst treated at the surgery department "A" at Ibn Sina Universi...

  14. Prediction of large esophageal varices in cirrhotic patients using classification and regression tree analysis

    Scientific Electronic Library Online (English)

    Wan-dong, Hong; Le-mei, Dong; Zen-cai, Jiang; Qi-huai, Zhu; Shu-Qing, Jin.

    Full Text Available SciELO Brazil | Language: English Abstract in english OBJECTIVES: Recent guidelines recommend that all cirrhotic patients should undergo endoscopic screening for esophageal varices. That identifying cirrhotic patients with esophageal varices by noninvasive predictors would allow for the restriction of the performance of endoscopy to patients with a hig [...] h risk of having varices. This study aimed to develop a decision model based on classification and regression tree analysis for the prediction of large esophageal varices in cirrhotic patients. METHODS: 309 cirrhotic patients (training sample, 187 patients; test sample 122 patients) were included. Within the training sample, the classification and regression tree analysis was used to identify predictors and prediction model of large esophageal varices. The prediction model was then further evaluated in the test sample and different Child-Pugh classes. RESULTS: The prevalence of large esophageal varices in cirrhotic patients was 50.8%. A tree model that was consisted of spleen width, portal vein diameter and prothrombin time was developed by classification and regression tree analysis achieved a diagnostic accuracy of 84% for prediction of large esophageal varices. When reconstructed into two groups, the rate of varices was 83.2% for high-risk group and 15.2% for low-risk group. Accuracy of the tree model was maintained in the test sample and different Child-Pugh classes. CONCLUSIONS: A decision tree model that consists of spleen width, portal vein diameter and prothrombin time may be useful for prediction of large esophageal varices in cirrhotic patients

  15. Canopy Height Estimation in French Guiana with LiDAR ICESat/GLAS Data Using Principal Component Analysis and Random Forest Regressions

    Directory of Open Access Journals (Sweden)

    Ibrahim Fayad

    2014-11-01

    Full Text Available Estimating forest canopy height from large-footprint satellite LiDAR waveforms is challenging given the complex interaction between LiDAR waveforms, terrain, and vegetation, especially in dense tropical and equatorial forests. In this study, canopy height in French Guiana was estimated using multiple linear regression models and the Random Forest technique (RF. This analysis was either based on LiDAR waveform metrics extracted from the GLAS (Geoscience Laser Altimeter System spaceborne LiDAR data and terrain information derived from the SRTM (Shuttle Radar Topography Mission DEM (Digital Elevation Model or on Principal Component Analysis (PCA of GLAS waveforms. Results show that the best statistical model for estimating forest height based on waveform metrics and digital elevation data is a linear regression of waveform extent, trailing edge extent, and terrain index (RMSE of 3.7 m. For the PCA based models, better canopy height estimation results were observed using a regression model that incorporated both the first 13 principal components (PCs and the waveform extent (RMSE = 3.8 m. Random Forest regressions revealed that the best configuration for canopy height estimation used all the following metrics: waveform extent, leading edge, trailing edge, and terrain index (RMSE = 3.4 m. Waveform extent was the variable that best explained canopy height, with an importance factor almost three times higher than those for the other three metrics (leading edge, trailing edge, and terrain index. Furthermore, the Random Forest regression incorporating the first 13 PCs and the waveform extent had a slightly-improved canopy height estimation in comparison to the linear model, with an RMSE of 3.6 m. In conclusion, multiple linear regressions and RF regressions provided canopy height estimations with similar precision using either LiDAR metrics or PCs. However, a regression model (linear regression or RF based on the PCA of waveform samples with waveform extent information is an interesting alternative for canopy height estimation as it does not require several metrics that are difficult to derive from GLAS waveforms in dense forests, such as those in French Guiana.

  16. Application of Variance-Based and Regression-Based Global Sensitivity Analysis Methods to a Distributed Parameter Hydrologic Model

    Science.gov (United States)

    Dessalegne, T.; Senarath, S. U.; Novoa, R. J.

    2010-12-01

    A sensitivity analysis of a distributed hydrologic model with a large number of parameters is essential for understanding the model structure and simplifying model calibration efforts. It is also useful for guiding future field data collection and sampling efforts. Global sensitivity analysis methods are widely recognized today as superior to local or one-at-a-time methods because they are not limited by model linearity requirements and have a more extensive coverage of the parameter space. In this study, two global sensitivity analysis methods, the variance-based Sobol method and a Latin Hypercube Sampling based Multiple Linear Regression (LHS-MLR) approach, are employed to evaluate the effect of model parameter variability on simulated stages in the Everglades National Park (ENP) in Florida, USA. Both methods provide robust estimates of model parameter sensitivity. However, due to the distinctive characteristics of the two methods, they provide unique insights regarding model parameter sensitivities. These observations are compared in detail in this study. The simulated stage results from the distributed-parameter Regional Simulation Model (RSM), developed by the South Florida Water Management District, are used for this comparison. The parameters considered for sensitivity analysis consist of several model parameters that influence overland and groundwater flows as well as evapotranspiration within the ENP. Their relative sensitivities are assessed under dry, wet and average hydrologic conditions existing in the ENP watershed. The use of a variety of hydrologic conditions allows the robust assessment of parameter sensitivities obtained using the two global sensitivity analysis methods.

  17. Determinants of reproductive success in dominant pairs of clownfish: a boosted regression tree analysis.

    Science.gov (United States)

    Buston, Peter M; Elith, Jane

    2011-05-01

    1. Central questions of behavioural and evolutionary ecology are what factors influence the reproductive success of dominant breeders and subordinate nonbreeders within animal societies? A complete understanding of any society requires that these questions be answered for all individuals. 2. The clown anemonefish, Amphiprion percula, forms simple societies that live in close association with sea anemones, Heteractis magnifica. Here, we use data from a well-studied population of A. percula to determine the major predictors of reproductive success of dominant pairs in this species. 3. We analyse the effect of multiple predictors on four components of reproductive success, using a relatively new technique from the field of statistical learning: boosted regression trees (BRTs). BRTs have the potential to model complex relationships in ways that give powerful insight. 4. We show that the reproductive success of dominant pairs is unrelated to the presence, number or phenotype of nonbreeders. This is consistent with the observation that nonbreeders do not help or hinder breeders in any way, confirming and extending the results of a previous study. 5. Primarily, reproductive success is negatively related to male growth and positively related to breeding experience. It is likely that these effects are interrelated because males that grow a lot have little breeding experience. These effects are indicative of a trade-off between male growth and parental investment. 6. Secondarily, reproductive success is positively related to female growth and size. In this population, female size is positively related to group size and anemone size, also. These positive correlations among traits likely are caused by variation in site quality and are suggestive of a silver-spoon effect. 7. Noteworthily, whereas reproductive success is positively related to female size, it is unrelated to male size. This observation provides support for the size advantage hypothesis for sex change: both individuals maximize their reproductive success when the larger individual adopts the female tactic. 8. This study provides the most complete picture to date of the factors that predict the reproductive success of dominant pairs of clown anemonefish and illustrates the utility of BRTs for analysis of complex behavioural and evolutionary ecology data. PMID:21284624

  18. A regression analysis of the effect of energy use in agriculture

    International Nuclear Information System (INIS)

    This study investigates the impacts of energy use on productivity of Turkey's agriculture. It reports the results of a regression analysis of the relationship between energy use and agricultural productivity. The study is based on the analysis of the yearbook data for the period 1971-2003. Agricultural productivity was specified as a function of its energy consumption (TOE) and gross additions of fixed assets during the year. Least square (LS) was employed to estimate equation parameters. The data of this study comes from the State Institute of Statistics (SIS) and The Ministry of Energy of Turkey

  19. A regression analysis of the effect of energy use in agriculture

    Energy Technology Data Exchange (ETDEWEB)

    Karkacier, Osman [Gaziosmanpasa Univ., Dept. of Business Administration, Tokat (Turkey); Goktolga, Z. Gokalp; Cicek, Adnan [Gaziosmanpasa Univ., Dept. of Agricultural Economics, Tokat (Turkey)

    2006-12-15

    This study investigates the impacts of energy use on productivity of Turkey's agriculture. It reports the results of a regression analysis of the relationship between energy use and agricultural productivity. The study is based on the analysis of the yearbook data for the period 1971-2003. Agricultural productivity was specified as a function of its energy consumption (TOE) and gross additions of fixed assets during the year. Least square (LS) was employed to estimate equation parameters. The data of this study comes from the State Institute of Statistics (SIS) and The Ministry of Energy of Turkey. (Author)

  20. Regression analysis with missing data and unknown colored noise: Application to the MICROSCOPE space mission

    Science.gov (United States)

    Baghi, Quentin; Métris, Gilles; Bergé, Joël; Christophe, Bruno; Touboul, Pierre; Rodrigues, Manuel

    2015-03-01

    The analysis of physical measurements often copes with highly correlated noises and interruptions caused by outliers, saturation events, or transmission losses. We assess the impact of missing data on the performance of linear regression analysis involving the fit of modeled or measured time series. We show that data gaps can significantly alter the precision of the regression parameter estimation in the presence of colored noise, due to the frequency leakage of the noise power. We present a regression method that cancels this effect and estimates the parameters of interest with a precision comparable to the complete data case, even if the noise power spectral density (PSD) is not known a priori. The method is based on an autoregressive fit of the noise, which allows us to build an approximate generalized least squares estimator approaching the minimal variance bound. The method, which can be applied to any similar data processing, is tested on simulated measurements of the MICROSCOPE space mission, whose goal is to test the weak equivalence principle (WEP) with a precision of 1 0-15. In this particular context the signal of interest is the WEP violation signal expected to be found around a well defined frequency. We test our method with different gap patterns and noise of known PSD and find that the results agree with the mission requirements, decreasing the uncertainty by a factor of 60 with respect to ordinary least squares methods. We show that it also provides a test of significance to assess the uncertainty of the measurement.

  1. Multiple regression models of ?13C and ?15N for fish populations in the eastern Gulf of Mexico

    Science.gov (United States)

    Radabaugh, Kara R.; Peebles, Ernst B.

    2014-08-01

    Multiple regression models were created to explain spatial and temporal variation in the ?13C and ?15N values of fish populations on the West Florida Shelf (eastern Gulf of Mexico, USA). Extensive trawl surveys from three time periods were used to acquire muscle samples from seven groundfish species. Isotopic variation (?13Cvar and ?15Nvar) was calculated as the deviation from the isotopic mean of each fish species. Static spatial data and dynamic water quality parameters were used to create models predicting ?13Cvar and ?15Nvar in three fish species that were caught in the summers of 2009 and 2010. Additional data sets were then used to determine the accuracy of the models for predicting isotopic variation (1) in a different time period (fall 2010) and (2) among four entirely different fish species that were collected during summer 2009. The ?15Nvar model was relatively stable and could be applied to different time periods and species with similar accuracy (mean absolute errors 0.31-0.33‰). The ?13Cvar model had a lower predictive capability and mean absolute errors ranged from 0.42 to 0.48‰. ?15N trends are likely linked to gradients in nitrogen fixation and Mississippi River influence on the West Florida Shelf, while ?13C trends may be linked to changes in algal species, photosynthetic fractionation, and abundance of benthic vs. planktonic basal resources. These models of isotopic variability may be useful for future stable isotope investigations of trophic level, basal resource use, and animal migration on the West Florida Shelf.

  2. Multiple linear regression to develop strength scaled equations for knee and elbow joints based on age, gender and segment mass

    DEFF Research Database (Denmark)

    D'Souza, Sonia; Rasmussen, John

    2012-01-01

    Background: The next fifty years will see a drastic increase in the older population. Among other effects, ageing causes a decrease in strength. It is necessary to provide safe and comfortable environments for the elderly. To achieve this, digital human modelling has proved to be a useful and valuable ergonomic tool. Objective: To investigate age and gender effects on the torque-producing ability in the knee and elbow in older adults. To create strength scaled equations based on age, gender, upper/lower limb lengths and masses using multiple linear regression. To reduce the number of dependent parameters based on statistical redundancies, and then validate these equations. Methods: 283 subjects (141 males, 142 females) aged 50-59 years (54.9 +/- 2.9) , 60-69 years (65.4 +/- 2.9) and 70-79 years (73.7 +/- 2.7) were tested for maximal voluntary isometric torque of right knee extensors and elbow flexors. Results: Males were signifantly stronger than females across all age groups. Elbow peak torque (EPT) was better preserved from 60s to 70s whereas knee peak torque (KPT) reduced significantly (P<0.05) across all age groups. This held true for males and females. Gender, thigh mass and age best predicted KPT (R2=0.60). Gender, forearm mass and age best predicted EPT (R2=0.75). Good crossvalidation was established for both elbow and knee models. Conclusion: This cross-sectional study of muscle strength created and validated strength scaled equations of EPT and KPT using only gender, segment mass and age.

  3. Analysis of electrical resistance tomography (ERT) data using least-squares regression modelling in industrial process tomography

    Science.gov (United States)

    Khanal, Manoj; Morrison, Rob

    2009-04-01

    Analysis of electrical resistance tomography (ERT) data using least-squares regression modelling in industrial process tomographs has been tested. Potential differences measured between electrodes in rings have been used to carry out the regression modelling to investigate the location and size of a disturbance present in the system. Extensive experiments have been carried out with ERT to test a suitable regression algorithm to extract the disturbance. Current analysis has been performed for a single disturbance known to be present in the system. For the environment considered, the least-squares regression reported in this paper demonstrates an alternative approach for analysis of tomography data in industrial applications. The position (concentric or off-centre) and the size of the disturbance (in concentric cases) can be well defined by the reported regression modelling approach. However, it is still a challenge to define the size of the off-centre disturbance.

  4. Analysis of electrical resistance tomography (ERT) data using least-squares regression modelling in industrial process tomography

    International Nuclear Information System (INIS)

    Analysis of electrical resistance tomography (ERT) data using least-squares regression modelling in industrial process tomographs has been tested. Potential differences measured between electrodes in rings have been used to carry out the regression modelling to investigate the location and size of a disturbance present in the system. Extensive experiments have been carried out with ERT to test a suitable regression algorithm to extract the disturbance. Current analysis has been performed for a single disturbance known to be present in the system. For the environment considered, the least-squares regression reported in this paper demonstrates an alternative approach for analysis of tomography data in industrial applications. The position (concentric or off-centre) and the size of the disturbance (in concentric cases) can be well defined by the reported regression modelling approach. However, it is still a challenge to define the size of the off-centre disturbance

  5. Investigation of Water Parameters in a River System with a Two-Dimensional Regression Analysis Model

    Science.gov (United States)

    Murariu, Gabriel; Caldararu, Aurelia; Georgescu, Lucian; Voiculescu, Mirela; Puscasu, Gheorghe; Basset, Alberto

    2011-10-01

    A key step of European Water Framework Directive (WFD) implementation is the ecological status classification and the achievement of good water statuses for all waters, by 2015. In transitional waters, the changing environmental niche induces responses in the macroinvertebrate guilds and macroinvertebrate responses induce uncertainty in the metrics. In this case, the sources of uncertainty in the ecological classification with benthic macroinvertebrates, is addressed by focusing on two major potential sources: spatial heterogeneity and temporal heterogeneity. A coherent study of the series of correlation between the physics and chemistry parameters is needed in order to succeed in reaching a complete picture. In this paper we present a bi-dimensional regression analysis model dependence of chemistry component by two independent environment variables—temperature and pH. The consistent experimental data set and the regression computation approach lead to a series of interesting outcomes.

  6. Mathematical models for estimating earthquake casualties and damage cost through regression analysis using matrices

    Science.gov (United States)

    Urrutia, J. D.; Bautista, L. A.; Baccay, E. B.

    2014-04-01

    The aim of this study was to develop mathematical models for estimating earthquake casualties such as death, number of injured persons, affected families and total cost of damage. To quantify the direct damages from earthquakes to human beings and properties given the magnitude, intensity, depth of focus, location of epicentre and time duration, the regression models were made. The researchers formulated models through regression analysis using matrices and used ? = 0.01. The study considered thirty destructive earthquakes that hit the Philippines from the inclusive years 1968 to 2012. Relevant data about these said earthquakes were obtained from Philippine Institute of Volcanology and Seismology. Data on damages and casualties were gathered from the records of National Disaster Risk Reduction and Management Council. The mathematical models made are as follows: This study will be of great value in emergency planning, initiating and updating programs for earthquake hazard reductionin the Philippines, which is an earthquake-prone country.

  7. An Econometric Analysis of Modulated Realised Covariance, Regression and Correlation in Noisy Diffusion Models

    DEFF Research Database (Denmark)

    Kinnebrock, Silja; Podolskij, Mark

    2008-01-01

    This paper introduces a new estimator to measure the ex-post covariation between high-frequency financial time series under market microstructure noise. We provide an asymptotic limit theory (including feasible central limit theorems) for standard methods such as regression, correlation analysis and covariance, for which we obtain the optimal rate of convergence. We demonstrate some positive semidefinite estimators of the covariation and construct a positive semidefinite estimator of the conditional covariance matrix in the central limit theorem. Furthermore, we indicate how the assumptions on the noise process can be relaxed and how our method can be applied to non-synchronous observations. We also present an empirical study of how high-frequency correlations, regressions and covariances change through time.

  8. Sub-pixel estimation of tree cover and bare surface densities using regression tree analysis

    Directory of Open Access Journals (Sweden)

    Carlos Augusto Zangrando Toneli

    2011-09-01

    Full Text Available Sub-pixel analysis is capable of generating continuous fields, which represent the spatial variability of certain thematic classes. The aim of this work was to develop numerical models to represent the variability of tree cover and bare surfaces within the study area. This research was conducted in the riparian buffer within a watershed of the São Francisco River in the North of Minas Gerais, Brazil. IKONOS and Landsat TM imagery were used with the GUIDE algorithm to construct the models. The results were two index images derived with regression trees for the entire study area, one representing tree cover and the other representing bare surface. The use of non-parametric and non-linear regression tree models presented satisfactory results to characterize wetland, deciduous and savanna patterns of forest formation.

  9. Hybrid fuzzy regression with trapezoidal fuzzy data

    Science.gov (United States)

    Razzaghnia, T.; Danesh, S.; Maleki, A.

    2012-01-01

    In this regard, this research deals with a method for hybrid fuzzy least-squares regression. The extension of symmetric triangular fuzzy coefficients to asymmetric trapezoidal fuzzy coefficients is considered as an effective measure for removing unnecessary fuzziness of the linear fuzzy model. First, trapezoidal fuzzy variable is applied to derive a bivariate regression model. In the following, normal equations are formulated to solve the four parts of hybrid regression coefficients. Also the model is extended to multiple regression analysis. Eventually, method is compared with Y-H.O. chang's model.

  10. Analysis of Dynamic Multiplicity Fluctuations at PHOBOS

    CERN Document Server

    Chai, Z; Baker, M D; Ballintijn, M; Barton, D S; Betts, R R; Bickley, A A; Bindel, R; Budzanowski, A; Busza, W; Carroll, A; Chai, Z; Decowski, M P; García, E; George, N; Gulbrandsen, K H; Gushue, S; Halliwell, C; Hamblen, J; Heintzelman, G A; Henderson, C; Hofman, D J; Hollis, R S; Holynski, R; Holzman, B; Iordanova, A; Johnson, E; Kane, J L; Katzy, J; Khan, N; Kucewicz, W; Kulinich, P; Kuo, C M; Lin, W T; Manly, S; McLeod, D; Mignerey, A C; Nouicer, R; Olszewski, A; Pak, R; Park, I C; Pernegger, H; Reed, C; Remsberg, L P; Reuter, M; Rolan, C; Roland, G; Rosenberg, L J; Sagerer, J; Sarin, P; Sawicki, P; Skulski, W; Steinberg, P; Stephans, G S F; Sukhanov, A; Tang, J L; Trzupek, A; Vale, C; van Nieuwenhuizen, G J; Verdier, R; Wolfs, F L H; Wosiek, B; Wozniak, K; Wuosmaa, A H; Wyslouch, B; Chai, Zhengwei

    2005-01-01

    This paper presents the analysis of the dynamic fluctuations in the inclusive charged particle multiplicity measured by PHOBOS for Au+Au collisions at sqrt(s_NN)=200$GeV within the pseudo-rapidity range of -3analysis is presented, together with the discussion of their physics meaning. Then the procedure for the extraction of dynamic fluctuations is described. Some preliminary results are included to illustrate the correlation features of the fluctuation observable. New dynamic fluctuations results will be available in a later publication.

  11. Determination of palatal rugae patterns among two ethnic populations of India by logistic regression analysis.

    Science.gov (United States)

    Kotrashetti, Vijayalakshmi S; Hollikatti, Kiran; Mallapur, M D; Hallikeremath, Seema R; Kale, Alka D

    2011-11-01

    Palatal rugae patterns are relatively unique to an individual and are well protected by the lips, buccal pad of fat and teeth. They are considered to be stable throughout life following completion of growth, although there is considerable debate on the matter, they can be used successfully in post mortem identification provided an antemortem record exists. Thus the aim of this study was to examine palatal rugae shape among two Indian populations and determine the accuracy in defining the Indian population using logistic regression analysis. The study comprises two groups from geographically different regions of India with basic origin from Maharashtra and Karnataka state. The sample includes 100 plaster cast equally distributed between two populations and genders with age ranging between 18 and 40 years. Impression of maxillary arch was obtained using alginate impression material and plaster cast was made. The rugae was delineated on the cast using a sharp graphite pencil under adequate light and magnification and recorded according to classification given by Kapali et al. and Thomas and Kotze (1983). Chi-Square analysis showed significant difference in wavy, circular and divergent pattern between the two populations. The straight and wavy forms were significant in logistic regression analysis. A predictive value of 71% was obtained in determining the original cases correctly when straight, wavy, curved and circular patterns were assessed. 70% of predictive value was achieved when all rugae patterns were assessed. Mean number of rugae was greater in females compared to males with straight pattern showing statistically significant difference between males and females. Significant difference was recorded among straight, wavy, circular and divergent pattern between two populations. Consequently this study demonstrates moderate accuracy of palatal rugae pattern using logistic regression analysis in identification of Indians. PMID:22018168

  12. Quantitative structure-property relationship study of n-octanol-water partition coefficients of some of diverse drugs using multiple linear regression

    International Nuclear Information System (INIS)

    A quantitative structure-property relationship (QSPR) study was performed to develop models those relate the structures of 150 drug organic compounds to their n-octanol-water partition coefficients (log Po/w). Molecular descriptors derived solely from 3D structures of the molecular drugs. A genetic algorithm was also applied as a variable selection tool in QSPR analysis. The models were constructed using 110 molecules as training set, and predictive ability tested using 40 compounds. Modeling of log Po/w of these compounds as a function of the theoretically derived descriptors was established by multiple linear regression (MLR). Four descriptors for these compounds molecular volume (MV) (geometrical), hydrophilic-lipophilic balance (HLB) (constitutional), hydrogen bond forming ability (HB) (electronic) and polar surface area (PSA) (electrostatic) are taken as inputs for the model. The use of descriptors calculated only from molecular structure eliminates the need for experimental determination of properties for use in the correlation and allows for the estimation of log Po/w for molecules not yet synthesized. Application of the developed model to a testing set of 40 drug organic compounds demonstrates that the model is reliable with good predictive accuracy and simple formulation. The prediction results are in good agreement with the experimental value. The root mean square error of prediction (RMSEP) and square correlation coefficient(RMSEP) and square correlation coefficient (R2) for MLR model were 0.22 and 0.99 for the prediction set log Po/w

  13. Use of multiple regression models to evaluate the formation of halonitromethane via chlorination/chloramination of water from Tai Lake and the Qiantang River, China.

    Science.gov (United States)

    Hong, Huachang; Qian, Lingya; Xiong, Yujing; Xiao, Zhuoqun; Lin, Hongjun; Yu, Haiying

    2015-01-01

    The deterioration of water quality, especially organic pollution in Tai Lake and the Qiantang River, have recently received attention in China. The objectives of this study were to evaluate the formation of halonitromethanes (HNMs) using multiple regression models for chlorination and chloramination and to identify the key factors that influence the formation of HNMs in Tai Lake and the Qiantang River. The results showed that the total formation of HNMs (T-HNMs) during chlorination and chloramination could be described using the following models: (1) [Formula: see text] =(10)(5.267)(DON)(6.645)(Br(-))(0.737)(DOC)(-)(5.537)(Cl2)(0.333)(t)(0.165) (R(2)=0.974, pDOC). The nitrite and bromide concentrations and the reaction time mainly affected the T-HNM yields during chloramination. Additional analysis indicated that the bromine incorporation factors (BIFs) for trihalogenated HNMs generally decreased as the chlorine/chloramine dose, temperature and reaction time decreased and increased as the bromide concentration increased. PMID:25112580

  14. Sensitivity analysis for misclassification in logistic regression via likelihood methods and predictive value weighting.

    Science.gov (United States)

    Lyles, Robert H; Lin, Ji

    2010-09-30

    The potential for bias due to misclassification error in regression analysis is well understood by statisticians and epidemiologists. Assuming little or no available data for estimating misclassification probabilities, investigators sometimes seek to gauge the sensitivity of an estimated effect to variations in the assumed values of those probabilities. We present an intuitive and flexible approach to such a sensitivity analysis, assuming an underlying logistic regression model. For outcome misclassification, we argue that a likelihood-based analysis is the cleanest and the most preferable approach. In the case of covariate misclassification, we combine observed data on the outcome, error-prone binary covariate of interest, and other covariates measured without error, together with investigator-supplied values for sensitivity and specificity parameters, to produce corresponding positive and negative predictive values. These values serve as estimated weights to be used in fitting the model of interest to an appropriately defined expanded data set using standard statistical software. Jackknifing provides a convenient tool for incorporating uncertainty in the estimated weights into valid standard errors to accompany log odds ratio estimates obtained from the sensitivity analysis. Examples illustrate the flexibility of this unified strategy, and simulations suggest that it performs well relative to a maximum likelihood approach carried out via numerical optimization. PMID:20552681

  15. Analysis of designed experiments by stabilised PLS Regression and jack-knifing

    DEFF Research Database (Denmark)

    Martens, Harald; HØy, M.

    2001-01-01

    Pragmatical, visually oriented methods for assessing and optimising bi-linear regression models are described, and applied to PLS Regression (PLSR) analysis of multi-response data from controlled experiments. The paper outlines some ways to stabilise the PLSR method to extend its range of applicability to the analysis of effects in designed experiments. Two ways of passifying unreliable variables are shown. A method for estimating the reliability of the cross- validated prediction error RMSEP is demonstrated. Some recently developed jack-knifing extensions are illustrated, for estimating the reliability of the linear and bi-linear model parameter estimates. The paper illustrates how the obtained PLSR "significance" probabilities are similar to those from conventional factorial ANOVA, but the PLSR is shown to give important additional overview plots of the main relevant structures in the multi-response data. The study is part of an ongoing effort to establish a cognitively simple and versatile approach to multivariate data analysis, with reliability assessment based on the data at hand, and with little need for abstract distribution theory [H. Martens, M. Martens, Multivariate Analysis of Quality. An Introduction, Wiley, Chichester, UK, 2001].

  16. Bayesian analysis of a multivariate null intercept errors-in-variables regression model.

    Science.gov (United States)

    Aoki, Reiko; Bolfarine, Heleno; Achcar, Jorge A; Dorival, Leão P Júnior

    2003-11-01

    Longitudinal data are of great interest in analysis of clinical trials. In many practical situations the covariate can not be measured precisely and a natural alternative model is the errors-in-variables regression models. In this paper we study a null intercept errors-in-variables regression model with a structure of dependency between the response variables within the same group. We apply the model to real data presented in Hadgu and Koch (Hadgu, A., Koch, G. (1999). Application of generalized estimating equations to a dental randomized clinical trial. J. Biopharmaceutical Statistics 9(1):161-178). In that study volunteers with preexisting dental plaque were randomized to two experimental mouth rinses (A and B) or a control mouth rinse with double blinding. The dental plaque index was measured for each subject in the beginning of the study and at two follow-up times, which leads to the presence of an interclass correlation. We propose the use of a Bayesian approach to model a multivariate null intercept errors-in-variables regression model to the longitudinal data. The proposed Bayesian approach accommodates the correlated measurements and incorporates the restriction that the slopes must lie in the (0, 1) interval. A Gibbs sampler is used to perform the computations. PMID:14584721

  17. Bayesian nonparametric regression analysis of data with random effects covariates from longitudinal measurements.

    Science.gov (United States)

    Ryu, Duchwan; Li, Erning; Mallick, Bani K

    2011-06-01

    We consider nonparametric regression analysis in a generalized linear model (GLM) framework for data with covariates that are the subject-specific random effects of longitudinal measurements. The usual assumption that the effects of the longitudinal covariate processes are linear in the GLM may be unrealistic and if this happens it can cast doubt on the inference of observed covariate effects. Allowing the regression functions to be unknown, we propose to apply Bayesian nonparametric methods including cubic smoothing splines or P-splines for the possible nonlinearity and use an additive model in this complex setting. To improve computational efficiency, we propose the use of data-augmentation schemes. The approach allows flexible covariance structures for the random effects and within-subject measurement errors of the longitudinal processes. The posterior model space is explored through a Markov chain Monte Carlo (MCMC) sampler. The proposed methods are illustrated and compared to other approaches, the "naive" approach and the regression calibration, via simulations and by an application that investigates the relationship between obesity in adulthood and childhood growth curves. PMID:20880012

  18. The partial least-squares regression analysis of impact factors of coordinate measuring machine dynamic error

    Science.gov (United States)

    Zhang, Mei; Fei, Yetai; Sheng, Li; Ma, Xiushui; Yang, Hong-tao

    2008-12-01

    The reasons why the coordinate measuring machine (CMM) dynamic error exists are complicate. And there are many elements which influence the error. So it is hard to build an accurate model. For the sake of attaining a model which not only avoided analyzing complex error sources and the interactions among them, but also solved the multiple colinearity among the variables. This paper adopted the Partial Least-Squares Regression (PLSR) to build model. The model takes 3D coordinates (X, Y, Z) and the moving velocity as the independent variable and takes the CMM dynamic error value as the dependent variable. The experimental results show that the model can be easily explained. At the same time the results show the magnitude and direction of the independent variable influencing the dependent variable.

  19. Pitfalls in predictions of rock properties using multivariate analysis and regression methods

    Science.gov (United States)

    Ma, Y. Zee

    2011-10-01

    Statistical methods are commonly used for prediction of geoscience and engineering properties. This commonly involves selection of a small number of variables among a large number of available geological, geophysical, petrophysical and engineering variables. The conventional view is to select the variables that have highest correlations with the variable of concern. In this article, we show that this may not always be a wise approach because it ignores a critical aspect of the variable interaction — suppression. We review the suppression phenomenon, and discuss three types of suppression in multiple linear regression of geoscience and reservoir properties. We present examples using wireline logs, seismic attributes, and other engineering parameters. We show that understanding the suppression phenomenon is important for selecting appropriate variables for optimal prediction of geoscience and reservoir properties.

  20. An investigation of correlation between pilot scanning behavior and workload using stepwise regression analysis

    Science.gov (United States)

    Waller, M. C.

    1976-01-01

    An electro-optical device called an oculometer which tracks a subject's lookpoint as a time function has been used to collect data in a real-time simulation study of instrument landing system (ILS) approaches. The data describing the scanning behavior of a pilot during the instrument approaches have been analyzed by use of a stepwise regression analysis technique. A statistically significant correlation between pilot workload, as indicated by pilot ratings, and scanning behavior has been established. In addition, it was demonstrated that parameters derived from the scanning behavior data can be combined in a mathematical equation to provide a good representation of pilot workload.

  1. Correlation-regression analysis of interrelations of VA tungsten properties and initial powder parameters

    International Nuclear Information System (INIS)

    Statistical analysis of properties of powder, compacts and wize of tungsten VA was made to determine optimum conditions of plastic working of tungsten and its alloys. The data were collected on 29 parameters and processed on ''Minsk-22'' computer. Correlations were found between wire structure and such factors as hardness and density of compacts, fractional composition and volume weight of powder and others. A regression equation was obtained which connected the structure of 0.52 mm wire with a number of parameters of initial material

  2. Data Management, EDA, and Regression Analysis with 1969-2000 Major League Baseball Attendance

    Science.gov (United States)

    Cochran, James J.

    This article, created by James J. Cochran of Louisiana Tech University, describes a dataset containing Major League Baseball data from seasons 1969 through 2000 and illustrates how this data can be used as a course long project covering basic data management, the use of exploratory data analysis to "clean" data, and construction of regression models. The set contains data such as: runs scored, runs allowed, wins, losses, number of games behind the division leader and attendance. This is a great lesson for anyone interested in the statistics of baseball. The data is in .dat format.

  3. LINEAR REGRESSION MODEL IN THE ANALYSIS OF THE GROSS DOMESTIC PRODUCT

    Directory of Open Access Journals (Sweden)

    Constantin ANGHELACHE

    2011-12-01

    Full Text Available As we ascertain the evolutionary trend of the global economy, it becomes evident that strict analyses on the evolution of a certain micro or macro-economical indicator is no longer enough to describe the corresponding phenomenon, as the emphasis shifts towards the analysis of the correlations existing between two or more indicators, able to offer a much stronger insight on the economical phenomenon. We propose to use the simple linear regression model, a relatively easy and very effective modality to establish the correlation between two economical indicators. The measurement of the factor’s influence on the indicator will most surely offer additional information on the phenomen they describe.

  4. An analysis of the differential item function through Mantel-Haenszel, SIBTEST and Logistic Regression Methods

    OpenAIRE

    Süleyman Demir; ?brahim Alper Köse

    2014-01-01

    This study performs a Differential Item Function (DIF) analysis in terms of gender and culture on the items available in the PISA 2009 mathematics literacy sub-test. The DIF analyses were done through the Mantel Haenszel, Logistic Regression and the SIBTEST methods. The data for the gender variable were collected from the responses given by 332 students to the items in the mathematics literacy sub-test during the administration of the 5th booklet in the PISA 2009 application whereas the data ...

  5. LOCA uncertainty analysis using the Fourier Amplitude Sensitivity Test and the Stepwise Regression Technique

    International Nuclear Information System (INIS)

    An uncertainty analysis method is proposed here, which uses Fourier Amplitude Sensitivity Test (FAST) and Stepwise Regression Technique (SRT). This method is a compromise between the approximation method [response surface method (RSM) or moments method] and Monte Carlo method (MCM). It is concluded that: 1. FAST gives the partial variance for each input parameter, which can be used as global sensitivity ranking between input parameters, with moderate sampling point compared to crude MCM. 2. SRT is a good tool to construct the later-used first- or second-order response surface model consisting of comparatively important parameters. 3. The combined uncertainty analysis method using FAST and SRT can be used for uncertainty/sensitivity analysis of the large computer codes with moderate cost and it will be a useful tool to analyze the feasibility of the newly developed, highly uncertain system models

  6. A Bayesian Quantile Regression Analysis of Potential Risk Factors for Violent Crimes in USA

    OpenAIRE

    Ming Wang; Lijun Zhang

    2012-01-01

    Bayesian quantile regression has drawn more attention in widespread applications recently. Yu and Moyeed (2001) proposed an asymmetric Laplace distribution to provide likelihood based mechanism for Bayesian inference of quantile regression models. In this work, the primary objective is to evaluate the performance of Bayesian quantile regression compared with simple regression and quantile regression through simulation and with application to a crime dataset from 50 USA states for assessing th...

  7. Regression analysis of growth responses to water depth in three wetland plant species

    DEFF Research Database (Denmark)

    Sorrell, Brian K; Tanner, Chris C

    2012-01-01

    Background and aims Plant species composition in wetlands and on lakeshores often shows dramatic zonation which is frequently ascribed to differences in flooding tolerance. This study compared the growth responses to water depth of three species (Phormium tenax, Carex secta, Typha orientalis) differing in depth preferences in wetlands, using non-linear and quantile regression analyses to establish how flooding tolerance can explain field zonation. Methodology Plants were established for 8 months in outdoor cultures in waterlogged soil without standing water, and then randomly allocated to water depths from 0 – 0.5 m. Morphological and growth responses to depth were followed for 54 days before harvest, and then analysed by repeated measures analysis of covariance, and non-linear and quantile regression analysis (QRA), to compare flooding tolerances. Principal results Growth responses to depth differed between the three species, and were non-linear. P. tenax growth rapidly decreased in standing water > 0.25 m depth, C. secta growth increased initially with depth but then decreased at depths > 0.30 m, accompanied by increased shoot height and decreased shoot density, and T. orientalis was unaffected by the 0 – 0.50 m depth range. In P. tenax the decrease in growth was associated with a decrease in the number of leaves produced per ramet and in C. secta the effect of water depth was greatest for the tallest shoots. Allocation patterns were unaffected by depth. Conclusions The responses are consistent with the principle that zonation in the field is primarily structured by competition in shallow water and by physiological flooding tolerance in deep water. Regression analyses, especially QRA, proved to be powerful tools in distinguishing genuine phenotypic responses to water depth from non-phenotypic variation due to size and developmental differences.

  8. Análise de regressão múltipla das concentrações de PM10 em função de elementos meteorológicos para Porto Alegre, Estado do Rio Grande do Sul, em 2005 e 2006 - doi: 10.4025/actascitechnol.v33i1.9627 Multiple regression analysis of PM10 concentration concerning to meteorological elements for Porto Alegre, Rio Grande do Sul State, in 2005 and 2006 - doi: 10.4025/actascitechnol.v33i1.9627

    Directory of Open Access Journals (Sweden)

    Rosana de Cassia de Souza Schneider

    2011-03-01

    Full Text Available O ar é um meio eficiente de dispersão de poluentes atmosféricos e seu comportamento depende dos movimentos atmosféricos que ocorrem na troposfera. Em Porto Alegre, Estado do Rio Grande do Sul, há um grande tráfego diário e uma concentração de indústrias que podem ser responsáveis por emissões atmosféricas. Neste trabalho, estudou-se o comportamento das concentrações diárias de material particulado (PM10 desta cidade, considerando a influência dos elementos meteorológicos. A análise dos dados foi realizada a partir de estatísticas descritivas, correlação linear e regressão múltipla. Os dados foram fornecidos pela Fundação Estadual de Proteção Ambiental Henrique Luiz Roessler - RS (FEPAM e pelo Instituto Nacional de Meteorologia (INMET. A partir das análises pôde-se verificar que: as concentrações do PM10, medidos diariamente às 16h, não ultrapassaram os padrões nacionais de qualidade do ar; os elementos meteorológicos que influenciam nas concentrações do PM10 foram: a velocidade média diária do vento e a radiação média diária com relações negativas; as temperaturas médias diárias do ar e as direções, norte e noroeste, do vento, com relações positivas. As direções do vento que contribuem significativamente para diminuir as concentrações nos locais medidos são Leste e Sudeste.Air is an efficient means of atmospheric pollutants dispersal and its r behavior depends on the atmospheric movements that occur in the troposphere. In Porto Alegre, Rio Grande do Sul State, there is a large daily traffic and a concentration of industries that may be responsible for atmospheric emission. In the present work we studied the behavior of daily concentrations of particulate matter (PM10, in this city, considering the influence of meteorological variables. Data analysis was performed from descriptive statistics, linear correlation and multiple regressions. Data were provided by the State Foundation of Environmental Protection Henrique Luiz Roessler - RS and the National Institute of Meteorology. Based on the analysis it was possible to verify that: the concentration of PM10, measured every day at 4:00 p.m., did not exceed national standards for air quality; meteorological elements that influenced on the concentrations of PM10 were the daily average wind speed and average daily radiation with negative relations; the daily average temperature of the air and the directions, north and northwest of wind, with positive relations. Wind directions which contribute significantly to lower concentrations on the measured places are east and southeast.

  9. Survival regression analysis: a powerful tool for evaluating fighting and assessment.

    Science.gov (United States)

    Moya-Laraño; Wise

    2000-09-01

    Theoretical models of animal contests frequently generate predictions about how asymmetries (e.g. differences in size, residence status) between contestants affect fight duration. Linear regression and nonparametric correlation analyses are commonly used to test the fit of data to such models. We show how survival regression analysis (SRA) is a powerful technique for studying the effect of asymmetries on the duration of contests. SRA, which is under-utilized by students of animal behaviour, offers several advantages over more frequently used procedures. It provides unbiased parameter estimates even when including censored data (i.e. results of contests that have not ended at the time when observations are stopped). The analysis of hazard functions, which is a component of SRA, is an easy way to test for consistency with predictions of the sequential assessment game model. These and other advantages of SRA are illustrated by using SRA and more conventional methods to analyse the effect of asymmetries on contest duration for encounters between female Mediterranean tarantulas, Lycosa tarentula (L.). It is hoped that this example of the advantages of SRA will encourage more widespread use of this powerful technique. Copyright 2000 The Association for the Study of Animal Behaviour. PMID:11007639

  10. Application of kernel principal component analysis and support vector regression for reconstruction of cardiac transmembrane potentials

    International Nuclear Information System (INIS)

    Non-invasively reconstructing the transmembrane potentials (TMPs) from body surface potentials (BSPs) constitutes one form of the inverse ECG problem that can be treated as a regression problem with multi-inputs and multi-outputs, and which can be solved using the support vector regression (SVR) method. In developing an effective SVR model, feature extraction is an important task for pre-processing the original input data. This paper proposes the application of principal component analysis (PCA) and kernel principal component analysis (KPCA) to the SVR method for feature extraction. Also, the genetic algorithm and simplex optimization method is invoked to determine the hyper-parameters of the SVR. Based on the realistic heart-torso model, the equivalent double-layer source method is applied to generate the data set for training and testing the SVR model. The experimental results show that the SVR method with feature extraction (PCA-SVR and KPCA-SVR) can perform better than that without the extract feature extraction (single SVR) in terms of the reconstruction of the TMPs on epi- and endocardial surfaces. Moreover, compared with the PCA-SVR, the KPCA-SVR features good approximation and generalization ability when reconstructing the TMPs.

  11. Application of kernel principal component analysis and support vector regression for reconstruction of cardiac transmembrane potentials

    Energy Technology Data Exchange (ETDEWEB)

    Jiang Mingfeng; Wang Yaming [College of Electronics and Informatics, Zhejiang Sci-Tech University, Hangzhou 310018 (China); Zhu Lingyan [Dongfang College, Zhejiang University of Finance and Economics, Hangzhou, 310018 (China); Xia Ling; Shou Guofa; Liu Feng [Department of Biomedical Engineering, Zhejiang University, Hangzhou 310027 (China); Crozier, Stuart, E-mail: peterjiang0517@163.com, E-mail: jiang.mingfeng@hotmail.com [School of Information Technology and Electrical Engineering, University of Queensland, St Lucia, Brisbane, Queensland 4072 (Australia)

    2011-03-21

    Non-invasively reconstructing the transmembrane potentials (TMPs) from body surface potentials (BSPs) constitutes one form of the inverse ECG problem that can be treated as a regression problem with multi-inputs and multi-outputs, and which can be solved using the support vector regression (SVR) method. In developing an effective SVR model, feature extraction is an important task for pre-processing the original input data. This paper proposes the application of principal component analysis (PCA) and kernel principal component analysis (KPCA) to the SVR method for feature extraction. Also, the genetic algorithm and simplex optimization method is invoked to determine the hyper-parameters of the SVR. Based on the realistic heart-torso model, the equivalent double-layer source method is applied to generate the data set for training and testing the SVR model. The experimental results show that the SVR method with feature extraction (PCA-SVR and KPCA-SVR) can perform better than that without the extract feature extraction (single SVR) in terms of the reconstruction of the TMPs on epi- and endocardial surfaces. Moreover, compared with the PCA-SVR, the KPCA-SVR features good approximation and generalization ability when reconstructing the TMPs.

  12. Application of kernel principal component analysis and support vector regression for reconstruction of cardiac transmembrane potentials

    Science.gov (United States)

    Jiang, Mingfeng; Zhu, Lingyan; Wang, Yaming; Xia, Ling; Shou, Guofa; Liu, Feng; Crozier, Stuart

    2011-03-01

    Non-invasively reconstructing the transmembrane potentials (TMPs) from body surface potentials (BSPs) constitutes one form of the inverse ECG problem that can be treated as a regression problem with multi-inputs and multi-outputs, and which can be solved using the support vector regression (SVR) method. In developing an effective SVR model, feature extraction is an important task for pre-processing the original input data. This paper proposes the application of principal component analysis (PCA) and kernel principal component analysis (KPCA) to the SVR method for feature extraction. Also, the genetic algorithm and simplex optimization method is invoked to determine the hyper-parameters of the SVR. Based on the realistic heart-torso model, the equivalent double-layer source method is applied to generate the data set for training and testing the SVR model. The experimental results show that the SVR method with feature extraction (PCA-SVR and KPCA-SVR) can perform better than that without the extract feature extraction (single SVR) in terms of the reconstruction of the TMPs on epi- and endocardial surfaces. Moreover, compared with the PCA-SVR, the KPCA-SVR features good approximation and generalization ability when reconstructing the TMPs.

  13. Local linear regression for function learning: an analysis based on sample discrepancy.

    Science.gov (United States)

    Cervellera, Cristiano; Macciò, Danilo

    2014-11-01

    Local linear regression models, a kind of nonparametric structures that locally perform a linear estimation of the target function, are analyzed in the context of empirical risk minimization (ERM) for function learning. The analysis is carried out with emphasis on geometric properties of the available data. In particular, the discrepancy of the observation points used both to build the local regression models and compute the empirical risk is considered. This allows to treat indifferently the case in which the samples come from a random external source and the one in which the input space can be freely explored. Both consistency of the ERM procedure and approximating capabilities of the estimator are analyzed, proving conditions to ensure convergence. Since the theoretical analysis shows that the estimation improves as the discrepancy of the observation points becomes smaller, low-discrepancy sequences, a family of sampling methods commonly employed for efficient numerical integration, are also analyzed. Simulation results involving two different examples of function learning are provided. PMID:25330431

  14. Regression Analysis and Analysis Of Variance for EN353 and20MnCr5 Alloyed Steels for Drilling Cutting Forces

    Directory of Open Access Journals (Sweden)

    Keerthiprasad.K

    2014-08-01

    Full Text Available In recent years, alloy steels have been widely usedin aerospace and automotive industries. Machining of these materials requires better understanding of cutting processes regarding accuracy and efficiency. This study addresses the modelling of the machinability of EN353 and 20mncr5 materials. In this study, multiple regression analysis (MRA is used to investigate the influence of some parameters on the thrust force and torque in the drilling processes of alloy steel materials. The model were identified by using cutting speed, feed rate, and depth as input data and the thrust force and torque as the output data. The statistical analysis accompanied with results showed that cutting feed (f were the most significant parameters on the drilling process, while spindle speed seemed insignificant. Since the spindle speed was insignificant, it directed us to set it either at the highest spindle speed to obtain high material removal rate or at the lowest spindle speed to prolong the tool life depending on the need for the application. The mathematical model is based on a power regression modelling, dependent on the three above mentioned parameters.

  15. A least trimmed square regression method for second level FMRI effective connectivity analysis.

    Science.gov (United States)

    Li, Xingfeng; Coyle, Damien; Maguire, Liam; McGinnity, Thomas Martin

    2013-01-01

    We present a least trimmed square (LTS) robust regression method to combine different runs/subjects for second/high level effective connectivity analysis. The basic idea of this method is to treat the extreme nonlinear model variability as outliers if they exceed a certain threshold. A bootstrap method for the LTS estimation is employed to detect model outliers. We compared the LTS robust method with a non-robust method using simulated and real datasets. The difference between LTS and the non-robust method for second level effective connectivity analysis is significant, suggesting the conventional non-robust method is easily affected by the model variability from the first level analysis. In addition, after these outliers are detected and excluded for the high level analysis, the model coefficients of the second level are combined within the framework of a mixed model. The variance of the mixed model is estimated using the Newton-Raphson (NR) type Levenberg-Marquardt algorithm. Three sets of real data are adopted to compare conventional methods which do not include random effects in the analysis with a mixed model for second level effective connectivity analysis. The results show that the conventional method is significantly different from the mixed model when greater model variability exists, suggesting there is a strong random effect, and the mixed model should be employed for the second level effective connectivity analysis. PMID:23093379

  16. Statistical learning method in regression analysis of simulated positron spectral data

    International Nuclear Information System (INIS)

    Positron lifetime spectroscopy is a non-destructive tool for detection of radiation induced defects in nuclear reactor materials. This work concerns the applicability of the support vector machines method for the input data compression in the neural network analysis of positron lifetime spectra. It has been demonstrated that the SVM technique can be successfully applied to regression analysis of positron spectra. A substantial data compression of about 50 % and 8 % of the whole training set with two and three spectral components respectively has been achieved including a high accuracy of the spectra approximation. However, some parameters in the SVM approach such as the insensitivity zone e and the penalty parameter C have to be chosen carefully to obtain a good performance. (author)

  17. Within-session analysis of the extinction of pavlovian fear-conditioning using robust regression

    Directory of Open Access Journals (Sweden)

    Vargas-Irwin, Cristina

    2010-06-01

    Full Text Available Traditionally , the analysis of extinction data in fear conditioning experiments has involved the use of standard linear models, mostly ANOVA of between-group differences of subjects that have undergone different extinction protocols, pharmacological manipulations or some other treatment. Although some studies report individual differences in quantities such as suppression rates or freezing percentages, these differences are not included in the statistical modeling. Withinsubject response patterns are then averaged using coarse-grain time windows which can overlook these individual performance dynamics. Here we illustrate an alternative analytical procedure consisting of 2 steps: the estimation of a trend for within-session data and analysis of group differences in trend as main outcome. This procedure is tested on real fear-conditioning extinction data, comparing trend estimates via Ordinary Least Squares (OLS and robust Least Median of Squares (LMS regression estimates, as well as comparing between-group differences and analyzing mean freezing percentage versus LMS slopes as outcomes

  18. Boosted Beta regression.

    OpenAIRE

    Schmid, Matthias; Wickler, Florian; Maloney, Kelly O.; Mitchell, Richard; Fenske, Nora; Mayr, Andreas

    2013-01-01

    Regression analysis with a bounded outcome is a common problem in applied statistics. Typical examples include regression models for percentage outcomes and the analysis of ratings that are measured on a bounded scale. In this paper, we consider beta regression, which is a generalization of logit models to situations where the response is continuous on the interval (0,1). Consequently, beta regression is a convenient tool for analyzing percentage responses. The classical approach to fit a bet...

  19. A Skew-t space-varying regression model for the spectral analysis of resting state brain activity.

    Science.gov (United States)

    Ismail, Salimah; Sun, Wenqi; Nathoo, Farouk S; Babul, Arif; Moiseev, Alexader; Beg, Mirza Faisal; Virji-Babul, Naznin

    2013-08-01

    It is known that in many neurological disorders such as Down syndrome, main brain rhythms shift their frequencies slightly, and characterizing the spatial distribution of these shifts is of interest. This article reports on the development of a Skew-t mixed model for the spatial analysis of resting state brain activity in healthy controls and individuals with Down syndrome. Time series of oscillatory brain activity are recorded using magnetoencephalography, and spectral summaries are examined at multiple sensor locations across the scalp. We focus on the mean frequency of the power spectral density, and use space-varying regression to examine associations with age, gender and Down syndrome across several scalp regions. Spatial smoothing priors are incorporated based on a multivariate Markov random field, and the markedly non-Gaussian nature of the spectral response variable is accommodated by the use of a Skew-t distribution. A range of models representing different assumptions on the association structure and response distribution are examined, and we conduct model selection using the deviance information criterion. (1) Our analysis suggests region-specific differences between healthy controls and individuals with Down syndrome, particularly in the left and right temporal regions, and produces smoothed maps indicating the scalp topography of the estimated differences. PMID:22614763

  20. Modelling and analysis of turbulent datasets using Auto Regressive Moving Average processes

    International Nuclear Information System (INIS)

    We introduce a novel way to extract information from turbulent datasets by applying an Auto Regressive Moving Average (ARMA) statistical analysis. Such analysis goes well beyond the analysis of the mean flow and of the fluctuations and links the behavior of the recorded time series to a discrete version of a stochastic differential equation which is able to describe the correlation structure in the dataset. We introduce a new index ? that measures the difference between the resulting analysis and the Obukhov model of turbulence, the simplest stochastic model reproducing both Richardson law and the Kolmogorov spectrum. We test the method on datasets measured in a von Kármán swirling flow experiment. We found that the ARMA analysis is well correlated with spatial structures of the flow, and can discriminate between two different flows with comparable mean velocities, obtained by changing the forcing. Moreover, we show that the ? is highest in regions where shear layer vortices are present, thereby establishing a link between deviations from the Kolmogorov model and coherent structures. These deviations are consistent with the ones observed by computing the Hurst exponents for the same time series. We show that some salient features of the analysis are preserved when considering global instead of local observables. Finally, we analyze flow configurations with multistability features where the ARMA technique is efficient in discriminating different stability branches of the system

  1. Modelling and analysis of turbulent datasets using Auto Regressive Moving Average processes

    Science.gov (United States)

    Faranda, Davide; Pons, Flavio Maria Emanuele; Dubrulle, Bérengère; Daviaud, François; Saint-Michel, Brice; Herbert, Éric; Cortet, Pierre-Philippe

    2014-10-01

    We introduce a novel way to extract information from turbulent datasets by applying an Auto Regressive Moving Average (ARMA) statistical analysis. Such analysis goes well beyond the analysis of the mean flow and of the fluctuations and links the behavior of the recorded time series to a discrete version of a stochastic differential equation which is able to describe the correlation structure in the dataset. We introduce a new index ? that measures the difference between the resulting analysis and the Obukhov model of turbulence, the simplest stochastic model reproducing both Richardson law and the Kolmogorov spectrum. We test the method on datasets measured in a von Kármán swirling flow experiment. We found that the ARMA analysis is well correlated with spatial structures of the flow, and can discriminate between two different flows with comparable mean velocities, obtained by changing the forcing. Moreover, we show that the ? is highest in regions where shear layer vortices are present, thereby establishing a link between deviations from the Kolmogorov model and coherent structures. These deviations are consistent with the ones observed by computing the Hurst exponents for the same time series. We show that some salient features of the analysis are preserved when considering global instead of local observables. Finally, we analyze flow configurations with multistability features where the ARMA technique is efficient in discriminating different stability branches of the system.

  2. Modelling and analysis of turbulent datasets using Auto Regressive Moving Average processes

    Energy Technology Data Exchange (ETDEWEB)

    Faranda, Davide, E-mail: davide.faranda@cea.fr; Dubrulle, Bérengère; Daviaud, François [Laboratoire SPHYNX, Service de Physique de l' Etat Condensé, DSM, CEA Saclay, CNRS URA 2464, 91191 Gif-sur-Yvette (France); Pons, Flavio Maria Emanuele [Dipartimento di Scienze Statistiche, Universitá di Bologna, Via delle Belle Arti 41, 40126 Bologna (Italy); Saint-Michel, Brice [Institut de Recherche sur les Phénomènes Hors Equilibre, Technopole de Chateau Gombert, 49 rue Frédéric Joliot Curie, B.P. 146, 13 384 Marseille (France); Herbert, Éric [Université Paris Diderot - LIED - UMR 8236, Laboratoire Interdisciplinaire des Énergies de Demain, Paris (France); Cortet, Pierre-Philippe [Laboratoire FAST, CNRS, Université Paris-Sud (France)

    2014-10-15

    We introduce a novel way to extract information from turbulent datasets by applying an Auto Regressive Moving Average (ARMA) statistical analysis. Such analysis goes well beyond the analysis of the mean flow and of the fluctuations and links the behavior of the recorded time series to a discrete version of a stochastic differential equation which is able to describe the correlation structure in the dataset. We introduce a new index ? that measures the difference between the resulting analysis and the Obukhov model of turbulence, the simplest stochastic model reproducing both Richardson law and the Kolmogorov spectrum. We test the method on datasets measured in a von Kármán swirling flow experiment. We found that the ARMA analysis is well correlated with spatial structures of the flow, and can discriminate between two different flows with comparable mean velocities, obtained by changing the forcing. Moreover, we show that the ? is highest in regions where shear layer vortices are present, thereby establishing a link between deviations from the Kolmogorov model and coherent structures. These deviations are consistent with the ones observed by computing the Hurst exponents for the same time series. We show that some salient features of the analysis are preserved when considering global instead of local observables. Finally, we analyze flow configurations with multistability features where the ARMA technique is efficient in discriminating different stability branches of the system.

  3. Quantitative laser-induced breakdown spectroscopy data using peak area step-wise regression analysis: an alternative method for interpretation of Mars science laboratory results

    Energy Technology Data Exchange (ETDEWEB)

    Clegg, Samuel M [Los Alamos National Laboratory; Barefield, James E [Los Alamos National Laboratory; Wiens, Roger C [Los Alamos National Laboratory; Dyar, Melinda D [MT HOLYOKE COLLEGE; Schafer, Martha W [LSU; Tucker, Jonathan M [MT HOLYOKE COLLEGE

    2008-01-01

    The ChemCam instrument on the Mars Science Laboratory (MSL) will include a laser-induced breakdown spectrometer (LIBS) to quantify major and minor elemental compositions. The traditional analytical chemistry approach to calibration curves for these data regresses a single diagnostic peak area against concentration for each element. This approach contrasts with a new multivariate method in which elemental concentrations are predicted by step-wise multiple regression analysis based on areas of a specific set of diagnostic peaks for each element. The method is tested on LIBS data from igneous and metamorphosed rocks. Between 4 and 13 partial regression coefficients are needed to describe each elemental abundance accurately (i.e., with a regression line of R{sup 2} > 0.9995 for the relationship between predicted and measured elemental concentration) for all major and minor elements studied. Validation plots suggest that the method is limited at present by the small data set, and will work best for prediction of concentration when a wide variety of compositions and rock types has been analyzed.

  4. Analysis of egg production in layer chickens using a random regression model with genomic relationships.

    Science.gov (United States)

    Wolc, A; Arango, J; Settar, P; Fulton, J E; O'Sullivan, N P; Preisinger, R; Fernando, R; Garrick, D J; Dekkers, J C M

    2013-06-01

    Random regression models allow for analysis of longitudinal data, which together with the use of genomic information are expected to increase accuracy of selection, when compared with analyzing average or total production with pedigree information. The objective of this study was to estimate variance components for egg production over time in a commercial brown egg layer population using genomic relationship information. A random regression reduced animal model with a marker-based relationship matrix was used to estimate genomic breeding values of 3,908 genotyped animals from 6 generations. The first 5 generations were used for training, and predictions were validated in generation 6. Daily egg production up to 46 wk in lay was accumulated into 85,462 biweekly (every 2 wk) records for training, of which 17,570 were recorded on genotyped hens and the remaining on their nongenotyped progeny. The effect of adding additional egg production data of 2,167 nongenotyped sibs of selection candidates [16,037 biweekly (every 2 wk) records] to the training data was also investigated. The model included a 5th order Legendre polynomial nested within hatch-week as fixed effects and random terms for coefficients of quadratic polynomials for genetic and permanent environmental components. Residual variance was assumed heterogeneous among 2-wk periods. Models using pedigree and genomic relationships were compared. Estimates of residual variance were very similar under both models, but the model with genomic relationships resulted in a larger estimate of genetic variance. Heritability estimates increased with age up to mid production and decreased afterward, resulting in an average heritability of 0.20 and 0.33 for pedigree and genomic models. Prediction of total egg number was more accurate with the genomic than with the pedigree-based random regression model (correlation in validation 0.26 vs. 0.16). The genomic model outperformed the pedigree model in most of the 2-wk periods. Thus, results of this study show that random regression reduced animal models can be used in breeding programs using genomic information and can result in substantial improvements in the accuracy of selection for trajectory traits. PMID:23687143

  5. Prognostics of Lithium-Ion Batteries Based on Battery Performance Analysis and Flexible Support Vector Regression

    Directory of Open Access Journals (Sweden)

    Shuai Wang

    2014-10-01

    Full Text Available Accurate prediction of the remaining useful life (RUL of lithium-ion batteries is important for battery management systems. Traditional empirical data-driven approaches for RUL prediction usually require multidimensional physical characteristics including the current, voltage, usage duration, battery temperature, and ambient temperature. From a capacity fading analysis of lithium-ion batteries, it is found that the energy efficiency and battery working temperature are closely related to the capacity degradation, which account for all performance metrics of lithium-ion batteries with regard to the RUL and the relationships between some performance metrics. Thus, we devise a non-iterative prediction model based on flexible support vector regression (F-SVR and an iterative multi-step prediction model based on support vector regression (SVR using the energy efficiency and battery working temperature as input physical characteristics. The experimental results show that the proposed prognostic models have high prediction accuracy by using fewer dimensions for the input data than the traditional empirical models.

  6. Principal components and iterative regression analysis of geophysical series: Application to Sunspot number (1750 2004)

    Science.gov (United States)

    Nordemann, D. J. R.; Rigozo, N. R.; de Souza Echer, M. P.; Echer, E.

    2008-11-01

    We present here an implementation of a least squares iterative regression method applied to the sine functions embedded in the principal components extracted from geophysical time series. This method seems to represent a useful improvement for the non-stationary time series periodicity quantitative analysis. The principal components determination followed by the least squares iterative regression method was implemented in an algorithm written in the Scilab (2006) language. The main result of the method is to obtain the set of sine functions embedded in the series analyzed in decreasing order of significance, from the most important ones, likely to represent the physical processes involved in the generation of the series, to the less important ones that represent noise components. Taking into account the need of a deeper knowledge of the Sun's past history and its implication to global climate change, the method was applied to the Sunspot Number series (1750-2004). With the threshold and parameter values used here, the application of the method leads to a total of 441 explicit sine functions, among which 65 were considered as being significant and were used for a reconstruction that gave a normalized mean squared error of 0.146.

  7. Semiparametric regression analysis for time-to-event marked endpoints in cancer studies.

    Science.gov (United States)

    Hu, Chen; Tsodikov, Alex

    2014-07-01

    In cancer studies the disease natural history process is often observed only at a fixed, random point of diagnosis (a survival time), leading to a current status observation (Sun (2006). The statistical analysis of interval-censored failure time data. Berlin: Springer.) representing a surrogate (a mark) (Jacobsen (2006). Point process theory and applications: marked point and piecewise deterministic processes. Basel: Birkhauser.) attached to the observed survival time. Examples include time to recurrence and stage (local vs. metastatic). We study a simple model that provides insights into the relationship between the observed marked endpoint and the latent disease natural history leading to it. A semiparametric regression model is developed to assess the covariate effects on the observed marked endpoint explained by a latent disease process. The proposed semiparametric regression model can be represented as a transformation model in terms of mark-specific hazards, induced by a process-based mixed effect. Large-sample properties of the proposed estimators are established. The methodology is illustrated by Monte Carlo simulation studies, and an application to a randomized clinical trial of adjuvant therapy for breast cancer. PMID:24379192

  8. A cautionary note on the use of EESC-based regression analysis for ozone trend studies

    Science.gov (United States)

    Kuttippurath, J.; Bodeker, G. E.; Roscoe, H. K.; Nair, P. J.

    2015-01-01

    Equivalent effective stratospheric chlorine (EESC) construct of ozone regression models attributes ozone changes to EESC changes using a single value of the sensitivity of ozone to EESC over the whole period. Using space-based total column ozone (TCO) measurements, and a synthetic TCO time series constructed such that EESC does not fall below its late 1990s maximum, we demonstrate that the EESC-based estimates of ozone changes in the polar regions (70-90°) after 2000 may, falsely, suggest an EESC-driven increase in ozone over this period. An EESC-based regression of our synthetic "failed Montreal Protocol with constant EESC" time series suggests a positive TCO trend that is statistically significantly different from zero over 2001-2012 when, in fact, no recovery has taken place. Our analysis demonstrates that caution needs to be exercised when using explanatory variables, with a single fit coefficient, fitted to the entire data record, to interpret changes in only part of the record.

  9. ???????????????? Regression Analysis on the Relationship between Water Consumption Structure and Industrial Structure in Fujian Province

    Directory of Open Access Journals (Sweden)

    ???

    2012-06-01

    Full Text Available ??????????????????????????????????????????????????logratio??????????(PLS??????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? ? ???? ? ????? Prediction of water consumption structure on the basis of the relationship between water consumption structure and industrial structure is essential to the exploitation and utilization of water resources. Based on the symmetrical logratio transformation and partial least-squares regression, linear regression model for water consumption structure and industrial structure in FujianProvinceis developed in this study. Analysis on the model showed that the compositional data of water consumption structure and industrial structure inFujianProvincehad obvious linear relationship. This model fit the data very well with high accuracy and can be used to predict water consumption structure. Agricultural water was highly correlated with primary industry, and so was the industrial water with secondary industry. Agricultural water showed significantly negative correlation with secondary industry and tertiary industry. The variation of domestic water had an insignificant correlation with industrial structure. The capacity to explain water consumption structure of the industrial structure factors was in the order of primary industry > secondary industry > tertiary industry.

  10. Analysis of correlation and regression between particle ionizing radiation parameters and the stability characteristics of irradiated monocrystalline Si film

    OpenAIRE

    Jakši? Uroš G.; Arsi? Nebojša B.; Fetahovi? Irfan S.; Stankovi? Koviljka ?.

    2014-01-01

    This paper deals with the analysis of correlation and regression between the parameters of particle ionizing radiation and the stability characteristics of the irradiated monocrystalline silicon film. Based on the presented theoretical model of correlation and linear regression between two random variables, numeric and real experiments were performed. In the numeric experiment, a simulation of the effect of alpha radiation on a thin layer of monocrystalline...

  11. Brief Review of Regression-Based and Machine Learning Methods in Genetic Epidemiology: The Genetic Analysis Workshop 17 Experience

    OpenAIRE

    Dasgupta, Abhijit; Sun, Yan V.; Ko?nig, Inke R.; Bailey-wilson, Joan E.; Malley, James D.

    2011-01-01

    Genetics Analysis Workshop 17 provided common and rare genetic variants from exome sequencing data and simulated binary and quantitative traits in 200 replicates. We provide a brief review of the machine learning and regression-based methods used in the analyses of these data. Several regression and machine learning methods were used to address different problems inherent in the analyses of these data, which are high-dimension, low-sample-size data typical of many genetic association studies....

  12. Quantitative structure-property relationship study of n-octanol-water partition coefficients of some of diverse drugs using multiple linear regression

    Energy Technology Data Exchange (ETDEWEB)

    Ghasemi, Jahanbakhsh [Chemistry Department, Faculty of Sciences, Razi University, Kermanshah (Iran, Islamic Republic of)], E-mail: Jahan.ghasemi@gmail.com; Saaidpour, Saadi [Chemistry Department, Faculty of Sciences, Razi University, Kermanshah (Iran, Islamic Republic of)

    2007-12-05

    A quantitative structure-property relationship (QSPR) study was performed to develop models those relate the structures of 150 drug organic compounds to their n-octanol-water partition coefficients (log P{sub o/w}). Molecular descriptors derived solely from 3D structures of the molecular drugs. A genetic algorithm was also applied as a variable selection tool in QSPR analysis. The models were constructed using 110 molecules as training set, and predictive ability tested using 40 compounds. Modeling of log P{sub o/w} of these compounds as a function of the theoretically derived descriptors was established by multiple linear regression (MLR). Four descriptors for these compounds molecular volume (MV) (geometrical), hydrophilic-lipophilic balance (HLB) (constitutional), hydrogen bond forming ability (HB) (electronic) and polar surface area (PSA) (electrostatic) are taken as inputs for the model. The use of descriptors calculated only from molecular structure eliminates the need for experimental determination of properties for use in the correlation and allows for the estimation of log P{sub o/w} for molecules not yet synthesized. Application of the developed model to a testing set of 40 drug organic compounds demonstrates that the model is reliable with good predictive accuracy and simple formulation. The prediction results are in good agreement with the experimental value. The root mean square error of prediction (RMSEP) and square correlation coefficient (R{sup 2}) for MLR model were 0.22 and 0.99 for the prediction set log P{sub o/w}.

  13. Collaborative Regression

    OpenAIRE

    Gross, Samuel M.; Tibshirani, Robert

    2014-01-01

    We consider the scenario where one observes an outcome variable and sets of features from multiple assays, all measured on the same set of samples. One approach that has been proposed for dealing with this type of data is ``sparse multiple canonical correlation analysis'' (sparse mCCA). All of the current sparse mCCA techniques are biconvex and thus have no guarantees about reaching a global optimum. We propose a method for performing sparse supervised canonical correlation ...

  14. Ordinal Logistic Regression for the Estimate of the Response Functions in the Conjoint Analysis

    Directory of Open Access Journals (Sweden)

    Amedeo De Luca

    2011-12-01

    Full Text Available In the Conjoint Analysis (COA model proposed here – a new approach to estimate more than one response function–an extension of the traditional COA, the polytomous response variable (i.e. evaluation of the overall desirability of alternative product profiles is described by a sequence of binary variables. To link the categories of overall evaluation to the factor levels, we adopt – at the aggregate level – an ordinal logistic regression, based on a main effects experimental design.The model provides several overall desirability functions (aggregated part-worths sets, as many as the overall ordered categories are, unlike the traditional metric and non metric COA, which gives only one response function. We provide an application of the model and an interpretation of the main effects.

  15. Research of NiMH Battery Modeling and Simulation Based on Linear Regression Analysis Method

    Directory of Open Access Journals (Sweden)

    Yong-sheng Zhang

    2013-11-01

    Full Text Available The battery State-Of-Charge estimation was one of core issues in the development of electric vehicles battery management system, and higher accurate model was needed in State-Of-Charge estimation correctly. Therefore, accurate battery modeling and simulation was researched here. The thevenin equivalent circuit model of NiMH battery was established for the poor accuracy of traditional model. Based on the data which were brought from the 6V 6Ah NiMH battery hybrid pulse cycling test experiments, thevenin model parameters were identified by means of the linear regression analysis method. Then, the battery equivalent circuit simulating model was built in the MATLAB/Simulink environment. The simulation and experimental results showed that the model has better accuracy and can be used to guide the battery State-Of-Charge estimation.

  16. Prediction of bioactivity of ACAT2 inhibitors by multilinear regression analysis and support vector machine.

    Science.gov (United States)

    Zhong, Min; Xuan, Shouyi; Wang, Ling; Hou, Xiaoli; Wang, Maolin; Yan, Aixia; Dai, Bin

    2013-07-01

    Two quantitative structure-activity relationships (QSAR) models for predicting 95 compounds inhibiting Acyl-coenzyme A: cholesterol acyltransferase2 (ACAT2) were developed. The whole data set was randomly split into a training set including 72 compounds and a test set including 23 compounds. The molecules were represented by 11 descriptors calculated by software ADRIANA.Code. Then the inhibitory activity of ACAT2 inhibitors was predicted using multilinear regression (MLR) analysis and support vector machine (SVM) method, respectively. The correlation coefficients of the models for the test sets were 0.90 for MLR model, and 0.91 for SVM model. Y-randomization was employed to ensure the robustness of the SVM model. The atom charge and electronegativity related descriptors were important for the interaction between the inhibitors and ACAT2. PMID:23711921

  17. Productivity, Efficiency, and Managerial Performance Regress and Gains in United States Universities: A Data Envelopment Analysis

    Directory of Open Access Journals (Sweden)

    G. Thomas Sav

    2012-08-01

    Full Text Available This paper uses data envelopment analysis to investigate the extent to which universities in the United States have undergone productivity and efficiency changes, partly due to managerial performance, during the 2005-09 academic years. Using panel data for 133 research and doctoral universities, the focus is on the primary drivers of U.S. publicly controlled higher education. DEA efficiency and returns to scale estimates are provided. In addition, university total factor productivity changes via the Malmquist index are decomposed into component parts. Results suggest that U.S. universities experienced average productivity regress. On an annual basis such was present prior to the global financial crisis. However, productivity gains appeared in concert with the crisis. Managerial efficiency tended to hamper productivity gains but, on the positive side, showed slight improvements over time. Decreasing returns to scale prevailed but from a policy perspective a return to economy wide growth may automatically correct some over production.

  18. Estimation of a Reactor Core Power Peaking Factor Using Support Vector Regression and Uncertainty Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Bae, In Ho; Naa, Man Gyun [Chosun Univ., Gwangju (Korea, Republic of); Lee, Yoon Joon [Cheju National Univ., Jeju-do (Korea, Republic of); Park, Goon Cherl [Seoul National Univ., Seoul (Korea, Republic of)

    2009-05-15

    The monitoring of detailed 3-dimensional (3D) reactor core power distribution is a prerequisite in the operation of nuclear power reactors to ensure that various safety limits imposed on the LPD and DNBR, are not violated during nuclear power reactor operation. The LPD and DNBR should be calculated in order to perform the two major functions of the core protection calculator system (CPCS) and the core operation limit supervisory system (COLSS). The LPD at the hottest part of a hot fuel rod, which is related to the power peaking factor (PPF, F{sub q} ), is more important than the LPD at any other position in a reactor core. The LPD needs to be estimated accurately to prevent nuclear fuel rods from melting. In this study, support vector regression (SVR) and uncertainty analysis have been applied to estimation of reactor core power peaking factor.

  19. A Logistic Regression Analysis of the Contractor`s Awareness Regarding Waste Management

    Directory of Open Access Journals (Sweden)

    Rawshan Ara Begum

    2006-01-01

    Full Text Available This study has highlighted a number of factors affecting contractor`s awareness regarding construction waste management to the construction industry. The data in the present study is based on contractors registered with the Construction Industry Development Board of Malaysia. Binary logistic regression analysis is employed for exploring the factors affecting the awareness. Contractor`s awareness regarding waste management will tend to be significantly adequate with the increasing values in the factors of having waste management plan, awareness of source reduction of waste minimisation measures, awareness of reusing and recycling of waste materials, sorting waste materials, perception on harmfulness of construction waste to the human health and willing to pay more for improved waste collection and disposal services. The findings generated from the study could help the environmental and waste management planners in their decision making for managing construction waste and reducing environmental pollution.

  20. Partial least square regression method for quantitative elemental analysis in fast neutron induced gamma spectroscopy

    International Nuclear Information System (INIS)

    Fast neutron induced gamma spectrometry is based on inelastic scattering and capture of fast neutrons in the nucleus of various elements and consequent detection of emitted characteristic gamma. It is a useful technique for online, nondestructive elemental analysis of composition of various compounds. In this technique fast neutron, typically 14 MeV are made to incident on the sample and inelastic, capture gamma is collected. The elements present in the sample can be identified through the peaks at their characteristic energies in the collected spectrum and the peak heights contain the information about the abundance of the elements in the sample. Analyzing this gamma spectrum gives the quantitative composition of the sample. A two step method consisting of spectrum evaluation and calibration is used currently for quantitative abundance analysis. In field applications such as explosive detection, cancer diagnostics where real-time composition analysis is required, this method is inconvenient and not practical. In this work a new single step method based on Partial least square regression (PLS) has been proposed. The gamma energy spectrums of various compounds are collected and used to calibrate the correlation between peak height and elements quantity. Based on this analysis the unknown composition of any compound having similar elements can be predicted with comparatively higher accuracy. Monte-Carlo simulations has been carried out to verify the proposed method and ied out to verify the proposed method and used to predict the quantity of various elements present in some unknown compounds. (author)

  1. Spatial-Temporal Variations of Turbidity and Ocean Current Velocity of the Ariake Sea Area, Kyushu, Japan Through Regression Analysis with Remote Sensing Satellite Data

    OpenAIRE

    Yuichi Sarusawa; Kohei Arai

    2013-01-01

    Regression analysis based method for turbidity and ocean current velocity estimation with remote sensing satellite data is proposed. Through regressive analysis with MODIS data and measured data of turbidity and ocean current velocity, regressive equation which allows estimation of turbidity and ocean current velocity is obtained. With the regressive equation as well as long term MODIS data, turbidity and ocean current velocity trends in Ariake Sea area are clarified. It is also confirmed tha...

  2. Predicting the Surface Quality of Face Milled Aluminium Alloy Using a Multiple Regression Model and Numerical Optimization

    Science.gov (United States)

    Simunovic, K.; Simunovic, G.; Saric, T.

    2013-10-01

    The surface roughness is a very significant indicator of surface quality. It represents an essential exploitation requirement and influences technological time and costs, i.e. productivity. For that reason, the main objective of this paper is to analyse the influence of face milling cutting parameters (number of revolution, feed rate and depth of cut) on the surface roughness of aluminium alloy. Hence, a statistical (regression) model has been developed to predict the surface roughness by using the methodology of experimental design. Central composite design is chosen for fitting response surface. Also, numerical optimization considering two goals simultaneously (minimum propagation of error and minimum roughness) was performed throughout the experimental region. In this way, the settings of cutting parameters causing the minimum variability in response were determined for the estimated variations of the significant regression factors.

  3. Synchronized multiple regression of diagnostic radiation-induced rather than spontaneous: disseminated primary intracranial germinoma in a woman: a case report

    Directory of Open Access Journals (Sweden)

    Natsumeda Manabu

    2011-01-01

    Full Text Available Abstract Introduction Examples of the spontaneous regression of primary intracranial germinomas can be found in the literature. We present the case of a patient with disseminated lesions of primary intracranial germinoma which synchronously shrunk following diagnostic irradiation. We will discuss whether this regression was spontaneous or radiation-induced. Case presentation A 43-year-old Japanese woman presented to our hospital complaining of memory problems over a period of one year and blurred vision over a period of three months. Following magnetic resonance imaging, she was found to have a massive lesion in the third ventricle and small lesions in the pineal region, fourth ventricle, and in the anterior horn of the left lateral ventricle. Prior to an open biopsy to confirm the pathology of the lesions, she underwent a single cranial computed tomography scan and a single cranial digital subtraction angiography for a transcranial biopsy. Fourteen days after the first magnetic resonance image - 12 and eight days after the computed tomography scan and digital subtraction angiography, respectively - a pre-operative magnetic resonance image was taken, which showed a notable synchronous shrinkage of the third ventricle tumor, as well as shrinkage of the lesions in the pineal region and in the fourth ventricle. She did not undergo steroid administration until after a biopsy that confirmed the pathological diagnosis of pure germinoma. She then underwent whole craniospinal irradiation and went into a complete remission. Conclusions In our case report, we state that diagnostic radiation can induce the regression of germinomas; this is the most reasonable explanation for the synchronous multiple regression observed in this case of germinoma. Clinicians should keep this non-spontaneous regression in mind and monitor germinoma lesions with minimal exposure to diagnostic radiation before diagnostic confirmation, and also before radiation treatment with or without chemotherapy begins.

  4. The Analysis of Internet Addiction Scale Using Multivariate Adaptive Regression Splines

    Directory of Open Access Journals (Sweden)

    M Kayri

    2010-12-01

    Full Text Available "nBackground: Determining real effects on internet dependency is too crucial with unbiased and robust statistical method. MARS is a new non-parametric method in use in the literature for parameter estimations of cause and effect based research. MARS can both obtain legible model curves and make unbiased parametric predictions."nMethods: In order to examine the performance of MARS, MARS findings will be compared to Classification and Regres­sion Tree (C&RT findings, which are considered in the literature to be efficient in revealing correlations between variables. The data set for the study is taken from "The Internet Addiction Scale" (IAS, which attempts to reveal addiction levels of individu­als. The population of the study consists of 754 secondary school students (301 female, 443 male students with 10 miss­ing data. MARS 2.0 trial version is used for analysis by MARS method and C&RT analysis was done by SPSS."nResults: MARS obtained six base functions of the model. As a common result of these six functions, regression equation of the model was found. Over the predicted variable, MARS showed that the predictors of daily Internet-use time on average, the purpose of Internet- use, grade of students and occupations of mothers had a significant effect (P< 0.05. In this compara­tive study, MARS obtained different findings from C&RT in dependency level prediction."nConclusion: The fact that MARS revealed extent to which the variable, which was considered significant, changes the charac­ter of the model was observed in this study.

  5. Multiple-output support vector regression with a firefly algorithm for interval-valued stock price index forecasting

    OpenAIRE

    Xiong, Tao; Bao, Yukun; Hu, Zhongyi

    2014-01-01

    Highly accurate interval forecasting of a stock price index is fundamental to successfully making a profit when making investment decisions, by providing a range of values rather than a point estimate. In this study, we investigate the possibility of forecasting an interval-valued stock price index series over short and long horizons using multi-output support vector regression (MSVR). Furthermore, this study proposes a firefly algorithm (FA)-based approach, built on the est...

  6. Multiple linear regression model for forecasting Bluetongue disease outbreak in sheep of North-west agroclimatic zone of Tamil Nadu, India

    Directory of Open Access Journals (Sweden)

    G. Selvaraju

    2013-12-01

    Full Text Available Aim: A study was undertaken to develop a forecasting model for predicting bluetongue outbreaks in North-west agroclimatic zone of Tamil Nadu, India. Materials and Methods: Eleven bluetongue outbreaks were characterised by active and passive surveillances for a period of twelve years and used in this study. Meteorological data comprising of maximum and minimum temperatures, relative humidity, rainfall and wind speed were collected and used as the multiple predictor variables in the multiple liner regression model. Results: A multiple liner regression model was developed for the North-west zone of Tamil Nadu. Values of the dependant variables were less than or greater than one, and indicated remote or greater chances of bluetongue outbreaks respectively. The monthly mean maximum and minimum temperatures, relative humidity at 8.30 h and at 17.00 h IST, wind speed, and monthly total rainfall of 29.1 - 31.0°C, 20.1 - 22.0°C, 80.1 ? 85.0%, 65.1 ? 70.0%, 3.1 ? 5.0 km/h and < 200 mm respectively, were identified as the ideal climatic conditions for increased numbers of bluetongue outbreaks in this zone. Conclusion: Based on the values obtained from the prediction model, stake holders can be warned timely through the media to institute suitable prophylactic measures against bluetongue, to avoid economic losses due to disease. [Vet World 2013; 6(6.000: 321-324

  7. Applying support vector regression analysis on grip force level-related corticomuscular coherence

    DEFF Research Database (Denmark)

    Rong, Yao; Han, Xixuan

    2014-01-01

    Voluntary motor performance is the result of cortical commands driving muscle actions. Corticomuscular coherence can be used to examine the functional coupling or communication between human brain and muscles. To investigate the effects of grip force level on corticomuscular coherence in an accessory muscle, this study proposed an expanded support vector regression (ESVR) algorithm to quantify the coherence between electroencephalogram (EEG) from sensorimotor cortex and surface electromyogram (EMG) from brachioradialis in upper limb. A measure called coherence proportion was introduced to compare the corticomuscular coherence in the alpha (7–15Hz), beta (15–30Hz) and gamma (30–45Hz) band at 25 % maximum grip force (MGF) and 75 % MGF. Results show that ESVR could reduce the influence of deflected signals and summarize the overall behavior of multiple coherence curves. Coherence proportion is more sensitive to grip force level than coherence area. The significantly higher corticomuscular coherence occurred in the alpha (p<0.01) and beta band (p<0.01) during 75 % MGF, but in the gamma band (p<0.01) during 25 % MGF. The results suggest that sensorimotor cortex might control the activity of an accessory muscle for hand grip with increased grip intensity by changing functional corticomuscular coupling at certain frequency bands (alpha, beta and gamma bands).

  8. An analysis of the differential item function through Mantel-Haenszel, SIBTEST and Logistic Regression Methods

    Directory of Open Access Journals (Sweden)

    Süleyman Demir

    2014-04-01

    Full Text Available This study performs a Differential Item Function (DIF analysis in terms of gender and culture on the items available in the PISA 2009 mathematics literacy sub-test. The DIF analyses were done through the Mantel Haenszel, Logistic Regression and the SIBTEST methods. The data for the gender variable were collected from the responses given by 332 students to the items in the mathematics literacy sub-test during the administration of the 5th booklet in the PISA 2009 application whereas the data for the culture variable were collected through the application of the 5th booklet in Turkey, Germany, Finland and the United States in the PISA 2009 application. As a result of DIF analysis according to gender, 4 items carried out in favor of men, only one item can be said to be advantageous in favor of girls. As a result of DIF analysis according to culture, 16 items for Turkish and German students, 14 items for Turkish and Finn students, 18 items for Turkish and United States students were determined.

  9. Thermodynamic dissociation constants of silychristin, silybin, silydianin and mycophenolate by the regression analysis of spectrophotometric data

    Energy Technology Data Exchange (ETDEWEB)

    Meloun, Milan; Burkonova, Dominika; Syrovy, Tomas; Vrana, Ales

    2003-06-11

    Mixed dissociation constants of four drug acids, i.e. silychristin, silybinin, silydianin and mycophenolate at various ionic strengths I of range 0.01 and 0.30 and at temperatures of 25 and 37 deg. C were determined using the SQUAD(84) regression analysis program applied to pH-spectrophotometric titration data. The proposed strategy of an efficient experimentation in a protonation constants determination, followed by a computational strategy for the chemical model with a protonation constants determination, is presented on the protonation equilibria of silychristin. The thermodynamic dissociation constant pK{sub a}{sup T} was estimated by non-linear regression of {l_brace}pK{sub a}, I data at 25 and 37 deg. C: for silychristin pK{sub a,1}{sup T}=6.52(16) and 6.62(1), pK{sub a,2}{sup T}=7.22(13) and 7.41(5), pK{sub a,3}{sup T}=8.96(9) and 8.94(9), pK{sub a,4}{sup T}=10.17(7) and 10.03(8), pK{sub a,5}{sup T}=11.89(4) and 11.63(7); for silybin pK{sub a,1}{sup T}=7.00(4) and 6.86(5), pK{sub a,2}{sup T}=8.77(11) and 8.77(3), pK{sub a,3}{sup T}=9.57(8) and 9.62(1), pK{sub a,4}{sup T}=11.66(3) and 11.38(1); for silydianin pK{sub a,1}{sup T}=6.64(7) and 7.10(6), pK{sub a,2}{sup T}=7.78(5) and 8.93(1), pK{sub a,3}{sup T}=9.66(9) and 10.06(11), pK{sub a,4}{sup T}=10.71(7) and 10.77(7), pK{sub a,5}{sup T}=12.26(5) and 12.14(5); for mycophenolate pK{sub a}{sup T}=8.32(1) and 8.14(1). Goodness-of-fit tests for various regression diagnostics enabled the reliability of parameter estimates to be found.

  10. Thermodynamic dissociation constants of silychristin, silybin, silydianin and mycophenolate by the regression analysis of spectrophotometric data

    International Nuclear Information System (INIS)

    Mixed dissociation constants of four drug acids, i.e. silychristin, silybinin, silydianin and mycophenolate at various ionic strengths I of range 0.01 and 0.30 and at temperatures of 25 and 37 deg. C were determined using the SQUAD(84) regression analysis program applied to pH-spectrophotometric titration data. The proposed strategy of an efficient experimentation in a protonation constants determination, followed by a computational strategy for the chemical model with a protonation constants determination, is presented on the protonation equilibria of silychristin. The thermodynamic dissociation constant pKaT was estimated by non-linear regression of {pKa, I data at 25 and 37 deg. C: for silychristin pKa,1T=6.52(16) and 6.62(1), pKa,2T=7.22(13) and 7.41(5), pKa,3T=8.96(9) and 8.94(9), pKa,4T=10.17(7) and 10.03(8), pKa,5T=11.89(4) and 11.63(7); for silybin pKa,1T=7.00(4) and 6.86(5), pKa,2T=8.77(11) and 8.77(3), pKa,3T=9.57(8) and 9.62(1), pKa,4T=11.66(3) and 11.38(1); for silydianin pKa,1T=6.64(7) and 7.10(6), pKa,2T=7.78(5) and 8.93(1), pKa,3T=9.66(9) and 10.06(11), pKa,4T=10.71(7) and 10.77(7), pKa,5T=12.26(5) and 12.14(5); for myT=12.26(5) and 12.14(5); for mycophenolate pKaT=8.32(1) and 8.14(1). Goodness-of-fit tests for various regression diagnostics enabled the reliability of parameter estimates to be found

  11. Causal mediation analysis with multiple mediators.

    Science.gov (United States)

    Daniel, R M; De Stavola, B L; Cousens, S N; Vansteelandt, S

    2015-03-01

    In diverse fields of empirical research-including many in the biological sciences-attempts are made to decompose the effect of an exposure on an outcome into its effects via a number of different pathways. For example, we may wish to separate the effect of heavy alcohol consumption on systolic blood pressure (SBP) into effects via body mass index (BMI), via gamma-glutamyl transpeptidase (GGT), and via other pathways. Much progress has been made, mainly due to contributions from the field of causal inference, in understanding the precise nature of statistical estimands that capture such intuitive effects, the assumptions under which they can be identified, and statistical methods for doing so. These contributions have focused almost entirely on settings with a single mediator, or a set of mediators considered en bloc; in many applications, however, researchers attempt a much more ambitious decomposition into numerous path-specific effects through many mediators. In this article, we give counterfactual definitions of such path-specific estimands in settings with multiple mediators, when earlier mediators may affect later ones, showing that there are many ways in which decomposition can be done. We discuss the strong assumptions under which the effects are identified, suggesting a sensitivity analysis approach when a particular subset of the assumptions cannot be justified. These ideas are illustrated using data on alcohol consumption, SBP, BMI, and GGT from the Izhevsk Family Study. We aim to bridge the gap from "single mediator theory" to "multiple mediator practice," highlighting the ambitious nature of this endeavor and giving practical suggestions on how to proceed. PMID:25351114

  12. The Jackknife Interval Estimation of Parametersin Partial Least Squares Regression Modelfor Poverty Data Analysis

    OpenAIRE

    Pudji Ismartini; Sony Sunaryo; Setiawan Setiawan

    2010-01-01

    One of the major problem facing the data modelling at social area is multicollinearity. Multicollinearity can have significant impact on the quality and stability of the fitted regression model. Common classical regression technique by using Least Squares estimate is highly sensitive to multicollinearity problem. In such a problem area, Partial Least Squares Regression (PLSR) is a useful and flexible tool for statistical model building; however, PLSR can only yields point estimations. This pa...

  13. Comparison of Some Estimation Methods in Linear Regression

    OpenAIRE

    ?lkay Alt?nda?; Teks?en, U?mran M.; A??r Genç

    2010-01-01

    In this study, we are informed about some methods as alternatives to the classical least squares methods which are used for simple linear and multiple linear regression analysis. In short, linear regression model is shown via matrix as;Y=X?+? where Y is the vector belonging to dependent variable, X is the design matrix of independent variables, ? is the parameter vector, ?is the vector belonging to error terms, so the least squares estimator of the linear regression is shown by?=(X^{?...

  14. Evidence suggesting multiple promoting roles of luteal group IVA phospholipase A(2) in prostaglandin F(2alpha)-induced regression in pseudopregnant rats.

    Science.gov (United States)

    Kurusu, Shiro; Sonoda, Norifumi; Nakahara, Masato; Yonezawa, Tomohiro; Kawaminami, Mitsumori

    2010-09-01

    We evaluated effects of local administration of selective inhibitors of group IVA phospholipase A(2) (GIVA PLA(2)) and cyclooxygenase (COX) on exogenous prostaglandin (PG) F(2alpha)-induced luteal regression in pseudopregnant rats. Intra-bursal treatment with a GIVA PLA(2) inhibitor AACOCF(3) just prior to PGF(2alpha) (30microg, subcutaneously) on day 6 of pseudopregnancy (PSP6) prevented a decline in circulating progesterone and inhibited TUNEL-positive reactions of steroidogenic cell. Its treatment on PSP9 failed to inhibit functional regression, but reduced significantly apoptosis of steroidogenic cells and vascular endothelial cells, and suppressed the infiltration of macrophages. A COX-2-selective inhibitor NS398 inhibited the decline of progesterone and apoptosis of steroidogenic cells on PSP6 but not on PSP9. A COX-1 inhibitor SC560 exerted insignificant anti-luteolytic effects. Overall data suggest that luteal GIVA PLA(2) plays multiple promoting roles in PGF(2alpha)-induced luteal regression at least partly by a COX-2 activity-related mechanism in pseudopregnant rats. PMID:20601072

  15. The Jackknife Interval Estimation of Parametersin Partial Least Squares Regression Modelfor Poverty Data Analysis

    Directory of Open Access Journals (Sweden)

    Pudji Ismartini

    2010-08-01

    Full Text Available One of the major problem facing the data modelling at social area is multicollinearity. Multicollinearity can have significant impact on the quality and stability of the fitted regression model. Common classical regression technique by using Least Squares estimate is highly sensitive to multicollinearity problem. In such a problem area, Partial Least Squares Regression (PLSR is a useful and flexible tool for statistical model building; however, PLSR can only yields point estimations. This paper will construct the interval estimations for PLSR regression parameters by implementing Jackknife technique to poverty data. A SAS macro programme is developed to obtain the Jackknife interval estimator for PLSR.

  16. Alternative Methods of Regression

    CERN Document Server

    Birkes, David

    2011-01-01

    Of related interest. Nonlinear Regression Analysis and its Applications Douglas M. Bates and Donald G. Watts ".an extraordinary presentation of concepts and methods concerning the use and analysis of nonlinear regression models.highly recommend[ed].for anyone needing to use and/or understand issues concerning the analysis of nonlinear regression models." --Technometrics This book provides a balance between theory and practice supported by extensive displays of instructive geometrical constructs. Numerous in-depth case studies illustrate the use of nonlinear regression analysis--with all data s

  17. Alternative parameterizations of the multiple-trait random regression model for milk yield and somatic cell score via recursive links between phenotypes.

    Science.gov (United States)

    Jamrozik, J; Schaeffer, L R

    2011-08-01

    Multiple-trait random regression models with recursive phenotypic link from somatic cell score (SCS) to milk yield on the same test day and with different restrictions on co-variances between these traits were fitted to the first-lactation Canadian Holstein data. Bayesian methods with Gibbs sampling were used to derive inferences about parameters for all models. Bayes factor indicated that the recursive model with uncorrelated environmental effects between traits was the most plausible specification in describing the data. Goodness of fit in terms of a within-trait weighted mean square error and correlation between observed and predicted data was the same for all parameterizations. All recursive models estimated similar negative causal effects from SCS to milk yield (up to -0.4 in 46-115 days in milk in lactation). Estimates of heritabilities, genetic and environmental correlations for the first two regression coefficients (overall level of a trait and lactation persistency) within both traits were similar among models. Genetic correlations between milk and SCS were dependent on the restrictions on genetic co-variances for these traits. Recursive model with uncorrelated system genetic effects between milk and SCS gave estimates of genetic correlations of the opposite sign compared with a regular multiple-trait model. Phenotypic recursion between milk and SCS seemed, however, to be the only source of environmental correlations between these two traits. Rankings of sires for total milk yield in lactation, average daily SCS and persistency for both traits were similar among models. Multiple-trait model with recursive links between milk and SCS and uncorrelated random environmental effects could be an attractive alternative for a regular multiple-trait model in terms of model parsimony and accuracy. PMID:21749472

  18. High-Dimensional Heteroscedastic Regression with an Application to eQTL Data Analysis.

    Science.gov (United States)

    Daye, Z John; Chen, Jinbo; Li, Hongzhe

    2012-03-01

    We consider the problem of high-dimensional regression under non-constant error variances. Despite being a common phenomenon in biological applications, heteroscedasticity has, so far, been largely ignored in high-dimensional analysis of genomic data sets. We propose a new methodology that allows non-constant error variances for high-dimensional estimation and model selection. Our method incorporates heteroscedasticity by simultaneously modeling both the mean and variance components via a novel doubly regularized approach. Extensive Monte Carlo simulations indicate that our proposed procedure can result in better estimation and variable selection than existing methods when heteroscedasticity arises from the presence of predictors explaining error variances and outliers. Further, we demonstrate the presence of heteroscedasticity in and apply our method to an expression quantitative trait loci (eQTLs) study of 112 yeast segregants. The new procedure can automatically account for heteroscedasticity in identifying the eQTLs that are associated with gene expression variations and lead to smaller prediction errors. These results demonstrate the importance of considering heteroscedasticity in eQTL data analysis. PMID:22547833

  19. An Assessment of Reaction Time Development in Females Using Polynomial Regression Analysis

    Directory of Open Access Journals (Sweden)

    Jaworski Janusz

    2014-08-01

    Full Text Available Purpose. The objective of this study was to determine the level and rate of change of reaction time during the developmental period from early childhood to early adulthood. Polynomial regression analysis was applied to determine the age at which the best reaction time results are achieved. Methods. The study involved 550 females between the ages of 7 and 20 years. Participants completed a computer test measuring simple reaction times to visual and auditory stimuli and choice reaction time during the ontogenetic developmental period. Results. Analysis of the results for age group distinguished two sub-periods of reaction time dynamics: a progressive increase throughout the developmental period followed by a plateau phase. This was evident for all reaction time measures (simple and choice particularly in the case of that data collected empirically. Conclusions. best reaction times to visual and auditory stimuli were approximately at the age of 17 years. In turn, quickest choice reaction time was approximately one year earlier in life. The most dynamic increase in the results of both simple reaction times was between the age of 7 and 8 years, whereas for choice reaction time this was between 10 and 11 years of age.

  20. Multivariate Regression Analysis of Winter Ozone Events in the Uinta Basin of Eastern Utah, USA

    Science.gov (United States)

    Mansfield, M. L.

    2012-12-01

    I report on a regression analysis of a number of variables that are involved in the formation of winter ozone in the Uinta Basin of Eastern Utah. One goal of the analysis is to develop a mathematical model capable of predicting the daily maximum ozone concentration from values of a number of independent variables. The dependent variable is the daily maximum ozone concentration at a particular site in the basin. Independent variables are (1) daily lapse rate, (2) daily "basin temperature" (defined below), (3) snow cover, (4) midday solar zenith angle, (5) monthly oil production, (6) monthly gas production, and (7) the number of days since the beginning of a multi-day inversion event. Daily maximum temperature and daily snow cover data are available at ten or fifteen different sites throughout the basin. The daily lapse rate is defined operationally as the slope of the linear least-squares fit to the temperature-altitude plot, and the "basin temperature" is defined as the value assumed by the same least-squares line at an altitude of 1400 m. A multi-day inversion event is defined as a set of consecutive days for which the lapse rate remains positive. The standard deviation in the accuracy of the model is about 10 ppb. The model has been combined with historical climate and oil & gas production data to estimate historical ozone levels.

  1. QAlign: quality-based multiple alignments with dynamic phylogenetic analysis

    OpenAIRE

    Sammeth, Michael; Rothga?nger, Jo?rg; Esser, Wolfram; Albert, Ju?rgen; Stoye, Jens; Harmsen, Dag

    2003-01-01

    Integrating different alignment strategies, a layout editor and tools deriving phylogenetic trees in a 'multiple alignment environment' helps to investigate and enhance results of multiple sequence alignment by hand. QAlign combines algorithms for fast progressive and accurate simultaneous multiple alignment with a versatile editor and a dynamic phylogenetic analysis in a convenient graphical user interface.

  2. Using Spline Regression in Semi-Parametric Stochastic Frontier Analysis: An Application to Polish Dairy Farms

    DEFF Research Database (Denmark)

    Czekaj, Tomasz Gerard; Henningsen, Arne

    The estimation of the technical efficiency comprises a vast literature in the field of applied production economics. There are two predominant approaches: the non-parametric and non-stochastic Data Envelopment Analysis (DEA) and the parametric Stochastic Frontier Analysis (SFA). The DEA is criticised, because it cannot account for statistical noise such as random production shocks and measurement errors, which are inherent in more or less all production data sets. In contrast, the SFA is criticised, because it requires the specification of a functional form, which involves the risk of specifying an unsuitable functional form and thus, model misspecification and biased parameter estimates. Given these problems of the DEA and the SFA, Fan, Li and Weersink (1996) proposed a semi-parametric stochastic frontier model that estimates the production function (frontier) by non-parametric regression based on kernel estimators. This approach combines the virtues of the DEA and the SFA, while avoiding their drawbacks: itavoids the specification of a functional form and at the same time accounts for statistical noise. More recently, this approach was used by Henderson and Simar (2005), Kumbhakar et al. (2007), and Henningsen and Kumbhakar (2009). The aim of this paper and its main contribution to the existing literature is the estimation semi-parametric stochastic frontier models using a different non-parametric estimation technique: spline regression (Ma et al. 2011). We apply this approach to the Polish dairy sector and use a panel data set of Polish dairy farms from the years 2004-2010. The Polish dairy sector has changed considerably since the integration of Poland in the European Union: the number of dairy producers decreased by one third and the average herd size increased from 3.8 to 5.7 cows per farm within the period 2004-2010. It is expected that farms with small herds (less than 30 dairy cows) will quit and that the number of large farms (with more than 100 dairy cows) will increase. Therefore, a thorough empirical study of the technical efficiency and scale efficiency of Polish dairy farms contributes to the insight into this dynamic process. Furthermore, we compare and evaluate the results of this spline-based semi-parametric stochastic frontier model with results of other semi-parametric stochastic frontier models and of traditional parametric stochastic frontier models. References: Fan, Y.; Li, Q. , Weersink, A. (1996), Semiparametric Estimation of Stochastic Production Frontier Models, Journal of Business and Economic Statistics. Henderson, D. J., Simar, L. (2005), A Fully Nonparametric Stochastic Frontier Model for Panel Data, University of New York Henningsen, A. , Kumbhakar, S. C. (2009), Semiparametric Stochastic Frontier Analysis: An Application to Polish Farms During Transition, Paper presented at the (EWEPA) in Pisa, Italy. Kumbhakar S. C., Park, B. U., Simar, L. Tsionas E. G. (2007), Nonparametric Stochastic Frontiers: A Local Maximum Likelihood Approach, Journal of Econometrics. Ma,S., Racine, J. S. & Yang, L. (2011), Spline regression in the presence of categorical predictors, Working Paper

  3. Gait analysis with multiple depth cameras.

    Science.gov (United States)

    Auvinet, Edouard; Multon, Franck; Meunier, Jean

    2011-01-01

    The gait movement seems simple at first glance, but in reality it is a very complex neural and biomechanical process. In particular, if a person is affected by a disease or an injury, the gait may be modified. To help detecting such change, we propose a new method based on multiple depth cameras. The aim of this paper is to show the possibility to reconstruct the body 3D volume in real time during gait in order to detect a pathological problem related to this movement and eventually improve diagnosis. Preliminary results showed that the system is sensitive to gait change produced by a heel prosthesis (heel cup) inserted in one shoe of subjects walking on a treadmill. The system detected a difference between maximal forward and backward positions of lower limbs for this pathological walk, a difference that was negligible for normal walk. These promising results were obtained with only 3 low cost depth cameras; we therefore believe that such methodology opens a new and affordable way for 3D volumetric gait analysis. PMID:22255770

  4. Multiple Criteria Analysis for Energy Storage Selection

    Directory of Open Access Journals (Sweden)

    Alexandre Barin

    2011-09-01

    Full Text Available In view of the current and predictable energy shortage and environmental concerns, the exploitation of renewable energy sources offers great potential to meet increasing energy demands and to decrease depend- ence on fossil fuels. However, introducing these sources will be more attractive provided they operate in conjunction with energy storage systems (ESS. Furthermore, effective energy storage management is essential to achieve a balance between power quality, efficiency, costs and environmental constraints. This paper presents a method based on the analytic hierarchy process and fuzzy multi-rules and multi-sets. By exploiting a multiple criteria analysis, the proposed methods evaluate the operation of storage energy systems such as: pumped hydro and compressed air energy storage, H2, flywheel, super-capacitors and lithium-ion storage as well as NaS advanced batteries and VRB flow battery. The main objective of the study is to find the most appropriate ESS consistent with a power quality priority. Several parameters are used for the investigation: efficiency, load management, technical maturity, costs, environmental impact and power quality.

  5. Detecting outliers in fuzzy regression analysis with asymmetric trapezoidal fuzzy data

    Science.gov (United States)

    Maleki, A.; Pasha, E.; Yari, Gh.; Razzaghnia, T.

    2012-01-01

    The existence of outliers in a set of experimental data can cause incorrect interpretation of the fuzzy linear regression results. This paper is to introduce some limitation on constraints of fuzzy linear regression models for determining fuzzy parameters with outliers by value trapezoidal fuzzy data.

  6. Using PEOX IML for iteratively re-weighted multiple linear regression with a block-diagonal covariance matrix

    Energy Technology Data Exchange (ETDEWEB)

    MacQueen, D.H.

    1990-08-01

    We have been using PROC IML to model the magnitude of a seismic event using signals from one to four seismic stations. Geological properties of the earth through which the shock waves travel introduce a correlation between the recorded signals for each event. Thus up to four correlated dependent variables are recorded for each value of the independent variable. We estimate the dependence of the measured seismic signals on the source using weighted linear regression with a block-diagonal covariance matrix in which the blocks are not all the same dimension and the covariances need to be estimated. After several iterations the weights and covariances converge. We can then estimate the magnitude of the source using a weighted average of the seismic signals, in which the weights and the variance of the estimate depend on which subset of the stations recorded the signal. PROC IML is a useful and succinct programming tool for this problem.

  7. Comparison of Some Estimation Methods in Linear Regression

    Directory of Open Access Journals (Sweden)

    ?lkay Alt?nda?

    2010-10-01

    Full Text Available In this study, we are informed about some methods as alternatives to the classical least squares methods which are used for simple linear and multiple linear regression analysis. In short, linear regression model is shown via matrix as;Y=X?+? where Y is the vector belonging to dependent variable, X is the design matrix of independent variables, ? is the parameter vector, ?is the vector belonging to error terms, so the least squares estimator of the linear regression is shown by?=(X^{?-1}X?Y Alternative methods have emerged on the purpose of outliers' existing in observations unlike the least squares estimation, data's not providing the regression assumptions or using of the previous information about parameters as well. In the study, we are informed about the least absolute deviations regression apart from the least squares method, artificial neural networks, M-regression, the nonparametric regression and Bayesian regression. On the purpose of comparison of the methods' results, numerical results are derived by using the temperature variation data in Antalya and Fethiye regions for simple regression analysis and variables affecting the fuel percentage in crude oil for multiple regression analysis.

  8. Regression analysis on ionic liquid pretreatment of sugarcane bagasse and assessment of structural changes

    International Nuclear Information System (INIS)

    This study aims to perform a regression analysis which leads to the optimization on the operating conditions of ionic liquid (IL), 1-ethyl-3-methylimidazolium acetate ([EMIM]oAc) pretreatment on sugarcane bagasse (SCB). The structural changes on SCB during pretreatment were also examined. The effects of temperature, time and solid loading on reducing sugar (RS) yield obtained from enzymatic hydrolysis of pretreated SCB were investigated by applying Central Composite Design (CCD) of Response Surface Methodology (RSM). Results from CCD were modeled into a second order polynomial equation and the model shows a good correlation between predicted and experimental values. The optimized condition for [EMIM]oAc pretreatment were 145 °C, 15 min and 14 wt% of solid loading with an optimum RS yield of 69.7%. Characterization of SCB was carried out and there were no significant difference between the chemical composition of untreated and [EMIM]oAc-pretreated SCB. Pretreated SCB was found to be porous, less crystalline and favorable to enzymatic hydrolysis as proven by Scanning Electron Microscopy (SEM), X-ray Powder Diffraction (XRD) analysis and Fourier Transform Infrared (FTIR) analysis. In short, [EMIM]oAc pretreatment shows good performance in improving the RS yield after enzymatic hydrolysis besides giving desirable structural modification on pretreated SCB. These are of great benefit to the subsequent downstream processes. -- Highlights: ? Reliable model prediction on reducing sugar yield. ? Temperature has the most significant effect on [EMIM]oAc pretreatment. ? High solid loading in [EMIM]oAc pretreatment is feasible. ? Amorphous and porous structure in pretreated bagasse was confirmed. ? No significant variation in chemical composition of untreated and pretreated bagasse.

  9. Brief review of regression-based and machine learning methods in genetic epidemiology: the Genetic Analysis Workshop 17 experience.

    Science.gov (United States)

    Dasgupta, Abhijit; Sun, Yan V; König, Inke R; Bailey-Wilson, Joan E; Malley, James D

    2011-01-01

    Genetics Analysis Workshop 17 provided common and rare genetic variants from exome sequencing data and simulated binary and quantitative traits in 200 replicates. We provide a brief review of the machine learning and regression-based methods used in the analyses of these data. Several regression and machine learning methods were used to address different problems inherent in the analyses of these data, which are high-dimension, low-sample-size data typical of many genetic association studies. Unsupervised methods, such as cluster analysis, were used for data segmentation and, subset selection. Supervised learning methods, which include regression-based methods (e.g., generalized linear models, logic regression, and regularized regression) and tree-based methods (e.g., decision trees and random forests), were used for variable selection (selecting genetic and clinical features most associated or predictive of outcome) and prediction (developing models using common and rare genetic variants to accurately predict outcome), with the outcome being case-control status or quantitative trait value. We include a discussion of cross-validation for model selection and assessment, and a description of available software resources for these methods. PMID:22128059

  10. Development of synthetic velocity - depth damage curves using a Weighted Monte Carlo method and Logistic Regression analysis

    Science.gov (United States)

    Vozinaki, Anthi Eirini K.; Karatzas, George P.; Sibetheros, Ioannis A.; Varouchakis, Emmanouil A.

    2014-05-01

    Damage curves are the most significant component of the flood loss estimation models. Their development is quite complex. Two types of damage curves exist, historical and synthetic curves. Historical curves are developed from historical loss data from actual flood events. However, due to the scarcity of historical data, synthetic damage curves can be alternatively developed. Synthetic curves rely on the analysis of expected damage under certain hypothetical flooding conditions. A synthetic approach was developed and presented in this work for the development of damage curves, which are subsequently used as the basic input to a flood loss estimation model. A questionnaire-based survey took place among practicing and research agronomists, in order to generate rural loss data based on the responders' loss estimates, for several flood condition scenarios. In addition, a similar questionnaire-based survey took place among building experts, i.e. civil engineers and architects, in order to generate loss data for the urban sector. By answering the questionnaire, the experts were in essence expressing their opinion on how damage to various crop types or building types is related to a range of values of flood inundation parameters, such as floodwater depth and velocity. However, the loss data compiled from the completed questionnaires were not sufficient for the construction of workable damage curves; to overcome this problem, a Weighted Monte Carlo method was implemented, in order to generate extra synthetic datasets with statistical properties identical to those of the questionnaire-based data. The data generated by the Weighted Monte Carlo method were processed via Logistic Regression techniques in order to develop accurate logistic damage curves for the rural and the urban sectors. A Python-based code was developed, which combines the Weighted Monte Carlo method and the Logistic Regression analysis into a single code (WMCLR Python code). Each WMCLR code execution provided a flow velocity-depth damage curve for a specific land use. More specifically, each WMCLR code execution for the agricultural sector generated a damage curve for a specific crop and for every month of the year, thus relating the damage to any crop with floodwater depth, flow velocity and the growth phase of the crop at the time of flooding. Respectively, each WMCLR code execution for the urban sector developed a damage curve for a specific building type, relating structural damage with floodwater depth and velocity. Furthermore, two techno-economic models were developed in Python programming language, in order to estimate monetary values of flood damages to the rural and the urban sector, respectively. A new Monte Carlo simulation was performed, consisting of multiple executions of the techno-economic code, which generated multiple damage cost estimates. Each execution used the proper WMCLR simulated damage curve. The uncertainty analysis of the damage estimates established the accuracy and reliability of the proposed methodology for the synthetic damage curves' development.

  11. Locating the Extrema of Fungible Regression Weights

    Science.gov (United States)

    Waller, Niels G.; Jones, Jeff A.

    2009-01-01

    In a multiple regression analysis with three or more predictors, every set of alternate weights belongs to an infinite class of "fungible weights" (Waller, Psychometrica, "in press") that yields identical "SSE" (sum of squared errors) and R[superscript 2] values. When the R[superscript 2] using the alternate weights is a fixed value, fungible…

  12. Identification of Sexually Abused Female Adolescents at Risk for Suicidal Ideations: A Classification and Regression Tree Analysis

    Science.gov (United States)

    Brabant, Marie-Eve; Hebert, Martine; Chagnon, Francois

    2013-01-01

    This study explored the clinical profiles of 77 female teenager survivors of sexual abuse and examined the association of abuse-related and personal variables with suicidal ideations. Analyses revealed that 64% of participants experienced suicidal ideations. Findings from classification and regression tree analysis indicated that depression,…

  13. A STATISTICAL ANALYSIS OF GDP AND FINAL CONSUMPTION USING SIMPLE LINEAR REGRESSION. THE CASE OF ROMANIA 1990–2010

    Directory of Open Access Journals (Sweden)

    B?l?cescu Aniela

    2011-12-01

    Full Text Available This paper aims to examine the causal relationship between GDP and final consumption. The authors used linear regression model in which GDP is considered variable results, and final consumption variable factor. In drafting article we used Excel software application that is a modern computing and statistical data analysis.

  14. Scientific Productivity on Research in Ethical Issues over the Past Half Century: A JoinPoint Regression Analysis

    OpenAIRE

    Long, Nguyen Phuoc; Huy, Nguyen Tien; Trang, Nguyen Thi Huyen; Luan, Nguyen Thien; Anh, Nguyen hoang; Nghi, Tran Diem; Hieu, Mai; Hirayama, Kenji; Karbwang, Juntra

    2014-01-01

    BACKGROUND: Ethics is one of the main pillars in the development of science. We performed a JoinPoint regression analysis to analyze the trends of ethical issue research over the past half century. The question is whether ethical issues are neglected despite their importance in modern research.

  15. What Satisfies Students? Mining Student-Opinion Data with Regression and Decision-Tree Analysis. AIR 2002 Forum Paper.

    Science.gov (United States)

    Thomas, Emily H.; Galambos, Nora

    To investigate how students' characteristics and experiences affect satisfaction, this study used regression and decision-tree analysis with the CHAID algorithm to analyze student opinion data from a sample of 1,783 college students. A data-mining approach identifies the specific aspects of students' university experience that most influence three…

  16. Using Regression Analysis to Establish the Relationship between Home Environment and Reading Achievement: A Case of Zimbabwe

    Science.gov (United States)

    Kanyongo, Gibbs Y.; Certo, Janine; Launcelot, Brown I.

    2006-01-01

    In this study, we report results of a study examining the relationship between home environment factors and reading achievement in Zimbabwe. The study utilised data collected by the Southern and Eastern Africa Consortium for Monitoring Educational Quality (SACMEQ). The data were submitted to linear regression analysis through structural equation…

  17. The basis function regression in pharmaceutical analysis. Theory and example of application.

    Science.gov (United States)

    Komsta, Lukasz; Skibi?ski, Robert; Pary?o, Marta; Dudek, Karolina

    2008-08-01

    The BFR (Basis Function Regression) is an interesting alternative to common techniques (such as PCR or PLS) in chemometrics. It is based on projecting the spectral information onto some number of equally spaced spline bases, than obtaining integrals of resulted curves. Existing references show that in certain cases it can reduce almost twice the RMSEP values. As this technique is not so popular in chemometrics nor applied in pharmaceutical analysis, it is desirable to present its theoretical considerations and implementation (with example MATLAB/Octave code). As an illustrative example we present the chemometric model for content recognition of a tablet (12 possible compounds in binary or ternary combinations) from the UV spectrum of its methanolic extract. The BFR technique gave lowest prediction error and the estimators obtained have more meritorical meaning than in case of PCR, PLS and other techniques used for comparison. In our opinion this technique should be considered in any chemometric approach as it often shows better performance. PMID:18450403

  18. Ultrasound material backscattered noise analysis by a duo wavelet-regression analysis

    OpenAIRE

    Bettayeb, Fairouz

    2012-01-01

    Internal material defects detection by ultrasound non destructive testing is widely used in industry, ultrasonic data are obtained from traveling waves inside the matter and captured by piezoelectric sensors. The natural inhomogeneous and anisotropy character of steel made material causes high acoustic attenuation and scattering effect. This adds complexity to data analysis. In this research we address the non linear features of back scattered ultrasonic waves from steel plates and welds.Inde...

  19. Application of Bootstrap Sample-Resample Method in Logistic Regression in Analysis of Breast Cancer Data

    Directory of Open Access Journals (Sweden)

    H Zeraati

    2006-05-01

    Full Text Available Background and Aim: The purpose of this study was to assess the accuracy of the bootstrap method in logistic regression and to explore the methods use in logistic regression models in cases where the sample size is insufficient. Materials and Methods: We use data from 150 patients who had undergone surgery at the Cancer Institute, Emam Khomeini hospital during from 1999 to 2001. Then we drew repeated samples of size 50 from these 150 patients. Results: Applying ordinary logistic regression, an appropriate model we fitted to the initial data. Then confidence intervals and standard errors were computed for all regression coefficients. There are many situations where the sample size is insufficient and conditions for using ordinary logistic regression are not met. In these cases the use of the bootstrap method not only produces more accurate estimations of regression coefficients, but with repeated sampling, produces estimates very close to the true values. This holds for the estimation of regression coefficients, confidence intervals and standard errors of coefficients. Conclusion: In this study we show the optimal number of replications and the optimal sample size when using the bootstrap method in studies involving relatively small sample sizes.

  20. Residual load method with multiple response spectra piping analysis

    International Nuclear Information System (INIS)

    The residual load method is incorporated into the multiple response spectra analysis. Numerical results show that this method produces close results to full multiple response spectra analysis results to obtain high frequency related responses in a cost effective way. Also, SRSS combination of support groups is shown to be appropriate with this method to reduce over-conservatism

  1. Simultaneous Two-Way Clustering of Multiple Correspondence Analysis

    Science.gov (United States)

    Hwang, Heungsun; Dillon, William R.

    2010-01-01

    A 2-way clustering approach to multiple correspondence analysis is proposed to account for cluster-level heterogeneity of both respondents and variable categories in multivariate categorical data. Specifically, in the proposed method, multiple correspondence analysis is combined with k-means in a unified framework in which "k"-means is applied…

  2. Modeling and regression analysis of semiochemical dose-response curves of insect antennal reception and behavior.

    Science.gov (United States)

    Byers, John A

    2013-08-01

    Dose-response curves of the effects of semiochemicals on neurophysiology and behavior are reported in many articles in insect chemical ecology. Most curves are shown in figures representing points connected by straight lines, in which the x-axis has order of magnitude increases in dosage vs. responses on the y-axis. The lack of regression curves indicates that the nature of the dose-response relationship is not well understood. Thus, a computer model was developed to simulate a flux of various numbers of pheromone molecules (10(3) to 5 × 10(6)) passing by 10(4) receptors distributed among 10(6) positions along an insect antenna. Each receptor was depolarized by at least one strike by a molecule, and subsequent strikes had no additional effect. The simulations showed that with an increase in pheromone release rate, the antennal response would increase in a convex fashion and not in a logarithmic relation as suggested previously. Non-linear regression showed that a family of kinetic formation functions fit the simulated data nearly perfectly (R(2) >0.999). This is reasonable because olfactory receptors have proteins that bind to the pheromone molecule and are expected to exhibit enzyme kinetics. Over 90 dose-response relationships reported in the literature of electroantennographic and behavioral bioassays in the laboratory and field were analyzed by the logarithmic and kinetic formation functions. This analysis showed that in 95% of the cases, the kinetic functions explained the relationships better than the logarithmic (mean of about 20% better). The kinetic curves become sigmoid when graphed on a log scale on the x-axis. Dose-catch relationships in the field are similar to dose-EAR (effective attraction radius, in which a spherical radius indicates the trapping effect of a lure) and the circular EARc in two dimensions used in mass trapping models. The use of kinetic formation functions for dose-response curves of attractants, and kinetic decay curves for inhibitors, will allow more accurate predictions of insect catch in monitoring and control programs. PMID:23897111

  3. Regression Analysis of Effective Factor on People Participation in Protecting, Revitalizing, Developing and Using Renewable Natural Resources in Ilam Province from the View of Users

    Directory of Open Access Journals (Sweden)

    Bagher Arayesh

    2010-01-01

    Full Text Available Problem statement: The purpose of this study was the regression analysis of effective factor on people participation in protecting, revitalizing, developing and using renewable natural resources in Ilam province. Approach: This study was a casual comparative and applies one. Sample was taken from natural resources users. Results: The sample size of groups was 317 for users respectively. For sample selection, stratified, cluster and multiple sampling were utilized. The main tools for gathering data were questionnaire. The reliability and validity of the questionnaire were obtained by experts and pilot study and its Alfa level was 88%. Descriptive and inferential statistics were used and data was analyzed by sp. 15. To test the hypothesis, correlation, multiple regressions were employed. Conclusion: The result indicated that level of education, rate of media using, users trusting on natural resources executive, consulting with users before implementation the plans, number of cattles, kind of occupation, users membership in public institution and organization, social status of users, Technical knowledge of users, present status of natural of natural resources extensive plans, political and low full support of users, amount of loan received by users and organizing nature assistant, have a significant role on people participation on protecting, revitalizing, developing and using renewable natural resources.

  4. The non-condition logistic regression analysis of the reason of hypothyroidism after hyperthyroidism with 131I treatment

    International Nuclear Information System (INIS)

    There are many opinions on the reason of hypothyroidism after hyperthyroidism with 131I treatment. In this respect, there are a few scientific analyses and reports. The non-condition logistic regression solved this problem successfully. It has a higher scientific value and confidence in the risk factor analysis. 748 follow-up patients' data were analysed by the non-condition logistic regression. The results shown that the half-life and 131I dose were the main causes of the incidence of hypothyroidism. The degree of confidence is 92.4%

  5. Shock Index Correlates with Extravasation on Angiographs of Gastrointestinal Hemorrhage: A Logistics Regression Analysis

    International Nuclear Information System (INIS)

    We applied multivariate analysis to the clinical findings in patients with acute gastrointestinal (GI) hemorrhage and compared the relationship between these findings and angiographic evidence of extravasation. Our study population consisted of 46 patients with acute GI bleeding. They were divided into two groups. In group 1 we retrospectively analyzed 41 angiograms obtained in 29 patients (age range, 25-91 years; average, 71 years). Their clinical findings including the shock index (SI), diastolic blood pressure, hemoglobin, platelet counts, and age, which were quantitatively analyzed. In group 2, consisting of 17 patients (age range, 21-78 years; average, 60 years), we prospectively applied statistical analysis by a logistics regression model to their clinical findings and then assessed 21 angiograms obtained in these patients to determine whether our model was useful for predicting the presence of angiographic evidence of extravasation. On 18 of 41 (43.9%) angiograms in group 1 there was evidence of extravasation; in 3 patients it was demonstrated only by selective angiography. Factors significantly associated with angiographic visualization of extravasation were the SI and patient age. For differentiation between cases with and cases without angiographic evidence of extravasation, the maximum cutoff point was between 0.51 and 0.0.53. Of the 21 angiograms obtained in group 2, 13 (61.9%) showed evidence of extravasation; in 1 patient it was demonstrated only on sele1 patient it was demonstrated only on selective angiograms. We found that in 90% of the cases, the prospective application of our model correctly predicted the angiographically confirmed presence or absence of extravasation. We conclude that in patients with GI hemorrhage, angiographic visualization of extravasation is associated with the pre-embolization SI. Patients with a high SI value should undergo study to facilitate optimal treatment planning

  6. Changes of platelet GMP-140 in diabetic nephropathy and its multi-factor regression analysis

    International Nuclear Information System (INIS)

    The relation of platelet GMP-140 and its related factors with diabetic nephropathy was studied. 144 patients of diabetic mellitus without nephropathy (group without DN, mean suffering duration of 25.5 +- 18.6 months); 80 with diabetic nephropathy (group DN, mean suffering duration of 58.7 +- 31.6 months) and 50 normal controls were chosen in the research. Platelet GMP-140, plasma ?1-MG, ?2-MG, and 24 hour urine albumin (ALB), IgG, ?1-MG, ?2-MG were detected by RIA, while HBA1C via chromatographic separation and FBG, PBG, Ch, TG, HDL, FG via biochemical methods. All the data had been processed with software on computer with t-test and linear regression, and multi-factor analysis were done also. The levels of platelet GMP-140, FG, DBP, TG, HBA1C and PBG in group DN were significantly higher than those of group without DN and normal control (P 0.05), while they were higher than those of normal controls. Multi-factor analysis of platelet GMP-140 with TG, DBP and HBA1C were performed in 80 patients with DN (P 1C are the independent factors enhancing the activation of platelets. The disturbance of lipid metabolism in type II diabetic mellitus may also enhance the activation of platelets. Elevation of blood pressure may accelerate the initiation and deterioration of DN in which change of platelet GMP-140 is an independent factor. Elevation of HBA1C and blood glucose are related closely to the diabetic nephropathy

  7. Neuroanatomical abnormalities in schizophrenia: a multimodal voxelwise meta-analysis and meta-regression analysis.

    Science.gov (United States)

    Bora, Emre; Fornito, Alex; Radua, Joaquim; Walterfang, Mark; Seal, Marc; Wood, Stephen J; Yücel, Murat; Velakoulis, Dennis; Pantelis, Christos

    2011-04-01

    Despite an increasing number of published voxel based morphometry studies of schizophrenia, there has been no adequate attempt to examine gray (GM) and white matter (WM) abnormalities and the heterogeneity of published findings. In the current article, we used a coordinate based meta-analysis technique to simultaneously examine GM and WM abnormalities in schizophrenia and to assess the effects of gender, chronicity, negative symptoms and other clinical variables. 79 studies meeting our inclusion criteria were included in the meta-analysis. Schizophrenia was associated with GM reductions in the bilateral insula/inferior frontal cortex, superior temporal gyrus, anterior cingulate gyrus/medial frontal cortex, thalamus and left amygdala. In WM analyses of volumetric and diffusion-weighted images, schizophrenia was associated with decreased FA and/or WM in interhemispheric fibers, anterior thalamic radiation, inferior longitudinal fasciculi, inferior frontal occipital fasciculi, cingulum and fornix. Male gender, chronic illness and negative symptoms were associated with more severe GM abnormalities and illness chronicity was associated with more severe WM deficits. The meta-analyses revealed overlapping GM and WM structural findings in schizophrenia, characterized by bilateral anterior cortical, limbic and subcortical GM abnormalities, and WM changes in regions including tracts that connect these structures within and between hemispheres. However, the available findings are biased towards characteristics of schizophrenia samples with poor prognosis. PMID:21300524

  8. Analysis of correlation and regression between particle ionizing radiation parameters and the stability characteristics of irradiated monocrystalline Si film

    Directory of Open Access Journals (Sweden)

    Jakši? Uroš G.

    2014-01-01

    Full Text Available This paper deals with the analysis of correlation and regression between the parameters of particle ionizing radiation and the stability characteristics of the irradiated monocrystalline silicon film. Based on the presented theoretical model of correlation and linear regression between two random variables, numeric and real experiments were performed. In the numeric experiment, a simulation of the effect of alpha radiation on a thin layer of monocrystalline silicon was performed by observing a number of vacancies along the film depth resulting from a single incident alpha particle. In the real experiment, the irradiation of a thin silicon film by alpha particles from a radioactive Am-241 alpha emitter was performed. The observed values of radiation effect on the Si film were specific resistance and the concentration of free charge carriers. The results showed a fine concordance between numeric and real experiments. Correlation verification of the observed values was presented by linear regression functions. [Projekat Ministarstva nauke Republike Srbije, br. 171007

  9. Future daily PM10 concentrations prediction by combining regression models and feedforward backpropagation models with principle component analysis (PCA)

    Science.gov (United States)

    Ul-Saufie, Ahmad Zia; Yahaya, Ahmad Shukri; Ramli, Nor Azam; Rosaida, Norrimi; Hamid, Hazrul Abdul

    2013-10-01

    Future PM10 concentration prediction is very important because it can help local authorities to enact preventative measures to reduce the impact of air pollution. The aims of this study are to improve prediction of Multiple Linear Regression (MLR) and Feedforward backpropagation (FFBP) by combining them with principle component analysis for predicting future (next day, next two-day and next three-day) PM10 concentration in Negeri Sembilan, Malaysia. Annual hourly observations for PM10 in Negeri Sembilan, Malaysia from January 2003 to December 2010 were selected for predicting PM10 concentration level. Eighty percent of the monitoring records were used for training and twenty percent were used for validation of the models. Three accuracy measures - Prediction Accuracy (PA), Coefficient of Determination (R2) and Index of Agreement (IA), as well as two error measures - Normalized Absolute Error (NAE) and Root Mean Square Error (RMSE) were used to evaluate the performance of the models. Results show that PCA models combined with MLR and PCA with FFBP improved MLR and FFBP models for all three days in advance of predicting PM10 concentration, with reduced errors by as much as 18.1% (PCA-MLR) and 17.68% (PCA-FFBP) for next day, 19.2% (PCA-MLR) and 22.1% (PCA-FFBP) for next two-day and 18.7% (PCA-MLR) and 22.79% (PCA-FFBP) for next three-day predictions. Including PCA improved the accuracy of the models by as much as by 12.9% (PCA-MLR) and 13.3% (PCA-FFBP) for next day, 32.3% (PCA-MLR) and 14.7% (PCA-FFBP) for next two-day and 46.1% (PCA-MLR) and 19.3% (PCA-FFBP) for next three-day predictions.

  10. Further Insight and Additional Inference Methods for Polynomial Regression Applied to the Analysis of Congruence

    OpenAIRE

    Cohen, Ayala; Nahum-shani, Inbal; Doveh, Etti

    2010-01-01

    In their seminal paper, Edwards and Parry (1993) presented the polynomial regression as a better alternative to applying difference score in the study of congruence. While this method is increasingly applied in congruence research, its complexity relative to other methods for assessing congruence (e.g., difference score methods) was one of the main claims against its use. The objective of this work is to gain additional insight into the use of polynomial regression in the area of social and b...

  11. Factor models and variable selection in high-dimensional regression analysis

    OpenAIRE

    Kneip, Alois; Sarda, Pascal

    2012-01-01

    The paper considers linear regression problems where the number of predictor variables is possibly larger than the sample size. The basic motivation of the study is to combine the points of view of model selection and functional regression by using a factor approach: it is assumed that the predictor vector can be decomposed into a sum of two uncorrelated random components reflecting common factors and specific variabilities of the explanatory variables. It is shown that the ...

  12. Sample- and segment-size specific Model Selection in Mixture Regression Analysis

    OpenAIRE

    Sarstedt, Marko

    2006-01-01

    As mixture regression models increasingly receive attention from both theory and practice, the question of selecting the correct number of segments gains urgency. A misspecification can lead to an under- or oversegmentation, thus resulting in flawed management decisions on customer targeting or product positioning. This paper presents the results of an extensive simulation study that examines the performance of commonly used information criteria in a mixture regression context with normal ...

  13. Analysis of North American Rheumatoid Arthritis Consortium data using a penalized logistic regression approach

    OpenAIRE

    Croiseau Pascal; Cordell Heather J

    2009-01-01

    Abstract We applied a penalized regression approach to single-nucleotide polymorphisms in regions on chromosomes 1, 6, and 9 of the North American Rheumatoid Arthritis Consortium data. Results were compared with a standard single-locus association test. Overall, the penalized regression approach did not appear to offer any advantage with respect to either detection or localization of disease-associated polymorphisms, compared with the single-locus approach.

  14. Collaborative regression.

    Science.gov (United States)

    Gross, Samuel M; Tibshirani, Robert

    2015-04-01

    We consider the scenario where one observes an outcome variable and sets of features from multiple assays, all measured on the same set of samples. One approach that has been proposed for dealing with these type of data is "sparse multiple canonical correlation analysis" (sparse mCCA). All of the current sparse mCCA techniques are biconvex and thus have no guarantees about reaching a global optimum. We propose a method for performing sparse supervised canonical correlation analysis (sparse sCCA), a specific case of sparse mCCA when one of the datasets is a vector. Our proposal for sparse sCCA is convex and thus does not face the same difficulties as the other methods. We derive efficient algorithms for this problem that can be implemented with off the shelf solvers, and illustrate their use on simulated and real data. PMID:25406332

  15. Is the Sexual Behaviour of HIV Patients on Antiretroviral therapy safe or risky in Sub-Saharan Africa? Meta-Analysis and Meta-Regression

    Directory of Open Access Journals (Sweden)

    Berhan Asres

    2012-05-01

    Full Text Available Abstract Background Reports on the sexual behavior of people on antiretroviral therapy (ART are inconsistent. We selected 14 articles that compared the sexual behavior of people with and without ART for this analysis. Methods We included both cross-sectional studies that compared different ART-naïve and ART-experienced participants and longitudinal studies examining the behavior of the same individuals pre- and post-ART start. Meta-analyses were performed both stratified by type of study and combined. Outcome variables assessed for association with ART experience were any sexual activity, unprotected sex and having multiple sexual partners. Random-effect models were applied to determine the overall odds ratios. Sub-group analyses and meta-regression analyses were performed to examine sources of heterogeneity among the studies. Sensitivity analysis was also conducted to evaluate the stability of the overall odds ratio in the presence of outliers. Results The meta-analysis failed to show a statistically significant association of any sexual activity with ART experience. It did, however, show an overall statistically significant reduction of any unprotected sex, having multiple sexual partners and unprotected sex with HIV negative or unknown HIV status with ART experience. Meta-regression showed no interaction between duration of ART use or recall period of sexual behavior with the sexual activity variables. However, there was an association between the percentage of married or cohabiting participants included in a study and reductions in the practice of unprotected sex with ART. Conclusion In general, this meta-analysis demonstrated a significant reduction in risky sexual behavior among people on ART in sub-Saharan Africa. Future studies should investigate the reproducibility and continuity of the observed positive behavioural changes as the duration of ART lasts a decade or more.

  16. Statistical sex determination from craniometrics: Comparison of linear discriminant analysis, logistic regression, and support vector machines.

    Science.gov (United States)

    Santos, Frédéric; Guyomarc'h, Pierre; Bruzek, Jaroslav

    2014-10-13

    Accuracy of identification tools in forensic anthropology primarily rely upon the variations inherent in the data upon which they are built. Sex determination methods based on craniometrics are widely used and known to be specific to several factors (e.g. sample distribution, population, age, secular trends, measurement technique, etc.). The goal of this study is to discuss the potential variations linked to the statistical treatment of the data. Traditional craniometrics of four samples extracted from documented osteological collections (from Portugal, France, the U.S.A., and Thailand) were used to test three different classification methods: linear discriminant analysis (LDA), logistic regression (LR), and support vector machines (SVM). The Portuguese sample was set as a training model on which the other samples were applied in order to assess the validity and reliability of the different models. The tests were performed using different parameters: some included the selection of the best predictors; some included a strict decision threshold (sex assessed only if the related posterior probability was high, including the notion of indeterminate result); and some used an unbalanced sex-ratio. Results indicated that LR tends to perform slightly better than the other techniques and offers a better selection of predictors. Also, the use of a decision threshold (i.e. p>0.95) is essential to ensure an acceptable reliability of sex determination methods based on craniometrics. Although the Portuguese, French, and American samples share a similar sexual dimorphism, application of Western models on the Thai sample (that displayed a lower degree of dimorphism) was unsuccessful. PMID:25459272

  17. Physicochemical factors associated with binding and retention of compounds in ocular melanin of rats: correlations using data from whole-body autoradiography and molecular modeling for multiple linear regression analyses.

    Science.gov (United States)

    Zane, P A; Brindle, S D; Gause, D O; O'Buck, A J; Raghavan, P R; Tripp, S L

    1990-09-01

    The relationship between the physicochemical characteristics of 27 new drug candidates and their distribution into the melanin-containing structure of the rat eye, the uveal tract, was examined. Tissue distribution data were obtained from whole-body autoradiograms of pigmented Long-Evans rats sacrificed at 5 min and 96 hr after dosing. The physicochemical parameters considered include molecular weight, pKa, degree of ionization, octanol/water partition coefficient (log Po/w), drug-melanin binding energy, and acid/base status of the functional groups within the molecule. Multiple linear regression analysis was used to describe the best model correlating physicochemical and/or biological characteristics of these compounds to their initial distribution at 5 min and to the retention of residual radioactivity in ocular melanin at 96 hr post-injection. The early distribution was a function primarily of acid/base status, pKa, binding energy, and log P(o/w), whereas uveal tract retention in rats was a function of volume of distribution (V1), log P(o/w), pKa, and binding energy. Further, there was a relationship between the initial distribution of a compound into the uveal tract and its retention 96 hr later. More specifically, the structures most likely to be distributed and ultimately retained at high concentrations were those containing strongly basic functionalities, such as piperidine or piperazine moieties and other amines. Further, the more lipophilic and, hence, widely distributed the basic compound, the greater the likelihood that it interacts with ocular melanin. In summary, the use of multiple linear regression analysis was useful in distinguishing which physicochemical characteristics of a compound or group of compounds contributed to melanin binding in pigmented rats in vivo. PMID:2235893

  18. Use of PLS regression in XRF spectroscopy: development of a quantitative analysis software and applications to XRF spectrometry

    International Nuclear Information System (INIS)

    Most of the applications and new generation of instruments in X ray Fluorescence spectroscopy require that the results of measurements be known as soon as possible. The present work aims to introduce an alternative solution allowing the direct conversion of spectra obtained by the XRF spectrometer to elements concentrations. This solution is the use of PLS regression which consists of establishing a regression relation [Y]=[X][?]+[E] between concentrations and spectra where [E] is the residual of the regression.The principle is to determine the coefficient [?] by mean of calibration with standard samples. This coefficient is used after to predict elements concentrations in unknown samples by a simple matrix multiplication. In order to use this method, we have developped the X-PLS v1.0 software which implements the PLS regression. This software allows to perform all required tasks for the calibration and also the prediction of concetrations in unknown samples. Besides, it gives the possibility to save the calibration results in a file so that they can be used later in adequate conditions. The results obtained during the application of this method in total reflection X ray spectrometry can confirm that the PLS regression can be used as quantification method in XRF spectroscopy. The graphical representation of the regression coefficient illustrates the existence of a regression relation between spectra and concentrations. We could also prove that satisfactory accuracy d also prove that satisfactory accuracy can be obtained for the prediction in good conditions. One of these conditions is that the concentration of the element of interest must change as much as possible in as much samples as possible. This condition has been satisfied during the first set of measurements in which only the elements Cu and Zn had the major proportions within 10 samples. Relative errors obtained for prediction were 5,24% for the Cu and 7,16% for the Zn. For the second set, we used 5 elements (Cu, Zn, As, Se, Ni) in 10 samples. Results obtained for the predictions have had the following relative errors: Cu : 10,85%, Zn:6,93%, As:9,42%, Se:8,25% and 14,90% for the Ni.

  19. Trends in Multiple Criteria Decision Analysis

    CERN Document Server

    Ehrgott, Matthias; Greco, Salvatore

    2010-01-01

    Multiple Criteria Decision Making (MCDM) is the study of methods and procedures by which concerns about multiple conflicting criteria can be formally incorporated into the management planning process. A key area of research in OR/MS, MCDM is now being applied in many new areas, including GIS systems, AI, and group decision making. This volume is in effect the third in a series of Springer books by these editors (all in the ISOR series), and it brings all the latest developments in MCDM into focus. Looking at developments in the applications, methodologies and foundations of MCDM, it presents r

  20. Transferencia regional de información hidrológica mediante regresión lineal múltiple de tipo ridge / Regional transference of hydrologic information through multiple linear regression of ridge type

    Scientific Electronic Library Online (English)

    Daniel F., Campos-Aranda.

    2013-08-01

    Full Text Available SciELO Mexico | Language: Spanish Abstract in spanish Cuando se emplean registros largos de escurrimiento, lluvia o crecientes anuales de una región con respuesta hidrológica similar, para ampliar una serie corta a través de la técnica estadística de regresión lineal múltiple (RLM), es probable que tales registros por su semejanza intrínseca den origen [...] a un problema de multicolinealidad. Tal problema se debe detectar y cuantificar para saber si es aceptable, moderada, fuerte o grave y buscar soluciones alternativas al método de ajuste de la RLM por mínimos cuadrados de los residuos. En este estudio se diagnosticó la multicolinealidad mediante factores de inflación de la variancia e índices de condición, basados en los eigenvalores. Además se presenta como método alternativo el ajuste sesgado de la RLM, conocido como regresión Ridge. Una aplicación numérica en el sistema del río Tempoal, de la Región Hidrológica No. 26 (Pánuco, México), se describió para completar el registro corto de volúmenes escurridos anuales de la estación hidrométrica Platón Sánchez, con base en las otras cuatro estaciones de aforos que cuentan con registros amplios. Se concluye que las principales ventajas de la regresión Ridge son la facilidad de manejo de transferencia con seis o más regresores y la sencillez de su implementación y desarrollo a través de la traza Ridge. Abstract in english When annual long records are used of runoff, rainfall or flooding of a region with similar hydrological response, to amplify short series through the statistical technique of multiple linear regression (MLR), it is likely that those records by reason of their intrinsic similarity will lead to a prob [...] lem of multicollinearity. This problem should be detected and quantified to know if it is acceptable, moderate, strong or serious and look for alternative solutions to the fitting method of the MLR by least squares of the residuals. In this study a diagnostic was made of multicollinearity through variance inflation factors and condition indices based on the eigenvalues. In addition, the biased fitting of the MLR is presented as an alternative method, known as Ridge regression. A numerical application in the system of the Tempoal river, of Hydrological Region No. 26 (Pánuco, México), was described to complete the short record of runoff volumes of the Platón Sánchez hydrometric station, based on the other four measuring stations that have long records. It is concluded that the principal advantages of Ridge regression are the ease of handling of transference with six or more regressions and the simplicity of its implementation and development by means of the Ridge trace.

  1. Sensitivity Analysis in Bayesian Networks: From Single to Multiple Parameters

    OpenAIRE

    Chan, Hei; Darwiche, Adnan

    2012-01-01

    Previous work on sensitivity analysis in Bayesian networks has focused on single parameters, where the goal is to understand the sensitivity of queries to single parameter changes, and to identify single parameter changes that would enforce a certain query constraint. In this paper, we expand the work to multiple parameters which may be in the CPT of a single variable, or the CPTs of multiple variables. Not only do we identify the solution space of multiple parameter changes...

  2. On assessing operator response time in human reliability analysis (HRA) using a possibilistic fuzzy regression model

    International Nuclear Information System (INIS)

    Operator response times for off-normal events in nuclear power plants have been used to assess human failure probabilities. Usually, log-normal distribution is applied to the relation of response times and failure probabilities. In the literature there have been several studies on estimating the response times. Since the response times are affected by the performance shaping factors, a regression model can be applied to estimate the response times. The conventional regression model regards the deviations between the observed and estimated values as measurement errors. However, the deviations of the dependent variable, the response times can be dependent upon the fuzziness of the parameters for the independent variables, the performance shaping factors. In this research, possibilistic linear regression models have been used to asses fuzzy parameters and furthermore response times associated with the performance shaping factors

  3. Solar radiation analysis and regression coefficients for the Vhembe Region, Limpopo Province, South Africa

    Scientific Electronic Library Online (English)

    Sophie T, Mulaudzi; Vaithianathaswami, Sankaran; Meena D, Lysko.

    Full Text Available SciELO South Africa | Language: English Abstract in english Given the limited observed and reliable data for solar irradiance in rural parts in South Africa, a correlation equation of the Angström-Prescott linear type has been used to estimate the regression coefficients in the Vhembe District, Limpopo Province, South Africa. Five stations were selected for [...] the study, with the greatest distance between stations less than 180 km. Monthly regression coefficients were derived for each station based on an observation dataset of sunshine duration hours and global horizontal irradiance. The correlation coefficients appear to be above 0.9. The representative Angström-Prescott model for the Vhembe Region was found by collating the data for each station and then averaging the respective correlation coefficients. This paper presents the generated regression coefficients for each station and for the Vhembe Region.

  4. Application of Robust Regression and Bootstrap in Poductivity Analysis of GERD Variable in EU27

    Directory of Open Access Journals (Sweden)

    Dagmar Blatná

    2014-06-01

    Full Text Available The GERD is one of Europe 2020 headline indicators being tracked within the Europe 2020 strategy. The headline indicator is the 3% target for the GERD to be reached within the EU by 2020. Eurostat defi nes “GERD” as total gross domestic expenditure on research and experimental development in a percentage of GDP. GERD depends on numerous factors of a general economic background, namely of employment, innovation and research, science and technology. The values of these indicators vary among the European countries, and consequently the occurrence of outliers can be anticipated in corresponding analyses. In such a case, a classical statistical approach – the least squares method – can be highly unreliable, the robust regression methods representing an acceptable and useful tool. The aim of the present paper is to demonstrate the advantages of robust regression and applicability of the bootstrap approach in regression based on both classical and robust methods.

  5. Factor models and variable selection in high-dimensional regression analysis

    CERN Document Server

    Kneip, Alois; 10.1214/11-AOS905

    2012-01-01

    The paper considers linear regression problems where the number of predictor variables is possibly larger than the sample size. The basic motivation of the study is to combine the points of view of model selection and functional regression by using a factor approach: it is assumed that the predictor vector can be decomposed into a sum of two uncorrelated random components reflecting common factors and specific variabilities of the explanatory variables. It is shown that the traditional assumption of a sparse vector of parameters is restrictive in this context. Common factors may possess a significant influence on the response variable which cannot be captured by the specific effects of a small number of individual variables. We therefore propose to include principal components as additional explanatory variables in an augmented regression model. We give finite sample inequalities for estimates of these components. It is then shown that model selection procedures can be used to estimate the parameters of the a...

  6. Analysis of variance, coefficient of determination and $F$-test for local polynomial regression

    CERN Document Server

    Huang, Li-Shan

    2008-01-01

    This paper provides ANOVA inference for nonparametric local polynomial regression (LPR) in analogy with ANOVA tools for the classical linear regression model. A surprisingly simple and exact local ANOVA decomposition is established, and a local R-squared quantity is defined to measure the proportion of local variation explained by fitting LPR. A global ANOVA decomposition is obtained by integrating local counterparts, and a global R-squared and a symmetric projection matrix are defined. We show that the proposed projection matrix is asymptotically idempotent and asymptotically orthogonal to its complement, naturally leading to an $F$-test for testing for no effect. A by-product result is that the asymptotic bias of the ``projected'' response based on local linear regression is of quartic order of the bandwidth. Numerical results illustrate the behaviors of the proposed R-squared and $F$-test. The ANOVA methodology is also extended to varying coefficient models.

  7. JT-60 configuration parameters for feedback control determined by regression analysis

    International Nuclear Information System (INIS)

    The stepwise regression procedure was applied to obtain measurement formulas for equilibrium parameters used in the feedback control of JT-60. This procedure automatically selects variables necessary for the measurements, and selects a set of variables which are not likely to be picked up by physical considerations. Regression equations with stable and small multicollinearity were obtained and it was experimentally confirmed that the measurement formulas obtained through this procedure were accurate enough to be applicable to the feedback control of plasma configurations in JT-60. (author)

  8. Predicting tropical cyclone intensity using satellite measured equivalent blackbody temperatures of cloud tops. [regression analysis

    Science.gov (United States)

    Gentry, R. C.; Rodgers, E.; Steranka, J.; Shenk, W. E.

    1978-01-01

    A regression technique was developed to forecast 24 hour changes of the maximum winds for weak (maximum winds less than or equal to 65 Kt) and strong (maximum winds greater than 65 Kt) tropical cyclones by utilizing satellite measured equivalent blackbody temperatures around the storm alone and together with the changes in maximum winds during the preceding 24 hours and the current maximum winds. Independent testing of these regression equations shows that the mean errors made by the equations are lower than the errors in forecasts made by the peristence techniques.

  9. Spontaneous regression of osteochondromas

    Energy Technology Data Exchange (ETDEWEB)

    Hoshi, Manabu; Takami, Masatsugu; Hashimoto, Ryouji; Okamoto, Takashi; Yanagida, Ikuhisa; Matsumura, Akira; Noguchi, Kazuko [Yodogawa Christian Hospital, Department of Orthopaedic Surgery, Osaka (Japan)

    2007-06-15

    Spontaneous regression of an osteochondroma is an infrequent event. In this report, two cases with spontaneous regression of osteochondromas are presented. The first case was a solitary osteochondroma of the pedunculated type involving the right proximal humerus in a 7-year-old boy. This lesion resolved over 15 months of observation. The second case was a 3-year-old girl with multiple osteochondromatosis, in whom sessile osteochondromas of the right tibia and left fibula regressed over 33 months.The mechanism of this phenomenon is discussed with a review of previous reports. Regarding treatment, careful observation may be acceptable for typical osteochondromas, especially in young children. (orig.)

  10. Multivariate analysis of micro-Raman spectra of thermoplastic polyurethane blends using principal component analysis and principal component regression.

    Science.gov (United States)

    Weakley, Andrew Todd; Warwick, P C Temple; Bitterwolf, Thomas E; Aston, D Eric

    2012-11-01

    Probing the specific hydrogen-bonding behavior of thermoplastic polyurethane (TPU) blends using vibrational spectroscopies remains the sin qua non for understanding the link between hydrogen-bonding and phase-segregation behavior. However, current literature holds to more traditional univariate approaches when studying the morphologically interesting normal molecular vibrations of TPUs. In the present study, multivariate analysis, including principal component analysis (PCA) and principal component regression (PCR), is used to scrutinize the relevant Raman bands acquired from a binary mixture of analogous TPU copolymer blends. Considering the near identical behavior of selected spectral regions, PCA was capable of isolating linear and nonlinear composition-dependent trends on PC-scores plots. From here, the PC scores, extracted from wavelengths comprising the carbonyl stretching region (1681-1764 cm(-1)), CH(2) deformations (1380-1500 cm(-1)), aromatic stretch from the hard segment (1617 cm(-1)), and amide II mixed band (1540 cm(-1)), were used to explicitly predict the mole fraction of hard segment present in each blend using PCR. Spectral preprocessing, wavelength selection, and variable scaling were major factors in PCR accurately predicting the weight fraction of each copolymer in spite of the clearly evident, blend-specific spectroscopic behavior. PMID:23146182

  11. Statistical analysis using multiple regression of stereological parameters for skeleton castings microstructure

    OpenAIRE

    Cholewa, M.; Dziuba-ka?uz?a, M.

    2011-01-01

    In this article authors showed influence of technological parameters and modification treatment on structural properties for closed skeleton castings. Approach obtained maximal refinement of structure and minimal structure diversification. Skeleton castings were manufactured in accordance with elaborated production technology. Experimental castings were manufactured in variables technological conditions: range of pouring temperature 953 ÷ 1013 K , temperature of mould 293 ÷ 373 K and height...

  12. Factors Influencing New York Doctoral Graduate Student Satisfaction: A Quantitative Multiple Regression Analysis

    Science.gov (United States)

    Nwenyi, Sabina E.

    2013-01-01

    Higher education administrators face challenges in providing a welcoming environment for doctoral students in higher education institutions. Administrators need to identify factors influencing satisfaction of this group of students to provide a supportive environment, reduce attrition rates, and promote persistence. The purpose of this…

  13. A Multiple Regression Analysis on Influencing Factors of Urban Services Growth in China

    OpenAIRE

    Abdul, Razak Bin Chik; Yuan Gao

    2013-01-01

    The indicator of urban success is the success of its urban services. Although much research on services have been made, there is major gap with regard to the regional services, especially on urban services within a country. As for urban ser-vices, there are few research on factors influencing urban services and its effect on regional growth. In reaction to this, the government intend to accelerate the development of urban services and regional economy in the present Twelfth Five-Year Plan 201...

  14. Subchannel analysis of multiple CHF events

    International Nuclear Information System (INIS)

    The phenomenon of multiple CHF events in rod bundle heat transfer tests, referring to the occurrence of CHF on more than one rod or at more than one location on one rod is examined. The adequacy of some of the subchannel CHF correlations presently used in the nuclear industry in predicting higher order CHF events is ascertained based on local coolant conditions obtained with the COBRA IIIC subchannel code. The rod bundle CHF data obtained at the Heat Transfer Research Facility of Columbia University are examined for multiple CHF events using a combination of statistical analyses and parametric studies. The above analyses are applied to the study of three data sets of tests simulating both PWR and BWR reactor cores with uniform and non-uniform axial heat flux distributions. The CHF correlations employed in this study include: (1) CE-1 correlation, (2) B and W-2 correlation, (3) W-3 correlation, and (4) Columbia correlation

  15. Approaches to Data Analysis of Multiple-Choice Questions

    Science.gov (United States)

    Ding, Lin; Beichner, Robert

    2009-01-01

    This paper introduces five commonly used approaches to analyzing multiple-choice test data. They are classical test theory, factor analysis, cluster analysis, item response theory, and model analysis. Brief descriptions of the goals and algorithms of these approaches are provided, together with examples illustrating their applications in physics…

  16. Approaches to data analysis of multiple-choice questions

    OpenAIRE

    Lin Ding; Robert Beichner

    2009-01-01

    This paper introduces five commonly used approaches to analyzing multiple-choice test data. They are classical test theory, factor analysis, cluster analysis, item response theory, and model analysis. Brief descriptions of the goals and algorithms of these approaches are provided, together with examples illustrating their applications in physics education research. We minimize mathematics, instead placing emphasis on data interpretation using these approaches.

  17. Multiple Quantitative Trait Analysis Using Bayesian Networks

    OpenAIRE

    Scutari, Marco; Howell, Phil; Balding, David J.; Mackay, Ian

    2014-01-01

    Models for genome-wide prediction and association studies usually target a single phenotypic trait. However, in animal and plant genetics it is common to record information on multiple phenotypes for each individual that will be genotyped. Modeling traits individually disregards the fact that they are most likely associated due to pleiotropy and shared biological basis, thus providing only a partial, confounded view of genetic effects and phenotypic interactions. In this pap...

  18. Regression analysis of growth responses to water depth in three wetland plant species

    OpenAIRE

    Sorrell, Brian K.; Tanner, Chris C.; Brix, Hans

    2012-01-01

    Variability in plant flooding tolerance is often associated with differential growth responses to increasing water depth. This study highlights how morphological responses conferring flooding tolerance differ, using non-linear and quantile regression to quantitatively compare flooding-related growth responses of three species.

  19. Analysis of Multivariate Experimental Data Using A Simplified Regression Model Search Algorithm

    Science.gov (United States)

    Ulbrich, Norbert Manfred

    2013-01-01

    A new regression model search algorithm was developed in 2011 that may be used to analyze both general multivariate experimental data sets and wind tunnel strain-gage balance calibration data. The new algorithm is a simplified version of a more complex search algorithm that was originally developed at the NASA Ames Balance Calibration Laboratory. The new algorithm has the advantage that it needs only about one tenth of the original algorithm's CPU time for the completion of a search. In addition, extensive testing showed that the prediction accuracy of math models obtained from the simplified algorithm is similar to the prediction accuracy of math models obtained from the original algorithm. The simplified algorithm, however, cannot guarantee that search constraints related to a set of statistical quality requirements are always satisfied in the optimized regression models. Therefore, the simplified search algorithm is not intended to replace the original search algorithm. Instead, it may be used to generate an alternate optimized regression model of experimental data whenever the application of the original search algorithm either fails or requires too much CPU time. Data from a machine calibration of NASA's MK40 force balance is used to illustrate the application of the new regression model search algorithm.

  20. The Analysis of Nonstationary Time Series Using Regression, Correlation and Cointegration

    OpenAIRE

    Søren Johansen

    2012-01-01

    There are simple well-known conditions for the validity of regression and correlation as statistical tools. We analyse by examples the effect of nonstationarity on inference using these methods and compare them to model based inference using the cointegrated vector autoregressive model. Finally we analyse some monthly data from US on interest rates as an illustration of the methods.

  1. Diagnosis of cranial hemangioma: Comparison between logistic regression analysis and neuronal network

    International Nuclear Information System (INIS)

    To study the utility of logistic regression and the neuronal network in the diagnosis of cranial hemangiomas. Fifteen patients presenting hemangiomas were selected form a total of 167 patients with cranial lesions. All were evaluated by plain radiography and computed tomography (CT). Nineteen variables in their medical records were reviewed. Logistic regression and neuronal network models were constructed and validated by the jackknife (leave-one-out) approach. The yields of the two models were compared by means of ROC curves, using the area under the curve as parameter. Seven men and 8 women presented hemangiomas. The mean age of these patients was 38.4 (15.4 years (mea ± standard deviation). Logistic regression identified as significant variables the shape, soft tissue mass and periosteal reaction. The neuronal network lent more importance to the existence of ossified matrix, ruptured cortical vein and the mixed calcified-blastic (trabeculated) pattern. The neuronal network showed a greater yield than logistic regression (Az, 0.9409) (0.004 versus 0.7211± 0.075; p<0.001). The neuronal network discloses hidden interactions among the variables, providing a higher yield in the characterization of cranial hemangiomas and constituting a medical diagnostic acid. (Author)29 refs

  2. Risk Factors of Falls in Community-Dwelling Older Adults: Logistic Regression Tree Analysis

    Science.gov (United States)

    Yamashita, Takashi; Noe, Douglas A.; Bailer, A. John

    2012-01-01

    Purpose of the Study: A novel logistic regression tree-based method was applied to identify fall risk factors and possible interaction effects of those risk factors. Design and Methods: A nationally representative sample of American older adults aged 65 years and older (N = 9,592) in the Health and Retirement Study 2004 and 2006 modules was used.…

  3. On the Usefulness of a Multilevel Logistic Regression Approach to Person-Fit Analysis

    Science.gov (United States)

    Conijn, Judith M.; Emons, Wilco H. M.; van Assen, Marcel A. L. M.; Sijtsma, Klaas

    2011-01-01

    The logistic person response function (PRF) models the probability of a correct response as a function of the item locations. Reise (2000) proposed to use the slope parameter of the logistic PRF as a person-fit measure. He reformulated the logistic PRF model as a multilevel logistic regression model and estimated the PRF parameters from this…

  4. Analysis of interactive fixed effects dynamic linear panel regression with measurement error

    OpenAIRE

    Lee, Nayoung; Moon, Hyungsik Roger; Weidner, Martin

    2011-01-01

    This paper studies a simple dynamic panel linear regression model with interactive fixed effects in which the variable of interest is measured with error. To estimate the dynamic coefficient, we consider the least-squares minimum distance (LS-MD) estimation method.

  5. Further Insight and Additional Inference Methods for Polynomial Regression Applied to the Analysis of Congruence

    Science.gov (United States)

    Cohen, Ayala; Nahum-Shani, Inbal; Doveh, Etti

    2010-01-01

    In their seminal paper, Edwards and Parry (1993) presented the polynomial regression as a better alternative to applying difference score in the study of congruence. Although this method is increasingly applied in congruence research, its complexity relative to other methods for assessing congruence (e.g., difference score methods) was one of the…

  6. Modelling and optimization of the surface roughness in the dry turning of the cold rolled alloyed steel using regression analysis

    OpenAIRE

    Dejan Taniki?; Velibor Marinkovi?

    2012-01-01

    Surface quality of the machined parts is one of the most important product quality indicators and one of the most frequent customer requirements. The average surface roughness (Ra) represents a measure of the surface quality, and it is mostly influenced by the following cutting parameters: the cutting speed, the feed rate, and the depth of cut. Quantifying the relationship between surface roughness and cutting parameters is a very important task. In this study regression analysis was used for...

  7. Incorporating the effects of topographic amplification in the analysis of earthquake-induced landslide hazards using logistic regression

    OpenAIRE

    Lee, S. T.; Yu, T. T.; Peng, W. F.; Wang, C. L.

    2010-01-01

    Seismic-induced landslide hazards are studied using seismic shaking intensity based on the topographic amplification effect. The estimation of the topographic effect includes the theoretical topographic amplification factors and the corresponding amplified ground motion. Digital elevation models (DEM) with a 5-m grid space are used. The logistic regression model and the geographic information system (GIS) are used to perform the seismic landslide hazard analysis. The 99 Peaks area, located 3 ...

  8. Rotation and Noise Invariant Near-Infrared Face Recognition by means of Zernike Moments and Spectral Regression Discriminant Analysis.

    Czech Academy of Sciences Publication Activity Database

    Farokhi, S.; Shamsuddin, S. M.; Flusser, Jan; Sheikh, U. U.; Khansari, M.; Jafari-Khouzani, K.

    2013-01-01

    Ro?. 22, ?. 1 (2013), s. 1-11. ISSN 1017-9909 R&D Projects: GA ?R GAP103/11/1552 Keywords : face recognition * infrared imaging * image moments Subject RIV: JD - Computer Applications, Robotics Impact factor: 0.850, year: 2013 http://library.utia.cas.cz/separaty/2013/ZOI/flusser-rotation and noise invariant near-infrared face recognition by means of zernike moments and spectral regression discriminant analysis.pdf

  9. Clinical parameters predicting failure of empirical antibacterial therapy in early onset neonatal sepsis, identified by classification and regression tree analysis

    OpenAIRE

    Merila Mirjam; Maipuu Lea; Parm Ülle; Ilmoja Mari-Liis; Pisarev Heti; Metsvaht Tuuli; Müürsepp Piia; Lutsar Irja

    2009-01-01

    Abstract Background About 10-20% of neonates with suspected or proven early onset sepsis (EOS) fail on the empiric antibiotic regimen of ampicillin or penicillin and gentamicin. We aimed to identify clinical and laboratory markers associated with empiric antibiotic treatment failure in neonates with suspected EOS. Methods Maternal and early neonatal characteristics predicting failure of empiric antibiotic treatment were identified by univariate logistic regression analysis from a prospective ...

  10. Griseofulvin/Carrier Blends: Application of Partial Least Squares (PLS) Regression Analysis for Estimating the Factors Affecting the Dissolution Efficiency

    OpenAIRE

    Cutrignelli, Annalisa; Trapani, Adriana; Lopedota, Angela; Franco, Massimo; Mandracchia, Delia; Denora, Nunzio; Laquintana, Valentino; Trapani, Giuseppe

    2011-01-01

    The main aim of the present study was to estimate the carrier characteristics affecting the dissolution efficiency of Griseofulvin (Gris) containing blends (BLs) using partial least squares (PLS) regression analysis. These systems were prepared at three different drug/carrier weight ratios (1/5, 1/10, and 1/20) by the solvent evaporation method, a well-established method for preparing solid dispersions (SDs). The carriers used were structurally different including polymers, a polyol, acids, b...

  11. Brief psychological therapies for anxiety and depression in primary care: meta-analysis and meta-regression

    OpenAIRE

    Cape John; Whittington Craig; Buszewicz Marta; Wallace Paul; Underwood Lisa

    2010-01-01

    Abstract Background Psychological therapies provided in primary care are usually briefer than in secondary care. There has been no recent comprehensive review comparing their effectiveness for common mental health problems. We aimed to compare the effectiveness of different types of brief psychological therapy administered within primary care across and between anxiety, depressive and mixed disorders. Methods Meta-analysis and meta-regression of randomized controlled trials of brief psycholog...

  12. DISCUSS: Regression and Correlation

    Science.gov (United States)

    Hunt, Neville

    This module introduces correlation and regression through topics like scatterplots, lines, slopes, intercepts, applications of regression analysis, the line of best fit, goodness of fit, assumptions and how to check them, prediction, interpolation, extrapolation, and reliability. Excel spreadsheets are used to provide examples and exercises.

  13. A local equation for differential diagnosis of ?-thalassemia trait and iron deficiency anemia by logistic regression analysis in Southeast Iran.

    Science.gov (United States)

    Sargolzaie, Narjes; Miri-Moghaddam, Ebrahim

    2014-01-01

    The most common differential diagnosis of ?-thalassemia (?-thal) trait is iron deficiency anemia. Several red blood cell equations were introduced during different studies for differential diagnosis between ?-thal trait and iron deficiency anemia. Due to genetic variations in different regions, these equations cannot be useful in all population. The aim of this study was to determine a native equation with high accuracy for differential diagnosis of ?-thal trait and iron deficiency anemia for the Sistan and Baluchestan population by logistic regression analysis. We selected 77 iron deficiency anemia and 100 ?-thal trait cases. We used binary logistic regression analysis and determined best equations for probability prediction of ?-thal trait against iron deficiency anemia in our population. We compared diagnostic values and receiver operative characteristic (ROC) curve related to this equation and another 10 published equations in discriminating ?-thal trait and iron deficiency anemia. The binary logistic regression analysis determined the best equation for best probability prediction of ?-thal trait against iron deficiency anemia with area under curve (AUC) 0.998. Based on ROC curves and AUC, Green & King, England & Frazer, and then Sirdah indices, respectively, had the most accuracy after our equation. We suggest that to get the best equation and cut-off in each region, one needs to evaluate specific information of each region, specifically in areas where populations are homogeneous, to provide a specific formula for differentiating between ?-thal trait and iron deficiency anemia. PMID:25155260

  14. Analysis of radiocardiographic first pass activity versus time data using models of the central circulation and nonlinear regression analysis

    International Nuclear Information System (INIS)

    In this study mathematical models of the central circulation, containing as undetermined parameters both chamber volumes and crosstalk coefficients, relating region-of-interest count rates to activity no only in the corresponding chamber but also overlapping and contiguous anatomical chambers, were used to identify contaminating crosstalk contributions to the various time-activity curves of interest. The identification of these crosstalks was essential for the creation of decontaminated region-of-interest time-activity curves which could be used for further model analysis. The decontaminated curves represent what the region-of-interest time-activity curves would look like in the absence of crosstalks. An optimal sampling route in was added to the nonlinear regression least squares fit program so that the region-of-interest time-activity curves could be analyzed to determine which data points contributed most toward decreasing the standard error or each parameter. A biplane model was investigated for use in analyzing radionuclide angiocardiographic first pass data

  15. Regression analysis of dynamics of insecticide resistance in field populations of Chilo suppressalis (Lepidoptera: Crambidae) during 2002-2011 in China.

    Science.gov (United States)

    He, Yueping; Zhang, Juefeng; Gao, Congfen; Su, Jianya; Chen, Jianming; Shen, Jinliang

    2013-08-01

    To understand the evolution of insecticide resistance in the Asiatic rice borer, Chilo suppressalis (Walker) (Lepidoptera: Crambidae), in field, regression analysis based on a linear or nonlinear model was adopted for analyzing resistance dynamics to six insecticides of two field populations of the Lianyungang (LYG) and Ruian (RA) populations during 2002-2011. For the low-level resistance population, LYG population, sustained susceptibilities to abamectin and fipronil were seen for 10 yr; a polynomial curve regression model showed an increase in resistance to chlorpyrifos; exponential growth models fit to the resistance dynamics to triazophos and deltamethrin, and a sigmoidal growth curve for monosultap. For the high-level multiple resistance population, RA population, a slight increase from susceptible to a minor resistance to abamectin could be modeled by a polynomial cubic equation; an exponential growth model fit to the increase of resistance to fipronil from 8.7-fold to 33.6-fold; a sine waveform model fit to the vibrating tendency of resistance to chlorpyrifos; the dynamics of resistance to triazophos could be modeled by two combined curves, with a polynomial growth model and a sine waveform model; the high level of resistance to monosultap could be modeled with a sine waveform model; and a significant linear growth relationship of the resistance to deltamethrin of the RA population over years was found. Then, the relationship between dynamics of resistance development to insecticides among the field populations of C. suppressalis and the application history of pesticides for controlling rice borers was discussed. PMID:24020300

  16. Ordered probit regression analysis of the effect of brand name on beer acceptance by consumers

    Scientific Electronic Library Online (English)

    Suzana Maria, Della Lucia; Valéria Paula Rodrigues, Minim; Carlos Henrique Osório, Silva; Luis Antonio, Minim; Paula De Aguiar, Cipriano.

    2013-09-01

    Full Text Available SciELO Brazil | Language: English Abstract in english Ordered probit regression was used to analyze data of sensory acceptance tests designed to study the effect of brand name on the acceptability of beer samples. Eight different brands of Pilsen beer were evaluated by 101 consumers in two sessions of acceptance tests: blind evaluation and brand inform [...] ation test. Ordered probit regression, although a relatively sophisticated technique compared to others used to analyze sensory data, was chosen to enable the observation of consumers' behavior using graphical interpretations of estimated probabilities plotted against hedonic scales. It can be concluded that brands B, C, and D had a positive effect on the sensory acceptance of the product, whereas brands A, F, G, and H had a negative influence on consumers' evaluation of the samples. On the other hand, brand E had little influence on consumers' assessment.

  17. Does Financial Crisis Give Impacts on Bahrain Islamic Banking Performance? A Panel Regression Analysis

    OpenAIRE

    Sutan Emir Hidayat; Muhamad Abduh

    2012-01-01

    The 2007/2008 global financial crisis has given a significant impact on the performance of banking industry worldwide. The objective of this study is to see the impact of global financial crisis towards the financial performance of Islamic banking industry in Bahrain. Moreover, it also utilizes bank specific factors as predictors for Islamic bank performance in Bahrain. Panel regression is used to analyze the data. The result shows that LTA, LEQ, and LOHE are significant bank specific factors...

  18. The Analysis of Internet Addiction Scale Using Multivariate Adaptive Regression Splines

    OpenAIRE

    Kayri, M.

    2010-01-01

    "nBackground: Determining real effects on internet dependency is too crucial with unbiased and robust statistical method. MARS is a new non-parametric method in use in the literature for parameter estimations of cause and effect based research. MARS can both obtain legible model curves and make unbiased parametric predictions."nMethods: In order to examine the performance of MARS, MARS findings will be compared to Classification and Regres­sion Tree (C&RT) findings, ...

  19. Comparative analysis of neural network and regression based condition monitoring approaches for wind turbine fault detection

    OpenAIRE

    Schlechtingen, Meik; Santos, Ilmar

    2011-01-01

    This paper presents the research results of a comparison of three different model based approaches for wind turbine fault detection in online SCADA data, by applying developed models to five real measured faults and anomalies. The regression based model as the simplest approach to build a normal behavior model is compared to two artificial neural network based approaches, which are a full signal reconstruction and an autoregressive normal behavior model. Based on a real time series containing...

  20. The use of risk assessment to predict recurrent maltreatment: A classification and regression tree analysis (CART)

    OpenAIRE

    Sledjeski, Eve M.; Dierker, Lisa C.; Brigham, Rebecca; Breslin, Eileen

    2008-01-01

    Research has suggested that recurrent maltreatment may be best predicted by a combination of factors that vary across families. The present study set out to determine whether a pattern-centered analytic approach would better predict families at high risk for recurrence when compared to logistic regression methods. Archival data from substantiated investigations during 2003 were collected from a Connecticut Department of Children and Families county branch. Families (n=244) with a substantiate...