- Home
- ▪
- About
- ▪
- News
- ▪
- Advanced Search
- ▪
- Mobile
- ▪
- Contact Us
- ▪
- Site Map
- ▪
- Help

1

Multiple Linear Regressions Analysis

This online calculator allows users to enter sixteen observations with up to four dependent variables and calculates the regression equation, the fitted values, R-Squared, the F-Statistic, mean, variance, first order serial-correlation, second order serial-correlation, the Durbin-Watson statistic, and the mean absolute errors. It also tests normality and gives the i-th residuals.

Arsham, Hossein

2009-01-22

2

Multiple linear regression analysis

Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.

Edwards, T. R.

1980-01-01

3

MULTIPLE REGRESSION ANALYSIS OF MAIN ECONOMIC INDICATORS IN TOURISM

Directory of Open Access Journals (Sweden)

Full Text Available This paper analysis the measure between GDP dependent variable in the sector of hotels and restaurants and the following independent variables: overnight stays in the establishments of touristic reception, arrivals in the establishments of touristic reception and investments in hotels and restaurants sector in the period of analysis 1995-2007. With the multiple regression analysis I found that investments and tourist arrivals are significant predictors for the GDP dependent variable. Based on these results, I identified those components of the marketing mix, which in my opinion require investment, which could contribute to the positive development of tourist arrivals in the establishments of touristic reception.

Erika KULCSÁR

2009-12-01

4

Multiple Regression Analysis Using ANCOVA in University Model

Directory of Open Access Journals (Sweden)

Full Text Available The government of UAE is promoting Dubai as an academic hub. Dubai International Academic City (DIAC is a free zone area with many national and international universities promoting higher education in almost all disciplines. The aspiration of every graduating student from the university is to get a good placement. In Dubai diverse job opportunities in national and multinational organizations are available. The objective of the paper is to review the placement opportunities in Dubai for the universities offering programs in Engineering. This paper attempts to study the effect of three independent variables namely Cumulative grade point average (CGPA, Engineering disciplines and types of jobs that graduating students are offered on the dependent variable salary. Engineering discipline understudy are Mechanical, Electronics and Communication, Computer Science and Electrical and Electronics Engineering. The type of jobs taken into consideration are marketing, technical marketing, design and logistics. The concepts of Analysis of covariance (ANCOVA and multiple regression are used for review of placement opportunities vis a vis the salary structure.

Maneesha

2013-09-01

5

Online multiple instance regression

The multiple instance regression problem has become a hot research topic recently. There are several approaches to the multiple instance regression problem, such as Salience, Citation KNN, and MI-ClusterRegress. All of these solutions work in batch mode during the training step. However, in practice, examples usually arrive in sequence. Therefore, the training step cannot be accomplished once. In this paper, an online multiple instance regression method “OnlineMIR" is proposed. OnlineMIR can not only predict the label of a new bag, but also update the current regression model with the latest arrived bag. The experimental results show that OnlineMIR achieves good performances on both synthetic and real data sets.

Wang, Zhi-Gang; Zhao, Zeng-Shun; Zhang, Chang-Shui

2013-09-01

6

An improved multiple linear regression and data analysis computer program package

NEWRAP, an improved version of a previous multiple linear regression program called RAPIER, CREDUC, and CRSPLT, allows for a complete regression analysis including cross plots of the independent and dependent variables, correlation coefficients, regression coefficients, analysis of variance tables, t-statistics and their probability levels, rejection of independent variables, plots of residuals against the independent and dependent variables, and a canonical reduction of quadratic response functions useful in optimum seeking experimentation. A major improvement over RAPIER is that all regression calculations are done in double precision arithmetic.

Sidik, S. M.

1972-01-01

7

Analysis of ? spectra in airborne radioactivity measurements using multiple linear regressions

International Nuclear Information System (INIS)

This paper describes the net peak counts calculating of nuclide 137Cs at 662 keV of ? spectra in airborne radioactivity measurements using multiple linear regressions. Mathematic model is founded by analyzing every factor that has contribution to Cs peak counts in spectra, and multiple linear regression function is established. Calculating process adopts stepwise regression, and the indistinctive factors are eliminated by F check. The regression results and its uncertainty are calculated using Least Square Estimation, then the Cs peak net counts and its uncertainty can be gotten. The analysis results for experimental spectrum are displayed. The influence of energy shift and energy resolution on the analyzing result is discussed. In comparison with the stripping spectra method, multiple linear regression method needn't stripping radios, and the calculating result has relation with the counts in Cs peak only, and the calculating uncertainty is reduced. (authors)

8

In order to visualize melanin and blood concentrations and oxygen saturation in human skin tissue, a simple imaging technique based on multispectral diffuse reflectance images acquired at six wavelengths (500, 520, 540, 560, 580 and 600nm) was developed. The technique utilizes multiple regression analysis aided by Monte Carlo simulation for diffuse reflectance spectra. Using the absorbance spectrum as a response variable and the extinction coefficients of melanin, oxygenated hemoglobin, and deoxygenated hemoglobin as predictor variables, multiple regression analysis provides regression coefficients. Concentrations of melanin and total blood are then determined from the regression coefficients using conversion vectors that are deduced numerically in advance, while oxygen saturation is obtained directly from the regression coefficients. Experiments with a tissue-like agar gel phantom validated the method. In vivo experiments with human skin of the human hand during upper limb occlusion and of the inner forearm exposed to UV irradiation demonstrated the ability of the method to evaluate physiological reactions of human skin tissue.

Nishidate, Izumi; Wiswadarma, Aditya; Hase, Yota; Tanaka, Noriyuki; Maeda, Takaaki; Niizeki, Kyuichi; Aizu, Yoshihisa

2011-08-01

9

Quantitative electron microscope autoradiography: application of multiple linear regression analysis

International Nuclear Information System (INIS)

A new method for the analysis of high resolution EM autoradiographs is described. It identifies labelled cell organelle profiles in sections on a strictly statistical basis and provides accurate estimates for their radioactivity without the need to make any assumptions about their size, shape and spatial arrangement. (author)

10

Multiple Regression and Canonical Variate Analysis Using Various Modes of Statistical Control.

Identification and explication of construct relationships, under conditions of extraneous variable control in multiple regression and its multivariate analog, canonical analysis, were studied. Several data models were generated as a function of the interaction of partial correlation and orthogonal linear transformations on nursing examination…

Brown, Ric; Carbonari, Joseph P.

11

A Performance Study of Data Mining Techniques: Multiple Linear Regression vs. Factor Analysis

Digital Repository Infrastructure Vision for European Research (DRIVER)

The growing volume of data usually creates an interesting challenge for the need of data analysis tools that discover regularities in these data. Data mining has emerged as disciplines that contribute tools for data analysis, discovery of hidden knowledge, and autonomous decision making in many application domains. The purpose of this study is to compare the performance of two data mining techniques viz., factor analysis and multiple linear regression for different sample si...

Chauhan, R. K.; Abhishek Taneja

2011-01-01

12

Multiple regression analysis of Jominy hardenability data for boron treated steels

International Nuclear Information System (INIS)

The relations between chemical composition and their hardenability of boron treated steels have been investigated using a multiple regression analysis method. A linear model of regression was chosen. The free boron content that is effective for the hardenability was calculated using a model proposed by Jansson. The regression analysis for 1261 steel heats provided equations that were statistically significant at the 95% level. All heats met the specification according to the nordic countries producers classification. The variation in chemical composition explained typically 80 to 90% of the variation in the hardenability. In the regression analysis elements which did not significantly contribute to the calculated hardness according to the F test were eliminated. Carbon, silicon, manganese, phosphorus and chromium were of importance at all Jominy distances, nickel, vanadium, boron and nitrogen at distances above 6 mm. After the regression analysis it was demonstrated that very few outliers were present in the data set, i.e. data points outside four times the standard deviation. The model has successfully been used in industrial practice replacing some of the necessary Jominy tests. (orig.)

13

A GIS-based landslide hazard assessment by multiple regression analysis

The occurrence of landslides generally depends on complex interactions among a large number of partially interrelated factors. It is appropriate to use multiple regression analysis for predicting landslides from a given set of independent variables. The procedure of landslide hazard assessment by regression analysis, however, requires evaluation of the spatially varying terrain conditions as well as spatial representation of the landslides. In this paper, the multiple regression analysis was applied to predict landslides in Himi district from independent factors, such as geology, slope-aspect, slope angle, land use and soil with Geographic Information System (GIS). Based on GIS, every factor was classified into several clusters and then the statistical weight of every cluster was assigned for every factor respectively. By the weights of five factors, the linear regression's coefficients of these input factors in landslide area were extracted and assigned to the whole region, and then the susceptibility for the potential landslide was obtained to make the landslide hazard assessment map. Geology and slope-aspect factors are the most important ones. Soil factor is not so notable in this research region, though it may be significant in other regions. At last, the average susceptibilities map for existing landslides was made for the engineers to do control work.

Pan, Xiaoduo; Nakamura, Hiroyuki; Tamotsu, Nozaki; Nan, Zhuotong

2007-06-01

14

One of the challenges associated with microalgal biomass characterization and the comparison of microalgal strains and conversion processes is the rapid determination of the composition of algae. We have developed and applied a high-throughput screening technology based on near-infrared (NIR) spectroscopy for the rapid and accurate determination of algal biomass composition. We show that NIR spectroscopy can accurately predict the full composition using multivariate linear regression analysis of varying lipid, protein, and carbohydrate content of algal biomass samples from three strains. We also demonstrate a high quality of predictions of an independent validation set. A high-throughput 96-well configuration for spectroscopy gives equally good prediction relative to a ring-cup configuration, and thus, spectra can be obtained from as little as 10-20 mg of material. We found that lipids exhibit a dominant, distinct, and unique fingerprint in the NIR spectrum that allows for the use of single and multiple linear regression of respective wavelengths for the prediction of the biomass lipid content. This is not the case for carbohydrate and protein content, and thus, the use of multivariate statistical modeling approaches remains necessary. PMID:24229385

Laurens, L M L; Wolfrum, E J

2013-12-18

15

REVAAM Model to determine a company's value by multiple valuation and linear regression analysis

Directory of Open Access Journals (Sweden)

Full Text Available This paper shows an alternative model to the widely used method of multiple valuation (or relative valuation) in order to calculate the value of a company by using either the Price Earnings (PE) and/or the Enterprise Value to Earnings Before Interest, Taxes, Depreciation and Amortization (EV/EBITDA). When calculating multiples, analysts tend to consider average multiples within an industry and apply them directly to the target company; however, we believe that this practice is not considering differences among the companies being compared, although they belong to the same sector or industry. REVAAM Model uses linear regression to calculate adjusted PE and EV/EBITDA multiples by taking into consideration profitability factors for each multiple in order to differentiate companies in the samples. Calculations are based on public data for US companies, but could be further expanded to other markets. Not only REVAAM Model provides a better estimate to relative valuation analysis than simply using average multiples, but it could be used to compare under/overvalued companies or sectors, and also analyze multiple value changes over time as the intrinsic fundamentals change.

Luis G. Acosta-Calzado

2010-07-01

16

A Performance Study of Data Mining Techniques: Multiple Linear Regression vs. Factor Analysis

Directory of Open Access Journals (Sweden)

Full Text Available The growing volume of data usually creates an interesting challenge for the need of data analysis tools that discover regularities in these data. Data mining has emerged as disciplines that contribute tools for data analysis, discovery of hidden knowledge, and autonomous decision making in many application domains. The purpose of this study is to compare the performance of two data mining techniques viz., factor analysis and multiple linear regression for different sample sizes on three unique sets of data. The performance of the two data mining techniques is compared on following parameters like mean square error (MSE, R-square, R-Square adjusted, condition number, root mean square error(RMSE, number of variables included in the prediction model, modified coefficient of efficiency, F-value, and test of normality. These parameters have been computed using various data mining tools like SPSS, XLstat, Stata, and MS-Excel. It is seen that for all the given dataset, factor analysis outperform multiple linear regression. But the absolute value of prediction accuracy varied between the three datasets indicating that the data distribution and data characteristics play a major role in choosing the correct prediction technique.

R.K.Chauhan

2011-04-01

17

International Nuclear Information System (INIS)

Various types of ultrasonic techniques have been used for the estimation of compressive strength of concrete structures. However, conventional ultrasonic velocity method using only longitudial wave cannot be determined the compressive strength of concrete structures with accuracy. In this paper, by using the introduction of multiple parameter, e. g. velocity of shear wave, velocity of longitudinal wave, attenuation coefficient of shear wave, attenuation coefficient of longitudinal wave, combination condition, age and preservation method, multiple regression analysis method was applied to the determination of compressive strength of concrete structures. The experimental results show that velocity of shear wave can be estimated compressive strength of concrete with more accuracy compared with the velocity of longitudinal wave, accuracy of estimated error range of compressive strength of concrete structures can be enhanced within the range of ± 10% approximately

18

Multiple Regression Assumptions. ERIC Digest.

This Digest presents a discussion of the assumptions of multiple regression that is tailored to the practicing researcher. The focus is on the assumptions of multiple regression that are not robust to violation, and that researchers can deal with if violated. Assumptions of normality, linearity, reliability of measurement, and homoscedasticity are…

Osborne, Jason W.; Waters, Elaine

19

ANALYSIS OF THE FINANCIAL PERFORMANCES OF THE FIRM, BY USING THE MULTIPLE REGRESSION MODEL

Digital Repository Infrastructure Vision for European Research (DRIVER)

The information achieved through the use of simple linear regression are not always enough to characterize the evolution of an economic phenomenon and, furthermore, to identify its possible future evolution. To remedy these drawbacks, the special literature includes multiple regression models, in which the evolution of the dependant variable is defined depending on two or more factorial variables.

Constantin Anghelache; Ioan Partachi

2011-01-01

20

Leaf pigments are key elements for plant photosynthesis and growth. Traditional manual sampling of these pigments is labor-intensive and costly, which also has the difficulty in capturing their temporal and spatial characteristics. The aim of this work is to estimate photosynthetic pigments at large scale by remote sensing. For this purpose, inverse model were proposed with the aid of stepwise multiple linear regression (SMLR) analysis. Furthermore, a leaf radiative transfer model (i.e. PROSPECT model) was employed to simulate the leaf reflectance where wavelength varies from 400 to 780 nm at 1 nm interval, and then these values were treated as the data from remote sensing observations. Meanwhile, simulated chlorophyll concentration (Cab), carotenoid concentration (Car) and their ratio (Cab/Car) were taken as target to build the regression model respectively. In this study, a total of 4000 samples were simulated via PROSPECT with different Cab, Car and leaf mesophyll structures as 70% of these samples were applied for training while the last 30% for model validation. Reflectance (r) and its mathematic transformations (1/r and log (1/r)) were all employed to build regression model respectively. Results showed fair agreements between pigments and simulated reflectance with all adjusted coefficients of determination (R2) larger than 0.8 as 6 wavebands were selected to build the SMLR model. The largest value of R2 for Cab, Car and Cab/Car are 0.8845, 0.876 and 0.8765, respectively. Meanwhile, mathematic transformations of reflectance showed little influence on regression accuracy. We concluded that it was feasible to estimate the chlorophyll and carotenoids and their ratio based on statistical model with leaf reflectance data.

Liu, Pudong; Shi, Runhe; Wang, Hong; Bai, Kaixu; Gao, Wei

2014-10-01

21

A Performance Study of Data Mining Techniques: Multiple Linear Regression vs. Factor Analysis

The growing volume of data usually creates an interesting challenge for the need of data analysis tools that discover regularities in these data. Data mining has emerged as disciplines that contribute tools for data analysis, discovery of hidden knowledge, and autonomous decision making in many application domains. The purpose of this study is to compare the performance of two data mining techniques viz., factor analysis and multiple linear regression for different sample sizes on three unique sets of data. The performance of the two data mining techniques is compared on following parameters like mean square error (MSE), R-square, R-Square adjusted, condition number, root mean square error(RMSE), number of variables included in the prediction model, modified coefficient of efficiency, F-value, and test of normality. These parameters have been computed using various data mining tools like SPSS, XLstat, Stata, and MS-Excel. It is seen that for all the given dataset, factor analysis outperform multiple linear re...

Taneja, Abhishek

2011-01-01

22

Directory of Open Access Journals (Sweden)

Full Text Available Risk is not always avoidable, but it is controllable. The aim of this study is to identify whether those techniques are effective in reducing software failure. This motivates the authors to continue the effort to enrich the managing software project risks with consider mining and quantitative approach with large data set. In this study, two new techniques are introduced namely stepwise multiple regression analysis and fuzzy multiple regression to manage the software risks. Two evaluation procedures such as MMRE and Pred (25 is used to compare the accuracy of techniques. The model’s accuracy slightly improves in stepwise multiple regression rather than fuzzy multiple regression. This study will guide software managers to apply software risk management practices with real world software development organizations and verify the effectiveness of the new techniques and approaches on a software project. The study has been conducted on a group of software project using survey questionnaire. It is hope that this will enable software managers improve their decision to increase the probability of software project success.

Abdelrafe Elzamly

2014-01-01

23

A Multiple Regression Analysis on Influencing Factors of Urban Services Growth in China

Directory of Open Access Journals (Sweden)

Full Text Available The indicator of urban success is the success of its urban services. Although much research on services have been made, there is major gap with regard to the regional services, especially on urban services within a country. As for urban ser-vices, there are few research on factors influencing urban services and its effect on regional growth. In reaction to this, the government intend to accelerate the development of urban services and regional economy in the present Twelfth Five-Year Plan 2011-2015.Thus, the main purpose of this paper is to investigate the factors that influence urban servic-es growth from demand , supply, institutional environment and spatial agglomeration side. By using cross-section mul-tiple regression analysis, the study examine the factors influencing urban services growth in China .The model indicated that except for urbanization, division of labor , other independent variables have contributed positively towards urban services growth in China.

ABDUL Razak bin Chik

2013-01-01

24

Directory of Open Access Journals (Sweden)

Full Text Available Depending on four controlable variables used in broilers nutrition: E (energy, P (protein, L(lysine, M (metyonine+ cystine have been deduced mathematically multiple curvilinear regressions showing the evolution of corporal mass during entire growth period. In this paper, using these regressions, we determine the average weekly gain of corporal mass. We test using dispersional analysis if there are significant differences between N.R.C. 1994 and the values given by regressions. Using correlation report we decide which of these regressions is optimum.

I. POPESCU

2013-12-01

25

Here I present an Excel based program for the analysis of intracellular Ca transients recorded using fluorescent indicators. The program can perform all the necessary steps which convert recorded raw voltage changes into meaningful physiological information. The program performs two fundamental processes. (1) It can prepare the raw signal by several methods. (2) It can then be used to analyze the prepared data to provide information such as absolute intracellular Ca levels. Also, the rates of change of Ca can be measured using multiple, simultaneous regression analysis. I demonstrate that this program performs equally well as commercially available software, but has numerous advantages, namely creating a simplified, self-contained analysis workflow. PMID:24125908

Greensmith, David J.

2014-01-01

26

Multiple Regression Analysis of Factors that May Influence Middle School Science Scores

The purpose of this quantitative multiple regression study was to determine whether a relationship existed between Maryland State Assessment (MSA) reading scores, MSA math scores, gender, ethnicity, age, and MSA science scores. Also examined was if MSA reading scores, MSA math scores, gender, ethnicity, and age can be used in combination or alone…

Glover, Judith

2012-01-01

27

A factor analysis-multiple regression model for source apportionment of suspended particulate matter

A factor analysis-multiple regression (FA-MR) model has been used for a source apportionment study in the Tokyo metropolitan area. By a varimax rotated factor analysis, five source types could be identified: refuse incineration, soil and automobile, secondary particles, sea salt and steel mill. Quantitative estimations using the FA-MR model corresponded to the calculated contributing concentrations determined by using a weighted least-squares CMB model. However, the source type of refuse incineration identified by the FA-MR model was similar to that of biomass burning, rather than that produced by an incineration plant. The estimated contributions of sea salt and steel mill by the FA-MR model contained those of other sources, which have the same temporal variation of contributing concentrations. This symptom was caused by a multicollinearity problem. Although this result shows the limitation of the multivariate receptor model, it gives useful information concerning source types and their distribution by comparing with the results of the CMB model. In the Tokyo metropolitan area, the contributions from soil (including road dust), automobile, secondary particles and refuse incineration (biomass burning) were larger than industrial contributions: fuel oil combustion and steel mill. However, since vanadium is highly correlated with SO 42- and other secondary particle related elements, a major portion of secondary particles is considered to be related to fuel oil combustion.

Okamoto, Shin'ichi; Hayashi, Masayuki; Nakajima, Masaomi; Kainuma, Yasutaka; Shiozawa, Kiyoshige

28

International Nuclear Information System (INIS)

Polycyclic aromatic hydrocarbons (PAHs) are contaminants that reside mainly in surface soils. Dietary intake of plant-based foods can make a major contribution to total PAH exposure. Little information is available on the relationship between root morphology and plant uptake of PAHs. An understanding of plant root morphologic and compositional factors that affect root uptake of contaminants is important and can inform both agricultural (chemical contamination of crops) and engineering (phytoremediation) applications. Five crop plant species are grown hydroponically in solutions containing the PAH phenanthrene. Measurements are taken for 1) phenanthrene uptake, 2) root morphology – specific surface area, volume, surface area, tip number and total root length and 3) root tissue composition – water, lipid, protein and carbohydrate content. These factors are compared through Pearson's correlation and multiple linear regression analysis. The major factors which promote phenanthrene uptake are specific surface area and lipid content. -- Highlights: •There is no correlation between phenanthrene uptake and total root length, and water. •Specific surface area and lipid are the most crucial factors for phenanthrene uptake. •The contribution of specific surface area is greater than that of lipid. -- The contribution of specific surface area is greater than that of lipid in the two most important root morphological and compositional factors affecting phenanthrene uptake

29

Multiple regression analysis in modeling of columnar ozone in Peninsular Malaysia.

This study aimed to predict monthly columnar ozone (O3) in Peninsular Malaysia by using data on the concentration of environmental pollutants. Data (2003-2008) on five atmospheric pollutant gases (CO2, O3, CH4, NO2, and H2O vapor) retrieved from the satellite Scanning Imaging Absorption Spectrometer for Atmospheric Chartography (SCIAMACHY) were employed to develop a model that predicts columnar ozone through multiple linear regression. In the entire period, the pollutants were highly correlated (R?=?0.811 for the southwest monsoon, R?=?0.803 for the northeast monsoon) with predicted columnar ozone. The results of the validation of columnar ozone with column ozone from SCIAMACHY showed a high correlation coefficient (R?=?0.752-0.802), indicating the model's accuracy and efficiency. Statistical analysis was utilized to determine the effects of each atmospheric pollutant on columnar ozone. A model that can retrieve columnar ozone in Peninsular Malaysia was developed to provide air quality information. These results are encouraging and accurate and can be used in early warning of the population to comply with air quality standards. PMID:24599658

Tan, K C; Lim, H S; Mat Jafri, M Z

2014-06-01

30

Penalized functional regression analysis of white-matter tract profiles in multiple sclerosis.

Diffusion tensor imaging (DTI) enables noninvasive parcellation of cerebral white matter into its component fiber bundles or tracts. These tracts often subserve specific functions, and damage to the tracts can therefore result in characteristic forms of disability. Attempts to quantify the extent of tract-specific damage have been limited in part by substantial spatial variation of imaging properties from one end of a tract to the other, variation that can be compounded by the effects of disease. Here, we develop a "penalized functional regression" procedure to analyze spatially normalized tract profiles, which powerfully characterize such spatial variation. The central idea is to identify and emphasize portions of a tract that are more relevant to a clinical outcome score, such as case status or degree of disability. The procedure also yields a "tract abnormality score" for each tract and MRI index studied. Importantly, the weighting function used in this procedure is constrained to be smooth, and the statistical associations are estimated using generalized linear models. We test the method on data from a cross-sectional MRI and functional study of 115 multiple-sclerosis cases and 42 healthy volunteers, considering a range of quantitative MRI indices, white-matter tracts, and clinical outcome scores, and using training and testing sets to validate the results. We show that attention to spatial variation yields up to 15% (mean across all tracts and MRI indices: 6.4%) improvement in the ability to discriminate multiple sclerosis cases from healthy volunteers. Our results confirm that comprehensive analysis of white-matter tract-specific imaging data improves with knowledge and characterization of the normal spatial variation. PMID:21554962

Goldsmith, Jeff; Crainiceanu, Ciprian M; Caffo, Brian S; Reich, Daniel S

2011-07-15

31

Investigations upon the indefinite rolls quality assurance in multiple regression analysis

International Nuclear Information System (INIS)

The rolling rolls quality has been enhanced mainly due to the improvements of the chemical compositions of rolls materials. The realization of an optimal chemical composition can constitute a technical efficient mode to assure the exploitation properties, the material from which the rolling mills rolls are manufactured having a higher importance in this sense. This paper continues to present the scientifically results of our experimental research in the area of the rolling rolls. The basic research contains concrete elements of immediate practical utilities in the metallurgical enterprises, for the quality improvements of rolls, having in last as the aim the durability growth and the safety in exploitation. This paper presents an analysis of the chemical composition, the influences upon the mechanical properties of the indefinite cast iron rolls. We present some mathematical correlations and graphical interpretations between the hardness (on the working surface and on necks) and the chemical composition. Using the double and triple correlations which is really helpful in the foundry practice, as it allows us to determine variation boundaries for the chemical composition, in view the obtaining the optimal values of the hardness. We suggest a mathematical interpretation of the influence of the chemical composition over the hardness of these indefinite rolling rolls. In this sense we use the multiple regression analysis which can be an important statistical tool for the investigation of relationships between variables. The enunciation of some mathematically modeling results can be described through a number of multi-component equations determined for the spaces with 3 and 4 dimensions. Also, the regression surfaces, curves of levels and volumes of variations can be represented and interpreted by technologists considering these as correlation diagrams between the analyzed variables. In this sense, these researches results can be used in the engineers collectives of the foundries and the rolling mills sectors, for quality assurances of rolls as far back as phase of production, as well as in exploitation of these, what lead to, inevitably, to the quality assurance of produced laminates. (Author) 16 refs.

32

International Nuclear Information System (INIS)

The calculated >1-MeV pressure vessel fluence is used to determine the fracture toughness and integrity of the reactor pressure vessel. It is therefore of the utmost importance to ensure that the fluence prediction is accurate and unbiased. In practice, this assurance is provided by comparing the predictions of the calculational methodology with an extensive set of accurate benchmarks. A benchmarking database is used to provide an estimate of the overall average measurement-to-calculation (M/C) bias in the calculations (). This average is used as an ad-hoc multiplicative adjustment to the calculations to correct for the observed calculational bias. However, this average only provides a well-defined and valid adjustment of the fluence if the M/C data are homogeneous; i.e., the data are statistically independent and there is no correlation between subsets of M/C data.Typically, the identification of correlations between the errors in the database M/C values is difficult because the correlation is of the same magnitude as the random errors in the M/C data and varies substantially over the database. In this paper, an evaluation of a reactor dosimetry benchmark database is performed to determine the statistical validity of the adjustment to the calculated pressure vessel fluence. Physical mechanisms that could potentially introduce a correlation between the subsets of M/C ratios are identified and included in a multiple regression analysis of the M/C data. Rigorous statistical criteria are used to evaluate the homogeneity of the M/C data and determine the validity of the adjustment.For the database evaluated, the M/C data are found to be strongly correlated with dosimeter response threshold energy and dosimeter location (e.g., cavity versus in-vessel). It is shown that because of the inhomogeneity in the M/C data, for this database, the benchmark data do not provide a valid basis for adjusting the pressure vessel fluence.The statistical criteria and methods employed in this analysis are generic and may be applied in benchmarking applications where the M/C comparisons are used to determine an adjustment of the calculations

33

Regression analysis by example

Praise for the Fourth Edition: ""This book is . . . an excellent source of examples for regression analysis. It has been and still is readily readable and understandable."" -Journal of the American Statistical Association Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. Regression Analysis by Example, Fifth Edition has been expanded

Chatterjee, Samprit

2012-01-01

34

Directory of Open Access Journals (Sweden)

Full Text Available The multiple linear regression (MLR was used to build the linear quantitative structure-property relationship (QSPR model for the prediction of the molar diamagnetic susceptibility (?mfor 140 diverse organic compounds using the three significant descriptors calculated from the molecular structures alone and selected by stepwise regression method. Stepwise regression was employed to develop a regression equation based on 100training compounds, and predictive ability was tested on 40 compounds reserved for that purpose. The stability of the proposed model was validated using Leave-One-Out cross-validation and randomization test. Application of the developed model to a testing set of 40 organic compounds demonstrates that the new model is reliable with good predictive accuracy and simple formulation. By applying MLR method we can predict the test set (40compounds with Q2extof 0.9894 and average root mean square error (RMSE of 2.2550. The model applicability domain was always verified by the leverage approach in order to propose reliable predicted data. The prediction results are in good agreement with the experimental values.

S . Saaidpour

2012-03-01

35

Error analysis of dimensionless scaling experiments with multiple points using linear regression

International Nuclear Information System (INIS)

A general method of error estimation in the case of multiple point dimensionless scaling experiments, using linear regression and standard error propagation, is proposed. The method reduces to the previous result of Cordey (2009 Nucl. Fusion 49 052001) in the case of a two-point scan. On the other hand, if the points follow a linear trend, it explains how the estimated error decreases as more points are added to the scan. Based on the analytical expression that is derived, it is argued that for a low number of points, adding points to the ends of the scanned range, rather than the middle, results in a smaller error estimate. (letter)

36

Directory of Open Access Journals (Sweden)

Full Text Available This article presents the possibility of using of multiple regression analysis (MRA and dynamic neural network (DNN for prediction of stability of Hydrocortisone 100 mg (in a form of hydrocortisone sodium succinate freeze-dried powder for injection packed into a dual chamber container. Degradation products of hydrocortisone sodium succinate: free hydrocortisone and related substances (impurities A, B, C, D and E; unspecified impurities and total impurities were followed during stress and formal stability studies. All data obtained during stability studies were used for in silico modeling; multiple regression models and dynamic neural networks as well, in order to compare predicted and observed results. High values of coefficient of determination (0.950.99 were gained using MRA and DNN, so both methods are powerful tools for in silico stability studies, but superiority of DNN over mathematical modeling of degradation was also confirmed.

Vuji? Zorica B.

2012-01-01

37

Directory of Open Access Journals (Sweden)

Full Text Available This study was carried out to detect nitrogen content in lettuce leaves rapidly and non-destructively using visible and near infrared (VIS-NIR hyperspectral imaging technology. Principal Component Analysis (PCA was performed on the average spectra to reduce the spectral dimensionality and the principal components (PCs were extracted as the input vectors of prediction models. Partial Least Square Regression (PLSR, Back Propagation Artificial Neural Network (BP-ANN, Extreme Learning Machine (ELM, Support Vector Machine Regression (SVR were, respectively applied to relate the nitrogen content to the corresponding PCs to build the prediction models of nitrogen content. R2p of the PLSR model for nitrogen content was 0.91 and RMSEP was 0.32. BP model of structure 5-2-1 with R2p of 0.92 and RMSEP of 0.21, ELM model of structure 5-10-1 with R2p of 0.95 and RMSEP of 0.19 and SVR model for nitrogen with R2p of 0.96 and RMSEP of 0.18, all got good prediction performance. Compared with the other three models, SVR model has the better performance for predicting nitrogen content in lettuce leaves. This work demonstrated that the hyperspectral imaging technique coupled with PCA-SVR exhibits a considerable promise for nondestructive detection of nitrogen content in lettuce leaves.

Sun Jun

2013-01-01

38

The bioavailability of rare earth elements (REEs) in soils was evaluated, based on the combination of chemical fractionation and multiple regression analysis. REEs in soils were partitioned by a sequential extraction procedure into water soluble (F(ws)), exchangeable (F(ec)), bound to carbonates (F(cb)), bound to organic matter (F(om)), bound to Fe-Mn oxides (F(fm)) and residual (F(rd)) fractions. Alfalfa (Medicago Staiva Linn.) had been grown on the soils in a pot-culture experiment under greenhouse conditions for 35 days. The concentrations of REEs in fractions and plant were determined by inductively coupled plasma-mass spectrometry (ICP-MS). Chemical fractionation showed that (F(ws)) fraction of REEs was less than 0.1% and residual (F(rd)) was the dominant form, more than 60% in soils. Bioaccumulation of REEs was observed in Alfalfa. REE availability to the plant was evaluated by multiple regression analysis. F(ws), F(ec), F(cb) and F(om) fractions were significantly correlated with REE uptake by alfalfa. But the exchangeable Pr(F(ec)) was significantly correlated with Pr concentration in alfalfa. F(ec), F(cb) and F(om) greatly contributed to La and Nd bioavailability; F(ec) and F(om) to Ce, Gd and Dy; F(ec) and F(cb) to Yb; and F(ws), F(ec) and F(om) to total REEs. This meant that the bioavailability of different species of REEs varied with individual REE. The results of this study indicated that the sequential extraction procedure, in conjunction with multiple regression analysis, may be useful for the prediction of plant uptake of REEs from soils. PMID:10665441

Cao, X; Wang, X; Zhao, G

2000-01-01

39

Directory of Open Access Journals (Sweden)

Full Text Available The purpose of this study was to quantitatively evaluate Akahori's preoperative classification of cubital tunnel syndrome. We analyzed the results for 57 elbows that were treated by a simple decompression procedure from 1997 to 2004. The relationship between each item of Akahori's preoperative classification and clinical stage was investigated based on the parameter distribution. We evaluated Akahori's classification system using multiple regression analysis, and investigated the association between the stage and treatment results. The usefulness of the regression equation was evaluated by analysis of variance of the expected and observed scores. In the parameter distribution, each item of Akahori's classification was mostly associated with the stage, but it was difficult to judge the severity of palsy. In the mathematical evaluation, the most effective item in determining the stage was sensory conduction velocity. It was demonstrated that the established regression equation was highly reliable (R?0.922. Akahori's preoperative classification can also be used in postoperative classification, and this classification was correlated with postoperative prognosis. Our results indicate that Akahori's preoperative classification is a suitable system. It is reliable, reproducible and well-correlated with the postoperative prognosis. In addition, the established prediction formula is useful to reduce the diagnostic complexity of Akahori's classification.

Nishida,Keiichiro

2013-02-01

40

Introduction to regression analysis

This book is an introduction to regression analysis for upper division and graduate students in science, engineering, social science and medicine. The emphasis is on the classical linear model using least squares estimation and inference. In addition, topics of current interest, such as regression diagnostics, ridge and logistic regression are treated as well. In contrast to other books at this level, the theoretical foundation of the subject is presented in some detail based on extensive use of matrix algebra. Throughout the text model building and evaluation are emphasised and illustrated wi

GOLBERG, M

2003-01-01

41

Reliability and Regression Analysis

This applet, by David M. Lane of Rice University, demonstrates how the reliability of X and Y affect various aspects of the regression of Y on X. Java 1.1 is required and a full set of instructions is given in order to get the full value from the applet. Exercises and definitions to key terms are also given to help students understand reliability and regression analysis.

Lane, David M.

2009-02-17

42

Multiple linear regression analysis was used to determine an equation for estimating hot corrosion attack for a series of Ni base cast turbine alloys. The U transform (i.e., 1/sin (% A/100) to the 1/2) was shown to give the best estimate of the dependent variable, y. A complete second degree equation is described for the centered" weight chemistries for the elements Cr, Al, Ti, Mo, W, Cb, Ta, and Co. In addition linear terms for the minor elements C, B, and Zr were added for a basic 47 term equation. The best reduced equation was determined by the stepwise selection method with essentially 13 terms. The Cr term was found to be the most important accounting for 60 percent of the explained variability hot corrosion attack.

Barrett, C. A.

1985-01-01

43

International Nuclear Information System (INIS)

A two layer perceptron with backpropagation of error is used for quantitative analysis in ICP-AES. The network was trained by emission spectra of two interfering lines of Cd and As and the concentrations of both elements were subsequently estimated from mixture spectra. The spectra of the Cd and As lines were also used to perform multiple linear regression (MLR) via the calculation of the pseudoinverse S+ of the sensitivity matrix S. In the present paper it is shown that there exist close relations between the operation of the perceptron and the MLR procedure. These are most clearly apparent in the correlation between the weights of the backpropagation network and the elements of the pseudoinverse. Using MLR, the confidence intervals over the predictions are exploited to correct for the optical device of the wavelength shift. (orig.)

44

Diagnostics for multiple regression problems

Energy Technology Data Exchange (ETDEWEB)

In the last 10 to 15 years there has been much work done in trying to improve linear regression results. Individuals have analyzed the susceptibility of least-squares results to values far removed from the center of the independent variable observations. They have studied the problem of heavy-tailed residuals, and they have studied the problem of collinearity. From these studies have come ridge regression techniques, robust regression techniques, regression on principal components, etc. However, many practitioners view these methods with suspicion (and ignorance), and prefer to continue using the usual least-squares procedures to fit their models, even though their results might not be answering the question they think. In reaction to this, statisticians are spending more time analyzing how the individual observations affect the least squares results. In the last few years approximately 10 papers and one text have appeared that address the problem of how to study the influence of the individual observations. This report is a study of the recent work done in linear regression diagnostics. It is concerned with analyzing the effect of one case at a time, since the methods to analyze this situation are relatively straight-forward and are not prohibitive computationally.

Daly, J.C.

1982-03-01

45

Alang-Sosiya is the largest ship-scrapping yard in the world, established in 1982. Every year an average of 171 ships having a mean weight of 2.10 x 10(6)(+/-7.82 x 10(5)) of light dead weight tonnage (LDT) being scrapped. Apart from scrapped metals, this yard generates a massive amount of combustible solid waste in the form of waste wood, plastic, insulation material, paper, glass wool, thermocol pieces (polyurethane foam material), sponge, oiled rope, cotton waste, rubber, etc. In this study multiple regression analysis was used to develop predictive models for energy content of combustible ship-scrapping solid wastes. The scope of work comprised qualitative and quantitative estimation of solid waste samples and performing a sequential selection procedure for isolating variables. Three regression models were developed to correlate the energy content (net calorific values (LHV)) with variables derived from material composition, proximate and ultimate analyses. The performance of these models for this particular waste complies well with the equations developed by other researchers (Dulong, Steuer, Scheurer-Kestner and Bento's) for estimating energy content of municipal solid waste. PMID:16009310

Reddy, M Srinivasa; Basha, Shaik; Joshi, H V; Sravan Kumar, V G; Jha, B; Ghosh, P K

2005-01-01

46

Directory of Open Access Journals (Sweden)

Full Text Available After much exertion and care to run an experiment in social science, the analysis of data should not be ruined by an improper analysis. Often, classical methods, like the mean, the usual simple and multiple linear regressions, and the ANOVA require normality and absence of outliers, which rarely occurs in data coming from experiments. To palliate to this problem, researchers often use some ad-hoc methods like the detection and deletion of outliers. In this tutorial, we will show the shortcomings of such an approach. In particular, we will show that outliers can sometimes be very difficult to detect and that the full inferential procedure is somewhat distorted by such a procedure. A more appropriate and modern approach is to use a robust procedure that provides estimation, inference and testing that are not influenced by outlying observations but describes correctly the structure for the bulk of the data. It can also give diagnostic of the distance of any point or subject relative to the central tendency. Robust procedures can also be viewed as methods to check the appropriateness of the classical methods. To provide a step-by-step tutorial, we present descriptive analyses that allow researchers to make an initial check on the conditions of application of the data. Next, we compare classical and robust alternatives to ANOVA and regression and discuss their advantages and disadvantages. Finally, we present indices and plots that are based on the residuals of the analysis and can be used to determine if the conditions of applications of the analyses are respected. Examples on data from psychological research illustrate each of these points and for each analysis and plot, R code is provided to allow the readers to apply the techniques presented throughout the article

Delphine S. Courvoisier

2010-01-01

47

Correlation Weights in Multiple Regression

A general theory on the use of correlation weights in linear prediction has yet to be proposed. In this paper we take initial steps in developing such a theory by describing the conditions under which correlation weights perform well in population regression models. Using OLS weights as a comparison, we define cases in which the two weighting…

Waller, Niels G.; Jones, Jeff A.

2010-01-01

48

Directory of Open Access Journals (Sweden)

Full Text Available In this article authors showed influence of technological parameters and modification treatment on structural properties for closed skeleton castings. Approach obtained maximal refinement of structure and minimal structure diversification. Skeleton castings were manufactured in accordance with elaborated production technology. Experimental castings were manufactured in variables technological conditions: range of pouring temperature 953 ÷ 1013 K , temperature of mould 293 ÷ 373 K and height of gating system above casting level 105 ÷ 175 mm. Analysis of metallographic specimens and quantitative analysis of silicon crystals and secondary dendrite-arm spacing analysis of solution ? were performed. Average values of stereological parameters for all castings were determined. (B/L and (P/A factors were determined. On basis results of microstructural analysis authors compares research of samples. The aim of analysis was selected samples on least diversification of refinement degree of structure and least silicon crystals. On basis microstructural analysis authors state that samples 5 (AlSi11, Tpour 1013K, Tmould 333K, h – 265 mm has the best structural properties (least diversification of refinement degree of structure and the least refinement of silicon crystals. Then statistical analysis results of structural analysis was obtained. On basis statistical analysis autors statethat the best structural properties for technological parameters: Tpour= 1013 K, Tmould= 373 K and h = 230 mm [4]. The results of statistical analysis are the prerequisite for optimization studies.

M. Cholewa

2011-07-01

49

Digital Repository Infrastructure Vision for European Research (DRIVER)

This paper joins the main properties of joint regression analysis (JRA), a model based on the Finlay-Wilkinson regression to analyse multi-environment trials, and of the additive main effects and multiplicative interaction (AMMI) model. The study compares JRA and AMMI with particular focus on robustness with increasing amounts of randomly selected missing data. The application is made using a data set from a breeding program of durum wheat (Triticum turgidum L., Durum Group) conducted in Port...

Paulo Canas Rodrigues; Dulce Gamito Santinhos Pereira; João Tiago Mexia

2011-01-01

50

On Investment Efficiency of China's Tourism Listed Companies Based on Multiple Regression Analysis

Directory of Open Access Journals (Sweden)

Full Text Available The paper is to investigate the conditions of efficient investment for China?s tourism listed companies and to examine how other factors affect the level of investment for the companies, in order to establish a basis for further studying the effect of executive compensation incentives on the investment efficiency of the tourism listed companies. Fifteen tourism listed companies from 2002 to 2010 are selected as study samples. On the basis of analysis of literature, the paper builds tourism listed companies' capital investment model by using the Richardson expected investment model for reference and then use it to deal with and analyze the data by the tools of SPSS 17.0 and EXCEL 2010. It is found that the mean residual of fifteen tourism listed companies' capital investment model is -0.000 000 744 with the mean residuals of seven companies less than zero and the ones of eight companies greater than zero. The minimum and maximum of the mean residuals respectively are -0.040 181 25 (Beijing Capital Tourism Co., Ltd and 0.036 942 5(Shenzhen Overseas Chinese Town Co., Ltd. ROAi,t-1(return of assets, p<0.10andINVi,t-1(scale of investment, p<0.01 respectively have significant positive correlations with INVi,t. And Agei,t-1(p<0.05has the significant negative correlation with INVi,t. It suggests that fifteen tourism listed companies from 2003 to 2010 have under-investment on the whole, in which seven ones and eight ones respectively have under-investment and over-investment. In addition, the total return on assets and the level of investment in tourism listed companies significantly advance the level of investment of the company of the following year. And the listing age significantly inhibits the level of investment of the company of the following year.

WEI Wei

2013-09-01

51

Directory of Open Access Journals (Sweden)

Full Text Available The estimation of retention factors by correlation equations with physico-chemical properties can be of great helpl in chromatographic studies. The retention factors were experimentally measured by RP-HPTLC on impregnated silica gel with paraffin oil using two-component solvent systems. The relationships between solute retention and modifier concentration were described by Snyder’s linear equation. A quantitative structure-retention relationship was developed for a series of s-triazine compounds by the multiple linear regression (MLR analysis. The MLR procedure was used to model the relationships between the molecular descriptors and retention of s-triazine derivatives. The physicochemical molecular descriptors were calculated from the optimized structures. The physico-chemical properties were the lipophilicity (log P, connectivity indices (?, total energy (Et, water solubility (log W, dissociation constant (pKa, molar refractivity (MR, and Gibbs energy (GibbsE of s-triazines. A high agreement between the experimental and predicted retention parameters was obtained when the dissociation constant and the hydrophilic-lipophilic balance were used as the molecular descriptors. The empirical equations may be successfully used for the prediction of the various chromatographic characteristics of substances, with a similar chemical structure. [Projekat Ministarstva nauke Republike Srbije, br. 31055, br. 172012, br. 172013 i br. 172014

Jevri? Lidija R.

2013-01-01

52

Crude oil prices do play significant role in the global economy and are a key input into option pricing formulas, portfolio allocation, and risk measurement. In this paper, a hybrid model integrating wavelet and multiple linear regressions (MLR) is proposed for crude oil price forecasting. In this model, Mallat wavelet transform is first selected to decompose an original time series into several subseries with different scale. Then, the principal component analysis (PCA) is used in processing subseries data in MLR for crude oil price forecasting. The particle swarm optimization (PSO) is used to adopt the optimal parameters of the MLR model. To assess the effectiveness of this model, daily crude oil market, West Texas Intermediate (WTI), has been used as the case study. Time series prediction capability performance of the WMLR model is compared with the MLR, ARIMA, and GARCH models using various statistics measures. The experimental results show that the proposed model outperforms the individual models in forecasting of the crude oil prices series. PMID:24895666

Shabri, Ani; Samsudin, Ruhaidah

2014-01-01

53

We develop a new method for estimating the biochemistry of plant material using spectroscopy. Normalized band depths calculated from the continuum-removed reflectance spectra of dried and ground leaves were used to estimate their concentrations of nitrogen, lignin, and cellulose. Stepwise multiple linear regression was used to select wavelengths in the broad absorption features centered at 1.73 ??m, 2.10 ??m, and 2.30 ??m that were highly correlated with the chemistry of samples from eastern U.S. forests. Band depths of absorption features at these wavelengths were found to also be highly correlated with the chemistry of four other sites. A subset of data from the eastern U.S. forest sites was used to derive linear equations that were applied to the remaining data to successfully estimate their nitrogen, lignin, and cellulose concentrations. Correlations were highest for nitrogen (R2 from 0.75 to 0.94). The consistent results indicate the possibility of establishing a single equation capable of estimating the chemical concentrations in a wide variety of species from the reflectance spectra of dried leaves. The extension of this method to remote sensing was investigated. The effects of leaf water content, sensor signal-to-noise and bandpass, atmospheric effects, and background soil exposure were examined. Leaf water was found to be the greatest challenge to extending this empirical method to the analysis of fresh whole leaves and complete vegetation canopies. The influence of leaf water on reflectance spectra must be removed to within 10%. Other effects were reduced by continuum removal and normalization of band depths. If the effects of leaf water can be compensated for, it might be possible to extend this method to remote sensing data acquired by imaging spectrometers to give estimates of nitrogen, lignin, and cellulose concentrations over large areas for use in ecosystem studies.We develop a new method for estimating the biochemistry of plant material using spectroscopy. Normalized band depths calculated from the continuum-removed reflectance spectra of dried and ground leaves were used to estimate their concentrations of nitrogen, lignin, and cellulose. Stepwise multiple linear regression was used to select wavelengths in the broad absorption features centered at 1.73 ??m, 2.10 ??m, and 2.301 ??m that were highly correlated with the chemistry of samples from eastern U.S. forests. Band depths of absorption features at these wavelengths were found to also be highly correlated with the chemistry of four other sites. A subset of data from the eastern U.S. forest sites was used to derive linear equations that were applied to the remaining data to successfully estimate their nitrogen, lignin, and cellulose concentrations. Correlations were highest for nitrogen (R2 from 0.75 to 0.94). The consistent results indicate the possibility of establishing a single equation capable of estimating the chemical concentrations in a wide variety of species from the reflectance spectra of dried leaves. The extension of this method to remote sensing was investigated. The effects of leaf water content, sensor signal-to-noise and bandpass, atmospheric effects, and background soil exposure were examined. Leaf water was found to be the greatest challenge to extending this empirical method to the analysis of fresh whole leaves and complete vegetation canopies. The influence of leaf water on reflectance spectra must be removed to within 10%. Other effects were reduced by continuum removal and normalization of band depths. If the effects of leaf water can be compensated for, it might be possible to extend this method to remote sensing data acquired by imaging spectrometers to give estimates of nitrogen, lignin, and cellulose concentrations over large areas for use in ecosystem studies.

Kokaly, R. F.; Clark, R. N.

1999-01-01

54

Enhance-Synergism and Suppression Effects in Multiple Regression

Relations between pairwise correlations and the coefficient of multiple determination in regression analysis are considered. The conditions for the occurrence of enhance-synergism and suppression effects when multiple determination becomes bigger than the total of squared correlations of the dependent variable with the regressors are discussed. It…

Lipovetsky, Stan; Conklin, W. Michael

2004-01-01

55

Energy Technology Data Exchange (ETDEWEB)

Artificial neural network analysis is found to be far superior to multiple regression when applied to the evaluation of trap quality in the Northern Kuqa Depression, a gas-rich depression of Tarim Basin in western China. This is because this technique can correlate the complex and non-linear relationship between trap quality and related geological factors, whereas multiple regression can only describe a linear relationship. However, multiple regression can work as an auxiliary tool, as it is suited to high-speed calculations and can indicate the degree of dependence between the trap quality and its related geological factors which artificial neural network analysis cannot. For illustration, we have investigated 30 traps in the Northern Kuqa Depression. For each of the traps, the values of 14 selected geological factors were all known. While geologists were also able to assign individual trap quality values to 27 traps, they were less certain about the values for the other three traps. Multiple regression and artificial neural network analysis were, therefore, respectively used to ascertain these values. Data for the 27 traps were used as known sample data, while the three traps were used as prediction candidates. Predictions from artificial neural network analysis are found to agree with exploration results: where simulation predicted high trap quality, commercial quality flows were afterwards found, and where low trap quality is indicated, no such discoveries have yet been made. On the other hand, multiple regression results indicate the order of dependence of the trap quality on geological factors, which reconciles with what geologists have commonly recognized. We can conclude, therefore, that the application of artificial neural network analysis with the aid of multiple regression to trap evaluation in the Northern Kuqa Depression has been quite successful. To ensure the precision of the above mentioned geological factors and their related parameters for each trap, a study of the petroleum system in Kuqa Depression was conducted, which included the partitioning and mechanisms of the Kuqa petroleum system. Three migration models are presented. (author)

Guangren Shi; Xingxi Zhou; Guangya Zhang; Xiaofeng Shi; Honghui Li [Research Institute of Petroleum Exploration and Development, Beijing (China)

2004-03-01

56

Multiple Regression Analyses in Clinical Child and Adolescent Psychology

A major form of data analysis in clinical child and adolescent psychology is multiple regression. This article reviews issues in the application of such methods in light of the research designs typical of this field. Issues addressed include controlling covariates, evaluation of predictor relevance, comparing predictors, analysis of moderation,…

Jaccard, James; Guilamo-Ramos, Vincent; Johansson, Margaret; Bouris, Alida

2006-01-01

57

Finite mixture models have come to play a very prominent role in modelling data. The finite mixture model is predicated on the assumption that distinct latent groups exist in the population. The finite mixture model therefore is based on a categorical latent variable that distinguishes the different groups. Often in practice distinct sub-populations do not actually exist. For example, disease severity (e.g. depression) may vary continuously and therefore, a distinction of diseased and not-diseased may not be based on the existence of distinct sub-populations. Thus, what is needed is a generalization of the finite mixture's discrete latent predictor to a continuous latent predictor. We cast the finite mixture model as a regression model with a latent Bernoulli predictor. A latent regression model is proposed by replacing the discrete Bernoulli predictor by a continuous latent predictor with a beta distribution. Motivation for the latent regression model arises from applications where distinct latent classes do not exist, but instead individuals vary according to a continuous latent variable. The shapes of the beta density are very flexible and can approximate the discrete Bernoulli distribution. Examples and a simulation are provided to illustrate the latent regression model. In particular, the latent regression model is used to model placebo effect among drug treated subjects in a depression study. PMID:20625443

Tarpey, Thaddeus; Petkova, Eva

2010-07-01

58

Directory of Open Access Journals (Sweden)

Full Text Available In this study we were investigated the relationship between the antifungal activity of some benzimidazole derivatives and some absorption, distribution, metabolism and excretion (ADME parameters. The antifungal activity of studied compounds against Saccharomyces cerevisiae was expressed as the minimal inhibitory concentration (MIC. A statistically significant quantitative structure-activity relationship (QSAR model for predicting antifungal activity of the investigated benzimidazole derivatives against Saccharomyces cerevisiae was obtained by multiple linear regression (MLR using ADME parameters. The quality of the MLR model was validated by the leave-one-out (LOO technique, as well as by the calculation of the statistical parameters for the developed model, and the results are discussed based on the statistical data. [Projekat Ministarstva nauke Republike Srbije, br. 172012 i br. 172014

Kalajdžija Nataša D.

2013-01-01

59

Canonical Analysis as a Generalized Regression Technique for Multivariate Analysis.

The use of characteristic coding (dummy coding) is made in showing solutions to four multivariate problems using canonical analysis. The canonical variates can be themselves analyzed by the use of multiple linear regression. When the canonical variates are used as criteria in a multiple linear regression, the R2 values are equal to 0, where 0 is…

Williams, John D.

60

Directory of Open Access Journals (Sweden)

Full Text Available The thermal inactivation of Enterococcus faecium under isothermal conditions in tryptic soy broth of different pH (4.0, 5.5 and 7.4 was studied. The bacterial cells were more sensitive at higher temperature and in media of low pH. Decimal reduction times at 71ºC were 2.56, 0.39 and 0.03 min at pH 7.4, 5.5 and 4.0 respectively. At all temperatures and pH assayed, the survival curves obtained were linear. A mathematical model based on the first order kinetic accurately described these survival curves. The relationship between DT values and temperature was also linear. A mean z-value of 5ºC was established. A multiple linear regression model using four predictor variables (pH, T, pH2 and T2 related the Log of DT value with pH and treatment temperature. The developed tertiary model satisfactorily predicted the heat inactivation of Enterococcus faeciumunder the treatment conditions investigated.

S. CONDON

2014-06-01

61

Omnibus Hypothesis Testing in Dominance-Based Ordinal Multiple Regression

Often quantitative data in the social sciences have only ordinal justification. Problems of interpretation can arise when least squares multiple regression (LSMR) is used with ordinal data. Two ordinal alternatives are discussed, dominance-based ordinal multiple regression (DOMR) and proportional odds multiple regression. The Q[superscript 2]…

Long, Jeffrey D.

2005-01-01

62

Scientific Electronic Library Online (English)

Full Text Available SciELO Brazil | Language: English Abstract in english This paper joins the main properties of joint regression analysis (JRA), a model based on the Finlay-Wilkinson regression to analyse multi-environment trials, and of the additive main effects and multiplicative interaction (AMMI) model. The study compares JRA and AMMI with particular focus on robust [...] ness with increasing amounts of randomly selected missing data. The application is made using a data set from a breeding program of durum wheat (Triticum turgidum L., Durum Group) conducted in Portugal. The results of the two models result in similar dominant cultivars (JRA) and winner of mega-environments (AMMI) for the same environments. However, JRA had more stable results with the increase in the incidence rates of missing values.

Paulo Canas, Rodrigues; Dulce Gamito Santinhos, Pereira; João Tiago, Mexia.

2011-12-01

63

Directory of Open Access Journals (Sweden)

Full Text Available This paper joins the main properties of joint regression analysis (JRA, a model based on the Finlay-Wilkinson regression to analyse multi-environment trials, and of the additive main effects and multiplicative interaction (AMMI model. The study compares JRA and AMMI with particular focus on robustness with increasing amounts of randomly selected missing data. The application is made using a data set from a breeding program of durum wheat (Triticum turgidum L., Durum Group conducted in Portugal. The results of the two models result in similar dominant cultivars (JRA and winner of mega-environments (AMMI for the same environments. However, JRA had more stable results with the increase in the incidence rates of missing values.

Paulo Canas Rodrigues

2011-12-01

64

Four Assumptions of Multiple Regression That Researchers Should Always Test.

Discusses assumptions of multiple regression that are not robust to violation: linearity, reliability of measurement, homoscedasticity, and normality. Stresses the importance of checking assumptions. (SLD)

Osbourne, Jason W.; Waters, Elaine

2002-01-01

65

Polylinear regression analysis in radiochemistry

International Nuclear Information System (INIS)

A number of radiochemical problems have been formulated in the framework of polylinear regression analysis, which permits the use of conventional mathematical methods for their solution. The authors have considered features of the use of polylinear regression analysis for estimating the contributions of various sources to the atmospheric pollution, for studying irradiated nuclear fuel, for estimating concentrations from spectral data, for measuring neutron fields of a nuclear reactor, for estimating crystal lattice parameters from X-ray diffraction patterns, for interpreting data of X-ray fluorescence analysis, for estimating complex formation constants, and for analyzing results of radiometric measurements. The problem of estimating the target parameters can be incorrect at certain properties of the system under study. The authors showed the possibility of regularization by adding a fictitious set of data open-quotes obtainedclose quotes from the orthogonal design. To estimate only a part of the parameters under consideration, the authors used incomplete rank models. In this case, it is necessary to take into account the possibility of confounding estimates. An algorithm for evaluating the degree of confounding is presented which is realized using standard software or regression analysis

66

Directory of Open Access Journals (Sweden)

Full Text Available The aim of this work is to establish a relationship between schistosomiasis prevalence and social-environmental variables, in the state of Minas Gerais, Brazil, through multiple linear regression. The final regression model was established, after a variables selection phase, with a set of spatial variables which contains the summer minimum temperature, human development index, and vegetation type variables. Based on this model, a schistosomiasis risk map was built for Minas Gerais.

Ricardo JPS Guimarães

2006-10-01

67

Synthesis analysis of regression models with a continuous outcome

Digital Repository Infrastructure Vision for European Research (DRIVER)

To estimate the multivariate regression model from multiple individual studies, it would be challenging to obtain results if the input from individual studies only provide univariate or incomplete multivariate regression information. Samsa et al. (J. Biomed. Biotechnol. 2005; 2:113–123) proposed a simple method to combine coefficients from univariate linear regression models into a multivariate linear regression model, a method known as synthesis analysis. However, the validity of this meth...

Zhou, Xiao-hua; Hu, Nan; Hu, Guizhou; Root, Martin

2009-01-01

68

International Nuclear Information System (INIS)

The problem of performing process capability analysis when auto correlations are present is discussed. It is shown that when the systematic nonrandom phenomenon induced by autocorrelation is ignored the variance estimate obtained from the original data is no longer an appropriate estimate for use in the process capability analyses. A remedial measure based on an autoregressive integrated moving average model is proposed. It is also shown that the process variance estimated from the residual analysis yields appropriate results for the process capability indices

69

Retail sales forecasting with application the multiple regression

Directory of Open Access Journals (Sweden)

Full Text Available The article begins with a formulation for predictive learning called multiple regression model. Theoretical approach on construction of the regression models is described. The key information of the article is the mathematical formulation for the forecast linear equation that estimates the multiple regression model. Calculation the quantitative value of dependent variable forecast under influence of independent variables is explained. This paper presents the retail sales forecasting with multiple model estimation. One of the most important decisions a retailer can make with information obtained by the multiple regression. Recently, a changing retail environment is causing by an expected consumer’s income and advertising costs. Checking model on the goodness of fit and statistical significance are explored in the article. Finally, the quantitative value of retail sales forecast based on multiple regression model is calculated.

Kuzhda, Tetyana

2012-05-01

70

Leaf chlorophyll content provides valuable information about physiological status of plants; it is directly linked to photosynthetic potential and primary production. In vitro assessment by wet chemical extraction is the standard method for leaf chlorophyll determination. This measurement is expensive, laborious, and time consuming. Over the years alternative methods, rapid and non-destructive, have been explored. The aim of this work was to evaluate the applicability of a fast and non-invasive field method for estimation of chlorophyll content in quinoa and amaranth leaves based on RGB components analysis of digital images acquired with a standard SLR camera. Digital images of leaves from different genotypes of quinoa and amaranth were acquired directly in the field. Mean values of each RGB component were evaluated via image analysis software and correlated to leaf chlorophyll provided by standard laboratory procedure. Single and multiple regression models using RGB color components as independent variables have been tested and validated. The performance of the proposed method was compared to that of the widely used non-destructive SPAD method. Sensitivity of the best regression models for different genotypes of quinoa and amaranth was also checked. Color data acquisition of the leaves in the field with a digital camera was quick, more effective, and lower cost than SPAD. The proposed RGB models provided better correlation (highest R (2)) and prediction (lowest RMSEP) of the true value of foliar chlorophyll content and had a lower amount of noise in the whole range of chlorophyll studied compared with SPAD and other leaf image processing based models when applied to quinoa and amaranth. PMID:24442792

Riccardi, M; Mele, G; Pulvento, C; Lavini, A; d'Andria, R; Jacobsen, S-E

2014-06-01

71

The use of multiple linear regression in property valuation

Directory of Open Access Journals (Sweden)

Full Text Available The property appraisal is of great importance for one country and its economy. Nowadays, successful land management system could not be imagined without the subsystem related to market economy. Having the information about land and its values offer broad possibilities for market economy and strongly influence development of the real estate market. Special attention should be paid to the mass appraisal methods and its use in developing the tax system and framework for appropriate property appraisal system. Multiple regression analysis is just one of the methods used for this purpose and this article is focused to its characteristics and advantages in mass appraisal system development.

Marko Peji?

2013-05-01

72

A Constrained Linear Estimator for Multiple Regression

"Improper linear models" (see Dawes, Am. Psychol. 34:571-582, "1979"), such as equal weighting, have garnered interest as alternatives to standard regression models. We analyze the general circumstances under which these models perform well by recasting a class of "improper" linear models as "proper" statistical models with a single predictor. We…

Davis-Stober, Clintin P.; Dana, Jason; Budescu, David V.

2010-01-01

73

Estimation of transport airplane aerodynamics using multiple stepwise regression

This paper presents an application of multiple stepwise regression to the flight test data of a typical transport airplane. The flight test data was carefully preprocessed to eliminate aliasing, time skews and high frequency noise. The data consisted both of basic certification maneuvers, such as wind-up-turns and maneuvers suitable for parameter estimation, such as responses to elevator pulses and doublets. It is shown that the results of multiple stepwise regression techniques compare favorably with the results obtained from maximum likelihood estimation. Finally, it is concluded that multiple stepwise regression could be a fast economical way to estimate transport airplane aerodynamics.

Keskar, D. A.; Klein, V.; Batterson, J. G.

1985-01-01

74

A Comparison of Best Model Selection Criteria in Multiple Regression.

The all-possible subset approach is recommended as an alternative over stepwise methods for selecting the best set of predictor variables for multiple regression. Several criteria are available for selecting the best subset model. These are compared with the principal component regression (PCR) method to investigate their usefulness for subset…

Schumacker, Randall E.

75

Regression Commonality Analysis: A Technique for Quantitative Theory Building

When it comes to multiple linear regression analysis (MLR), it is common for social and behavioral science researchers to rely predominately on beta weights when evaluating how predictors contribute to a regression model. Presenting an underutilized statistical technique, this article describes how organizational researchers can use commonality…

Nimon, Kim; Reio, Thomas G., Jr.

2011-01-01

76

Using Cigarette Data for An Introduction to Multiple Regression

This article, created by Lauran McIntyre of North Carolina State University, describes a dataset containing information for twenty-five brands of domestic cigarettes. The information collected includes: measurements of weight, tar, nicotine and carbon monoxide. The dataset can be used to illustrate multiple regression, outliers, and collinearity. Speaking to this, the author states: "The dataset is useful for introducing the ideas of multiple regression and provides examples of an outlier and a pair of collinear variables."

Mcintyre, Lauren

2009-02-12

77

International Nuclear Information System (INIS)

Various statistical techniques was used on five-year data from 1998-2002 of average humidity, rainfall, maximum and minimum temperatures, respectively. The relationships to regression analysis time series (RATS) were developed for determining the overall trend of these climate parameters on the basis of which forecast models can be corrected and modified. We computed the coefficient of determination as a measure of goodness of fit, to our polynomial regression analysis time series (PRATS). The correlation to multiple linear regression (MLR) and multiple linear regression analysis time series (MLRATS) were also developed for deciphering the interdependence of weather parameters. Spearman's rand correlation and Goldfeld-Quandt test were used to check the uniformity or non-uniformity of variances in our fit to polynomial regression (PR). The Breusch-Pagan test was applied to MLR and MLRATS, respectively which yielded homoscedasticity. We also employed Bartlett's test for homogeneity of variances on a five-year data of rainfall and humidity, respectively which showed that the variances in rainfall data were not homogenous while in case of humidity, were homogenous. Our results on regression and regression analysis time series show the best fit to prediction modeling on climatic data of Quetta, Pakistan. (author)

78

In the first part of this work [1] a field operational test (FOT) on micro-HEVs (hybrid electric vehicles) and conventional vehicles was introduced. Valve-regulated lead-acid (VRLA) batteries in absorbent glass mat (AGM) technology and flooded batteries were applied. The FOT data were analyzed by kernel density estimation. In this publication multiple regression analysis is applied to the same data. Square regression models without interdependencies are used. Hereby, capacity loss serves as dependent parameter and several battery-related and vehicle-related parameters as independent variables. Battery temperature is found to be the most critical parameter. It is proven that flooded batteries operated in the conventional power system (CPS) degrade faster than VRLA-AGM batteries in the micro-hybrid power system (MHPS). A smaller number of FOT batteries were applied in a vehicle-assigned test design where the test battery is repeatedly mounted in a unique test vehicle. Thus, vehicle category and specific driving profiles can be taken into account in multiple regression. Both parameters have only secondary influence on battery degradation, instead, extended vehicle rest time linked to low mileage performance is more serious. A tear-down analysis was accomplished for selected VRLA-AGM batteries operated in the MHPS. Clear indications are found that pSoC-operation with periodically fully charging the battery (refresh charging) does not result in sulphation of the negative electrode. Instead, the batteries show corrosion of the positive grids and weak adhesion of the positive active mass.

Schaeck, S.; Karspeck, T.; Ott, C.; Weirather-Koestner, D.; Stoermer, A. O.

2011-03-01

79

Vehicle Travel Time Predication based on Multiple Kernel Regression

Directory of Open Access Journals (Sweden)

Full Text Available With the rapid development of transportation and logistics economy, the vehicle travel time prediction and planning become an important topic in logistics. Travel time prediction, which is indispensible for traffic guidance, has become a key issue for researchers in this field. At present, the prediction of travel time is mainly short term prediction, and the predication methods include artificial neural network, Kaman filter and support vector regression (SVR method etc. However, these algorithms still have some shortcomings, such as highcomputationcomplexity, slow convergence rate etc. This paper exploits the learning ability of multiple kernel learning regression (MKLR in nonlinear prediction processing characteristics, logistics planning based on MKLR for vehicle travel time prediction. The method for Vehicle travel time prediction includes the following steps: (1 preprocessing historical data; (2 selecting appropriate kernel function, training the historical data and performing analysis ;(3 predicting the vehicle travel time based on the trained model. The experimental results show that, through the analysis of using different methods for prediction, the vehicle travel time prediction method proposed in this paper, archives higher accuracy than other methods. It also illustrates the feasibility and effectiveness of the proposed prediction method.

Wenjing Xu

2014-07-01

80

Steganalysis of LSB Image Steganography using Multiple Regression and Auto Regressive (AR Model

Directory of Open Access Journals (Sweden)

Full Text Available The staggering growth in communication technologyand usage of public domain channels (i.e. Internet has greatly facilitated transfer of data. However, such open communication channelshave greater vulnerability to security threats causing unauthorizedin- formation access. Traditionally, encryption is used to realizethen communication security. However, important information is notprotected once decoded. Steganography is the art and science of communicating in a way which hides the existence of the communication.Important information is ?rstly hidden in a host data, such as digitalimage, text, video or audio, etc, and then transmitted secretly tothe receiver. Steganalysis is another important topic in informationhiding which is the art of detecting the presence of steganography. Inthis paper a novel technique for the steganalysis of Image has beenpresented. The proposed technique uses an auto-regressive model todetect the presence of the hidden messages, as well as to estimatethe relative length of the embedded messages.Various auto regressiveparameters are used to classify cover image as well as stego imagewith the help of a SVM classi?er. Multiple Regression analysis ofthe cover carrier along with the stego carrier has been carried outin order to ?nd out the existence of the negligible amount of thesecret message. Experimental results demonstrate the effectivenessand accuracy of the proposed technique.

Souvik Bhattacharyya

2011-07-01

81

Functional linear regression via canonical analysis

We study regression models for the situation where both dependent and independent variables are square-integrable stochastic processes. Questions concerning the definition and existence of the corresponding functional linear regression models and some basic properties are explored for this situation. We derive a representation of the regression parameter function in terms of the canonical components of the processes involved. This representation establishes a connection between functional regression and functional canonical analysis and suggests alternative approaches for the implementation of functional linear regression analysis. A specific procedure for the estimation of the regression parameter function using canonical expansions is proposed and compared with an established functional principal component regression approach. As an example of an application, we present an analysis of mortality data for cohorts of medflies, obtained in experimental studies of aging and longevity.

He, Guozhong; Wang, Jane-Ling; Yang, Wenjing; 10.3150/09-BEJ228

2011-01-01

82

REPRESENTATIVE VARIABLES IN A MULTIPLE REGRESSION MODEL

Directory of Open Access Journals (Sweden)

Full Text Available There are presented econometric models developed for analysis of banking exclusion of the economic crisis. Access to public goods and services is a condition „sine qua non” for open and efficient society. Availability of banking and payment of the entire population without discrimination in our opinion should be the primary objective of public service policy.

Barbu Bogdan POPESCU

2013-02-01

83

Significant Tests of Coefficient Multiple Regressions by using Permutation Methods

Directory of Open Access Journals (Sweden)

Full Text Available Tests of significance of a single partial regression coefficient in a multiple regression model are often made in situations where the standard assumptions underlying the probability calculation (for example assumption of normally of random error term do not hold. When the random error term fails to fulfill some of these assumptions, one need resort to some other nonparametric methods to carry out statistical inferences. Permutation methods are a branch of nonparametric methods. This study compared empirical type one error of different permutation strategies that proposed for testing nullity of a partial regression coefficient in a multiple regression model, using simulation and show that the type one error of Freedman and Lanes strategy is lower to than the other methods.

Ali Shadrokh

2011-01-01

84

Directory of Open Access Journals (Sweden)

Full Text Available In the last few decades, techniques such as Artificial Neural Networks and Fuzzy Inference Systems were used for developing predictive models to estimate the required parameters. Since the recent past Soft Computing techniques are being used as alternate statistical tool. Determination of nature of financial time series data is difficult, expensive, time consuming and involves complex tests. In this paper, we use Multi Layer Perception and Radial Basis Functions of Artificial Neural Networks, Adaptive Neuro Fuzzy Inference System for prediction of S% (Financial Stress percent of financial time series data and compare it with traditional statistical tool of Multiple Regression. The accuracies of Artificial Neural Network and Adaptive Neuro Fuzzy Inference System techniques are evaluated as relatively similar. It is found that Radial Basis Functions constructed exhibit high performance than Multi Layer Perception, Adaptive Neuro Fuzzy Inference System and Multiple Regression for predicting S%. The performance comparison shows that Soft Computing paradigm is a promising tool for minimizing uncertainties in financial time series data. Further Soft Computing also minimizes the potential inconsistency of correlations.

Arindam Chaudhuri

2012-09-01

85

Regression Analysis and the Sociological Imagination

Regression analysis is an important aspect of most introductory statistics courses in sociology but is often presented in contexts divorced from the central concerns that bring students into the discipline. Consequently, we present five lesson ideas that emerge from a regression analysis of income inequality and mortality in the USA and Canada.

De Maio, Fernando

2014-01-01

86

Modeling Oil Palm Yield Using Multiple Linear Regression and Robust M-regression

Directory of Open Access Journals (Sweden)

Full Text Available This study shows how a multiple linear regression model can be used to model palm oil yield. The methods are illustrated by examining the time series data of foliar nutrient compositions as one of the independent variable and fresh fruit bunch as dependent variable. Other independent variables include the nutrient balance ratio and major nutrient composition. This modeling approach is capable of identifying the significant contribution of each independent variable in the improving the modeling performance. We find that the quantile-quantile plot demonstrates the existing of outlier and this directs us to use robust M-regression for removing the negative impact of outliers. Results show that robust regression in this case gives a better results than conventional regression in modeling oil palm yield.

Azme Khamis

2006-01-01

87

AN EFFECTIVE TECHNIQUE OF MULTIPLE IMPUTATION IN NONPARAMETRIC QUANTILE REGRESSION

Directory of Open Access Journals (Sweden)

Full Text Available In this study, we consider the nonparametric quantile regression model with the covariates Missing at Random (MAR. Multiple imputation is becoming an increasingly popular approach for analyzing missing data, which combined with quantile regression is not well-developed. We propose an effective and accurate two-stage multiple imputation method for the model based on the quantile regression, which consists of initial imputation in the first stage and multiple imputation in the second stage. The estimation procedure makes full use of the entire dataset to achieve increased efficiency and we show the proposed two-stage multiple imputation estimator to be asymptotically normal. In simulation study, we compare the performance of the proposed imputation estimator with Complete Case (CC estimator and other imputation estimators, e.g., the regression imputation estimator and k-Nearest-Neighbor imputation estimator. We conclude that the proposed estimator is robust to the initial imputation and illustrates more desirable performance than other comparative methods. We also apply the proposed multiple imputation method to an AIDS clinical trial data set to show its practical application.

Yanan Hu

2014-01-01

88

Digital Repository Infrastructure Vision for European Research (DRIVER)

Both linear neural network and multiple linear regression models can be used for multi-factor analysis and forecasting, but the data of the multiple linear regression model are required to meet such conditions as independence and normality, while the data of the linear neural network are only required to have a linear relationship. This article uses the same set of data to establish respectively a linear neural network model and a multiple linear regression model, compares the abilities of fi...

Jianhua Wu; Jianhui Wu; Guoli Wang; Xiaohong Wang

2011-01-01

89

Use of analysis of covariance or multiple regression can lead to Type I errors in pretest-posttest nonequivalent control group designs. Analysis of the interaction term of a repeated model analysis of variance was shown to help identify areas of significant change. The standardized difference score, effect size, suggested meaningful differences.…

Karabinus, Robert A

1983-01-01

90

Directional Hypotheses with the Multiple Linear Regression Approach.

Two well known directional (one-tailed) tests of significance, mean difference and correlation coefficient, are presented within the multiple linear regression framework. Adjustments on the computed probability level are indicated. The case for a directional interaction research hypothesis is defended. Conservative adjustments on the computed…

McNeil, Keith A.; Beggs, Donald L.

91

Constituents with linear radiance gradients with concentration may be quantified from signals which contain nonlinear atmospheric and surface reflection effects for both homogeneous and non-homogeneous water bodies provided accurate data can be obtained and nonlinearities are constant with wavelength. Statistical parameters must be used which give an indication of bias as well as total squared error to insure that an equation with an optimum combination of bands is selected. It is concluded that the effect of error in upwelled radiance measurements is to reduce the accuracy of the least square fitting process and to increase the number of points required to obtain a satisfactory fit. The problem of obtaining a multiple regression equation that is extremely sensitive to error is discussed.

Whitlock, C. H., III

1977-01-01

92

On relationship between regression models and interpretation of multiple regression coefficients

In this paper, we consider the problem of treating linear regression equation coefficients in the case of correlated predictors. It is shown that in general there are no natural ways of interpreting these coefficients similar to the case of single predictor. Nevertheless we suggest linear transformations of predictors, reducing multiple regression to a simple one and retaining the coefficient at variable of interest. The new variable can be treated as the part of the old variable that has no linear statistical dependence on other presented variables.

Varaksin, A N

2012-01-01

93

Linear regression analysis theory and computing

This volume presents in detail the fundamental theories of linear regression analysis and diagnosis, as well as the relevant statistical computing techniques so that readers are able to actually model the data using the methods and techniques described in the book. It covers the fundamental theories in linear regression analysis and is extremely useful for future research in this area. The examples of regression analysis using the Statistical Application System (SAS) are also included. This book is suitable for graduate students who are either majoring in statistics/biostatistics or using line

Yan, Xin

2009-01-01

94

Weighted regression analysis for comparing varietal adaptation.

The normally used joint linear regression analysis (OLS) is not appropriate for comparing estimates of stability parameters of varieties when the error variances of site means are heterogeneous. Weighted regression analysis (WLS), in these situations, yields more precise estimates of stability parameters. A comparison of the two analytical methods using the grain yield (kg ha(-1)) data of 12 varieties and one hybrid of pearl millet [Pennisetum typhoides (Burm.) S. & H.], tested at 26 sites in India, revealed that the weighted regression analysis yields more efficient estimates of regression coefficients (b i ) than the ordinary regression analysis, and that the standard errors of b i values were reduced by up to 43%. The estimated b i differed with the two procedures. The number of varieties with b i ssignificantly deviating from unity was not only more (five varieties) with weighted regression analysis than the ordinary regression analysis (one variety), but the classification of varieties as possessing general or specific adaptation differed with the two procedures. PMID:24221324

Virk, D S; Virk, P S; Mangat, B K; Harinarayana, G

1991-04-01

95

Mediation analysis in child and adolescent development research is possible using large secondary data sets. This article provides an overview of two statistical methods commonly used to test mediated effects in secondary analysis: multiple regression and structural equation modeling (SEM). Two empirical studies are presented to illustrate the…

Li, Spencer D.

2011-01-01

96

International Nuclear Information System (INIS)

The incremental diagnostic yield of clinical data, exercise ECG, stress thallium scintigraphy, and cardiac fluoroscopy to predict coronary and multivessel disease was assessed in 171 symptomatic men by means of multiple logistic regression analyses. When clinical variables alone were analyzed, chest pain type and age were predictive of coronary disease, whereas chest pain type, age, a family history of premature coronary disease before age 55 years, and abnormal ST-T wave changes on the rest ECG were predictive of multivessel disease. The percentage of patients correctly classified by cardiac fluoroscopy (presence or absence of coronary artery calcification), exercise ECG, and thallium scintigraphy was 9%, 25%, and 50%, respectively, greater than for clinical variables, when the presence or absence of coronary disease was the outcome, and 13%, 25%, and 29%, respectively, when multivessel disease was studied; 5% of patients were misclassified. When the 37 clinical and noninvasive test variables were analyzed jointly, the most significant variable predictive of coronary disease was an abnormal thallium scan and for multivessel disease, the amount of exercise performed. The data from this study provide a quantitative model and confirm previous reports that optimal diagnostic efficacy is obtained when noninvasive tests are ordered sequentially. In symptomatic men, cardiac fluoroscopy is a relatively ineffective test when compared to exercise ECG and thallium scintigraphy

97

Application of wavelet-based multiple linear regression model to rainfall forecasting in Australia

In this study, a wavelet-based multiple linear regression model is applied to forecast monthly rainfall in Australia by using monthly historical rainfall data and climate indices as inputs. The wavelet-based model is constructed by incorporating the multi-resolution analysis (MRA) with the discrete wavelet transform and multiple linear regression (MLR) model. The standardized monthly rainfall anomaly and large-scale climate index time series are decomposed using MRA into a certain number of component subseries at different temporal scales. The hierarchical lag relationship between the rainfall anomaly and each potential predictor is identified by cross correlation analysis with a lag time of at least one month at different temporal scales. The components of predictor variables with known lag times are then screened with a stepwise linear regression algorithm to be selectively included into the final forecast model. The MRA-based rainfall forecasting method is examined with 255 stations over Australia, and compared to the traditional multiple linear regression model based on the original time series. The models are trained with data from the 1959-1995 period and then tested in the 1996-2008 period for each station. The performance is compared with observed rainfall values, and evaluated by common statistics of relative absolute error and correlation coefficient. The results show that the wavelet-based regression model provides considerably more accurate monthly rainfall forecasts for all of the selected stations over Australia than the traditional regression model.

He, X.; Guan, H.; Zhang, X.; Simmons, C.

2013-12-01

98

Polylinear regression analysis in problems of radiochemistry

International Nuclear Information System (INIS)

Some radiochemical problems are formulated in the form of a problem of polylinear regression analysis that enables to use standard software to solve them. Application peculiarities of regression analysis to estimate contribution of different sources of the atmosphere contamination, to investigate into the irradiated nuclear fuel, to estimate concentrations using spectroscopy data, to measure neutron fields of nuclear reactor, to estimate parameters of crystalline lattices using diffraction pictures, to interpret data of X-ray fluorescent analysis, to estimate constants of complex formation, to analyze the results of radiometric measurements. Incomplete models are used to estimate only a part of the required parameters. In this case, account should be made for possible mixing of estimations. Mixing estimation algorithm realized by means of regression analysis standard programs is presented. 9 refs

99

Introducing Evolutionary Computing in Regression Analysis

A typical upper level undergraduate or first year graduate level regression course syllabus treats model selection with various stepwise regression methods. Here we implement evolutionary computing for subset model selection and accomplish two goals: i) introduce students to the powerful optimization method of genetic algorithms, and ii) transform a regression analysis course to a regression and modeling without requiring any additional time or software commitment.Furthermore we also employed Akaike Information Criterion (AIC) as a measure of model fitness instead of another commonly used measure of R-square. The model selection tool uses Excel which makes the procedure accessible to a very wide spectrum of interdisciplinary students with no specialized software requirement. An Excel macro, to be used as an instructional tool is freely available through the author's website.

Akman, Olcay

100

Directory of Open Access Journals (Sweden)

Full Text Available In ordinary statistical methods, multiple outliers in multiple linear regression model are detected sequentially one after another, where smearing and masking effects give misleading results. If the potential multiple outliers can be detected simultaneously, smearing and masking effects can be avoided. Such multiple-case outlier detection is of combinatorial nature and 2^N-N-1 sets of possible outliers need to be tested, where N is the number of data points. This exhaustive search is practically impossible. In this paper, we have used quantum-inspired evolutionary algorithm (QEA for multiple-case outlier detection in multiple linear regression model. A Bayesian information criterion based fitness function incorporating extra penalty for number of potential outliers has been used for identifying the most appropriate set of potential outliers. Experimental results with 10 widely referred datasets from statistical literature show that the QEA overcomes the effect of smearing and masking and effectively detects the most appropriate set of outliers.

Salena Akter

2010-12-01

101

A multiple regression model for the Ft. Calhoun reactor coolant pump system

International Nuclear Information System (INIS)

Multiple regression analysis is one of the most widely used of all statistical tools. In this research paper, we introduce an application of fitting a multiple regression model on reactor coolant pump (RCP) data. The primary purpose of this research is to correlate the results obtained by Design of Experiments (DOE) and regression model fitting. Also, the idea behind using regression model is to gain more detailed information in the RCP data than provided by DOE. In engineering science, statistical quality control techniques have traditionally been applied to control manufacturing processes. An application to commercial nuclear power plant maintenance and control is presented that can greatly improve plant safety and reliability. The result obtained show that six out of ten parameters are under control specification limits and four parameters are not in the state of statistical control. The four parameters that are out of control adversely affect the regression model fitting and the final prediction equation, thereby, does not predict accurate response for the future. The analysis concludes that in order to fit a best regression model, one has to remove all out of control points from the data set, including dropping a variable from the model to have better prediction of the response variable. (author)

102

Multiple Linear Regression for Extracting Phrase Translation Pairs

Directory of Open Access Journals (Sweden)

Full Text Available Phrase translation pairs are very useful for bilingual lexicography, machine translation system, cross-lingual information retrieval and many applications in natural language processing. Phrase translation pairs are always extracted from bilingual sentence pairs. In this paper, we extract phrase translation pairs based on word alignment results of Chinese-English bilingual sentence pairs and parsing trees of Chinese sentences, in order to decrease the influence of the grammar disagreement between Chinese and English. Discriminative features for phrase translation pairs are proposed to evaluate extracted ones in this paper, including translation literality, phrase alignment probability and phrase length difference. Multiple linear regression model combined with N-best strategy will be employed to filter phrase translation pairs, in order to improve the evaluating and filtering performance. Experimental results indicate that the filtering performance of phrase alignment probability is best in three kinds of discriminative features for evaluating Chinese-English phrase translation pairs. After multiple linear regression model combined with N-best strategy is used, its F1 achieves 86.24%.

Chun-Xiang Zhang

2011-05-01

103

Directory of Open Access Journals (Sweden)

Full Text Available Problem statement: This study presents a novel method for the determination of average winding temperature rise of transformers under its predetermined field operating conditions. Rise in the winding temperature was determined from the estimated values of winding resistance during the heat run test conducted as per IEC standard. Approach: The estimation of hot resistance was modeled using Multiple Variable Regression (MVR, Multiple Polynomial Regression (MPR and soft computing techniques such as Artificial Neural Network (ANN and Adaptive Neuro Fuzzy Inference System (ANFIS. The modeled hot resistance will help to find the load losses at any load situation without using complicated measurement set up in transformers. Results: These techniques were applied for the hot resistance estimation for dry type transformer by using the input variables cold resistance, ambient temperature and temperature rise. The results are compared and they show a good agreement between measured and computed values. Conclusion: According to our experiments, the proposed methods are verified using experimental results, which have been obtained from temperature rise test performed on a 55 kVA dry-type transformer.

M. Srinivasan

2012-01-01

104

Multiple predictor smoothing methods for sensitivity analysis.

Energy Technology Data Exchange (ETDEWEB)

The use of multiple predictor smoothing methods in sampling-based sensitivity analyses of complex models is investigated. Specifically, sensitivity analysis procedures based on smoothing methods employing the stepwise application of the following nonparametric regression techniques are described: (1) locally weighted regression (LOESS), (2) additive models, (3) projection pursuit regression, and (4) recursive partitioning regression. The indicated procedures are illustrated with both simple test problems and results from a performance assessment for a radioactive waste disposal facility (i.e., the Waste Isolation Pilot Plant). As shown by the example illustrations, the use of smoothing procedures based on nonparametric regression techniques can yield more informative sensitivity analysis results than can be obtained with more traditional sensitivity analysis procedures based on linear regression, rank regression or quadratic regression when nonlinear relationships between model inputs and model predictions are present.

Helton, Jon Craig; Storlie, Curtis B.

2006-08-01

105

Multiple predictor smoothing methods for sensitivity analysis

International Nuclear Information System (INIS)

The use of multiple predictor smoothing methods in sampling-based sensitivity analyses of complex models is investigated. Specifically, sensitivity analysis procedures based on smoothing methods employing the stepwise application of the following nonparametric regression techniques are described: (1) locally weighted regression (LOESS), (2) additive models, (3) projection pursuit regression, and (4) recursive partitioning regression. The indicated procedures are illustrated with both simple test problems and results from a performance assessment for a radioactive waste disposal facility (i.e., the Waste Isolation Pilot Plant). As shown by the example illustrations, the use of smoothing procedures based on nonparametric regression techniques can yield more informative sensitivity analysis results than can be obtained with more traditional sensitivity analysis procedures based on linear regression, rank regression or quadratic regression when nonlinear relationships between model inputs and model predictions are present

106

Directory of Open Access Journals (Sweden)

Full Text Available ÖZArast?rman?n Amac?: Bu çal?sman?n amac? kredi kart? müsterilerinin kulland?klar? kredikartlar?na iliskin negatif ve pozitif tutumlar?n?n arast?r?lmas?d?r.Yöntem: Önce müsterilerin kredi kart?na olan tutumlar? Aç?klay?c? Faktör Analizi yard?m?ylaincelenmis, daha sonra belirlenen 7 faktörün kredi kart?na duyulan memnuniyet ve gelecekte kredikart? kullanmama tutumlar?na etkileri Çoklu Regresyon Analizi yard?m?yla arast?r?lm?st?r.Bulgular ve Sonuç: Çal?sma sonucunda kredi kart?n?n kisiye güven verdigi alg?s?n?nMemnuniyet degiskeni üzerinde en büyük artt?r?c? etkiye sahip faktör oldugu, bunun yan? s?ra kredikart? kullan?m?na kars? olumlu alg?n?n Ç?k?s degiskeni üzerinde en çok azalt?c? etkiye sahip faktöroldugu saptanm?st?r.Anahtar Kelimeler: Kredi Kart?, Müsteri Memnuniyeti, Ç?k?s Davran?s?, Aç?klay?c? FaktörAnalizi ve Çoklu Regresyon AnaliziABSTRACTResearch Aim: This study researched the effect of negative and positive perceptions of creditcard holders towards credit cards in their satisfaction and exit behaviors.Method: In this study, we first assessed the attitudes of customers towards the use of creditcards by means of Exploratory Factor Analysis, then we assessed the effects of the pre-determined 7factors on the credit card satisfaction and the use of credit cards in the future thanks to MultipleRegression Analysis.Findings and Result: At the end of the study, It was found the perception that credit cards givecustomer confidence has the most effect to increase the satisfaction. It was also found that positiveattitudes towards the use of credit cards have the most effect to decrease effect on exit behavior.Key Words: Credit Card, Consumer’s Satisfaction, Exit Behaviors, Exploratory FactorAnalysis, and Multiple Regression Analysis

M. S. Talha ARSLAN

2009-12-01

107

Robust mediation analysis based on median regression.

Mediation analysis has many applications in psychology and the social sciences. The most prevalent methods typically assume that the error distribution is normal and homoscedastic. However, this assumption may rarely be met in practice, which can affect the validity of the mediation analysis. To address this problem, we propose robust mediation analysis based on median regression. Our approach is robust to various departures from the assumption of homoscedasticity and normality, including heavy-tailed, skewed, contaminated, and heteroscedastic distributions. Simulation studies show that under these circumstances, the proposed method is more efficient and powerful than standard mediation analysis. We further extend the proposed robust method to multilevel mediation analysis, and demonstrate through simulation studies that the new approach outperforms the standard multilevel mediation analysis. We illustrate the proposed method using data from a program designed to increase reemployment and enhance mental health of job seekers. PMID:24079925

Yuan, Ying; Mackinnon, David P

2014-03-01

108

Directory of Open Access Journals (Sweden)

Full Text Available The activity of a selected class of DPP4 inhibitors was preliminarily assessed using chemical descriptors derived AM1 optimized geometries. Using multiple linear regression model, it was found that ?E0, LUMO energy, area, molecular weight and ?H0 are the significant descriptors that can adequately assess the binding affinity of the compounds. The derived multiple linear regression (MLR model was validated using rigorous statistical analysis. The preliminary model suggests that bulky and electrophilic inhibitors are desired.

Jose Isagani Janairo

2011-08-01

109

Digital Repository Infrastructure Vision for European Research (DRIVER)

Landslide is a natural hazard that causes many damages to the environment. Depending on the landform, several factors can cause the Landslide. This research addresses the methodology for landslide susceptibility mapping using multiple regression analysis and GIS tools. Based on the initial hypothesis, ten factors were recognized as effectual elements on landslide, which is geology, slope, aspect, distance from roads, faults and drainage network, soil capability, land use and rainfall. Crossin...

Ebrahim Omidvar; Karim Solaimani; Somayeh Mashari

2012-01-01

110

Directory of Open Access Journals (Sweden)

Full Text Available Unified Multiple Linear Regression (UMLR is a nonlinear programming model that unifies all kind of multiple linear regression models, such as Principal Components Regression, Ridge Regression, Robust Regression and constrained regression. Although, UMLR has exhibited excellent performances in some real applications, the optimization procedure is not satisfying yet. This study proposes a novel Granular Computing-Particle Swarm Optimization (Grc-PSO algorithm by introducing granular computing into standard PSO which is used for the optimization of the UMLR model. The experimental results show that the solution got by Grc-PSO algorithm is much better to the real situation than other state-of-art algorithms.

Chen Su-Fen

2013-01-01

111

Regression Analysis for the Social Sciences

The book provides graduate students in the social sciences with the basic skills that they need to estimate, interpret, present, and publish basic regression models using contemporary standards. Key features of the book include:interweaving the teaching of statistical concepts with examples developed for the course from publicly-available social science data or drawn from the literature. thorough integration of teaching statistical theory with teaching data processing and analysis.teaching of both SAS and Stata "side-by-side" and use of chapter exercises in which students practice programming

Gordon, Rachel A A

2012-01-01

112

Multiple regression models for the lower heating value of municipal solid waste in Taiwan.

A multiple regression analysis was used to develop two predictive models of lower heating value (LHV) for municipal solid waste (MSW), using 180 samples gathered from cities and counties in Taiwan during 2001-2002. These models are referred to as the original proposed model (OPM) and the simplified model (SM). The coefficients of multiple determinations for the OPM and SM were 0.983 and 0.975, respectively. To verify the feasibility of the models, a demonstration program based on sampling of MSW in Kaohsiung City was conducted. As a result, the OPM showed superior precision in terms of relative percentage deviation (RPD) and mean absolute percentage error (MAPE), when compared to the conventional models based on the proximate analysis, physical composition and ultimate analysis. The SM was derived by neglecting the three minor physical components used in the OPM. The resulting SM was less precise when compared to the OPM, but it was still acceptable, with a precision level better than the conventional models. It was concluded that the predictability of empirical models could be improved significantly through selection of the appropriate physical components and multiple regression analysis. PMID:17234326

Chang, Y F; Lin, C J; Chyan, J M; Chen, I M; Chang, J E

2007-12-01

113

Tools to Support Interpreting Multiple Regression in the Face of Multicollinearity

Directory of Open Access Journals (Sweden)

Full Text Available While multicollinearity may increase the difficulty of interpreting multiple regression results, it should not cause undue problems for the knowledgeable researcher. In the current paper, we argue that rather than using one technique to investigate regression results, researchers should consider multiple indices to understand the contributions that predictors make not only to a regression model, but to each other as well. Some of the techniques to interpret multiple regression effects include, but are not limited to, correlation coefficients, beta weights, structure coefficients, all possible subsets regression, commonality coefficients, dominance weights, and relative importance weights. This article will review a set of techniques to interpret multiple regression effects, identify the elements of the data on which the methods focus, and identify statistical software to support such analyses.

AmandaKraha

2012-03-01

114

Validity and Cross-Validity of Metric and Nonmetric Multiple Regression.

Questions are raised concerning differences between traditional metric multiple regression, which assumes all variables to be measured on interval scales, and nonmetric multiple regression. The ordinal model is generally superior in fitting derivation samples but the metric technique fits better than the nonmetric in cross-validation samples.…

MacCallum, Robert C.; And Others

1979-01-01

115

Forecasting Gold Prices Using Multiple Linear Regression Method

Directory of Open Access Journals (Sweden)

Full Text Available Problem statement: Forecasting is a function in management to assist decision making. It is also described as the process of estimation in unknown future situations. In a more general term it is commonly known as prediction which refers to estimation of time series or longitudinal type data. Gold is a precious yellow commodity once used as money. It was made illegal in USA 41 years ago, but is now once again accepted as a potential currency. The demand for this commodity is on the rise. Approach: Objective of this study was to develop a forecasting model for predicting gold prices based on economic factors such as inflation, currency price movements and others. Following the melt-down of US dollars, investors are putting their money into gold because gold plays an important role as a stabilizing influence for investment portfolios. Due to the increase in demand for gold in Malaysian and other parts of the world, it is necessary to develop a model that reflects the structure and pattern of gold market and forecast movement of gold price. The most appropriate approach to the understanding of gold prices is the Multiple Linear Regression (MLR model. MLR is a study on the relationship between a single dependent variable and one or more independent variables, as this case with gold price as the single dependent variable. The fitted model of MLR will be used to predict the future gold prices. A naive model known as ?forecast-1? was considered to be a benchmark model in order to evaluate the performance of the model. Results: Many factors determine the price of gold and based on ?a hunch of experts?, several economic factors had been identified to have influence on the gold prices. Variables such as Commodity Research Bureau future index (CRB; USD/Euro Foreign Exchange Rate (EUROUSD; Inflation rate (INF; Money Supply (M1; New York Stock Exchange (NYSE; Standard and Poor 500 (SPX; Treasury Bill (T-BILL and US Dollar index (USDX were considered to have influence on the prices. Parameter estimations for the MLR were carried out using Statistical Packages for Social Science package (SPSS with Mean Square Error (MSE as the fitness function to determine the forecast accuracy. Conclusion: Two models were considered. The first model considered all possible independent variables. The model appeared to be useful for predicting the price of gold with 85.2% of sample variations in monthly gold prices explained by the model. The second model considered the following four independent variables the (CRB lagged one, (EUROUSD lagged one, (INF lagged two and (M1 lagged two to be significant. In terms of prediction, the second model achieved high level of predictive accuracy. The amount of variance explained was about 70% and the regression coefficients also provide a means of assessing the relative importance of individual variables in the overall prediction of gold price.

Z. Ismail

2009-01-01

116

Functional linear regression via canonical analysis

Digital Repository Infrastructure Vision for European Research (DRIVER)

We study regression models for the situation where both dependent and independent variables are square-integrable stochastic processes. Questions concerning the definition and existence of the corresponding functional linear regression models and some basic properties are explored for this situation. We derive a representation of the regression parameter function in terms of the canonical components of the processes involved. This representation establishes a connection betw...

He, Guozhong; Mu?ller, Hans-georg; Wang, Jane-ling; Yang, Wenjing

2011-01-01

117

Multiple Regression Model for Compressive Strength Prediction of High Performance Concrete

Directory of Open Access Journals (Sweden)

Full Text Available A mathematical model for the prediction of compressive strength of high performance concrete was performed using statistical analysis for the concrete data obtained from experimental work done in this study. The multiple non-linear regression model yielded excellent correlation coefficient for the prediction of compressive strength at different ages (3, 7, 14, 28 and 91 days. The coefficient of correlation was 99.99% for each strength (at each age. Also, the model gives high correlation for strength prediction of concrete with different types of curing.

M. F.M. Zain

2009-01-01

118

Presents an overview of nonparametric regression as it allies to differential item functioning analysis and then provides three examples to illustrate how nonparametric regression can be applied to multilingual, multicultural data to study group differences. (SLD)

Gierl, Mark J.; Bolt, Daniel M.

2001-01-01

119

Regression Discontinuity Designs with Multiple Rating-Score Variables

In the absence of a randomized control trial, regression discontinuity (RD) designs can produce plausible estimates of the treatment effect on an outcome for individuals near a cutoff score. In the standard RD design, individuals with rating scores higher than some exogenously determined cutoff score are assigned to one treatment condition; those…

Reardon, Sean F.; Robinson, Joseph P.

2012-01-01

120

Regression models are being used to quantify the effect of an exposure on an outcome, while adjusting for potential confounders. While the type of regression model to be used is determined by the nature of the outcome variable, e.g. linear regression has to be applied for continuous outcome variables, all regression models can handle any kind of exposure variables. However, some fundamentals of representation of the exposure in a regression model and also some potential pitfalls have to be kept in mind in order to obtain meaningful interpretation of results. The objective of this educational paper was to illustrate these fundamentals and pitfalls, using various multiple regression models applied to data from a hypothetical cohort of 3000 patients with chronic kidney disease. In particular, we illustrate how to represent different types of exposure variables (binary, categorical with two or more categories and continuous), and how to interpret the regression coefficients in linear, logistic and Cox models. We also discuss the linearity assumption in these models, and show how wrongly assuming linearity may produce biased results and how flexible modelling using spline functions may provide better estimates. PMID:24366898

Leffondré, Karen; Jager, Kitty J; Boucquemont, Julie; Stel, Vianda S; Heinze, Georg

2014-10-01

121

Directory of Open Access Journals (Sweden)

Full Text Available Both linear neural network and multiple linear regression models can be used for multi-factor analysis and forecasting, but the data of the multiple linear regression model are required to meet such conditions as independence and normality, while the data of the linear neural network are only required to have a linear relationship. This article uses the same set of data to establish respectively a linear neural network model and a multiple linear regression model, compares the abilities of fitting and forecasting of the two kinds of models, and consequently, comes to the conclusion that the linear neural network method has a stronger fitting ability and a more stable ability of prediction so that it can be further applied and promoted in the analyzing and forecasting of continuous data factors.

Guoli Wang

2011-10-01

122

Analysis of genome-wide association data by large-scale Bayesian logistic regression

Digital Repository Infrastructure Vision for European Research (DRIVER)

Abstract Single-locus analysis is often used to analyze genome-wide association (GWA) data, but such analysis is subject to severe multiple comparisons adjustment. Multivariate logistic regression is proposed to fit a multi-locus model for case-control data. However, when the sample size is much smaller than the number of single-nucleotide polymorphisms (SNPs) or when correlation among SNPs is high, traditional multivariate logistic regression breaks down. To accommodate the scale o...

Wang Yuanjia; Sha Nanshi; Fang Yixin

2009-01-01

123

Evaluating novel agent effects in multiple-treatments meta-regression.

Multiple-treatments meta-analyses are increasingly used to evaluate the relative effectiveness of several competing regimens. In some fields which evolve with the continuous introduction of new agents over time, it is possible that in trials comparing older with newer regimens the effectiveness of the latter is exaggerated. Optimism bias, conflicts of interest and other forces may be responsible for this exaggeration, but its magnitude and impact, if any, needs to be formally assessed in each case. Whereas such novelty bias is not identifiable in a pair-wise meta-analysis, it is possible to explore it in a network of trials involving several treatments. To evaluate the hypothesis of novel agent effects and adjust for them, we developed a multiple-treatments meta-regression model fitted within a Bayesian framework. When there are several multiple-treatments meta-analyses for diverse conditions within the same field/specialty with similar agents involved, one may consider either different novel agent effects in each meta-analysis or may consider the effects to be exchangeable across the different conditions and outcomes. As an application, we evaluate the impact of modelling and adjusting for novel agent effects for chemotherapy and other non-hormonal systemic treatments for three malignancies. We present the results and the impact of different model assumptions to the relative ranking of the various regimens in each network. We established that multiple-treatments meta-regression is a good method for examining whether novel agent effects are present and estimation of their magnitude in the three worked examples suggests an exaggeration of the hazard ratio by 6 per cent (2-11 per cent). PMID:20687172

Salanti, Georgia; Dias, Sofia; Welton, Nicky J; Ades, A E; Golfinopoulos, Vassilis; Kyrgiou, Maria; Mauri, Davide; Ioannidis, John P A

2010-10-15

124

MULTIPLE REGRESSION MODELS FOR HINDCASTING AND FORECASTING MIDSUMMER HYPOXIA IN THE GULF OF MEXICO

A new suite of multiple regression models were developed that describe the relationship between the area of bottom water hypoxia along the northern Gulf of Mexico and Mississippi-Atchafalaya River nitrate concentration, total phosphorus (TP) concentration, and discharge. Variabil...

125

Validation of Simulation Models: Regression Analysis Revisited

Digital Repository Infrastructure Vision for European Research (DRIVER)

This paper proves that it is wrong to require that regressing a model's outputs on the observed real outcomes gives a 45 degrees line through the origin (unit slope, zero intercept).Therefore this paper proposes an alternative requirement: the responses of the model and the real system should have the same means and the same variances.To test whether this requirement is satisfied, a novel statisti-cal procedure is derived.This procedure regresses the differences of simulated and real response...

Kleijnen, J. P. C.; Bettonvil, B. W. M.; Groenendaal, W. J. H.

1996-01-01

126

Regression tree approach to studying factors influencing acoustic voice analysis.

Multiple factors influence voice quality measurements (VQM) obtained during an acoustic voice assessment including: gender, intrasubject variability, microphone, environmental noise (type and level), data acquisition (DA) system, and analysis software. This study used regression trees to investigate the order and relative importance of these factors on VQM including interaction effects of the factors and how the outcome differs when the acoustic environment is controlled for noise. Twenty normophonic participants provided 20 voice samples each, which were recorded synchronously on five DA systems combined with six different microphones. The samples were mixed with five noise types at eight signal-to-noise ratio (SNR) levels. The resulting 80,000 audio samples were analyzed for fundamental frequency (F(0)), jitter and shimmer using three software analysis systems: MDVP, PRAAT, and TF32 (CSpeech). Fifteen regression trees and their Variable Importance Measures were utilized to analyze the data. The analyses confirmed that all of the factors listed above were influential. The results suggest that gender, intrasubject variability, and microphone were significant influences on F(0). Software systems and gender were highly influential on measurements of jitter and shimmer. Environmental noise was shown to be the prominent factor that affects VQM when SNR levels are below 30 dB. PMID:16825780

Deliyski, Dimitar D; Shaw, Heather S; Evans, Maegan K; Vesselinov, Roumen

2006-01-01

127

Egg hatchability prediction by multiple linear regression and artificial neural networks

Digital Repository Infrastructure Vision for European Research (DRIVER)

An artificial neural network (ANN) was compared with a multiple linear regression statistical method to predict hatchability in an artificial incubation process. A feedforward neural network architecture was applied. Network trainings were made by the backpropagation algorithm based on data obtained from industrial incubations. The ANN model was chosen as it produced data that fit better the experimental data as compared to the multiple linear regression model, which used coefficients determi...

Ac, Bolzan; Raf, Machado; Jcz, Piaia

2008-01-01

128

Landslides in the hilly terrain along the Kansas and Missouri rivers in northeastern Kansas have caused millions of dollars in property damage during the last decade. To address this problem, a statistical method called multiple logistic regression has been used to create a landslide-hazard map for Atchison, Kansas, and surrounding areas. Data included digitized geology, slopes, and landslides, manipulated using ArcView GIS. Logistic regression relates predictor variables to the occurrence or nonoccurrence of landslides within geographic cells and uses the relationship to produce a map showing the probability of future landslides, given local slopes and geologic units. Results indicated that slope is the most important variable for estimating landslide hazard in the study area. Geologic units consisting mostly of shale, siltstone, and sandstone were most susceptible to landslides. Soil type and aspect ratio were considered but excluded from the final analysis because these variables did not significantly add to the predictive power of the logistic regression. Soil types were highly correlated with the geologic units, and no significant relationships existed between landslides and slope aspect. ?? 2003 Elsevier Science B.V. All rights reserved.

Ohlmacher, G.C.; Davis, J.C.

2003-01-01

129

The multivariate effects of Na, K, Mg and Ca as nitrates on the electrothermal atomisation of manganese, cadmium and iron were studied by multiple linear regression modelling. Since the models proved to efficiently predict the effects of the considered matrix elements in a wide range of concentrations, they were applied to correct the interferences occurring in the determination of trace elements in seawater after pre-concentration of the analytes. In order to obtain a statistically significant number of samples, a large volume of the certified seawater reference materials CASS-3 and NASS-3 was treated with Chelex-100 resin; then, the chelating resin was separated from the solution, divided into several sub-samples, each of them was eluted with nitric acid and analysed by electrothermal atomic absorption spectrometry (for trace element determinations) and inductively coupled plasma optical emission spectrometry (for matrix element determinations). To minimise any other systematic error besides that due to matrix effects, accuracy of the pre-concentration step and contamination levels of the procedure were checked by inductively coupled plasma mass spectrometric measurements. Analytical results obtained by applying the multiple linear regression models were compared with those obtained with other calibration methods, such as external calibration using acid-based standards, external calibration using matrix-matched standards and the analyte addition technique. Empirical models proved to efficiently reduce interferences occurring in the analysis of real samples, allowing an improvement of accuracy better than for other calibration methods.

Grotti, Marco; Abelmoschi, Maria Luisa; Soggia, Francesco; Tiberiade, Christian; Frache, Roberto

2000-12-01

130

Directory of Open Access Journals (Sweden)

Full Text Available Landslide is a natural hazard that causes many damages to the environment. Depending on the landform, several factors can cause the Landslide. This research addresses the methodology for landslide susceptibility mapping using multiple regression analysis and GIS tools. Based on the initial hypothesis, ten factors were recognized as effectual elements on landslide, which is geology, slope, aspect, distance from roads, faults and drainage network, soil capability, land use and rainfall. Crossing investigated parameters with the observed landslides indicated that three factor including distance from channel network, distance from fault and rainfall have no major effect on observed landslide in Tajan area. In order to quantifying the parameters in the form of weighting factors, the coverage of landslides in different observation was determined. Then Stepwise method was used for statistical analysis. It was found that slope, aspect, distance from the roads and soil capability are as most effective factors in landslide respectively.

Somayeh Mashari

2012-07-01

131

Functional linear regression analysis for longitudinal data

Digital Repository Infrastructure Vision for European Research (DRIVER)

We propose nonparametric methods for functional linear regression which are designed for sparse longitudinal data, where both the predictor and response are functions of a covariate such as time. Predictor and response processes have smooth random trajectories, and the data consist of a small number of noisy repeated measurements made at irregular times for a sample of subjects. In longitudinal studies, the number of repeated measurements per subject is often small and may b...

Yao, Fang; Mu?ller, Hans-georg; Wang, Jane-ling

2006-01-01

132

Linear regression analysis of survival data with missing censoring indicators

Digital Repository Infrastructure Vision for European Research (DRIVER)

Linear regression analysis has been studied extensively in a random censorship setting, but typically all of the censoring indicators are assumed to be observed. In this paper, we develop synthetic data methods for estimating regression parameters in a linear model when some censoring indicators are missing. We define estimators based on regression calibration, imputation, and inverse probability weighting techniques, and we prove all three estimators are asymptotically normal. The finite-sam...

Wang, Qihua; Dinse, Gregg E.

2011-01-01

133

Regression analysis as a tool for identification of industrial processes

Energy Technology Data Exchange (ETDEWEB)

Use of mathematical methods is discussed in economic analysis of underground coal mining. A method for description of operations in underground mining is analyzed. Method for development of static characteristics of processes in underground mining is evaluated. Regression analysis is characterized as most suitable for analysis of underground mining. Methods for plotting regression functions and construction of mathematical models of processes and operations in underground coal mining are evaluated. Recommendations for use of regression analysis for optimization of underground mining are made. 4 references.

Bruski, J.

1976-07-01

134

Neutron multiplicity analysis tool

International Nuclear Information System (INIS)

I describe the capabilities of the EXCOM (EXcel based COincidence and Multiplicity) calculation tool which is used to analyze experimental data or simulated neutron multiplicity data. The input to the program is the count-rate data (including the multiplicity distribution) for a measurement, the isotopic composition of the sample and relevant dates. The program carries out deadtime correction and background subtraction and then performs a number of analyses. These are: passive calibration curve, known alpha and multiplicity analysis. The latter is done with both the point model and with the weighted point model. In the current application EXCOM carries out the rapid analysis of Monte Carlo calculated quantities and allows the user to determine the magnitude of sample perturbations that lead to systematic errors. Neutron multiplicity counting is an assay method used in the analysis of plutonium for safeguards applications. It is widely used in nuclear material accountancy by international (IAEA) and national inspectors. The method uses the measurement of the correlations in a pulse train to extract information on the spontaneous fission rate in the presence of neutrons from (?,n) reactions and induced fission. The measurement is relatively simple to perform and gives results very quickly ((le) 1 hour). By contrast, destructive analysis techniques are extremely costly and time consuming (several days). By improving the achievable accuracy of neutron multiplicity countie accuracy of neutron multiplicity counting, a nondestructive analysis technique, it could be possible to reduce the use of destructive analysis measurements required in safeguards applications. The accuracy of a neutron multiplicity measurement can be affected by a number of variables such as density, isotopic composition, chemical composition and moisture in the material. In order to determine the magnitude of these effects on the measured plutonium mass a calculational tool, EXCOM, has been produced using VBA within Excel. This program was developed to help speed the analysis of Monte Carlo neutron transport simulation (MCNP) data, and only requires the count-rate data to calculate the mass of material using INCC's analysis methods instead of the full neutron multiplicity distribution required to run analysis in INCC. This paper describes what is implemented within EXCOM, including the methods used, how the program corrects for deadtime, and how uncertainty is calculated. This paper also describes how to use EXCOM within Excel.

135

Directory of Open Access Journals (Sweden)

Full Text Available Polyethylene glycol (PEG is the most common preservative in use for bulking and maintaining structural integrity in waterlogged wood. Conservators therefore have a need to be able to determine PEG concentrations in wood in a non-destructive manner. We present a study highlighting the application of infrared spectroscopy coupled with multivariate analysis techniques to predict the concentration of polyethylene glycol 400 (PEG-400 and water simultaneously. This technique uses attenuated total reflectance (ATR spectroscopy andunconstrained stepwise multiple linear regression (SMLR analysis for prediction of multiple components in archaeological wood. Using this model we have calculated the concentration of PEG-400 and water in treated archaeological waterlogged wood samples.

Rohan PATEL

2012-03-01

136

Bootstrapping regression parameters in multivariate survival analysis.

Bootstrap methods are proposed for estimating sampling distributions and associated statistics for regression parameters in multivariate survival data. We use an Independence Working Model (IWM) approach, fitting margins independently, to obtain consistent estimates of the parameters in the marginal models. Resampling procedures, however, are applied to an appropriate joint distribution to estimate covariance matrices, make bias corrections, and construct confidence intervals. The proposed methods allow for fixed or random explanatory variables, the latter case using extensions of existing resampling schemes (Loughin, 1995), and they permit the possibility of random censoring. An application is shown for the viral positivity time data previously analyzed by Wei, Lin, and Weissfeld (1989). A simulation study of small-sample properties shows that the proposed bootstrap procedures provide substantial improvements in variance estimation over the robust variance estimator commonly used with the IWM. PMID:9384620

Loughin, T M; Koehler, K J

1997-01-01

137

Overdispersed logistic regression for SAGE: Modelling multiple groups and covariates

Digital Repository Infrastructure Vision for European Research (DRIVER)

Abstract Background Two major identifiable sources of variation in data derived from the Serial Analysis of Gene Expression (SAGE) are within-library sampling variability and between-library heterogeneity within a group. Most published methods for identifying differential expression focus on just the sampling variability. In recent work, the problem of assessing differential expression between two groups of SAGE libraries has been addressed by introducing a beta-binomial hier...

Morris Jeffrey S; Deng Li; Baggerly Keith A; Marcelo, Aldaz C.

2004-01-01

138

PREG: A Computer Program for Poisson Regression Analysis.

PREG is a FORTRAN program that computes maximum likelihood estimates for a regression analysis when the dependent variable is a count that follows the Poisson distribution. This documentation contains a detailed description of the algorithm for maximum li...

E. L. Frome

1981-01-01

139

Combined linkage and segregation analysis using regressive models.

Digital Repository Infrastructure Vision for European Research (DRIVER)

Regressive models for segregation analysis have been extended to include multivariate data and linked marker loci. The new models have been applied to data from two pedigrees segregating a gene for cardiovascular disease.

Bonney, G. E.; Lathrop, G. M.; Lalouel, J. M.

1988-01-01

140

Multiple predictor smoothing methods for sensitivity analysis: Example results

International Nuclear Information System (INIS)

The use of multiple predictor smoothing methods in sampling-based sensitivity analyses of complex models is investigated. Specifically, sensitivity analysis procedures based on smoothing methods employing the stepwise application of the following nonparametric regression techniques are described in the first part of this presentation: (i) locally weighted regression (LOESS), (ii) additive models, (iii) projection pursuit regression, and (iv) recursive partitioning regression. In this, the second and concluding part of the presentation, the indicated procedures are illustrated with both simple test problems and results from a performance assessment for a radioactive waste disposal facility (i.e., the Waste Isolation Pilot Plant). As shown by the example illustrations, the use of smoothing procedures based on nonparametric regression techniques can yield more informative sensitivity analysis results than can be obtained with more traditional sensitivity analysis procedures based on linear regression, rank regression or quadratic regression when nonlinear relationships between model inputs and model predictions are present

141

Inferring gene expression dynamics via functional regression analysis

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Temporal gene expression profiles characterize the time-dynamics of expression of specific genes and are increasingly collected in current gene expression experiments. In the analysis of experiments where gene expression is obtained over the life cycle, it is of interest to relate temporal patterns of gene expression associated with different developmental stages to each other to study patterns of long-term developmental gene regulation. We use tools from functional data analysis to study dynamic changes by relating temporal gene expression profiles of different developmental stages to each other. Results We demonstrate that functional regression methodology can pinpoint relationships that exist between temporary gene expression profiles for different life cycle phases and incorporates dimension reduction as needed for these high-dimensional data. By applying these tools, gene expression profiles for pupa and adult phases are found to be strongly related to the profiles of the same genes obtained during the embryo phase. Moreover, one can distinguish between gene groups that exhibit relationships with positive and others with negative associations between later life and embryonal expression profiles. Specifically, we find a positive relationship in expression for muscle development related genes, and a negative relationship for strictly maternal genes for Drosophila, using temporal gene expression profiles. Conclusion Our findings point to specific reactivation patterns of gene expression during the Drosophila life cycle which differ in characteristic ways between various gene groups. Functional regression emerges as a useful tool for relating gene expression patterns from different developmental stages, and avoids the problems with large numbers of parameters and multiple testing that affect alternative approaches.

Leng Xiaoyan

2008-01-01

142

Egg hatchability prediction by multiple linear regression and artificial neural networks

Scientific Electronic Library Online (English)

Full Text Available SciELO Brazil | Language: English Abstract in english An artificial neural network (ANN) was compared with a multiple linear regression statistical method to predict hatchability in an artificial incubation process. A feedforward neural network architecture was applied. Network trainings were made by the backpropagation algorithm based on data obtained [...] from industrial incubations. The ANN model was chosen as it produced data that fit better the experimental data as compared to the multiple linear regression model, which used coefficients determined by minimum square method. The proposed simulation results of these approaches indicate that this ANN can be used for incubation performance prediction.

AC, Bolzan; RAF, Machado; JCZ, Piaia.

2008-06-01

143

Egg hatchability prediction by multiple linear regression and artificial neural networks

Directory of Open Access Journals (Sweden)

Full Text Available An artificial neural network (ANN was compared with a multiple linear regression statistical method to predict hatchability in an artificial incubation process. A feedforward neural network architecture was applied. Network trainings were made by the backpropagation algorithm based on data obtained from industrial incubations. The ANN model was chosen as it produced data that fit better the experimental data as compared to the multiple linear regression model, which used coefficients determined by minimum square method. The proposed simulation results of these approaches indicate that this ANN can be used for incubation performance prediction.

AC Bolzan

2008-06-01

144

Evaluating Productivity Index in a Gas Well Using Regression Analysis

Directory of Open Access Journals (Sweden)

Full Text Available In this study, a new approach is introduced to augment existing correlations for the analysis of Productivity Index of a gas well. The Modified Isochronal test method is used in this analysis. The Productivity Index trend of the gas well is evaluated from the test data. Regression Analysis is used to develop a correlation, which is then used to evaluate andforecast future Productivity Index trend. The back pressure equation of the Simplified Analysis method is also used to examine the test data. The Inflow Performance Relationship data generated is compared with that generated from Regression Analysis. In the Regression Analysis, using pseudo-pressure approach evaluates productivity index of the gas well more accurately than the pressure-squared approach. The bottom-hole pressure method using the pressure-squared approach under Regression Analysis generated a better estimate of IPR data, than any other method.The Productivity Index values evaluated from the Regression Analysis are quite approximate and can be used to establish a deliverability equation for the gas well.

Tobuyei Christopher

2014-06-01

145

Linear regression analysis of survival data with missing censoring indicators.

Linear regression analysis has been studied extensively in a random censorship setting, but typically all of the censoring indicators are assumed to be observed. In this paper, we develop synthetic data methods for estimating regression parameters in a linear model when some censoring indicators are missing. We define estimators based on regression calibration, imputation, and inverse probability weighting techniques, and we prove all three estimators are asymptotically normal. The finite-sample performance of each estimator is evaluated via simulation. We illustrate our methods by assessing the effects of sex and age on the time to non-ambulatory progression for patients in a brain cancer clinical trial. PMID:20559722

Wang, Qihua; Dinse, Gregg E

2011-04-01

146

Joint regression analysis of correlated data using Gaussian copulas.

This article concerns a new joint modeling approach for correlated data analysis. Utilizing Gaussian copulas, we present a unified and flexible machinery to integrate separate one-dimensional generalized linear models (GLMs) into a joint regression analysis of continuous, discrete, and mixed correlated outcomes. This essentially leads to a multivariate analogue of the univariate GLM theory and hence an efficiency gain in the estimation of regression coefficients. The availability of joint probability models enables us to develop a full maximum likelihood inference. Numerical illustrations are focused on regression models for discrete correlated data, including multidimensional logistic regression models and a joint model for mixed normal and binary outcomes. In the simulation studies, the proposed copula-based joint model is compared to the popular generalized estimating equations, which is a moment-based estimating equation method to join univariate GLMs. Two real-world data examples are used in the illustration. PMID:18510653

Song, Peter X-K; Li, Mingyao; Yuan, Ying

2009-03-01

147

Background stratified Poisson regression analysis of cohort data

Digital Repository Infrastructure Vision for European Research (DRIVER)

Background stratified Poisson regression is an approach that has been used in the analysis of data derived from a variety of epidemiologically important studies of radiation-exposed populations, including uranium miners, nuclear industry workers, and atomic bomb survivors. We describe a novel approach to fit Poisson regression models that adjust for a set of covariates through background stratification while directly estimating the radiation-disease association of primary interest. The approa...

Richardson, David B.; Langholz, Bryan

2012-01-01

148

A Simple and Convenient Method of Multiple Linear Regression to Calculate Iodine Molecular Constants

A new procedure using a student-friendly least-squares multiple linear-regression technique utilizing a function within Microsoft Excel is described that enables students to calculate molecular constants from the vibronic spectrum of iodine. This method is advantageous pedagogically as it calculates molecular constants for ground and excited…

Cooper, Paul D.

2010-01-01

149

Calculation of U, Ra, Th and K contents in uranium ore by multiple linear regression method

International Nuclear Information System (INIS)

A multiple linear regression method was used to compute ? spectra of uranium ore samples and to calculate contents of U, Ra, Th, and K. In comparison with the inverse matrix method, its advantage is that no standard samples of pure U, Ra, Th and K are needed for obtaining response coefficients

150

We report a case of tumor regression of multiple bone metastases from breast carcinoma after administration of strontium-89 chloride. This case suggests that strontium-89 chloride can not only relieve bone metastases pain not responsive to analgesics, but may also have a tumoricidal effect on bone metastases. PMID:25298863

Heianna, Joichi; Miyauchi, Takaharu; Endo, Wataru; Miura, Naoki; Terui, Kazuyuki; Kamata, Syuichi; Hashimoto, Manabu

2014-05-01

151

Directory of Open Access Journals (Sweden)

Full Text Available This paper introduces a statistical model by using the statistical methods in 2G,GSM communication system.Multiple regression formula is to calculate path loss. It is assumed that hb,W and ? are three statistical variables. We use nakagami distribution to model hb,W and uniform distribution to model ?.

Meenal Sharma

2011-07-01

152

Background stratified Poisson regression analysis of cohort data.

Background stratified Poisson regression is an approach that has been used in the analysis of data derived from a variety of epidemiologically important studies of radiation-exposed populations, including uranium miners, nuclear industry workers, and atomic bomb survivors. We describe a novel approach to fit Poisson regression models that adjust for a set of covariates through background stratification while directly estimating the radiation-disease association of primary interest. The approach makes use of an expression for the Poisson likelihood that treats the coefficients for stratum-specific indicator variables as 'nuisance' variables and avoids the need to explicitly estimate the coefficients for these stratum-specific parameters. Log-linear models, as well as other general relative rate models, are accommodated. This approach is illustrated using data from the Life Span Study of Japanese atomic bomb survivors and data from a study of underground uranium miners. The point estimate and confidence interval obtained from this 'conditional' regression approach are identical to the values obtained using unconditional Poisson regression with model terms for each background stratum. Moreover, it is shown that the proposed approach allows estimation of background stratified Poisson regression models of non-standard form, such as models that parameterize latency effects, as well as regression models in which the number of strata is large, thereby overcoming the limitations of previously available statistical software for fitting background stratified Poisson regression models. PMID:22193911

Richardson, David B; Langholz, Bryan

2012-03-01

153

In this work, we introduce multiplicative drift analysis as a suitable way to analyze the runtime of randomized search heuristics such as evolutionary algorithms. We give a multiplicative version of the classical drift theorem. This allows easier analyses in those settings where the optimization progress is roughly proportional to the current distance to the optimum. To display the strength of this tool, we regard the classical problem how the (1+1) Evolutionary Algorithm optimizes an arbitrary linear pseudo-Boolean function. Here, we first give a relatively simple proof for the fact that any linear function is optimized in expected time $O(n \\log n)$, where $n$ is the length of the bit string. Afterwards, we show that in fact any such function is optimized in expected time at most ${(1+o(1)) 1.39 \\euler n\\ln (n)}$, again using multiplicative drift analysis. We also prove a corresponding lower bound of ${(1-o(1))e n\\ln(n)}$ which actually holds for all functions with a unique global optimum. We further demons...

Doerr, Benjamin; Winzen, Carola

2011-01-01

154

Applying Multiple Linear Regression and Neural Network to Predict Bank Performance

Directory of Open Access Journals (Sweden)

Full Text Available Globalization and technological advancement has created a highly competitive market in the banking and finance industry. Performance of the industry depends heavily on the accuracy of the decisions made at managerial level. This study uses multiple linear regression technique and feed forward artificial neural network in predicting bank performance. The study aims to predict bank performance using multiple linear regression and neural network. The study then evaluates the performance of the two techniques with a goal to find a powerful tool in predicting the bank performance. Data of thirteen banks for the period 2001-2006 was used in the study. ROA was used as a measure of bank performance, and hence is a dependent variable for the multiple linear regressions. Seven variables including liquidity, credit risk, cost to income ratio, size, concentration ratio, inflation and GDP were used as independent variables. Under supervised learning, the dependent variable, ROA was used as the target output for the artificial neural network. Seven inputs corresponding to seven predictor variables were used for pattern recognition at the training phase. Experimental results from the multiple linear regression show that two variables: credit risk and cost to income ratio are significant in determining the bank performance. Two variables were found to explain about 60.9 percent of the total variation in the data with a mean square error (MSE of 0.330. The artificial neural network was found to give optimal results by using thirteen hidden neurons. Testing results show that the seven inputs explain about 66.9 percent of the total variation in the data with a very low MSE of 0.00687. Performance of both methods is measured by mean square prediction error (MSPR at the validation stage. The MSPR value for neural network is lower than the MPSR value for multiple linear regression (0.0061 against 0.6190. The study concludes that artificial neural network is the more powerful tool in predicting bank performance.

Nor Mazlina Abu Bakar

2009-09-01

155

Ratio Versus Regression Analysis: Some Empirical Evidence in Brazil

Directory of Open Access Journals (Sweden)

Full Text Available This work compares the traditional methodology for ratio analysis, applied to a sample of Brazilian firms, with the alternative one of regression analysis both to cross-industry and intra-industry samples. It was tested the structural validity of the traditional methodology through a model that represents its analogous regression format. The data are from 156 Brazilian public companies in nine industrial sectors for the year 1997. The results provide weak empirical support for the traditional ratio methodology as it was verified that the validity of this methodology may differ between ratios.

Newton Carneiro Affonso da Costa Jr.

2004-06-01

156

Analysis of Sting Balance Calibration Data Using Optimized Regression Models

Calibration data of a wind tunnel sting balance was processed using a candidate math model search algorithm that recommends an optimized regression model for the data analysis. During the calibration the normal force and the moment at the balance moment center were selected as independent calibration variables. The sting balance itself had two moment gages. Therefore, after analyzing the connection between calibration loads and gage outputs, it was decided to choose the difference and the sum of the gage outputs as the two responses that best describe the behavior of the balance. The math model search algorithm was applied to these two responses. An optimized regression model was obtained for each response. Classical strain gage balance load transformations and the equations of the deflection of a cantilever beam under load are used to show that the search algorithm s two optimized regression models are supported by a theoretical analysis of the relationship between the applied calibration loads and the measured gage outputs. The analysis of the sting balance calibration data set is a rare example of a situation when terms of a regression model of a balance can directly be derived from first principles of physics. In addition, it is interesting to note that the search algorithm recommended the correct regression model term combinations using only a set of statistical quality metrics that were applied to the experimental data during the algorithm s term selection process.

Ulbrich, N.; Bader, Jon B.

2010-01-01

157

The multivariate interference effects due to nitrates of sodium, potassium, magnesium and calcium on electrothermal atomization of manganese were modelled by using the multiple linear regression method in conjunction with a suitable experimental design. Since the model proved to be able to efficiently predict the simultaneous effects of the considered salts in a wide range of concentrations, it was applied to provide a computational correction of the matrix effects occurring in the ETAAS analysis of manganese in seawater, after preconcentration of the analyte on Chelex-100 resin and elution with nitric acid. Preliminary results showed that the matrix effects are significantly reduced, leading to an improvement in accuracy of the ETAAS analysis.

Grotti, Marco; Leardi, Riccardo; Gnecco, Cinzia; Frache, Roberto

1999-05-01

158

Regression analysis of creep-rupture data: a practical approach

International Nuclear Information System (INIS)

A generalized linear regression approach to the analysis of creep and creep-rupture data appears to have great promise for future applications. Uncertainties in predictions of creep behavior can be large due to heat treatment, heat-to-heat and other variations in properties. For types 304 and 316 stainless steels and for 2 1/4 Cr--1 Mo steel these uncertainties can be reduced by using regression models that include terms involving the ultimate tensile strength or 100-hr rupture strength of a given heat. A model for Alloy 800H was developed to predict the middle of the scatter band on behavior. Regression analysis of single heat data sets for a variety of materials yielded generally good results. Extrapolation of any model must be done with extreme caution. Possible metallurgical instabilities or changes in creep mechanism can cause serious errors in extrapolated results

159

Directory of Open Access Journals (Sweden)

Full Text Available The retention behavior and lipophilicity parameters of some antiphychotics were determined using reversed-phase thin layer chromatography. Quantitative structure-activity relationships studies have been performed to correlate the molecular characteristics of observed compounds with their retention as well as with their chromatographically determinated lipophilicity parameters. The effect of different organic modifiers (acetone, tetrahydrofuran, and methanol has been studied. The retention of investigated compounds decreases linearly with increasing concentration of organic modifier. The chemical structures of the antipsychotics have been characterized by molecular descriptors which are calculated from the structure and related to chromatographically determinated lipophilicity parameters by multiple linear regression analysis. This approach gives us the possibility to gain insight into factors responsible for the retention as well as lipophilicity of the investigated set of the compounds. The most prominent factors affecting lipophilicity of the investigated substances are Solubility, Energy of the highest occupied molecular orbital, and Energy of the lowest unoccupied molecular orbital. The obtained models were used for interpretation of the lipophilicity of the investigated compounds. The prediction results are in good agreement with the experimental value. This study provides good information about pharmacologically important physico-chemical parameters of observed antipsychotics relevant to variations in molecular lipophilicity and chromatographic behavior. Established QSAR models could be helpful in design of novel multitarget antipsychotic compounds.

Danica S. Peruškovi?

2014-04-01

160

Multiple regression as a preventive tool for determining the risk of Legionella spp.

Directory of Open Access Journals (Sweden)

Full Text Available To determine the interrelationship between health & hygiene conditions for prevention of legionellosis, the compositionof materials used in water distribution systems, the water origin and Legionella pneumophila risk. Material and methods. Include adescriptive study and multiple regression analysis on a sample of golf course sprinkler irrigation systems (n=31 pertaining to hotelslocated on the Costa del Sol (Malaga, Spain. The study was carried out in 2009. Results. Presented a significant lineal relation, withall the independent variables contributing significantly (p<0.05 to the model’s fit. The relationship between water type and the risk ofLegionella, as well as the material composition and the latter, is lineal and positive. In contrast, the relationship between health-hygieneconditions and Legionella risk is lineal and negative. Conclusion. The characterization of Legionella pneumophila concentration, asdefined by the risk in water and through use of the predictive method, can contribute to the consideration of new influence variables inthe development of the agent, resulting in improved control and prevention of the disease.

Enrique Gea-Izquierdo

2012-04-01

161

Aerosol optical depth (AOD) from AERONET data has a very fine resolution but air pollution index (API), visibility and relative humidity from the ground truth measurements are coarse. To obtain the local AOD in the atmosphere, the relationship between these three parameters was determined using multiple regression analysis. The data of southwest monsoon period (August to September, 2012) taken in Penang, Malaysia, was used to establish a quantitative relationship in which the AOD is modeled as a function of API, relative humidity, and visibility. The highest correlated model was used to predict AOD values during southwest monsoon period. When aerosol is not uniformly distributed in the atmosphere then the predicted AOD can be highly deviated from the measured values. Therefore these deviated data can be removed by comparing between the predicted AOD values and the actual AERONET data which help to investigate whether the non uniform source of the aerosol is from the ground surface or from higher altitude level. This model can accurately predict AOD if only the aerosol is uniformly distributed in the atmosphere. However, further study is needed to determine this model is suitable to use for AOD predicting not only in Penang, but also other state in Malaysia or even global.

Tan, F.; Lim, H. S.; Abdullah, K.; Yoon, T. L.; Zubir Matjafri, M.; Holben, B.

2014-02-01

162

The potential of multiple linear regression (MLR) and artificial neural network (ANN) techniques in predicting transient water levels over a groundwater basin were compared. MLR and ANN modeling was carried out at 17 sites in Japan, considering all significant inputs: rainfall, ambient temperature, river stage, 11 seasonal dummy variables, and influential lags of rainfall, ambient temperature, river stage and groundwater level. Seventeen site-specific ANN models were developed, using multi-layer feed-forward neural networks trained with Levenberg-Marquardt backpropagation algorithms. The performance of the models was evaluated using statistical and graphical indicators. Comparison of the goodness-of-fit statistics of the MLR models with those of the ANN models indicated that there is better agreement between the ANN-predicted groundwater levels and the observed groundwater levels at all the sites, compared to the MLR. This finding was supported by the graphical indicators and the residual analysis. Thus, it is concluded that the ANN technique is superior to the MLR technique in predicting spatio-temporal distribution of groundwater levels in a basin. However, considering the practical advantages of the MLR technique, it is recommended as an alternative and cost-effective groundwater modeling tool.

Sahoo, Sasmita; Jha, Madan K.

2013-12-01

163

Application of stepwise multiple regression techniques to inversion of Nimbus 'IRIS' observations.

Exploratory studies with Nimbus-3 infrared interferometer-spectrometer (IRIS) data indicate that, in addition to temperature, such meteorological parameters as geopotential heights of pressure surfaces, tropopause pressure, and tropopause temperature can be inferred from the observed spectra with the use of simple regression equations. The technique of screening the IRIS spectral data by means of stepwise regression to obtain the best radiation predictors of meteorological parameters is validated. The simplicity of application of the technique and the simplicity of the derived linear regression equations - which contain only a few terms - suggest usefulness for this approach. Based upon the results obtained, suggestions are made for further development and exploitation of the stepwise regression analysis technique.

Ohring, G.

1972-01-01

164

Hyperbolic decline-curve analysis using linear regression

Energy Technology Data Exchange (ETDEWEB)

Two rigorous techniques for hyperbolic decline-curve analysis based on the fundamental equations are investigated. The parameter estimation is made more accurate and repeatable than type curve matching if linear regression is used to extract the unknowns D[sub i], q[sub i] and n from the data. Maximizing the regression coefficient is the criterion used to select the correct parameter values. In application to six field examples, it is seen that the two methods do not always give identical results if there is a lot of scatter in the data. 25 figs., 1 tab., 21 refs.

Towler, B.F.; Bansal, Sitanshu (Dept. of Petroleum Engineering, Univ. of Wyoming, Laramie, WY (United States))

1993-01-01

165

User's Guide to the Weighted-Multiple-Linear Regression Program (WREG version 1.0)

Streamflow is not measured at every location in a stream network. Yet hydrologists, State and local agencies, and the general public still seek to know streamflow characteristics, such as mean annual flow or flood flows with different exceedance probabilities, at ungaged basins. The goals of this guide are to introduce and familiarize the user with the weighted multiple-linear regression (WREG) program, and to also provide the theoretical background for program features. The program is intended to be used to develop a regional estimation equation for streamflow characteristics that can be applied at an ungaged basin, or to improve the corresponding estimate at continuous-record streamflow gages with short records. The regional estimation equation results from a multiple-linear regression that relates the observable basin characteristics, such as drainage area, to streamflow characteristics.

Eng, Ken; Chen, Yin-Yu; Kiang, Julie.E.

2009-01-01

166

Directory of Open Access Journals (Sweden)

Full Text Available In this study, we propose a Leverage Based Near-Neighbor (LBNN method where prior information on the structure of the heteroscedastic error is not required. In the proposed LBNN method, weights are determined not from the near-neighbor values of the explanatory variables, but from their corresponding leverage values so that it can be readily applied to a multiple regression model. Both the empirical and Monte Carlo simulation results show that the LBNN method offers substantial improvement over the existing methods. The LBNN has significantly reduced the standard errors of the estimates and also the standard errors of residuals for both simple and multiple linear regression models. Hence, the LBNN can be established as one reliable alternative approach to other existing methods that deal with heteroscedastic errors when the form of heteroscedasticity is unknown.

H. Midi

2009-01-01

167

Directory of Open Access Journals (Sweden)

Adequate dietary calcium intake in combination with maintaining a daily physical activity, increasing educational level, decreasing birth rate, and duration of breast-feeding may contribute to healthy bones and play a role in practical prevention of osteoporosis in Southeast Anatolia. In addition, the findings of the present study indicate that the use of multivariate statistical method as a multiple logistic regression in osteoporosis, which maybe influenced by many variables, is better than univariate statistical evaluation.

Zeki Akkus

2005-09-01

168

Multiple-regression equations for estimating low flows at ungaged stream sites in Ohio

This report presents multiple-regression equations for estimating selected low-flow characteristics for most unregulated Ohio streams at sites where little or no discharge data are available. The equations relate combinations of drainage area, main-channel length, main-channel slope, average basin elevation, forested area, average annual precipitation, and an index of infiltration to low flows with durations of 7 and 30 days and average recurrence intervals of 2 and 10 years. Data from 132 long-term continuous-record gaging stations and partial-record sites in Ohio were used in the analyses. Multiple-regression analyses were first performed by using data from all 132 sites in an attempt to develop equations that would be applicable statewide. Standard errors for the statewide equations were too high (111 to 189 percent) for them to be of practical use in estimating low streamflows. Data for the state were then subdivided into five regions, and multiple-regression equations were developed for each region. Standard errors for four of the five regions improved, and raged from 43 to 106 percent. Standard errors for region 5 remained high (74 to 129 percent). The multiple-regression equations presented in this report are not applicable to streams with significant low-flow regulation. The equations also are not applicable if (1) the site has been gaged and low-flow estimates have been developed from gaging-station records, (2) low flow can be estimated by the drainage-area transference method from data for a nearby gaged site, or (3) a sufficient number of partial-record measurements made at the site can be adquately correlated with concurrent base flows at a suitable index station.

Koltun, G.F.; Schwartz, R.R.

1987-01-01

169

Regression-based sib pair linkage analysis for binary traits.

The Haseman-Elston (HE) regression method offers a mathematically and computationally simpler alternative to variance-components (VC) models for the linkage analysis of quantitative traits. However, current versions of HE regression and VC models are not optimised for binary traits. Here, we present a modified HE regression and a liability-threshold VC model for binary-traits. The new HE method is based on the regression of a linear combination of the trait squares and the trait cross-product on the proportion of alleles identical by descent (IBD) at the putative locus, for sibling pairs. We have implemented both the new HE regression-based method and have performed analytic and simulation studies to assess its type 1 error rate and power under a range of conditions. These studies showed that the new HE method is well-behaved under the null hypothesis in large samples, is more powerful than both the original and the revisited HE methods, and is approximately equivalent in power to the liability-threshold VC model. PMID:12931051

Zeegers, Maurice P A; Rice, John P; Rijsdijk, Frühling V; Abecasis, Goncalo R; Sham, Pak C

2003-01-01

170

Least Squares Adjustment: Linear and Nonlinear Weighted Regression Analysis

DEFF Research Database (Denmark)

This note primarily describes the mathematics of least squares regression analysis as it is often used in geodesy including land surveying and satellite positioning applications. In these fields regression is often termed adjustment. The note also contains a couple of typical land surveying and satellite positioning application examples. In these application areas we are typically interested in the parameters in the model typically 2- or 3-D positions and not in predictive modelling which is often the main concern in other regression analysis applications. Adjustment is often used to obtain estimates of relevant parameters in an over-determined system of equations which may arise from deliberately carrying out more measurements than actually needed to determine the set of desired parameters. An example may be the determination of a geographical position based on information from a number of Global Navigation Satellite System (GNSS) satellites also known as space vehicles (SV). It takes at least four SVs to determine the position (and the clock error) of a GNSS receiver. Often more than four SVs are used and we use adjustment to obtain a better estimate of the geographical position (and the clock error) and to obtain estimates of the uncertainty with which the position is determined. Regression analysis is used in many other fields of application both in the natural, the technical and the social sciences. Examples may be curve fitting, calibration, establishing relationships between different variables in an experiment or in a survey, etc. Regression analysis is probably one the most used statistical techniques around. Dr. Anna B. O. Jensen provided insight and data for the Global Positioning System (GPS) example. Matlab code and sections that are considered as either traditional land surveying material or as advanced material are typeset with smaller fonts. Comments in general or on for example unavoidable typos, shortcomings and errors are most welcome.

Nielsen, Allan Aasbjerg

2007-01-01

171

Directory of Open Access Journals (Sweden)

Full Text Available Abstract A random regression model for daily feed intake and a conventional multiple trait animal model for the four traits average daily gain on test (ADG, feed conversion ratio (FCR, carcass lean content and meat quality index were combined to analyse data from 1 449 castrated male Large White pigs performance tested in two French central testing stations in 1997. Group housed pigs fed ad libitum with electronic feed dispensers were tested from 35 to 100 kg live body weight. A quadratic polynomial in days on test was used as a regression function for weekly means of daily feed intake and to escribe its residual variance. The same fixed (batch and random (additive genetic, pen and individual permanent environmental effects were used for regression coefficients of feed intake and single measured traits. Variance components were estimated by means of a Bayesian analysis using Gibbs sampling. Four Gibbs chains were run for 550 000 rounds each, from which 50 000 rounds were discarded from the burn-in period. Estimates of posterior means of covariance matrices were calculated from the remaining two million samples. Low heritabilities of linear and quadratic regression coefficients and their unfavourable genetic correlations with other performance traits reveal that altering the shape of the feed intake curve by direct or indirect selection is difficult.

Künzi Niklaus

2002-01-01

172

Multiple quantitative trait loci Haseman-Elston regression using all markers on the entire genome.

The Haseman-Elston (HE) regression, developed in the 1970s, remains in common use to detect genetic linkage between a quantitative trait and a genetic marker. Although the technique has been improved in a number of ways, it predicts a high rate of false positive quantitative trait locus (QTL) because it is based on a single-QTL model. We have extended the origin HE regression to multi-QTL HE (MQHE) regression, so that all markers across the entire genome can be exploited simultaneously. The parameters have been estimated by the penalized maximum likelihood method, and several response variables for phenotypic difference have been compared in order to optimize the procedure. The method has been tested by simulation in a pedigree population of maize inbred lines of known ancestry. These simulations show that the trait product is the optimal response variable for phenotypic difference. The false positive rate produced by the MQHE regression is substantially lower than that generated by either variance component analysis or the origin HE regression. The MQHE regression, with the trait product as the response variable, represents a significant improvement on existing methods for QTL mapping in a set of inbred lines (or cultivars) of known ancestry. PMID:18563308

Zhang, Yuan-Ming; Lü, Hai-Yan; Yao, Li-Li

2008-09-01

173

Early cost estimating for road construction projects using multiple regression techniques

Directory of Open Access Journals (Sweden)

Full Text Available The objective of this study is to develop early cost estimating models for road construction projects using multiple regression techniques, based on 131 sets of data collected in the West Bank in Palestine. As the cost estimates are required at early stages of a project, considerations were given to the fact that the input data for the required regression model could be easily extracted from sketches or scope definition of the project. 11 regression models are developed to estimate the total cost of road construction project in US dollar; 5 of them include bid quantities as input variables and 6 include road length and road width. The coefficient of determination r2 for the developed models is ranging from 0.92 to 0.98 which indicate that the predicted values from a forecast models fit with the real-life data. The values of the mean absolute percentage error (MAPE of the developed regression models are ranging from 13% to 31%, the results compare favorably with past researches which have shown that the estimate accuracy in the early stages of a project is between ±25% and ±50%.

Ibrahim Mahamid

2011-12-01

174

Regression analysis of milk production traits in simmental cows

Digital Repository Infrastructure Vision for European Research (DRIVER)

The relationship between milk production traits over whole lactations was evaluated across three generations of Simmental cows, i.e. between daughters, dams and grand dams, by a phenotypic regression analysis with whole lactation traits in the daughter generation being used as the dependent variables (x1), and those in the dam and grand dam generations being used as the independent variables (x2 and x3). The results were obtained from a sample of 1170 daugh...

Petrovi? M.D.; Bogdanovi? V.; Petrovi? M.M.; Rakonjac S.

2011-01-01

175

An analysis of the least median of squares regression problem

Digital Repository Infrastructure Vision for European Research (DRIVER)

The optimization problem that arises out of the least median of squared residuals method in linear regression is analyzed. To simplify the analysis, the problem is replaced by an equivalent one of minimizing the median of absolute residuals. A useful representation of the last problem is given to examine properties of the objective function and estimate the number of its local minima. It is shown that the exact number of local minima is equal to $ {p+\\lfloor (n-1)/2 \\rfloor ...

Krivulin, Nikolai

2012-01-01

176

Poisson Regression Analysis of Illness and Injury Surveillance Data

Energy Technology Data Exchange (ETDEWEB)

The Department of Energy (DOE) uses illness and injury surveillance to monitor morbidity and assess the overall health of the work force. Data collected from each participating site include health events and a roster file with demographic information. The source data files are maintained in a relational data base, and are used to obtain stratified tables of health event counts and person time at risk that serve as the starting point for Poisson regression analysis. The explanatory variables that define these tables are age, gender, occupational group, and time. Typical response variables of interest are the number of absences due to illness or injury, i.e., the response variable is a count. Poisson regression methods are used to describe the effect of the explanatory variables on the health event rates using a log-linear main effects model. Results of fitting the main effects model are summarized in a tabular and graphical form and interpretation of model parameters is provided. An analysis of deviance table is used to evaluate the importance of each of the explanatory variables on the event rate of interest and to determine if interaction terms should be considered in the analysis. Although Poisson regression methods are widely used in the analysis of count data, there are situations in which over-dispersion occurs. This could be due to lack-of-fit of the regression model, extra-Poisson variation, or both. A score test statistic and regression diagnostics are used to identify over-dispersion. A quasi-likelihood method of moments procedure is used to evaluate and adjust for extra-Poisson variation when necessary. Two examples are presented using respiratory disease absence rates at two DOE sites to illustrate the methods and interpretation of the results. In the first example the Poisson main effects model is adequate. In the second example the score test indicates considerable over-dispersion and a more detailed analysis attributes the over-dispersion to extra-Poisson variation. The R open source software environment for statistical computing and graphics is used for analysis. Additional details about R and the data that were used in this report are provided in an Appendix. Information on how to obtain R and utility functions that can be used to duplicate results in this report are provided.

Frome E.L., Watkins J.P., Ellis E.D.

2012-12-12

177

The use of weighted multiple linear regression to estimate QTL-by-QTL epistatic effects

Directory of Open Access Journals (Sweden)

Full Text Available Knowledge of the nature and magnitude of gene effects, as well as their contribution to the control of metric traits, is important in formulating efficient breeding programs for the improvement of plant genetics. Information concerning a genetic parameter such as the additive-by-additive epistatic effect can be useful in traditional breeding. This report describes the results obtained by applying weighted multiple linear regression to estimate the parameter connected with an additive-by-additive epistatic interaction. Three weight variants were used: (1 standard weights based on estimated variances, (2 different weights for minimal, maximal and other lines, and (3 different weights for extreme and other lines. The approach described here combines two methods of estimation, one based on phenotypic observations and the other using molecular marker data. The comparison was done using Monte Carlo simulations. The results show that the application of weighted regression to the marker data yielded estimates similar to those obtained by phenotypic methods.

Jan Bocianowski

2012-01-01

178

Accurate projections of stratospheric ozone are required, because ozone changes impact onexposures to ultraviolet radiation and on tropospheric climate. Unweighted multi-model ensemble mean (uMMM) projections from chemistry-climate models (CCMs) are commonly used to project ozone in the 21 th century, when ozone-depleting substances are expected to decline and greenhouse gases expected to rise. Here, we address the question whether Antarctic total column ozone projections in October given by the uMMM of CCM simulations can be improved by using a process-oriented multiple diagnostic ensemble regression (MDER) method. This method is based on the correlation between simulated future ozone and selected key processes relevant for stratospheric ozone under present-day conditions. The regression model is built using an algorithm that selects those process-oriented diagnostics which explain a significant fraction of the spread in the projected ozone among the CCMs. The regression model with observed diagnostics is then used to predict future ozone and associated uncertainty. The precision of our method is tested in a pseudo-reality, i.e. the prediction is validated against an independent CCM projection used to replace unavailable future observations. The test shows that MDER has a higher precision than uMMM, suggesting an improvement in the estimate of future Antarctic ozone. Our method projects that Antarctic total ozone will return to 1980 values around 2060 with the 95% confidence interval ranging from 2040 to 2080. This reduces the range of return dates across the ensemble of CCMs by more than a decade and suggests that the earliest simulated return dates are unlikely. Karpechko, Maraun and Eyring (2013) Improving Antarctic Total Ozone Projections by a Process-Oriented Multiple Diagnostic Ensemble Regression, J. Atmos. Sci. 70: 3959-3976

Karpechko, Aleyey; Maraun, Douglas; Eyring, Veronika

2014-05-01

179

Generalized Constrained Multiple Correspondence Analysis.

Proposes a comprehensive approach, generalized constrained multiple correspondence analysis, for imposing both row and column constraints on multivariate discrete data. Each set of discrete data is decomposed into several submatrices and then multiple correspondence analysis is applied to explore relationships among the decomposed submatrices.…

Hwang, Heungsun; Takane, Yoshio

2002-01-01

180

Bayesian residual analysis for beta-binomial regression models

The beta-binomial regression model is an alternative model to the sum of any sequence of equicorrelated binary variables with common probability of success p. In this work a Bayesian perspective of this model is presented considering different link functions and different correlation structures. A general Bayesian residual analysis for this model, a issue which is often neglected in Bayesian analysis, using the residuals based on the predicted values obtained by the conditional predictive ordinate [1], the residuals based on the posterior distribution of the model parameters [2] and the Bayesian deviance residual [3] are presented in order to check the assumptions in the model.

Pires, Rubiane Maria; Diniz, Carlos Alberto Ribeiro

2012-10-01

181

Regression analysis of failure time data with informative interval censoring.

Interval censoring arises when a subject misses prescheduled visits at which the failure is to be assessed. Most existing approaches for analysing interval-censored failure time data assume that the censoring mechanism is independent of the true failure time. However, there are situations where this assumption may not hold. In this paper, we consider such a situation in which the dependence structure between the censoring variables and the failure time can be modelled through some latent variables and a method for regression analysis of failure time data is proposed. The method makes use of the proportional hazards frailty model and an EM algorithm is presented for estimation. Finite sample properties of the proposed estimators of regression parameters are examined through simulation studies and we illustrate the method with data from an AIDS study. PMID:17072823

Zhang, Zhigang; Sun, Liuquan; Sun, Jianguo; Finkelstein, Dianne M

2007-05-30

182

Morphometric analysis of breast carcinoma regression after radio- and thermoradiotherapy

International Nuclear Information System (INIS)

A study was made of 50 infiltrating breast carcinomas, stages 2-3, treated with a combined method with or without local UHF-hyperthermia. The object of investigation were cytological and histological specimens, histotopographic tumor slices. Methods of mathematical analysis of correlations between tumor tissue regression and parameters of parenchymal differentiation in cytological specimens and an area of tumor nodes were employed. Tumor tissue regression and differentiation of cell elements in cytological specimens showed correlation in the course of combined treatment on the basis of radiotherapy. Insignificant correlation was revealed between a volumetric density of the tumor parenchyma, preserved after combined treatment with UHF-hyperthermia, and parameters of cell differentiation in cytological specimens. Significantly positive correlation was found between an area of tumor nodes and an area of necrotic foci, developing in them after thermoradiotherapy

183

Variable selection in multiple linear regression: The influence of individual cases

Directory of Open Access Journals (Sweden)

Full Text Available The influence of individual cases in a data set is studied when variable selection is applied in multiple linear regression. Two different influence measures, based on the C_p criterion and Akaike's information criterion, are introduced. The relative change in the selection criterion when an individual case is omitted is proposed as the selection influence of the specific omitted case. Four standard examples from the literature are considered and the selection influence of the cases is calculated. It is argued that the selection procedure may be improved by taking the selection influence of individual data cases into account.

SJ Steel

2007-12-01

184

Energy Technology Data Exchange (ETDEWEB)

Multiple linear regression analysis has been used to study bidding and production data for Federal offshore oil and gas leases. Policy value conclusions have been stated therefrom. We, firstly, address the applicability of the inherent assumptions of normality and homoscedasticity finding the assumptions unsupported and questioning the statistical inferences which could otherwise be drawn. Secondly, even given the legitimacy of the assumptions and the usual statistical inferences from multiple linear regression results, we show the conclusions are volatilely sensitive. We are led to a strong assertion that quantitative assessment of the assumptions of normality and homoscedasticity be a mandatory requirement for the proper understanding and use, if indeed any is possible, of multiple linear regression analysis results for drawing policy value conclusions from data with the statistical behavior of Federal offshore oil and gas lease data.

Berger, P.D.; Lohrenz, J.

1980-06-01

185

Modern geochemical data sets have typically around 20-30 compositional variables measured on some tens or hundreds of samples. A statistical analysis of data sets with so many variables should take as a priority the reduction of dimensionality of the model, in order to increase its reliability and enhance its interpretation. In the framework of compositional data analysis with multiple regression, such simplification can be achieved taking some geometric concepts into account. First, the sample space of compositions, the simplex, is given an Euclidean space structure by the compositional operations of perturbation, powering and Aitchison inner product. Then, given some qualitative information on which subcompositions might depend on each explanatory variable, one can decompose the simplex in a set of orthogonal subspaces, in such a way that the composition projected onto each subspace is independent of a subset of the explanatory variables. This is achieved with a series of singular value decomposition computations. The method is applied to a data set of 88 observations of six major oxides in molar proportions, from modern glacial and fluvio-glacial sediments, with grain size ranging from coarse sand to clay. The goal is to assess the influence of chemical weathering processes (expected to impose a linear relation of composition and grain size) against purely physical processes (expected to show step-wise functions following the largest characteristic crystal sizes of specific minerals in the source rock). We exhaustively explore all patterns of uncorrelation of the composition with three explanatory variables: grain size in ? scale, and two step functions for the silt and clay domains. The best pattern, chosen with a likelihood ratio test, has only a smooth trend of (Mg,Fe) vs. (Al,K,Ca+Na) enrichment towards finer grain sizes—explained as differential mechanical behaviour of phyllosilicates vs. feldspar—and coefficients for the two step functions related to the sharp decrease of quartz in silt fractions, and the sudden enrichment of mafic accessory minerals, alteration products and mechanically unstable phyllosilicates in the clay fraction. We could thus be confident that weathering is almost absent in this data set.

Tolosana-Delgado, R.; von Eynatten, H.

2010-05-01

186

Directory of Open Access Journals (Sweden)

Full Text Available Software Estimation Techniques present an inclusive set of directives for software project developers, project managers and the management in order to produce more accurate estimates or predictions for future developments. The estimates also facilitate allocation of resources’ for Software development. Estimations also smooth the process of re-planning, prioritizing, classification and reuse of the projects. Various estimation models are widely being used in the Industry as well for research purposes. Several comparative studies have been executed on them, but choosing the best technique is quite intricate. Estimation by Analogy(EbA is the method of making estimations based on the outcome from k most analogous projects. The projects close in distance are potentially similar to the reference project from the repository of projects. This method has widely been accepted and is quite popular as it impersonates human beings inherent judgment skill by estimating with analogous projects. In this paper, Grey Relational Analysis(GRA is used as the method for feature selection and also for locating the closest analogous projects to the reference project from the set of projects. The closest k projects are then used to build regression models. Regression techniques like Multiple Linear Regression, Stepwise Regression and Robust regression techniques are used to find the effort from the closest projects.

Arvinder Kaur

2012-01-01

187

Analysis of Impacted Classes and Regression Test Suite Generation

Directory of Open Access Journals (Sweden)

Full Text Available Software needs to be changed over time to deal with new requirements, existing faults and change requests. Change made tosoftware will inevitably have some unforeseen and un desirable effects on other parts of the software. Software Change Impact Analysis (SCIA is an approach used to identify the potential effects caused by change made o software. As any change is requested by the client or user the software project team have not only the objective to incorporate that change in the existing system while to maintain the software quality is also the other objective. The paper proposes an approach to find the impact set of the change requested by user or client. Author uses the impact set of the requested change to prepare the test suite for regression testing. The results of proposed approach are illustrated with a case study. The approach used in this paper finds the regression test suite required for regression testing based on the impact set that is the sub set of the existing test suite of the system.

Aprna Tripathi

2013-03-01

188

International Nuclear Information System (INIS)

Highlights: ? We obtained models for estimation of cetane number of biodiesel. ? Twenty-four neural networks using two topologies were evaluated. ? The best neural network for predict the cetane number was selected. ? The best accuracy was obtained for the selected neural network. - Abstract: Models for estimation of cetane number of biodiesel from their fatty acid methyl ester composition using multiple linear regression and artificial neural networks were obtained in this work. For the obtaining of models to predict the cetane number, an experimental data from literature reports that covers 48 and 15 biodiesels in the modeling-training step and validation step respectively were taken. Twenty-four neural networks using two topologies and different algorithms for the second training step were evaluated. The model obtained using multiple regression was compared with two other models from literature and it was able to predict cetane number with 89% of accuracy, observing one outlier. A model to predict cetane number using artificial neural network was obtained with better accuracy than 92% except one outlier. The best neural network to predict the cetane number was a backpropagation network (11:5:1) using the Levenberg–Marquardt algorithm for the second step of the networks training and showing R = 0.9544 for the validation data.

189

Multiple Regression Models up to First-order Interaction on Hydrochemistry Properties

Directory of Open Access Journals (Sweden)

Full Text Available This study illustrated the procedure in selecting the best model in estimating the Electrical Conductivity (EC levels based on the hydrochemistry properties and nature effecting factors using multiple regressions. The six independent variables and two dummy variables considered in this data set. The Multiple Regression (MR models were involved up to first-order interaction and there were 57 possible models considered. This study is the extension of prior research which had generated 63 possible models, by using the same technique but no interaction involved between the independent variables. In this study, the process of getting the best model from the total of 120 possible models had been illustrated. The backward elimination of variables with the highest p-value was employed to get the selected model. The best model includes the combination of single and first order interaction (Li, Mg, Na-SO4, Na-Li, Na-Mg and SO4-Mg. The best model obtains then being verified by the Mean Absolute Percentage Error (MAPE calculation to measure the models’ relative overall fit.

Noraini Abdullah

2012-01-01

190

Estimation of Neutronic Performance in a Hybrid Reactor with Regression Analysis

This study presents regression analysis method used for prediction and investigation of neutronic performance in a hybrid reactor using UO2 fuel and Flibe (Li2BeF4) coolant. The 235U fraction is increased gradually from 0 to 4% stepped by 1% and the 6Li fraction within the Flibe coolant is enriched gradually to 30, 60 and 90% from 7.5%. Relations between 235U fuel fraction and lithium (6Li) enrichment are investigated for the estimation of neutronic performance as the tritium breeding ratio (TBR), energy multiplication factor (M), total fission rate (?f), 238U ( n, ?) reaction and fissile fuel breeding (FFB) in the hybrid reactor. Regression analysis by results obtained by using the code (XSDRNPM/SCALE5) for TBR, M, ?f, 238U ( n, ?) and FFB are performed. The results of the regression analysis and the values obtained by using the code (XSDRNPM/SCALE5) are compared with respect to the TBR, M, ?f, 238U ( n, ?) and FFB of the reactor. The values calculated from the obtained formulations with regression analysis are found to be in good agreement with results obtained by using the code (XSDRNPM/SCALE5). It is observed that the derived equations from regression analysis could provide an accurate computation of the neutronic performances so that these equations could use for the prediction of TBR, M, ?f, 238U ( n, ?) and FFB. In addition, correlation matrix is calculated to determine the degree of relationship between variables as TBR, M, ?f, 238U ( n, ?) and FFB.

Ac?r, Adem; Alakoç, Nilüfer Pekin; Y?ld?z, Kadir

2009-12-01

191

Multivariate study and regression analysis of gluten-free granola

Scientific Electronic Library Online (English)

Full Text Available SciELO Brazil | Language: English Abstract in english This study developed a gluten-free granola and evaluated it during storage with the application of multivariate and regression analysis of the sensory and instrumental parameters. The physicochemical, sensory, and nutritional characteristics of a product containing quinoa, amaranth and linseed were [...] evaluated. The crude protein and lipid contents ranged from 97.49 and 122.72 g kg-1 of food, respectively. The polyunsaturated/saturated, and n-6:n-3 fatty acid ratios ranged from 2.82 and 2.59:1, respectively. Granola had the best alpha-linolenic acid content, nutritional indices in the lipid fraction, and mineral content. There were good hygienic and sanitary conditions during storage; probably due to the low water activity of the formulation, which contributed to inhibit microbial growth. The sensory attributes ranged from 'like very much' to 'like slightly', and the regression models were highly fitted and correlated during the storage period. A reduction in the sensory attribute levels and in the product physical stabilisation was verified by principal component analysis. The use of the affective test acceptance and instrumental analysis combined with statistical methods allowed us to obtain promising results about the characteristics of gluten-free granola.

Lilian Maria, Pagamunici; Aloisio Henrique Pereira de, Souza; Aline Kirie, Gohara; Alline Aparecida Freitas, Silvestre; Jesuí Vergílio, Visentainer; Nilson Evelázio de, Souza; Sandra Terezinha Marques, Gomes; Makoto, Matsushita.

192

Multivariate study and regression analysis of gluten-free granola

Scientific Electronic Library Online (English)

Full Text Available SciELO Brazil | Language: English Abstract in english This study developed a gluten-free granola and evaluated it during storage with the application of multivariate and regression analysis of the sensory and instrumental parameters. The physicochemical, sensory, and nutritional characteristics of a product containing quinoa, amaranth and linseed were [...] evaluated. The crude protein and lipid contents ranged from 97.49 and 122.72 g kg-1 of food, respectively. The polyunsaturated/saturated, and n-6:n-3 fatty acid ratios ranged from 2.82 and 2.59:1, respectively. Granola had the best alpha-linolenic acid content, nutritional indices in the lipid fraction, and mineral content. There were good hygienic and sanitary conditions during storage; probably due to the low water activity of the formulation, which contributed to inhibit microbial growth. The sensory attributes ranged from 'like very much' to 'like slightly', and the regression models were highly fitted and correlated during the storage period. A reduction in the sensory attribute levels and in the product physical stabilisation was verified by principal component analysis. The use of the affective test acceptance and instrumental analysis combined with statistical methods allowed us to obtain promising results about the characteristics of gluten-free granola.

Lilian Maria, Pagamunici; Aloisio Henrique Pereira de, Souza; Aline Kirie, Gohara; Alline Aparecida Freitas, Silvestre; Jesuí Vergílio, Visentainer; Nilson Evelázio de, Souza; Sandra Terezinha Marques, Gomes; Makoto, Matsushita.

2014-03-01

193

Isolated Area Load Forecasting using Linear Regression Analysis: Practical Approach

Directory of Open Access Journals (Sweden)

Full Text Available This paper presents an analysis to forecast the loads of an isolated area where the history of load is not available or the history may not represent the realistic demand of electricity. The analysis is done through linear regression and based on the identification of factors on which electrical load growth depends. To determine the identification factors, areas are selected whose histories of load growth rate known and the load growth deciding factors are similar to those of the isolated area. The proposed analysis is applied to an isolated area of Bangladesh, called Swandip where a past history of electrical load demand is not available and also there is no possibility of connecting the area with the main land grid system.

M. A. Mahmud

2011-09-01

194

Yearly Streamflow Discharge Analysis Using Functional Regression Models

Earlier spring runoff from snow melt in western North America has been suggested from analysis of both river discharge and snowpack data. This work takes a different approach to detecting evidence of earlier spring onset using a new semi-metric based on yearly streamflow discharge records. New methods of time series analysis for functional data (Ramsay and Silverman, 2005) are presented to analyze the inverse yearly cumulative discharge functions. An algorithm is developed for estimation of a functional regression model that incorporates autocorrelated errors. A framework for choosing the model structure is provided using a functional extension of a model selection criterion. Further, a diagnostic for assessing autocorrelation in the errors is provided. Results based on the analysis of streamflow records for Water Years 1951-2005 from the South Fork of the Boise River are used to illustrate the new techniques.

Greenwood, M. C.; Harper, J. T.; Moore, J. N.

2007-12-01

195

International Nuclear Information System (INIS)

Objective: To analyze the correlations between liver lipid level determined by liver 3.0 T 1H-MRS in vivo and influencing factors using multiple linear stepwise regression. Methods: The prospective study of liver 1H-MRS was performed with 3.0 T system and eight-channel torso phased-array coils using PRESS sequence. Forty-four volunteers were enrolled in this study. Liver spectra were collected with a TR of 1500 ms, TE of 30 ms, volume of interest of 2 cm×2 cm×2 cm, NSA of 64 times. The acquired raw proton MRS data were processed by using a software program SAGE. For each MRS measurement, using water as the internal reference, the amplitude of the lipid signal was normalized to the sum of the signal from lipid and water to obtain percentage lipid within the liver. The statistical description of height, weight, age and BMI, Line width and water suppression were recorded, and Pearson analysis was applied to test their relationships. Multiple linear stepwise regression was used to set the statistical model for the prediction of Liver lipid content. Results: Age (39.1±12.6) years, body weight (64.4±10.4) kg, BMI (23.3±3.1) kg/m2, linewidth (18.9±4.4) and the water suppression (90.7±6.5)% had significant correlation with liver lipid content (0.00 to 0.96%, median 0.02%), r were 0.11, 0.44, 0.40, 0.52, -0.73 respectively (P<0.05). But only age, BMI, line width, and the water suppression entered into the multiple linear regression equation. Liver lipid content prediction equation was as follows: Y= 1.395 - (0.021×water suppression) + (0.022×BMI) + (0.014×line width) - (0.004×age), and the coefficient of determination was 0. 613, corrected coefficient of determination was 0.59. Conclusion: The regression model fitted well, since the variables of age, BMI, width, and water suppression can explain about 60% of liver lipid content changes. (authors)

196

Bias due to two-stage residual-outcome regression analysis in genetic association studies.

Association studies of risk factors and complex diseases require careful assessment of potential confounding factors. Two-stage regression analysis, sometimes referred to as residual- or adjusted-outcome analysis, has been increasingly used in association studies of single nucleotide polymorphisms (SNPs) and quantitative traits. In this analysis, first, a residual-outcome is calculated from a regression of the outcome variable on covariates and then the relationship between the adjusted-outcome and the SNP is evaluated by a simple linear regression of the adjusted-outcome on the SNP. In this article, we examine the performance of this two-stage analysis as compared with multiple linear regression (MLR) analysis. Our findings show that when a SNP and a covariate are correlated, the two-stage approach results in biased genotypic effect and loss of power. Bias is always toward the null and increases with the squared-correlation between the SNP and the covariate (). For example, for , 0.1, and 0.5, two-stage analysis results in, respectively, 0, 10, and 50% attenuation in the SNP effect. As expected, MLR was always unbiased. Since individual SNPs often show little or no correlation with covariates, a two-stage analysis is expected to perform as well as MLR in many genetic studies; however, it produces considerably different results from MLR and may lead to incorrect conclusions when independent variables are highly correlated. While a useful alternative to MLR under , the two -stage approach has serious limitations. Its use as a simple substitute for MLR should be avoided. PMID:21769934

Demissie, Serkalem; Cupples, L Adrienne

2011-11-01

197

International Nuclear Information System (INIS)

Parameters derived from computer analysis of digital radio-frequency (rf) ultrasound scan data of untreated uveal malignant melanomas were examined for correlations with tumor regression following cobalt-60 plaque. Parameters included tumor height, normalized power spectrum and acoustic tissue type (ATT). Acoustic tissue type was based upon discriminant analysis of tumor power spectra, with spectra of tumors of known pathology serving as a model. Results showed ATT to be correlated with tumor regression during the first 18 months following treatment. Tumors with ATT associated with spindle cell malignant melanoma showed over twice the percentage reduction in height as those with ATT associated with mixed/epithelioid melanomas. Pre-treatment height was only weakly correlated with regression. Additionally, significant spectral changes were observed following treatment. Ultrasonic spectrum analysis thus provides a noninvasive tool for classification, prediction and monitoring of tumor response to cobalt-60 plaque

198

Spontaneous Regression of a Large Hepatocellular Carcinoma with Multiple Lung Metastases

A 75-year-old Japanese man with chronic hepatitis C was found to have a large liver tumor and multiple nodules in the bilateral lungs. We diagnosed the tumor as hepatocellular carcinoma (HCC) with multiple lung metastases based on imaging studies and high titers of HCC tumor markers. Remarkably, without any anticancer treatment or medication, including herbal preparations, the liver tumor decreased in size, and the tumor makers diminished. Moreover, after 1 year, the multiple nodules in the bilateral lungs had disappeared. Fifteen months after the first medical examination, transcatheter arterial chemoembolization (TACE) was performed for the residual HCC. Because local relapse was observed on follow-up computed tomography, a second TACE was performed 13 months after the first one. At 4 years after the second TACE (7 years after the initial medical examination), there was no recurrence of primary or metastatic lesions. Spontaneous regression of HCC is very rare, and its mechanism remains unclear. Understanding the underlying mechanism of this rare phenomenon may offer some hope of finding new therapies, even in advanced metastatic cases. PMID:25228980

Saito, Tamiko; Naito, Masafumi; Matsumura, Yuki; Kita, Hisaaki; Kanno, Tomoyo; Nakada, Yuki; Hamano, Mina; Chiba, Miho; Maeda, Kosaku; Michida, Tomoki; Ito, Toshifumi

2014-01-01

199

Scientific Electronic Library Online (English)

Full Text Available SciELO Brazil | Language: Portuguese Abstract in portuguese Este trabalho teve por objetivo estimar equações de regressão linear múltipla tendo, como variáveis explicativas, as demais características avaliadas em experimento de milho e, como variáveis principais, a diferença mínima significativa em percentagem da média (DMS%) e quadrado médio do erro (QMe), [...] para peso de grãos. Com 610 experimentos conduzidos na Rede de Ensaios Nacionais de Competição de Cultivares de Milho, realizados entre 1986 e 1996 (522 experimentos) e em 1997 (88 experimentos), estimaram-se duas equações de regressão, com os 522 experimentos, validando estas pela análise de regressão simples entre os valores reais e os estimados pelas equações, com os 88 restantes, observando que, para a DMS% a equação não estimava o mesmo valor que a fórmula original e, para o QMe, a equação poderia ser utilizada na estimação. Com o teste de Lilliefors, verificou-se que os valores do QMe aderiam à distribuição normal padrão e foi construída uma tabela de classificação dos valores do QMe, baseada nos valores observados na análise da variância dos experimentos e nos estimados pela equação de regressão. Abstract in english The aims of this study were to estimate the multiple linear regression equation and to verify the possible relationship between dependent and independent variables. Dependent variables were the mean percentage of the least significant difference (LSD%) and the mean square of the error (MSe) for grai [...] n yield. Data from 522 experiments conducted from 1986 to 1996 and 88 experiments conducted in 1997 were used in a total of 610 experiments of the National Competition of Maize Cultivars. In the 522 experiments, two regression equations validated by the analysis of simple regression between the real values and the foreseen for the equations were estimated, in the 88 experiments, it was observed that the regression equation was not a good estimation for the same original value for LSD%, but the equation can be used for the estimation of MSe. The application of Lilliefors test resulted in normal pattern distribution of MSe values. One classification table of MSe values was built based on observed values of variance analysis of the experiments and on the regression equation estimated value.

Alessandro Dal’Col, Lúcio; David Ariovaldo, Banzatto; Lindolfo, Storck; Thomas Newton, Martin; Leandro Homrich, Lorentz.

2001-12-01

200

Directory of Open Access Journals (Sweden)

Full Text Available Este trabalho teve por objetivo estimar equações de regressão linear múltipla tendo, como variáveis explicativas, as demais características avaliadas em experimento de milho e, como variáveis principais, a diferença mínima significativa em percentagem da média (DMS% e quadrado médio do erro (QMe, para peso de grãos. Com 610 experimentos conduzidos na Rede de Ensaios Nacionais de Competição de Cultivares de Milho, realizados entre 1986 e 1996 (522 experimentos e em 1997 (88 experimentos, estimaram-se duas equações de regressão, com os 522 experimentos, validando estas pela análise de regressão simples entre os valores reais e os estimados pelas equações, com os 88 restantes, observando que, para a DMS% a equação não estimava o mesmo valor que a fórmula original e, para o QMe, a equação poderia ser utilizada na estimação. Com o teste de Lilliefors, verificou-se que os valores do QMe aderiam à distribuição normal padrão e foi construída uma tabela de classificação dos valores do QMe, baseada nos valores observados na análise da variância dos experimentos e nos estimados pela equação de regressão.The aims of this study were to estimate the multiple linear regression equation and to verify the possible relationship between dependent and independent variables. Dependent variables were the mean percentage of the least significant difference (LSD% and the mean square of the error (MSe for grain yield. Data from 522 experiments conducted from 1986 to 1996 and 88 experiments conducted in 1997 were used in a total of 610 experiments of the National Competition of Maize Cultivars. In the 522 experiments, two regression equations validated by the analysis of simple regression between the real values and the foreseen for the equations were estimated, in the 88 experiments, it was observed that the regression equation was not a good estimation for the same original value for LSD%, but the equation can be used for the estimation of MSe. The application of Lilliefors test resulted in normal pattern distribution of MSe values. One classification table of MSe values was built based on observed values of variance analysis of the experiments and on the regression equation estimated value.

Alessandro Dal’Col Lúcio

2001-12-01

201

The gross calorific value (GCV), proximate, ultimate and chemical analysis of debark wood in Portugal were studied, for future utilization in wood pellets industry and the results compared with CEN/TS 14961. The relationship between GCV, ultimate and chemical analysis were determined by multiple regression stepwise backward. The treatment between hardwoods-softwoods did not result in significant statistical differences for proximate, ultimate and chemical analysis. Significant statistical differences were found in carbon for National (hardwoods-softwoods) and (National-tropical) hardwoods in volatile matter, fixed carbon, carbon and oxygen and also for chemical analysis in National (hardwoods-softwoods) for F and (National-tropical) hardwoods for Br. GCV was highly positively related to C (0.79 * * *) and negatively to O (-0.71 * * *). The final independent variables of the model were (C, O, S, Zn, Ni, Br) with R(2)=0.86; F=27.68 * * *. The hydrogen did not contribute statistically to the energy content. PMID:20122826

Telmo, C; Lousada, J; Moreira, N

2010-06-01

202

The use of weighted multiple linear regression to estimate QTL-by-QTL epistatic effects

Scientific Electronic Library Online (English)

Full Text Available SciELO Brazil | Language: English Abstract in english Knowledge of the nature and magnitude of gene effects, as well as their contribution to the control of metric traits, is important in formulating efficient breeding programs for the improvement of plant genetics. Information concerning a genetic parameter such as the additive-by-additive epistatic e [...] ffect can be useful in traditional breeding. This report describes the results obtained by applying weighted multiple linear regression to estimate the parameter connected with an additive-by-additive epistatic interaction. Three weight variants were used: (1) standard weights based on estimated variances, (2) different weights for minimal, maximal and other lines, and (3) different weights for extreme and other lines. The approach described here combines two methods of estimation, one based on phenotypic observations and the other using molecular marker data. The comparison was done using Monte Carlo simulations. The results show that the application of weighted regression to the marker data yielded estimates similar to those obtained by phenotypic methods.

Jan, Bocianowski.

203

Directory of Open Access Journals (Sweden)

Full Text Available The objectives of this study were to estimate (covariance functions for additive genetic and permanent environmental effects, as well as the genetic parameters for milk yield over multiple parities, using random regressions models (RRM. Records of 4,757 complete lactations of Murrah breed buffaloes from 12 herds were analyzed. Ages at calving were between 2 and 11 years. The model included the additive genetic and permanent environmental random effects and the fixed effects of contemporary groups (herd, year and calving season and milking frequency (1 or 2. A cubic regression on Legendre orthogonal polynomials of ages was used to model the mean trend. The additive genetic and permanent environmental effects were modeled by Legendre orthogonal polynomials. Residual variances were considered homogenous or heterogeneous, modeled through variance functions or step functions with 5, 7 or 10 classes. Results from Akaike’s and Schwarz’s Bayesian information criterion indicated that a RRM considering a third order polynomial for the additive genetic and permanent environmental effects and a step function with 5 classes for residual variances fitted best. Heritability estimates obtained by this model varied from 0.10 to 0.28. Genetic correlations were high between consecutive ages, but decreased when intervals between ages increased

H. Tonhati

2010-02-01

204

Estimation of Saturation Percentage of Soil Using Multiple Regression, ANN, and ANFIS Techniques

Directory of Open Access Journals (Sweden)

Full Text Available The saturation percentage (SP of soils is an important index in hydrological studies. In this paper, arti?cial neural networks (ANNs, multiple regression (MR, and adaptive neural-based fuzzy inference system (ANFIS were used for estimation of saturation percentage of soils collected from Boukan region in the northwestern part of Iran. Percent clay, silt, sand and organic carbon (OC were used to develop the applied methods. In additions contributions of each input variable were assessed on estimation of SP index. Two performance functions, namely root mean square errors (RMSE and determination coefficient (R2, were used to evaluate the adequacy of the models. ANFIS method was found to be superior over the other methods. It is, then, proposed that ANFIS model can be used for reasonable estimation of SP values of soils.

Khaled Ahmad Aali

2009-07-01

205

Evaluating the Sustainable Development of Agriculture Based on Multiple Linear Regression

Directory of Open Access Journals (Sweden)

Full Text Available Agriculture is the base of national economy, rural area is basic community and agricultural sustainable development is the base of whole society sustainable development. Studying evaluation index system of agricultural sustainable development level, constructing reasonable evaluation model, are significant for path selection and level promotion. Evaluation index system based on input and output has been built with the method of multiple regression, the interrelation between agricultural investment in fixed assets and related output indexes of agricultural sustainable development, degree of closeness and changing law have been analyzed to find the interrelation mode existing in indexes, a set comprehensive evaluation methods of agricultural sustainable development have been constructed. This evaluation method were used to evaluate agricultural sustainable development level in China’s 31 provinces, can help the local government scientifically know agricultural sustainable development level, provide agricultural sustainable development with scientific basis of decision-making.

Li Qing-xue

2013-01-01

206

Identifying the Factors that Influence Change in SEBD Using Logistic Regression Analysis

Directory of Open Access Journals (Sweden)

Full Text Available Multiple linear regression and ANOVA models are widely used in applications since they provide effective statistical tools for assessing the relationship between a continuous dependent variable and several predictors. However these models rely heavily on linearity and normality assumptions and they do not accommodate categorical dependent variables. The seminal contribution of John Nelder and Robert Wedderburn (1972 introduced the concept of Generalized Linear Models. GLMs overcome the limitations of Normal regression models and accommodate any distribution which is a member of the exponential family. Moreover, these models relate the dependent variable to the linear predictor (non-random component through any invertible link function. Logistic regression models are GLMs that accommodate categorical dependent variables. They assume a Binomial distribution and Logit canonical link function. The iteratively re-weighted least squares algorithm using the Fisher scoring technique is employed to maximize the log-likelihood function in GLMs and estimate the model parameters. In this paper, Logistic regression analysis was used to identify the dominant factors that influence change in social, emotional and behaviour difficulties (SEBD of Maltese children. The study comprised 486 pupils whose SEBD was assessed by both teachers and parents using the Strengths and Difficulties Questionnaire (Goodman 1997 when the children were aged 6 and 9 years old.

Liberato Camilleri

2013-07-01

207

Node-Mapping EIT Method Based on Regression Analysis

Directory of Open Access Journals (Sweden)

Full Text Available Medical Imaging shows people the morphology of the body's internal organs function intuitive ly. Electrical Impedance Tomography (EIT is an emerging medical imaging technology. It has the advantages of simple structure, low cost, non-radiological hazards and non-invasive . EIT can not only take advantage of the impedance differences between the different organizations reconstruction of anatomical images, and cantissues and organs to achieve functional imaging impedance changes in different physiological and pathological state, and is suitable for long -term monitoring. The solution is approximate due to t he ill -posedness of inverse problem . Because the image is accuracy and computation of contradictions in not quick enough, EIT is still unable to meet the requirements of practical pplication. By using regression analysis algorithm , Node-Mapping Method only calculates the node potential . The speed of operation and the reconstructed image quality have been greatly improved.

Jianjun Zhang

2012-12-01

208

A Visual Analytics Approach for Correlation, Classification, and Regression Analysis

Energy Technology Data Exchange (ETDEWEB)

New approaches that combine the strengths of humans and machines are necessary to equip analysts with the proper tools for exploring today's increasing complex, multivariate data sets. In this paper, a novel visual data mining framework, called the Multidimensional Data eXplorer (MDX), is described that addresses the challenges of today's data by combining automated statistical analytics with a highly interactive parallel coordinates based canvas. In addition to several intuitive interaction capabilities, this framework offers a rich set of graphical statistical indicators, interactive regression analysis, visual correlation mining, automated axis arrangements and filtering, and data classification techniques. The current work provides a detailed description of the system as well as a discussion of key design aspects and critical feedback from domain experts.

Steed, Chad A [ORNL; SwanII, J. Edward [Mississippi State University (MSU); Fitzpatrick, Patrick J. [Mississippi State University (MSU); Jankun-Kelly, T.J. [Mississippi State University (MSU)

2012-02-01

209

A Quantile Regression Analysis of Micro-lending's Poverty Impact

Directory of Open Access Journals (Sweden)

Full Text Available This paper aims to evaluate the impact of a microlending program on ameliorating measured poverty within its client population, with the aim of improving that impact. We analyze over 18,000 women micro-finance clients of the Negros Women for Tomorrow Foundation (NWTF, a database using the Progress out of Poverty (PPI Scorecard as a measure of poverty. Analysis using both OLS and quantile multivariate regression models shows how observable borrower attributes affect the ability of clients to reduce their measured poverty. Loan size, duration, and the economic activity supported all have strongly identifiable effects. Moreover, estimates suggest which among the poor are receiving the greatest effective help by the program. Results offer specific advice to the NWTF and other micro-lenders: impact is greatest with fewer, larger loans in particular economic sectors (sari-sari, service and trade but require patience as each additional year increases the client’s average change in poverty score.

Stephen W. Polk

2012-07-01

210

Multivariate Regression Analysis of Gravitational Waves from Rotating Core Collapse

We present a new multivariate regression model for analysis and parameter estimation of gravitational waves observed from well but not perfectly modeled sources such as core-collapse supernovae. Our approach is based on a principal component decomposition of simulated waveform catalogs. Instead of reconstructing waveforms by direct linear combination of physically meaningless principal components, we solve via least squares for the relationship that encodes the connection between chosen physical parameters and the principal component basis. Although our approach is linear, the waveforms' parameter dependence may be non-linear. For the case of gravitational waves from rotating core collapse, we show, using statistical hypothesis testing, that our method is capable of identifying the most important physical parameters that govern waveform morphology in the presence of simulated detector noise. We also demonstrate our method's ability to predict waveforms from a principal component basis given a set of physical ...

Engels, William J; Ott, Christian D

2014-01-01

211

A Logistic Regression Analysis of the Ischemic Heart Disease Risk

Directory of Open Access Journals (Sweden)

Full Text Available The main objective of the present study is to investigate factors that contribute significantly to enhancing the risk of ischemic heart disease. The dependent variable of the study is diagnosis - whether the patient has the disease or does not have the disease. Logistic regression analysis is applied for exploring the factors affecting the disease. The result of the study show the factors that contribute significantly to enhancing the risk of ischemic heart disease are the use of banaspati ghee, living in urban area, high cholesterol level, age group of 51 to 60 years. Other significant factors are Apo Protein A, Apo Protein B, cholesterol level, high density Lipo protein, low density Lipo protein, phospholipids, total lipid and uric acid.

Irfana P. Bhatti

2006-01-01

212

A Prediction Model for System Testing Defects using Regression Analysis

Directory of Open Access Journals (Sweden)

Full Text Available This research describes the initial effort of building a prediction model for defects in system testing carried out by an independent testing team. The motivation to have such defect prediction model is to serve as early quality indicator of the software entering system testing and assist the testing team to manage and control test execution activities. Metrics collected from prior phases to system testing are identified and analyzed to determine the potential predictors for building the model. The selected metrics are then put into regression analysis to generate several mathematical equations. Mathematical equation that has p-value of less than 0.05 with R-squared and R-squared (adjusted more than 90% is selected as the desired prediction model for system testing defects. This model is verified using new projects to confirm that it is fit for actual implementation.

Muhammad Dhiauddin Mohamed Suffian 1

2012-07-01

213

International Nuclear Information System (INIS)

The Gauss-Newton algorithm has been used to evaluate tracer binding parameters of RIA by nonlinear regression analysis. The calculations were carried out on the K1003 desk computer. Equations for simple binding models and its derivatives are presented. The advantages of nonlinear regression analysis over linear regression are demonstrated

214

Low-Cost Housing in Sabah, Malaysia: A Regression Analysis

Directory of Open Access Journals (Sweden)

Full Text Available Low-cost housing plays a vital role in the development process especially in providing accommodation to those who are less fortunate and the lower income group. This effort is also a step in overcoming the squatter problem which could cripple the competitive drive of the local community especially in the state of Sabah, Malaysia. This article attempts to look into the influencing factors to low-cost housing in Sabah namely the government’s budget (allocation for low cost housing projects and Sabah’s total population. At the same time, this study will attempt to show the implication from the development and economic crises which occurred during period 1971 to 2000 towards the provision of low cost houses in Sabah. Empirical analyses were conducted using the multiple linear regression method, stepwise and also the dummy variable approach in demonstrating the link. The empirical result shows that the government’s budget for low-cost housing is the main contributor to the provision of low-cost housing in Sabah. The empirical decision also suggests that economic growth namely Gross Domestic Product (GDP did not provide a significant effect to the low-cost housing in Sabah. However, almost all major crises that have beset upon Malaysia’s economy caused a significant and consistent effect to the low-cost housing in Sabah especially the financial crisis which occurred in mid 1997.

Dullah Mulok

2009-02-01

215

Building Regression Models: The Importance of Graphics.

Points out reasons for using graphical methods to teach simple and multiple regression analysis. Argues that a graphically oriented approach has considerable pedagogic advantages in the exposition of simple and multiple regression. Shows that graphical methods may play a central role in the process of building regression models. (Author/LS)

Dunn, Richard

1989-01-01

216

International Nuclear Information System (INIS)

In environmental epidemiology, trace and toxic substance concentrations frequently have very highly skewed distributions ranging over one or more orders of magnitude, and prediction by conventional regression is often poor. Classification and Regression Tree Analysis (CART) is an alternative in such contexts. To compare the techniques, two Pennsylvania data sets and three independent variables are used: house radon progeny (RnD) and gamma levels as predicted by construction characteristics in 1330 houses; and ?200 house radon (Rn) measurements as predicted by topographic parameters. CART may identify structural variables of interest not identified by conventional regression, and vice versa, but in general the regression models are similar. CART has major advantages in dealing with other common characteristics of environmental data sets, such as missing values, continuous variables requiring transformations, and large sets of potential independent variables. CART is most useful in the identification and screening of independent variables, greatly reducing the need for cross-tabulations and nested breakdown analyses. There is no need to discard cases with missing values for the independent variables because surrogate variables are intrinsic to CART. The tree-structured approach is also independent of the scale on which the independent variables are measured, so that transformations are unnecessary. CART identifies important interactions as well as main effects. The major advantages of CART appear to be in exploring data. Once the important variables are identified, conventional regressions seem to lead to results similar but more interpretable by most audiences. 12 refs., 8 figs., 10 tabs

217

Integrated analysis of incidence, progression, regression and disappearance probabilities

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Age-related maculopathy (ARM is a leading cause of vision loss in people aged 65 or older. ARM is distinctive in that it is a disease which can transition through incidence, progression, regression and disappearance. The purpose of this study is to develop methodologies for studying the relationship of risk factors with different transition probabilities. Methods Our framework for studying this relationship includes two different analytical approaches. In the first approach, one can define, model and estimate the relationship between each transition probability and risk factors separately. This approach is similar to constraining a population to a certain disease status at the baseline, and then analyzing the probability of the constrained population to develop a different status. While this approach is intuitive, one risks losing available information while at the same time running into the problem of insufficient sample size. The second approach specifies a transition model for analyzing such a disease. This model provides the conditional probability of a current disease status based upon a previous status, and can therefore jointly analyze all transition probabilities. Throughout the paper, an analysis to determine the birth cohort effect on ARM is used as an illustration. Results and conclusion This study has found parallel separate and joint analyses to be more enlightening than any analysis in isolation. By implementing both approaches, one can obtain more reliable and more efficient results.

Huang Guan-Hua

2008-06-01

218

Semiparametric regression for periodic longitudinal hormone data from multiple menstrual cycles.

We consider semiparametric regression for periodic longitudinal data. Parametric fixed effects are used to model the covariate effects and a periodic nonparametric smooth function is used to model the time effect. The within-subject correlation is modeled using subject-specific random effects and a random stochastic process with a periodic variance function. We use maximum penalized likelihood to estimate the regression coefficients and the periodic nonparametric time function, whose estimator is shown to be a periodic cubic smoothing spline. We use restricted maximum likelihood to simultaneously estimate the smoothing parameter and the variance components. We show that all model parameters can be easily obtained by fitting a linear mixed model. A common problem in the analysis of longitudinal data is to compare the time profiles of two groups, e.g., between treatment and placebo. We develop a scaled chi-squared test for the equality of two nonparametric time functions. The proposed model and the test are illustrated by analyzing hormone data collected during two consecutive menstrual cycles and their performance is evaluated through simulations. PMID:10783774

Zhang, D; Lin, X; Sowers, M

2000-03-01

219

International Nuclear Information System (INIS)

Risk associated with power generation must be identified to make intelligent choices between alternate power technologies. Radionuclide air stack emissions for a single coal plant and a single nuclear plant are used to compute the single plant leukemia incidence risk and total industry leukemia incidence risk. Leukemia incidence is the response variable as a function of radionuclide bone dose for the six proposed dose response curves considered. During normal operation a coal plant has higher radionuclide emissions than a nuclear plant and the coal industry has a higher leukaemia incidence risk than the nuclear industry, unless a nuclear accident occurs. Variation of nuclear accident size allows quantification of the impact of accidents on the total industry leukemia incidence risk comparison. The leukemia incidence risk is quantified as the number of accidents of a given size for the nuclear industry leukemia incidence risk to equal the coal industry leukemia incidence risk. The general linear model is used to develop equations that relate the accident frequency required for equal industry risks to the magnitude of the nuclear emission. Exploratory data analysis revealed that the relationship between the natural log of accident number versus the natural log of accident size is linear. (Author)

220

The Analysis of Bootstrap Method in Linear Regression Effect

Digital Repository Infrastructure Vision for European Research (DRIVER)

This paper combines the least squaress estimate, least absolute deviation estimate, least median estimate with Bootstrapmethod. When the overall error distribution is unknown or it is not the normal distribution, we estimate the regression coefficientand confidence interval of coefficient, and through data simulation, obtain Bootstrap method, which can improvestability of regression coefficient and reduce the length of confidence interval.

Jiehan Zhu; Ping Jing

2010-01-01

221

The Analysis of Bootstrap Method in Linear Regression Effect

Directory of Open Access Journals (Sweden)

Full Text Available This paper combines the least squaress estimate, least absolute deviation estimate, least median estimate with Bootstrapmethod. When the overall error distribution is unknown or it is not the normal distribution, we estimate the regression coefficientand confidence interval of coefficient, and through data simulation, obtain Bootstrap method, which can improvestability of regression coefficient and reduce the length of confidence interval.

Jiehan Zhu

2010-10-01

222

Use of generalized regression models for the analysis of stress-rupture data

International Nuclear Information System (INIS)

The design of components for operation in an elevated-temperature environment often requires a detailed consideration of the creep and creep-rupture properties of the construction materials involved. Techniques for the analysis and extrapolation of creep data have been widely discussed. The paper presents a generalized regression approach to the analysis of such data. This approach has been applied to multiple heat data sets for types 304 and 316 austenitic stainless steel, ferritic 21/4 Cr-1 Mo steel, and the high-nickel austenitic alloy 800H. Analyses of data for single heats of several materials are also presented. All results appear good. The techniques presented represent a simple yet flexible and powerful means for the analysis and extrapolation of creep and creep-rupture data

223

Use of generalized regression models for the analysis of stress-rupture data

Energy Technology Data Exchange (ETDEWEB)

The design of components for operation in an elevated-temperature environment often requires a detailed consideration of the creep and creep-rupture properties of the construction materials involved. Techniques for the analysis and extrapolation of creep data have been widely discussed. The paper presents a generalized regression approach to the analysis of such data. This approach has been applied to multiple heat data sets for types 304 and 316 austenitic stainless steel, ferritic 2/sup 1///sub 4/ Cr-1 Mo steel, and the high-nickel austenitic alloy 800H. Analyses of data for single heats of several materials are also presented. All results appear good. The techniques presented represent a simple yet flexible and powerful means for the analysis and extrapolation of creep and creep-rupture data.

Booker, M.K.

1978-01-01

224

International Nuclear Information System (INIS)

Several MRI features of supratentorial astrocytomas are associated with high histologic grade by statistically significant p values. We sought to apply this information prospectively to a group of astrocytomas in the prediction of tumor grade. We used 10 MRI features of fibrillary astrocytomas from 52 patient studies to develop neural network and multiple linear regression models for practical use in predicting tumor grade. The models were tested prospectively on MR images from 29 patient studies. The performance of the models was compared against that of a radiologist. Neural network accuracy was 61 % in distinguishing between low and high grade tumors. Multiple linear regression achieved an accuracy of 59 %. Assessment of the images by a radiologist yielded 57 % accuracy. We conclude that while certain MRI parameters may be statistically related to astrocytoma histologic grade, neural network and linear regression models cannot reliably use them to predict tumor grade. (orig.)

225

DEFF Research Database (Denmark)

Multiple regression and model building with mediator variables was addressed to avoid double counting when economic values are estimated from data simulated with herd simulation modeling (using the SimHerd model). The simulated incidence of metritis was analyzed statistically as the independent variable, while using the traits representing the direct effects of metritis on yield, fertility and occurrence of other diseases as mediator variables. The economic value of metritis was estimated to be €78 per 100 cow-years for each 1% increase of metritis in the period of 1-100 days in milk in multiparous cows. The merit of using this approach was demonstrated since the economic value of metritis was estimated to be 81% higher when no mediator variables were included in the multiple regression analysis

Østergaard, SØren; Ettema, Jehan Frans

226

The request to conduct an independent review of regression models, developed for determining the expected Launch Commit Criteria (LCC) External Tank (ET)-04 cycle count for the Space Shuttle ET tanking process, was submitted to the NASA Engineering and Safety Center NESC on September 20, 2005. The NESC team performed an independent review of regression models documented in Prepress Regression Analysis, Tom Clark and Angela Krenn, 10/27/05. This consultation consisted of a peer review by statistical experts of the proposed regression models provided in the Prepress Regression Analysis. This document is the consultation's final report.

Parsons, Vickie s.

2009-01-01

227

This study used Monte Carlo simulation to examine the properties of the overall odds ratio (OOR), which was recently introduced as an index for overall effect size in multiple logistic regression. It was found that the OOR was relatively independent of study base rate and performed better than most commonly used R-square analogs in indexing model…

Le, Huy; Marcus, Justin

2012-01-01

228

Oil and gas pipeline construction cost analysis and developing regression models for cost estimation

In this study, cost data for 180 pipelines and 136 compressor stations have been analyzed. On the basis of the distribution analysis, regression models have been developed. Material, Labor, ROW and miscellaneous costs make up the total cost of a pipeline construction. The pipelines are analyzed based on different pipeline lengths, diameter, location, pipeline volume and year of completion. In a pipeline construction, labor costs dominate the total costs with a share of about 40%. Multiple non-linear regression models are developed to estimate the component costs of pipelines for various cross-sectional areas, lengths and locations. The Compressor stations are analyzed based on the capacity, year of completion and location. Unlike the pipeline costs, material costs dominate the total costs in the construction of compressor station, with an average share of about 50.6%. Land costs have very little influence on the total costs. Similar regression models are developed to estimate the component costs of compressor station for various capacities and locations.

Thaduri, Ravi Kiran

229

Scientific Electronic Library Online (English)

Full Text Available SciELO Public Health | Language: English Abstract in spanish RESUMEN OBJETIVO: Realizar una revisión sistemática de ensayos aleatorizados y controlados en los que se compara el efecto de la administración de múltiples micronutrientes con el de la administración de hierro y ácido fólico sobre los resultados de los embarazos en los países en vías de desarrollo. [...] MÉTODOS: Se realizaron búsquedas en MEDLINE y EMBASE. Los resultados de interés fueron: peso del neonato, bajo peso neonatal, neonatos con una talla baja para la edad gestacional, mortalidad perinatal y mortalidad neonatal. Se calcularon los riesgos relativos (RR) agrupados, empleando modelos de efectos aleatorios. Se investigaron las fuentes de heterogeneidad del metanálisis y la metarregresión de los subgrupos. RESULTADOS: La administración de múltiples micronutrientes fue más eficaz que la administración de hierro y ácido fólico a la hora de reducir el riesgo del peso bajo neonatal (RR=0,86, IC del 95%=0,79-0,93) y la talla baja para la edad gestacional (RR=0,85; IC del 95%=0,78-0,93). La administración de micronutrientes no tuvo un efecto global en la mortalidad perinatal (RR=1,05; IC del 95%=0,90-1,22), si bien la heterogeneidad fue importante y evidente (I²=58%; p de heterogeneidad=0,008). Los análisis de los subgrupos y de la metarregresión sugirieron que la administración de micronutrientes estaba asociada a un menor riesgo de mortalidad perinatal en aquellos estudios en los que más del 50% de las madres tenía formación universitaria (RR=0,93; IC del 95%=0,82-1,06) o en los que la administración se inició después de una media de 20 semanas de gestación (RR=0,88; IC del 95%=0,80-0,97). CONCLUSIÓN: La educación de la madre o la edad gestacional en la que se inició la administración pueden haber contribuido a los efectos heterogéneos observados en la mortalidad perinatal. Se debe seguir investigando la seguridad, la eficacia y la efectividad de la administración de micronutrientes a mujeres embarazadas. Abstract in english OBJECTIVE: To systematically review randomized controlled trials comparing the effect of supplementation with multiple micronutrients versus iron and folic acid on pregnancy outcomes in developing countries. METHODS: MEDLINE and EMBASE were searched. Outcomes of interest were birth weight, low birth [...] weight, small size for gestational age, perinatal mortality and neonatal mortality. Pooled relative risks (RRs) were estimated by random effects models. Sources of heterogeneity were explored through subgroup meta-analyses and meta-regression. FINDINGS: Multiple micronutrient supplementation was more effective than iron and folic acid supplementation at reducing the risk of low birth weight (RR:0.86, 95% confidence interval, CI:0.79-0.93) and of small size for gestational age (RR:0.85; 95% CI: 0.78-0.93). Micronutrient supplementation had no overall effect on perinatal mortality (RR:1.05; 95% CI:0.90-1.22), although substantial heterogeneity was evident (I²=58%; P for heterogeneity=0.008). Subgroup and meta-regression analyses suggested that micronutrient supplementation was associated with a lower risk of perinatal mortality in trials in which >50% of mothers had formal education (RR:0.93; 95% CI:0.82-1.06) or in which supplementation was initiated after a mean of 20 weeks of gestation (RR:0.88; 95% CI:0.80-0.97). CONCLUSION: Maternal education or gestational age at initiation of supplementation may have contributed to the observed heterogeneous effects on perinatal mortality. The safety, efficacy and effective delivery of maternal micronutrient supplementation require further research.

Kosuke, Kawai; Donna, Spiegelman; Anuraj H, Shankar; Wafaie W, Fawzi.

230

The analytical effect of the number of events per variable (EPV) in a proportional hazards regression analysis was evaluated using Monte Carlo simulation techniques for data from a randomized trial containing 673 patients and 252 deaths, in which seven predictor variables had an original significance level of p EPVs of 2, 5, 10, 15, 20, and 25. For each simulation, a random exponential survival time was generated for each of the 673 patients, and the simulated results were compared with their original counterparts. As EPV decreased, the regression coefficients became more biased relative to the true value; the 90% confidence limits about the simulated values did not have a coverage of 90% for the original value; large sample properties did not hold for variance estimates from the proportional hazards model, and the Z statistics used to test the significance of the regression coefficients lost validity under the null hypothesis. Although a single boundary level for avoiding problems is not easy to choose, the value of EPV = 10 seems most prudent. Below this value for EPV, the results of proportional hazards regression analyses should be interpreted with caution because the statistical model may not be valid. PMID:8543964

Peduzzi, P; Concato, J; Feinstein, A R; Holford, T R

1995-12-01

231

Multiple Regression Analysis Using ANCOVA in University Model

Digital Repository Infrastructure Vision for European Research (DRIVER)

The government of UAE is promoting Dubai as an academic hub. Dubai International Academic City (DIAC) is a free zone area with many national and international universities promoting higher education in almost all disciplines. The aspiration of every graduating student from the university is to get a good placement. In Dubai diverse job opportunities in national and multinational organizations are available. The objective of the paper is to review the placement opportunities in Dubai for the u...

Maneesha; Priti Bajpai

2013-01-01

232

International Nuclear Information System (INIS)

The gamma/beta TLD badge used by OPPD consists of two TLD-700 chips (Harshaw G7 card), one of which (chip number sign 2) is shielded by a 0.102 cm-thick aluminum filter, and the other (chip number sign 1) is unshielded, as shown in Fig. 1. Standard procedure had been to determine the beta dose to the badge by subtracting the response of chip number sign 2 from that of chip number sign 1 and then dividing by a calibrated beta-sensitivity factor; the gamma dose was taken to be the response of chip number sign 2 divided by the chip's gamma-sensitivity factor followed by the subtraction of the background dose. A problem with this procedure is penetration of energetic beta particles through the aluminum filter on chip number sign 2 which causes an over-response. Due to the technique used to obtain the beta dose, this also results in an under-estimate of the beta dose. This problem has been corrected through application of multiple linear regression analysis on a large data base of pure gamma (137Cs), pure beta (90Sr), and mixed exposures. The outcome of the analysis is an algorithm that automatically corrects for penetration effects. Performance tests using the ANSI N13.11 standard are presented to show the improvement

233

Penalized-regression-based multimarker genotype analysis of Genetic Analysis Workshop 17 data

Digital Repository Infrastructure Vision for European Research (DRIVER)

Abstract Testing for association between multiple markers and a phenotype can not only capture untyped causal variants in weak linkage disequilibrium with nearby typed markers but also identify the effect of a combination of markers. We propose a sliding window approach that uses multimarker genotypes as variables in a penalized regression. We investigate a penalty with three separate components: (1) a group least absolute shrinkage and selection operator (LASSO) that selects multim...

Ayers Kristin L; Mamasoula Chrysovalanto; Cordell Heather J

2011-01-01

234

Regression analysis of technical parameters affecting nuclear power plant performances

Energy Technology Data Exchange (ETDEWEB)

Since the 80's many studies have been conducted in order to explicate good and bad performances of commercial nuclear power plants (NPPs), but yet no defined correlation has been found out to be totally representative of plant operational experience. In early works, data availability and the number of operating power stations were both limited; therefore, results showed that specific technical characteristics of NPPs were supposed to be the main causal factors for successful plant operation. Although these aspects keep on assuming a significant role, later studies and observations showed that other factors concerning management and organization of the plant could instead be predominant comparing utilities operational and economic results. Utility quality, in a word, can be used to summarize all the managerial and operational aspects that seem to be effective in determining plant performance. In this paper operational data of a consistent sample of commercial nuclear power stations, out of the total 433 operating NPPs, are analyzed, mainly focusing on the last decade operational experience. The sample consists of PWR and BWR technology, operated by utilities located in different countries, including U.S. (Japan)) (France)) (Germany)) and Finland. Multivariate regression is performed using Unit Capability Factor (UCF) as the dependent variable; this factor reflects indeed the effectiveness of plant programs and practices in maximizing the available electrical generation and consequently provides an overall indication of how well plants are operated and maintained. Aspects that may not be real causal factors but which can have a consistent impact on the UCF, as technology design, supplier, size and age, are included in the analysis as independent variables. (authors)

Ghazy, R.; Ricotti, M. E.; Trueco, P. [Politecnico di Milano, Via La Masa, 34, 20156 Milano (Italy)

2012-07-01

235

Different aspects of visibility degradation problems in Brisbane were investigated through concurrent visibility monitoring and aerosol sampling programs carried out in 1995. The relationship between the light extinction coefficients and aerosol mass/composition was derived by using multiple linear regression techniques. The visibility properties at different sites in Brisbane were found to be correlated with each other on a daily basis, but not correlated with each other hour by hour. The cause of scattering of light by moisture ( bsw) was due to sulphate particles which shift to a larger size under high-humidity conditions. The scattering of light by particulate matter ( bsp) was found to be highly correlated with the mass of fine aerosols, in particular the mass of fine soot, sulphate and non-soil K. For the period studied, on average, the total light extinction coefficient ( bext) at five sites in Brisbane was 0.65×10 -4 m -1, considerably smaller than those values found in other Australian and overseas cities. On average, the major component of bext is bsp (49% of bext), followed by bap (the absorption of light, mainly by fine soot particles, 28%), bsg (Rayleigh scattering, 20%) and bsw (3%). The absorption of light by NO 2 ( bag) is expected to contribute less than 5% of bext. On average, the percentage contribution of the visibility degrading species to bext (excluding bag) were: soot (53%), sulphate (21%), Rayleigh scattering (20%), non-soil K (2%) and humidity (3%). In terms of visibility degrading sources, motor vehicles (including soot and the secondary products) are expected to contribute more than half of the bext (excluding bag) in Brisbane on average, followed by secondary sulphates (17%) and biomass burning (10%).

Chan, Y. C.; Simpson, R. W.; Mctainsh, G. H.; Vowles, P. D.; Cohen, D. D.; Bailey, G. M.

236

Varying-coefficient functional linear regression

Digital Repository Infrastructure Vision for European Research (DRIVER)

Functional linear regression analysis aims to model regression relations which include a functional predictor. The analog of the regression parameter vector or matrix in conventional multivariate or multiple-response linear regression models is a regression parameter function in one or two arguments. If, in addition, one has scalar predictors, as is often the case in applications to longitudinal studies, the question arises how to incorporate these into a functional regressi...

Wu, Yichao; Fan, Jianqing; Mu?ller, Hans-georg

2011-01-01

237

Spatial regression analysis on 32 years of total column ozone data

Multiple-regression analyses have been performed on 32 years of total ozone column data that was spatially gridded with a 1 × 1.5° resolution. The total ozone data consist of the MSR (Multi Sensor Reanalysis; 1979-2008) and 2 years of assimilated SCIAMACHY (SCanning Imaging Absorption spectroMeter for Atmospheric CHartographY) ozone data (2009-2010). The two-dimensionality in this data set allows us to perform the regressions locally and investigate spatial patterns of regression coefficients and their explanatory power. Seasonal dependencies of ozone on regressors are included in the analysis. A new physically oriented model is developed to parameterize stratospheric ozone. Ozone variations on nonseasonal timescales are parameterized by explanatory variables describing the solar cycle, stratospheric aerosols, the quasi-biennial oscillation (QBO), El Niño-Southern Oscillation (ENSO) and stratospheric alternative halogens which are parameterized by the effective equivalent stratospheric chlorine (EESC). For several explanatory variables, seasonally adjusted versions of these explanatory variables are constructed to account for the difference in their effect on ozone throughout the year. To account for seasonal variation in ozone, explanatory variables describing the polar vortex, geopotential height, potential vorticity and average day length are included. Results of this regression model are compared to that of a similar analysis based on a more commonly applied statistically oriented model. The physically oriented model provides spatial patterns in the regression results for each explanatory variable. The EESC has a significant depleting effect on ozone at mid- and high latitudes, the solar cycle affects ozone positively mostly in the Southern Hemisphere, stratospheric aerosols affect ozone negatively at high northern latitudes, the effect of QBO is positive and negative in the tropics and mid- to high latitudes, respectively, and ENSO affects ozone negatively between 30° N and 30° S, particularly over the Pacific. The contribution of explanatory variables describing seasonal ozone variation is generally large at mid- to high latitudes. We observe ozone increases with potential vorticity and day length and ozone decreases with geopotential height and variable ozone effects due to the polar vortex in regions to the north and south of the polar vortices. Recovery of ozone is identified globally. However, recovery rates and uncertainties strongly depend on choices that can be made in defining the explanatory variables. The application of several trend models, each with their own pros and cons, yields a large range of recovery rate estimates. Overall these results suggest that care has to be taken in determining ozone recovery rates, in particular for the Antarctic ozone hole.

Knibbe, J. S.; van der A, R. J.; de Laat, A. T. J.

2014-08-01

238

Combinatorial Analysis of Multiple Networks

The study of complex networks has been historically based on simple graph data models representing relationships between individuals. However, often reality cannot be accurately captured by a flat graph model. This has led to the development of multi-layer networks. These models have the potential of becoming the reference tools in network data analysis, but require the parallel development of specific analysis methods explicitly exploiting the information hidden in-between the layers and the availability of a critical mass of reference data to experiment with the tools and investigate the real-world organization of these complex systems. In this work we introduce a real-world layered network combining different kinds of online and offline relationships, and present an innovative methodology and related analysis tools suggesting the existence of hidden motifs traversing and correlating different representation layers. We also introduce a notion of betweenness centrality for multiple networks. While some preli...

Magnani, Matteo; Rossi, Luca

2013-01-01

239

Derailments are the most common type of freight-train accidents in the United States. Derailments cause damage to infrastructure and rolling stock, disrupt services, and may cause casualties and harm the environment. Accordingly, derailment analysis and prevention has long been a high priority in the rail industry and government. Despite the low probability of a train derailment, the potential for severe consequences justify the need to better understand the factors influencing train derailment severity. In this paper, a zero-truncated negative binomial (ZTNB) regression model is developed to estimate the conditional mean of train derailment severity. Recognizing that the mean is not the only statistic describing data distribution, a quantile regression (QR) model is also developed to estimate derailment severity at different quantiles. The two regression models together provide a better understanding of train derailment severity distribution. Results of this work can be used to estimate train derailment severity under various operational conditions and by different accident causes. This research is intended to provide insights regarding development of cost-efficient train safety policies. PMID:23770389

Liu, Xiang; Saat, M Rapik; Qin, Xiao; Barkan, Christopher P L

2013-10-01

240

Stepwise Regression as an Exploratory Data Analysis Procedure.

This paper identifies specific problems with stepwise regression, notes criticisms of stepwise methods by statisticians, suggests appropriate ways in which stepwise procedures can be used, and gives examples of how this can be done. Although the stepwise method has been routinely criticized by statisticians, it is still frequently used in the…

Thayer, Jerome D.

241

Regression Analysis with Block Missing Values and Variables Selection

Directory of Open Access Journals (Sweden)

Full Text Available We consider a regression model when a block of observations is missing, i.e. there are a group of observations with all the explanatory variables or covariates observed and another set of observations with only a block of the variables observed. We propose an estimator of the regression coefficients that is a combination of two estimators, one based on the observations with no missing variables, and the other the set all observations after deleting of the block of variables with missing values. The proposed combined estimator will be compared with the uncombined estimators. If the experimenter suspects that the variables with missing values may be deleted, a preliminary test will be performed to resolve the uncertainty. If the preliminary test of the null hypothesis that regression coefficients of the variables with missing value equal to zero is accepted, then only the data with no missing values are used for estimating the regression coefficients. Otherwise the combined estimator is used. This gives a preliminary test estimator. The properties of the preliminary test estimator and comparisons of the estimators are studied by a Monte Carlo study

Chien-Pai Han

2011-07-01

242

Regression analysis using order statistics and their concomitants

Digital Repository Infrastructure Vision for European Research (DRIVER)

In this work we derive the exact joint distribution of linear combinations of order statistics and linear combinations of their concomitants and some auxiliary variables in multivariate normal distribution. By extending the results of Sheikhi and Jamalizadeh we investigate some regression equations. Our results generalize those obtained in previous research by Viana, Lee and Loper-fido.

Rasoul Ziaei, Abdul

2014-01-01

243

Analysis on Train Stopping Accuracy based on Regression Algorithms

Directory of Open Access Journals (Sweden)

Full Text Available Stopping accuracy is one of the most important indexes of efficiency of automatic train operation (ATO systems. Traditional stopping control algorithms in ATO systems have some drawbacks, as many factors have not been taken into account. In the large amount of field-collected data about stopping accuracy there are many factors (e.g. system delays, stopping time, net pressure which affecting stopping accuracy. In this paper, three popular data mining methods are proposed to analyze the train stopping accuracy. Firstly, we find fifteen factors which have impact on the stopping accuracy. Then, ridge regression, lasso regression and elastic net regression are employed to mine models to reflecting the relationship between the fifteen factors and the stopping accuracy. Then, the three models are compared by using Akaike information criterion (AIC, a model selection criterion which considering the trade-off between accuracy and complexity. The computational results show that elastic net regression model has a best performance on AIC value. Finally, we obtain the parameters which can make the train stop more accurately which can provide a reference to improve stopping accuracy for ATO systems.

Lin Ma

2014-05-01

244

Functional Multiple-Set Canonical Correlation Analysis

We propose functional multiple-set canonical correlation analysis for exploring associations among multiple sets of functions. The proposed method includes functional canonical correlation analysis as a special case when only two sets of functions are considered. As in classical multiple-set canonical correlation analysis, computationally, the…

Hwang, Heungsun; Jung, Kwanghee; Takane, Yoshio; Woodward, Todd S.

2012-01-01

245

Directory of Open Access Journals (Sweden)

Full Text Available Data from the Interagency Monitoring of Protected Visual Environments (IMPROVE network are used to estimate organic mass to organic carbon (OM/OC ratios across the United States by extending previously published multiple regression techniques. Our new methodology addresses common pitfalls of multiple regression including measurement uncertainty, colinearity of covariates, and dataset selection. As expected, summertime OM/OC ratios are larger than wintertime values across the US with all regional median OM/OC values tightly confined between 1.8 and 1.95. Further, we find that OM/OC ratios during the winter are distinctly larger in the eastern US than in the West (regional medians are 1.58, 1.64, and 1.85 in the great lakes, southeast, and northeast regions, versus 1.29 and 1.32 in the western and central states. We find less spatial variability in long-term averaged OM/OC ratios across the US (90% of our multiyear regressions predicted OM/OC ratios between 1.37 and 1.94 than previous studies (90% of OM/OC estimates from a previous regression study fell between 1.30 and 2.10. We attribute this difference largely to the inclusion of EC as a covariate in previous regression studies. Due to the colinearity of EC and OC, we believe that up to one-quarter of the OM/OC estimates in a previous study are biased low. In addition to estimating OM/OC ratios, our technique reveals trends that may be contrasted with conventional assumptions regarding nitrate, sulfate, and soil across the IMPROVE network. For example, our regressions show pronounced seasonal and spatial variability in both nitrate volatilization and sulfate neutralization and hydration.

H. Simon

2010-10-01

246

International Nuclear Information System (INIS)

A program, WRANL, is described for the analysis of immunoassays or bioassays which have a logistic dose-response relationship. Responses are transformed to logits and iterative weighted regression analysis is used to obtain log dose-logit response lines for all preparations compared in an assay. Potency estimates of preparations relative to the standard preparation are available for both unweighted and weighted regression analyses together with detailed analysis of variance, estimates of slope and other relevant parameters. The general comparisons of dose-response relationships produced by the program are a feature of particular interest. However, an option which suppresses the more general output is available if the program is to be used for analysis of a 'screening' assay comparing single dilutions or doses of test samples with a standard curve. Data input is designed to permit immediate running of the program by junior personnel. Data output is designed to facilitate record keeping. (Auth.)

247

Measurement and Analysis of Test Suite Volume Metrics for Regression Testing

Directory of Open Access Journals (Sweden)

Full Text Available Regression testing intends to ensure that a software applications works as specified after changes made to it during maintenance. It is an important phase in software development lifecycle. Regression testing is the re-execution of some subset of test cases that has already been executed. It is an expensive process used to detect defects due to regressions. Regression testing has been used to support software-testing activities and assure acquiring an appropriate quality through several versions of a software product during its development and maintenance. Regression testing assures the quality of modified applications. In this proposed work, a study and analysis of metrics related to test suite volume was undertaken. It was shown that the software under test needs more test cases after changes were made to it. A comparative analysis was performed for finding the change in test suite size before and after the regression test.

S Raju

2014-01-01

248

A Bayesian Quantile Regression Analysis of Potential Risk Factors for Violent Crimes in USA

Directory of Open Access Journals (Sweden)

Full Text Available Bayesian quantile regression has drawn more attention in widespread applications recently. Yu and Moyeed (2001 proposed an asymmetric Laplace distribution to provide likelihood based mechanism for Bayesian inference of quantile regression models. In this work, the primary objective is to evaluate the performance of Bayesian quantile regression compared with simple regression and quantile regression through simulation and with application to a crime dataset from 50 USA states for assessing the effect of potential risk factors on the violent crime rate. This paper also explores improper priors, and conducts sensitivity analysis on the parameter estimates. The data analysis reveals that the percent of population that are single parents always has a significant positive influence on violent crimes occurrence, and Bayesian quantile regression provides more comprehensive statistical description of this association.

Ming Wang

2012-12-01

249

REGRESSION ANALYSIS OF PRODUCTIVITY USING MIXED EFFECT MODEL

Directory of Open Access Journals (Sweden)

Full Text Available Production plants of a company are located in several areas that spread across Middle and East Java. As the production process employs mostly manpower, we suspected that each location has different characteristics affecting the productivity. Thus, the production data may have a spatial and hierarchical structure. For fitting a linear regression using the ordinary techniques, we are required to make some assumptions about the nature of the residuals i.e. independent, identically and normally distributed. However, these assumptions were rarely fulfilled especially for data that have a spatial and hierarchical structure. We worked out the problem using mixed effect model. This paper discusses the model construction of productivity and several characteristics in the production line by taking location as a random effect. The simple model with high utility that satisfies the necessary regression assumptions was built using a free statistic software R version 2.6.1.

Siana Halim

2007-01-01

250

Directory of Open Access Journals (Sweden)

Full Text Available The aim of this study was to investigate performance of Multiple Linear Regression (MLR method in predicting future (next day, next 2 days and next 3 days PM10 concentration levels in Seberang Perai, Malaysia. The developed model was compared to multiple linear regression models. The model used gaseous (NO2, SO2, CO, PM10 and meteorological parameters (temperature, relative humidity and wind speed as predictors. Performance indicators such as Prediction Accuracy (PA, Coefficient of Determination (R2, Index of Agreement (IA, Normalized Absolute Error (NAE and Root Mean Square Error (RMSE were used to measure the accuracy of the models. Performance indicator shows next day (RMSE = 11.211, NAE = 0.124, PA = 0.927, IA = 0.960, R2 = 0.858, and next 2-day (RMSE = 14.652, NAE = 0.155, PA = 0.881, IA = 0.925, R2 = 0.775 and next 3-day (RMSE = 15.611, NAE = 0.167, PA = 0.849, IA = 0.912, R2 = 0.720. Assessment of model performance indicated that multiple linear regression method can be used for long term PM10 concentration prediction with next day for next day.

NorAzam Ramli

2012-01-01

251

Rock mass classification systems are one of the most common ways of determining rock mass excavatability and related equipment assessment. However, the strength and weak points of such rating-based classifications have always been questionable. Such classification systems assign quantifiable values to predefined classified geotechnical parameters of rock mass. This causes particular ambiguities, leading to the misuse of such classifications in practical applications. Recently, intelligence system approaches such as artificial neural networks (ANNs) and neuro-fuzzy methods, along with multiple regression models, have been used successfully to overcome such uncertainties. The purpose of the present study is the construction of several models by using an adaptive neuro-fuzzy inference system (ANFIS) method with two data clustering approaches, including fuzzy c-means (FCM) clustering and subtractive clustering, an ANN and non-linear multiple regression to estimate the basic rock mass diggability index. A set of data from several case studies was used to obtain the real rock mass diggability index and compared to the predicted values by the constructed models. In conclusion, it was observed that ANFIS based on the FCM model shows higher accuracy and correlation with actual data compared to that of the ANN and multiple regression. As a result, one can use the assimilation of ANNs with fuzzy clustering-based models to construct such rigorous predictor tools.

Saeidi, Omid; Torabi, Seyed Rahman; Ataei, Mohammad

2014-03-01

252

A Nonmonotone Line Search Method for Regression Analysis

Directory of Open Access Journals (Sweden)

Full Text Available In this paper, we propose a nonmonotone line search combining with the search direction (G. L. Yuan and Z. X.Wei, New Line Search Methods for Unconstrained Optimization, Journal of the Korean Statistical Society, 38(2009, pp. 29-39. for regression problems. The global convergence of the given method will be established under suitable conditions. Numerical results show that the presented algorithm is more competitive than the normal methods.

Gonglin Yuan

2009-03-01

253

Regression analysis of censored data using pseudo-observations

DEFF Research Database (Denmark)

We draw upon a series of articles in which a method based on pseu- dovalues is proposed for direct regression modeling of the survival function, the restricted mean, and the cumulative incidence function in competing risks with right-censored data. The models, once the pseudovalues have been computed, can be fit using standard generalized estimating equation software. Here we present Stata procedures for computing these pseudo-observations. An example from a bone marrow transplantation study is used to illustrate the method.

Parner, Erik T.; Andersen, Per Kragh

2010-01-01

254

A New Approach in Regression Analysis for Modeling Adsorption Isotherms

Numerous regression approaches to isotherm parameters estimation appear in the literature. The real insight into the proper modeling pattern can be achieved only by testing methods on a very big number of cases. Experimentally, it cannot be done in a reasonable time, so the Monte Carlo simulation method was applied. The objective of this paper is to introduce and compare numerical approaches that involve different levels of knowledge about the noise structure of the analytical method used for initial and equilibrium concentration determination. Six levels of homoscedastic noise and five types of heteroscedastic noise precision models were considered. Performance of the methods was statistically evaluated based on median percentage error and mean absolute relative error in parameter estimates. The present study showed a clear distinction between two cases. When equilibrium experiments are performed only once, for the homoscedastic case, the winning error function is ordinary least squares, while for the case of heteroscedastic noise the use of orthogonal distance regression or Margart's percent standard deviation is suggested. It was found that in case when experiments are repeated three times the simple method of weighted least squares performed as well as more complicated orthogonal distance regression method. PMID:24672394

Onjia, Antonije E.

2014-01-01

255

BRGLM, Interactive Linear Regression Analysis by Least Square Fit

International Nuclear Information System (INIS)

1 - Description of program or function: BRGLM is an interactive program written to fit general linear regression models by least squares and to provide a variety of statistical diagnostic information about the fit. Stepwise and all-subsets regression can be carried out also. There are facilities for interactive data management (e.g. setting missing value flags, data transformations) and tools for constructing design matrices for the more commonly-used models such as factorials, cubic Splines, and auto-regressions. 2 - Method of solution: The least squares computations are based on the orthogonal (QR) decomposition of the design matrix obtained using the modified Gram-Schmidt algorithm. 3 - Restrictions on the complexity of the problem: The current release of BRGLM allows maxima of 1000 observations, 99 variables, and 3000 words of main memory workspace. For a problem with N observations and P variables, the number of words of main memory storage required is MAX(N*(P+6), N*P+P*P+3*N, and 3*P*P+6*N). Any linear model may be fit although the in-memory workspace will have to be increased for larger problems

256

Analysis of some methods for reduced rank Gaussian process regression

DEFF Research Database (Denmark)

While there is strong motivation for using Gaussian Processes (GPs) due to their excellent performance in regression and classification problems, their computational complexity makes them impractical when the size of the training set exceeds a few thousand cases. This has motivated the recent proliferation of a number of cost-effective approximations to GPs, both for classification and for regression. In this paper we analyze one popular approximation to GPs for regression: the reduced rank approximation. While generally GPs are equivalent to infinite linear models, we show that Reduced Rank Gaussian Processes (RRGPs) are equivalent to finite sparse linear models. We also introduce the concept of degenerate GPs and show that they correspond to inappropriate priors. We show how to modify the RRGP to prevent it from being degenerate at test time. Training RRGPs consists both in learning the covariance function hyperparameters and the support set. We propose a method for learning hyperparameters for a given support set. We also review the Sparse Greedy GP (SGGP) approximation (Smola and Bartlett, 2001), which is a way of learning the support set for given hyperparameters based on approximating the posterior. We propose an alternative method to the SGGP that has better generalization capabilities. Finally we make experiments to compare the different ways of training a RRGP. We provide some Matlab code for learning RRGPs.

Rasmussen, Carl Edward

2005-01-01

257

Digital Repository Infrastructure Vision for European Research (DRIVER)

Different methods for modelling nonlinear system are investigated in this paper. Neural network (NN) techniques, multiple linear regression (MLR) and principal component regression (PCR) are applied to two nonlinear systems which are sine function and distillation column. For the sake of studying these three distinctive methods, all the data taken is from simulation which is then be seperated into training, testing and validation. Among those different approaches, the NN approach based on the...

Zainal Ahmad; Yong Fei San

2007-01-01

258

Digital Repository Infrastructure Vision for European Research (DRIVER)

The aim of this paper is to generalize permutation methods for multiple testing adjustment of significant partial regression coefficients in a linear regression model used for microarray data. Using a permutation method outlined by Anderson and Legendre [1999] and the permutation P-value adjustment from Simon et al. [2004], the significance of disease related gene expression will be determined and adjusted after accounting for the effects of covariates, which are not restricted to be categori...

Wagner, Brandie D.; Zerbe, Gary O.; Mexal, Sharon; Leonard, Sherry S.

2008-01-01

259

Directory of Open Access Journals (Sweden)

Full Text Available With determination micro-Fe by 1, 10-phenanthroline spectrophotometry for example, they are systematically introduced the combinatorial measurement and regression analysis method application about metheodic principle, operation step and data processing in the instrumental analysis, including: calibration curve best linear equation is set up, measurand best linear equation is set up, and calculation of best value of a concentration. The results showed that mean of thrice determination , s = 0 ?g/mL, RSD = 0. Results of preliminary application are simply introduced in the basic instrumental analysis for atomic absorption spectrophotometry, ion-selective electrodes, coulometry and polarographic analysis and are contrasted to results of normal measurements.

Hongyi Zheng

2014-05-01

260

International Nuclear Information System (INIS)

A plot of lung-cancer rates versus radon exposures in 965 US counties, or in all US states, has a strong negative slope, b, in sharp contrast to the strong positive slope predicted by linear/no-threshold theory. The discrepancy between these slopes exceeds 20 standard deviations (SD). Including smoking frequency in the analysis substantially improves fits to a linear relationship but has little effect on the discrepancy in b, because correlations between smoking frequency and radon levels are quite weak. Including 17 socioeconomic variables (SEV) in multiple regression analysis reduces the discrepancy to 15 SD. Data were divided into segments by stratifying on each SEV in turn, and on geography, and on both simultaneously, giving over 300 data sets to be analyzed individually, but negative slopes predominated. The slope is negative whether one considers only the most urban counties or only the most rural; only the richest or only the poorest; only the richest in the South Atlantic region or only the poorest in that region, etc., etc.,; and for all the strata in between. Since this is an ecological study, the well-known problems with ecological studies were investigated and found not to be applicable here. The open-quotes ecological fallacyclose quotes was shown not to apply in testing a linear/no-threshold theory, and the vulnerability to confounding is greatly reduced when confounding factors are only weakly correlated with radon levels, as is generally the case hereadon levels, as is generally the case here. All confounding factors known to correlate with radon and with lung cancer were investigated quantitatively and found to have little effect on the discrepancy

261

Directory of Open Access Journals (Sweden)

Full Text Available Data from the Interagency Monitoring of Protected Visual Environments (IMPROVE network are used to estimate organic mass to organic carbon (OM/OC ratios across the United States by extending previously published multiple regression techniques. Our new methodology addresses common pitfalls of multiple regression including measurement uncertainty, colinearity of covariates, dataset selection, and model selection. As expected, summertime OM/OC ratios are larger than wintertime values across the US with all regional median OM/OC values tightly confined between 1.80 and 1.95. Further, we find that OM/OC ratios during the winter are distinctly larger in the eastern US than in the West (regional medians are 1.58, 1.64, and 1.85 in the great lakes, southeast, and northeast regions, versus 1.29 and 1.32 in the western and central states. We find less spatial variability in long-term averaged OM/OC ratios across the US (90% of our multiyear regressions estimate OM/OC ratios between 1.37 and 1.94 than previous studies (90% fell between 1.30 and 2.10. We attribute this difference largely to the inclusion of EC as a covariate in previous regression studies. Due to the colinearity of EC and OC, we find that up to one-quarter of the OM/OC estimates in a previous study are biased low. Assumptions about OC measurement artifacts add uncertainty to our estimates of OM/OC. In addition to estimating OM/OC ratios, our technique reveals trends that may be contrasted with conventional assumptions regarding nitrate, sulfate, and soil across the IMPROVE network. For example, our regressions show pronounced seasonal and spatial variability in both nitrate volatilization and sulfate neutralization and hydration.

H. Simon

2011-03-01

262

Detrended fluctuation analysis as a regression framework: Estimating dependence at different scales

We propose a novel framework combining detrended fluctuation analysis with standard regression methodology. The method is built on detrended variances and covariances and it is designed to estimate regression parameters at different scales and under potential non-stationarity and power-law correlations. Selected examples from physics, finance and environmental sciences illustrate usefulness of the framework.

Kristoufek, Ladislav

2014-01-01

263

Seasonal Regression Models for Electricity Consumption Characteristics Analysis

Directory of Open Access Journals (Sweden)

Full Text Available This paper presents seasonal regression models of demand to investigate electricity consumption characteristics. Electricity consumption in commercial areas in Japan is analyzed by using meteorological variables, namely temperature and relative humidity. A dummy variable for holidays is also considered. We have developed models for two levels of period to analyze demand characteristics, that is, half year models and seasonal models. Some options for each model are calculated and validated by statistical tests to obtain better models. As results, half year and seasonal models present explicit information about how the variables affect the demand differently for each period. These specific information help in analyzing characteristics of studied commercial demand.

Yusri Syam Akil

2013-01-01

264

Bias due to 2-stage residual-outcome regression analysis in genetic association studies

Digital Repository Infrastructure Vision for European Research (DRIVER)

Association studies of risk factors and complex diseases require careful assessment of potential confounding factors. Two-stage regression analysis, sometimes referred to as residual- or adjusted-outcome analysis, has been increasingly used in association studies of single nucleotide polymorphisms (SNPs) and quantitative traits. In this analysis, first, a residual-outcome is calculated from a regression of the outcome variable on covariates and then the relationship between the adjusted-outco...

Demissie, Serkalem; Cupples, L. Adrienne

2011-01-01

265

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Multiple imputation is becoming increasingly popular. Theoretical considerations as well as simulation studies have shown that the inclusion of auxiliary variables is generally of benefit. Methods A simulation study of a linear regression with a response Y and two predictors X1 and X2 was performed on data with n = 50, 100 and 200 using complete cases or multiple imputation with 0, 10, 20, 40 and 80 auxiliary variables. Mechanisms of missingness were either 100% MCAR or 50% MAR + 50% MCAR. Auxiliary variables had low (r=.10 vs. moderate correlations (r=.50 with X’s and Y. Results The inclusion of auxiliary variables can improve a multiple imputation model. However, inclusion of too many variables leads to downward bias of regression coefficients and decreases precision. When the correlations are low, inclusion of auxiliary variables is not useful. Conclusion More research on auxiliary variables in multiple imputation should be performed. A preliminary rule of thumb could be that the ratio of variables to cases with complete data should not go below 1 : 3.

Hardt Jochen

2012-12-01

266

Post hoc and planned comparison procedures for interpreting chi-square contingency table test results are presented, and a planned comparison procedure that simplifies the process of creating a contingency table by creating single-degree-of-freedom contrasts through a regression-based approach is proposed. (SLD)

Beasley, T. Mark; Schumacker, Randall E.

1995-01-01

267

Development of a User Interface for a Regression Analysis Software Tool

An easy-to -use user interface was implemented in a highly automated regression analysis tool. The user interface was developed from the start to run on computers that use the Windows, Macintosh, Linux, or UNIX operating system. Many user interface features were specifically designed such that a novice or inexperienced user can apply the regression analysis tool with confidence. Therefore, the user interface s design minimizes interactive input from the user. In addition, reasonable default combinations are assigned to those analysis settings that influence the outcome of the regression analysis. These default combinations will lead to a successful regression analysis result for most experimental data sets. The user interface comes in two versions. The text user interface version is used for the ongoing development of the regression analysis tool. The official release of the regression analysis tool, on the other hand, has a graphical user interface that is more efficient to use. This graphical user interface displays all input file names, output file names, and analysis settings for a specific software application mode on a single screen which makes it easier to generate reliable analysis results and to perform input parameter studies. An object-oriented approach was used for the development of the graphical user interface. This choice keeps future software maintenance costs to a reasonable limit. Examples of both the text user interface and graphical user interface are discussed in order to illustrate the user interface s overall design approach.

Ulbrich, Norbert Manfred; Volden, Thomas R.

2010-01-01

268

Nineteen variables, including precipitation, soils and geology, land use, and basin morphologic characteristics, were evaluated to develop Iowa regression models to predict total streamflow (Q), base flow (Qb), storm flow (Qs) and base flow percentage (%Qb) in gauged and ungauged watersheds in the state. Discharge records from a set of 33 watersheds across the state for the 1980 to 2000 period were separated into Qb and Qs. Multiple linear regression found that 75.5 percent of long term average Q was explained by rainfall, sand content, and row crop percentage variables, whereas 88.5 percent of Qb was explained by these three variables plus permeability and floodplain area variables. Qs was explained by average rainfall and %Qb was a function of row crop percentage, permeability, and basin slope variables. Regional regression models developed for long term average Q and Qb were adapted to annual rainfall and showed good correlation between measured and predicted values. Combining the regression model for Q with an estimate of mean annual nitrate concentration, a map of potential nitrate loads in the state was produced. Results from this study have important implications for understanding geomorphic and land use controls on streamflow and base flow in Iowa watersheds and similar agriculture dominated watersheds in the glaciated Midwest. (JAWRA) (Copyright ?? 2005).

Schilling, K.E.; Wolter, C.F.

2005-01-01

269

Energy Technology Data Exchange (ETDEWEB)

In this paper, the most relevant multiple regression models for sales forecasting of gas stations, developed over the past ten years, are reviewed. The most significant variables related to gas station sales, the types of the multiple regression models (linear or non-linear), the most common uses in supporting decision making and its limits are presented. The predictive power of each model and its impact on decision-making, such as sensitivity analysis and confidence intervals for independent variables, are also commented. Four models are presented, based on studies conducted in South Africa, Portugal and Brazil. In conclusion, suggestions for future developments are presented based on past developments. (author)

Wanke, Peter [Universidade Federal do Rio de Janeiro (UFRJ), RJ (Brazil). Instituto de Pesquisa e Pos-Graduacao em Administracao de Empresas (COPPEAD). Centro de Estudos em Logistica

2004-07-01

270

Directory of Open Access Journals (Sweden)

Full Text Available We compared probability surfaces derived using one set of environmental variables in three Geographic Information Systems (GIS-based approaches: logistic regression and Akaike’s Information Criterion (AIC, Multiple Criteria Evaluation (MCE, and Bayesian Analysis (specifically Dempster-Shafer theory. We used lynx Lynx canadensis as our focal species, and developed our environment relationship model using track data collected in Banff National Park, Alberta, Canada, during winters from 1997 to 2000. The accuracy of the three spatial models were compared using a contingency table method. We determined the percentage of cases in which both presence and absence points were correctly classified (overall accuracy, the failure to predict a species where it occurred (omission error and the prediction of presence where there was absence (commission error. Our overall accuracy showed the logistic regression approach was the most accurate (74.51%. The multiple criteria evaluation was intermediate (39.22%, while the Dempster-Shafer (D-S theory model was the poorest (29.90%. However, omission and commission error tell us a different story: logistic regression had the lowest commission error, while D-S theory produced the lowest omission error. Our results provide evidence that habitat modellers should evaluate all three error measures when ascribing confidence in their model. We suggest that for our study area at least, the logistic regression model is optimal. However, where sample size is small or the species is very rare, it may also be useful to explore and/or use a more ecologically cautious modelling approach (e.g. Dempster-Shafer that would over-predict, protect more sites, and thereby minimize the risk of missing critical habitat in conservation plans[Current Zoology 55(1: 28 – 40, 2009].

Shelley M. ALEXANDER

2009-02-01

271

Directory of Open Access Journals (Sweden)

Full Text Available Objective To explore the risk factors of complication of acute renal failure(ARF in war injuries of limbs.Methods The clinical data of 352 patients with limb injuries admitted to 303 Hospital of PLA from 1968 to 2002 were retrospectively analyzed.The patients were divided into ARF group(n=9 and non-ARF group(n=343 according to the occurrence of ARF,and the case-control study was carried out.Ten factors which might lead to death were analyzed by logistic regression to screen the risk factors for ARF,including causes of trauma,shock after injury,time of admission to hospital after injury,injured sites,combined trauma,number of surgical procedures,presence of foreign matters,features of fractures,amputation,and tourniquet time.Results Fifteen of the 352 patients died(4.3%,among them 7 patients(46.7% died of ARF,3(20.0% of pulmonary embolism,3(20.0% of gas gangrene,and 2(13.3% of multiple organ failure.Univariate analysis revealed that the shock,time before admitted to hospital,amputation and tourniquet time were the risk factors for ARF in the wounded with limb injuries,while the logistic regression analysis showed only amputation was the risk factor for ARF(P < 0.05.Conclusion ARF is the primary cause-of-death in the wounded with limb injury.Prompt and accurate treatment and optimal time for amputation may be beneficial to decreasing the incidence and mortality of ARF in the wounded with severe limb injury and ischemic necrosis.

Chang-zhi CHENG

2011-06-01

272

Multiple neutron spectrum activation analysis

International Nuclear Information System (INIS)

A new nuclear analytical technique based on neutron source spectrum differentiation has been developed to solve complex problems in multinuclide activation analysis with wide nuclide concentration, half-life, cross-section, radioactivity and counting rate ranges. Thus the isotopic and elemental analysis of various isotopic compositions of uranium for nuclear safeguard purposes and of uranium-thorium mixtures in geological, archaeological and other samples, as well as the identification and quantitative determination of multiple photopeaks in instrumental neutron activation analysis is facilitated by the new technique. In case of short and medium-lived nuclides the technique is particularly powerful if it is combined with cyclic activation, intermediate sample storage and proper choice of timing sequences and sample sizes, as well as with irradiation position selection for neutron flux adjustment, in order to optimize the experimental conditions. Thus the counting statistics and consequently the accuracy and sensitivity of the measurements can be improved, high counting rates and radiation build-up, which could cause dead-time losses and pulse pile-up effects, can be avoided, and timing and sample positioning uncertainties, as well as matrix interferences and other negative effects on the measurements, can be reduced. Recent intercomparisons with other laboratories, as the Safeguards Analytical Laboratory of the International Atomic Energy Agency, showed that the new technique can provide high accuracy in uranium element and U-235 abundance determination by delayed fission neutron counting after neutron activation, with certain advantages as non-destructive sample preparation, high throughput and low analytical cost, and thus can complement or even compete with other well established analytical techniques

273

PREDICTION OF GROUND VIBRATIONS IN OPENCAST MINE USING NONLINEAR REGRESSION ANALYSIS

Directory of Open Access Journals (Sweden)

Full Text Available The present work deals with the prediction of ground vibrations in Opencast mine by using Nonlinear regression analysis. It is very important to control the influence of various blast design parameters in the prediction of ground vibrations. Predictions from Non linear regression analysis have been compared with actual values observed from the field and are very close with the field values. Three cases have been considered and the ground vibrations are predicted. In the second case, the obtained results matched very closely with themeasured values from the field data. Thus the Nonlinear regression model can be applied for analyzing the prediction of ground vibrations in Opencast mine.

Dr.Y.SEETHARAMA RAO

2012-09-01

274

Using Negative Binomial Regression Analysis to Predict Software Faults: A Study of Apache Ant

Directory of Open Access Journals (Sweden)

Full Text Available Negative binomial regression has been proposed as an approach to predicting fault-prone software modules. However, little work has been reported to study the strength, weakness, and applicability of this method. In this paper, we present a deep study to investigate the effectiveness of using negative binomial regression to predict fault-prone software modules under two different conditions, self-assessment and forward assessment. The performance of negative binomial regression model is also compared with another popular fault prediction model—binary logistic regression method. The study is performed on six versions of an open-source objected-oriented project, Apache Ant. The study shows (1 the performance of forward assessment is better than or at least as same as the performance of self-assessment; (2 in predicting fault-prone modules, negative binomial regression model could not outperform binary logistic regression model; and (3 negative binomial regression is effective in predicting multiple errors in one module.

Liguo Yu

2012-07-01

275

This paper proposes a five-step process by which to analyze whether the salary ratio between junior and senior college faculty exhibits salary compression, a term used to describe an unusually small differential between faculty with different levels of experience. The procedure utilizes commonly used statistical techniques (multiple regression…

Toutkoushian, Robert K.

276

Directory of Open Access Journals (Sweden)

Full Text Available The multiple linear regression formula of the probability of the averaged daily solar energy reaching a specific location on the earth's surface in a calendar month was obtained with the assumption that the arrival process of clouds and solar energy during the day follows the exponential distribution. This formula enables any user to find out some of the required information such as knowing the maximum probability for the averaged daily solar energy and the amount of the corresponding clouds. In addition, the cumulative distribution functions of this probability was obtained.

Mohammed Mohammed El Genidy

2012-01-01

277

Directory of Open Access Journals (Sweden)

Full Text Available O ar é um meio eficiente de dispersão de poluentes atmosféricos e seucomportamento depende dos movimentos atmosféricos que ocorrem na troposfera. Em Porto Alegre, Estado do Rio Grande do Sul, há um grande tráfego diário e uma concentração de indústrias que podem ser responsáveis por emissões atmosféricas. Neste trabalho, estudou-se ocomportamento das concentrações diárias de material particulado (PM10 desta cidade, considerando a influência dos elementos meteorológicos. A análise dos dados foi realizada a partir de estatísticas descritivas, correlação linear e regressão múltipla. Os dados foram fornecidos pela Fundação Estadual de Proteção Ambiental Henrique Luiz Roessler - RS (FEPAM e pelo Instituto Nacional de Meteorologia (INMET. A partir das análises pôde-se verificar que: asconcentrações do PM10, medidos diariamente às 16h, não ultrapassaram os padrões nacionais de qualidade do ar; os elementos meteorológicos que influenciam nas concentrações do PM10 foram: a velocidade média diária do vento e a radiação média diária com relações negativas; astemperaturas médias diárias do ar e as direções, norte e noroeste, do vento, com relações positivas. As direções do vento que contribuem significativamente para diminuir as concentrações nos locais medidos são Leste e Sudeste.Air is an efficient means of atmospheric pollutants dispersal and its r behavior depends on the atmospheric movements that occur in the troposphere. In Porto Alegre, Rio Grande do Sul State, there is a large daily traffic and a concentration of industries that may be responsible for atmospheric emission. In the present work we studied the behavior of daily concentrations of particulate matter (PM10, in this city, considering the influence of meteorological variables. Dataanalysis was performed from descriptive statistics, linear correlation and multiple regressions. Data were provided by the State Foundation of Environmental Protection Henrique Luiz Roessler - RS and the National Institute of Meteorology. Based on the analysis it was possible to verify that: the concentration of PM10, measured every day at 4:00 p.m., did not exceed national standards for air quality; meteorological elements that influenced on the concentrations of PM10 were the daily average wind speed and average daily radiation with negative relations; the daily average temperature of the air and the directions, north and northwest of wind, with positive relations. Wind directions which contribute significantly to lower concentrations on the measured placesare east and southeast.

Angela Radünz Lazzari

2011-01-01

278

The all rocket mode of operation is shown to be a critical factor in the overall performance of a rocket based combined cycle (RBCC) vehicle. An axisymmetric RBCC engine was used to determine specific impulse efficiency values based upon both full flow and gas generator configurations. Design of experiments methodology was used to construct a test matrix and multiple linear regression analysis was used to build parametric models. The main parameters investigated in this study were: rocket chamber pressure, rocket exit area ratio, injected secondary flow, mixer-ejector inlet area, mixer-ejector area ratio, and mixer-ejector length-to-inlet diameter ratio. A perfect gas computational fluid dynamics analysis, using both the Spalart-Allmaras and k-omega turbulence models, was performed with the NPARC code to obtain values of vacuum specific impulse. Results from the multiple linear regression analysis showed that for both the full flow and gas generator configurations increasing mixer-ejector area ratio and rocket area ratio increase performance, while increasing mixer-ejector inlet area ratio and mixer-ejector length-to-diameter ratio decrease performance. Increasing injected secondary flow increased performance for the gas generator analysis, but was not statistically significant for the full flow analysis. Chamber pressure was found to be not statistically significant.

Smith, Timothy D.; Steffen, Christopher J., Jr.; Yungster, Shaye; Keller, Dennis J.

1998-01-01

279

Additive Intensity Regression Models in Corporate Default Analysis

DEFF Research Database (Denmark)

We consider additive intensity (Aalen) models as an alternative to the multiplicative intensity (Cox) models for analyzing the default risk of a sample of rated, nonfinancial U.S. firms. The setting allows for estimating and testing the significance of time-varying effects. We use a variety of model checking techniques to identify misspecifications. In our final model, we find evidence of time-variation in the effects of distance-to-default and short-to-long term debt. Also we identify interactions between distance-to-default and other covariates, and the quick ratio covariate is significant. None of our macroeconomic covariates are significant.

Lando, David; Medhat, Mamdouh

2013-01-01

280

[An applied study on Fourier transform near-infrared whole spectroscopy regression analysis].

In the present paper, 66 wheat samples were used as experimental materials, 33 of them were used for building the quantitative analysis model of protein content, and the rest composed the prediction set. Using Moore-Penrose matrix, we estimated directly the regression coefficients of the regression analysis model with Fourier transform near-infrared (FTNIR) whole spectroscopy. The samples of prediction set were analyzed, and the correlation coefficient is 0.979 9 between the prediction values of the near-infrared model and the standard chemical ones by Kjeldahl's method, and the average relative error is 1.76%. Using Moore-Penrose matrix, we can not only get the near-infrared spectroscopy analysis model's regression coefficients, but also know their contribution at every wavelength point. Consequently we can understand and explain the physical and chemical significance of the FTNIR whole spectroscopy regression model. PMID:16544481

Zhang, Lu-Da; Wang, Tao; Yang, Li-Ming; Zhao, Li-Li; Zhao, Long-Lian; Li, Jun-Hui; Yan, Yan-Lu

2005-12-01

281

Summary In clinical studies, when censoring is caused by competing risks or patient withdrawal, there is always a concern about the validity of treatment effect estimates that are obtained under the assumption of independent censoring. Since dependent censoring is non-identifiable without additional information, the best we can do is a sensitivity analysis to assess the changes of parameter estimates under different degrees of assumed dependent censoring. Such an analysis is especially useful when knowledge about the degree of dependent censoring is available through literature review or expert opinions. In a regression analysis setting, the consequences of falsely assuming independent censoring on parameter estimates are not clear. Neither the direction nor the magnitude of the potential bias can be easily predicted. We provide an approach to do sensitivity analysis for the Cox proportional hazards models. The joint distribution of the failure and censoring times is assumed to be a function of their marginal distributions. This function is called a copula. Under this constraint, we propose an iteration algorithm to estimate the regression parameters and marginal survival functions. Simulation studies show that this algorithm works well. We apply the proposed sensitivity analysis approach to data from an AIDS clinical trial in which 27% of the patients withdrew due to toxicity or at the request of the patient or investigator. PMID:18266895

Huang, Xuelin; Zhang, Nan

2014-01-01

282

Directory of Open Access Journals (Sweden)

Full Text Available Functional MRI studies have revealed changes in default-mode and salience networks in neurodegenerative dementias, especially in Alzheimer’s disease. The purpose of this study was to analyze the whole brain cortex resting state networks in patients with behavioral variant frontotemporal dementia by using resting state functional MRI. The group specific resting state networks were identified by high model order independent component analysis and a dual regression technique was used to detect between-group differences in the resting state networks with p<0.05 threshold corrected for multiple comparisons. A y-concatenation method was used to correct for multiple comparisons for multiple independent components, grey matter differences as well as the voxel level. We found increased connectivity in several networks within patients with bvFTD compared to the control group. The most prominent enhancement was seen in the right frontotemporal area and insula. A significant increase in functional connectivity was also detected in the left dorsal attention network, in anterior paracingulate – a default mode sub-network as well as in the anterior parts of the frontal pole. Notably the increased patterns of connectivity were seen in areas around atrophic regions. The present results demonstrate abnormal increased connectivity in several important brain networks including the dorsal attention network and default-mode network in patients with behavioral variant frontotemporal dementia. These changes may be associated with decline in executive functions and attention as well as apathy, which are the major cognitive and neuropsychiatric defects in patients with frontotemporal dementia.

AnneMarjaRemes

2013-08-01

283

Stepwise regression analysis of an intensive 1-year study of delirium tremens.

An intensive 1-year study was carried out on 41 male patients, mean age 49, mean hospitalization time 49 days, admitted to a special ward of the Beckomberga Hospital with the diagnosis of delirium tremens and 50 concomitant somatic and psychiatric diagnoses (1--9 per capita), and given a standardized treatment. The mean duration of delirium tremens after admission was 2 days; 76% recovered within 48 h. The duration after admission was positively correlated to age, number of previous delirium tremens, negatively correlated to B-haemoglobin and B-haematocrit for laboratory data obtained within the first 24 h and was positively correlated to blood sugar and S-creatinine on data taken within 40 h (Pearson correlation matrix). Stepwise multiple regression (SWR) based on 46 quantitative and dummy variables (the latter used to represent the presence of various concomitant diseases) was employed to identify the factors predicting the duration of delirium tremens. On final SWR analysis, which limited the number of observations to cases with complete observation vectors, the following regression equation was obtained: Duration after admission = 3.57--0.93 (S-magnesium)--0.29 (B-eosinophils) + 0.62 (liver disease), P greater than 0.05, n = 14. Although the regression coefficients were not statistically significant, S-magnesium, negatively associated with the duration after admission, offered 20% out of the total 38% of explanation given, whereas B-eosinophils, negatively associated, offered 12%, and liver disease, positively associated, 6%. The choice by the SWR program of S-magnesium as the most important factor in predicting the duration of delirium tremens is consistent with clinical evidence that alcohol ingestion causes magnesium diuresis and that magnesium deficiency is present in chronic alcoholism. In view of this knowledge, it is reasonable to assume that the lack of statistical significance is due to the small sample size rather than to the alternative that no explanation is offered by S-magnesium. Furthermore, B-haemoglobin, S-potassium, S-ASAT, and S-ALAT, known to be characteristically altered in delirium tremens, were found on forcing (a variant of SWR) to be of secondary importance to S-magnesium as explaining factors, whereas blood sugar and S-creatinine derived part of their explaining power from S-magnesium. In conclusion, extensive use of SWR analysis based on 46 potential explaining variables points to serum magnesium concentration as the most important factor in predicting the duration of delirium tremens. PMID:7468290

Stendig-Lindberg, G; Rudy, N

1980-10-01

284

Regression analysis for a bottom-up approach to analyzing semi-prompt fission gamma yields

International Nuclear Information System (INIS)

Highlights: ? Fitting the semi-prompt non-resolved photon spectrum after fission. ? Energy–time dependence can be factorized. ? Physical model, statistical model, sampling procedure. ? The best fit is: lognormal for energy and F for time. - Abstract: We present an empirical model that describes the yield of gamma rays emitted by fission in the time interval from 20 to 958 ns following a fission event. The analysis is based on experimental data from neutron-induced fission of 235U and 239Pu. The model is devised by first using regression analysis to identify likely patterns in the data and to choose plausible fitting functions. We provide statistical and physical arguments in support of time and energy independence. The intensity of the emitted gamma rays can be described as a bivariate distribution that is the product of independent variates for energy and time. We test several plausible distribution families for the energy and time variates and use maximum likelihood and minimum ?2 to estimate distribution parameters. Because of the uncertainty in the experimental data, multiple combinations of variate pairs give rise to a surface that plausibly well fits the observations well. The best-fit variate turns out to be lognormal in energy and F in time. The findings illustrated in this paper can be used to simulate gamma ray de-excitation from fission in Monte Carlo codes.

285

Digital Repository Infrastructure Vision for European Research (DRIVER)

The aim of this study was to investigate performance of Multiple Linear Regression (MLR) method in predicting future (next day, next 2 days and next 3 days) PM10 concentration levels in Seberang Perai, Malaysia. The developed model was compared to multiple linear regression models. The model used gaseous (NO2, SO2, CO), PM10 and meteorological parameters (temperature, relative humidity and wind speed) as predictors. Performance indicators such as Pr...

NorAzam Ramli; Ahmad Shukri Yahaya; Ahmad Zia Ul-Saufie; Hazrul Abdul Hamid

2012-01-01

286

Relationships defining the ballistic limit of Space Station Freedom's (SSF) dual wall protection systems have been determined. These functions were regressed from empirical data found in Marshall Space Flight Center's (MSFC) Hypervelocity Impact Testing Summary (HITS) for the velocity range between three and seven kilometers per second. A stepwise linear least squares regression was used to determine the coefficients of several expressions that define a ballistic limit surface. Using statistical significance indicators and graphical comparisons to other limit curves, a final set of expressions is recommended for potential use in Probability of No Critical Flaw (PNCF) calculations for Space Station. The three equations listed below represent the mean curves for normal, 45 degree, and 65 degree obliquity ballistic limits, respectively, for a dual wall protection system consisting of a thin 6061-T6 aluminum bumper spaced 4.0 inches from a .125 inches thick 2219-T87 rear wall with multiple layer thermal insulation installed between the two walls. Normal obliquity is d(sub c) = 1.0514 v(exp 0.2983 t(sub 1)(exp 0.5228). Forty-five degree obliquity is d(sub c) = 0.8591 v(exp 0.0428) t(sub 1)(exp 0.2063). Sixty-five degree obliquity is d(sub c) = 0.2824 v(exp 0.1986) t(sub 1)(exp -0.3874). Plots of these curves are provided. A sensitivity study on the effects of using these new equations in the probability of no critical flaw analysis indicated a negligible increase in the performance of the dual wall protection system for SSF over the current baseline. The magnitude of the increase was 0.17 percent over 25 years on the MB-7 configuration run with the Bumper II program code.

Jolly, William H.

1992-01-01

287

Energy Technology Data Exchange (ETDEWEB)

Evaluation of economic feasibility of a bio-gasification facility needs understanding of its unit cost under different production capacities. The objective of this study was to evaluate the unit cost of syngas production at capacities from 60 through 1800Nm 3/h using an economic model with three regression analysis techniques (simple regression, reciprocal regression, and log-log regression). The preliminary result of this study showed that reciprocal regression analysis technique had the best fit curve between per unit cost and production capacity, with sum of error squares (SES) lower than 0.001 and coefficient of determination of (R 2) 0.996. The regression analysis techniques determined the minimum unit cost of syngas production for micro-scale bio-gasification facilities of $0.052/Nm 3, under the capacity of 2,880 Nm 3/h. The results of this study suggest that to reduce cost, facilities should run at a high production capacity. In addition, the contribution of this technique could be the new categorical criterion to evaluate micro-scale bio-gasification facility from the perspective of economic analysis.

Deng, Yangyang; Parajuli, Prem B.

2011-08-10

288

Regularized Multiple-Set Canonical Correlation Analysis

Multiple-set canonical correlation analysis (Generalized CANO or GCANO for short) is an important technique because it subsumes a number of interesting multivariate data analysis techniques as special cases. More recently, it has also been recognized as an important technique for integrating information from multiple sources. In this paper, we…

Takane, Yoshio; Hwang, Heungsun; Abdi, Herve

2008-01-01

289

Regression Models for Demand Reduction based on Cluster Analysis of Load Profiles

Energy Technology Data Exchange (ETDEWEB)

This paper provides new regression models for demand reduction of Demand Response programs for the purpose of ex ante evaluation of the programs and screening for recruiting customer enrollment into the programs. The proposed regression models employ load sensitivity to outside air temperature and representative load pattern derived from cluster analysis of customer baseline load as explanatory variables. The proposed models examined their performances from the viewpoint of validity of explanatory variables and fitness of regressions, using actual load profile data of Pacific Gas and Electric Company's commercial and industrial customers who participated in the 2008 Critical Peak Pricing program including Manual and Automated Demand Response.

Yamaguchi, Nobuyuki; Han, Junqiao; Ghatikar, Girish; Piette, Mary Ann; Asano, Hiroshi; Kiliccote, Sila

2009-06-28

290

The effectiveness of multiple linear regression approaches in removing solar, volcanic, and El Nino Southern Oscillation (ENSO) influences from the recent (1979-2012) surface temperature record is examined, using simple energy balance and global climate models (GCMs). These multiple regression methods are found to incorrectly diagnose the underlying signal - particularly in the presence of a deceleration - by generally overestimating the solar cooling contribution to an early 21st century pause while underestimating the warming contribution from the Mt. Pinatubo recovery. In fact, one-box models and GCMs suggest that the Pinatubo recovery has contributed more to post-2000 warming trends than the solar minimum has contributed to cooling over the same period. After adjusting the observed surface temperature record based on the natural-only multi-model mean from several CMIP5 GCMs and an empirical ENSO adjustment, a significant deceleration in the surface temperature increase is found, ranging in magnitude from -0.06 to -0.12 K dec-2 depending on model sensitivity and the temperature index used. This likely points to internal decadal variability beyond these solar, volcanic, and ENSO influences.

Masters, T.

2013-11-01

291

Directory of Open Access Journals (Sweden)

Full Text Available Many research groups have being studying the contribution of tropical forests to the global carbon cycle, and theclimatic consequences of substituting the forests for pastures. Considering that soil CO2 efflux is the greater component of the carboncycle of the biosphere, this work found an equation for estimating the soil CO2 efflux of an area of the Transition Forest, using a modelof multiple regression for time series data of temperature and soil moisture. The study was carried out in the northwest of MatoGrosso, Brazil (11°24.75’S; 55°19.50’W, in a transition forest between cerrado and AmazonForest, 50 km far from Sinop county.Each month, throughout one year, it was measured soil CO2 efflux, temperature and soil moisture. The annual average of soil CO2 efflux was 7.5 ± 0.6 (mean ± SE ì mol m-2 s-1, the annual mean soil temperature was 25,06 ± 0.12 (mean ± SE ºC. The study indicatedthat the humidity had high influence on soil CO2 efflux; however the results were more significant using a multiple regression modelthat estimated the logarithm of soil CO2 efflux, considering time, soil moisture and the interaction between time duration and theinverse of soil temperature. .

Carla Maria Abido Valentini

2008-03-01

292

The application of a multiple regression model for aero radiometric data

International Nuclear Information System (INIS)

The data observed in the total channel of high sensitivity airborne ?-ray spectrometric surveys is selected as the dependent variable while those of the Th, K and U channels are considered as independent variables and a linear statistical model is assumed to relate them as (Total)sub(i) ?sub(0) + ?1(U)sub(i) + ?2(Th)sub(i) + ?3(K)sub(i) + ?sub(i), ?1, ?2, ?3, are the partial regression coefficients and ?sub(i) is the error term. The estimated coefficients (?1, ?2, ?3) are used to check on board the data acquisition system as well as to predict occasionally the more appropriate value of the data in case a single data item is not recorded correctly. (author)

293

Directory of Open Access Journals (Sweden)

Full Text Available Purpose: this paper aims to present a simple method to synthesize an empirically-based model that permit to estimate the maximum displacement of a plate when a shotpeening process values are known.Design/methodology/approach: This approach regards the difficulty to develop a mathematical model to describe the relationship between the shot peening process variables (shot diameter, impact velocity, static preload and coverage and the curvature of the piece. Such a model was generated through the application of statistical inference methods – multivariable regression and neural networks – to a set of experimental data concerning the application of peen forming processes to a group of 215 aluminium 7050 alloy rectangular plates.Findings: Although the estimated displacements from both models comply reasonably well with the experimental data, the obtained results exposed the superiority of the regressive model concerning accuracy.Research limitations/implications: Shot peen forming, a die less forming process, is one of the most successful methods to produce slight and smooth curvatures on large panels and plates. Through the application of a regulated blast of small round steel shot on the piece surface, a thin internal layer of residual compressive stress causes the elastic stretching of the shotted surface, giving rise to a permanent non-plastic deformation of the whole piece. Although this forming process has been used since the fifties, especially by the aerospatial industry, a scientific method for peen forming process planning has not been developed yet.Originality/value: The referred model can be used as an engineering tool to aid setting up a peen forming process in order to produce a desired curvature on a given plate.

S. Delijaicov

2010-12-01

294

Energy Technology Data Exchange (ETDEWEB)

Effect of the properties of coals on indicators for the hydrogenation process (i.e. level of organic matter conversion, yield of liquid products and of fractions with a boiling point higher than 573 K, and hydrogen consumption) are investigated by multiple regression analysis. Data on the hydrogenation of Donbass, Kuzbass and Kansk-Achinsk coals, obtained by a method developed by the Fossil-Fuel Institute, were used. A quantitative correlation was established which can serve as the basis of a mathematical model for forecasting process indicators in relation to the structural parameters of the raw coal. 9 references.

Gagarin, S.G.; Kirilina, T.A.; Krichko, A.A.; Larina, N.K.; Skripchenko, G.B.; Slivinskaya, I.I.; Shulyakovskaya, L.V.

1984-11-01

295

Multivariate Bayesian Logistic Regression for Analysis of Clinical Study Safety Issues

Digital Repository Infrastructure Vision for European Research (DRIVER)

This paper describes a method for a model-based analysis of clinical safety data called multivariate Bayesian logistic regression (MBLR). Parallel logistic regression models are fit to a set of medically related issues, or response variables, and MBLR allows information from the different issues to "borrow strength" from each other. The method is especially suited to sparse response data, as often occurs when fine-grained adverse events are collected from subjects in studies...

Dumouchel, William

2012-01-01

296

This paper presents the application of regression analysis in measuring the relationship between rainfall area and Timah Tasoh reservoir level. The trend lines of the rainfall data on a monthly basis were checked. The data collected from the year 2007 until 2012 were run using JMP software to obtain the regression model. Result showed that Wang Kelian has high humidity in Perlis and significantly has contributed to high level of Timah Tasoh reservoir.

Noor, Nor Fashihah Mohd; Adnan, Farah Adibah; Saad, Syafawati Ab.; Zakaria, Haslina Binti; Yazid, Nornadia Binti Mohd; Jalil, Mohd Faizal Ab; Kamarudzaman, Ain Nihla

2014-07-01

297

The Use of Nonparametric Kernel Regression Methods in Econometric Production Analysis

DEFF Research Database (Denmark)

This PhD thesis addresses one of the fundamental problems in applied econometric analysis, namely the econometric estimation of regression functions. The conventional approach to regression analysis is the parametric approach, which requires the researcher to specify the form of the regression function. However, the a priori specification of a functional form involves the risk of choosing one that is not similar to the “true” but unknown relationship between the regressors and the dependent variable. This problem, known as parametric misspecification, can result in biased parameter estimates and hence, also in biased measures, which are derived from the estimated parameters. This, in turn, can result in incorrect economic conclusions and recommendations for managers, politicians and decision makers in general. This PhD thesis focuses on a nonparametric econometric approach that can be used to avoid this problem. The main objective is to investigate the applicability of the nonparametric kernel regression method in applied production analysis. The focus of the empirical analyses included in this thesis is the agricultural sector in Poland. Data on Polish farms are used to investigate practically and politically relevant problems and to illustrate how nonparametric regression methods can be used in applied microeconomic production analysis both in panel data and cross-section data settings. The thesis consists of four papers. The first paper addresses problems of parametric and nonparametric estimations of production functions in order to evaluate the optimal firm size. The second paper discusses the use of parametric and nonparametric regression methods to estimate panel data regression models. The third paper analyses production risk, price uncertainty, and farmers' risk preferences within a nonparametric panel data regression framework. The fourth paper analyses the technical efficiency of dairy farms with environmental output using nonparametric kernel regression in a semiparametric stochastic frontier analysis. The results provided in this PhD thesis show that nonparametric kernel methods are well-suited to econometric production analysis and can outperform traditional parametric methods. Although the empirical focus of this thesis is on the application of nonparametric kernel regression in applied production analysis, the findings are also applicable to econometric estimations in general.

Czekaj, Tomasz Gerard

2013-01-01

298

Random Decrement and Regression Analysis of Traffic Responses of Bridges

DEFF Research Database (Denmark)

The topic of this paper is the estimation of modal parameters from ambient data by applying the Random Decrement technique. The data from the Queensborough Bridge over the Fraser River in Vancouver, Canada have been applied. The loads producing the dynamic response are ambient, e.g. wind, traffic and small ground motion. The Random Decrement technique is used to estimate the correlation function or the free decays from the ambient data. From these functions, the modal parameters are extracted using the Ibrahim Time Domain method. The possible influence of the traffic mass load on the bridge is investigated by assuming that the response level of the bridge is dependent on the mass of the vehicle load. The eigenfrequencies of the bridge are estimated as a function of the response level. This indicates the degree of influence of the mass load on the estimated eigenfrequencies. The results of the analysis using the Random Decrement technique are compared with results from an analysis based on fast Fourier transformations.

Asmussen, J. C.; Ibrahim, S. R.

1995-01-01

299

Directory of Open Access Journals (Sweden)

Full Text Available The aim of the present study was to appraise prime dependent variables of ophthalmic patients’ satisfaction in a Nigerian public eye care facility with a view to boosting service uptake. It was a cross sectional study conducted between March and May 2012 in our centre. Consecutive clinic patients (n=251 that met study’s criteria were recruited. The patients filled interviewer-administered structured questionnaires. A total of 251 patients were analyzed comprising 139 males (55.4% and 112 females (44.6%. Male:female ratio=1:0.8. The ages of the patients studied ranged from 17 to 92 years with a mean of 37.2 years±15.57. Bivariate analysis, validated by multiple logistic regression, showed P values of 0.021, 0.008, 0.036, 0.008 and 0.004 for privacy, comfort during eye exam, fairness (non-partiality, thoroughness of examination and expectation, respectively. Satisfaction with overall quality of services was 80.1%. The services of any eye facility should be patient-driven to attain desired goals; therefore the identified areas of patients’ dissatisfaction should be addressed for effective service uptake.

Emmanuel Olu Megbelayin

2014-02-01

300

Directory of Open Access Journals (Sweden)

Full Text Available Linear Least Square (LLS is an approach for modeling regression analysis, applied for prediction and quantification of the strength of relationship between dependent and independent variables. There are a number of methods for solving the LLS problem but as soon as the data size increases and system becomes ill conditioned, the classical methods become complex at time and space with decreasing level of accuracy. Proposed work is based on prediction and quantification of the strength of relationship between sugar fasting and Post-Prandial (PP sugar with 73 factors that affect diabetes. Due to the large number of independent variables, presented problem of diabetes prediction also presented similar complexities. ABS method is an approach proven better than other classical approaches for LLS problems. ABS algorithm has been applied for solving LLS problem. Hence, separate regression equations were obtained for sugar fasting and PP severity.

Soniya Lalwani

2013-06-01

301

DEFF Research Database (Denmark)

This paper proposes a simple method for estimating emissions and predicted environmental concentrations (PECs) in water and air for organic chemicals that are used in household products and industrial processes. The method has been tested on existing data for 63 organic high-production volume chemicals available in the European Chemicals Bureau risk assessment reports (RARs). The method suggests a simple linear relationship between Henry's Law constant, octanol-water coefficient, use and production volumes, and emissions and PECs on a regional scale in the European Union. Emissions and PECs are a result of a complex interaction between chemical properties, production and use patterns and geographical characteristics. A linear relationship cannot capture these complexities; however, it may be applied at a cost-efficient screening level for suggesting critical chemicals that are candidates for an in-depth risk assessment. Uncertainty measures are not available for the RAR data; however, uncertainties for the applied regression models are given in the paper. Evaluation of the methods reveals that between 79% and 93% of all emission and PEC estimates are within one order of magnitude of the reported RAR values. Bearing in mind that the domain of the method comprises organic industrial high-production volume chemicals, four chemicals, prioritized in the Water Framework Directive and the Stockholm Convention on Persistent Organic Pollutants, were used to test the method for estimated emissions and PECs, with corresponding uncertainty intervals, in air and water at regional EU level.

Fauser, Patrik; Thomsen, Marianne

2010-01-01

302

A three-stage framework for gene expression data analysis by L1-norm support vector regression.

The identification of discriminative genes for categorical phenotypes in microarray gene expression data analysis has been extensively studied, especially for disease diagnosis. In recent biological experiments, continuous phenotypes have also been dealt with. For example, the extent of programmed cell death (apoptosis) can be measured by the level of caspase 3 enzyme. Thus, an effective gene selection method for continuous phenotypes is desirable. In this paper, we describe a three-stage framework for gene expression data analysis based on L1-norm support vector regression (L1-SVR). The first stage ranks genes by recursive multiple feature elimination based on L1-SVR. In the second stage, the minimal genes are determined by a kernel regression, which yields the lowest ten-fold cross-validation error. In the last stage, the final non-linear regression model is built with the minimal genes and optimal parameters found by leave-one-out cross-validation. The experimental results show a significant improvement over the current state-of-the-art approach, i.e., the two-stage process, which consists of the gene selection based on L1-SVR and the third stage of the proposed method. PMID:18048121

Kim, Hyunsoo; Zhou, Jeff X; Morse, Herbert C; Park, Haesun

2005-01-01

303

Treating experimental data of inverse kinetic method by unitary linear regression analysis

International Nuclear Information System (INIS)

The theory of treating experimental data of inverse kinetic method by unitary linear regression analysis was described. Not only the reactivity, but also the effective neutron source intensity could be calculated by this method. Computer code was compiled base on the inverse kinetic method and unitary linear regression analysis. The data of zero power facility BFS-1 in Russia were processed and the results were compared. The results show that the reactivity and the effective neutron source intensity can be obtained correctly by treating experimental data of inverse kinetic method using unitary linear regression analysis and the precision of reactivity measurement is improved. The central element efficiency can be calculated by using the reactivity. The result also shows that the effect to reactivity measurement caused by external neutron source should be considered when the reactor power is low and the intensity of external neutron source is strong. (authors)

304

DEFF Research Database (Denmark)

The Sea Level Thematic Assembly Center in the EUFP7 MyOcean project aims at build a sea level service for multiple satellite sea level observations at a European level for GMES marine applications. It aims to improve the sea level related products to guarantee the sustainability and the quality of GMES marine core service. One such added value will be a multivariate regression model of sea level variability of multisatellite and in-situ tide gauge observations with the aim at improved future high spatial and temporal sea level prediction for i.e., human safety. Tide gauges and satellite altimetry data from the last seventeen years have been compared for an area around UK and temporal correlation coefficients between them were calculated. The results are extremely encouraging, as we have shown that the detided signal from response method correlates to more than 90% for nearly all tide gauge stations with satellite altimetry.

Cheng, Yongcun; Andersen, Ole Baltazar

2010-01-01

305

DEFF Research Database (Denmark)

This paper analyses multivariate high frequency financial data using realized covariation. We provide a new asymptotic distribution theory for standard methods such as regression, correlation analysis, and covariance. It will be based on a fixed interval of time (e.g., a day or week), allowing the number of high frequency returns during this period to go to infinity. Our analysis allows us to study how high frequency correlations, regressions, and covariances change through time. In particular we provide confidence intervals for each of these quantities.

Barndorff-Nielsen, Ole Eiler; Shephard, N.

2004-01-01

306

The performance of two QSAR methodologies, namely Multiple Linear Regressions (MLR) and Neural Networks (NN), towards the modeling and prediction of antitubercular activity was evaluated and compared. A data set of 173 potentially active compounds belonging to the hydrazide family and represented by 96 descriptors was analyzed. Models were built with Multiple Linear Regressions (MLR), single Feed-Forward Neural Networks (FFNNs), ensembles of FFNNs and Associative Neural Networks (AsNNs) using four different data sets and different types of descriptors. The predictive ability of the different techniques used were assessed and discussed on the basis of different validation criteria and results show in general a better performance of AsNNs in terms of learning ability and prediction of antitubercular behaviors when compared with all other methods. MLR have, however, the advantage of pinpointing the most relevant molecular characteristics responsible for the behavior of these compounds against Mycobacterium tuberculosis. The best results for the larger data set (94 compounds in training set and 18 in test set) were obtained with AsNNs using seven descriptors (R(2) of 0.874 and RMSE of 0.437 against R(2) of 0.845 and RMSE of 0.472 in MLRs, for test set). Counter-Propagation Neural Networks (CPNNs) were trained with the same data sets and descriptors. From the scrutiny of the weight levels in each CPNN and the information retrieved from MLRs, a rational design of potentially active compounds was attempted. Two new compounds were synthesized and tested against M. tuberculosis showing an activity close to that predicted by the majority of the models. PMID:24246731

Ventura, Cristina; Latino, Diogo A R S; Martins, Filomena

2013-01-01

307

Varying-coefficient functional linear regression

Functional linear regression analysis aims to model regression relations which include a functional predictor. The analog of the regression parameter vector or matrix in conventional multivariate or multiple-response linear regression models is a regression parameter function in one or two arguments. If, in addition, one has scalar predictors, as is often the case in applications to longitudinal studies, the question arises how to incorporate these into a functional regression model. We study a varying-coefficient approach where the scalar covariates are modeled as additional arguments of the regression parameter function. This extension of the functional linear regression model is analogous to the extension of conventional linear regression models to varying-coefficient models and shares its advantages, such as increased flexibility; however, the details of this extension are more challenging in the functional case. Our methodology combines smoothing methods with regularization by truncation at a finite numb...

Wu, Yichao; Müller, Hans-Georg; 10.3150/09-BEJ231

2011-01-01

308

The validation of an analytical procedure means the evaluation of some performance criteria such as accuracy, sensitivity, linear range, capability of detection, selectivity, calibration curve, etc. This implies the use of different statistical methodologies, some of them related with statistical regression techniques, which may be robust or not. The presence of outlier data has a significant effect on the determination of sensitivity, linear range or capability of detection amongst others, when these figures of merit are evaluated with non-robust methodologies. In this paper some of the robust methods used for calibration in analytical chemistry are reviewed: the Huber M-estimator; the Andrews, Tukey and Welsh GM-estimators; the fuzzy estimators; the constrained M-estimators, CM; the least trimmed squares, LTS. The paper also shows that the mathematical properties of the least median squares (LMS) regression can be of great interest in the detection of outlier data in chemical analysis. A comparative analysis is made of the results obtained by applying these regression methods to synthetic and real data. There is also a review of some applications where this robust regression works in a suitable and simple way that proves very useful to secure an objective detection of outliers. The use of a robust regression is recommended in ISO 5725-5. PMID:18970799

Ortiz, M Cruz; Sarabia, Luis A; Herrero, Ana

2006-10-15

309

Linear Maximum Likelihood Regression Analysis for Untransformed Log-Normally Distributed Data

Directory of Open Access Journals (Sweden)

Full Text Available Medical research data are often skewed and heteroscedastic. It has therefore become practice to log-transform data in regression analysis, in order to stabilize the variance. Regression analysis on log-transformed data estimates the relative effect, whereas it is often the absolute effect of a predictor that is of interest. We propose a maximum likelihood (ML-based approach to estimate a linear regression model on log-normal, heteroscedastic data. The new method was evaluated with a large simulation study. Log-normal observations were generated according to the simulation models and parameters were estimated using the new ML method, ordinary least-squares regression (LS and weighed least-squares regression (WLS. All three methods produced unbiased estimates of parameters and expected response, and ML and WLS yielded smaller standard errors than LS. The approximate normality of the Wald statistic, used for tests of the ML estimates, in most situations produced correct type I error risk. Only ML and WLS produced correct confidence intervals for the estimated expected value. ML had the highest power for tests regarding ?_{1}.

Sara M. Gustavsson

2012-10-01

310

Scientific Electronic Library Online (English)

Full Text Available SciELO Colombia | Language: Spanish Abstract in spanish Este trabajo introductorio presenta y describe diversos modelos de regresión múltiple y su respectiva formulación como un problema de optimización por metas. Se describen los modelos de regresión mediana, regresión mediana ponderada, regresión cuantílica, regresión cuantílica ponderada y formulación [...] minimax. Además, se describe la formulación dual de estos modelos y se presentan algunos ejemplos sencillos se presentan para explicar los conceptos desarrollados y las aplicaciones de dichos modelos en ingeniería y ciencias. Abstract in english This introductory work shows several multiple regression models and their relevant development as a problem of goal programming (eliminar...optimization by goals). It describes the median regression, weighted median regression, quantile regression, weighted quantile regression, and minimax formulati [...] on models. Furthermore, describes their dual formulation. We describe some simple examples to explain the concepts developed and applications of such models on engineering and sciences.

Héctor Andrés, López Ospina; Rafael David, López Ospina.

2010-06-01

311

Scientific Electronic Library Online (English)

Full Text Available SciELO Colombia | Language: Spanish Abstract in spanish Este trabajo introductorio presenta y describe diversos modelos de regresión múltiple y su respectiva formulación como un problema de optimización por metas. Se describen los modelos de regresión mediana, regresión mediana ponderada, regresión cuantílica, regresión cuantílica ponderada y formulación [...] minimax. Además, se describe la formulación dual de estos modelos y se presentan algunos ejemplos sencillos se presentan para explicar los conceptos desarrollados y las aplicaciones de dichos modelos en ingeniería y ciencias. Abstract in english This introductory work shows several multiple regression models and their relevant development as a problem of goal programming (eliminar...optimization by goals). It describes the median regression, weighted median regression, quantile regression, weighted quantile regression, and minimax formulati [...] on models. Furthermore, describes their dual formulation. We describe some simple examples to explain the concepts developed and applications of such models on engineering and sciences.

Héctor Andrés, López Ospina; Rafael David, López Ospina.

312

Spline Nonparametric Regression Analysis of Stress-Strain Curve of Confined Concrete

Directory of Open Access Journals (Sweden)

Full Text Available Due to enormous uncertainties in confinement models associated with the maximum compressive strength and ductility of concrete confined by rectilinear ties, the implementation of spline nonparametric regression analysis is proposed herein as an alternative approach. The statistical evaluation is carried out based on 128 large-scale column specimens of either normal-or high-strength concrete tested under uniaxial compression. The main advantage of this kind of analysis is that it can be applied when the trend of relation between predictor and response variables are not obvious. The error in the analysis can, therefore, be minimized so that it does not depend on the assumption of a particular shape of the curve. This provides higher flexibility in the application. The results of the statistical analysis indicates that the stress-strain curves of confined concrete obtained from the spline nonparametric regression analysis proves to be in good agreement with the experimental curves available in literatures

Tavio Tavio

2008-01-01

313

Multiple Imputation, Maximum Likelihood and Fully Bayesian methods are the three most commonly used model-based approaches in missing data problems. Although it is easy to show that when the responses are missing at random (MAR), the complete case analysis is unbiased and efficient, the aforementioned methods are still commonly used in practice for this setting. To examine the performance of and relationships between these three methods in this setting, we derive and investigate small sample and asymptotic expressions of the estimates and standard errors, and fully examine how these estimates are related for the three approaches in the linear regression model when the responses are MAR. We show that when the responses are MAR in the linear model, the estimates of the regression coefficients using these three methods are asymptotically equivalent to the complete case estimates under general conditions. One simulation and a real data set from a liver cancer clinical trial are given to compare the properties of these methods when the responses are MAR. PMID:25309677

Ibrahim, Joseph G.

2014-01-01

314

In a watershed framework, the selection of a particular type of water quality model depends on several factors such as complexity of process being modeled, input data requirements, modeling objectives, and model applicability. For most applications, process-based simulation models or mechanistic models are routinely used to quantify the response of different hydrologic and water quality processes occurring in a watershed. In a complex watershed, the modeling objectives may require the use of multiple models of varying complexity. For instance, both a watershed-scale loading as well as a receiving water model may be needed for a watershed of sufficient complexity in which both point and non-point sources of pollution are being modeled. Recently, inductive or data-driven models are increasingly used for applications in watershed management. Examples of inductive models range from simple linear regression models to more complex nonlinear models based on artificial neural networks. Both linear and non-linear inductive models can be used to fit a mathematical model to a given data set in order to represent a process. Inductive or data-driven models are becoming more and more popular due to their ease of use and simplicity as substitutes for more process-based models in a number of applications. For instance, inductive models may be preferred where 1) computational expense is a critical issue, 2) the process-based deductive models are over parameterized and cannot be adequately calibrated, 3) budgetary constraints do not allow for a complex deductive model, and 4) quick and simple models are needed for integration into an optimal management framework for evaluating multiple scenarios in a relatively short period of time. Both explicit inductive or implicit inductive models can be developed in such applications. While implicit inductive models require output from a calibrated mechanistic model of the watershed, explicit inductive models can be easily developed using raw data collected for the process being modeled. More recently, inductive models derived using evolutionary and biological principles are becoming increasingly popular. These include artificial intelligence-based models such as artificial neural networks, genetic algorithms, and genetic programming. This paper will compare these techniques among themselves as well as with a simple baseline technique such as multiple linear regression models for application to water quality modeling in a watershed management framework. Example applications include modeling water quality parameters such as pathogens, dissolved oxygen, total nitrogen, and total phosphorus in an urban watershed.

Tufail, M.; Ormsbee, L.

2006-12-01

315

Partially linear censored quantile regression

Digital Repository Infrastructure Vision for European Research (DRIVER)

Censored regression quantile (CRQ) methods provide a powerful and flexible approach to the analysis of censored survival data when standard linear models are felt to be appropriate. In many cases however, greater flexibility is desired to go beyond the usual multiple regression paradigm. One area of common interest is that of partially linear models: one (or more) of the explanatory covariates are assumed to act on the response through a non-linear function. Here the CRQ approach of Portnoy (...

Neocleous, T.; Portnoy, S.

2009-01-01

316

Modelling of investment development of national economy of Ukraine on basis of regression analysis

Digital Repository Infrastructure Vision for European Research (DRIVER)

The article considers the results of modeling of investment development of national economy of Ukraine during 2001-2011 on basis of regression analysis. Determine the influence of the investment to economic development of national economy of Ukraine. Sectors of national economy divided to three groups of level of investment impact to economic development.

Kuzmin, O.; Pyrog, O.

2013-01-01

317

Family Background Variables as Instruments for Education in Income Regressions: A Bayesian Analysis

The validity of family background variables instrumenting education in income regressions has been much criticized. In this paper, we use data from the 2004 German Socio-Economic Panel and Bayesian analysis to analyze to what degree violations of the strict validity assumption affect the estimation results. We show that, in case of moderate direct…

Hoogerheide, Lennart; Block, Joern H.; Thurik, Roy

2012-01-01

318

Declining Bias and Gender Wage Discrimination? A Meta-Regression Analysis

The meta-regression analysis reveals that there is a strong tendency for discrimination estimates to fall and wage discrimination exist against the woman. The biasing effect of researchers' gender of not correcting for selection bias has weakened and changes in labor market have made it less important.

Jarrell, Stephen B.; Stanley, T. D.

2004-01-01

319

Meta-regression analysis of commensal and pathogenic Escherichia coli survival in soil and water.

The extent to which pathogenic and commensal E. coli (respectively PEC and CEC) can survive, and which factors predominantly determine the rate of decline, are crucial issues from a public health point of view. The goal of this study was to provide a quantitative summary of the variability in E. coli survival in soil and water over a broad range of individual studies and to identify the most important sources of variability. To that end, a meta-regression analysis on available literature data was conducted. The considerable variation in reported decline rates indicated that the persistence of E. coli is not easily predictable. The meta-analysis demonstrated that for soil and water, the type of experiment (laboratory or field), the matrix subtype (type of water and soil), and temperature were the main factors included in the regression analysis. A higher average decline rate in soil of PEC compared with CEC was observed. The regression models explained at best 57% of the variation in decline rate in soil and 41% of the variation in decline rate in water. This indicates that additional factors, not included in the current meta-regression analysis, are of importance but rarely reported. More complete reporting of experimental conditions may allow future inference on the global effects of these variables on the decline rate of E. coli. PMID:24839874

Franz, Eelco; Schijven, Jack; de Roda Husman, Ana Maria; Blaak, Hetty

2014-06-17

320

Study of quantitative structure - property methods of linear regression analysis and neural networks

Directory of Open Access Journals (Sweden)

Full Text Available Modelation of protonisation dependence on the values of molecular discriptors of various classesorganic compounds is carried out by the methods of multydimensional regressive analysis and neuron nets. Advantage of neuron nets method for guantitive relationships structure-property description is shown.

?.?. ??????

2007-02-01

321

What Satisfies Students?: Mining Student-Opinion Data with Regression and Decision Tree Analysis

To investigate how students' characteristics and experiences affect satisfaction, this study uses regression and decision tree analysis with the CHAID algorithm to analyze student-opinion data. A data mining approach identifies the specific aspects of students' university experience that most influence three measures of general satisfaction. The…

Thomas, Emily H.; Galambos, Nora

2004-01-01

322

Cognitive Differentiation Analysis: A Regression Extension of the Reynolds-Sutrick Model.

Cognitive Differentiation Analysis (CDA) represents a method to measure the correspondence of an individual vector or a composite vector of descriptor ratings to a matrix of pair-wise dissimilarity judgments where both sets of judgments are assumed to be ordinal. The zero intercept regression extension of CDA is described. (TJH)

Reynolds, Thomas J.; Sutrick, Kenneth H.

1988-01-01

323

Two-level Haseman-Elston regression for general pedigree data analysis.

The Haseman-Elston (HE) (Haseman and Elston [1972] Behav Genet 2:3-19) method is widely used in genetic linkage studies for quantitative traits. We propose a new version of the HE regression model, a two-level HE regression model (tHE) in which the variance-covariance structure of family data is modeled under the framework of multiple-level regression. An iterative generalized least squares (IGLS) algorithm is adopted to handle the varying variance-covariance structures across families in a simple fashion. In this way, the tHE can compete favorably with any current version of HE in that it can naturally make use of all the trait information available in any general pedigree, simultaneously incorporate individual-level and pedigree-level covariates, marker genotypes for linkage (i.e., the number of allele shared identically by descent [IBD]), and marker alleles for association. Under the assumption of normality, the method is asymptotically equivalent to the usual variance component model for detecting linkage. For the situation where the assumption of normality is critical, a robust globally consistent estimator of the quantitative trait locus (QTL) variance is available. Complex genetic mechanisms, including gene-gene interaction, gene-environmental interaction, and imprinting, can be directly modeled in this version of HE regression. PMID:15838848

Wang, Tao; Elston, Robert C

2005-07-01

324

Parent Progeny regression analysis in F2 and F3 generations of rice

Directory of Open Access Journals (Sweden)

Full Text Available Parent progeny regression analysis involving F2 and F3 generation of two crosses in rice was undertaken to estimate the geneticpotential transferred from one generation to other by adopting three levels of selection for single plant yield. Significant positivecorrelation and regression was observed in both crosses at positive level of selection (mean +1SD between F3 mean and thecorresponding F2 values, indicating that selection of single plant yield at these levels would be effective in both crosses. Itindicates the chances of selecting high yielding genotypes at early generations.

Anilkumar , C. Vanniarajan*1 and J. Ramalingam

2011-12-01

325

Statistical methods in regression and calibration analysis of chromosome aberration data

International Nuclear Information System (INIS)

The method of iteratively reweighted least squares for the regression analysis of Poisson distributed chromosome aberration data is reviewed in the context of other fit procedures used in the cytogenetic literature. As an application of the resulting regression curves methods for calculating confidence intervals on dose from aberration yield are described and compared, and, for the linear quadratic model a confidence interval is given. Emphasis is placed on the rational interpretation and the limitations of various methods from a statistical point of view. (orig./MG)

326

Key To Effective English Remedial Education: Intimation Derived From Multiple Regression

With the rapid decrease in younger population, Japanese universities/colleges have to face the challenging task of how to reach the annual quota for incoming students. The admission criteria are debased and students with a broad variety of scholastic abilities are being accepted by higher education institutions. Freshmen's deterioration in academic performances is said to be the most crucial factor hindering the implementation of effective curriculum education. Many universities/colleges have to establish remedial education programs to deal with this problem arising from the limited room for student selection. This paper reports an English remedial education program carried out in Nishinippon Institute of Technology, Japan, examining the validities of its course setting, optimizing the prediction models for students' post-course score changes. The analysis is focused on those determinants proved to be responsible for the improvement of students' English proficiencies, verifying the argument that more effective English remedial education can be realized by conducting appropriate instructions and teaching methodology in courses at different levels.

Zhang, Rong; Ishino, Fukuya

2009-05-01

327

A nanocrystalline SnO2 thin film was synthesized by a chemical bath method. The parameters affecting the energy band gap and surface morphology of the deposited SnO2 thin film were optimized using a semi-empirical method. Four parameters, including deposition time, pH, bath temperature and tin chloride (SnCl2·2H2O) concentration were optimized by a factorial method. The factorial used a Taguchi OA (TOA) design method to estimate certain interactions and obtain the actual responses. Statistical evidences in analysis of variance including high F-value (4,112.2 and 20.27), very low P-value (<0.012 and 0.0478), non-significant lack of fit, the determination coefficient (R2 equal to 0.978 and 0.977) and the adequate precision (170.96 and 12.57) validated the suggested model. The optima of the suggested model were verified in the laboratory and results were quite close to the predicted values, indicating that the model successfully simulated the optimum conditions of SnO2 thin film synthesis. PMID:24509767

Ebrahimiasl, Saeideh; Zakaria, Azmi

2014-01-01

328

This activity focuses on basic ideas of linear regression. It covers creating scatterplots from data, describing the association between two variables, and correlation as a measure of linear association. After this activity students will have the knowledge to create output that yields R-square, the slope and intercept, as well as their interpretations. This activity also covers some of the basics about residual analysis and the fit of the linear regression model in certain settings.

2009-01-28

329

Regression analysis for general adaptation in pearl millet using different environmental indices.

Regression analyses on grain yield of 20 hybrid and 13 composite varieties of pearl millet (Pennisetum typhoides (Burm. S. & H.)) evaluated at 19 sites in India were performed to assess their relative stability and to compare different measures of environmental values. A large portion of the significant genotype X environment interactions was attributed to the non-linear component and deviations mean squares (Sdi (2)) were a very important parameter for selection of stable varieties. The mean grain yield was positively associated with regression coefficients and deviations mean squares. The hybrids MH 31, MH 35, MH 36 and MH 62 and composite populations MP 16, MP 31 and MP 36 possessed general adaptability. The use of dependent, independent and near-independent measures of environmental values has been found to have little influence on the general interpretation of regression analysis in pearl millet. PMID:24257822

Virk, D S; Singh, N B; Srivastava, M; Harinarayana, G

1984-10-01

330

It is estimated that mainland Chinese tourists travelling to Taiwan can bring annual revenues of 400 billion NTD to the Taiwan economy. Thus, how the Taiwanese Government formulates relevant measures to satisfy both sides is the focus of most concern. Taiwan must improve the facilities and service quality of its tourism industry so as to attract more mainland tourists. This paper conducted a questionnaire survey of mainland tourists and used grey relational analysis in grey mathematics to analyze the satisfaction performance of all satisfaction question items. The first eight satisfaction items were used as independent variables, and the overall satisfaction performance was used as a dependent variable for quantile regression model analysis to discuss the relationship between the dependent variable under different quantiles and independent variables. Finally, this study further discussed the predictive accuracy of the least mean regression model and each quantile regression model, as a reference for research personnel. The analysis results showed that other variables could also affect the overall satisfaction performance of mainland tourists, in addition to occupation and age. The overall predictive accuracy of quantile regression model Q0.25 was higher than that of the other three models. PMID:24574916

Wang, Wen-Cheng; Cho, Wen-Chien; Chen, Yin-Jen

2014-01-01

331

It is estimated that mainland Chinese tourists travelling to Taiwan can bring annual revenues of 400 billion NTD to the Taiwan economy. Thus, how the Taiwanese Government formulates relevant measures to satisfy both sides is the focus of most concern. Taiwan must improve the facilities and service quality of its tourism industry so as to attract more mainland tourists. This paper conducted a questionnaire survey of mainland tourists and used grey relational analysis in grey mathematics to analyze the satisfaction performance of all satisfaction question items. The first eight satisfaction items were used as independent variables, and the overall satisfaction performance was used as a dependent variable for quantile regression model analysis to discuss the relationship between the dependent variable under different quantiles and independent variables. Finally, this study further discussed the predictive accuracy of the least mean regression model and each quantile regression model, as a reference for research personnel. The analysis results showed that other variables could also affect the overall satisfaction performance of mainland tourists, in addition to occupation and age. The overall predictive accuracy of quantile regression model Q0.25 was higher than that of the other three models. PMID:24574916

Wang, Wen-Cheng; Cho, Wen-Chien; Chen, Yin-Jen

2014-01-01

332

Digital Repository Infrastructure Vision for European Research (DRIVER)

Standard quantitative trait loci (QTL) mapping techniques commonly assume that the trait is both fully observed and normally distributed. When considering survival or age-at-onset traits these assumptions are often incorrect. Methods have been developed to map QTL for survival traits; however, they are both computationally intensive and not available in standard genome analysis software packages. We propose a grouped linear regression method for the analysis of continuous survival data. Using...

Anderson, Carl A.; Mcrae, Allan F.; Visscher, Peter M.

2006-01-01

333

Critical Regression Analysis of Real Time Industrial Web Data Set Using Data Mining Tool

Digital Repository Infrastructure Vision for European Research (DRIVER)

In todays fast pacing, highly competing,volatile and challenging world, companies highly rely on data analysis obtained from both offline as well as online way to make their future strategy, to sustain in the market. This paper reviews the regression technique analysis on a real time web data to analyse different attributes of interest and to predict possible growth factors for the company, so as to enable the company to make possible strategic decisions for the growth of th...

Kohli, Shruti; Gupta, Ankit

2014-01-01

334

Directory of Open Access Journals (Sweden)

Full Text Available Different methods for modelling nonlinear system are investigated in this paper. Neural network (NN techniques, multiple linear regression (MLR and principal component regression (PCR are applied to two nonlinear systems which are sine function and distillation column. For the sake of studying these three distinctive methods, all the data taken is from simulation which is then be seperated into training, testing and validation. Among those different approaches, the NN approach based on the nonlinear prediction technique gives a very good performance in for both case studies. It is also shown that MLR model suffers from glitches due to the collinearity of the input variables whereas PCR model shows good result in the prediction output. As a conclusion, the NN methods exhibit a consistent result with least sum square error (SSE on the unseen data compared to the other two technique

Zainal Ahmad

2007-10-01

335

DEFF Research Database (Denmark)

Colloids are potential carriers for strongly sorbing chemicals in macroporous soils, but predicting the amount of colloids readily available for facilitated chemical transport is an unsolved challenge. This study addresses potential key parameters and predictive indicators when assessing colloid dispersibility and transport at the field scale. Samples representing three measurement scales (1-2 mm aggregates, intact 100 cm3 rings, and intact 6283 cm3 columns) were retrieved from the topsoil of a 1.69 ha agricultural field in a 15 m × 15 m grid (65 locations) to determine soil dispersibility as well as 24 comparison parameters including textural, chemical, and structural (e.g. air permeability) 8 soil properties. The soil dispersibility was determined (i) using a laser diffraction method on 1-2 mm aggregates equilibrated to an initial matric potential of -100 cm H2O, (ii) using an end-over-end shaking on 6.06 cm (diam.) × 3.48 cm (height) cm intact soil rings equilibrated to an initial matric potential of -5 cmH2O, and (iii) as the accumulated amount of particles leached from 20 cm × 20 cm intact soil columns after 6.5 hr (60 mm accumulated outflow). At all three scales, soil dispersibility was higher in samples collected from the northern part of the field where the greatest leaching of pesticides was observed in a horizontal well at ~ 3.5 m depth during a 9-year monitoring program. This suggests that the three dispersibility methods used are all relevant for field-scale mapping of areas with enhanced risk of colloid-facilitated transport. Subsequently, using multiple linear regression (MLR) analyses, soil dispersibility was predicted at all three sample scales from the 24 measured, geo-referenced parameters to produce sets of only a few promising indicator parameters for evaluating soil stability and particle mobilization on field scale. The MLR analyses at each scale were separated in predictions using all, only north, and only south locations in the field. We found that different independent variables were included in the regression models when the sample scale increased from aggregate to column level. Generally, the predictive power of the regression models was better on the 1-2 mm aggregate scale than on the intact 100 cm3 and 20 cm × 20 cm scales. Overall, results suggested that different drivers controlled soil dispersibility 1 at the three scales and the two sub-areas of the field. Predictions of soil dispersibility and the risk of colloid-facilitated chemical transport will therefore need to be highly scale- and area-specific.

NØrgaard, Trine; Katuwal, Sheela

2014-01-01

336

Scientific Electronic Library Online (English)

Full Text Available SciELO Chile | Language: Spanish Abstract in spanish La incorporación de nuevo personal o la reasignación del ya existente a tareas específicas constituyen una decisión importante, porque el acierto en ella determinará la propia supervivencia de la empresa. En este contexto se vuelve relevante contar con un modelo de selección de personal que consider [...] e la información ambigua y los grados de incertidumbre que están asociados al momento de evaluar las valoraciones cualitativas de los postulantes y que pueda entregar resultados certeros y precisos, garantizando de esta manera el buen desempeño del cargo y reduciendo así el riesgo que conlleva la incorporación de nuevas personas. En este trabajo se elaboró un modelo de selección de personal, en condiciones de incertidumbre, aplicando Lógica Difusa, utilizando como datos de entrada las descripciones de cargos de una empresa del retail, con variables difusas triangulares y con solapamiento. Este fue comparado con un modelo clásico de regresión múltiple. Los resultados mostraron que, en este caso, el uso del modelo de regresión múltiple es más eficiente que el modelo de lógica difusa optado. Abstract in english The incorporation of new personnel or the reallocation of existing tasks is an important decision, since its correctness will determine the survival of the company. In this context, having a model of personnel selection, that considers the associated ambiguous information and degrees of uncertainty, [...] becomes relevant when assessing the qualitative value of the applicants, able to deliver accurate and precise results thus ensuring the good performance of the position and reducing the associated risk with the incorporation of new people. In this work, a model of personnel selection, in conditions of uncertainty using fuzzy logic and having as input the data descriptions of positions of a retail industry, with triangular fuzzy variables and overlap was developed. This was compared with a classical model of multiple regressions. The results showed in this case, that the use of the model of multiple regressions is more efficient than the opted model of fuzzy logic.

Carlos A, Díaz-Contreras; Alejandra, Aguilera-Rojas; Nathaly, Guillén-Barrientos.

2014-10-01

337

A new cluster-histo-regression analysis for incremental learning from temporal data chunks

Directory of Open Access Journals (Sweden)

Full Text Available In scenarios where data chunks arrive temporally, a good algorithm for exploratory analysisshould be able to generate the knowledge and with the next chunk of data arriving, the process should bethe one of just updating online by accumulating the knowledge derived from the recent chunk. Such anincremental learning process in most of the cases indent a lot of memory requiring to carry all earlier data inthe process of updating the knowledge successively. In this research work we propose to employ a novelCluster-Histo-Regression analysis of the chunk to extract the knowledge for the temporal instant and fusethis knowledge through Histo-Regression-Distance analysis with the already accumulated knowledge. Wehave designed a methodology which (i discards all those data samples from the chunk which haveparticipated in the knowledge generation process (ii indents minimum amount of memory to carry theaccumulated knowledge and (iii proposes to carry forward only those limited data samples (referred to ashard samples which could not contribute to knowledge generated at that moment. Knowledge of eachcluster is represented in the form of a histogram for each dimension of the clustered data and is transformedto regression line for the compact representation of the knowledge. The regression line parameters of theclusters obtained by incremental augmentation have shown an accuracy of up to 100% for some of the datasets that are considered for experimentation.

Nagabhushan P.

2010-03-01

338

DEFF Research Database (Denmark)

Recently there has been an explosion in the availability of bacterial genomic sequences, making possible now an analysis of genomic signatures across more than 800 hundred different bacterial chromosomes, from a wide variety of environments. Using genomic signatures, we pair-wise compared 867 different genomic DNA sequences, taken from chromosomes and plasmids more than 100,000 base-pairs in length. Hierarchical clustering was performed on the outcome of the comparisons before a multinomial regression model was fitted. The regression model included the cluster groups as the response variable with AT content, phyla, growth temperature, selective pressure, habitat, sequence size, oxygen requirement and pathogenicity as predictors. Many significant factors were associated with the genomic signature, most notably AT content. Phyla was also an important factor, although considerably less so than AT content. Small improvements to the regression model, although significant, were also obtained by factors such as sequence size, habitat, growth temperature, selective pressure measured as oligonucleotide usage variance, and oxygen requirement.The statistics obtained using hierarchical clustering and multinomial regression analysis indicate that the genomic signature is shaped by many factors, and this may explain the varying ability to classify prokaryotic organisms below genus level.

Ussery, David

2009-01-01

339

International Nuclear Information System (INIS)

Prediction of the amount of hospital waste production will be helpful in the storage, transportation and disposal of hospital waste management. Based on this fact, two predictor models including artificial neural networks (ANNs) and multiple linear regression (MLR) were applied to predict the rate of medical waste generation totally and in different types of sharp, infectious and general. In this study, a 5-fold cross-validation procedure on a database containing total of 50 hospitals of Fars province (Iran) were used to verify the performance of the models. Three performance measures including MAR, RMSE and R2 were used to evaluate performance of models. The MLR as a conventional model obtained poor prediction performance measure values. However, MLR distinguished hospital capacity and bed occupancy as more significant parameters. On the other hand, ANNs as a more powerful model, which has not been introduced in predicting rate of medical waste generation, showed high performance measure values, especially 0.99 value of R2 confirming the good fit of the data. Such satisfactory results could be attributed to the non-linear nature of ANNs in problem solving which provides the opportunity for relating independent variables to dependent ones non-linearly. In conclusion, the obtained results showed that our ANN-based model approach is very promising and may play a useful role in developing a better cost-effective strategy for waste management in fututive strategy for waste management in future.

340

Mitigating the effects of salt and selenium on water quality in the Grand Valley and lower Gunnison River Basin in western Colorado is a major concern for land managers. Previous modeling indicated means to improve the models by including more detailed geospatial data and a more rigorous method for developing the models. After evaluating all possible combinations of geospatial variables, four multiple linear regression models resulted that could estimate irrigation-season salt yield, nonirrigation-season salt yield, irrigation-season selenium yield, and nonirrigation-season selenium yield. The adjusted r-squared and the residual standard error (in units of log-transformed yield) of the models were, respectively, 0.87 and 2.03 for the irrigation-season salt model, 0.90 and 1.25 for the nonirrigation-season salt model, 0.85 and 2.94 for the irrigation-season selenium model, and 0.93 and 1.75 for the nonirrigation-season selenium model. The four models were used to estimate yields and loads from contributing areas corresponding to 12-digit hydrologic unit codes in the lower Gunnison River Basin study area. Each of the 175 contributing areas was ranked according to its estimated mean seasonal yield of salt and selenium.

Linard, Joshua I.

2013-01-01

341

Directory of Open Access Journals (Sweden)

Full Text Available In the present work, support vector machines (SVMs and multiple linear regression (MLR techniques were used for quantitative structure–property relationship (QSPR studies of retention time (tR in standardized liquid chromatography–UV–mass spectrometry of 67 mycotoxins (aflatoxins, trichothecenes, roquefortines and ochratoxins based on molecular descriptors calculated from the optimized 3D structures. By applying missing value, zero and multicollinearity tests with a cutoff value of 0.95, and genetic algorithm method of variable selection, the most relevant descriptors were selected to build QSPR models. MLRand SVMs methods were employed to build QSPR models. The robustness of the QSPR models was characterized by the statistical validation and applicability domain (AD. The prediction results from the MLR and SVM models are in good agreement with the experimental values. The correlation and predictability measure by r2 and q2 are 0.931 and 0.932, repectively, for SVM and 0.923 and 0.915, respectively, for MLR. The applicability domain of the model was investigated using William’s plot. The effects of different descriptors on the retention times are described.

Fereshteh Shiri

2010-08-01

342

We compared the goodness of fit of three mathematical functions (including: Legendre polynomials, Lidauer-Mäntysaari function and Wilmink function) for describing the lactation curve of primiparous Iranian Holstein cows by using multiple-trait random regression models (MT-RRM). Lactational submodels provided the largest daily additive genetic (AG) and permanent environmental (PE) variance estimates at the end and at the onset of lactation, respectively, as well as low genetic correlations between peripheral test-day records. For all models, heritability estimates were highest at the end of lactation (245 to 305 days) and ranged from 0.05 to 0.26, 0.03 to 0.12 and 0.04 to 0.24 for milk, fat and protein yields, respectively. Generally, the genetic correlations between traits depend on how far apart they are or whether they are on the same day in any two traits. On average, genetic correlations between milk and fat were the lowest and those between fat and protein were intermediate, while those between milk and protein were the highest. Results from all criteria (Akaike's and Schwarz's Bayesian information criterion, and -2*logarithm of the likelihood function) suggested that a model with 2 and 5 coefficients of Legendre polynomials for AG and PE effects, respectively, was the most adequate for fitting the data. PMID:25228285

Kheirabadi, Khabat; Rashidi, Amir; Alijani, Sadegh; Imumorin, Ikhide

2014-11-01

343

Groping Toward Linear Regression Analysis: Newton's Analysis of Hipparchus' Equinox Observations

In 1700, Newton, in designing a new universal calendar contained in the manuscripts known as Yahuda MS 24 from Jewish National and University Library at Jerusalem and analyzed in our recent article in Notes & Records Royal Society (59 (3), Sept 2005, pp. 223-54), attempted to compute the length of the tropical year using the ancient equinox observations reported by a famous Greek astronomer Hipparchus of Rhodes, ten in number. Though Newton had a very thin sample of data, he obtained a tropical year only a few seconds longer than the correct length. The reason lies in Newton's application of a technique similar to modern regression analysis. Actually he wrote down the first of the two so-called "normal equations" known from the Ordinary Least Squares method. Newton also had a vague understanding of qualitative variables. This paper concludes by discussing open historico-astronomical problems related to the inclination of the Earth's axis of rotation. In particular, ignorance about the long-range variation...

Belenkiy, Ari

2008-01-01

344

Analysis of the Evolution of the Gross Domestic Product by Means of Cyclic Regressions

Directory of Open Access Journals (Sweden)

Full Text Available In this article, we will carry out an analysis on the regularity of the Gross Domestic Product of a country, in our case the United States. The method of analysis is based on a new method of analysis – the cyclic regressions based on the Fourier series of a function. Another point of view is that of considering instead the growth rate of GDP the speed of variation of this rate, computed as a numerical derivative. The obtained results show a cycle for this indicator for 71 years, the mean square error being 0.93%. The method described allows an prognosis on short-term trends in GDP.

Catalin Angelo Ioan

2011-08-01

345

Directory of Open Access Journals (Sweden)

Full Text Available Credit scoring is a vital topic for Banks since there is a need to use limited financial sources more effectively. There are several credit scoring methods that are used by Banks. One of them is to estimate whether a credit demanding customer’s repayment order will be regular or not. In this study, artificial neural networks and logistic regression analysis have been used to provide a support to the Banks’ credit risk prediction and to estimate whether a credit demanding customers’ repayment order will be regular or not. The results of the study showed that artificial neural networks method is more reliable than logistic regression analysis while estimating a credit demanding customer’s repayment order.

Hüseyin BUDAK

2012-11-01

346

Directory of Open Access Journals (Sweden)

Full Text Available Objetivou-se, com o presente trabalho, desenvolver uma metodologia para classificação da composição iônica da água de irrigação, através da regressão linear múltipla, tendo-se, como variável dependente, a condutividade elétrica e, como variáveis independentes, as concentrações de cátions e ânions da água de irrigação, classificada de acordo com o peso de cada íon no modelo estatístico. A fonte secundária de dados para a pesquisa foi o Banco de Dados do Laboratório de Análise de Água e Fertilidade do Solo, da Escola Superior de Agricultura de Mossoró (LAAFS/ESAM. As regressões foram ajustadas utilizando-se o método da seleção por etapas, conhecido como the stepwise regression procedure, no qual a variável dependente foi a condutividade elétrica e, como variáveis independentes, os íons determinados pela análise físico-química da água. Os resultados mostraram que, empregando-se este critério de regressão linear múltipla, havia variação na contribuição de cada variável no modelo ajustado, cuja estimativa era baseada no aumento da soma de quadrado, devido à regressão, a medida em que se incorporava, ao modelo, cada variável independente. Em função de critérios preestabelecidos, águas provenientes de mananciais da região da Chapada do Apodi foram classificadas como cálcica-sódica, cálcica e cloretada, quando provinham de poço tubular, de poço amazonas e rio, respectivamente. As águas oriundas da região do Baixo Açu, foram classificadas como sódica, magnesiana-sódica e sódica, para as águas de poço tubular, poço amazonas e rio, respectivamente.This work was conducted with the objective of developing a methodology for classification of the ionic composition of the irrigation water using multiple linear regression. A Stepwise Regression Analysis model was tested, using electrical conductivity as the dependent variable and analyzed ions calcium, sodium, potassium, carbonate, bicarbonate and chlorides as the independent variables in all tested models. All water samples were collected by the farmers of the region where this work was conducted. The regression models were adjusted using the water analysis database from the ESAM's Analysis Laboratory (Laboratório de Análises de Água e Fertilidade do Solo da Escola Superior de Agricultura de Mossoró - LAAFS/ESAM. The linear model, adjusted using the Stepwise Regression Procedure, shows that the degree of model adjustment tested depends upon geological formation of watersheds and whether it is collected in a river or tubular wells. The classification of the water in calcareous region of the Chapada do Apodi is calcic-sodic, calcic or choride if this source was tubular well, piezometric well (drilled in unconfined water denominated in the region as poço amazonas or surface rivers and lagoons water, respectively. In Baixo Açu region, these waters were classified as sodic, magnesian-sodic or sodic depending if the source collected is a tubular well (drilled in Açu sedimentary geological formation, piezometric well or superficial water, respectivelly.

Celsemy E. Maia

2001-04-01

347

A simulation study of the number of events per variable in logistic regression analysis.

We performed a Monte Carlo study to evaluate the effect of the number of events per variable (EPV) analyzed in logistic regression analysis. The simulations were based on data from a cardiac trial of 673 patients in which 252 deaths occurred and seven variables were cogent predictors of mortality; the number of events per predictive variable was (252/7 =) 36 for the full sample. For the simulations, at values of EPV = 2, 5, 10, 15, 20, and 25, we randomly generated 500 samples of the 673 patients, chosen with replacement, according to a logistic model derived from the full sample. Simulation results for the regression coefficients for each variable in each group of 500 samples were compared for bias, precision, and significance testing against the results of the model fitted to the original sample. For EPV values of 10 or greater, no major problems occurred. For EPV values less than 10, however, the regression coefficients were biased in both positive and negative directions; the large sample variance estimates from the logistic model both overestimated and underestimated the sample variance of the regression coefficients; the 90% confidence limits about the estimated values did not have proper coverage; the Wald statistic was conservative under the null hypothesis; and paradoxical associations (significance in the wrong direction) were increased. Although other factors (such as the total number of events, or sample size) may influence the validity of the logistic model, our findings indicate that low EPV can lead to major problems. PMID:8970487

Peduzzi, P; Concato, J; Kemper, E; Holford, T R; Feinstein, A R

1996-12-01

348

A logistic normal multinomial regression model for microbiome compositional data analysis.

Changes in human microbiome are associated with many human diseases. Next generation sequencing technologies make it possible to quantify the microbial composition without the need for laboratory cultivation. One important problem of microbiome data analysis is to identify the environmental/biological covariates that are associated with different bacterial taxa. Taxa count data in microbiome studies are often over-dispersed and include many zeros. To account for such an over-dispersion, we propose to use an additive logistic normal multinomial regression model to associate the covariates to bacterial composition. The model can naturally account for sampling variabilities and zero observations and also allow for a flexible covariance structure among the bacterial taxa. In order to select the relevant covariates and to estimate the corresponding regression coefficients, we propose a group ?1 penalized likelihood estimation method for variable selection and estimation. We develop a Monte Carlo expectation-maximization algorithm to implement the penalized likelihood estimation. Our simulation results show that the proposed method outperforms the group ?1 penalized multinomial logistic regression and the Dirichlet multinomial regression models in variable selection. We demonstrate the methods using a data set that links human gut microbiome to micro-nutrients in order to identify the nutrients that are associated with the human gut microbiome enterotype. PMID:24128059

Xia, Fan; Chen, Jun; Fung, Wing Kam; Li, Hongzhe

2013-12-01

349

Digital Repository Infrastructure Vision for European Research (DRIVER)

The paper uses Contingent Valuation to investigate the externalities from linear infrastructures, with a particular concern for their dependence on characteristics of the local context within which they are perceived. We employ Geographical Information Systems and a spatial econometric technique, the Geographic Weighted Regression, integrated in a dichotomous choice CV in order to improve both the sampling design and the econometric analysis of a CV survey. These tools are h...

Frontuto, Vito; Giaccaria, Sergio

2006-01-01

350

DEFF Research Database (Denmark)

The reduced rank regression model is a multivariate regression model with a coefficient matrix with reduced rank. The reduced rank regression algorithm is an estimation procedure, which estimates the reduced rank regression model. It is related to canonical correlations and involves calculating eigenvalues and eigenvectors. We give a number of different applications to regression and time series analysis, and show how the reduced rank regression estimator can be derived as a Gaussian maximum likelihood estimator. We briefly mention asymptotic results

Johansen, SØren

2008-01-01

351

Robust Outlier Detection in Linear Regression

Digital Repository Infrastructure Vision for European Research (DRIVER)

New methodology of robust outlier detection based on Robustly Studentized Robust Residuals (RSRR) examination is well established in linear regression analysis. Two new robust location estimators of linear regression parameters are developed in simple and multiple cases. Based on these robust estimators we obtain RSRR. We used RSRR to derive a new measure of distance to be used in outlier detection. A graphical display using new measure of distance is constructed for detecting multiple outlie...

Jajo, Nethal K.; Xizhi Wu

2004-01-01

352

Equality Constraints in Multiple Correspondence Analysis.

Application of equality constraints on the categories of a variable is a simple and useful extension of multiple correspondence analysis. Equality is an easy way to incorporate prior knowledge. A procedure to deal with unequal category numbers and with subsets of variables is outlined and illustrated. (SLD)

van Buuren, Stef; de Leeuw, Jan

1992-01-01

353

MULTIPLE PROBIT ANALYSIS WITH A NONZERO BACKGROUND

The 'EM' (Expectation-Maximization) algorithm is applied to probit analysis with multiple independent variables and a nonzero response rate. The equations for the maximum likelihood estimators are relatively simple, and converge in all the cases so far examined. An animal bioassa...

354

1. Central questions of behavioural and evolutionary ecology are what factors influence the reproductive success of dominant breeders and subordinate nonbreeders within animal societies? A complete understanding of any society requires that these questions be answered for all individuals. 2. The clown anemonefish, Amphiprion percula, forms simple societies that live in close association with sea anemones, Heteractis magnifica. Here, we use data from a well-studied population of A. percula to determine the major predictors of reproductive success of dominant pairs in this species. 3. We analyse the effect of multiple predictors on four components of reproductive success, using a relatively new technique from the field of statistical learning: boosted regression trees (BRTs). BRTs have the potential to model complex relationships in ways that give powerful insight. 4. We show that the reproductive success of dominant pairs is unrelated to the presence, number or phenotype of nonbreeders. This is consistent with the observation that nonbreeders do not help or hinder breeders in any way, confirming and extending the results of a previous study. 5. Primarily, reproductive success is negatively related to male growth and positively related to breeding experience. It is likely that these effects are interrelated because males that grow a lot have little breeding experience. These effects are indicative of a trade-off between male growth and parental investment. 6. Secondarily, reproductive success is positively related to female growth and size. In this population, female size is positively related to group size and anemone size, also. These positive correlations among traits likely are caused by variation in site quality and are suggestive of a silver-spoon effect. 7. Noteworthily, whereas reproductive success is positively related to female size, it is unrelated to male size. This observation provides support for the size advantage hypothesis for sex change: both individuals maximize their reproductive success when the larger individual adopts the female tactic. 8. This study provides the most complete picture to date of the factors that predict the reproductive success of dominant pairs of clown anemonefish and illustrates the utility of BRTs for analysis of complex behavioural and evolutionary ecology data. PMID:21284624

Buston, Peter M; Elith, Jane

2011-05-01

355

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background About 10-20% of neonates with suspected or proven early onset sepsis (EOS fail on the empiric antibiotic regimen of ampicillin or penicillin and gentamicin. We aimed to identify clinical and laboratory markers associated with empiric antibiotic treatment failure in neonates with suspected EOS. Methods Maternal and early neonatal characteristics predicting failure of empiric antibiotic treatment were identified by univariate logistic regression analysis from a prospective database of 283 neonates admitted to neonatal intensive care unit within 72 hours of life and requiring antibiotic therapy with penicillin or ampicillin and gentamicin. Variables, identified as significant by univariate analysis, were entered into stepwise multiple logistic regression (MLR analysis and classification and regression tree (CRT analysis to develop a decision algorithm for clinical application. In order to ensure the earliest possible timing separate analysis for 24 and 72 hours of age was performed. Results At 24 hours of age neonates with hypoglycaemia ? 2.55 mmol/L together with CRP values > 1.35 mg/L or those with BW ? 678 g had more than 30% likelihood of treatment failure. In normoglycaemic neonates with higher BW the best predictors of treatment failure at 24 hours were GA ? 27 weeks and among those, with higher GA, WBC ? 8.25 × 109 L-1 together with platelet count ? 143 × 109 L-1. The algorithm allowed capture of 75% of treatment failure cases with a specificity of 89%. By 72 hours of age minimum platelet count ? 94.5 × 109 L-1 with need for vasoactive treatment or leukopaenia ? 3.5 × 109 L-1 or leukocytosis > 39.8 × 109 L-1 or blood glucose ? 1.65 mmol/L allowed capture of 81% of treatment failure cases with the specificity of 88%. The performance of MLR and CRT models was similar, except for higher specificity of the CRT at 72 h, compared to MLR analysis. Conclusion There is an identifiable group of neonates with high risk of EOS, likely to fail on conventional antibiotic therapy.

Merila Mirjam

2009-11-01

356

Multiple regression models of ?13C and ?15N for fish populations in the eastern Gulf of Mexico

Multiple regression models were created to explain spatial and temporal variation in the ?13C and ?15N values of fish populations on the West Florida Shelf (eastern Gulf of Mexico, USA). Extensive trawl surveys from three time periods were used to acquire muscle samples from seven groundfish species. Isotopic variation (?13Cvar and ?15Nvar) was calculated as the deviation from the isotopic mean of each fish species. Static spatial data and dynamic water quality parameters were used to create models predicting ?13Cvar and ?15Nvar in three fish species that were caught in the summers of 2009 and 2010. Additional data sets were then used to determine the accuracy of the models for predicting isotopic variation (1) in a different time period (fall 2010) and (2) among four entirely different fish species that were collected during summer 2009. The ?15Nvar model was relatively stable and could be applied to different time periods and species with similar accuracy (mean absolute errors 0.31-0.33‰). The ?13Cvar model had a lower predictive capability and mean absolute errors ranged from 0.42 to 0.48‰. ?15N trends are likely linked to gradients in nitrogen fixation and Mississippi River influence on the West Florida Shelf, while ?13C trends may be linked to changes in algal species, photosynthetic fractionation, and abundance of benthic vs. planktonic basal resources. These models of isotopic variability may be useful for future stable isotope investigations of trophic level, basal resource use, and animal migration on the West Florida Shelf.

Radabaugh, Kara R.; Peebles, Ernst B.

2014-08-01

357

Directory of Open Access Journals (Sweden)

Full Text Available Organophosphorus compounds are a well known class of toxic chemicals which find their way into ecosystem due to their wide spread use. Their detection, identification and quantification are cause of concern world over. In environmental samples these compounds are detected and estimated through the gas chromatographic response factor. This prompted us to study the quantitative structure-response relationships (QSRR of gas chromatographic response factor of organophosphonate esters. In this study attempts have been made to rationalize the gas chromatographic response factor of twenty-eight organophosphonates in terms of their physicochemical and electronic descriptors. Combinatorial Protocol in Multiple Linear Regression (CP-MLR, a 'filter' based variable selection procedure for model development in structure-activity or property relationship studies, has been used for the variable selection and identification of diverse QSRR models of the GC response factor of organophosphonates. The study has resulted in the identification of ten models (equations, having two or three descriptor each, to account for the response factor of organophosphonates (cross-validated R2 or Q2 is 0.88 to 0.95. The response factor of the compounds is strongly correlated with the total refractivity (TREF, molecular weight (MW and thermodynamic properties, e.g., enthalpy of vaporization (ENTH. In the study, alkyl groups of these compounds have shown two-fold influence (namely, steric and branching effect on the response factor. Also, the study suggests that the polarization of (d-p? bond of P=Oa in these compounds plays a critical role in the formation of the responding species. The steric and electronic properties of organophosphonates play a determining role in the predictive aspect of their gas chromatographic response factor. Also the study suggested a mechanism for the formation of the responding species.

Yenamandra S. Prabhakar

2004-03-01

358

DEFF Research Database (Denmark)

Background: The next fifty years will see a drastic increase in the older population. Among other effects, ageing causes a decrease in strength. It is necessary to provide safe and comfortable environments for the elderly. To achieve this, digital human modelling has proved to be a useful and valuable ergonomic tool. Objective: To investigate age and gender effects on the torque-producing ability in the knee and elbow in older adults. To create strength scaled equations based on age, gender, upper/lower limb lengths and masses using multiple linear regression. To reduce the number of dependent parameters based on statistical redundancies, and then validate these equations. Methods: 283 subjects (141 males, 142 females) aged 50-59 years (54.9 +/- 2.9) , 60-69 years (65.4 +/- 2.9) and 70-79 years (73.7 +/- 2.7) were tested for maximal voluntary isometric torque of right knee extensors and elbow flexors. Results: Males were signifantly stronger than females across all age groups. Elbow peak torque (EPT) was better preserved from 60s to 70s whereas knee peak torque (KPT) reduced significantly (P<0.05) across all age groups. This held true for males and females. Gender, thigh mass and age best predicted KPT (R2=0.60). Gender, forearm mass and age best predicted EPT (R2=0.75). Good crossvalidation was established for both elbow and knee models. Conclusion: This cross-sectional study of muscle strength created and validated strength scaled equations of EPT and KPT using only gender, segment mass and age.

D'Souza, Sonia; Rasmussen, John

2012-01-01

359

A wavelet-based latent variable regression (WLVR) method was developed to perform simultaneous quantitative analysis of overlapping spectrophotometric signals. The quality of the noise removal was improved by combining wavelet thresholding with principal component analysis (PCA). A method for selecting the optimum threshold was also developed. Eight error functions were calculated for deducing the number of factor. The latent variables were made by projecting the wavelet-processed signals onto orthogonal basis eigenvectors. Two-programs WMRA and WLVR, were designed to perform wavelet thresholding and simultaneous multicomponent determination. Experimental results showed the WLVR method to be successful even where there was severe overlap of spectra.

Gao, Ling; Ren, Shouxin

2008-12-01

360

Modern near infrared spectroscopy (NIRS), as an indirect analytical technique, is used to carriy out quantitative analysis of unknown samples by establishing a model with calibration samples. Taking into account the low sensitivity and poor disturbance rejection of NIRS, a new robust version of the SIMPLS algorithm was constructed from a robust covariance matrix for high-dimensional data and robust linear regression in the present paper. Because SIMPLS was based on the empirical cross-covariance matrix between the response variables and the regressors and on linear least squares regression, the results were affected by abnormal observations in the data set. In order to eliminate their negative impact on the accuracy and reliability of the model, a simple multivariate outlier-detection procedure and a robust estimator for the covariance matrix were embedded in the SIMPLS regression framework, based on the use of information obtained from projections onto the directions that maximize and minimize the kurtosis coefficient of the projected data. Finally, application of the proposed kurtosis-SIMPLS method to the NIR analysis was presented with a comparison to the SIMPLS. The results show that kurtosis-SIMPLS method not only finds out the very outliers from the data set with less computational cost, but also holds better prediction performance and steady capability for the normal samples. PMID:16961227

Cheng, Zhong; Chen, De-zhao

2006-06-01

361

Directory of Open Access Journals (Sweden)

Full Text Available When independent variables have high linear correlation in a multiple linear regression model, we can have wrong analysis. It happens if we do the multiple linear regression analysis based on common Ordinary Least Squares (OLS method. In this situation, we are suggested to use ridge regression estimator. We conduct some simulation study to compare the performance of ridge regression estimator and the OLS. We found that Hoerl and Kennard ridge regression estimation method has better performance than the other approaches.

Anwar Fitrianto

2014-01-01

362

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Cognitive deficits and multiple psychoactive drug regimens are both common in patients treated for opioid-dependence. Therefore, we examined whether the cognitive performance of patients in opioid-substitution treatment (OST is associated with their drug treatment variables. Methods Opioid-dependent patients (N = 104 who were treated either with buprenorphine or methadone (n = 52 in both groups were given attention, working memory, verbal, and visual memory tests after they had been a minimum of six months in treatment. Group-wise results were analysed by analysis of variance. Predictors of cognitive performance were examined by hierarchical regression analysis. Results Buprenorphine-treated patients performed statistically significantly better in a simple reaction time test than methadone-treated ones. No other significant differences between groups in cognitive performance were found. In each OST drug group, approximately 10% of the attention performance could be predicted by drug treatment variables. Use of benzodiazepine medication predicted about 10% of performance variance in working memory. Treatment with more than one other psychoactive drug (than opioid or BZD and frequent substance abuse during the past month predicted about 20% of verbal memory performance. Conclusions Although this study does not prove a causal relationship between multiple prescription drug use and poor cognitive functioning, the results are relevant for psychosocial recovery, vocational rehabilitation, and psychological treatment of OST patients. Especially for patients with BZD treatment, other treatment options should be actively sought.

Rapeli Pekka

2012-11-01

363

Median Regression Analysis of Body Mass Index of Adults in Pakistan

Directory of Open Access Journals (Sweden)

Full Text Available Body Mass Index (BMI is considered to be the most popular measure for overweight and obesity. Numerous studies of BMI are limited to compute and interpret different percentiles of BMI and do not account for many other covariates affecting BMI. Conventional regression methods are used for estimating how covariates are related to mean values of the dependent variable but in many situations, we are interested in quantiles rather than in mean values as in the case of BMI analysis. The present study addresses the same using median regression. Some important covariates such as gender, age, marital status, daily working hours, daily exercise routine and number of meat-eaten days per week are included in the study and found to be significant.

G.R. Pasha

2010-01-01

364

Logistic regression analysis of biomarker data subject to pooling and dichotomization.

There is growing interest in pooling specimens across subjects in epidemiologic studies, especially those involving biomarkers. This paper is concerned with regression analysis of epidemiologic data where a binary exposure is subject to pooling and the pooled measurement is dichotomized to indicate either that no subjects in the pool are exposed or that some are exposed, without revealing further information about the exposed subjects in the latter case. The pooling process may be stratified on the disease status (a binary outcome) and possibly other variables but is otherwise assumed random. We propose methods for estimating parameters in a prospective logistic regression model and illustrate these with data from a population-based case-control study of colorectal cancer. Simulation results show that the proposed methods perform reasonably well in realistic settings and that pooling can lead to sizable gains in cost efficiency. We make recommendations with regard to the choice of design for pooled epidemiologic studies. PMID:21953741

Zhang, Z; Liu, A; Lyles, R H; Mukherjee, B

2012-09-28

365

DEFF Research Database (Denmark)

This paper introduces a new estimator to measure the ex-post covariation between high-frequency financial time series under market microstructure noise. We provide an asymptotic limit theory (including feasible central limit theorems) for standard methods such as regression, correlation analysis and covariance, for which we obtain the optimal rate of convergence. We demonstrate some positive semidefinite estimators of the covariation and construct a positive semidefinite estimator of the conditional covariance matrix in the central limit theorem. Furthermore, we indicate how the assumptions on the noise process can be relaxed and how our method can be applied to non-synchronous observations. We also present an empirical study of how high-frequency correlations, regressions and covariances change through time.

Kinnebrock, Silja; Podolskij, Mark

2008-01-01

366

We studied the effect of grazing on the degree of regression of successional vegetation dynamic in a semi-arid Mediterranean matorral. We quantified the spatial distribution patterns of the vegetation by fractal analyses, using the fractal information dimension and spatial autocorrelation measured by detrended fluctuation analyses (DFA). It is the first time that fractal analysis of plant spatial patterns has been used to characterize the regressive ecological succession. Plant spatial patterns were compared over a long-term grazing gradient (low, medium and heavy grazing pressure) and on ungrazed sites for two different plant communities: A middle dense matorral of Chamaerops and Periploca at Sabinar-Romeral and a middle dense matorral of Chamaerops, Rhamnus and Ulex at Requena-Montano. The two communities differed also in the microclimatic characteristics (sea oriented at the Sabinar-Romeral site and inland oriented at the Requena-Montano site). The information fractal dimension increased as we moved from a middle dense matorral to discontinuous and scattered matorral and, finally to the late regressive succession, at Stipa steppe stage. At this stage a drastic change in the fractal dimension revealed a change in the vegetation structure, accurately indicating end successional vegetation stages. Long-term correlation analysis (DFA) revealed that an increase in grazing pressure leads to unpredictability (randomness) in species distributions, a reduction in diversity, and an increase in cover of the regressive successional species, e.g. Stipa tenacissima L. These comparisons provide a quantitative characterization of the successional dynamic of plant spatial patterns in response to grazing perturbation gradient. ?? 2002 Elsevier Science B.V. All rights reserved.

Alados, C.L.; Pueyo, Y.; Giner, M.L.; Navarro, T.; Escos, J.; Barroso, F.; Cabezudo, B.; Emlen, J.M.

2003-01-01

367

Palatal rugae patterns are relatively unique to an individual and are well protected by the lips, buccal pad of fat and teeth. They are considered to be stable throughout life following completion of growth, although there is considerable debate on the matter, they can be used successfully in post mortem identification provided an antemortem record exists. Thus the aim of this study was to examine palatal rugae shape among two Indian populations and determine the accuracy in defining the Indian population using logistic regression analysis. The study comprises two groups from geographically different regions of India with basic origin from Maharashtra and Karnataka state. The sample includes 100 plaster cast equally distributed between two populations and genders with age ranging between 18 and 40 years. Impression of maxillary arch was obtained using alginate impression material and plaster cast was made. The rugae was delineated on the cast using a sharp graphite pencil under adequate light and magnification and recorded according to classification given by Kapali et al. and Thomas and Kotze (1983). Chi-Square analysis showed significant difference in wavy, circular and divergent pattern between the two populations. The straight and wavy forms were significant in logistic regression analysis. A predictive value of 71% was obtained in determining the original cases correctly when straight, wavy, curved and circular patterns were assessed. 70% of predictive value was achieved when all rugae patterns were assessed. Mean number of rugae was greater in females compared to males with straight pattern showing statistically significant difference between males and females. Significant difference was recorded among straight, wavy, circular and divergent pattern between two populations. Consequently this study demonstrates moderate accuracy of palatal rugae pattern using logistic regression analysis in identification of Indians. PMID:22018168

Kotrashetti, Vijayalakshmi S; Hollikatti, Kiran; Mallapur, M D; Hallikeremath, Seema R; Kale, Alka D

2011-11-01

368

Robust best linear estimation for regression analysis using surrogate and instrumental variables.

We investigate methods for regression analysis when covariates are measured with errors. In a subset of the whole cohort, a surrogate variable is available for the true unobserved exposure variable. The surrogate variable satisfies the classical measurement error model, but it may not have repeated measurements. In addition to the surrogate variables that are available among the subjects in the calibration sample, we assume that there is an instrumental variable (IV) that is available for all study subjects. An IV is correlated with the unobserved true exposure variable and hence can be useful in the estimation of the regression coefficients. We propose a robust best linear estimator that uses all the available data, which is the most efficient among a class of consistent estimators. The proposed estimator is shown to be consistent and asymptotically normal under very weak distributional assumptions. For Poisson or linear regression, the proposed estimator is consistent even if the measurement error from the surrogate or IV is heteroscedastic. Finite-sample performance of the proposed estimator is examined and compared with other estimators via intensive simulation studies. The proposed method and other methods are applied to a bladder cancer case-control study. PMID:22285992

Wang, C Y

2012-04-01

369

International Nuclear Information System (INIS)

A quantitative structure-property relationship (QSPR) study was performed to develop models those relate the structures of 150 drug organic compounds to their n-octanol-water partition coefficients (log Po/w). Molecular descriptors derived solely from 3D structures of the molecular drugs. A genetic algorithm was also applied as a variable selection tool in QSPR analysis. The models were constructed using 110 molecules as training set, and predictive ability tested using 40 compounds. Modeling of log Po/w of these compounds as a function of the theoretically derived descriptors was established by multiple linear regression (MLR). Four descriptors for these compounds molecular volume (MV) (geometrical), hydrophilic-lipophilic balance (HLB) (constitutional), hydrogen bond forming ability (HB) (electronic) and polar surface area (PSA) (electrostatic) are taken as inputs for the model. The use of descriptors calculated only from molecular structure eliminates the need for experimental determination of properties for use in the correlation and allows for the estimation of log Po/w for molecules not yet synthesized. Application of the developed model to a testing set of 40 drug organic compounds demonstrates that the model is reliable with good predictive accuracy and simple formulation. The prediction results are in good agreement with the experimental value. The root mean square error of prediction (RMSEP) and square correlation coefficient(RMSEP) and square correlation coefficient (R2) for MLR model were 0.22 and 0.99 for the prediction set log Po/w

370

The mammalian target of rapamycin (mTOR) has an important role in cell growth, proliferation, and survival. mTOR is frequently hyperactivated in cancer, and therefore, it is a clinically validated target for cancer therapy. In this study, we combined exhaustive pharmacophore modeling and quantitative structure-activity relationship (QSAR) analysis to explore the structural requirements for potent mTOR inhibitors employing 210 known mTOR ligands. Genetic function algorithm (GFA) coupled with k nearest neighbor (kNN) and multiple linear regression (MLR) analyses were employed to build self-consistent and predictive QSAR models based on optimal combinations of pharmacophores and physicochemical descriptors. Successful pharmacophores were complemented with exclusion spheres to optimize their receiver operating characteristic curve (ROC) profiles. Optimal QSAR models and their associated pharmacophore hypotheses were validated by identification and experimental evaluation of several new promising mTOR inhibitory leads retrieved from the National Cancer Institute (NCI) structural database. The most potent hit illustrated an IC50 value of 48 nM. PMID:24050502

Khanfar, Mohammad A; Taha, Mutasem O

2013-10-28

371

Directory of Open Access Journals (Sweden)

Full Text Available Aim: The study aimed to determine the factors associated with periodontal disease (different levels of severity by using different regression models for ordinal data. Design: A cross-sectional design was employed using clinical examination and ?questionnaire with interview? method. Materials and Methods: The study was conducted during June 2008 to October 2008 in Dharwad, Karnataka, India. It involved a systematic random sample of 1760 individuals aged 18-40 years. The periodontal disease examination was conducted by using Community Periodontal Index for Treatment Needs (CPITN. Statistical Analysis Used: Regression models for ordinal data with different built-in link functions were used in determination of factors associated with periodontal disease. Results: The study findings indicated that, the ordinal regression models with four built-in link functions (logit, probit, Clog-log and nlog-log displayed similar results with negligible differences in significant factors associated with periodontal disease. The factors such as religion, caste, sources of drinking water, Timings for sweet consumption, Timings for cleaning or brushing the teeth and materials used for brushing teeth were significantly associated with periodontal disease in all ordinal models. Conclusions: The ordinal regression model with Clog-log is a better fit in determination of significant factors associated with periodontal disease as compared to models with logit, probit and nlog-log built-in link functions. The factors such as caste and time for sweet consumption are negatively associated with periodontal disease. But religion, sources of drinking water, Timings for cleaning or brushing the teeth and materials used for brushing teeth are significantly and positively associated with periodontal disease.

Javali Shivalingappa

2010-01-01

372

A transverse magnetic field was introduced to the arc plasma in the process of welding stainless steel tubes by high-speed Tungsten Inert Gas Arc Welding (TIG for short) without filler wire. The influence of external magnetic field on welding quality was investigated. 9 sets of parameters were designed by the means of orthogonal experiment. The welding joint tensile strength and form factor of weld were regarded as the main standards of welding quality. A binary quadratic nonlinear regression equation was established with the conditions of magnetic induction and flow rate of Ar gas. The residual standard deviation was calculated to adjust the accuracy of regression model. The results showed that, the regression model was correct and effective in calculating the tensile strength and aspect ratio of weld. Two 3D regression models were designed respectively, and then the impact law of magnetic induction on welding quality was researched.

Lu, Lin; Chang, Yunlong; Li, Yingmin; He, Youyou

2013-05-01

373

Analysis of dynamic multiplicity fluctuations at PHOBOS

This paper presents the analysis of the dynamic fluctuations in the inclusive charged particle multiplicity measured by PHOBOS for Au+Au collisions at surdsNN = 200GeV within the pseudo-rapidity range of -3 < ? < 3. First the definition of the fluctuations observables used in this analysis is presented, together with the discussion of their physics meaning. Then the procedure for the extraction of dynamic fluctuations is described. Some preliminary results are included to illustrate the correlation features of the fluctuation observable. New dynamic fluctuations results will be available in a later publication.

Chai, Zhengwei; PHOBOS Collaboration; Back, B. B.; Baker, M. D.; Ballintijn, M.; Barton, D. S.; Betts, R. R.; Bickley, A. A.; Bindel, R.; Budzanowski, A.; Busza, W.; Carroll, A.; Chai, Z.; Decowski, M. P.; García, E.; George, N.; Gulbrandsen, K.; Gushue, S.; Halliwell, C.; Hamblen, J.; Heintzelman, G. A.; Henderson, C.; Hofman, D. J.; Hollis, R. S.; Holynski, R.; Holzman, B.; Iordanova, A.; Johnson, E.; Kane, J. L.; Katzy, J.; Khan, N.; Kucewicz, W.; Kulinich, P.; Kuo, C. M.; Lin, W. T.; Manly, S.; McLeod, D.; Mignerey, A. C.; Nouicer, R.; Olszewski, A.; Pak, R.; Park, I. C.; Pernegger, H.; Reed, C.; Remsberg, L. P.; Reuter, M.; Roland, C.; Roland, G.; Rosenberg, L.; Sagerer, J.; Sarin, P.; Sawicki, P.; Skulski, W.; Steinberg, P.; Stephans, G. S. F.; Sukhanov, A.; Tang, J. L.; Trzupek, A.; Vale, C.; van Nieuwenhuizen, G. J.; Verdier, R.; Wolfs, F. L. H.; Wosiek, B.; Wozniak, K.; Wuosmaa, A. H.; Wyslouch, B.

2005-01-01

374

Depression in Multiple Sclerosis: A Longitudinal Analysis

Digital Repository Infrastructure Vision for European Research (DRIVER)

High rates of depression are documented in persons with multiple sclerosis (MS), but few studies examine depression over time. This analysis considered data from 607 persons with MS over a seven-year period as part of an ongoing longitudinal study of quality of life in chronic illness. Latent growth curve analysis was used to examine trajectories in depression and the effects of covariates such as age, time since diagnosis of MS, type of MS, and functional limitations on the extent to which d...

Beal, Claudia; Stuifbergen, Alexa K.; Sands, Dolores V.; Brown, Adama

2007-01-01

375

Multiple relational analysis method for uranium mineral metallogenic prediction

International Nuclear Information System (INIS)

After introduction of the basic principle of relational analysis, multiple relational analysis method for uranium mineral resources are proposed. Multiple relational analysis prediction method is especially efficient where known ore deposits or ore-bearing units are scarce. Where other prediction methods fail, multiple relational analysis method proves to work well with reliability and accuracy. It is fully illustrated with the examples presented

376

Energy Technology Data Exchange (ETDEWEB)

A software tool was created in Fiscal Year 2010 (FY11) that enables multiple-regression correction of well water levels for river-stage effects. This task was conducted as part of the Remediation Science and Technology project of CH2MHILL Plateau Remediation Company (CHPRC). This document contains an overview of the correction methodology and a user’s manual for Multiple Regression in Excel (MRCX) v.1.1. It also contains a step-by-step tutorial that shows users how to use MRCX to correct river effects in two different wells. This report is accompanied by an enclosed CD that contains the MRCX installer application and files used in the tutorial exercises.

Mackley, Rob D.; Spane, Frank A.; Pulsipher, Trenton C.; Allwardt, Craig H.

2010-09-01

377

Estimating the causes of traffic accidents using logistic regression and discriminant analysis.

Factors that affect traffic accidents have been analysed in various ways. In this study, we use the methods of logistic regression and discriminant analysis to determine the damages due to injury and non-injury accidents in the Eskisehir Province. Data were obtained from the accident reports of the General Directorate of Security in Eskisehir; 2552 traffic accidents between January and December 2009 were investigated regarding whether they resulted in injury. According to the results, the effects of traffic accidents were reflected in the variables. These results provide a wealth of information that may aid future measures toward the prevention of undesired results. PMID:23837801

Karacasu, Murat; Ergül, Bar??; Altin Yavuz, Arzu

2014-12-01

378

An electro-optical device called an oculometer which tracks a subject's lookpoint as a time function has been used to collect data in a real-time simulation study of instrument landing system (ILS) approaches. The data describing the scanning behavior of a pilot during the instrument approaches have been analyzed by use of a stepwise regression analysis technique. A statistically significant correlation between pilot workload, as indicated by pilot ratings, and scanning behavior has been established. In addition, it was demonstrated that parameters derived from the scanning behavior data can be combined in a mathematical equation to provide a good representation of pilot workload.

Waller, M. C.

1976-01-01

379

LINEAR REGRESSION MODEL IN THE ANALYSIS OF THE GROSS DOMESTIC PRODUCT

Directory of Open Access Journals (Sweden)

Full Text Available As we ascertain the evolutionary trend of the global economy, it becomes evident that strict analyses on the evolution of a certain micro or macro-economical indicator is no longer enough to describe the corresponding phenomenon, as the emphasis shifts towards the analysis of the correlations existing between two or more indicators, able to offer a much stronger insight on the economical phenomenon. We propose to use the simple linear regression model, a relatively easy and very effective modality to establish the correlation between two economical indicators. The measurement of the factor’s influence on the indicator will most surely offer additional information on the phenomen they describe.

Constantin ANGHELACHE

2011-12-01

380

KINETIC ANALYSIS OF HIGH-NITROGEN ENERGETIC MATERIALS USING MULTIVARIATE NONLINEAR REGRESSION

Energy Technology Data Exchange (ETDEWEB)

New high-nitrogen energetic materials were synthesized by Hiskey and Naud. J. Opfermann reported a new tool for finding the probable model of the complex reactions using multivariate non-linear regression analysis of DSC and TGA data from several measurements run at different heating rates. This study is to take the kinetic parameters from the different steps and discover which reaction step is responsible for the runaway reaction by comparing predicted results from the Frank-Kamenetsckii equation with the critical temperature found experimentally using the modified Henkin test.

Campbell, M. S. (Mary Stinecipher); Rabie, R. L. (Ronald L.); Diaz-Acosta, I. (Irina); Pulay, P. (Peter)

2001-01-01

381

International Nuclear Information System (INIS)

Statistical analysis of properties of powder, compacts and wize of tungsten VA was made to determine optimum conditions of plastic working of tungsten and its alloys. The data were collected on 29 parameters and processed on ''Minsk-22'' computer. Correlations were found between wire structure and such factors as hardness and density of compacts, fractional composition and volume weight of powder and others. A regression equation was obtained which connected the structure of 0.52 mm wire with a number of parameters of initial material

382

Analysis of reactor noise by multi-variate auto-regressive model

International Nuclear Information System (INIS)

The multi-variate auto-regressive model has recently been applied to the noise analysis of nuclear reactor systems. From such a standpoint a system identification study was performed at the Japan Power Demonstrain Reactor-2 (JPDR-2), 45 Mwt, using pseuds-random signals. The aim of this paper is further to extend and refine this identification problem based on the measured data. Emphasis is on the fact that the results obtained by the non-parametric method can by justified by the parametric one. Elucidation of feedback map is also made by estimating the noise contribution rate. Results of computation show the effectiveness of the procedure. (author)

383

Soil colour and spectral analysis employing linear regression models I. Effect of organic matter

Directory of Open Access Journals (Sweden)

Full Text Available This work comprises an investigation into whether soil reflectance spectral analysis which is employed to calculate the colour characteristics (hue, value, chroma of soil can be carried out using linear regression models, so that comparison of colour characteristics subsequently becomes possible, and also statistically documented. To this end the colour of soil samples was calculated through spectrum reflectance in the visible region of dry smooth-rubbed soil samples smaller than 250 mm. The colour parameters of the CIE system assessed by analysis of the spectrum reflectance were converted into Munsell colour system characte- ristics. Regression in accordance with the piecewise linear model was then applied to the spectrum data. The processing indicated that this model is capable of making satisfactory predictions - above all of the value and secondarily of the chroma of the soil samples. Detection of statistically significant differences in the colour characteristics of horizons of the same profile was effected through the application of the nested model. These differences cannot be detected using the tables of the Munsell colour system. Finally, in each region of the spectrum, qualitative analysis of the effect of the organic matter on the soil colour characteristics was performed, demonstrating its active role in determining the readings for value and chroma.

Moustakas N.K.

2004-03-01

384

International Nuclear Information System (INIS)

Purpose: The goal of this study was to maximize the discrimination between benign and malignant masses in patients with sonographically indeterminate ovarian lesions by means of unenhanced and contrast-enhanced MR imaging, and to develop a computer-assisted diagnosis system. Material and Methods: Findings in precontrast and Gd-DTPA contrast-enhanced MR images of 104 patients with 115 sonographically indeterminate ovarian masses were analyzed, and the results were correlated with histopathological findings. Of 115 lesions, 65 were benign (23 cystadenomas, 13 complex cysts, 11 teratomas, 6 fibrothecomas, 12 others) and 50 were malignant (32 ovarian carcinomas, 7 metastatic tumors of the ovary, 4 carcinomas of the fallopian tubes, 7 others). A logistic regression analysis was performed to discriminate between benign and malignant lesions, and a model of a computer-assisted diagnosis was developed. This model was prospectively tested in 75 cases of ovarian tumors found at other institutions. Results: From the univariate analysis, the following parameters were selected as significant for predicting malignancy (p?0.05): A solid or cystic mass with a large solid component or wall thickness greater than 3 mm; complex internal architecture; ascites; and bilaterality. Based on these parameters, a model of a computer-assisted diagnosis system was developed with the logistic regression analysis. To distinguish benign from malignant lesions, the maximum cut-off point was obtained between 0.47 and 0.51. In a prospective application of this model, 87% of the lesions were accurately identified as benign or malignant. (orig.)

385

Previous studies have identified associations between traffic exposures and a variety of adverse health effects, but many of these studies relied on proximity measures rather than measured or modeled concentrations of specific air pollutants, complicating interpretability of the findings. An increasing number of studies have used land-use regression (LUR) or other techniques to model small-scale variability in concentrations of specific air pollutants. However, these studies have generally considered a limited number of pollutants, focused on outdoor concentrations (or indoor concentrations of ambient origin) when indoor concentrations are better proxies for personal exposures, and have not taken full advantage of statistical methods for source apportionment that may have provided insight about the structure of the LUR models and the interpretability of model results. Given these issues, the primary objective of our study was to determine predictors of indoor and outdoor residential concentrations of multiple traffic-related air pollutants within an urban area, based on a combination of central site monitoring data; geographic information system (GIS) covariates reflecting traffic and other outdoor sources; questionnaire data reflecting indoor sources and activities that affect ventilation rates; and factor-analytic methods to better infer source contributions. As part of a prospective birth cohort study assessing asthma etiology in urban Boston, we collected indoor and/or outdoor 3-to-4 day samples of nitrogen dioxide (NO2) and fine particulate matter with an aerodynamic diameter or = 2.5 pm (PM2.5) at 44 residences during multiple seasons of the year from 2003 through 2005. We performed reflectance analysis, x-ray fluorescence spectroscopy (XRF), and high-resolution inductively coupled plasma-mass spectrometry (ICP-MS) on particle filters to estimate the concentrations of elemental carbon (EC), trace elements, and water-soluble metals, respectively. We derived multiple indicators of traffic using Massachusetts Highway Department (MHD) data and traffic counts collected outside the residences where the air monitoring was conducted. We used a standardized questionnaire to collect data on home characteristics and occupant behaviors. Additional housing information was collected through property tax records. Ambient concentrations of pollutants as well as meteorological data were collected from centrally located ambient monitors. We used GIS-based LUR models to explain spatial and temporal variability in residential outdoor concentrations of PM2.5, EC, and NO2. We subsequently derived latent-source factors for residential outdoor concentrations using confirmatory factor analysis constrained to nonnegative loadings. We developed LUR models to determine whether GIS covariates and other predictors explain factor variability and thereby support initial factor interpretations. To evaluate indoor concentrations, we developed physically interpretable regression models that explored the relationship between measured indoor and outdoor concentrations, relying on questionnaire data to characterize indoor sources and activities. Because outdoor pollutant concentrations measured directly outside of homes are unlikely to be available for most large epidemiologic studies, we developed regression models to explain indoor concentrations of PM2.5, EC, and NO2 as a function of other, more readily available data: GIS covariates, questionnaire data reflecting both sources and ventilation, and central site monitoring data. As we did for outdoor concentrations, we then derived latent-source factors for residential indoor concentrations and developed regression models explaining variability in these indoor latent-source factors. Finally, to provide insight about the effects of improved characterization of exposures for the results of subsequent epidemiologic investigations, we developed a simulation framework to quantitatively compare the implications of using exposure models derived from validation studies with the use of other surrogate models w

Levy, Jonathan I; Clougherty, Jane E; Baxter, Lisa K; Houseman, E Andres; Paciorek, Christopher J

2010-12-01

386

International Nuclear Information System (INIS)

An uncertainty analysis method is proposed here, which uses Fourier Amplitude Sensitivity Test (FAST) and Stepwise Regression Technique (SRT). This method is a compromise between the approximation method [response surface method (RSM) or moments method] and Monte Carlo method (MCM). It is concluded that: 1. FAST gives the partial variance for each input parameter, which can be used as global sensitivity ranking between input parameters, with moderate sampling point compared to crude MCM. 2. SRT is a good tool to construct the later-used first- or second-order response surface model consisting of comparatively important parameters. 3. The combined uncertainty analysis method using FAST and SRT can be used for uncertainty/sensitivity analysis of the large computer codes with moderate cost and it will be a useful tool to analyze the feasibility of the newly developed, highly uncertain system models

387

Digital Repository Infrastructure Vision for European Research (DRIVER)

Regularized regression techniques for linear regression have been created the last few ten years to reduce the flaws of ordinary least squares regression with regard to prediction accuracy. In this paper, new methods for using regularized regression in model choice are introduced, and we distinguish the conditions in which regularized regression develops our ability to discriminate models. We applied all the five methods that use penalty-based (regularization) shrinkage to h...

Doreswamy; Vastrad, Chanabasayya M.

2013-01-01

388

Directory of Open Access Journals (Sweden)

Full Text Available Este artigo discute algumas aplicações das técnicas de análise de regressão múltipla stepwise e hierárquica, as quais são muito utilizadas em pesquisas da área de Psicologia Organizacional. São discutidas algumas estratégias de identificação e de solução de problemas relativos à ocorrência de erros do Tipo I e II e aos fenômenos de supressão, complementaridade e redundância nas equações de regressão múltipla. São apresentados alguns exemplos de pesquisas nas quais esses padrões de associação entre variáveis estiveram presentes e descritas as estratégias utilizadas pelos pesquisadores para interpretá-los. São discutidas as aplicações dessas análises no estudo de interação entre variáveis e na realização de testes para avaliação da linearidade do relacionamento entre variáveis. Finalmente, são apresentadas sugestões para lidar com as limitações das análises de regressão múltipla (stepwise e hierárquica.This article discusses applications of stepwise and hierarchical multiple regression analyses to research in organizational psychology. Strategies for identifying type I and II errors, and solutions to potential problems that may arise from such errors are proposed. In addition, phenomena such as suppression, complementarity, and redundancy are reviewed. The article presents examples of research where these phenomena occurred, and the manner in which they were explained by researchers. Some applications of multiple regression analyses to studies involving between-variable interactions are presented, along with tests used to analyze the presence of linearity among variables. Finally, some suggestions are provided for dealing with limitations implicit in multiple regression analyses (stepwise and hierarchical.

Gardênia Abbad

2002-01-01

389

Variable (or wavelength) selection plays an important role in the quantitative analysis of near-infrared (NIR) spectra. A method based on a genetic algorithm interval partial least squares regression (GAiPLS) combined successive projections algorithm (SPA) was proposed for variable selection in NIR spectroscopy. GAiPLS was used to select informative interval regions among the spectrum, and then SPA was employed to select the most informative variables and to minimize collinearity between those variables in the model. The performance of the proposed method was compared with the full-spectrum model, conventional interval partial least squares regression (iPLS), and backward interval partial least squares regression (BiPLS) for modeling the NIR data sets of pigments in cucumber leaf samples. The multiple linear regression (MLR) model was obtained with eight variables for chlorophylls and five variables for carotenoids selected by SPA. When the SPA model was applied to the prediction of the validation set, the correlation coefficients of the predicted value by MLR and the measured value for the validation data set (r(p)) of chlorophylls and carotenoids were 0.917 and 0.932, respectively. Results show that the proposed method was able to select important wavelengths from the NIR spectra and makes the prediction more robust and accurate in quantitative analysis. PMID:20615293

Zou, Xiaobo; Zhao, Jiewen; Mao, Hanpin; Shi, Jiyong; Yin, Xiaopin; Li, Yanxiao

2010-07-01

390

Directory of Open Access Journals (Sweden)

Full Text Available O ar é um meio eficiente de dispersão de poluentes atmosféricos e seu comportamento depende dos movimentos atmosféricos que ocorrem na troposfera. Em Porto Alegre, Estado do Rio Grande do Sul, há um grande tráfego diário e uma concentração de indústrias que podem ser responsáveis por emissões atmosféricas. Neste trabalho, estudou-se o comportamento das concentrações diárias de material particulado (PM10 desta cidade, considerando a influência dos elementos meteorológicos. A análise dos dados foi realizada a partir de estatísticas descritivas, correlação linear e regressão múltipla. Os dados foram fornecidos pela Fundação Estadual de Proteção Ambiental Henrique Luiz Roessler - RS (FEPAM e pelo Instituto Nacional de Meteorologia (INMET. A partir das análises pôde-se verificar que: as concentrações do PM10, medidos diariamente às 16h, não ultrapassaram os padrões nacionais de qualidade do ar; os elementos meteorológicos que influenciam nas concentrações do PM10 foram: a velocidade média diária do vento e a radiação média diária com relações negativas; as temperaturas médias diárias do ar e as direções, norte e noroeste, do vento, com relações positivas. As direções do vento que contribuem significativamente para diminuir as concentrações nos locais medidos são Leste e Sudeste.Air is an efficient means of atmospheric pollutants dispersal and its r behavior depends on the atmospheric movements that occur in the troposphere. In Porto Alegre, Rio Grande do Sul State, there is a large daily traffic and a concentration of industries that may be responsible for atmospheric emission. In the present work we studied the behavior of daily concentrations of particulate matter (PM10, in this city, considering the influence of meteorological variables. Data analysis was performed from descriptive statistics, linear correlation and multiple regressions. Data were provided by the State Foundation of Environmental Protection Henrique Luiz Roessler - RS and the National Institute of Meteorology. Based on the analysis it was possible to verify that: the concentration of PM10, measured every day at 4:00 p.m., did not exceed national standards for air quality; meteorological elements that influenced on the concentrations of PM10 were the daily average wind speed and average daily radiation with negative relations; the daily average temperature of the air and the directions, north and northwest of wind, with positive relations. Wind directions which contribute significantly to lower concentrations on the measured places are east and southeast.

Rosana de Cassia de Souza Schneider

2011-03-01

391

International Nuclear Information System (INIS)

Non-invasively reconstructing the transmembrane potentials (TMPs) from body surface potentials (BSPs) constitutes one form of the inverse ECG problem that can be treated as a regression problem with multi-inputs and multi-outputs, and which can be solved using the support vector regression (SVR) method. In developing an effective SVR model, feature extraction is an important task for pre-processing the original input data. This paper proposes the application of principal component analysis (PCA) and kernel principal component analysis (KPCA) to the SVR method for feature extraction. Also, the genetic algorithm and simplex optimization method is invoked to determine the hyper-parameters of the SVR. Based on the realistic heart-torso model, the equivalent double-layer source method is applied to generate the data set for training and testing the SVR model. The experimental results show that the SVR method with feature extraction (PCA-SVR and KPCA-SVR) can perform better than that without the extract feature extraction (single SVR) in terms of the reconstruction of the TMPs on epi- and endocardial surfaces. Moreover, compared with the PCA-SVR, the KPCA-SVR features good approximation and generalization ability when reconstructing the TMPs.

392

Regression analysis of mixed recurrent-event and panel-count data.

In event history studies concerning recurrent events, two types of data have been extensively discussed. One is recurrent-event data (Cook and Lawless, 2007. The Analysis of Recurrent Event Data. New York: Springer), and the other is panel-count data (Zhao and others, 2010. Nonparametric inference based on panel-count data. Test 20: , 1-42). In the former case, all study subjects are monitored continuously; thus, complete information is available for the underlying recurrent-event processes of interest. In the latter case, study subjects are monitored periodically; thus, only incomplete information is available for the processes of interest. In reality, however, a third type of data could occur in which some study subjects are monitored continuously, but others are monitored periodically. When this occurs, we have mixed recurrent-event and panel-count data. This paper discusses regression analysis of such mixed data and presents two estimation procedures for the problem. One is a maximum likelihood estimation procedure, and the other is an estimating equation procedure. The asymptotic properties of both resulting estimators of regression parameters are established. Also, the methods are applied to a set of mixed recurrent-event and panel-count data that arose from a Childhood Cancer Survivor Study and motivated this investigation. PMID:24648408

Zhu, Liang; Tong, Xinwei; Sun, Jianguo; Chen, Manhua; Srivastava, Deo Kumar; Leisenring, Wendy; Robison, Leslie L

2014-07-01

393

Gamma regression improves Haseman-Elston and variance components linkage analysis for sib-pairs.

Existing standard methods of linkage analysis for quantitative phenotypes rest on the assumptions of either ordinary least squares (Haseman and Elston [1972] Behav. Genet. 2:3-19; Sham and Purcell [2001] Am. J. Hum. Genet. 68:1527-1532) or phenotypic normality (Almasy and Blangero [1998] Am. J. Hum. Genet. 68:1198-1199; Kruglyak and Lander [1995] Am. J. Hum. Genet. 57:439-454). The limitations of both these methods lie in the specification of the error distribution in the respective regression analyses. In ordinary least squares regression, the residual distribution is misspecified as being independent of the mean level. Using variance components and assuming phenotypic normality, the dependency on the mean level is correctly specified, but the remaining residual coefficient of variation is constrained a priori. Here it is shown that these limitations can be addressed (for a sample of unselected sib-pairs) using a generalized linear model based on the gamma distribution, which can be readily implemented in any standard statistical software package. The generalized linear model approach can emulate variance components when phenotypic multivariate normality is assumed (Almasy and Blangero [1998] Am. J. Hum Genet. 68: 1198-1211) and is therefore more powerful than ordinary least squares, but has the added advantage of being robust to deviations from multivariate normality and provides (often overlooked) model-fit diagnostics for linkage analysis. PMID:14748009

Barber, Mathew J; Cordell, Heather J; MacGregor, Alex J; Andrew, Toby

2004-02-01

394

Directory of Open Access Journals (Sweden)

Full Text Available Medium Density Fiber board (MDF panels are appropriate for many exterior and interior industrial applications. The degree of surface roughness of MDF plays an important role since, any surface irregularities will affect the final quality of the product. In the present study, regression model were developed to predict surface roughness in drilling MDF panels with carbide step drills. In the development of predictive models, drilling parameters of spindle speed, feed rate and drill diameter were considered as model variables. For this purpose, Taguchi’s design of experiments was carried out in order to collect surface roughness value. The Orthogonal Array (OA and Analysis of Variance (ANOVA are employed to study the surface roughness characteristics in drilling operation of MDF panels. The objective is to establish a correlation between spindle speed, feed rate and drill diameter with surface roughness in a MDF panel. The experiments are conducted as per Taguchi L27 orthogonal array with different cutting conditions. ANOVA and F-test were used to check the validity of regression model and to determine the significant parameter affecting the surface roughness. The statistical analysis showed that the feed rate was an utmost parameter on surface roughness. The microstructure of drilled surfaces were also studied by scanning electron microscopy (SEM.The SEM investigations reveled that drilling MDF panels with step drill produce surface striations and waviness which were increased significantly with feed rate.

M.I. Rizwan Jamal

2012-01-01

395

A least trimmed square regression method for second level FMRI effective connectivity analysis.

We present a least trimmed square (LTS) robust regression method to combine different runs/subjects for second/high level effective connectivity analysis. The basic idea of this method is to treat the extreme nonlinear model variability as outliers if they exceed a certain threshold. A bootstrap method for the LTS estimation is employed to detect model outliers. We compared the LTS robust method with a non-robust method using simulated and real datasets. The difference between LTS and the non-robust method for second level effective connectivity analysis is significant, suggesting the conventional non-robust method is easily affected by the model variability from the first level analysis. In addition, after these outliers are detected and excluded for the high level analysis, the model coefficients of the second level are combined within the framework of a mixed model. The variance of the mixed model is estimated using the Newton-Raphson (NR) type Levenberg-Marquardt algorithm. Three sets of real data are adopted to compare conventional methods which do not include random effects in the analysis with a mixed model for second level effective connectivity analysis. The results show that the conventional method is significantly different from the mixed model when greater model variability exists, suggesting there is a strong random effect, and the mixed model should be employed for the second level effective connectivity analysis. PMID:23093379

Li, Xingfeng; Coyle, Damien; Maguire, Liam; McGinnity, Thomas Martin

2013-01-01

396

International Nuclear Information System (INIS)

The observation of the equipment and piping system installed in an operating nuclear power plant in earthquakes is very umportant for evaluating and confirming the adequacy and the safety margin expected in the design stage. By analyzing observed earthquake records, it can be expected to get the valuable data concerning the behavior of those in earthquakes, and extract the information about the aseismatic design parameters for those systems. From these viewpoints, an earthquake observation system was installed in a reactor building in an operating plant. Up to now, the records of three earthquakes were obtained with this system. In this paper, an example of the analysis of earthquake records is shown, and the main purpose of the analysis was the evaluation of the vibration mode, natural frequency and damping factor of this piping system. Prior to the earthquake record analysis, the eigenvalue analysis for this piping system was performed. Auto-regressive analysis was applied to the observed acceleration time history which was obtained with a piping system installed in an operating BWR. The results of earthquake record analysis agreed well with the results of eigenvalue analysis. (Kako, I.)

397

Directory of Open Access Journals (Sweden)

Full Text Available Polycyclic aromatic hydrocarbons (PAHs are ubiquitous contaminants found in the environment. Immunoassays represent useful analytical methods to complement traditional analytical procedures for PAHs. Cross-reactivity (CR is a very useful character to evaluate the extent of cross-reaction of a cross-reactant in immunoreactions and immunoassays. The quantitative relationships between the molecular properties and the CR of PAHs were established by stepwise multiple linear regression, principal component regression and partial least square regression, using the data of two commercial enzyme-linked immunosorbent assay (ELISA kits. The objective is to find the most important molecular properties that affect the CR, and predict the CR by multiple regression methods. The results show that the physicochemical, electronic and topological properties of the PAH molecules have an integrated effect on the CR properties for the two ELISAs, among which molar solubility (*S*_{m} and valence molecular connectivity index (^{3}?^{v} are the most important factors. The obtained regression equations for Ris^{C} kit are all statistically significant (*p** *< 0.005 and show satisfactory ability for predicting CR values, while equations for RaPID kit are all not significant (*p** *> 0.05 and not suitable for predicting. It is probably because that the Ris^{C} immunoassay employs a monoclonal antibody, while the RaPID kit is based on polyclonal antibody. Considering the important effect of solubility on the CR values, cross-reaction potential (CRP is calculated and used as a complement of CR for evaluation of cross-reactions in immunoassays. Only the compounds with both high CR and high CRP can cause intense cross-reactions in immunoassays.

Yan-Feng Zhang

2012-07-01

398

Death rates from lung cancer in men are higher in Andalusia than in other Spanish regions. This study describes lung cancer mortality rates and their trends in Andalusia from 1975 through 2008. Data on lung cancer mortality were obtained from the Death Registry of Andalusia. For each gender, age group-specific and standardized (overall and truncated) rates were calculated by the direct method using the world standard population. Joinpoint regression analysis was used to identify points where a significant change in trends occurred. In men, short-term trends for age-standardized mortality rates (ASMRs) declined significantly from 2004 through 2008 for each age group < 80 years old. In women, the segmented joinpoint analysis showed a decrease from 1975 through 1998 in ASMRs (overall) (-0.6%, P < 0.05), followed by a marked increase (4.6%, P < 0.05). A decrease in male versus female mortality due to lung cancer is evident in Andalusia (Spain). PMID:21678025

Cayuela, Aurelio; Rodríguez-Domínguez, Susana; Jara-Palomares, Luis; Otero-Candelera, Remedios; López-Campos, Jose Luis; Vigil, Eduardo

2012-09-01

399

An automated event tree analysis system for estimating the probability of short term volcanic activity is presented. The algorithm is driven by a suite of empirical statistical models that are derived through logistic regression. Each model is constructed from a multidisciplinary dataset that was assembled from a collection of historic volcanic unrest episodes. The dataset consists of monitoring measurements (e.g. InSAR, seismic), source modeling results, and historic eruption activity. This provides a simple mechanism for simultaneously accounting for the geophysical changes occurring within the volcano and the historic behavior of analog volcanoes. The algorithm is extensible and can be easily recalibrated to include new or additional monitoring, modeling, or historic information. Standard cross validation techniques are employed to optimize its forecasting capabilities. Analysis results from several recent volcanic unrest episodes are presented.

Junek, W. N.; Jones, W. L.; Woods, M. T.

2011-12-01

400

Directory of Open Access Journals (Sweden)

Full Text Available In this study, effective economic factors on the import of forest industry products were investigated. Data used in the time series analysis covered a period of 18 years from 1985 to 2002. Double-log linear function was used to analyze the import model. The imported forest industry products in Turkey were considered to be a function of domestic production value, domestic prices, national income per capita, lagged import value (t-1, exchange-rate (TL/$ and export values. The parameters were evaluated using a regression analysis. The results indicated that imported forest industry products in Turkey have largely been effected by national income per capita, domestic prices, export values and exchange-rate variables.

Metin Akay

2006-01-01

401

Statistical learning method in regression analysis of simulated positron spectral data

International Nuclear Information System (INIS)

Positron lifetime spectroscopy is a non-destructive tool for detection of radiation induced defects in nuclear reactor materials. This work concerns the applicability of the support vector machines method for the input data compression in the neural network analysis of positron lifetime spectra. It has been demonstrated that the SVM technique can be successfully applied to regression analysis of positron spectra. A substantial data compression of about 50 % and 8 % of the whole training set with two and three spectral components respectively has been achieved including a high accuracy of the spectra approximation. However, some parameters in the SVM approach such as the insensitivity zone e and the penalty parameter C have to be chosen carefully to obtain a good performance. (author)

402

Modelling and analysis of turbulent datasets using Auto Regressive Moving Average processes

We introduce a novel way to extract information from turbulent datasets by applying an Auto Regressive Moving Average (ARMA) statistical analysis. Such analysis goes well beyond the analysis of the mean flow and of the fluctuations and links the behavior of the recorded time series to a discrete version of a stochastic differential equation which is able to describe the correlation structure in the dataset. We introduce a new index