Multiple linear regression analysis
Edwards, T. R.
1980-01-01
Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.
MULTIPLE REGRESSION ANALYSIS OF MAIN ECONOMIC INDICATORS IN TOURISM
Directory of Open Access Journals (Sweden)
Erika KULCSÁR
2009-12-01
Full Text Available This paper analysis the measure between GDP dependent variable in the sector of hotels and restaurants and the following independent variables: overnight stays in the establishments of touristic reception, arrivals in the establishments of touristic reception and investments in hotels and restaurants sector in the period of analysis 1995-2007. With the multiple regression analysis I found that investments and tourist arrivals are significant predictors for the GDP dependent variable. Based on these results, I identified those components of the marketing mix, which in my opinion require investment, which could contribute to the positive development of tourist arrivals in the establishments of touristic reception.
Applied multiple regression correlation analysis for the behavioral sciences
Cohen, Patricia; Aiken, Leona S
2014-01-01
This classic text on multiple regression is noted for its nonmathematical, applied, and data-analytic approach. Readers profit from its verbal-conceptual exposition and frequent use of examples. The applied emphasis provides clear illustrations of the principles and provides worked examples of the types of applications that are possible. Researchers learn how to specify regression models that directly address their research questions. An overview of the fundamental ideas of multiple regression and a review of bivariate correlation and regression and other elementary statistical concepts provide a strong foundation for understanding the rest of the text. The third edition features an increased emphasis on graphics and the use of confidence intervals and effect size measures, and an accompanying CD with data for most of the numerical examples along with the computer code for SPSS, SAS, and SYSTAT. Applied Multiple Regression serves as both a textbook for graduate students and as a reference tool for researche...
Multiple Regression Analysis Using ANCOVA in University Model
Directory of Open Access Journals (Sweden)
Maneesha
2013-09-01
Full Text Available The government of UAE is promoting Dubai as an academic hub. Dubai International Academic City (DIAC is a free zone area with many national and international universities promoting higher education in almost all disciplines. The aspiration of every graduating student from the university is to get a good placement. In Dubai diverse job opportunities in national and multinational organizations are available. The objective of the paper is to review the placement opportunities in Dubai for the universities offering programs in Engineering. This paper attempts to study the effect of three independent variables namely Cumulative grade point average (CGPA, Engineering disciplines and types of jobs that graduating students are offered on the dependent variable salary. Engineering discipline understudy are Mechanical, Electronics and Communication, Computer Science and Electrical and Electronics Engineering. The type of jobs taken into consideration are marketing, technical marketing, design and logistics. The concepts of Analysis of covariance (ANCOVA and multiple regression are used for review of placement opportunities vis a vis the salary structure.
Regression analysis for multiple-disease group testing data.
Zhang, Boan; Bilder, Christopher R; Tebbs, Joshua M
2013-12-10
Group testing, where individual specimens are composited into groups to test for the presence of a disease (or other binary characteristic), is a procedure commonly used to reduce the costs of screening a large number of individuals. Group testing data are unique in that only group responses may be available, but inferences are needed at the individual level. A further methodological challenge arises when individuals are tested in groups for multiple diseases simultaneously, because unobserved individual disease statuses are likely correlated. In this paper, we propose new regression techniques for multiple-disease group testing data. We develop an expectation-solution based algorithm that provides consistent parameter estimates and natural large-sample inference procedures. We apply our proposed methodology to chlamydia and gonorrhea screening data collected in Nebraska as part of the Infertility Prevention Project and to prenatal infectious disease screening data from Kenya. PMID:23703944
An improved multiple linear regression and data analysis computer program package
Sidik, S. M.
1972-01-01
NEWRAP, an improved version of a previous multiple linear regression program called RAPIER, CREDUC, and CRSPLT, allows for a complete regression analysis including cross plots of the independent and dependent variables, correlation coefficients, regression coefficients, analysis of variance tables, t-statistics and their probability levels, rejection of independent variables, plots of residuals against the independent and dependent variables, and a canonical reduction of quadratic response functions useful in optimum seeking experimentation. A major improvement over RAPIER is that all regression calculations are done in double precision arithmetic.
Modeling the energy content of municipal solid waste using multiple regression analysis
Energy Technology Data Exchange (ETDEWEB)
Liu, J.I. [Kaohsiung Department of Environmental Protection (Taiwan, Province of China); Paode, R.D.; Holsen, T.M, [Illinois Inst. of Technology, Chicago, IL (United States)
1996-07-01
In this research multiple regression analysis was used to develop predictive models of the energy content of municipal solid waste (MSW). The scope of work included collecting waste samples in Kaohsiung City, Taiwan, characterizing the waste, and performing a stepwise forward selection procedure for isolating variables. Two regression models were developed to correlate the energy content with variables derived from physical composition and ultimate analysis. The performance of these models for this particular waste was superior to that of equations developed by other researchers (e.g., Dulong, Steuer) for estimating energy content. Attempts at developing regression models from proximate analysis data were not successful. 6 refs., 8 figs., 2 tabs.
Multiple Regression Analysis of Sib-Pair Data on Reading to Detect Quantitative Trait Loci.
Fulker, D. W.; And Others
1991-01-01
Applies an extension of an earlier multiple regression model for twin analysis to the problem of detecting linkage in a quantitative trait. Detects a number of possible linkages, indicating that the approach is effective. Discusses detecting genotype-environment interaction and the issue of power. (RS)
Modeling of retardance in ferrofluid with Taguchi-based multiple regression analysis
Lin, Jing-Fung; Wu, Jyh-Shyang; Sheu, Jer-Jia
2015-03-01
The citric acid (CA) coated Fe3O4 ferrofluids are prepared by a co-precipitation method and the magneto-optical retardance property is measured by a Stokes polarimeter. Optimization and multiple regression of retardance in ferrofluids are executed by combining Taguchi method and Excel. From the nine tests for four parameters, including pH of suspension, molar ratio of CA to Fe3O4, volume of CA, and coating temperature, influence sequence and excellent program are found. Multiple regression analysis and F-test on the significance of regression equation are performed. It is found that the model F value is much larger than Fcritical and significance level P excellent program into equation, retardance is obtained as 32.703°, higher than the highest value in tests by 11.4%.
Forecasting SASX-10 Index Using Multiple Regression Based on Principal Component Analysis
Directory of Open Access Journals (Sweden)
Adnan Rov?anin
2015-02-01
Full Text Available In this paper we forecast SASX-10 Index (SArajevo Stock Exchange Index 10 by using multiple regression based on Principal Component Analysis scores (PCAS. In order to forecast stock market index SASX-10, as dependent variable, we use multiple regression and various macroeconomic indicators as independent variables to investigate indicators that significantly affect the performance of stocks actively traded on the Bosnia and Herzegovina (B & H financial market. Initially, the sample of study covered 17 macroeconomic factors as independent variables but we chosen in our model 9 statistically significant factors as independent variables (p < 0.05. After that, we have used multiple regression based on PCA scores to establish a meaningful relationship among various explanatory variables identified through the empirical analysis considering the available research studies. This paper provides an econometric analysis of the valuation SASX-10 Index. Principal Component Analysis was used to reduce large number of explanatory variables and we have taken into consideration the multicollinearity problem among different independent variables. The main objective of this study was to forecast the value for SASX-10 Index using a multivariate statistical approach, Principal Component Analysis, to classify predictor variables according to interrelationships and to predict SASX-10 Index. For this purpose, PCA scores of 9 macroeconomic indicators were used as independent variables in multiple linear regression model for prediction of SASX-10 Index. We have got some relationships of macroeconomic indicators with the SASX-10 market index. The result shows that the empirical characteristics of the SASX-10 Index are determined by the CPI, BIRS Index, SASX-10t-1 Index, CROBX10 Index, ATX Index, FTSE Italian STAR Index, SBITOP Index, KM/HRK and M1. Finally, we create four models with their loss function. After that, we compare loss function of all created forecasting models and the model Forecast 1 has a minimum of all loss function. As it can be seen, 81.10% of variation in SASX-10 can be explained by explanatory variables. Accordingly, we forecast SASX-10 Index closed price for the period 01/12/2014 through 31/12/2014 by using four models. Key words: Forecasting; SASX-10 index; Multiple regression analysis; Principal component analysis
Directory of Open Access Journals (Sweden)
A. Shirvani
2005-10-01
Full Text Available Since the fluctuations of the Persian Gulf Sea Surface Temperature (PGSST have a significant effect on the winter precipitation and water resources and agricultural productions of the south western parts of Iran, the possibility of the Winter SST prediction was evaluated by multiple regression model. The time series of PGSSTs for all seasons, during 1947-1992, were considered as predictors, and the time series of MSSTs during 1948-1993, as the prrdictand. For the purpose of data reduction and principal components extraction, the principal components analysis was applied. Just the scores of the first four PCs (PC1 to PC4 that accounted for the total variance in predictor field were considered as the input file for the regression analysis. For finding the dependency of each principal component to the first time series of the PGSST, the Varimax rotation analysis was applied. The results have indicated that PC1 to PC4 respectively are the indicator of temperature changes during winter, autumn, Spring and Summer. According to the regression model, the components of PC1, PC2 and PC4 were significant at 5% level. But the components of PC3 was insignificant. The results indicated that the significant variables are held accountable for the 33.5% of the total variance in the winter PGSSTs. It became obvious that for the prediction of the winter PGSST, the PGSST during the winter of the last year has a particular importance. At the next stage, autumn and summer temperature have also a role in prediction of winter PGSST.
A Performance Study of Data Mining Techniques: Multiple Linear Regression vs. Factor Analysis
Taneja, Abhishek
2011-01-01
The growing volume of data usually creates an interesting challenge for the need of data analysis tools that discover regularities in these data. Data mining has emerged as disciplines that contribute tools for data analysis, discovery of hidden knowledge, and autonomous decision making in many application domains. The purpose of this study is to compare the performance of two data mining techniques viz., factor analysis and multiple linear regression for different sample sizes on three unique sets of data. The performance of the two data mining techniques is compared on following parameters like mean square error (MSE), R-square, R-Square adjusted, condition number, root mean square error(RMSE), number of variables included in the prediction model, modified coefficient of efficiency, F-value, and test of normality. These parameters have been computed using various data mining tools like SPSS, XLstat, Stata, and MS-Excel. It is seen that for all the given dataset, factor analysis outperform multiple linear re...
International Nuclear Information System (INIS)
In this study, thermodynamic and statistical analyses were performed on a gas turbine system, to assess the impact of some important operating parameters like CIT (Compressor Inlet Temperature), PR (Pressure Ratio) and TIT (Turbine Inlet Temperature) on its performance characteristics such as net power output, energy efficiency, exergy efficiency and fuel consumption. Each performance characteristic was enunciated as a function of operating parameters, followed by a parametric study and optimization. The results showed that the performance characteristics increase with an increase in the TIT and a decrease in the CIT, except fuel consumption which behaves oppositely. The net power output and efficiencies increase with the PR up to certain initial values and then start to decrease, whereas the fuel consumption always decreases with an increase in the PR. The results of exergy analysis showed the combustion chamber as a major contributor to the exergy destruction, followed by stack gas. Subsequently, multiple regression models were developed to correlate each of the response variables (performance characteristic) with the predictor variables (operating parameters). The regression model equations showed a significant statistical relationship between the predictor and response variables. (author)
Lorenzetti, G.; Foresta, A.; Palleschi, V.; Legnaioli, S.
2009-09-01
The recent development of mobile instrumentation, specifically devoted to in situ analysis and study of museum objects, allows the acquisition of many LIBS spectra in very short time. However, such large amount of data calls for new analytical approaches which would guarantee a prompt analysis of the results obtained. In this communication, we will present and discuss the advantages of statistical analytical methods, such as Partial Least Squares Multiple Regression algorithms vs. the classical calibration curve approach. PLS algorithms allows to obtain in real time the information on the composition of the objects under study; this feature of the method, compared to the traditional off-line analysis of the data, is extremely useful for the optimization of the measurement times and number of points associated with the analysis. In fact, the real time availability of the compositional information gives the possibility of concentrating the attention on the most `interesting' parts of the object, without over-sampling the zones which would not provide useful information for the scholars or the conservators. Some example on the applications of this method will be presented, including the studies recently performed by the researcher of the Applied Laser Spectroscopy Laboratory on museum bronze objects.
Roundy, Paul E.; Frank, William M.
2004-12-01
Multiple linear regression models with nonlinear power terms may be applied to find relationships between interacting wave modes that may be characterized by different frequencies. Such regression techniques have been explored in other disciplines, but they have not been used in the analysis of atmospheric circulations. In this study, such a model is developed to predict anomalies of westward-moving intraseasonal precipitable water by utilizing the first through fourth powers of a time series of outgoing longwave radiation that is filtered for eastward propagation and for the temporal and spatial scales of the tropical intraseasonal oscillations. An independent and simpler compositing method is applied to show that the results of this multiple linear regression model provide a better description of the actual relationships between eastward- and westward-moving intraseasonal modes than a regression model that includes only the linear predictor.A statistical significance test is applied to the coefficients of the multiple linear regression model, and they are found to be significant over broad regions of the Tropics. Correlations between the predictors are shown to not significantly influence results for this case.Results show that this regression model reveals physical relationships between eastward- and westward-moving intraseasonal modes. The physical interpretation of these regression relationships is given in a companion paper.
International Nuclear Information System (INIS)
Polycyclic aromatic hydrocarbons (PAHs) are contaminants that reside mainly in surface soils. Dietary intake of plant-based foods can make a major contribution to total PAH exposure. Little information is available on the relationship between root morphology and plant uptake of PAHs. An understanding of plant root morphologic and compositional factors that affect root uptake of contaminants is important and can inform both agricultural (chemical contamination of crops) and engineering (phytoremediation) applications. Five crop plant species are grown hydroponically in solutions containing the PAH phenanthrene. Measurements are taken for 1) phenanthrene uptake, 2) root morphology – specific surface area, volume, surface area, tip number and total root length and 3) root tissue composition – water, lipid, protein and carbohydrate content. These factors are compared through Pearson's correlation and multiple linear regression analysis. The major factors which promote phenanthrene uptake are specific surface area and lipid content. -- Highlights: •There is no correlation between phenanthrene uptake and total root length, and water. •Specific surface area and lipid are the most crucial factors for phenanthrene uptake. •The contribution of specific surface area is greater than that of lipid. -- The contribution of specific surface area is greater than that of lipid in the two most important root morphological and compositional factors affecting phenanthrene uptake
Multiple Logistic Regression Analysis of Cigarette Use among High School Students
Adwere-Boamah, Joseph
2011-01-01
A binary logistic regression analysis was performed to predict high school students' cigarette smoking behavior from selected predictors from 2009 CDC Youth Risk Behavior Surveillance Survey. The specific target student behavior of interest was frequent cigarette use. Five predictor variables included in the model were: a) race, b) frequency of…
Regression analysis by example
Chatterjee, Samprit
2012-01-01
Praise for the Fourth Edition: ""This book is . . . an excellent source of examples for regression analysis. It has been and still is readily readable and understandable."" -Journal of the American Statistical Association Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. Regression Analysis by Example, Fifth Edition has been expanded
Directory of Open Access Journals (Sweden)
S. Saaidpour
2012-03-01
Full Text Available The multiple linear regression (MLR was used to build the linear quantitative structure-property relationship (QSPR model for the prediction of the molar diamagnetic susceptibility (?m for 140 diverse organic compounds using the three significant descriptors calculated from the molecular structures alone and selected by stepwise regression method. Stepwise regression was employed to develop a regression equation based on 100 training compounds, and predictive ability was tested on 40 compounds reserved for that purpose. The stability of the proposed model was validated using Leave-One-Out cross-validation and randomization test. Application of the developed model to a testing set of 40 organic compounds demonstrates that the new model is reliable with good predictive accuracy and simple formulation. By applying MLR method we can predict the test set (40 compounds with Q2ext of 0.9894 and average root mean square error (RMSE of 2.2550. The model applicability domain was always verified by the leverage approach in order to propose reliable predicted data. The prediction results are in good agreement with the experimental values.
Nishida,Keiichiro; Honda, Mitsugi; Hashizume, Hiroyuki; Arita, Seizaburo; Watanabe,Masutaka; Ozaki, Toshifumi
2013-01-01
The purpose of this study was to quantitatively evaluate Akahori's preoperative classification of cubital tunnel syndrome. We analyzed the results for 57 elbows that were treated by a simple decompression procedure from 1997 to 2004. The relationship between each item of Akahori's preoperative classification and clinical stage was investigated based on the parameter distribution. We evaluated Akahori's classification system using multiple regression analysis, and investigated the association ...
Multiple Regressions in Analysing House Price Variations
Directory of Open Access Journals (Sweden)
Aminah Md Yusof
2012-03-01
Full Text Available An application of rigorous statistical analysis in aiding investment decision making gains momentum in the United States of America as well as the United Kingdom. Nonetheless in Malaysia the responses from the local academician are rather slow and the rate is even slower as far as the practitioners are concern. This paper illustrates how Multiple Regression Analysis (MRA and its extension, Hedonic Regression Analysis been used in explaining price variation for selected houses in Malaysia. Each attribute that theoretically identified as price determinant is priced and the perceived contribution of each is explicitly shown. The paper demonstrates how the statistical analysis is capable of analyzing property investment by considering multiple determinants. The consideration of various characteristics which is more rigorous enables better investment decision making.
Multiple regression analysis of factors that may influence middle school science scores
Glover, Judith
The purpose of this quantitative multiple regression study was to determine whether a relationship existed between Maryland State Assessment (MSA) reading scores, MSA math scores, gender, ethnicity, age, and MSA science scores. Also examined was if MSA reading scores, MSA math scores, gender, ethnicity, and age can be used in combination or alone to predict a passing score on the MSA science test and which variable, if any, had the most influence on science MSA scores. Both math and reading MSA scores were positively correlated with science MSA scores. Ethnicity was correlated with science MSA scores, but may have been confounded by socio-economic status. Age and gender were not correlated with science MSA scores. When the variables were combined, results showed that math MSA scores followed by reading MSA scores had the most predictive influence upon science MSA scores. Ethnicity, gender, and age had the least predictive influence. The findings of this study may serve as a catalyst for improving student achievement in science through changes in instructional methodology and curriculum design thereby increasing the number of students pursuing science careers.
Application of multiple regression analysis to forecasting South Africa's electricity demand
Scientific Electronic Library Online (English)
Renee, Koen; Jennifer, Holloway.
2014-11-01
Full Text Available In a developing country such as South Africa, understanding the expected future demand for electricity is very important in various planning contexts. It is specifically important to understand how expected scenarios regarding population or economic growth can be translated into corresponding future [...] electricity usage patterns. This paper discusses a methodology for forecasting long-term electricity demand that was specifically developed for applying to such scenarios. The methodology uses a series of multiple regression models to quantify historical patterns of electricity usage per sector in relation to patterns observed in certain economic and demographic variables, and uses these relationships to derive expected future electricity usage patterns. The methodology has been used successfully to derive forecasts used for strategic planning within a private company as well as to provide forecasts to aid planning in the public sector. This paper discusses the development of the modelling methodology, provides details regarding the extensive data collection and validation processes followed during the model development, and reports on the relevant model fit statistics. The paper also shows that the forecasting methodology has to some extent been able to match the actual patterns, and therefore concludes that the methodology can be used to support planning by translating changes relating to economic and demographic growth, for a range of scenarios, into a corresponding electricity demand. The methodology therefore fills a particular gap within the South African long-term electricity forecasting domain.
Practical Session: Multiple Linear Regression
Clausel, M.; Grégoire, G.
2014-01-01
Three exercises are proposed to illustrate the simple linear regression. In the first one investigates the influence of several factors on atmospheric pollution. It has been proposed by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr33.pdf) and is based on data coming from 20 cities of U.S. Exercise 2 is an introduction to model selection whereas Exercise 3 provides a first example of analysis of variance. Exercises 2 and 3 have been proposed by A. Dalalyan at ENPC (see Exercises 2 and 3 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_5.pdf).
Keith, Timothy Z
2014-01-01
Multiple Regression and Beyond offers a conceptually oriented introduction to multiple regression (MR) analysis and structural equation modeling (SEM), along with analyses that flow naturally from those methods. By focusing on the concepts and purposes of MR and related methods, rather than the derivation and calculation of formulae, this book introduces material to students more clearly, and in a less threatening way. In addition to illuminating content necessary for coursework, the accessibility of this approach means students are more likely to be able to conduct research using MR or SEM--a
Keat, Sim Chong; Chun, Beh Boon; San, Lim Hwee; Jafri, Mohd Zubir Mat
2015-04-01
Climate change due to carbon dioxide (CO2) emissions is one of the most complex challenges threatening our planet. This issue considered as a great and international concern that primary attributed from different fossil fuels. In this paper, regression model is used for analyzing the causal relationship among CO2 emissions based on the energy consumption in Malaysia using time series data for the period of 1980-2010. The equations were developed using regression model based on the eight major sources that contribute to the CO2 emissions such as non energy, Liquefied Petroleum Gas (LPG), diesel, kerosene, refinery gas, Aviation Turbine Fuel (ATF) and Aviation Gasoline (AV Gas), fuel oil and motor petrol. The related data partly used for predict the regression model (1980-2000) and partly used for validate the regression model (2001-2010). The results of the prediction model with the measured data showed a high correlation coefficient (R2=0.9544), indicating the model's accuracy and efficiency. These results are accurate and can be used in early warning of the population to comply with air quality standards.
REVAAM Model to determine a company's value by multiple valuation and linear regression analysis
Luis G. Acosta-Calzado; Humberto Murrieta-Romo; Carlos Acosta-Calzado
2010-01-01
This paper shows an alternative model to the widely used method of multiple valuation (or relative valuation) in order to calculate the value of a company by using either the Price Earnings (PE) and/or the Enterprise Value to Earnings Before Interest, Taxes, Depreciation and Amortization (EV/EBITDA). When calculating multiples, analysts tend to consider average multiples within an industry and apply them directly to the target company; however, we believe that ...
Barrett, C. A.
1985-01-01
Multiple linear regression analysis was used to determine an equation for estimating hot corrosion attack for a series of Ni base cast turbine alloys. The U transform (i.e., 1/sin (% A/100) to the 1/2) was shown to give the best estimate of the dependent variable, y. A complete second degree equation is described for the centered" weight chemistries for the elements Cr, Al, Ti, Mo, W, Cb, Ta, and Co. In addition linear terms for the minor elements C, B, and Zr were added for a basic 47 term equation. The best reduced equation was determined by the stepwise selection method with essentially 13 terms. The Cr term was found to be the most important accounting for 60 percent of the explained variability hot corrosion attack.
Seber, George A F
2012-01-01
Concise, mathematically clear, and comprehensive treatment of the subject.* Expanded coverage of diagnostics and methods of model fitting.* Requires no specialized knowledge beyond a good grasp of matrix algebra and some acquaintance with straight-line regression and simple analysis of variance models.* More than 200 problems throughout the book plus outline solutions for the exercises.* This revision has been extensively class-tested.
Directory of Open Access Journals (Sweden)
M. Cholewa
2011-07-01
Full Text Available In this article authors showed influence of technological parameters and modification treatment on structural properties for closed skeleton castings. Approach obtained maximal refinement of structure and minimal structure diversification. Skeleton castings were manufactured in accordance with elaborated production technology. Experimental castings were manufactured in variables technological conditions: range of pouring temperature 953 ÷ 1013 K , temperature of mould 293 ÷ 373 K and height of gating system above casting level 105 ÷ 175 mm. Analysis of metallographic specimens and quantitative analysis of silicon crystals and secondary dendrite-arm spacing analysis of solution ? were performed. Average values of stereological parameters for all castings were determined. (B/L and (P/A factors were determined. On basis results of microstructural analysis authors compares research of samples. The aim of analysis was selected samples on least diversification of refinement degree of structure and least silicon crystals. On basis microstructural analysis authors state that samples 5 (AlSi11, Tpour 1013K, Tmould 333K, h – 265 mm has the best structural properties (least diversification of refinement degree of structure and the least refinement of silicon crystals. Then statistical analysis results of structural analysis was obtained. On basis statistical analysis autors statethat the best structural properties for technological parameters: Tpour= 1013 K, Tmould= 373 K and h = 230 mm [4]. The results of statistical analysis are the prerequisite for optimization studies.
Giovanis, Eleftherios
2009-01-01
This paper examines the factors that are contributing at the most explained and efficient way to health expenditures in Greece. Two methods are applied. Multiple regressions and vector error correction models are estimated, as also unit root tests applied to define in which order variables are stationary. Because the available data are yearly and capture a small period from 1985-2006, so the sample is small, a bootstrap simulation is applied, to improve the estimations.
Computing multiple-output regression quantile regions.
Czech Academy of Sciences Publication Activity Database
Paindaveine, D.; Šiman, Miroslav
2012-01-01
Ro?. 56, ?. 4 (2012), s. 840-853. ISSN 0167-9473 R&D Projects: GA MŠk(CZ) 1M06047 Institutional research plan: CEZ:AV0Z10750506 Keywords : halfspace depth * multiple-output regression * parametric linear programming * quantile regression Subject RIV: BA - General Mathematics Impact factor: 1.304, year: 2012 http://library.utia.cas.cz/separaty/2012/SI/siman-0376413.pdf
DART: Dropouts meet Multiple Additive Regression Trees
Rashmi, K. V.; Gilad-Bachrach, Ran
2015-01-01
Multiple Additive Regression Trees (MART), an ensemble model of boosted regression trees, is known to deliver high prediction accuracy for diverse tasks, and it is widely used in practice. However, it suffers an issue which we call over-specialization, wherein trees added at later iterations tend to impact the prediction of only a few instances, and make negligible contribution towards the remaining instances. This negatively affects the performance of the model on unseen da...
Kokaly, R.F.; Clark, R.N.
1999-01-01
We develop a new method for estimating the biochemistry of plant material using spectroscopy. Normalized band depths calculated from the continuum-removed reflectance spectra of dried and ground leaves were used to estimate their concentrations of nitrogen, lignin, and cellulose. Stepwise multiple linear regression was used to select wavelengths in the broad absorption features centered at 1.73 ??m, 2.10 ??m, and 2.30 ??m that were highly correlated with the chemistry of samples from eastern U.S. forests. Band depths of absorption features at these wavelengths were found to also be highly correlated with the chemistry of four other sites. A subset of data from the eastern U.S. forest sites was used to derive linear equations that were applied to the remaining data to successfully estimate their nitrogen, lignin, and cellulose concentrations. Correlations were highest for nitrogen (R2 from 0.75 to 0.94). The consistent results indicate the possibility of establishing a single equation capable of estimating the chemical concentrations in a wide variety of species from the reflectance spectra of dried leaves. The extension of this method to remote sensing was investigated. The effects of leaf water content, sensor signal-to-noise and bandpass, atmospheric effects, and background soil exposure were examined. Leaf water was found to be the greatest challenge to extending this empirical method to the analysis of fresh whole leaves and complete vegetation canopies. The influence of leaf water on reflectance spectra must be removed to within 10%. Other effects were reduced by continuum removal and normalization of band depths. If the effects of leaf water can be compensated for, it might be possible to extend this method to remote sensing data acquired by imaging spectrometers to give estimates of nitrogen, lignin, and cellulose concentrations over large areas for use in ecosystem studies.We develop a new method for estimating the biochemistry of plant material using spectroscopy. Normalized band depths calculated from the continuum-removed reflectance spectra of dried and ground leaves were used to estimate their concentrations of nitrogen, lignin, and cellulose. Stepwise multiple linear regression was used to select wavelengths in the broad absorption features centered at 1.73 ??m, 2.10 ??m, and 2.301 ??m that were highly correlated with the chemistry of samples from eastern U.S. forests. Band depths of absorption features at these wavelengths were found to also be highly correlated with the chemistry of four other sites. A subset of data from the eastern U.S. forest sites was used to derive linear equations that were applied to the remaining data to successfully estimate their nitrogen, lignin, and cellulose concentrations. Correlations were highest for nitrogen (R2 from 0.75 to 0.94). The consistent results indicate the possibility of establishing a single equation capable of estimating the chemical concentrations in a wide variety of species from the reflectance spectra of dried leaves. The extension of this method to remote sensing was investigated. The effects of leaf water content, sensor signal-to-noise and bandpass, atmospheric effects, and background soil exposure were examined. Leaf water was found to be the greatest challenge to extending this empirical method to the analysis of fresh whole leaves and complete vegetation canopies. The influence of leaf water on reflectance spectra must be removed to within 10%. Other effects were reduced by continuum removal and normalization of band depths. If the effects of leaf water can be compensated for, it might be possible to extend this method to remote sensing data acquired by imaging spectrometers to give estimates of nitrogen, lignin, and cellulose concentrations over large areas for use in ecosystem studies.
On directional multiple-output quantile regression.
Czech Academy of Sciences Publication Activity Database
Paindaveine, D.; Šiman, Miroslav
2011-01-01
Ro?. 102, ?. 2 (2011), s. 193-212. ISSN 0047-259X R&D Projects: GA MŠk(CZ) 1M06047 Grant ostatní: Commision EC(BE) Fonds National de la Recherche Scientifique Institutional research plan: CEZ:AV0Z10750506 Keywords : multivariate quantile * quantile regression * multiple-output regression * halfspace depth * portfolio optimization * value-at risk Subject RIV: BA - General Mathematics Impact factor: 0.879, year: 2011 http://library.utia.cas.cz/separaty/2011/SI/siman-0364128.pdf
Directory of Open Access Journals (Sweden)
Kalajdžija Nataša D.
2013-01-01
Full Text Available In this study we were investigated the relationship between the antifungal activity of some benzimidazole derivatives and some absorption, distribution, metabolism and excretion (ADME parameters. The antifungal activity of studied compounds against Saccharomyces cerevisiae was expressed as the minimal inhibitory concentration (MIC. A statistically significant quantitative structure-activity relationship (QSAR model for predicting antifungal activity of the investigated benzimidazole derivatives against Saccharomyces cerevisiae was obtained by multiple linear regression (MLR using ADME parameters. The quality of the MLR model was validated by the leave-one-out (LOO technique, as well as by the calculation of the statistical parameters for the developed model, and the results are discussed based on the statistical data. [Projekat Ministarstva nauke Republike Srbije, br. 172012 i br. 172014
Tarpey, Thaddeus; Petkova, Eva
2010-07-01
Finite mixture models have come to play a very prominent role in modelling data. The finite mixture model is predicated on the assumption that distinct latent groups exist in the population. The finite mixture model therefore is based on a categorical latent variable that distinguishes the different groups. Often in practice distinct sub-populations do not actually exist. For example, disease severity (e.g. depression) may vary continuously and therefore, a distinction of diseased and not-diseased may not be based on the existence of distinct sub-populations. Thus, what is needed is a generalization of the finite mixture's discrete latent predictor to a continuous latent predictor. We cast the finite mixture model as a regression model with a latent Bernoulli predictor. A latent regression model is proposed by replacing the discrete Bernoulli predictor by a continuous latent predictor with a beta distribution. Motivation for the latent regression model arises from applications where distinct latent classes do not exist, but instead individuals vary according to a continuous latent variable. The shapes of the beta density are very flexible and can approximate the discrete Bernoulli distribution. Examples and a simulation are provided to illustrate the latent regression model. In particular, the latent regression model is used to model placebo effect among drug treated subjects in a depression study. PMID:20625443
A general framework for multiple linear regression
Blanco, Víctor; Puerto, Justo; Salmerón, Román
2015-01-01
This paper presents a family of new methods for estimating the coefficients in multiple linear regression models. The novelty consists in considering distance-based residuals instead of the usual vertical distance and on the use of different forms of aggregation criteria for those residuals. The most popular methods found in the specialized literature can be cast within this family as particular choices of the residuals and the aggregation criteria. Mathematical programming ...
A Dirty Model for Multiple Sparse Regression
Jalali, Ali; Sanghavi, Sujay
2011-01-01
Sparse linear regression -- finding an unknown vector from linear measurements -- is now known to be possible with fewer samples than variables, via methods like the LASSO. We consider the multiple sparse linear regression problem, where several related vectors -- with partially shared support sets -- have to be recovered. A natural question in this setting is whether one can use the sharing to further decrease the overall number of samples required. A line of recent research has studied the use of \\ell_1/\\ell_q norm block-regularizations with q>1 for such problems; however these could actually perform worse in sample complexity -- vis a vis solving each problem separately ignoring sharing -- depending on the level of sharing. We present a new method for multiple sparse linear regression that can leverage support and parameter overlap when it exists, but not pay a penalty when it does not. A very simple idea: we decompose the parameters into two components and regularize these differently. We show both theore...
Directory of Open Access Journals (Sweden)
S. CONDON
2014-06-01
Full Text Available The thermal inactivation of Enterococcus faecium under isothermal conditions in tryptic soy broth of different pH (4.0, 5.5 and 7.4 was studied. The bacterial cells were more sensitive at higher temperature and in media of low pH. Decimal reduction times at 71ºC were 2.56, 0.39 and 0.03 min at pH 7.4, 5.5 and 4.0 respectively. At all temperatures and pH assayed, the survival curves obtained were linear. A mathematical model based on the first order kinetic accurately described these survival curves. The relationship between DT values and temperature was also linear. A mean z-value of 5ºC was established. A multiple linear regression model using four predictor variables (pH, T, pH2 and T2 related the Log of DT value with pH and treatment temperature. The developed tertiary model satisfactorily predicted the heat inactivation of Enterococcus faeciumunder the treatment conditions investigated.
Multiple linear regression for isotopic measurements
Garcia Alonso, J. I.
2012-04-01
There are two typical applications of isotopic measurements: the detection of natural variations in isotopic systems and the detection man-made variations using enriched isotopes as indicators. For both type of measurements accurate and precise isotope ratio measurements are required. For the so-called non-traditional stable isotopes, multicollector ICP-MS instruments are usually applied. In many cases, chemical separation procedures are required before accurate isotope measurements can be performed. The off-line separation of Rb and Sr or Nd and Sm is the classical procedure employed to eliminate isobaric interferences before multicollector ICP-MS measurement of Sr and Nd isotope ratios. Also, this procedure allows matrix separation for precise and accurate Sr and Nd isotope ratios to be obtained. In our laboratory we have evaluated the separation of Rb-Sr and Nd-Sm isobars by liquid chromatography and on-line multicollector ICP-MS detection. The combination of this chromatographic procedure with multiple linear regression of the raw chromatographic data resulted in Sr and Nd isotope ratios with precisions and accuracies typical of off-line sample preparation procedures. On the other hand, methods for the labelling of individual organisms (such as a given plant, fish or animal) are required for population studies. We have developed a dual isotope labelling procedure which can be unique for a given individual, can be inherited in living organisms and it is stable. The detection of the isotopic signature is based also on multiple linear regression. The labelling of fish and its detection in otoliths by Laser Ablation ICP-MS will be discussed using trout and salmon as examples. As a conclusion, isotope measurement procedures based on multiple linear regression can be a viable alternative in multicollector ICP-MS measurements.
Directory of Open Access Journals (Sweden)
Hukharnsusatrue, A.
2005-11-01
Full Text Available The objective of this research is to compare multiple regression coefficients estimating methods with existence of multicollinearity among independent variables. The estimation methods are Ordinary Least Squares method (OLS, Restricted Least Squares method (RLS, Restricted Ridge Regression method (RRR and Restricted Liu method (RL when restrictions are true and restrictions are not true. The study used the Monte Carlo Simulation method. The experiment was repeated 1,000 times under each situation. The analyzed results of the data are demonstrated as follows. CASE 1: The restrictions are true. In all cases, RRR and RL methods have a smaller Average Mean Square Error (AMSE than OLS and RLS method, respectively. RRR method provides the smallest AMSE when the level of correlations is high and also provides the smallest AMSE for all level of correlations and all sample sizes when standard deviation is equal to 5. However, RL method provides the smallest AMSE when the level of correlations is low and middle, except in the case of standard deviation equal to 3, small sample sizes, RRR method provides the smallest AMSE.The AMSE varies with, most to least, respectively, level of correlations, standard deviation and number of independent variables but inversely with to sample size.CASE 2: The restrictions are not true.In all cases, RRR method provides the smallest AMSE, except in the case of standard deviation equal to 1 and error of restrictions equal to 5%, OLS method provides the smallest AMSE when the level of correlations is low or median and there is a large sample size, but the small sample sizes, RL method provides the smallest AMSE. In addition, when error of restrictions is increased, OLS method provides the smallest AMSE for all level, of correlations and all sample sizes, except when the level of correlations is high and sample sizes small. Moreover, the case OLS method provides the smallest AMSE, the most RLS method has a smaller AMSE than RRR and RL methods when the level of correlations is low or median and sample sizes are large.The AMSE varies with, most to least, respectively, error of restrictions, level of correlations, standard deviation and number of independent variables but inversely with to sample sizes, except that error of restrictions does not affect AMSE of OLS method.
Lochner, Christine; Seedat, Soraya; Hemmings, Sian M J; Moolman-Smook, Johanna C; Kidd, Martin; Stein, Dan J
2007-01-01
Dissociation is defined as the disruption of the usually integrated functions of consciousness, such as memory, identity, and perceptions of the environment. Causes include various psychological, neurological and neurobiological mechanisms, none of which have been consistently supported. To our knowledge, the role of gene-environment interactions in dissociative experiences in obsessive-compulsive disorder (OCD) has not previously been investigated. Eighty-three Caucasian patients (29 male, 54 female) with a principal diagnosis of OCD were included. The Dissociative Experiences Scale was used to assess dissociation. The role of childhood trauma (assessed with the Childhood Trauma Questionnaire), and a functional 44-bp insertion/deletion polymorphism in the promoter region of the serotonin transporter, or 5-HTT, in mediating dissociation, was investigated using multiple regression analysis and path analysis using the partial least squares model. Both analyses indicated that an interaction between physical neglect and the S/S genotype of the 5-HTT gene significantly predicted dissociation in patients with OCD. Dissociation may be a predictor of poorer treatment outcome in patients with OCD; therefore, a better understanding of the mechanisms that underlie this phenomenon may be useful. Here, two different but related statistical techniques (multiple regression and partial least squares), confirmed that physical neglect and the 5-HTT genotype jointly play a role in predicting dissociation in OCD. PMID:17943026
Yano, Kentaro; Mita, Suzune; Morimoto, Kaori; Haraguchi, Tamami; Arakawa, Hiroshi; Yoshida, Miyako; Yamashita, Fumiyoshi; Uchida, Takahiro; Ogihara, Takuo
2015-09-01
P-glycoprotein (P-gp) regulates absorption of many drugs in the gastrointestinal tract and their accumulation in tumor tissues, but the basis of substrate recognition by P-gp remains unclear. Bitter-tasting phenylthiocarbamide, which stimulates taste receptor 2 member 38 (T2R38), increases P-gp activity and is a substrate of P-gp. This led us to hypothesize that bitterness intensity might be a predictor of P-gp-inhibitor/substrate status. Here, we measured the bitterness intensity of a panel of P-gp substrates and nonsubstrates with various taste sensors, and used multiple linear regression analysis to examine the relationship between P-gp-inhibitor/substrate status and various physical properties, including intensity of bitter taste measured with the taste sensor. We calculated the first principal component analysis score (PC1) as the representative value of bitterness, as all taste sensor's outputs shared significant correlation. The P-gp substrates showed remarkably greater mean bitterness intensity than non-P-gp substrates. We found that Km value of P-gp substrates were correlated with molecular weight, log P, and PC1 value, and the coefficient of determination (R(2) ) of the linear regression equation was 0.63. This relationship might be useful as an aid to predict P-gp substrate status at an early stage of drug discovery. © 2014 Wiley Periodicals, Inc. and the American Pharmacists Association J Pharm Sci 104:2789-2794, 2015. PMID:25545612
Directory of Open Access Journals (Sweden)
St Leger Antony S
2005-02-01
Full Text Available Abstract Background There is a small, but growing body of literature highlighting inequities in GP practice prescribing rates for many drug therapies. The aim of this paper is to further explore the equity of prescribing for five major CHD drug groups and to explain the amount of variation in GP practice prescribing rates that can be explained by a range of healthcare needs indicators (HCNIs. Methods The study involved a cross-sectional secondary analysis in four primary care trusts (PCTs 1–4 in the North West of England, including 132 GP practices. Prescribing rates (average daily quantities per registered patient aged over 35 years and HCNIs were developed for all GP practices. Analysis was undertaken using multiple linear regression. Results Between 22–25% of the variation in prescribing rates for statins, beta-blockers and bendrofluazide was explained in the multiple regression models. Slightly more variation was explained for ACE inhibitors (31.6% and considerably more for aspirin (51.2%. Prescribing rates were positively associated with CHD hospital diagnoses and procedures for all drug groups other than ACE inhibitors. The proportion of patients aged 55–74 years was positively related to all prescribing rates other than aspirin, where they were positively related to the proportion of patients aged >75 years. However, prescribing rates for statins and ACE inhibitors were negatively associated with the proportion of patients aged >75 years in addition to the proportion of patients from minority ethnic groups. Prescribing rates for aspirin, bendrofluazide and all CHD drugs combined were negatively associated with deprivation. Conclusion Although around 25–50% of the variation in prescribing rates was explained by HCNIs, this varied markedly between PCTs and drug groups. Prescribing rates were generally characterised by both positive and negative associations with HCNIs, suggesting possible inequities in prescribing rates on the basis of ethnicity, deprivation and the proportion of patients aged over 75 years (for statins and ACE inhibitors, but not for aspirin.
Retail sales forecasting with application the multiple regression
Directory of Open Access Journals (Sweden)
Kuzhda, Tetyana
2012-05-01
Full Text Available The article begins with a formulation for predictive learning called multiple regression model. Theoretical approach on construction of the regression models is described. The key information of the article is the mathematical formulation for the forecast linear equation that estimates the multiple regression model. Calculation the quantitative value of dependent variable forecast under influence of independent variables is explained. This paper presents the retail sales forecasting with multiple model estimation. One of the most important decisions a retailer can make with information obtained by the multiple regression. Recently, a changing retail environment is causing by an expected consumer’s income and advertising costs. Checking model on the goodness of fit and statistical significance are explored in the article. Finally, the quantitative value of retail sales forecast based on multiple regression model is calculated.
International Nuclear Information System (INIS)
The problem of performing process capability analysis when auto correlations are present is discussed. It is shown that when the systematic nonrandom phenomenon induced by autocorrelation is ignored the variance estimate obtained from the original data is no longer an appropriate estimate for use in the process capability analyses. A remedial measure based on an autoregressive integrated moving average model is proposed. It is also shown that the process variance estimated from the residual analysis yields appropriate results for the process capability indices
Riccardi, M; Mele, G; Pulvento, C; Lavini, A; d'Andria, R; Jacobsen, S-E
2014-06-01
Leaf chlorophyll content provides valuable information about physiological status of plants; it is directly linked to photosynthetic potential and primary production. In vitro assessment by wet chemical extraction is the standard method for leaf chlorophyll determination. This measurement is expensive, laborious, and time consuming. Over the years alternative methods, rapid and non-destructive, have been explored. The aim of this work was to evaluate the applicability of a fast and non-invasive field method for estimation of chlorophyll content in quinoa and amaranth leaves based on RGB components analysis of digital images acquired with a standard SLR camera. Digital images of leaves from different genotypes of quinoa and amaranth were acquired directly in the field. Mean values of each RGB component were evaluated via image analysis software and correlated to leaf chlorophyll provided by standard laboratory procedure. Single and multiple regression models using RGB color components as independent variables have been tested and validated. The performance of the proposed method was compared to that of the widely used non-destructive SPAD method. Sensitivity of the best regression models for different genotypes of quinoa and amaranth was also checked. Color data acquisition of the leaves in the field with a digital camera was quick, more effective, and lower cost than SPAD. The proposed RGB models provided better correlation (highest R (2)) and prediction (lowest RMSEP) of the true value of foliar chlorophyll content and had a lower amount of noise in the whole range of chlorophyll studied compared with SPAD and other leaf image processing based models when applied to quinoa and amaranth. PMID:24442792
DEFF Research Database (Denmark)
Riccardi, M.; Mele, G.
2014-01-01
Leaf chlorophyll content provides valuable information about physiological status of plants; it is directly linked to photosynthetic potential and primary production. In vitro assessment by wet chemical extraction is the standard method for leaf chlorophyll determination. This measurement is expensive, laborious, and time consuming. Over the years alternative methods, rapid and non-destructive, have been explored. The aim of this work was to evaluate the applicability of a fast and non-invasive field method for estimation of chlorophyll content in quinoa and amaranth leaves based on RGB components analysis of digital images acquired with a standard SLR camera. Digital images of leaves from different genotypes of quinoa and amaranth were acquired directly in the field. Mean values of each RGB component were evaluated via image analysis software and correlated to leaf chlorophyll provided by standard laboratory procedure. Single and multiple regression models using RGB color components as independent variables have been tested and validated. The performance of the proposed method was compared to that of the widely used non-destructive SPAD method. Sensitivity of the best regression models for different genotypes of quinoa and amaranth was also checked. Color data acquisition of the leaves in the field with a digital camera was quick, more effective, and lower cost than SPAD. The proposed RGB models provided better correlation (highest R 2) and prediction (lowest RMSEP) of the true value of foliar chlorophyll content and had a lower amount of noise in the whole range of chlorophyll studied compared with SPAD and other leaf image processing based models when applied to quinoa and amaranth.
Fuzzy multiple linear regression: A computational approach
Juang, C. H.; Huang, X. H.; Fleming, J. W.
1992-01-01
This paper presents a new computational approach for performing fuzzy regression. In contrast to Bardossy's approach, the new approach, while dealing with fuzzy variables, closely follows the conventional regression technique. In this approach, treatment of fuzzy input is more 'computational' than 'symbolic.' The following sections first outline the formulation of the new approach, then deal with the implementation and computational scheme, and this is followed by examples to illustrate the new procedure.
International Nuclear Information System (INIS)
A retrospective analysis of 965 patients with invasive cervix cancer treated by radiation therapy between 1976 and 1981 was performed in order to evaluate prognostic factors for disease-free survival (DFS) and pelvic control. FIGO stage was the most powerful prognostic factor followed by radiation dose and treatment duration (P values = 0.0001). If the analysis was limited to patients treated with radical doses of 75 Gy or more, dose was no longer significant. Young age at diagnosis, non-squamous histology and transfusion during treatment were also adverse prognostic factors for survival and control. Para-aortic nodal involvement on lymphogram was associated with a reduction in DFS (P = 0.0027), whereas pelvic lymph node involvement alone was not. In patients with Stage I and IIA disease, tumour size was the most powerful prognostic factor for survival (P = 0.0001) and the extent of pelvic sidewall involvement was significant in patients with Stage III tumours (P = 0.007). Histological grade appeared to be a predictive factor but was only recorded in 712 patients. These features should be considered in the staging of patients and in the design of clinical trials
Directory of Open Access Journals (Sweden)
G. Ugrasen
2014-05-01
Full Text Available Wire Electrical Discharge Machining (WEDM is a specialized thermal machining process capable of accurately machining parts with varying hardness or complex shapes, which have sharp edges that are very difficult to be machined by the main stream machining processes. In WEDM a specific wire run-off speed is applied to compensate wear and avoid wire breakage. Since the workpiece generally stays stationary and short discharge durations are applied, the relative displacement between wire and workpiece during one single discharge is very small. This study outlines the development of model and its application to optimize WEDM machining parameters using the Taguchi?s technique which is based on the robust design. Present study outlines the electrode wear estimation in the wire EDM. EN-8 and EN-19 was machined using different process parameters based on L?16 orthogonal array. Among different process parameters voltage and flush rate were kept constant. Parameters such as bed speed, current, pulse-on and pulse-off was varied. Molybdenum wire having diameter of 0.18 mm was used as an electrode. Electrode wear was measured using universal measuring machine. Estimation and comparison of electrode wear was done using multiple regression analysis and group method data handling technique. From the results it was observed that, measured electrode wear and estimated electrode wear correlates well with respect to MRA than GMDH
Energy Technology Data Exchange (ETDEWEB)
Nakagawa, S. [Maizuru National College of Technology, Kyoto (Japan); Kenmoku, Y.; Sakakibara, T. [Toyohashi University of Technology, Aichi (Japan); Kawamoto, T. [Shizuoka University, Shizuoka (Japan). Faculty of Engineering
1996-10-27
Study is under way for a more accurate solar radiation quantity prediction for the enhancement of solar energy utilization efficiency. Utilizing the technique of roughly estimating the day`s clearness index from forecast weather, the forecast weather (constituted of weather conditions such as `clear,` `cloudy,` etc., and adverbs or adjectives such as `afterward,` `temporary,` and `intermittent`) has been quantified relative to the clearness index. This index is named the `weather index` for the purpose of this article. The error high in rate in the weather index relates to cloudy days, which means a weather index falling in 0.2-0.5. It has also been found that there is a high correlation between the clearness index and the north-south wind direction component. A multiple regression analysis has been carried out, under the circumstances, for the estimation of clearness index from the maximum temperature and the north-south wind direction component. As compared with estimation of the clearness index on the basis only of the weather index, estimation using the weather index and maximum temperature achieves a 3% improvement throughout the year. It has also been learned that estimation by use of the weather index and north-south wind direction component enables a 2% improvement for summer and a 5% or higher improvement for winter. 2 refs., 6 figs., 4 tabs.
Spatial regression analysis on 32 years total column ozone data
J. S. Knibbe; R. J. van der A; Laat, A.T.J., de
2014-01-01
Multiple-regressions analysis have been performed on 32 years of total ozone column data that was spatially gridded with a 1° × 1.5° resolution. The total ozone data consists of the MSR (Multi Sensor Reanalysis; 1979–2008) and two years of assimilated SCIAMACHY ozone data (2009–2010). The two-dimensionality in this data-set allows us to perform the regressions locally and investigate spatial patterns of regression coefficients and their explanatory pow...
International Nuclear Information System (INIS)
Various statistical techniques was used on five-year data from 1998-2002 of average humidity, rainfall, maximum and minimum temperatures, respectively. The relationships to regression analysis time series (RATS) were developed for determining the overall trend of these climate parameters on the basis of which forecast models can be corrected and modified. We computed the coefficient of determination as a measure of goodness of fit, to our polynomial regression analysis time series (PRATS). The correlation to multiple linear regression (MLR) and multiple linear regression analysis time series (MLRATS) were also developed for deciphering the interdependence of weather parameters. Spearman's rand correlation and Goldfeld-Quandt test were used to check the uniformity or non-uniformity of variances in our fit to polynomial regression (PR). The Breusch-Pagan test was applied to MLR and MLRATS, respectively which yielded homoscedasticity. We also employed Bartlett's test for homogeneity of variances on a five-year data of rainfall and humidity, respectively which showed that the variances in rainfall data were not homogenous while in case of humidity, were homogenous. Our results on regression and regression analysis time series show the best fit to prediction modeling on climatic data of Quetta, Pakistan. (author)
Vehicle Travel Time Predication based on Multiple Kernel Regression
Directory of Open Access Journals (Sweden)
Wenjing Xu
2014-07-01
Full Text Available With the rapid development of transportation and logistics economy, the vehicle travel time prediction and planning become an important topic in logistics. Travel time prediction, which is indispensible for traffic guidance, has become a key issue for researchers in this field. At present, the prediction of travel time is mainly short term prediction, and the predication methods include artificial neural network, Kaman filter and support vector regression (SVR method etc. However, these algorithms still have some shortcomings, such as highcomputationcomplexity, slow convergence rate etc. This paper exploits the learning ability of multiple kernel learning regression (MKLR in nonlinear prediction processing characteristics, logistics planning based on MKLR for vehicle travel time prediction. The method for Vehicle travel time prediction includes the following steps: (1 preprocessing historical data; (2 selecting appropriate kernel function, training the historical data and performing analysis ;(3 predicting the vehicle travel time based on the trained model. The experimental results show that, through the analysis of using different methods for prediction, the vehicle travel time prediction method proposed in this paper, archives higher accuracy than other methods. It also illustrates the feasibility and effectiveness of the proposed prediction method.
On relationship between regression models and interpretation of multiple regression coefficients
Varaksin, A. N.; Panov, V. G.
2012-01-01
In this paper, we consider the problem of treating linear regression equation coefficients in the case of correlated predictors. It is shown that in general there are no natural ways of interpreting these coefficients similar to the case of single predictor. Nevertheless we suggest linear transformations of predictors, reducing multiple regression to a simple one and retaining the coefficient at variable of interest. The new variable can be treated as the part of the old var...
Steganalysis of LSB Image Steganography using Multiple Regression and Auto Regressive (AR Model
Directory of Open Access Journals (Sweden)
Souvik Bhattacharyya
2011-07-01
Full Text Available The staggering growth in communication technologyand usage of public domain channels (i.e. Internet has greatly facilitated transfer of data. However, such open communication channelshave greater vulnerability to security threats causing unauthorizedin- formation access. Traditionally, encryption is used to realizethen communication security. However, important information is notprotected once decoded. Steganography is the art and science of communicating in a way which hides the existence of the communication.Important information is ?rstly hidden in a host data, such as digitalimage, text, video or audio, etc, and then transmitted secretly tothe receiver. Steganalysis is another important topic in informationhiding which is the art of detecting the presence of steganography. Inthis paper a novel technique for the steganalysis of Image has beenpresented. The proposed technique uses an auto-regressive model todetect the presence of the hidden messages, as well as to estimatethe relative length of the embedded messages.Various auto regressiveparameters are used to classify cover image as well as stego imagewith the help of a SVM classi?er. Multiple Regression analysis ofthe cover carrier along with the stego carrier has been carried outin order to ?nd out the existence of the negligible amount of thesecret message. Experimental results demonstrate the effectivenessand accuracy of the proposed technique.
REPRESENTATIVE VARIABLES IN A MULTIPLE REGRESSION MODEL
Directory of Open Access Journals (Sweden)
Barbu Bogdan POPESCU
2013-02-01
Full Text Available There are presented econometric models developed for analysis of banking exclusion of the economic crisis. Access to public goods and services is a condition „sine qua non” for open and efficient society. Availability of banking and payment of the entire population without discrimination in our opinion should be the primary objective of public service policy.
Applied regression analysis a research tool
Pantula, Sastry; Dickey, David
1998-01-01
Least squares estimation, when used appropriately, is a powerful research tool. A deeper understanding of the regression concepts is essential for achieving optimal benefits from a least squares analysis. This book builds on the fundamentals of statistical methods and provides appropriate concepts that will allow a scientist to use least squares as an effective research tool. Applied Regression Analysis is aimed at the scientist who wishes to gain a working knowledge of regression analysis. The basic purpose of this book is to develop an understanding of least squares and related statistical methods without becoming excessively mathematical. It is the outgrowth of more than 30 years of consulting experience with scientists and many years of teaching an applied regression course to graduate students. Applied Regression Analysis serves as an excellent text for a service course on regression for non-statisticians and as a reference for researchers. It also provides a bridge between a two-semester introduction to...
Computing multiple-output regression quantile regions from projection quantiles.
Czech Academy of Sciences Publication Activity Database
Paindaveine, D.; Šiman, Miroslav
2012-01-01
Ro?. 27, ?. 1 (2012), s. 29-49. ISSN 0943-4062 R&D Projects: GA MŠk(CZ) 1M06047 Institutional research plan: CEZ:AV0Z10750506 Keywords : directional quantile * halfspace depth * multiple-output regression * parametric programming * quantile regression Subject RIV: BA - General Mathematics Impact factor: 0.482, year: 2012 http://library.utia.cas.cz/separaty/2012/SI/siman-0376414.pdf
Chen Su-Fen
2013-01-01
Unified Multiple Linear Regression (UMLR) is a nonlinear programming model that unifies all kind of multiple linear regression models, such as Principal Components Regression, Ridge Regression, Robust Regression and constrained regression. Although, UMLR has exhibited excellent performances in some real applications, the optimization procedure is not satisfying yet. This study proposes a novel Granular Computing-Particle Swarm Optimization (Grc-PSO) algorithm by ...
Interpreting Multiple Linear Regression: A Guidebook of Variable Importance
Nathans, Laura L.; Oswald, Frederick L.; Nimon, Kim
2012-01-01
Multiple regression (MR) analyses are commonly employed in social science fields. It is also common for interpretation of results to typically reflect overreliance on beta weights, often resulting in very limited interpretations of variable importance. It appears that few researchers employ other methods to obtain a fuller understanding of what…
Whitlock, C. H., III
1977-01-01
Constituents with linear radiance gradients with concentration may be quantified from signals which contain nonlinear atmospheric and surface reflection effects for both homogeneous and non-homogeneous water bodies provided accurate data can be obtained and nonlinearities are constant with wavelength. Statistical parameters must be used which give an indication of bias as well as total squared error to insure that an equation with an optimum combination of bands is selected. It is concluded that the effect of error in upwelled radiance measurements is to reduce the accuracy of the least square fitting process and to increase the number of points required to obtain a satisfactory fit. The problem of obtaining a multiple regression equation that is extremely sensitive to error is discussed.
Directory of Open Access Journals (Sweden)
Halil Ibrahim Cebeci
2009-12-01
Full Text Available This study explores the relationship between the student performance and instructional design. The research was conducted at the E-Learning School at a university in Turkey. A list of design factors that had potential influence on student success was created through a review of the literature and interviews with relevant experts. From this, the five most import design factors were chosen. The experts scored 25 university courses on the extent to which they demonstrated the chosen design factors. Multiple-regression and supervised artificial neural network (ANN models were used to examine the relationship between student grade point averages and the scores on the five design factors. The results indicated that there is no statistical difference between the two models. Both models identified the use of examples and applications as the most influential factor. The ANN model provided more information and was used to predict the course-specific factor values required for a desired level of success.
Predicting share price by using Multiple Linear Regression.
Forslund, Gustaf; Åkesson, David
2013-01-01
The aim of the project was to design a multiple linear regression model and use it to predict the share’s closing price for 44 companies listed on the OMX Stockholm stock exchange’s Large Cap list. The model is intended to be used as a day trading guideline i.e. today’s information is used to predict tomorrow’s closing price. The regression was done in Microsoft Excel 2010[18] by using its built-in function LINEST. The LINEST-function uses the dependent variable y and all the covariates x to ...
Li, Spencer D.
2011-01-01
Mediation analysis in child and adolescent development research is possible using large secondary data sets. This article provides an overview of two statistical methods commonly used to test mediated effects in secondary analysis: multiple regression and structural equation modeling (SEM). Two empirical studies are presented to illustrate the…
Joint regression analysis and AMMI model applied to oat improvement
Oliveira, A.; Oliveira, T. A.; Mejza, S.
2012-09-01
In our work we present an application of some biometrical methods useful in genotype stability evaluation, namely AMMI model, Joint Regression Analysis (JRA) and multiple comparison tests. A genotype stability analysis of oat (Avena Sativa L.) grain yield was carried out using data of the Portuguese Plant Breeding Board, sample of the 22 different genotypes during the years 2002, 2003 and 2004 in six locations. In Ferreira et al. (2006) the authors state the relevance of the regression models and of the Additive Main Effects and Multiplicative Interactions (AMMI) model, to study and to estimate phenotypic stability effects. As computational techniques we use the Zigzag algorithm to estimate the regression coefficients and the agricolae-package available in R software for AMMI model analysis.
Modeling Oil Palm Yield Using Multiple Linear Regression and Robust M-regression
Azme Khamis; Zuhaimy Ismail; Khalid Haron; Ahmad Tarmizi Mohammed
2006-01-01
This study shows how a multiple linear regression model can be used to model palm oil yield. The methods are illustrated by examining the time series data of foliar nutrient compositions as one of the independent variable and fresh fruit bunch as dependent variable. Other independent variables include the nutrient balance ratio and major nutrient composition. This modeling approach is capable of identifying the significant contribution of each independent variable in the improving the modelin...
International Nuclear Information System (INIS)
The incremental diagnostic yield of clinical data, exercise ECG, stress thallium scintigraphy, and cardiac fluoroscopy to predict coronary and multivessel disease was assessed in 171 symptomatic men by means of multiple logistic regression analyses. When clinical variables alone were analyzed, chest pain type and age were predictive of coronary disease, whereas chest pain type, age, a family history of premature coronary disease before age 55 years, and abnormal ST-T wave changes on the rest ECG were predictive of multivessel disease. The percentage of patients correctly classified by cardiac fluoroscopy (presence or absence of coronary artery calcification), exercise ECG, and thallium scintigraphy was 9%, 25%, and 50%, respectively, greater than for clinical variables, when the presence or absence of coronary disease was the outcome, and 13%, 25%, and 29%, respectively, when multivessel disease was studied; 5% of patients were misclassified. When the 37 clinical and noninvasive test variables were analyzed jointly, the most significant variable predictive of coronary disease was an abnormal thallium scan and for multivessel disease, the amount of exercise performed. The data from this study provide a quantitative model and confirm previous reports that optimal diagnostic efficacy is obtained when noninvasive tests are ordered sequentially. In symptomatic men, cardiac fluoroscopy is a relatively ineffective test when compared to exercise ECG and thallium scintigraphyto exercise ECG and thallium scintigraphy
A multiple regression model for the Ft. Calhoun reactor coolant pump system
International Nuclear Information System (INIS)
Multiple regression analysis is one of the most widely used of all statistical tools. In this research paper, we introduce an application of fitting a multiple regression model on reactor coolant pump (RCP) data. The primary purpose of this research is to correlate the results obtained by Design of Experiments (DOE) and regression model fitting. Also, the idea behind using regression model is to gain more detailed information in the RCP data than provided by DOE. In engineering science, statistical quality control techniques have traditionally been applied to control manufacturing processes. An application to commercial nuclear power plant maintenance and control is presented that can greatly improve plant safety and reliability. The result obtained show that six out of ten parameters are under control specification limits and four parameters are not in the state of statistical control. The four parameters that are out of control adversely affect the regression model fitting and the final prediction equation, thereby, does not predict accurate response for the future. The analysis concludes that in order to fit a best regression model, one has to remove all out of control points from the data set, including dropping a variable from the model to have better prediction of the response variable. (author)
A multiple regression model for the Ft. Calhoun reactor coolant pump system
Energy Technology Data Exchange (ETDEWEB)
Patel, B.; Heising, C.D. [Iowa State Univ. of Science and Technology, Ames, IA (United States)
1996-10-01
Multiple regression analysis is one of the most widely used of all statistical tools. In this research paper, we introduce an application of fitting a multiple regression model on reactor coolant pump (RCP) data. The primary purpose of this research is to correlate the results obtained by Design of Experiments (DOE) and regression model fitting. Also, the idea behind using regression model is to gain more detailed information in the RCP data than provided by DOE. In engineering science, statistical quality control techniques have traditionally been applied to control manufacturing processes. An application to commercial nuclear power plant maintenance and control is presented that can greatly improve plant safety and reliability. The result obtained show that six out of ten parameters are under control specification limits and four parameters are not in the state of statistical control. The four parameters that are out of control adversely affect the regression model fitting and the final prediction equation, thereby, does not predict accurate response for the future. The analysis concludes that in order to fit a best regression model, one has to remove all out of control points from the data set, including dropping a variable from the model to have better prediction of the response variable. (author)
Forecasting relativistic electron flux using dynamic multiple regression models
Wei, H.-L.; Billings, S. A.; Surjalal Sharma, A.; Wing, S.; Boynton, R. J.; Walker, S. N.
2011-02-01
The forecast of high energy electron fluxes in the radiation belts is important because the exposure of modern spacecraft to high energy particles can result in significant damage to onboard systems. A comprehensive physical model of processes related to electron energisation that can be used for such a forecast has not yet been developed. In the present paper a systems identification approach is exploited to deduce a dynamic multiple regression model that can be used to predict the daily maximum of high energy electron fluxes at geosynchronous orbit from data. It is shown that the model developed provides reliable predictions.
Multiple predictor smoothing methods for sensitivity analysis.
Energy Technology Data Exchange (ETDEWEB)
Helton, Jon Craig; Storlie, Curtis B.
2006-08-01
The use of multiple predictor smoothing methods in sampling-based sensitivity analyses of complex models is investigated. Specifically, sensitivity analysis procedures based on smoothing methods employing the stepwise application of the following nonparametric regression techniques are described: (1) locally weighted regression (LOESS), (2) additive models, (3) projection pursuit regression, and (4) recursive partitioning regression. The indicated procedures are illustrated with both simple test problems and results from a performance assessment for a radioactive waste disposal facility (i.e., the Waste Isolation Pilot Plant). As shown by the example illustrations, the use of smoothing procedures based on nonparametric regression techniques can yield more informative sensitivity analysis results than can be obtained with more traditional sensitivity analysis procedures based on linear regression, rank regression or quadratic regression when nonlinear relationships between model inputs and model predictions are present.
Fiscal Multipliers: A Meta Regression Analysis
Gechert, Sebastian; Will, Henner
2012-01-01
Since the fiscal expansion during the Great Recession 2008-2009 and the current European consolidation and austerity measures, the analysis of fiscal multiplier effects is back on the scientific agenda. The number of empirical studies is growing fast, tackling the issue with manifold model classes, identification strategies, and specifications. While plurality of methods seems to be a good idea to address a complicated issue, the results are far off consensus. We apply meta regression analysi...
Directory of Open Access Journals (Sweden)
M. Srinivasan
2012-01-01
Full Text Available Problem statement: This study presents a novel method for the determination of average winding temperature rise of transformers under its predetermined field operating conditions. Rise in the winding temperature was determined from the estimated values of winding resistance during the heat run test conducted as per IEC standard. Approach: The estimation of hot resistance was modeled using Multiple Variable Regression (MVR, Multiple Polynomial Regression (MPR and soft computing techniques such as Artificial Neural Network (ANN and Adaptive Neuro Fuzzy Inference System (ANFIS. The modeled hot resistance will help to find the load losses at any load situation without using complicated measurement set up in transformers. Results: These techniques were applied for the hot resistance estimation for dry type transformer by using the input variables cold resistance, ambient temperature and temperature rise. The results are compared and they show a good agreement between measured and computed values. Conclusion: According to our experiments, the proposed methods are verified using experimental results, which have been obtained from temperature rise test performed on a 55 kVA dry-type transformer.
Precipitation interpolation in mountainous regions using multiple linear regression
Hay, L.; Viger, R.; McCabe, G.
1998-01-01
Multiple linear regression (MLR) was used to spatially interpolate precipitation for simulating runoff in the Animas River basin of southwestern Colorado. MLR equations were defined for each time step using measured precipitation as dependent variables. Explanatory variables used in each MLR were derived for the dependent variable locations from a digital elevation model (DEM) using a geographic information system. The same explanatory variables were defined for a 5 ?? 5 km grid of the DEM. For each time step, the best MLR equation was chosen and used to interpolate precipitation onto the 5 ?? 5 km grid. The gridded values of precipitation provide a physically-based estimate of the spatial distribution of precipitation and result in reliable simulations of daily runoff in the Animas River basin.
Multiple Retrieval Models and Regression Models for Prior Art Search
Lopez, Patrice; Romary, Laurent
2009-01-01
This paper presents the system called PATATRAS (PATent and Article Tracking, Retrieval and AnalysiS) realized for the IP track of CLEF 2009. Our approach presents three main characteristics: 1. The usage of multiple retrieval models (KL, Okapi) and term index definitions (lemma, phrase, concept) for the three languages considered in the present track (English, French, German) producing ten different sets of ranked results. 2. The merging of the different results based on mul...
Functional linear regression analysis for longitudinal data
Yao, F; Wang, J L; Yao, Fang; Müller, Hans-Georg; Wang, Jane-Ling
2005-01-01
We propose nonparametric methods for functional linear regression which are designed for sparse longitudinal data, where both the predictor and response are functions of a covariate such as time. Predictor and response processes have smooth random trajectories, and the data consist of a small number of noisy repeated measurements made at irregular times for a sample of subjects. In longitudinal studies, the number of repeated measurements per subject is often small and may be modeled as a discrete random number and, accordingly, only a finite and asymptotically nonincreasing number of measurements are available for each subject or experimental unit. We propose a functional regression approach for this situation, using functional principal component analysis, where we estimate the functional principal component scores through conditional expectations. This allows the prediction of an unobserved response trajectory from sparse measurements of a predictor trajectory. The resulting technique is flexible and allow...
Assessment method of study program: Results from regression analysis
Hamid, Mohd Rashid Bin Ab; Mohamed, Mohd Rusllim Bin; Mustafa, Zainol
2015-02-01
Assessment is an important part in any universities programs. Various approach of assessment has been used in determining the students' grade for the subjects. Therefore, this article discussed the empirical study for finding the best solution for determining the student grades. Several predictors for determining the students' grades i.e. total marks were identified such as coursework marks, mid-semester marks and final exam marks. Therefore, raw data from the database for a particular semester at one university in east coast of Malaysia are used for this purpose. The Correlational analysis was used to determine the strength of the association between the three predictors and the criterion variable. Also, multiple regression analysis was used to find the best regression model for the purpose of the study. Implications of the study were also discussed.
Regression Analysis for the Social Sciences
Gordon, Rachel A A
2012-01-01
The book provides graduate students in the social sciences with the basic skills that they need to estimate, interpret, present, and publish basic regression models using contemporary standards. Key features of the book include:interweaving the teaching of statistical concepts with examples developed for the course from publicly-available social science data or drawn from the literature. thorough integration of teaching statistical theory with teaching data processing and analysis.teaching of both SAS and Stata "side-by-side" and use of chapter exercises in which students practice programming
An Effect Size for Regression Predictors in Meta-Analysis
Aloe, Ariel M.; Becker, Betsy Jane
2012-01-01
A new effect size representing the predictive power of an independent variable from a multiple regression model is presented. The index, denoted as r[subscript sp], is the semipartial correlation of the predictor with the outcome of interest. This effect size can be computed when multiple predictor variables are included in the regression model…
Local bilinear multiple-output quantile/depth regression.
Czech Academy of Sciences Publication Activity Database
Hallin, M.; Lu, Z.; Paindaveine, D.; Šiman, Miroslav
2015-01-01
Ro?. 21, ?. 3 (2015), s. 1435-1466. ISSN 1350-7265 R&D Projects: GA MŠk(CZ) 1M06047 Institutional support: RVO:67985556 Keywords : conditional depth * growth chart * halfspace depth * local bilinear regression * multivariate quantile * quantile regression * regression depth Subject RIV: BA - General Mathematics Impact factor: 1.161, year: 2014 http:// library .utia.cas.cz/separaty/2015/SI/siman-0446857.pdf
Using Dominance Analysis to Determine Predictor Importance in Logistic Regression
Azen, Razia; Traxel, Nicole
2009-01-01
This article proposes an extension of dominance analysis that allows researchers to determine the relative importance of predictors in logistic regression models. Criteria for choosing logistic regression R[superscript 2] analogues were determined and measures were selected that can be used to perform dominance analysis in logistic regression. A…
Forecasting Gold Prices Using Multiple Linear Regression Method
Directory of Open Access Journals (Sweden)
Z. Ismail
2009-01-01
Full Text Available Problem statement: Forecasting is a function in management to assist decision making. It is also described as the process of estimation in unknown future situations. In a more general term it is commonly known as prediction which refers to estimation of time series or longitudinal type data. Gold is a precious yellow commodity once used as money. It was made illegal in USA 41 years ago, but is now once again accepted as a potential currency. The demand for this commodity is on the rise. Approach: Objective of this study was to develop a forecasting model for predicting gold prices based on economic factors such as inflation, currency price movements and others. Following the melt-down of US dollars, investors are putting their money into gold because gold plays an important role as a stabilizing influence for investment portfolios. Due to the increase in demand for gold in Malaysian and other parts of the world, it is necessary to develop a model that reflects the structure and pattern of gold market and forecast movement of gold price. The most appropriate approach to the understanding of gold prices is the Multiple Linear Regression (MLR model. MLR is a study on the relationship between a single dependent variable and one or more independent variables, as this case with gold price as the single dependent variable. The fitted model of MLR will be used to predict the future gold prices. A naive model known as ?forecast-1? was considered to be a benchmark model in order to evaluate the performance of the model. Results: Many factors determine the price of gold and based on ?a hunch of experts?, several economic factors had been identified to have influence on the gold prices. Variables such as Commodity Research Bureau future index (CRB; USD/Euro Foreign Exchange Rate (EUROUSD; Inflation rate (INF; Money Supply (M1; New York Stock Exchange (NYSE; Standard and Poor 500 (SPX; Treasury Bill (T-BILL and US Dollar index (USDX were considered to have influence on the prices. Parameter estimations for the MLR were carried out using Statistical Packages for Social Science package (SPSS with Mean Square Error (MSE as the fitness function to determine the forecast accuracy. Conclusion: Two models were considered. The first model considered all possible independent variables. The model appeared to be useful for predicting the price of gold with 85.2% of sample variations in monthly gold prices explained by the model. The second model considered the following four independent variables the (CRB lagged one, (EUROUSD lagged one, (INF lagged two and (M1 lagged two to be significant. In terms of prediction, the second model achieved high level of predictive accuracy. The amount of variance explained was about 70% and the regression coefficients also provide a means of assessing the relative importance of individual variables in the overall prediction of gold price.
Functional linear regression via canonical analysis
He, Guozhong; Müller, Hans-Georg; Wang, Jane-Ling; Yang, Wenjing
2011-01-01
We study regression models for the situation where both dependent and independent variables are square-integrable stochastic processes. Questions concerning the definition and existence of the corresponding functional linear regression models and some basic properties are explored for this situation. We derive a representation of the regression parameter function in terms of the canonical components of the processes involved. This representation establishes a connection betw...
Spatial regression analysis on 32 years total column ozone data
Directory of Open Access Journals (Sweden)
J. S. Knibbe
2014-02-01
Full Text Available Multiple-regressions analysis have been performed on 32 years of total ozone column data that was spatially gridded with a 1° × 1.5° resolution. The total ozone data consists of the MSR (Multi Sensor Reanalysis; 1979–2008 and two years of assimilated SCIAMACHY ozone data (2009–2010. The two-dimensionality in this data-set allows us to perform the regressions locally and investigate spatial patterns of regression coefficients and their explanatory power. Seasonal dependencies of ozone on regressors are included in the analysis. A new physically oriented model is developed to parameterize stratospheric ozone. Ozone variations on non-seasonal timescales are parameterized by explanatory variables describing the solar cycle, stratospheric aerosols, the quasi-biennial oscillation (QBO, El Nino (ENSO and stratospheric alternative halogens (EESC. For several explanatory variables, seasonally adjusted versions of these explanatory variables are constructed to account for the difference in their effect on ozone throughout the year. To account for seasonal variation in ozone, explanatory variables describing the polar vortex, geopotential height, potential vorticity and average day length are included. Results of this regression model are compared to that of similar analysis based on a more commonly applied statistically oriented model. The physically oriented model provides spatial patterns in the regression results for each explanatory variable. The EESC has a significant depleting effect on ozone at high and mid-latitudes, the solar cycle affects ozone positively mostly at the Southern Hemisphere, stratospheric aerosols affect ozone negatively at high Northern latitudes, the effect of QBO is positive and negative at the tropics and mid to high-latitudes respectively and ENSO affects ozone negatively between 30° N and 30° S, particularly at the Pacific. The contribution of explanatory variables describing seasonal ozone variation is generally large at mid to high latitudes. We observe ozone contributing effects for potential vorticity and day length, negative effect on ozone for geopotential height and variable ozone effects due to the polar vortex at regions to the north and south of the polar vortices. Recovery of ozone is identified globally. However, recovery rates and uncertainties strongly depend on choices that can be made in defining the explanatory variables. In particular the recovery rates over Antarctica might not be statistically significant. Furthermore, the results show that there is no spatial homogeneous pattern which regression model and explanatory variables provide the best fit to the data and the most accurate estimates of the recovery rates. Overall these results suggest that care has to be taken in determining ozone recovery rates, in particular for the Antarctic ozone hole.
Vehicle Travel Time Predication based on Multiple Kernel Regression
Wenjing Xu
2014-01-01
With the rapid development of transportation and logistics economy, the vehicle travel time prediction and planning become an important topic in logistics. Travel time prediction, which is indispensible for traffic guidance, has become a key issue for researchers in this field. At present, the prediction of travel time is mainly short term prediction, and the predication methods include artificial neural network, Kaman filter and support vector regression (SVR) method etc. However, these algo...
Multiple regression technique for Pth degree polynominals with and without linear cross products
Davis, J. W.
1973-01-01
A multiple regression technique was developed by which the nonlinear behavior of specified independent variables can be related to a given dependent variable. The polynomial expression can be of Pth degree and can incorporate N independent variables. Two cases are treated such that mathematical models can be studied both with and without linear cross products. The resulting surface fits can be used to summarize trends for a given phenomenon and provide a mathematical relationship for subsequent analysis. To implement this technique, separate computer programs were developed for the case without linear cross products and for the case incorporating such cross products which evaluate the various constants in the model regression equation. In addition, the significance of the estimated regression equation is considered and the standard deviation, the F statistic, the maximum absolute percent error, and the average of the absolute values of the percent of error evaluated. The computer programs and their manner of utilization are described. Sample problems are included to illustrate the use and capability of the technique which show the output formats and typical plots comparing computer results to each set of input data.
Ohlmacher, G.C.; Davis, J.C.
2003-01-01
Landslides in the hilly terrain along the Kansas and Missouri rivers in northeastern Kansas have caused millions of dollars in property damage during the last decade. To address this problem, a statistical method called multiple logistic regression has been used to create a landslide-hazard map for Atchison, Kansas, and surrounding areas. Data included digitized geology, slopes, and landslides, manipulated using ArcView GIS. Logistic regression relates predictor variables to the occurrence or nonoccurrence of landslides within geographic cells and uses the relationship to produce a map showing the probability of future landslides, given local slopes and geologic units. Results indicated that slope is the most important variable for estimating landslide hazard in the study area. Geologic units consisting mostly of shale, siltstone, and sandstone were most susceptible to landslides. Soil type and aspect ratio were considered but excluded from the final analysis because these variables did not significantly add to the predictive power of the logistic regression. Soil types were highly correlated with the geologic units, and no significant relationships existed between landslides and slope aspect. ?? 2003 Elsevier Science B.V. All rights reserved.
Comparison of Fuzzy Inference System and Multiple Regression to Predict Synthetic Envelopes Clogging
Bakhtiar Karimi; Farhad Mirzaei; Mohammad Javad Nahvinia; Behnam Ababaei
2010-01-01
Geo-synthetic materials are being used with acceptable performance in soil and water projects worldwide. Geotextiles are one of the categories of geo-synthetics being used in drainage systems. First generation of geotextiles used in the late 1950’s as an alternative for gravel envelopes. In this research two methods (multiple regression and fuzzy interference system) evaluate to predict synthetic envelope clogging. In multiple regression method the correlation coefficients for PP450, PP700 an...
Validation of Simulation Models: Regression Analysis Revisited
Kleijnen, J.P.C.; Bettonvil, B.W.M.; Groenendaal, W.J.H. van
1996-01-01
This paper proves that it is wrong to require that regressing a model's outputs on the observed real outcomes gives a 45 degrees line through the origin (unit slope, zero intercept).Therefore this paper proposes an alternative requirement: the responses of the model and the real system should have the same means and the same variances.To test whether this requirement is satisfied, a novel statisti-cal procedure is derived.This procedure regresses the differences of simulated and real response...
Linear regression analysis of survival data with missing censoring indicators
Wang, Qihua; Dinse, Gregg E.
2010-01-01
Linear regression analysis has been studied extensively in a random censorship setting, but typically all of the censoring indicators are assumed to be observed. In this paper, we develop synthetic data methods for estimating regression parameters in a linear model when some censoring indicators are missing. We define estimators based on regression calibration, imputation, and inverse probability weighting techniques, and we prove all three estimators are asymptotically normal. The finite-sam...
Neutron multiplicity analysis tool
Energy Technology Data Exchange (ETDEWEB)
Stewart, Scott L [Los Alamos National Laboratory
2010-01-01
I describe the capabilities of the EXCOM (EXcel based COincidence and Multiplicity) calculation tool which is used to analyze experimental data or simulated neutron multiplicity data. The input to the program is the count-rate data (including the multiplicity distribution) for a measurement, the isotopic composition of the sample and relevant dates. The program carries out deadtime correction and background subtraction and then performs a number of analyses. These are: passive calibration curve, known alpha and multiplicity analysis. The latter is done with both the point model and with the weighted point model. In the current application EXCOM carries out the rapid analysis of Monte Carlo calculated quantities and allows the user to determine the magnitude of sample perturbations that lead to systematic errors. Neutron multiplicity counting is an assay method used in the analysis of plutonium for safeguards applications. It is widely used in nuclear material accountancy by international (IAEA) and national inspectors. The method uses the measurement of the correlations in a pulse train to extract information on the spontaneous fission rate in the presence of neutrons from ({alpha},n) reactions and induced fission. The measurement is relatively simple to perform and gives results very quickly ({le} 1 hour). By contrast, destructive analysis techniques are extremely costly and time consuming (several days). By improving the achievable accuracy of neutron multiplicity counting, a nondestructive analysis technique, it could be possible to reduce the use of destructive analysis measurements required in safeguards applications. The accuracy of a neutron multiplicity measurement can be affected by a number of variables such as density, isotopic composition, chemical composition and moisture in the material. In order to determine the magnitude of these effects on the measured plutonium mass a calculational tool, EXCOM, has been produced using VBA within Excel. This program was developed to help speed the analysis of Monte Carlo neutron transport simulation (MCNP) data, and only requires the count-rate data to calculate the mass of material using INCC's analysis methods instead of the full neutron multiplicity distribution required to run analysis in INCC. This paper describes what is implemented within EXCOM, including the methods used, how the program corrects for deadtime, and how uncertainty is calculated. This paper also describes how to use EXCOM within Excel.
Neutron multiplicity analysis tool
International Nuclear Information System (INIS)
I describe the capabilities of the EXCOM (EXcel based COincidence and Multiplicity) calculation tool which is used to analyze experimental data or simulated neutron multiplicity data. The input to the program is the count-rate data (including the multiplicity distribution) for a measurement, the isotopic composition of the sample and relevant dates. The program carries out deadtime correction and background subtraction and then performs a number of analyses. These are: passive calibration curve, known alpha and multiplicity analysis. The latter is done with both the point model and with the weighted point model. In the current application EXCOM carries out the rapid analysis of Monte Carlo calculated quantities and allows the user to determine the magnitude of sample perturbations that lead to systematic errors. Neutron multiplicity counting is an assay method used in the analysis of plutonium for safeguards applications. It is widely used in nuclear material accountancy by international (IAEA) and national inspectors. The method uses the measurement of the correlations in a pulse train to extract information on the spontaneous fission rate in the presence of neutrons from (?,n) reactions and induced fission. The measurement is relatively simple to perform and gives results very quickly ((le) 1 hour). By contrast, destructive analysis techniques are extremely costly and time consuming (several days). By improving the achievable accuracy of neutron multiplicity counting, a nondestructive analysis technique, it could be possible to reduce the use of destructive analysis measurements required in safeguards applications. The accuracy of a neutron multiplicity measurement can be affected by a number of variables such as density, isotopic composition, chemical composition and moisture in the material. In order to determine the magnitude of these effects on the measured plutonium mass a calculational tool, EXCOM, has been produced using VBA within Excel. This program was developed to help speed the analysis of Monte Carlo neutron transport simulation (MCNP) data, and only requires the count-rate data to calculate the mass of material using INCC's analysis methods instead of the full neutron multiplicity distribution required to run analysis in INCC. This paper describes what is implemented within EXCOM, including the methods used, how the program corrects for deadtime, and how uncertainty is calculated. This paper also describes how to use EXCOM within Excel.
Survival Analysis with Multivariate adaptive Regression Splines
Kriner, Monika
2007-01-01
Multivariate adaptive regression splines (MARS) are a useful tool to identify linear and nonlinear e?ects and interactions between two covariates. In this dissertation a new proposal to model survival type data with MARS is introduced. Martingale and deviance residuals of a Cox PH model are used as response in a common MARS approach to model functional forms of covariate e?ects as well as possible interactions in a data-driven way. Simulation studies prove that the new method yields a bett...
Catchment Area Analysis Using Bayesian Regression Modeling
Wang, Aobo; Wheeler, David C
2015-01-01
A catchment area (CA) is the geographic area and population from which a cancer center draws patients. Defining a CA allows a cancer center to describe its primary patient population and assess how well it meets the needs of cancer patients within the CA. A CA definition is required for cancer centers applying for National Cancer Institute (NCI)-designated cancer center status. In this research, we constructed both diagnosis and diagnosis/treatment CAs for the Massey Cancer Center (MCC) at Virginia Commonwealth University. We constructed diagnosis CAs for all cancers based on Virginia state cancer registry data and Bayesian hierarchical logistic regression models. We constructed a diagnosis/treatment CA using billing data from MCC and a Bayesian hierarchical Poisson regression model. To define CAs, we used exceedance probabilities for county random effects to assess unusual spatial clustering of patients diagnosed or treated at MCC after adjusting for important demographic covariates. We used the MCC CAs to compare patient characteristics inside and outside the CAs. Among cancer patients living within the MCC CA, patients diagnosed at MCC were more likely to be minority, female, uninsured, or on Medicaid. PMID:25983542
Auto-Regressive Independent Process Analysis without Combinatorial Efforts
Szabo, Z.
2010-01-01
We treat the problem of searching for hidden multi-dimensional independent auto-regressive processes (Auto-Regressive Independent Process Analysis, AR-IPA). Independent Subspace Analysis (ISA) can be used to solve the AR-IPA task. The so-called separation theorem simplifies the ISA task considerably: the theorem enables one to reduce the task to 1-dimensional Blind Source Separation (BSS) task followed by the grouping of the coordinates. However, the grouping of the coordinates still involves...
MONEY DEMAND IN ROMANIAN ECONOMY, USING MULTIPLE REGRESSION METHOD AND UNRESTRICTED VAR MODEL
Mariana KAZNOVSKY
2008-01-01
The paper describes the money demand in Romanian economy using two econometrics models. The first model consist in a multiple regression between demand money, monthly inflation rate, Industrial production Index and the foreign exchange rate RON/Euro. The second model (Unrestricted Vector AutoRegressive model) is applied for the same variables used in the first model. Identifying a statistically strong model, capable of stable estimations for the money demand function in Romania’s economy cons...
Regression analysis of multivariate grouped survival data.
Guo, S W; Lin, D Y
1994-09-01
Multivariate failure time data arise when each study subject may experience several types of event or when there are clusterings of observational units such that failure times within the same cluster are correlated. The failure times are often subject to interval grouping or have truly discrete measurements. In this paper, the marginal distribution for each discrete failure time variable is formulated by a grouped-data version of the proportional hazards model while the dependence structure is unspecified. Generalized estimating equations in the spirit of Liang and Zeger (1986, Biometrika 73, 13-22) are proposed to estimate the regression parameters and survival probabilities. The resulting estimators are consistent and asymptotically normal. Robust estimators for the limiting covariance matrices are constructed. Simulation studies demonstrate that the asymptotic approximations are adequate for practical use and that ignoring the intracluster dependence in the variance-covariance estimation would lead to invalid statistical inference. A psychological experiment is provided for illustration. PMID:7981390
Digital Repository Service at National Institute of Oceanography (India)
Balachandran, K.K.; Jayalakshmy, K.V.; Laluraj, C.M.; Nair, M.; Joseph, T.; Sheeba, P.
2008-01-01
-parametric procedures. Further, the step-up multiple regression analysis is an exploratory procedure that is sufficiently powerful to detect both abrupt and gradual patterns, though it may not necessarily signify cause and effect that actually occur (Ludwig, Reynolds... February (Figure 4.d) and November (Figure 4.e,f), suggesting that temperature sets the condition for optimal metabolic activity, proportional to the abundance of flagellates and succession of diatom species (Fisher and Gray, 1983). This also underlines...
Cost Analysis and Tradeoffs in Regression Testing using FSMWeb
Directory of Open Access Journals (Sweden)
Seif Azghandi
2014-07-01
Full Text Available Web applications have become software commodities of choice due to advances in internet, and wireless communications. Web applications need to be tested during new development, and thereafter during maintenance when presented with changes. Models can be used to represent the desired behavior or to represent the desired testing strategies and testing environment. FSMWeb is a black box model-based testing, and regression testing approach for web applications.This paper elaborates, extends the previous works on FSMWeb's features, and introduces patching as a regression testing technique in which test cases are repaired (patched versus being fully regenerated. Patching may lead to cost saving when regression testing.These enhancements lead to construction, and analysis of cost models used to compare various regression testing approaches resulting to selection of a regression testing approach that achieves a favorable (reduced cost.The determination of favorable cost is subject to number of assumptions, and tradeoffs.
Egg hatchability prediction by multiple linear regression and artificial neural networks
Scientific Electronic Library Online (English)
AC, Bolzan; RAF, Machado; JCZ, Piaia.
2008-06-01
Full Text Available An artificial neural network (ANN) was compared with a multiple linear regression statistical method to predict hatchability in an artificial incubation process. A feedforward neural network architecture was applied. Network trainings were made by the backpropagation algorithm based on data obtained [...] from industrial incubations. The ANN model was chosen as it produced data that fit better the experimental data as compared to the multiple linear regression model, which used coefficients determined by minimum square method. The proposed simulation results of these approaches indicate that this ANN can be used for incubation performance prediction.
Sykas, Dimitris; Karathanassi, Vassilia
2015-06-01
This paper presents a new method for automatically determining the optimum regression model, which enable the estimation of a parameter. The concept lies on the combination of k spectral pre-processing algorithms (SPPAs) that enhance spectral features correlated to the desired parameter. Initially a pre-processing algorithm uses as input a single spectral signature and transforms it according to the SPPA function. A k-step combination of SPPAs uses k preprocessing algorithms serially. The result of each SPPA is used as input to the next SPPA, and so on until the k desired pre-processed signatures are reached. These signatures are then used as input to three different regression methods: the Normalized band Difference Regression (NDR), the Multiple Linear Regression (MLR) and the Partial Least Squares Regression (PLSR). Three Simple Genetic Algorithms (SGAs) are used, one for each regression method, for the selection of the optimum combination of k SPPAs. The performance of the SGAs is evaluated based on the RMS error of the regression models. The evaluation not only indicates the selection of the optimum SPPA combination but also the regression method that produces the optimum prediction model. The proposed method was applied on soil spectral measurements in order to predict Soil Organic Matter (SOM). In this study, the maximum value assigned to k was 3. PLSR yielded the highest accuracy while NDR's accuracy was satisfactory compared to its complexity. MLR method showed severe drawbacks due to the presence of noise in terms of collinearity at the spectral bands. Most of the regression methods required a 3-step combination of SPPAs for achieving the highest performance. The selected preprocessing algorithms were different for each regression method since each regression method handles with a different way the explanatory variables.
Joint regression analysis of correlated data using Gaussian copulas.
Song, Peter X-K; Li, Mingyao; Yuan, Ying
2009-03-01
This article concerns a new joint modeling approach for correlated data analysis. Utilizing Gaussian copulas, we present a unified and flexible machinery to integrate separate one-dimensional generalized linear models (GLMs) into a joint regression analysis of continuous, discrete, and mixed correlated outcomes. This essentially leads to a multivariate analogue of the univariate GLM theory and hence an efficiency gain in the estimation of regression coefficients. The availability of joint probability models enables us to develop a full maximum likelihood inference. Numerical illustrations are focused on regression models for discrete correlated data, including multidimensional logistic regression models and a joint model for mixed normal and binary outcomes. In the simulation studies, the proposed copula-based joint model is compared to the popular generalized estimating equations, which is a moment-based estimating equation method to join univariate GLMs. Two real-world data examples are used in the illustration. PMID:18510653
Doerr, Benjamin; Winzen, Carola
2011-01-01
In this work, we introduce multiplicative drift analysis as a suitable way to analyze the runtime of randomized search heuristics such as evolutionary algorithms. We give a multiplicative version of the classical drift theorem. This allows easier analyses in those settings where the optimization progress is roughly proportional to the current distance to the optimum. To display the strength of this tool, we regard the classical problem how the (1+1) Evolutionary Algorithm optimizes an arbitrary linear pseudo-Boolean function. Here, we first give a relatively simple proof for the fact that any linear function is optimized in expected time $O(n \\log n)$, where $n$ is the length of the bit string. Afterwards, we show that in fact any such function is optimized in expected time at most ${(1+o(1)) 1.39 \\euler n\\ln (n)}$, again using multiplicative drift analysis. We also prove a corresponding lower bound of ${(1-o(1))e n\\ln(n)}$ which actually holds for all functions with a unique global optimum. We further demons...
ANALYSIS OF DEPENDENTLY CENSORED DATA BASED ON QUANTILE REGRESSION
Ji, Shuang; Peng, Limin; Li, Ruosha; Lynn, Michael J.
2014-01-01
Dependent censoring occurs in many biomedical studies and poses considerable methodological challenges for survival analysis. In this work, we develop a new approach for analyzing dependently censored data by adopting quantile regression models. We formulate covariate effects on the quantiles of the marginal distribution of the event time of interest. Such a modeling strategy can accommodate a more dynamic relationship between covariates and survival time compared to traditional regression mo...
Robust In-Car Speech Recognition Based on Nonlinear Multiple Regressions
Directory of Open Access Journals (Sweden)
Itakura Fumitada
2007-01-01
Full Text Available We address issues for improving handsfree speech recognition performance in different car environments using a single distant microphone. In this paper, we propose a nonlinear multiple-regression-based enhancement method for in-car speech recognition. In order to develop a data-driven in-car recognition system, we develop an effective algorithm for adapting the regression parameters to different driving conditions. We also devise the model compensation scheme by synthesizing the training data using the optimal regression parameters and by selecting the optimal HMM for the test speech. Based on isolated word recognition experiments conducted in 15 real car environments, the proposed adaptive regression approach shows an advantage in average relative word error rate (WER reductions of 52.5 and 14.8 , compared to original noisy speech and ETSI advanced front end, respectively.
Meenal Sharma; Rakesh Mohan
2011-01-01
This paper introduces a statistical model by using the statistical methods in 2G,GSM communication system.Multiple regression formula is to calculate path loss. It is assumed that hb,W and ? are three statistical variables. We use nakagami distribution to model hb,W and uniform distribution to model ?.
International Nuclear Information System (INIS)
We report a case of tumor regression of multiple bone metastases from breast carcinoma after administration of strontium-89 chloride. This case suggests that strontium-89 chloride can not only relieve bone metastases pain not responsive to analgesics, but may also have a tumoricidal effect on bone metastases
A Simple and Convenient Method of Multiple Linear Regression to Calculate Iodine Molecular Constants
Cooper, Paul D.
2010-01-01
A new procedure using a student-friendly least-squares multiple linear-regression technique utilizing a function within Microsoft Excel is described that enables students to calculate molecular constants from the vibronic spectrum of iodine. This method is advantageous pedagogically as it calculates molecular constants for ground and excited…
Miyauchi, Takaharu; Endo, Wataru; Miura, Naoki; Terui, Kazuyuki; Kamata, Syuichi; Hashimoto, Manabu
2014-01-01
We report a case of tumor regression of multiple bone metastases from breast carcinoma after administration of strontium-89 chloride. This case suggests that strontium-89 chloride can not only relieve bone metastases pain not responsive to analgesics, but may also have a tumoricidal effect on bone metastases. PMID:25298863
Heianna, Joichi; Miyauchi, Takaharu; Endo, Wataru; MIURA, Naoki; Terui, Kazuyuki; Kamata, Syuichi; Hashimoto, Manabu
2014-01-01
We report a case of tumor regression of multiple bone metastases from breast carcinoma after administration of strontium-89 chloride. This case suggests that strontium-89 chloride can not only relieve bone metastases pain not responsive to analgesics, but may also have a tumoricidal effect on bone metastases.
Due to the complexity of the processes contributing to beach bacteria concentrations, many researchers rely on statistical modeling, among which multiple linear regression (MLR) modeling is most widely used. Despite its ease of use and interpretation, there may be time dependence...
Evaluation Applications of Regression Analysis with Time-Series Data.
Veney, James E.
1993-01-01
The application of time series analysis is described, focusing on the use of regression analysis for analyzing time series in a way that may make it more readily available to an evaluation practice audience. Practical guidelines are suggested for decision makers in government, health, and social welfare agencies. (SLD)
The Sage handbook of regression analysis and causal inference
Best, Henning
2014-01-01
Covering both general and advanced aspects of multivariate methods, this handbook focuses on regression analysis of cross-sectional and longitudinal data with an emphasis on causal analysis and provides readers with an introduction to and exploration of a large range of techniques.
Applying Multiple Linear Regression and Neural Network to Predict Bank Performance
Directory of Open Access Journals (Sweden)
Nor Mazlina Abu Bakar
2009-09-01
Full Text Available Globalization and technological advancement has created a highly competitive market in the banking and finance industry. Performance of the industry depends heavily on the accuracy of the decisions made at managerial level. This study uses multiple linear regression technique and feed forward artificial neural network in predicting bank performance. The study aims to predict bank performance using multiple linear regression and neural network. The study then evaluates the performance of the two techniques with a goal to find a powerful tool in predicting the bank performance. Data of thirteen banks for the period 2001-2006 was used in the study. ROA was used as a measure of bank performance, and hence is a dependent variable for the multiple linear regressions. Seven variables including liquidity, credit risk, cost to income ratio, size, concentration ratio, inflation and GDP were used as independent variables. Under supervised learning, the dependent variable, ROA was used as the target output for the artificial neural network. Seven inputs corresponding to seven predictor variables were used for pattern recognition at the training phase. Experimental results from the multiple linear regression show that two variables: credit risk and cost to income ratio are significant in determining the bank performance. Two variables were found to explain about 60.9 percent of the total variation in the data with a mean square error (MSE of 0.330. The artificial neural network was found to give optimal results by using thirteen hidden neurons. Testing results show that the seven inputs explain about 66.9 percent of the total variation in the data with a very low MSE of 0.00687. Performance of both methods is measured by mean square prediction error (MSPR at the validation stage. The MSPR value for neural network is lower than the MPSR value for multiple linear regression (0.0061 against 0.6190. The study concludes that artificial neural network is the more powerful tool in predicting bank performance.
Yu, Binbing
2013-01-01
Relative survival is the standard measure of excess mortality due to cancer in population-based cancer survival studies. In relative survival analysis, the observed hazard for cancer patients is the sum of the expected hazard for the general cancer-free population and the excess hazard associated with a cancer diagnosis. Previous models for relative survival analysis have assumed that the excess hazard rate is related to covariates by additive or multiplicative regression models. In this pape...
Seo, Min Seok; Kim, Ja Kyung; Shim, Jae Yong
2015-09-01
We report a case of regression of multiple pulmonary metastases, which originated from hepatocellular carcinoma after treatment with intravenous administration of high-dose vitamin C. A 74-year-old woman presented to the clinic for her cancer-related symptoms such as general weakness and anorexia. After undergoing initial transarterial chemoembolization (TACE), local recurrence with multiple pulmonary metastases was found. She refused further conventional therapy, including sorafenib tosylate (Nexavar). She did receive high doses of vitamin C (70 g), which were administered into a peripheral vein twice a week for 10 months, and multiple pulmonary metastases were observed to have completely regressed. She then underwent subsequent TACE, resulting in remission of her primary hepatocellular carcinoma. PMID:26256994
Ratio Versus Regression Analysis: Some Empirical Evidence in Brazil
Directory of Open Access Journals (Sweden)
Newton Carneiro Affonso da Costa Jr.
2004-06-01
Full Text Available This work compares the traditional methodology for ratio analysis, applied to a sample of Brazilian firms, with the alternative one of regression analysis both to cross-industry and intra-industry samples. It was tested the structural validity of the traditional methodology through a model that represents its analogous regression format. The data are from 156 Brazilian public companies in nine industrial sectors for the year 1997. The results provide weak empirical support for the traditional ratio methodology as it was verified that the validity of this methodology may differ between ratios.
Tan, F.; Lim, H. S.; Abdullah, K.; Yoon, T. L.; Zubir Matjafri, M.; Holben, B.
2014-02-01
Aerosol optical depth (AOD) from AERONET data has a very fine resolution but air pollution index (API), visibility and relative humidity from the ground truth measurements are coarse. To obtain the local AOD in the atmosphere, the relationship between these three parameters was determined using multiple regression analysis. The data of southwest monsoon period (August to September, 2012) taken in Penang, Malaysia, was used to establish a quantitative relationship in which the AOD is modeled as a function of API, relative humidity, and visibility. The highest correlated model was used to predict AOD values during southwest monsoon period. When aerosol is not uniformly distributed in the atmosphere then the predicted AOD can be highly deviated from the measured values. Therefore these deviated data can be removed by comparing between the predicted AOD values and the actual AERONET data which help to investigate whether the non uniform source of the aerosol is from the ground surface or from higher altitude level. This model can accurately predict AOD if only the aerosol is uniformly distributed in the atmosphere. However, further study is needed to determine this model is suitable to use for AOD predicting not only in Penang, but also other state in Malaysia or even global.
International Nuclear Information System (INIS)
Much attention is focused on increasing the energy efficiency to decrease fuel costs and CO2 emissions throughout industrial sectors. The ORC (organic Rankine cycle) is a relatively simple but efficient process that can be used for this purpose by converting low and medium temperature waste heat to power. In this study we propose four linear regression models to predict the maximum obtainable thermal efficiency for simple and recuperated ORCs. A previously derived methodology is able to determine the maximum thermal efficiency among many combinations of fluids and processes, given the boundary conditions of the process. Hundreds of optimised cases with varied design parameters are used as observations in four multiple regression analyses. We analyse the model assumptions, prediction abilities and extrapolations, and compare the results with recent studies in the literature. The models are in agreement with the literature, and they present an opportunity for accurate prediction of the potential of an ORC to convert heat sources with temperatures from 80 to 360 °C, without detailed knowledge or need for simulation of the process. - Highlights: • The maximum thermal efficiency of ORCs in hundreds of cases was analysed. • Multiple regression models were derived to predict the maximum obtainable efficiency of ORCs. • Using only key design parameters, the maximum obtainable efficiency can be evaluated. • The regression models decrease the resources needed to evaluate the maximum potential. • The models are statistically strong and in good agreement with the literature
User's Guide to the Weighted-Multiple-Linear Regression Program (WREG version 1.0)
Eng, Ken; Chen, Yin-Yu; Kiang, Julie.E.
2009-01-01
Streamflow is not measured at every location in a stream network. Yet hydrologists, State and local agencies, and the general public still seek to know streamflow characteristics, such as mean annual flow or flood flows with different exceedance probabilities, at ungaged basins. The goals of this guide are to introduce and familiarize the user with the weighted multiple-linear regression (WREG) program, and to also provide the theoretical background for program features. The program is intended to be used to develop a regional estimation equation for streamflow characteristics that can be applied at an ungaged basin, or to improve the corresponding estimate at continuous-record streamflow gages with short records. The regional estimation equation results from a multiple-linear regression that relates the observable basin characteristics, such as drainage area, to streamflow characteristics.
Regression Analysis between Properties of Subgrade Lateritic Soil
Afeez Adefemi BELLO
2012-01-01
The results of a study that considered the use of regression analysis that may have correlation between index properties and California Bearing Ratio (CBR) of some lateritic soil within Osogbo town of South Western Nigeria have been presented. For an appreciable conclusion to be established, lateritic soil samples were collected from eight (8) different borrow pits within the town and various laboratory tests including Atterberg Limits, Gradation analysis, California Bearing Ratio, Compaction...
An Algorithm to Estimate Continuous-time Traffic Speed Using Multiple Regression Model
Xin Jin; Suk-Kyo Hong; Qiang Ma
2006-01-01
In this study we present a novel algorithm to estimate continuous-time traffic speed data using multiple regression based on the correlated speed and then compare its results to other baseline missing speed prediction methods with real freeway traffic speed data. Since this approach has greater generalization ability for given real speed data, it is believed that this model will also perform well for all time-series missing data estimation fields.
Multiple regression model for thermal inactivation of Listeria monocytogenes in liquid food products
Lieverloo, J.H.M.; Roode, M., de; Fox, M.B.; Zwietering, M H; Wells-Bennik, M.H.J.
2013-01-01
A multiple regression model was constructed for thermal inactivation of Listeria monocytogenes in liquid food products, based on 802 sets of data with 51 different strains and 6 cocktails of strains published from 1984 to 2010. Significant variables, other than inactivation temperature, were pH, sodium chloride content, sugar content, the temperature of growth or storage before inactivation, in addition to a heat shock before inactivation. The constructed model for thermal inactivation of L. ...
Multiple linear regression MOS for short-term wind power forecast
Ranaboldo, Matteo
2011-01-01
Short-term (0 - 36 h ahead) wind power forecast is a central issue for the correct management of a grid connected wind farm. A combination of physical and statistical treatments to post-process Numerical Weather Predictions (NWP) outputs is needed for successful short-term wind power forecasts. One of the most promising and effective approaches for statistical treatment is the Model Output Statistics (MOS) technique. In this study a MOS based on multiple linear regression is proposed: the mod...
Estimation of Saturation Percentage of Soil Using Multiple Regression, ANN, and ANFIS Techniques
Khaled Ahmad Aali; Masoud Parsinejad; Bizhan Rahmani
2009-01-01
The saturation percentage (SP) of soils is an important index in hydrological studies. In this paper, arti?cial neural networks (ANNs), multiple regression (MR), and adaptive neural-based fuzzy inference system (ANFIS) were used for estimation of saturation percentage of soils collected from Boukan region in the northwestern part of Iran. Percent clay, silt, sand and organic carbon (OC) were used to develop the applied methods. In additions contributions of each input variable were asse...
Least Squares Adjustment: Linear and Nonlinear Weighted Regression Analysis
DEFF Research Database (Denmark)
Nielsen, Allan Aasbjerg
2007-01-01
This note primarily describes the mathematics of least squares regression analysis as it is often used in geodesy including land surveying and satellite positioning applications. In these fields regression is often termed adjustment. The note also contains a couple of typical land surveying and satellite positioning application examples. In these application areas we are typically interested in the parameters in the model typically 2- or 3-D positions and not in predictive modelling which is often the main concern in other regression analysis applications. Adjustment is often used to obtain estimates of relevant parameters in an over-determined system of equations which may arise from deliberately carrying out more measurements than actually needed to determine the set of desired parameters. An example may be the determination of a geographical position based on information from a number of Global Navigation Satellite System (GNSS) satellites also known as space vehicles (SV). It takes at least four SVs to determine the position (and the clock error) of a GNSS receiver. Often more than four SVs are used and we use adjustment to obtain a better estimate of the geographical position (and the clock error) and to obtain estimates of the uncertainty with which the position is determined. Regression analysis is used in many other fields of application both in the natural, the technical and the social sciences. Examples may be curve fitting, calibration, establishing relationships between different variables in an experiment or in a survey, etc. Regression analysis is probably one the most used statistical techniques around. Dr. Anna B. O. Jensen provided insight and data for the Global Positioning System (GPS) example. Matlab code and sections that are considered as either traditional land surveying material or as advanced material are typeset with smaller fonts. Comments in general or on for example unavoidable typos, shortcomings and errors are most welcome.
Early cost estimating for road construction projects using multiple regression techniques
Directory of Open Access Journals (Sweden)
Ibrahim Mahamid
2011-12-01
Full Text Available The objective of this study is to develop early cost estimating models for road construction projects using multiple regression techniques, based on 131 sets of data collected in the West Bank in Palestine. As the cost estimates are required at early stages of a project, considerations were given to the fact that the input data for the required regression model could be easily extracted from sketches or scope definition of the project. 11 regression models are developed to estimate the total cost of road construction project in US dollar; 5 of them include bid quantities as input variables and 6 include road length and road width. The coefficient of determination r2 for the developed models is ranging from 0.92 to 0.98 which indicate that the predicted values from a forecast models fit with the real-life data. The values of the mean absolute percentage error (MAPE of the developed regression models are ranging from 13% to 31%, the results compare favorably with past researches which have shown that the estimate accuracy in the early stages of a project is between ±25% and ±50%.
A Logistic Regression Analysis of the Ischemic Heart Disease Risk
Irfana P. Bhatti; Heman D. Lohano; Zafar A. Pirzado; Imran A. Jafri
2006-01-01
The main objective of the present study is to investigate factors that contribute significantly to enhancing the risk of ischemic heart disease. The dependent variable of the study is diagnosis - whether the patient has the disease or does not have the disease. Logistic regression analysis is applied for exploring the factors affecting the disease. The result of the study show the factors that contribute significantly to enhancing the risk of ischemic heart disease are the use of banaspati gh...
Entrepreneurship programs in developing countries: A meta regression analysis
Cho, Yoonyoung; Trionfi Honorati, Maddalena
2013-01-01
This paper provides a synthetic and systematic review on the effectiveness of various entrepreneurship programs in developing countries. We adopt a meta-regression analysis using 37 impact evaluation studies that were in the public domain by March 2012, and draw out several lessons on the design of the programs. We observe a wide variation in program effectiveness across different interventions depending on outcomes, types of beneficiaries, and country context. Overall, entrepreneurship progr...
Robust regression applied to fractal/multifractal analysis.
Portilla, F.; Valencia, J. L.; Tarquis, A. M.; Saa-Requejo, A.
2012-04-01
Fractal and multifractal are concepts that have grown increasingly popular in recent years in the soil analysis, along with the development of fractal models. One of the common steps is to calculate the slope of a linear fit commonly using least squares method. This shouldn't be a special problem, however, in many situations using experimental data the researcher has to select the range of scales at which is going to work neglecting the rest of points to achieve the best linearity that in this type of analysis is necessary. Robust regression is a form of regression analysis designed to circumvent some limitations of traditional parametric and non-parametric methods. In this method we don't have to assume that the outlier point is simply an extreme observation drawn from the tail of a normal distribution not compromising the validity of the regression results. In this work we have evaluated the capacity of robust regression to select the points in the experimental data used trying to avoid subjective choices. Based on this analysis we have developed a new work methodology that implies two basic steps: • Evaluation of the improvement of linear fitting when consecutive points are eliminated based on R p-value. In this way we consider the implications of reducing the number of points. • Evaluation of the significance of slope difference between fitting with the two extremes points and fitted with the available points. We compare the results applying this methodology and the common used least squares one. The data selected for these comparisons are coming from experimental soil roughness transect and simulated based on middle point displacement method adding tendencies and noise. The results are discussed indicating the advantages and disadvantages of each methodology. Acknowledgements Funding provided by CEIGRAM (Research Centre for the Management of Agricultural and Environmental Risks) and by Spanish Ministerio de Ciencia e Innovación (MICINN) through project no. AGL2010-21501/AGR is greatly appreciated.
Poisson Regression Analysis of Illness and Injury Surveillance Data
Energy Technology Data Exchange (ETDEWEB)
Frome E.L., Watkins J.P., Ellis E.D.
2012-12-12
The Department of Energy (DOE) uses illness and injury surveillance to monitor morbidity and assess the overall health of the work force. Data collected from each participating site include health events and a roster file with demographic information. The source data files are maintained in a relational data base, and are used to obtain stratified tables of health event counts and person time at risk that serve as the starting point for Poisson regression analysis. The explanatory variables that define these tables are age, gender, occupational group, and time. Typical response variables of interest are the number of absences due to illness or injury, i.e., the response variable is a count. Poisson regression methods are used to describe the effect of the explanatory variables on the health event rates using a log-linear main effects model. Results of fitting the main effects model are summarized in a tabular and graphical form and interpretation of model parameters is provided. An analysis of deviance table is used to evaluate the importance of each of the explanatory variables on the event rate of interest and to determine if interaction terms should be considered in the analysis. Although Poisson regression methods are widely used in the analysis of count data, there are situations in which over-dispersion occurs. This could be due to lack-of-fit of the regression model, extra-Poisson variation, or both. A score test statistic and regression diagnostics are used to identify over-dispersion. A quasi-likelihood method of moments procedure is used to evaluate and adjust for extra-Poisson variation when necessary. Two examples are presented using respiratory disease absence rates at two DOE sites to illustrate the methods and interpretation of the results. In the first example the Poisson main effects model is adequate. In the second example the score test indicates considerable over-dispersion and a more detailed analysis attributes the over-dispersion to extra-Poisson variation. The R open source software environment for statistical computing and graphics is used for analysis. Additional details about R and the data that were used in this report are provided in an Appendix. Information on how to obtain R and utility functions that can be used to duplicate results in this report are provided.
The use of weighted multiple linear regression to estimate QTL-by-QTL epistatic effects
Scientific Electronic Library Online (English)
Jan, Bocianowski.
Full Text Available Knowledge of the nature and magnitude of gene effects, as well as their contribution to the control of metric traits, is important in formulating efficient breeding programs for the improvement of plant genetics. Information concerning a genetic parameter such as the additive-by-additive epistatic e [...] ffect can be useful in traditional breeding. This report describes the results obtained by applying weighted multiple linear regression to estimate the parameter connected with an additive-by-additive epistatic interaction. Three weight variants were used: (1) standard weights based on estimated variances, (2) different weights for minimal, maximal and other lines, and (3) different weights for extreme and other lines. The approach described here combines two methods of estimation, one based on phenotypic observations and the other using molecular marker data. The comparison was done using Monte Carlo simulations. The results show that the application of weighted regression to the marker data yielded estimates similar to those obtained by phenotypic methods.
Quinino, Roberto C.; Reis, Edna A.; Bessegato, Lupercio F.
2013-01-01
This article proposes the use of the coefficient of determination as a statistic for hypothesis testing in multiple linear regression based on distributions acquired by beta sampling. (Contains 3 figures.)
Directory of Open Access Journals (Sweden)
Arvinder Kaur
2012-01-01
Full Text Available Software Estimation Techniques present an inclusive set of directives for software project developers, project managers and the management in order to produce more accurate estimates or predictions for future developments. The estimates also facilitate allocation of resources’ for Software development. Estimations also smooth the process of re-planning, prioritizing, classification and reuse of the projects. Various estimation models are widely being used in the Industry as well for research purposes. Several comparative studies have been executed on them, but choosing the best technique is quite intricate. Estimation by Analogy(EbA is the method of making estimations based on the outcome from k most analogous projects. The projects close in distance are potentially similar to the reference project from the repository of projects. This method has widely been accepted and is quite popular as it impersonates human beings inherent judgment skill by estimating with analogous projects. In this paper, Grey Relational Analysis(GRA is used as the method for feature selection and also for locating the closest analogous projects to the reference project from the set of projects. The closest k projects are then used to build regression models. Regression techniques like Multiple Linear Regression, Stepwise Regression and Robust regression techniques are used to find the effort from the closest projects.
Tolosana-Delgado, R.; von Eynatten, H.
2010-05-01
Modern geochemical data sets have typically around 20-30 compositional variables measured on some tens or hundreds of samples. A statistical analysis of data sets with so many variables should take as a priority the reduction of dimensionality of the model, in order to increase its reliability and enhance its interpretation. In the framework of compositional data analysis with multiple regression, such simplification can be achieved taking some geometric concepts into account. First, the sample space of compositions, the simplex, is given an Euclidean space structure by the compositional operations of perturbation, powering and Aitchison inner product. Then, given some qualitative information on which subcompositions might depend on each explanatory variable, one can decompose the simplex in a set of orthogonal subspaces, in such a way that the composition projected onto each subspace is independent of a subset of the explanatory variables. This is achieved with a series of singular value decomposition computations. The method is applied to a data set of 88 observations of six major oxides in molar proportions, from modern glacial and fluvio-glacial sediments, with grain size ranging from coarse sand to clay. The goal is to assess the influence of chemical weathering processes (expected to impose a linear relation of composition and grain size) against purely physical processes (expected to show step-wise functions following the largest characteristic crystal sizes of specific minerals in the source rock). We exhaustively explore all patterns of uncorrelation of the composition with three explanatory variables: grain size in ? scale, and two step functions for the silt and clay domains. The best pattern, chosen with a likelihood ratio test, has only a smooth trend of (Mg,Fe) vs. (Al,K,Ca+Na) enrichment towards finer grain sizes—explained as differential mechanical behaviour of phyllosilicates vs. feldspar—and coefficients for the two step functions related to the sharp decrease of quartz in silt fractions, and the sudden enrichment of mafic accessory minerals, alteration products and mechanically unstable phyllosilicates in the clay fraction. We could thus be confident that weathering is almost absent in this data set.
Arch Height: A Regression Analysis of Different Measuring Parameters
Directory of Open Access Journals (Sweden)
Hironmoy Roy
2011-07-01
Full Text Available Rationale: For measuring the height of the arch of foot either standing navicular height or talar height of the medial longitudinal arch was accepted in earlier days, where as the ‘standing normalised navicular height’ is taken by modern day by authors as a yardstick. But being troublesome and time consuming, we practically not opt for them in busy OPD schedule; rather go for measuring the arch-height in supine posture. Objectives: So this study was aimed to derive the regression between the standing arch-height values with the supine counterparts, so that former can be predicted easily from later. Methodology: It was carried out among 103 adult subjects in the purview of North Bengal Medical College & Hospital. From the x-ray films of their feet in supine and standing posture the navicular and talar heights were determined and the records were analysed. Result: Statistically significant correlation followed by regression analysis could reveal simple linear regression-equations for predicting the standing arch-height values from the supine values; derived separately in both males and females. Conclusion: Thus, from a known supine arch-height value, we can derive the respective standing arch- height, as well as the ‘standing normalised navicular height’ indirectly avoiding the entire troublesome maneuver in regular practice. So the present study recommends this method in clinical fields as because this is more rational and ideal approach to estimate arch height.
International Nuclear Information System (INIS)
Highlights: ? We obtained models for estimation of cetane number of biodiesel. ? Twenty-four neural networks using two topologies were evaluated. ? The best neural network for predict the cetane number was selected. ? The best accuracy was obtained for the selected neural network. - Abstract: Models for estimation of cetane number of biodiesel from their fatty acid methyl ester composition using multiple linear regression and artificial neural networks were obtained in this work. For the obtaining of models to predict the cetane number, an experimental data from literature reports that covers 48 and 15 biodiesels in the modeling-training step and validation step respectively were taken. Twenty-four neural networks using two topologies and different algorithms for the second training step were evaluated. The model obtained using multiple regression was compared with two other models from literature and it was able to predict cetane number with 89% of accuracy, observing one outlier. A model to predict cetane number using artificial neural network was obtained with better accuracy than 92% except one outlier. The best neural network to predict the cetane number was a backpropagation network (11:5:1) using the Levenberg–Marquardt algorithm for the second step of the networks training and showing R = 0.9544 for the validation data.
Parisi Kern, Andrea; Ferreira Dias, Michele; Piva Kulakowski, Marlova; Paulo Gomes, Luciana
2015-05-01
Reducing construction waste is becoming a key environmental issue in the construction industry. The quantification of waste generation rates in the construction sector is an invaluable management tool in supporting mitigation actions. However, the quantification of waste can be a difficult process because of the specific characteristics and the wide range of materials used in different construction projects. Large variations are observed in the methods used to predict the amount of waste generated because of the range of variables involved in construction processes and the different contexts in which these methods are employed. This paper proposes a statistical model to determine the amount of waste generated in the construction of high-rise buildings by assessing the influence of design process and production system, often mentioned as the major culprits behind the generation of waste in construction. Multiple regression was used to conduct a case study based on multiple sources of data of eighteen residential buildings. The resulting statistical model produced dependent (i.e. amount of waste generated) and independent variables associated with the design and the production system used. The best regression model obtained from the sample data resulted in an adjusted R(2) value of 0.694, which means that it predicts approximately 69% of the factors involved in the generation of waste in similar constructions. Most independent variables showed a low determination coefficient when assessed in isolation, which emphasizes the importance of assessing their joint influence on the response (dependent) variable. PMID:25704604
Multivariate study and regression analysis of gluten-free granola
Scientific Electronic Library Online (English)
Lilian Maria, Pagamunici; Aloisio Henrique Pereira de, Souza; Aline Kirie, Gohara; Alline Aparecida Freitas, Silvestre; Jesuí Vergílio, Visentainer; Nilson Evelázio de, Souza; Sandra Terezinha Marques, Gomes; Makoto, Matsushita.
2014-03-01
Full Text Available This study developed a gluten-free granola and evaluated it during storage with the application of multivariate and regression analysis of the sensory and instrumental parameters. The physicochemical, sensory, and nutritional characteristics of a product containing quinoa, amaranth and linseed were [...] evaluated. The crude protein and lipid contents ranged from 97.49 and 122.72 g kg-1 of food, respectively. The polyunsaturated/saturated, and n-6:n-3 fatty acid ratios ranged from 2.82 and 2.59:1, respectively. Granola had the best alpha-linolenic acid content, nutritional indices in the lipid fraction, and mineral content. There were good hygienic and sanitary conditions during storage; probably due to the low water activity of the formulation, which contributed to inhibit microbial growth. The sensory attributes ranged from 'like very much' to 'like slightly', and the regression models were highly fitted and correlated during the storage period. A reduction in the sensory attribute levels and in the product physical stabilisation was verified by principal component analysis. The use of the affective test acceptance and instrumental analysis combined with statistical methods allowed us to obtain promising results about the characteristics of gluten-free granola.
Multivariate study and regression analysis of gluten-free granola
Directory of Open Access Journals (Sweden)
Lilian Maria Pagamunici
2014-03-01
Full Text Available This study developed a gluten-free granola and evaluated it during storage with the application of multivariate and regression analysis of the sensory and instrumental parameters. The physicochemical, sensory, and nutritional characteristics of a product containing quinoa, amaranth and linseed were evaluated. The crude protein and lipid contents ranged from 97.49 and 122.72 g kg-1 of food, respectively. The polyunsaturated/saturated, and n-6:n-3 fatty acid ratios ranged from 2.82 and 2.59:1, respectively. Granola had the best alpha-linolenic acid content, nutritional indices in the lipid fraction, and mineral content. There were good hygienic and sanitary conditions during storage; probably due to the low water activity of the formulation, which contributed to inhibit microbial growth. The sensory attributes ranged from 'like very much' to 'like slightly', and the regression models were highly fitted and correlated during the storage period. A reduction in the sensory attribute levels and in the product physical stabilisation was verified by principal component analysis. The use of the affective test acceptance and instrumental analysis combined with statistical methods allowed us to obtain promising results about the characteristics of gluten-free granola.
Hassanzadeh, S.; Hosseinibalam, F.; Omidvari, M.
2008-04-01
Data of seven meteorological variables (relative humidity, wet temperature, dry temperature, maximum temperature, minimum temperature, ground temperature and sun radiation time) and ozone values have been used for statistical analysis. Meteorological variables and ozone values were analyzed using both multiple linear regression and principal component methods. Data for the period 1999-2004 are analyzed jointly using both methods. For all periods, temperature dependent variables were highly correlated, but were all negatively correlated with relative humidity. Multiple regression analysis was used to fit the meteorological variables using the meteorological variables as predictors. A variable selection method based on high loading of varimax rotated principal components was used to obtain subsets of the predictor variables to be included in the linear regression model of the meteorological variables. In 1999, 2001 and 2002 one of the meteorological variables was weakly influenced predominantly by the ozone concentrations. However, the model did not predict that the meteorological variables for the year 2000 were not influenced predominantly by the ozone concentrations that point to variation in sun radiation. This could be due to other factors that were not explicitly considered in this study.
KAYAALP, G. Tamer
1999-01-01
In animal breeding, when there is a relationship between the dependent (Y) and independent (X) variables, regression analysis is applied. But when one of the variables has one or more missing observations regression analysis cannot be applied. This paper illustrates and discusses a regression analysis in which the independent variable (X) has a missing observation.
International Nuclear Information System (INIS)
Objective: To analyze the correlations between liver lipid level determined by liver 3.0 T 1H-MRS in vivo and influencing factors using multiple linear stepwise regression. Methods: The prospective study of liver 1H-MRS was performed with 3.0 T system and eight-channel torso phased-array coils using PRESS sequence. Forty-four volunteers were enrolled in this study. Liver spectra were collected with a TR of 1500 ms, TE of 30 ms, volume of interest of 2 cm×2 cm×2 cm, NSA of 64 times. The acquired raw proton MRS data were processed by using a software program SAGE. For each MRS measurement, using water as the internal reference, the amplitude of the lipid signal was normalized to the sum of the signal from lipid and water to obtain percentage lipid within the liver. The statistical description of height, weight, age and BMI, Line width and water suppression were recorded, and Pearson analysis was applied to test their relationships. Multiple linear stepwise regression was used to set the statistical model for the prediction of Liver lipid content. Results: Age (39.1±12.6) years, body weight (64.4±10.4) kg, BMI (23.3±3.1) kg/m2, linewidth (18.9±4.4) and the water suppression (90.7±6.5)% had significant correlation with liver lipid content (0.00 to 0.96%, median 0.02%), r were 0.11, 0.44, 0.40, 0.52, -0.73 respectively (P<0.05). But only age, BMI, line width, and the water suppression entered into the multiple linear regression equation. Liver lipid content prediction equation was as follows: Y= 1.395 - (0.021×water suppression) + (0.022×BMI) + (0.014×line width) - (0.004×age), and the coefficient of determination was 0. 613, corrected coefficient of determination was 0.59. Conclusion: The regression model fitted well, since the variables of age, BMI, width, and water suppression can explain about 60% of liver lipid content changes. (authors)
Harrell , Jr , Frank E
2015-01-01
This highly anticipated second edition features new chapters and sections, 225 new references, and comprehensive R software. In keeping with the previous edition, this book is about the art and science of data analysis and predictive modeling, which entails choosing and using multiple tools. Instead of presenting isolated techniques, this text emphasizes problem solving strategies that address the many issues arising when developing multivariable models using real data and not standard textbook examples. It includes imputation methods for dealing with missing data effectively, methods for fitting nonlinear relationships and for making the estimation of transformations a formal part of the modeling process, methods for dealing with "too many variables to analyze and not enough observations," and powerful model validation techniques based on the bootstrap. The reader will gain a keen understanding of predictive accuracy, and the harm of categorizing continuous predictors or outcomes. This text realistically...
Determination of ventilatory threshold through quadratic regression analysis.
Gregg, Joey S; Wyatt, Frank B; Kilgore, J Lon
2010-09-01
Ventilatory threshold (VT) has been used to measure physiological occurrences in athletes through models via gas analysis with limited accuracy. The purpose of this study is to establish a mathematical model to more accurately detect the ventilatory threshold using the ventilatory equivalent of carbon dioxide (VE/VCO2) and the ventilatory equivalent of oxygen (VE/Vo2). The methodology is primarily a mathematical analysis of data. The raw data used were archived from the cardiorespiratory laboratory in the Department of Kinesiology at Midwestern State University. Procedures for archived data collection included breath-by-breath gas analysis averaged every 20 seconds (ParVoMedics, TrueMax 2400). A ramp protocol on a Velotron bicycle ergometer was used with increased work at 25 W.min beginning with 150 W, until volitional fatigue. The subjects consisted of 27 healthy, trained cyclists with age ranging from 18 to 50 years. All subjects signed a university approved informed consent before testing. Graphic scatterplots and statistical regression analyses were performed to establish the crossover and subsequent dissociation of VE/Vo2 to VE/VCO2. A polynomial trend line along the scatterplots for VE/VO2 and VE/VCO2 was used because of the high correlation coefficient, the coefficient of determination, and trend line. The equations derived from the scatterplots and trend lines were quadratic in nature because they have a polynomial degree of 2. A graphing calculator in conjunction with a spreadsheet was used to find the exact point of intersection of the 2 trend lines. After the quadratic regression analysis, the exact point of VE/Vo2 and VE/VCO2 crossover was established as the VT. This application will allow investigators to more accurately determine the VT in subsequent research. PMID:20802290
dos Santos, T. S.; Mendes, D.; Torres, R. R.
2015-08-01
Several studies have been devoted to dynamic and statistical downscaling for analysis of both climate variability and climate change. This paper introduces an application of artificial neural networks (ANN) and multiple linear regression (MLR) by principal components to estimate rainfall in South America. This method is proposed for downscaling monthly precipitation time series over South America for three regions: the Amazon, Northeastern Brazil and the La Plata Basin, which is one of the regions of the planet that will be most affected by the climate change projected for the end of the 21st century. The downscaling models were developed and validated using CMIP5 model out- put and observed monthly precipitation. We used GCMs experiments for the 20th century (RCP Historical; 1970-1999) and two scenarios (RCP 2.6 and 8.5; 2070-2100). The model test results indicate that the ANN significantly outperforms the MLR downscaling of monthly precipitation variability.
Chaloulakou, Archontoula; Grivas, Georgios; Spyrellis, Nikolas
2003-10-01
Particulate atmospheric pollution in urban areas is considered to have significant impact on human health. Therefore, the ability to make accurate predictions of particulate ambient concentrations is important to improve public awareness and air quality management. This study examines the possibility of using neural network methods as tools for daily average particulate matter with aerodynamic diameter <10 microm (PM10) concentration forecasting, providing an alternative to statistical models widely used up to this day. Based on a data inventory, in a fixed central site in Athens, Greece, ranging over a two-year period, and using mainly meteorological variables as inputs, neural network models and multiple linear regression models were developed and evaluated. Comparison statistics used indicate that the neural network approach has an edge over regression models, expressed both in terms of prediction error (root mean square error values lower by 8.2-9.4%) and of episodic prediction ability (false alarm rate values lower by 7-13%). The results demonstrate that artificial neural networks (ANNs), if properly trained and formed, can provide adequate solutions to particulate pollution prognostic demands. PMID:14604327
DEFF Research Database (Denmark)
Larsen, Ulrik; Pierobon, Leonardo
2014-01-01
Much attention is focused on increasing the energy efficiency to decrease fuel costs and CO2 emissions throughout industrial sectors. The ORC (organic Rankine cycle) is a relatively simple but efficient process that can be used for this purpose by converting low and medium temperature waste heat to power. In this study we propose four linear regression models to predict the maximum obtainable thermal efficiency for simple and recuperated ORCs. A previously derived methodology is able to determine the maximum thermal efficiency among many combinations of fluids and processes, given the boundary conditions of the process. Hundreds of optimised cases with varied design parameters are used as observations in four multiple regression analyses. We analyse the model assumptions, prediction abilities and extrapolations, and compare the results with recent studies in the literature. The models are in agreement with the literature, and they present an opportunity for accurate prediction of the potential of an ORC to convert heat sources with temperatures from 80 to 360 C, without detailed knowledge or need for simulation of the process. © 2013 Elsevier Ltd. All rights reserved
Directory of Open Access Journals (Sweden)
Avval Zhila Mohajeri
2015-01-01
Full Text Available This paper deals with developing a linear quantitative structure-activity relationship (QSAR model for predicting the RSK inhibition activity of some new compounds. A dataset consisting of 62 pyrazino [1,2-?] indole, diazepino [1,2-?] indole, and imidazole derivatives with known inhibitory activities was used. Multiple linear regressions (MLR technique combined with the stepwise (SW and the genetic algorithm (GA methods as variable selection tools was employed. For more checking stability, robustness and predictability of the proposed models, internal and external validation techniques were used. Comparison of the results obtained, indicate that the GA-MLR model is superior to the SW-MLR model and that it isapplicable for designing novel RSK inhibitors.
Estimation of Saturation Percentage of Soil Using Multiple Regression, ANN, and ANFIS Techniques
Directory of Open Access Journals (Sweden)
Khaled Ahmad Aali
2009-07-01
Full Text Available The saturation percentage (SP of soils is an important index in hydrological studies. In this paper, arti?cial neural networks (ANNs, multiple regression (MR, and adaptive neural-based fuzzy inference system (ANFIS were used for estimation of saturation percentage of soils collected from Boukan region in the northwestern part of Iran. Percent clay, silt, sand and organic carbon (OC were used to develop the applied methods. In additions contributions of each input variable were assessed on estimation of SP index. Two performance functions, namely root mean square errors (RMSE and determination coefficient (R2, were used to evaluate the adequacy of the models. ANFIS method was found to be superior over the other methods. It is, then, proposed that ANFIS model can be used for reasonable estimation of SP values of soils.
Logistic regression analysis on the risk factors of radiation pneumonitis
International Nuclear Information System (INIS)
Objective: To identify the risk factors of radiation pneumonitis (RP). Methods: A retrospective study was conducted on 101 patients with radiation pneumonitis using SPSS 8.0 software. Factors evaluated included: gender, age, pathology, clinical stage, irradiation dose, irradiation field size, history of smoking, cardiovascular disease, bronchitis, surgery, chemotherapy, lung infection, atelectasis, obstructive infection and pleural effusion. Univariate analysis was performed using Chi-Square test and multivariate analysis was performed using Logistic regression model. Results: Univariate analysis revealed a significant relationship between 10 factors: pulmonary infection, atelectasis, obstructive infection, cardiovascular disease, bronchitis, chemotherapy, irradiation dose, number of days of radiation and irradiation field size were factors leading to radiation pneumonitis. Multivariate analysis showed that 9 factors: pulmonary infection, obs tractive infection, atelectasis, pleural effusion, bronchitis, cardiovascular disease, chemotherapy, irradiation dose, and irradiation field size were independent factors. Conclusion: Comprehensive consideration of the accompanying disease, chemotherapy, dose, field size, etc during the planning of radiotherapy is able to minimize the possibility of developing radiation pneumonitis
A Visual Analytics Approach for Correlation, Classification, and Regression Analysis
Energy Technology Data Exchange (ETDEWEB)
Steed, Chad A [ORNL; SwanII, J. Edward [Mississippi State University (MSU); Fitzpatrick, Patrick J. [Mississippi State University (MSU); Jankun-Kelly, T.J. [Mississippi State University (MSU)
2013-01-01
New approaches that combine the strengths of humans and machines are necessary to equip analysts with the proper tools for exploring today s increasing complex, multivariate data sets. In this paper, a visual data mining framework, called the Multidimensional Data eXplorer (MDX), is described that addresses the challenges of today s data by combining automated statistical analytics with a highly interactive parallel coordinates based canvas. In addition to several intuitive interaction capabilities, this framework offers a rich set of graphical statistical indicators, interactive regression analysis, visual correlation mining, automated axis arrangements and filtering, and data classification techniques. This chapter provides a detailed description of the system as well as a discussion of key design aspects and critical feedback from domain experts.
A Visual Analytics Approach for Correlation, Classification, and Regression Analysis
Energy Technology Data Exchange (ETDEWEB)
Steed, Chad A [ORNL; SwanII, J. Edward [Mississippi State University (MSU); Fitzpatrick, Patrick J. [Mississippi State University (MSU); Jankun-Kelly, T.J. [Mississippi State University (MSU)
2012-02-01
New approaches that combine the strengths of humans and machines are necessary to equip analysts with the proper tools for exploring today's increasing complex, multivariate data sets. In this paper, a novel visual data mining framework, called the Multidimensional Data eXplorer (MDX), is described that addresses the challenges of today's data by combining automated statistical analytics with a highly interactive parallel coordinates based canvas. In addition to several intuitive interaction capabilities, this framework offers a rich set of graphical statistical indicators, interactive regression analysis, visual correlation mining, automated axis arrangements and filtering, and data classification techniques. The current work provides a detailed description of the system as well as a discussion of key design aspects and critical feedback from domain experts.
A Quantile Regression Analysis of Micro-lending's Poverty Impact
Directory of Open Access Journals (Sweden)
Stephen W. Polk
2012-07-01
Full Text Available This paper aims to evaluate the impact of a microlending program on ameliorating measured poverty within its client population, with the aim of improving that impact. We analyze over 18,000 women micro-finance clients of the Negros Women for Tomorrow Foundation (NWTF, a database using the Progress out of Poverty (PPI Scorecard as a measure of poverty. Analysis using both OLS and quantile multivariate regression models shows how observable borrower attributes affect the ability of clients to reduce their measured poverty. Loan size, duration, and the economic activity supported all have strongly identifiable effects. Moreover, estimates suggest which among the poor are receiving the greatest effective help by the program. Results offer specific advice to the NWTF and other micro-lenders: impact is greatest with fewer, larger loans in particular economic sectors (sari-sari, service and trade but require patience as each additional year increases the client’s average change in poverty score.
Low-Cost Housing in Sabah, Malaysia: A Regression Analysis
Directory of Open Access Journals (Sweden)
Dullah Mulok
2009-02-01
Full Text Available Low-cost housing plays a vital role in the development process especially in providing accommodation to those who are less fortunate and the lower income group. This effort is also a step in overcoming the squatter problem which could cripple the competitive drive of the local community especially in the state of Sabah, Malaysia. This article attempts to look into the influencing factors to low-cost housing in Sabah namely the government’s budget (allocation for low cost housing projects and Sabah’s total population. At the same time, this study will attempt to show the implication from the development and economic crises which occurred during period 1971 to 2000 towards the provision of low cost houses in Sabah. Empirical analyses were conducted using the multiple linear regression method, stepwise and also the dummy variable approach in demonstrating the link. The empirical result shows that the government’s budget for low-cost housing is the main contributor to the provision of low-cost housing in Sabah. The empirical decision also suggests that economic growth namely Gross Domestic Product (GDP did not provide a significant effect to the low-cost housing in Sabah. However, almost all major crises that have beset upon Malaysia’s economy caused a significant and consistent effect to the low-cost housing in Sabah especially the financial crisis which occurred in mid 1997.
Deconinck, E; Zhang, M H; Petitet, F; Dubus, E; Ijjaali, I; Coomans, D; Vander Heyden, Y
2008-02-18
The use of some unconventional non-linear modeling techniques, i.e. classification and regression trees and multivariate adaptive regression splines-based methods, was explored to model the blood-brain barrier (BBB) passage of drugs and drug-like molecules. The data set contains BBB passage values for 299 structural and pharmacological diverse drugs, originating from a structured knowledge-based database. Models were built using boosted regression trees (BRT) and multivariate adaptive regression splines (MARS), as well as their respective combinations with stepwise multiple linear regression (MLR) and partial least squares (PLS) regression in two-step approaches. The best models were obtained using combinations of MARS with either stepwise MLR or PLS. It could be concluded that the use of combinations of a linear with a non-linear modeling technique results in some improved properties compared to the individual linear and non-linear models and that, when the use of such a combination is appropriate, combinations using MARS as non-linear technique should be preferred over those with BRT, due to some serious drawbacks of the BRT approaches. PMID:18243869
Accounting for data errors discovered from an audit in multiple linear regression.
Shepherd, Bryan E; Yu, Chang
2011-09-01
A data coordinating team performed onsite audits and discovered discrepancies between the data sent to the coordinating center and that recorded at sites. We present statistical methods for incorporating audit results into analyses. This can be thought of as a measurement error problem, where the distribution of errors is a mixture with a point mass at 0. If the error rate is nonzero, then even if the mean of the discrepancy between the reported and correct values of a predictor is 0, naive estimates of the association between two continuous variables will be biased. We consider scenarios where there are (1) errors in the predictor, (2) errors in the outcome, and (3) possibly correlated errors in the predictor and outcome. We show how to incorporate the error rate and magnitude, estimated from a random subset (the audited records), to compute unbiased estimates of association and proper confidence intervals. We then extend these results to multiple linear regression where multiple covariates may be incorrect in the database and the rate and magnitude of the errors may depend on study site. We study the finite sample properties of our estimators using simulations, discuss some practical considerations, and illustrate our methods with data from 2815 HIV-infected patients in Latin America, of whom 234 had their data audited using a sequential auditing plan. PMID:21281274
Directory of Open Access Journals (Sweden)
T.A. Renaldy
2011-01-01
Full Text Available oped for prediction of particulate matter. The performance of the multiple regression models was assessed. For the development of neural network models, a feed forward with back propagation learning algorithm was used to train the network. The performance of neural network was determined in terms of correlation coefficient (R and Mean Square Error (MSE. The optimum number of hidden neurons was found out for obtaining the lowest value of MSE and the highest value of R. The results indicated that the network can predict particulate concentrations better than multiple regression models.
International Nuclear Information System (INIS)
In environmental epidemiology, trace and toxic substance concentrations frequently have very highly skewed distributions ranging over one or more orders of magnitude, and prediction by conventional regression is often poor. Classification and Regression Tree Analysis (CART) is an alternative in such contexts. To compare the techniques, two Pennsylvania data sets and three independent variables are used: house radon progeny (RnD) and gamma levels as predicted by construction characteristics in 1330 houses; and ?200 house radon (Rn) measurements as predicted by topographic parameters. CART may identify structural variables of interest not identified by conventional regression, and vice versa, but in general the regression models are similar. CART has major advantages in dealing with other common characteristics of environmental data sets, such as missing values, continuous variables requiring transformations, and large sets of potential independent variables. CART is most useful in the identification and screening of independent variables, greatly reducing the need for cross-tabulations and nested breakdown analyses. There is no need to discard cases with missing values for the independent variables because surrogate variables are intrinsic to CART. The tree-structured approach is also independent of the scale on which the independent variables are measured, so that transformations are unnecessary. CART identifies important interactions as well as main effects. The major advantages of CART appear to be in exploring data. Once the important variables are identified, conventional regressions seem to lead to results similar but more interpretable by most audiences. 12 refs., 8 figs., 10 tabs
Regression Analysis of Variables Describing Poultry Meat Supply in European Countries
Directory of Open Access Journals (Sweden)
Simoni? Miro
2012-11-01
Full Text Available In this paper, based on the analysis of official FAOSTAT and EUROSTAT data on poultry meat for 38 European countries for years 2007 and 2009, two hypotheses were examined. Firstly, considering four clustering variables on poultry meat, i.e. production, export and import in kg/capita, as well as the producer price in US $/t, using descriptive exploratory and cluster analysis, the hypothesis that the clusters of countries may be recognized was confirmed. As a result six clusters of similar countries were distinguished. Secondly, based on multiple regression analysis, this paper proofs that there exists the statistically significant relationship of poultry meat production on export and import of that kind of meat, all measured in kg/capita. There is also a high correlation between production, as a dependent, and each of two independent variables.
Framing an Nuclear Emergency Plan using Qualitative Regression Analysis
International Nuclear Information System (INIS)
Since the arising on safety maintenance issues due to post-Fukushima disaster, as well as, lack of literatures on disaster scenario investigation and theory development. This study is dealing with the initiation difficulty on the research purpose which is related to content and problem setting of the phenomenon. Therefore, the research design of this study refers to inductive approach which is interpreted and codified qualitatively according to primary findings and written reports. These data need to be classified inductively into thematic analysis as to develop conceptual framework related to several theoretical lenses. Moreover, the framing of the expected framework of the respective emergency plan as the improvised business process models are abundant of unstructured data abstraction and simplification. The structural methods of Qualitative Regression Analysis (QRA) and Work System snapshot applied to form the data into the proposed model conceptualization using rigorous analyses. These methods were helpful in organising and summarizing the snapshot into an 'as-is' work system that being recommended as 'to-be'work system towards business process modelling. We conclude that these methods are useful to develop comprehensive and structured research framework for future enhancement in business process simulation. (author)
Regression Analysis between Properties of Subgrade Lateritic Soil
Directory of Open Access Journals (Sweden)
Afeez Adefemi BELLO
2012-12-01
Full Text Available The results of a study that considered the use of regression analysis that may have correlation between index properties and California Bearing Ratio (CBR of some lateritic soil within Osogbo town of South Western Nigeria have been presented. For an appreciable conclusion to be established, lateritic soil samples were collected from eight (8 different borrow pits within the town and various laboratory tests including Atterberg Limits, Gradation analysis, California Bearing Ratio, Compaction and Specific Gravity were performed on the soil samples.Various linear relationships between index properties and CBR of the samples were investigated and predictive equations estimating CBR from the experimental index values were developed. The findings indicate that good correlation exists between the two groups (i.e Index properties and CBR values. However, the values of the CBR computed from the models are only to be used for preliminary in view of simplicity and economy and not acceptable alternatives to laboratory testing because of the anisotropic nature of lateritic soil and its heterogeneity.
International Nuclear Information System (INIS)
Risk associated with power generation must be identified to make intelligent choices between alternate power technologies. Radionuclide air stack emissions for a single coal plant and a single nuclear plant are used to compute the single plant leukemia incidence risk and total industry leukemia incidence risk. Leukemia incidence is the response variable as a function of radionuclide bone dose for the six proposed dose response curves considered. During normal operation a coal plant has higher radionuclide emissions than a nuclear plant and the coal industry has a higher leukaemia incidence risk than the nuclear industry, unless a nuclear accident occurs. Variation of nuclear accident size allows quantification of the impact of accidents on the total industry leukemia incidence risk comparison. The leukemia incidence risk is quantified as the number of accidents of a given size for the nuclear industry leukemia incidence risk to equal the coal industry leukemia incidence risk. The general linear model is used to develop equations that relate the accident frequency required for equal industry risks to the magnitude of the nuclear emission. Exploratory data analysis revealed that the relationship between the natural log of accident number versus the natural log of accident size is linear. (Author)
Adaptive regression analysis: theory and applications in econometrics
Directory of Open Access Journals (Sweden)
J. Garc\\u00EDa P\\u00E9rez
2003-01-01
Full Text Available In this work we (a discuss some theoretical and computational difficulties of regression analysing dependences, describing the behaviour of the heterogeneous systems, (b offer a set of new techniques adaptable to regression analysing the heterogeneous dependences and (c demonstrate the advantages of application of these new techniques in econometrics.
Parmenter, Brett A; Testa, S Marc; Schretlen, David J; Weinstock-Guttman, Bianca; Benedict, Ralph H B
2010-01-01
The Minimal Assessment of Cognitive Function in Multiple Sclerosis (MACFIMS) is a consensus neuropsychological battery with established reliability and validity. One of the difficulties in implementing the MACFIMS in clinical settings is the reliance on manualized norms from disparate sources. In this study, we derived regression-based norms for the MACFIMS, using a unique data set to control for standard demographic variables (i.e., age, age2, sex, education). Multiple sclerosis (MS) patients (n = 395) and healthy volunteers (n = 100) did not differ in age, level of education, sex, or race. Multiple regression analyses were conducted on the performance of the healthy adults, and the resulting models were used to predict MS performance on the MACFIMS battery. This regression-based approach identified higher rates of impairment than manualized norms for many of the MACFIMS measures. These findings suggest that there are advantages to developing new norms from a single sample using the regression-based approach. We conclude that the regression-based norms presented here provide a valid alternative to identifying cognitive impairment as measured by the MACFIMS. PMID:19796441
DEFF Research Database (Denmark)
Østergaard, SØren; Ettema, Jehan Frans
Multiple regression and model building with mediator variables was addressed to avoid double counting when economic values are estimated from data simulated with herd simulation modeling (using the SimHerd model). The simulated incidence of metritis was analyzed statistically as the independent variable, while using the traits representing the direct effects of metritis on yield, fertility and occurrence of other diseases as mediator variables. The economic value of metritis was estimated to be €78 per 100 cow-years for each 1% increase of metritis in the period of 1-100 days in milk in multiparous cows. The merit of using this approach was demonstrated since the economic value of metritis was estimated to be 81% higher when no mediator variables were included in the multiple regression analysis
Hardt Jochen; Herke Max; Leonhart Rainer
2012-01-01
Abstract Background Multiple imputation is becoming increasingly popular. Theoretical considerations as well as simulation studies have shown that the inclusion of auxiliary variables is generally of benefit. Methods A simulation study of a linear regression with a response Y and two predictors X1 and X2 was performed on data with n = 50, 100 and 200 using complete cases or multiple imputation with 0, 10, 20, 40 and 80 auxiliary variables. Mechanisms of missingness were either 100% MCAR or 50...
Parsons, Vickie s.
2009-01-01
The request to conduct an independent review of regression models, developed for determining the expected Launch Commit Criteria (LCC) External Tank (ET)-04 cycle count for the Space Shuttle ET tanking process, was submitted to the NASA Engineering and Safety Center NESC on September 20, 2005. The NESC team performed an independent review of regression models documented in Prepress Regression Analysis, Tom Clark and Angela Krenn, 10/27/05. This consultation consisted of a peer review by statistical experts of the proposed regression models provided in the Prepress Regression Analysis. This document is the consultation's final report.
International Nuclear Information System (INIS)
Several MRI features of supratentorial astrocytomas are associated with high histologic grade by statistically significant p values. We sought to apply this information prospectively to a group of astrocytomas in the prediction of tumor grade. We used 10 MRI features of fibrillary astrocytomas from 52 patient studies to develop neural network and multiple linear regression models for practical use in predicting tumor grade. The models were tested prospectively on MR images from 29 patient studies. The performance of the models was compared against that of a radiologist. Neural network accuracy was 61 % in distinguishing between low and high grade tumors. Multiple linear regression achieved an accuracy of 59 %. Assessment of the images by a radiologist yielded 57 % accuracy. We conclude that while certain MRI parameters may be statistically related to astrocytoma histologic grade, neural network and linear regression models cannot reliably use them to predict tumor grade. (orig.)
Le, Huy; Marcus, Justin
2012-01-01
This study used Monte Carlo simulation to examine the properties of the overall odds ratio (OOR), which was recently introduced as an index for overall effect size in multiple logistic regression. It was found that the OOR was relatively independent of study base rate and performed better than most commonly used R-square analogs in indexing model…
Supply and Demand of Jeneberang River Aggregate Using Multiple Regression Model
Directory of Open Access Journals (Sweden)
Aryanti Virtanti Anas
2013-07-01
Full Text Available Aggregate plays an important role in developing infrastructure because it is the major raw materials used in construction such as roads, hospitals, schools, factories, homes and other buildings. Sand and gravel are essential sources of aggregate and exploited often from the active channels of river systems. Jeneberang River is one of the main rivers in South Sulawesi Province which is located at Gowa Regency and mined in order to fulfill the aggregate demand of Gowa Regency and Makassar City. Supply and demand are economic occurrences that affected by several factors, so this research aims to (1 determine influencing factors to aggregate supply and demand, (2 develop supply and demand model. Data was obtained from Central Bureau Statistics of Gowa Regency and Makassar City, and Department of Mines and Energy, Gowa Regency for eleven years (2001 – 2011. In this research, aggregate supply and demand were modeled using multiple regression method. First, relationship among supply and influencing factors were established, followed by demand and its factors. Second, supply and demand model was established using SPSS. The result of this research showed that the model can be used to estimate accurately supply and demand of aggregate using the established relationship among the influencing factors. Supply of aggregate was affected by several factors including price, number of trucks, number of mining companies and mining permit area meanwhile the price, GDP, income per capita, length of road, number of buildings and economic growth had high influence on demand rate.
Dental malocclusion and body posture in young subjects: A multiple regression study
Scientific Electronic Library Online (English)
Giuseppe, Perinetti; Luca, Contardo; Armando, Silvestrini-Biavati; Lucia, Perdoni; Attilio, Castaldo.
Full Text Available OBJECTIVES: Controversial results have been reported on potential correlations between the stomatognathic system and body posture. We investigated whether malocclusal traits correlate with body posture alterations in young subjects to determine possible clinical applications. METHODS: A total of 122 [...] subjects, including 86 males and 36 females (age range of 10.8-16.3 years), were enrolled. All subjects tested negative for temporomandibular disorders or other conditions affecting the stomatognathic systems, except malocclusion. A dental occlusion assessment included phase of dentition, molar class, overjet, overbite, anterior and posterior crossbite, scissorbite, mandibular crowding and dental midline deviation. In addition, body posture was recorded through static posturography using a vertical force platform. Recordings were performed under two conditions, namely, i) mandibular rest position (RP) and ii) dental intercuspidal position (ICP). Posturographic parameters included the projected sway area and velocity and the antero-posterior and right-left load differences. Multiple regression models were run for both recording conditions to evaluate associations between each malocclusal trait and posturographic parameters. RESULTS: All of the posturographic parameters had large variability and were very similar between the two recording conditions. Moreover, a limited number of weakly significant correlations were observed, mainly for overbite and dentition phase, when using multivariate models. CONCLUSION: Our current findings, particularly with regard to the use of posturography as a diagnostic aid for subjects affected by dental malocclusion, do not support existence of clinically relevant correlations between malocclusal traits and body posture
Dental malocclusion and body posture in young subjects: A multiple regression study
Directory of Open Access Journals (Sweden)
Giuseppe Perinetti
2010-01-01
Full Text Available OBJECTIVES: Controversial results have been reported on potential correlations between the stomatognathic system and body posture. We investigated whether malocclusal traits correlate with body posture alterations in young subjects to determine possible clinical applications. METHODS: A total of 122 subjects, including 86 males and 36 females (age range of 10.8-16.3 years, were enrolled. All subjects tested negative for temporomandibular disorders or other conditions affecting the stomatognathic systems, except malocclusion. A dental occlusion assessment included phase of dentition, molar class, overjet, overbite, anterior and posterior crossbite, scissorbite, mandibular crowding and dental midline deviation. In addition, body posture was recorded through static posturography using a vertical force platform. Recordings were performed under two conditions, namely, i mandibular rest position (RP and ii dental intercuspidal position (ICP. Posturographic parameters included the projected sway area and velocity and the antero-posterior and right-left load differences. Multiple regression models were run for both recording conditions to evaluate associations between each malocclusal trait and posturographic parameters. RESULTS: All of the posturographic parameters had large variability and were very similar between the two recording conditions. Moreover, a limited number of weakly significant correlations were observed, mainly for overbite and dentition phase, when using multivariate models. CONCLUSION: Our current findings, particularly with regard to the use of posturography as a diagnostic aid for subjects affected by dental malocclusion, do not support existence of clinically relevant correlations between malocclusal traits and body posture
A simplified procedure of linear regression in a preliminary analysis
Directory of Open Access Journals (Sweden)
Silvia Facchinetti
2013-05-01
Full Text Available The analysis of a statistical large data-set can be led by the study of a particularly interesting variable Y – regressed – and an explicative variable X, chosen among the remained variables, conjointly observed. The study gives a simplified procedure to obtain the functional link of the variables y=y(x by a partition of the data-set into m subsets, in which the observations are synthesized by location indices (mean or median of X and Y. Polynomial models for y(x of order r are considered to verify the characteristics of the given procedure, in particular we assume r= 1 and 2. The distributions of the parameter estimators are obtained by simulation, when the fitting is done for m= r + 1. Comparisons of the results, in terms of distribution and efficiency, are made with the results obtained by the ordinary least square methods. The study also gives some considerations on the consistency of the estimated parameters obtained by the given procedure.
Demin, S.; Panichev, O.; Nefedyev, Y.
2013-09-01
In this abstract are discussed new method for researching of time series based on statistical physics including Fourier analysis, correlation analysis, elements of fractal analysis and regression modeling.
International Nuclear Information System (INIS)
The gamma/beta TLD badge used by OPPD consists of two TLD-700 chips (Harshaw G7 card), one of which (chip number sign 2) is shielded by a 0.102 cm-thick aluminum filter, and the other (chip number sign 1) is unshielded, as shown in Fig. 1. Standard procedure had been to determine the beta dose to the badge by subtracting the response of chip number sign 2 from that of chip number sign 1 and then dividing by a calibrated beta-sensitivity factor; the gamma dose was taken to be the response of chip number sign 2 divided by the chip's gamma-sensitivity factor followed by the subtraction of the background dose. A problem with this procedure is penetration of energetic beta particles through the aluminum filter on chip number sign 2 which causes an over-response. Due to the technique used to obtain the beta dose, this also results in an under-estimate of the beta dose. This problem has been corrected through application of multiple linear regression analysis on a large data base of pure gamma (137Cs), pure beta (90Sr), and mixed exposures. The outcome of the analysis is an algorithm that automatically corrects for penetration effects. Performance tests using the ANSI N13.11 standard are presented to show the improvement
Change Impact Analysis Based Regression Testing of Web Services
Chaturvedi, Animesh
2014-01-01
Reducing the effort required to make changes in web services is one of the primary goals in web service projects maintenance and evolution. Normally, functional and non-functional testing of a web service is performed by testing the operations specified in its WSDL. The regression testing is performed by identifying the changes made thereafter to the web service code and the WSDL. In this thesis, we present a tool-supported approach to perform efficient regression testing of...
Analysis of Nonlinear Regression Models: A Cautionary Note
Peddada, Shyamal D.; Haseman, Joseph K.
2006-01-01
Regression models are routinely used in many applied sciences for describing the relationship between a response variable and an independent variable. Statistical inferences on the regression parameters are often performed using the maximum likelihood estimators (MLE). In the case of nonlinear models the standard errors of MLE are often obtained by linearizing the nonlinear function around the true parameter and by appealing to large sample theory. In this article we demonstrate, through comp...
Linear Maximum Likelihood Regression Analysis for Untransformed Log-Normally Distributed Data
Sara M. Gustavsson; Sandra Johannesson; Gerd Sallsten; Eva M. Andersson
2012-01-01
Medical research data are often skewed and heteroscedastic. It has therefore become practice to log-transform data in regression analysis, in order to stabilize the variance. Regression analysis on log-transformed data estimates the relative effect, whereas it is often the absolute effect of a predictor that is of interest. We propose a maximum likelihood (ML)-based approach to estimate a linear regression model on log-normal, heteroscedastic data. The new method was evaluated with a large si...
Varying-coefficient functional linear regression
Wu, Yichao; Fan, Jianqing; Müller, Hans-Georg
2011-01-01
Functional linear regression analysis aims to model regression relations which include a functional predictor. The analog of the regression parameter vector or matrix in conventional multivariate or multiple-response linear regression models is a regression parameter function in one or two arguments. If, in addition, one has scalar predictors, as is often the case in applications to longitudinal studies, the question arises how to incorporate these into a functional regressi...
Regression analysis of technical parameters affecting nuclear power plant performances
International Nuclear Information System (INIS)
Since the 80's many studies have been conducted in order to explicate good and bad performances of commercial nuclear power plants (NPPs), but yet no defined correlation has been found out to be totally representative of plant operational experience. In early works, data availability and the number of operating power stations were both limited; therefore, results showed that specific technical characteristics of NPPs were supposed to be the main causal factors for successful plant operation. Although these aspects keep on assuming a significant role, later studies and observations showed that other factors concerning management and organization of the plant could instead be predominant comparing utilities operational and economic results. Utility quality, in a word, can be used to summarize all the managerial and operational aspects that seem to be effective in determining plant performance. In this paper operational data of a consistent sample of commercial nuclear power stations, out of the total 433 operating NPPs, are analyzed, mainly focusing on the last decade operational experience. The sample consists of PWR and BWR technology, operated by utilities located in different countries, including U.S. (Japan)) (France)) (Germany)) and Finland. Multivariate regression is performed using Unit Capability Factor (UCF) as the dependent variable; this factor reflects indeed the effectiveness of plant programs and practices in maximizing the available electrical generation and consequently provides an overall indication of how well plants are operated and maintained. Aspects that may not be real causal factors but which can have a consistent impact on the UCF, as technology design, supplier, size and age, are included in the analysis as independent variables. (authors)
Combinatorial Analysis of Multiple Networks
Magnani, Matteo; Rossi, Luca
2013-01-01
The study of complex networks has been historically based on simple graph data models representing relationships between individuals. However, often reality cannot be accurately captured by a flat graph model. This has led to the development of multi-layer networks. These models have the potential of becoming the reference tools in network data analysis, but require the parallel development of specific analysis methods explicitly exploiting the information hidden in-between the layers and the availability of a critical mass of reference data to experiment with the tools and investigate the real-world organization of these complex systems. In this work we introduce a real-world layered network combining different kinds of online and offline relationships, and present an innovative methodology and related analysis tools suggesting the existence of hidden motifs traversing and correlating different representation layers. We also introduce a notion of betweenness centrality for multiple networks. While some preli...
Buffalos milk yield analysis using random regression models
Directory of Open Access Journals (Sweden)
A.S. Schierholt
2010-02-01
Full Text Available Data comprising 1,719 milk yield records from 357 females (predominantly Murrah breed, daughters of 110 sires, with births from 1974 to 2004, obtained from the Programa de Melhoramento Genético de Bubalinos (PROMEBUL and from records of EMBRAPA Amazônia Oriental - EAO herd, located in Belém, Pará, Brazil, were used to compare random regression models for estimating variance components and predicting breeding values of the sires. The data were analyzed by different models using the Legendre’s polynomial functions from second to fourth orders. The random regression models included the effects of herd-year, month of parity date of the control; regression coefficients for age of females (in order to describe the fixed part of the lactation curve and random regression coefficients related to the direct genetic and permanent environment effects. The comparisons among the models were based on the Akaike Infromation Criterion. The random effects regression model using third order Legendre’s polynomials with four classes of the environmental effect were the one that best described the additive genetic variation in milk yield. The heritability estimates varied from 0.08 to 0.40. The genetic correlation between milk yields in younger ages was close to the unit, but in older ages it was low.
Multiple logistic regression model of signalling practices of drivers on urban highways
Puan, Othman Che; Ibrahim, Muttaka Na'iya; Zakaria, Rozana
2015-05-01
Giving signal is a way of informing other road users, especially to the conflicting drivers, the intention of a driver to change his/her movement course. Other users are exposed to hazard situation and risks of accident if the driver who changes his/her course failed to give signal as required. This paper describes the application of logistic regression model for the analysis of driver's signalling practices on multilane highways based on possible factors affecting driver's decision such as driver's gender, vehicle's type, vehicle's speed and traffic flow intensity. Data pertaining to the analysis of such factors were collected manually. More than 2000 drivers who have performed a lane changing manoeuvre while driving on two sections of multilane highways were observed. Finding from the study shows that relatively a large proportion of drivers failed to give any signals when changing lane. The result of the analysis indicates that although the proportion of the drivers who failed to provide signal prior to lane changing manoeuvre is high, the degree of compliances of the female drivers is better than the male drivers. A binary logistic model was developed to represent the probability of a driver to provide signal indication prior to lane changing manoeuvre. The model indicates that driver's gender, type of vehicle's driven, speed of vehicle and traffic volume influence the driver's decision to provide a signal indication prior to a lane changing manoeuvre on a multilane urban highway. In terms of types of vehicles driven, about 97% of motorcyclists failed to comply with the signal indication requirement. The proportion of non-compliance drivers under stable traffic flow conditions is much higher than when the flow is relatively heavy. This is consistent with the data which indicates a high degree of non-compliances when the average speed of the traffic stream is relatively high.
Yang, Jianhong; Yi, Cancan; Xu, Jinwu; Ma, Xianghong
2015-05-01
A new LIBS quantitative analysis method based on analytical line adaptive selection and Relevance Vector Machine (RVM) regression model is proposed. First, a scheme of adaptively selecting analytical line is put forward in order to overcome the drawback of high dependency on a priori knowledge. The candidate analytical lines are automatically selected based on the built-in characteristics of spectral lines, such as spectral intensity, wavelength and width at half height. The analytical lines which will be used as input variables of regression model are determined adaptively according to the samples for both training and testing. Second, an LIBS quantitative analysis method based on RVM is presented. The intensities of analytical lines and the elemental concentrations of certified standard samples are used to train the RVM regression model. The predicted elemental concentration analysis results will be given with a form of confidence interval of probabilistic distribution, which is helpful for evaluating the uncertainness contained in the measured spectra. Chromium concentration analysis experiments of 23 certified standard high-alloy steel samples have been carried out. The multiple correlation coefficient of the prediction was up to 98.85%, and the average relative error of the prediction was 4.01%. The experiment results showed that the proposed LIBS quantitative analysis method achieved better prediction accuracy and better modeling robustness compared with the methods based on partial least squares regression, artificial neural network and standard support vector machine.
Innovation sources and productivity : a quantile regression analysis
Segarra Blasco, Agustí
2007-01-01
This paper explores the effects of two main sources of innovation —intramural and external R&D— on the productivity level in a sample of 3,267 Catalan firms. The data set used is based on the official innovation survey of Catalonia which was a part of the Spanish sample of CIS4, covering the years 2002-2004. We compare empirical results by applying usual OLS and quantile regression techniques both in manufacturing and services industries. In quantile regression, results suggest different patt...
INFLUENCE OF TOURISM SECTOR IN ALBANIAN GDP: STIMATION USING MULTIPLE REGRESSION METHOD
Directory of Open Access Journals (Sweden)
Eglantina HYSA
2012-06-01
Full Text Available During last years, tourism sector has significantly increased in Albania, since after year 1990 Albania has passed from a centralized economy to a liberal one. Tourism sector plays an important role in economic and social development. The contributions of this sector reflect directly into the generation of national income. The two main components matching the tourism movements are the number of tourists and the number of overnights in hotels. Investments done in this sector could be expected to have high positive influence in the country's GDP. This study seeks to identify the influence of tourists, their overnights in hotels and capital investment spending by all sectors directly involved in tourism sector on tourism total contribution to gross domestic product of Albania during 1996-2009. A regression analysis has been performed taking as dependent variable GDP generated by tourism sector and as independent variables, capital investment, tourist number and overnights in hotels. Even if all the variables have been found to be positivlye related, the variable ‘overnights of foreigners and Albanians in hotels' have beenfound insignificant.
Scientific Electronic Library Online (English)
H., Jang; E., Topal; Y., Kawamura.
2015-05-01
Full Text Available Unplanned dilution and ore loss directly influence not only the productivity of underground stopes, but also the profitability of the entire mining process. Stope dilution is a result of complex interactions between a number of factors, and cannot be predicted prior to mining. In this study, unplann [...] ed dilution and ore loss prediction models were established using multiple linear and nonlinear regression analysis (MLRA and MNRA), as well as an artificial neural network (ANN) method based on 1067 datasets with ten causative factors from three underground longhole stoping mines in Western Australia. Models were established for individual mines, as well as a general model that includes all of the mine data-sets. The correlation coefficient (R) was used to evaluate the methods, and the values for MLRA, MNRA, and ANN compared with the general model were 0.419, 0.438, and 0.719, respectively. Considering that the current unplanned dilution and ore loss prediction for the mines investigated yielded an R of 0.088, the ANN model results are noteworthy. The proposed ANN model can be used directly as a practical tool to predict unplanned dilution and ore loss in mines, which will not only enhance productivity, but will also be beneficial for stope planning and design.
Analysis on Train Stopping Accuracy based on Regression Algorithms
Directory of Open Access Journals (Sweden)
Lin Ma
2014-05-01
Full Text Available Stopping accuracy is one of the most important indexes of efficiency of automatic train operation (ATO systems. Traditional stopping control algorithms in ATO systems have some drawbacks, as many factors have not been taken into account. In the large amount of field-collected data about stopping accuracy there are many factors (e.g. system delays, stopping time, net pressure which affecting stopping accuracy. In this paper, three popular data mining methods are proposed to analyze the train stopping accuracy. Firstly, we find fifteen factors which have impact on the stopping accuracy. Then, ridge regression, lasso regression and elastic net regression are employed to mine models to reflecting the relationship between the fifteen factors and the stopping accuracy. Then, the three models are compared by using Akaike information criterion (AIC, a model selection criterion which considering the trade-off between accuracy and complexity. The computational results show that elastic net regression model has a best performance on AIC value. Finally, we obtain the parameters which can make the train stop more accurately which can provide a reference to improve stopping accuracy for ATO systems.
Grades, Gender, and Encouragement: A Regression Discontinuity Analysis
Owen, Ann L.
2010-01-01
The author employs a regression discontinuity design to provide direct evidence on the effects of grades earned in economics principles classes on the decision to major in economics and finds a differential effect for male and female students. Specifically, for female students, receiving an A for a final grade in the first economics class is…
Combining regression analysis and air quality modelling to predict benzene concentration levels
Vlachokostas, Ch.; Achillas, Ch.; Chourdakis, E.; Moussiopoulos, N.
2011-05-01
State of the art epidemiological research has found consistent associations between traffic-related air pollution and various outcomes, such as respiratory symptoms and premature mortality. However, many urban areas are characterised by the absence of the necessary monitoring infrastructure, especially for benzene (C 6H 6), which is a known human carcinogen. The use of environmental statistics combined with air quality modelling can be of vital importance in order to assess air quality levels of traffic-related pollutants in an urban area in the case where there are no available measurements. This paper aims at developing and presenting a reliable approach, in order to forecast C 6H 6 levels in urban environments, demonstrated for Thessaloniki, Greece. Multiple stepwise regression analysis is used and a strong statistical relationship is detected between C 6H 6 and CO. The adopted regression model is validated in order to depict its applicability and representativeness. The presented results demonstrate that the adopted approach is capable of capturing C 6H 6 concentration trends and should be considered as complementary to air quality monitoring.
Analysis of nonlinear regression models: a cautionary note.
Peddada, Shyamal D; Haseman, Joseph K
2005-01-01
Regression models are routinely used in many applied sciences for describing the relationship between a response variable and an independent variable. Statistical inferences on the regression parameters are often performed using the maximum likelihood estimators (MLE). In the case of nonlinear models the standard errors of MLE are often obtained by linearizing the nonlinear function around the true parameter and by appealing to large sample theory. In this article we demonstrate, through computer simulations, that the resulting asymptotic Wald confidence intervals cannot be trusted to achieve the desired confidence levels. Sometimes they could underestimate the true nominal level and are thus liberal. Hence one needs to be cautious in using the usual linearized standard errors of MLE and the associated confidence intervals. PMID:18648618
Model performance analysis and model validation in logistic regression
Directory of Open Access Journals (Sweden)
Rosa Arboretti Giancristofaro
2007-10-01
Full Text Available In this paper a new model validation procedure for a logistic regression model is presented. At first, we illustrate a brief review of different techniques of model validation. Next, we define a number of properties required for a model to be considered "good", and a number of quantitative performance measures. Lastly, we describe a methodology for the assessment of the performance of a given model by using an example taken from a management study.
Grades, gender, and encouragement: A regression discontinuity analysis
Owen, Ann L.
2008-01-01
This study employs a regression discontinuity design in order to provide direct evidence on the effects of grades earned in economics principles classes on the decision to major in economics and finds a differential effect for male and female students. Specifically, for female students, receiving an “A” for a final grade in the first economics class is associated with a meaningful increase in the probability of majoring in economics, even after controlling for the numerical grade earned in t...
Analysis of some methods for reduced rank Gaussian process regression
DEFF Research Database (Denmark)
Quinonero-Candela, J.; Rasmussen, Carl Edward
2005-01-01
While there is strong motivation for using Gaussian Processes (GPs) due to their excellent performance in regression and classification problems, their computational complexity makes them impractical when the size of the training set exceeds a few thousand cases. This has motivated the recent proliferation of a number of cost-effective approximations to GPs, both for classification and for regression. In this paper we analyze one popular approximation to GPs for regression: the reduced rank approximation. While generally GPs are equivalent to infinite linear models, we show that Reduced Rank Gaussian Processes (RRGPs) are equivalent to finite sparse linear models. We also introduce the concept of degenerate GPs and show that they correspond to inappropriate priors. We show how to modify the RRGP to prevent it from being degenerate at test time. Training RRGPs consists both in learning the covariance function hyperparameters and the support set. We propose a method for learning hyperparameters for a given support set. We also review the Sparse Greedy GP (SGGP) approximation (Smola and Bartlett, 2001), which is a way of learning the support set for given hyperparameters based on approximating the posterior. We propose an alternative method to the SGGP that has better generalization capabilities. Finally we make experiments to compare the different ways of training a RRGP. We provide some Matlab code for learning RRGPs.
BRGLM, Interactive Linear Regression Analysis by Least Square Fit
International Nuclear Information System (INIS)
1 - Description of program or function: BRGLM is an interactive program written to fit general linear regression models by least squares and to provide a variety of statistical diagnostic information about the fit. Stepwise and all-subsets regression can be carried out also. There are facilities for interactive data management (e.g. setting missing value flags, data transformations) and tools for constructing design matrices for the more commonly-used models such as factorials, cubic Splines, and auto-regressions. 2 - Method of solution: The least squares computations are based on the orthogonal (QR) decomposition of the design matrix obtained using the modified Gram-Schmidt algorithm. 3 - Restrictions on the complexity of the problem: The current release of BRGLM allows maxima of 1000 observations, 99 variables, and 3000 words of main memory workspace. For a problem with N observations and P variables, the number of words of main memory storage required is MAX(N*(P+6), N*P+P*P+3*N, and 3*P*P+6*N). Any linear model may be fit although the in-memory workspace will have to be increased for larger problems
International Nuclear Information System (INIS)
A plot of lung-cancer rates versus radon exposures in 965 US counties, or in all US states, has a strong negative slope, b, in sharp contrast to the strong positive slope predicted by linear/no-threshold theory. The discrepancy between these slopes exceeds 20 standard deviations (SD). Including smoking frequency in the analysis substantially improves fits to a linear relationship but has little effect on the discrepancy in b, because correlations between smoking frequency and radon levels are quite weak. Including 17 socioeconomic variables (SEV) in multiple regression analysis reduces the discrepancy to 15 SD. Data were divided into segments by stratifying on each SEV in turn, and on geography, and on both simultaneously, giving over 300 data sets to be analyzed individually, but negative slopes predominated. The slope is negative whether one considers only the most urban counties or only the most rural; only the richest or only the poorest; only the richest in the South Atlantic region or only the poorest in that region, etc., etc.,; and for all the strata in between. Since this is an ecological study, the well-known problems with ecological studies were investigated and found not to be applicable here. The open-quotes ecological fallacyclose quotes was shown not to apply in testing a linear/no-threshold theory, and the vulnerability to confounding is greatly reduced when confounding factors are only weakly correlated with radon levels, as is generally the case hereadon levels, as is generally the case here. All confounding factors known to correlate with radon and with lung cancer were investigated quantitatively and found to have little effect on the discrepancy
Development of a User Interface for a Regression Analysis Software Tool
Ulbrich, Norbert Manfred; Volden, Thomas R.
2010-01-01
An easy-to -use user interface was implemented in a highly automated regression analysis tool. The user interface was developed from the start to run on computers that use the Windows, Macintosh, Linux, or UNIX operating system. Many user interface features were specifically designed such that a novice or inexperienced user can apply the regression analysis tool with confidence. Therefore, the user interface s design minimizes interactive input from the user. In addition, reasonable default combinations are assigned to those analysis settings that influence the outcome of the regression analysis. These default combinations will lead to a successful regression analysis result for most experimental data sets. The user interface comes in two versions. The text user interface version is used for the ongoing development of the regression analysis tool. The official release of the regression analysis tool, on the other hand, has a graphical user interface that is more efficient to use. This graphical user interface displays all input file names, output file names, and analysis settings for a specific software application mode on a single screen which makes it easier to generate reliable analysis results and to perform input parameter studies. An object-oriented approach was used for the development of the graphical user interface. This choice keeps future software maintenance costs to a reasonable limit. Examples of both the text user interface and graphical user interface are discussed in order to illustrate the user interface s overall design approach.
Energy Technology Data Exchange (ETDEWEB)
Wanke, Peter [Universidade Federal do Rio de Janeiro (UFRJ), RJ (Brazil). Instituto de Pesquisa e Pos-Graduacao em Administracao de Empresas (COPPEAD). Centro de Estudos em Logistica
2004-07-01
In this paper, the most relevant multiple regression models for sales forecasting of gas stations, developed over the past ten years, are reviewed. The most significant variables related to gas station sales, the types of the multiple regression models (linear or non-linear), the most common uses in supporting decision making and its limits are presented. The predictive power of each model and its impact on decision-making, such as sensitivity analysis and confidence intervals for independent variables, are also commented. Four models are presented, based on studies conducted in South Africa, Portugal and Brazil. In conclusion, suggestions for future developments are presented based on past developments. (author)
Schilling, K.E.; Wolter, C.F.
2005-01-01
Nineteen variables, including precipitation, soils and geology, land use, and basin morphologic characteristics, were evaluated to develop Iowa regression models to predict total streamflow (Q), base flow (Qb), storm flow (Qs) and base flow percentage (%Qb) in gauged and ungauged watersheds in the state. Discharge records from a set of 33 watersheds across the state for the 1980 to 2000 period were separated into Qb and Qs. Multiple linear regression found that 75.5 percent of long term average Q was explained by rainfall, sand content, and row crop percentage variables, whereas 88.5 percent of Qb was explained by these three variables plus permeability and floodplain area variables. Qs was explained by average rainfall and %Qb was a function of row crop percentage, permeability, and basin slope variables. Regional regression models developed for long term average Q and Qb were adapted to annual rainfall and showed good correlation between measured and predicted values. Combining the regression model for Q with an estimate of mean annual nitrate concentration, a map of potential nitrate loads in the state was produced. Results from this study have important implications for understanding geomorphic and land use controls on streamflow and base flow in Iowa watersheds and similar agriculture dominated watersheds in the glaciated Midwest. (JAWRA) (Copyright ?? 2005).
Point Estimates and Confidence Intervals for Variable Importance in Multiple Linear Regression
Thomas, D. Roland; Zhu, PengCheng; Decady, Yves J.
2007-01-01
The topic of variable importance in linear regression is reviewed, and a measure first justified theoretically by Pratt (1987) is examined in detail. Asymptotic variance estimates are used to construct individual and simultaneous confidence intervals for these importance measures. A simulation study of their coverage properties is reported, and an…
Use of Structure Coefficients in Published Multiple Regression Articles: Beta Is Not Enough.
Courville, Troy; Thompson, Bruce
2001-01-01
Reviewed articles published in the "Journal of Applied Psychology" (JAP) to determine how interpretations might have differed if standardized regression coefficients and structure coefficients (or bivariate "r"s of predictors with the criterion) had been interpreted. Summarizes some dramatic misinterpretations or incomplete interpretations.…
Directory of Open Access Journals (Sweden)
Shelley M. ALEXANDER
2009-02-01
Full Text Available We compared probability surfaces derived using one set of environmental variables in three Geographic Information Systems (GIS-based approaches: logistic regression and Akaike’s Information Criterion (AIC, Multiple Criteria Evaluation (MCE, and Bayesian Analysis (specifically Dempster-Shafer theory. We used lynx Lynx canadensis as our focal species, and developed our environment relationship model using track data collected in Banff National Park, Alberta, Canada, during winters from 1997 to 2000. The accuracy of the three spatial models were compared using a contingency table method. We determined the percentage of cases in which both presence and absence points were correctly classified (overall accuracy, the failure to predict a species where it occurred (omission error and the prediction of presence where there was absence (commission error. Our overall accuracy showed the logistic regression approach was the most accurate (74.51%. The multiple criteria evaluation was intermediate (39.22%, while the Dempster-Shafer (D-S theory model was the poorest (29.90%. However, omission and commission error tell us a different story: logistic regression had the lowest commission error, while D-S theory produced the lowest omission error. Our results provide evidence that habitat modellers should evaluate all three error measures when ascribing confidence in their model. We suggest that for our study area at least, the logistic regression model is optimal. However, where sample size is small or the species is very rare, it may also be useful to explore and/or use a more ecologically cautious modelling approach (e.g. Dempster-Shafer that would over-predict, protect more sites, and thereby minimize the risk of missing critical habitat in conservation plans[Current Zoology 55(1: 28 – 40, 2009].
Scientific Electronic Library Online (English)
W., Sun; G. X., Meng; Q., Ye; H. L., Jin; J. Z., Zhang.
2012-04-01
Full Text Available Carrying out regression analysis for gas leakage of pressure-relief valve (PRV) to get accurate leakage flow and changing trend of leakage will be helpful in assessing the reliability of PRV. Classic support vector regression (SVR) is an excellent regression model, and has been widely used in variou [...] s fields. However, standard SVR model does regression only using leakage data without elements closely related to the leakage considered. In this paper a regression model based on support vector regression plus (SVR+) is put forward to perform leakage regression of PRV, in which particle swarm optimization (PSO) is used to select optimum parameters of SVR+, termed PSO_SVR+. The experimental results demonstrate that the proposed model taking the difference of inlet pressure and outlet pressure of PRV as hidden information can access a more favorable regression precision than SVR can provide. Meanwhile this article also investigates effects of PSO and Genetic Algorithm on the performance of regression model (SVR+ or SVR)
Selection of Higher Order Regression Models in the Analysis of Multi-Factorial Transcription Data
Prazeres da Costa, Olivia; Hoffman, Arthur; Rey, Johannes W; Mansmann, Ulrich; Buch, Thorsten; Tresch, Achim
2014-01-01
INTRODUCTION: Many studies examine gene expression data that has been obtained under the influence of multiple factors, such as genetic background, environmental conditions, or exposure to diseases. The interplay of multiple factors may lead to effect modification and confounding. Higher order linear regression models can account for these effects. We present a new methodology for linear model selection and apply it to microarray data of bone marrow-derived macrophages. This experiment invest...
Intra-industry spillovers from inward FDI: A meta-regression analysis
Havránek, Tomáš
2008-01-01
The present paper conducts a meta-analysis of literature on intra-industry spillovers from foreign direct investment, using 97 different outcomes from 67 empirical studies. Apart from the traditional approach, robust meta-regression, random-effects model, and normal probability regression are employed. Results of combined significance analysis are mixed but it is evident that studies published in leading academic journals tend to report rather insignificant results. Our findings suggest that ...
Economic growth and electricity consumption: Auto regressive distributed lag analysis
Scientific Electronic Library Online (English)
Melike E, Bildirici; Tahsin, Bakirtas; Fazil, Kayikci.
Full Text Available Knowledge of the direction of causality between electricity consumption and economic growth is of primary importance if appropriate energy policies and energy conservation measures are to be devised. This study estimates the causality relationship between electricity consumption and economic growth [...] in per capita and aggregate levels. The study uses the price and income elasticities of total electricity demand and industrial demand by using the auto regressive distributed lag (ARDL) method for some developed and developing countries, including the US, UK, Canada, Japan, China, India, Brazil, Italy, France, Turkey and South Africa. There is evidence to support the growth hypothesis for the US, China, Canada and Brazil. There is evidence to support the conservation hypothesis for India, Turkey, South Africa, Japan, UK, France and Italy.
Using Negative Binomial Regression Analysis to Predict Software Faults: A Study of Apache Ant
Directory of Open Access Journals (Sweden)
Liguo Yu
2012-07-01
Full Text Available Negative binomial regression has been proposed as an approach to predicting fault-prone software modules. However, little work has been reported to study the strength, weakness, and applicability of this method. In this paper, we present a deep study to investigate the effectiveness of using negative binomial regression to predict fault-prone software modules under two different conditions, self-assessment and forward assessment. The performance of negative binomial regression model is also compared with another popular fault prediction model—binary logistic regression method. The study is performed on six versions of an open-source objected-oriented project, Apache Ant. The study shows (1 the performance of forward assessment is better than or at least as same as the performance of self-assessment; (2 in predicting fault-prone modules, negative binomial regression model could not outperform binary logistic regression model; and (3 negative binomial regression is effective in predicting multiple errors in one module.
Smith, Timothy D.; Steffen, Christopher J., Jr.; Yungster, Shaye; Keller, Dennis J.
1998-01-01
The all rocket mode of operation is shown to be a critical factor in the overall performance of a rocket based combined cycle (RBCC) vehicle. An axisymmetric RBCC engine was used to determine specific impulse efficiency values based upon both full flow and gas generator configurations. Design of experiments methodology was used to construct a test matrix and multiple linear regression analysis was used to build parametric models. The main parameters investigated in this study were: rocket chamber pressure, rocket exit area ratio, injected secondary flow, mixer-ejector inlet area, mixer-ejector area ratio, and mixer-ejector length-to-inlet diameter ratio. A perfect gas computational fluid dynamics analysis, using both the Spalart-Allmaras and k-omega turbulence models, was performed with the NPARC code to obtain values of vacuum specific impulse. Results from the multiple linear regression analysis showed that for both the full flow and gas generator configurations increasing mixer-ejector area ratio and rocket area ratio increase performance, while increasing mixer-ejector inlet area ratio and mixer-ejector length-to-diameter ratio decrease performance. Increasing injected secondary flow increased performance for the gas generator analysis, but was not statistically significant for the full flow analysis. Chamber pressure was found to be not statistically significant.
Directory of Open Access Journals (Sweden)
Angela Radünz Lazzari
2011-01-01
Full Text Available O ar é um meio eficiente de dispersão de poluentes atmosféricos e seucomportamento depende dos movimentos atmosféricos que ocorrem na troposfera. Em Porto Alegre, Estado do Rio Grande do Sul, há um grande tráfego diário e uma concentração de indústrias que podem ser responsáveis por emissões atmosféricas. Neste trabalho, estudou-se ocomportamento das concentrações diárias de material particulado (PM10 desta cidade, considerando a influência dos elementos meteorológicos. A análise dos dados foi realizada a partir de estatísticas descritivas, correlação linear e regressão múltipla. Os dados foram fornecidos pela Fundação Estadual de Proteção Ambiental Henrique Luiz Roessler - RS (FEPAM e pelo Instituto Nacional de Meteorologia (INMET. A partir das análises pôde-se verificar que: asconcentrações do PM10, medidos diariamente às 16h, não ultrapassaram os padrões nacionais de qualidade do ar; os elementos meteorológicos que influenciam nas concentrações do PM10 foram: a velocidade média diária do vento e a radiação média diária com relações negativas; astemperaturas médias diárias do ar e as direções, norte e noroeste, do vento, com relações positivas. As direções do vento que contribuem significativamente para diminuir as concentrações nos locais medidos são Leste e Sudeste.Air is an efficient means of atmospheric pollutants dispersal and its r behavior depends on the atmospheric movements that occur in the troposphere. In Porto Alegre, Rio Grande do Sul State, there is a large daily traffic and a concentration of industries that may be responsible for atmospheric emission. In the present work we studied the behavior of daily concentrations of particulate matter (PM10, in this city, considering the influence of meteorological variables. Dataanalysis was performed from descriptive statistics, linear correlation and multiple regressions. Data were provided by the State Foundation of Environmental Protection Henrique Luiz Roessler - RS and the National Institute of Meteorology. Based on the analysis it was possible to verify that: the concentration of PM10, measured every day at 4:00 p.m., did not exceed national standards for air quality; meteorological elements that influenced on the concentrations of PM10 were the daily average wind speed and average daily radiation with negative relations; the daily average temperature of the air and the directions, north and northwest of wind, with positive relations. Wind directions which contribute significantly to lower concentrations on the measured placesare east and southeast.
Regularized Multiple-Set Canonical Correlation Analysis
Takane, Yoshio; Hwang, Heungsun; Abdi, Herve
2008-01-01
Multiple-set canonical correlation analysis (Generalized CANO or GCANO for short) is an important technique because it subsumes a number of interesting multivariate data analysis techniques as special cases. More recently, it has also been recognized as an important technique for integrating information from multiple sources. In this paper, we…
Regression analysis for a bottom-up approach to analyzing semi-prompt fission gamma yields
International Nuclear Information System (INIS)
Highlights: ? Fitting the semi-prompt non-resolved photon spectrum after fission. ? Energy–time dependence can be factorized. ? Physical model, statistical model, sampling procedure. ? The best fit is: lognormal for energy and F for time. - Abstract: We present an empirical model that describes the yield of gamma rays emitted by fission in the time interval from 20 to 958 ns following a fission event. The analysis is based on experimental data from neutron-induced fission of 235U and 239Pu. The model is devised by first using regression analysis to identify likely patterns in the data and to choose plausible fitting functions. We provide statistical and physical arguments in support of time and energy independence. The intensity of the emitted gamma rays can be described as a bivariate distribution that is the product of independent variates for energy and time. We test several plausible distribution families for the energy and time variates and use maximum likelihood and minimum ?2 to estimate distribution parameters. Because of the uncertainty in the experimental data, multiple combinations of variate pairs give rise to a surface that plausibly well fits the observations well. The best-fit variate turns out to be lognormal in energy and F in time. The findings illustrated in this paper can be used to simulate gamma ray de-excitation from fission in Monte Carlo codes.
Ilander, Aki; Väisänen, Ari
2010-01-01
A multiple linear regression technique was used to evaluate and correct the matrix interferences in the determination of As and Pb concentrations in fly ashes by inductively coupled plasma optical emission spectrometry. The direct determination of As and Pb in SRM 1633b by ICP-OES failed to obtain the certified concentrations, except in a couple of cases. However, it proved possible to use the multiple linear regression (MLR) technique to correct the determined concentrations to a satisfactor...
Additive Intensity Regression Models in Corporate Default Analysis
DEFF Research Database (Denmark)
Lando, David; Medhat, Mamdouh
2013-01-01
We consider additive intensity (Aalen) models as an alternative to the multiplicative intensity (Cox) models for analyzing the default risk of a sample of rated, nonfinancial U.S. firms. The setting allows for estimating and testing the significance of time-varying effects. We use a variety of model checking techniques to identify misspecifications. In our final model, we find evidence of time-variation in the effects of distance-to-default and short-to-long term debt. Also we identify interactions between distance-to-default and other covariates, and the quick ratio covariate is significant. None of our macroeconomic covariates are significant.
International Nuclear Information System (INIS)
Highlights: • Thermodynamic models of simple and regenerative cycles are defined. • Exergy destruction rate of different components was determined. • Impact of important operating parameters on cycles’ characteristics was determined. • Multiple polynomial regression models were developed. • Optimization for optimal operating parameters was performed. - Abstract: In this paper, thermo-environmental, economic and regression analyses of simple and regenerative gas turbine cycles are exhibited. Firstly, thermodynamic models for both cycles are defined; exergy destruction rate of different components is determined and parametric study is carried out to investigate the effects of compressor inlet temperature, turbine inlet temperature and compressor pressure ratio on the parameters that measure cycles’ performance, environmental impact and costs. Subsequently, multiple polynomial regression (MPR) models are developed to correlate important response variables with predictor variables and finally optimization is performed for optimal operating conditions. The results of parametric study have shown a significant impact of operating parameters on the performance parameters, environmental impact and costs. According to exergy analysis, the combustion chamber and exhaust stack are two major sites where largest exergy destruction/losses occur. Also, the total exergy destruction in the regenerative cycle is relatively lower; thereby resulted in a higher exergy efficiency of the cycle. The MPR models are also appeared as good estimator of the response variables since appended with very high R2 values. Finally, these models are used to determine the optimal operating parameters, which maximize the cycles’ performance and minimize CO2 emissions and costs
Regression Models for Demand Reduction based on Cluster Analysis of Load Profiles
Energy Technology Data Exchange (ETDEWEB)
Yamaguchi, Nobuyuki; Han, Junqiao; Ghatikar, Girish; Piette, Mary Ann; Asano, Hiroshi; Kiliccote, Sila
2009-06-28
This paper provides new regression models for demand reduction of Demand Response programs for the purpose of ex ante evaluation of the programs and screening for recruiting customer enrollment into the programs. The proposed regression models employ load sensitivity to outside air temperature and representative load pattern derived from cluster analysis of customer baseline load as explanatory variables. The proposed models examined their performances from the viewpoint of validity of explanatory variables and fitness of regressions, using actual load profile data of Pacific Gas and Electric Company's commercial and industrial customers who participated in the 2008 Critical Peak Pricing program including Manual and Automated Demand Response.
Directory of Open Access Journals (Sweden)
H. Jalilvand
2008-01-01
Full Text Available This study was down in Forest Park of Noor. In order to determination of tree ring response to climatic variations, 35 cores were taken from dominant natural stand of common ash (Fraxinus excelsior L.. The guide of this study was finding which climatic variables are effective in the ring width growth of ash in current growing year and previous years (one, two and three years before current growing year by multiple regression models at the North of IR-Iran. Totally, 85 annually, monthly seasons and seasonal growth climatic variations of precipitation, temperature, heat index, evapotranspiration and water balance were analyzed. The best multiple regression models were explained 83 percent of total variance of the growth of common ash. The results show that the growth of common ash was related to the previous year's climatic variations than that of the current year. The most effective role of climatic variations was due to the first and second preceding years (55%. Evapotranspiration of July and September, and precipitation of May in the second and precipitation of March in the third previous years, all were positively affected the growth of this species. This study revealed that ash is interested in warmer condition on early and middle of seasonal growth in present of available humid, and precipitation in the months of early growing season (Ordibehesht-Khordad of two previous years.
Masters, T.
2013-11-01
The effectiveness of multiple linear regression approaches in removing solar, volcanic, and El Nino Southern Oscillation (ENSO) influences from the recent (1979-2012) surface temperature record is examined, using simple energy balance and global climate models (GCMs). These multiple regression methods are found to incorrectly diagnose the underlying signal - particularly in the presence of a deceleration - by generally overestimating the solar cooling contribution to an early 21st century pause while underestimating the warming contribution from the Mt. Pinatubo recovery. In fact, one-box models and GCMs suggest that the Pinatubo recovery has contributed more to post-2000 warming trends than the solar minimum has contributed to cooling over the same period. After adjusting the observed surface temperature record based on the natural-only multi-model mean from several CMIP5 GCMs and an empirical ENSO adjustment, a significant deceleration in the surface temperature increase is found, ranging in magnitude from -0.06 to -0.12 K dec-2 depending on model sensitivity and the temperature index used. This likely points to internal decadal variability beyond these solar, volcanic, and ENSO influences.
Quantile regression for the statistical analysis of immunological data with many non-detects
Eilers Paul HC; Röder Esther; Savelkoul Huub FJ; van Wijk Roy
2012-01-01
Abstract Background Immunological parameters are hard to measure. A well-known problem is the occurrence of values below the detection limit, the non-detects. Non-detects are a nuisance, because classical statistical analyses, like ANOVA and regression, cannot be applied. The more advanced statistical techniques currently available for the analysis of datasets with non-detects can only be used if a small percentage of the data are non-detects. Methods and results Quantile regression, a genera...
International Nuclear Information System (INIS)
The primary treatment goal of radiotherapy for paragangliomas of the head and neck region (HNPGLs) is local control of the tumor, i.e. stabilization of tumor volume. Interestingly, regression of tumor volume has also been reported. Up to the present, no meta-analysis has been performed giving an overview of regression rates after radiotherapy in HNPGLs. The main objective was to perform a systematic review and meta-analysis to assess regression of tumor volume in HNPGL-patients after radiotherapy. A second outcome was local tumor control. Design of the study is systematic review and meta-analysis. PubMed, EMBASE, Web of Science, COCHRANE and Academic Search Premier and references of key articles were searched in March 2012 to identify potentially relevant studies. Considering the indolent course of HNPGLs, only studies with ?12 months follow-up were eligible. Main outcomes were the pooled proportions of regression and local control after radiotherapy as initial, combined (i.e. directly post-operatively or post-embolization) or salvage treatment (i.e. after initial treatment has failed) for HNPGLs. A meta-analysis was performed with an exact likelihood approach using a logistic regression with a random effect at the study level. Pooled proportions with 95% confidence intervals (CI) were reported. Fifteen studies were included, concerning a total of 283 jugulotympanic HNPGLs in 276 patients. Pooled regression proportions for initial, combined and salvage treatment were respectively 21%, 33% and 52% in radiosurgery studies and 4%, 0% and 64% in external beam radiotherapy studies. Pooled local control proportions for radiotherapy as initial, combined and salvage treatment ranged from 79% to 100%. Radiotherapy for jugulotympanic paragangliomas results in excellent local tumor control and therefore is a valuable treatment for these types of tumors. The effects of radiotherapy on regression of tumor volume remain ambiguous, although the data suggest that regression can be achieved at least in some patients. More research is needed to identify predictors for treatment success
Application of Binary Regression Analysis in the Prescription Pattern of Antidepressants
Dr.Indrajit Banerjee, MBBS, MD; Dr.Indraneel Banerjee, MBBS, MS, MRCS; Bedanta Roy; Dr.Brijesh Sathian MD(AM), PhD.
2013-01-01
Background:In Nepal several research studies are reported using percentages or cross tabulation method, but the relevance of logistic regression methodology in research is lag behind among the researchers. Objectives: The main objective of this study was to find the role of logistic regression analysis in the pattern of antidepressants in a tertiary care center in hospitalized patients of Western Nepal.Methods: A hospital based study was done between 1st October 2009 and 31st March 2010 at Ps...
Multiple correspondence analysis and related methods
Greenacre, Michael
2006-01-01
CORRESPONDENCE ANALYSIS AND RELATED METHODS IN PRACTICE, Jörg Blasius and Michael GreenacreA simple exampleBasic method Concepts of correspondence analysisStacked tables Multiple correspondence analysisCategorical principal components analysisActive and supplementary variablesMultiway data Content of the bookFROM SIMPLE TO MULTIPLE CORRESPONDENCE ANALYSIS, Michael GreenacreCanonical correlation analysisGeometric approach Supplementary pointsDiscussion and conclusions DIVIDED BY A COMMON LANGUAGE: ANALYZING AND VISUALIZING TWO-WAY ARRAYS, John C. GowerIntroduction: two-way tables and data matri
The Use of Nonparametric Kernel Regression Methods in Econometric Production Analysis
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard
2013-01-01
This PhD thesis addresses one of the fundamental problems in applied econometric analysis, namely the econometric estimation of regression functions. The conventional approach to regression analysis is the parametric approach, which requires the researcher to specify the form of the regression function. However, the a priori specification of a functional form involves the risk of choosing one that is not similar to the “true” but unknown relationship between the regressors and the dependent variable. This problem, known as parametric misspecification, can result in biased parameter estimates and hence, also in biased measures, which are derived from the estimated parameters. This, in turn, can result in incorrect economic conclusions and recommendations for managers, politicians and decision makers in general. This PhD thesis focuses on a nonparametric econometric approach that can be used to avoid this problem. The main objective is to investigate the applicability of the nonparametric kernel regression method in applied production analysis. The focus of the empirical analyses included in this thesis is the agricultural sector in Poland. Data on Polish farms are used to investigate practically and politically relevant problems and to illustrate how nonparametric regression methods can be used in applied microeconomic production analysis both in panel data and cross-section data settings. The thesis consists of four papers. The first paper addresses problems of parametric and nonparametric estimations of production functions in order to evaluate the optimal firm size. The second paper discusses the use of parametric and nonparametric regression methods to estimate panel data regression models. The third paper analyses production risk, price uncertainty, and farmers' risk preferences within a nonparametric panel data regression framework. The fourth paper analyses the technical efficiency of dairy farms with environmental output using nonparametric kernel regression in a semiparametric stochastic frontier analysis. The results provided in this PhD thesis show that nonparametric kernel methods are well-suited to econometric production analysis and can outperform traditional parametric methods. Although the empirical focus of this thesis is on the application of nonparametric kernel regression in applied production analysis, the findings are also applicable to econometric estimations in general.
Nonlinear Robust Regression Using Kernel Principal Component Analysis and R-Estimators
Directory of Open Access Journals (Sweden)
Antoni Wibowo
2011-09-01
Full Text Available In recent years, many algorithms based on kernel principal component analysis (KPCA have been proposed including kernel principal component regression (KPCR. KPCR can be viewed as a non-linearization of principal component regression (PCR which uses the ordinary least squares (OLS for estimating its regression coefficients. We use PCR to dispose the negative effects of multicollinearity in regression models. However, it is well known that the main disadvantage of OLS is its sensitiveness to the presence of outliers. Therefore, KPCR can be inappropriate to be used for data set containing outliers. In this paper, we propose a novel nonlinear robust technique using hybridization of KPCA and R-estimators. The proposed technique is compared to KPCR and gives better results than KPCR.
Random Decrement and Regression Analysis of Traffic Responses of Bridges
DEFF Research Database (Denmark)
Asmussen, J. C.; Ibrahim, S. R.
1996-01-01
The topic of this paper is the estimation of modal parameters from ambient data by applying the Random Decrement technique. The data fro the Queensborough Bridge over the Fraser River in Vancouver, Canada have been applied. The loads producing the dynamic response are ambient, e. g. wind, traffic and small ground motion. The random Decrement technique is used to estimate the correlation function or the free decays from the ambient data. From these functions, the modal parameters are extracted using the Ibrahim Time domain method. The possible influence of the traffic mass load on the bridge is investigated by assuming that the response level of the bridge is dependent on the mass of the vehicle load. The eigenfrequencies of the bridge is estimated as a function of the response level. This indicates the degree of influence of the mass load on the estimated eigenfrequencies. The results of the analysis using the Random decrement technique are compared with results from an analysis based on fast Fourier transformations.
Treating experimental data of inverse kinetic method by unitary linear regression analysis
International Nuclear Information System (INIS)
The theory of treating experimental data of inverse kinetic method by unitary linear regression analysis was described. Not only the reactivity, but also the effective neutron source intensity could be calculated by this method. Computer code was compiled base on the inverse kinetic method and unitary linear regression analysis. The data of zero power facility BFS-1 in Russia were processed and the results were compared. The results show that the reactivity and the effective neutron source intensity can be obtained correctly by treating experimental data of inverse kinetic method using unitary linear regression analysis and the precision of reactivity measurement is improved. The central element efficiency can be calculated by using the reactivity. The result also shows that the effect to reactivity measurement caused by external neutron source should be considered when the reactor power is low and the intensity of external neutron source is strong. (authors)
DEFF Research Database (Denmark)
Fauser, Patrik; Thomsen, Marianne
2010-01-01
This paper proposes a simple method for estimating emissions and predicted environmental concentrations (PECs) in water and air for organic chemicals that are used in household products and industrial processes. The method has been tested on existing data for 63 organic high-production volume chemicals available in the European Chemicals Bureau risk assessment reports (RARs). The method suggests a simple linear relationship between Henry's Law constant, octanol-water coefficient, use and production volumes, and emissions and PECs on a regional scale in the European Union. Emissions and PECs are a result of a complex interaction between chemical properties, production and use patterns and geographical characteristics. A linear relationship cannot capture these complexities; however, it may be applied at a cost-efficient screening level for suggesting critical chemicals that are candidates for an in-depth risk assessment. Uncertainty measures are not available for the RAR data; however, uncertainties for the applied regression models are given in the paper. Evaluation of the methods reveals that between 79% and 93% of all emission and PEC estimates are within one order of magnitude of the reported RAR values. Bearing in mind that the domain of the method comprises organic industrial high-production volume chemicals, four chemicals, prioritized in the Water Framework Directive and the Stockholm Convention on Persistent Organic Pollutants, were used to test the method for estimated emissions and PECs, with corresponding uncertainty intervals, in air and water at regional EU level.
DEFF Research Database (Denmark)
Cheng, Yongcun; Andersen, Ole Baltazar
2010-01-01
The Sea Level Thematic Assembly Center in the EUFP7 MyOcean project aims at build a sea level service for multiple satellite sea level observations at a European level for GMES marine applications. It aims to improve the sea level related products to guarantee the sustainability and the quality of GMES marine core service. One such added value will be a multivariate regression model of sea level variability of multisatellite and in-situ tide gauge observations with the aim at improved future high spatial and temporal sea level prediction for i.e., human safety. Tide gauges and satellite altimetry data from the last seventeen years have been compared for an area around UK and temporal correlation coefficients between them were calculated. The results are extremely encouraging, as we have shown that the detided signal from response method correlates to more than 90% for nearly all tide gauge stations with satellite altimetry.
Scientific Electronic Library Online (English)
Carlos A, Díaz-Contreras; Alejandra, Aguilera-Rojas; Nathaly, Guillén-Barrientos.
2014-10-01
Full Text Available La incorporación de nuevo personal o la reasignación del ya existente a tareas específicas constituyen una decisión importante, porque el acierto en ella determinará la propia supervivencia de la empresa. En este contexto se vuelve relevante contar con un modelo de selección de personal que consider [...] e la información ambigua y los grados de incertidumbre que están asociados al momento de evaluar las valoraciones cualitativas de los postulantes y que pueda entregar resultados certeros y precisos, garantizando de esta manera el buen desempeño del cargo y reduciendo así el riesgo que conlleva la incorporación de nuevas personas. En este trabajo se elaboró un modelo de selección de personal, en condiciones de incertidumbre, aplicando Lógica Difusa, utilizando como datos de entrada las descripciones de cargos de una empresa del retail, con variables difusas triangulares y con solapamiento. Este fue comparado con un modelo clásico de regresión múltiple. Los resultados mostraron que, en este caso, el uso del modelo de regresión múltiple es más eficiente que el modelo de lógica difusa optado. Abstract in english The incorporation of new personnel or the reallocation of existing tasks is an important decision, since its correctness will determine the survival of the company. In this context, having a model of personnel selection, that considers the associated ambiguous information and degrees of uncertainty, [...] becomes relevant when assessing the qualitative value of the applicants, able to deliver accurate and precise results thus ensuring the good performance of the position and reducing the associated risk with the incorporation of new people. In this work, a model of personnel selection, in conditions of uncertainty using fuzzy logic and having as input the data descriptions of positions of a retail industry, with triangular fuzzy variables and overlap was developed. This was compared with a classical model of multiple regressions. The results showed in this case, that the use of the model of multiple regressions is more efficient than the opted model of fuzzy logic.
Baghi, Q; Bergé, J; Christophe, B; Touboul, P; Rodrigues, M
2015-01-01
The analysis of physical measurements often copes with highly correlated noises and interruptions caused by outliers, saturation events or transmission losses. We assess the impact of missing data on the performance of linear regression analysis involving the fit of modeled or measured time series. We show that data gaps can significantly alter the precision of the regression parameter estimation in the presence of colored noise, due to the frequency leakage of the noise power. We present a regression method which cancels this effect and estimates the parameters of interest with a precision comparable to the complete data case, even if the noise power spectral density (PSD) is not known a priori. The method is based on an autoregressive (AR) fit of the noise, which allows us to build an approximate generalized least squares estimator approaching the minimal variance bound. The method, which can be applied to any similar data processing, is tested on simulated measurements of the MICROSCOPE space mission, whos...
Wang, Chong; Sun, Qun; Wahab, Magd Abdel; Zhang, Xingyu; Xu, Limin
2015-09-01
Rotary cup brushes mounted on each side of a road sweeper undertake heavy debris removal tasks but the characteristics have not been well known until recently. A Finite Element (FE) model that can analyze brush deformation and predict brush characteristics have been developed to investigate the sweeping efficiency and to assist the controller design. However, the FE model requires large amount of CPU time to simulate each brush design and operating scenario, which may affect its applications in a real-time system. This study develops a mathematical regression model to summarize the FE modeled results. The complex brush load characteristic curves were statistically analyzed to quantify the effects of cross-section, length, mounting angle, displacement and rotational speed etc. The data were then fitted by a multiple variable regression model using the maximum likelihood method. The fitted results showed good agreement with the FE analysis results and experimental results, suggesting that the mathematical regression model may be directly used in a real-time system to predict characteristics of different brushes under varying operating conditions. The methodology may also be used in the design and optimization of rotary brush tools. PMID:26123978
Partially linear censored quantile regression
Neocleous, T.; Portnoy, S.
2009-01-01
Censored regression quantile (CRQ) methods provide a powerful and flexible approach to the analysis of censored survival data when standard linear models are felt to be appropriate. In many cases however, greater flexibility is desired to go beyond the usual multiple regression paradigm. One area of common interest is that of partially linear models: one (or more) of the explanatory covariates are assumed to act on the response through a non-linear function. Here the CRQ approach of Portnoy (...
Energy Technology Data Exchange (ETDEWEB)
Han, Bing; Jing, Hongyuan; Liu, Jianping; Wu, Zhangzhong [PetroChina Pipeline RandD Center, Langfang, Hebei (China); Hao, Jianbin [School of Petroleum Engineering, Southwest Petroleum University, Chengdu, Sichuan (China)
2010-07-01
Landslides have a serious impact on the integrity of oil and gas pipelines in the tough terrain of Western China. This paper introduces a solving method of axial stress, which uses numerical simulation and regression analysis for the pipelines subjected to landslides. Numerical simulation is performed to analyze the change regularity of pipe stresses for the five vulnerability assessment indexes, which are: the distance between pipeline and landslide tail; the thickness of landslide; the inclination angle of landslide; the pipeline length passing through landslide; and the buried depth of pipeline. A pipeline passing through a certain landslide in southwest China was selected as an example to verify the feasibility and effectiveness of this method. This method has practical applicability, but it would need large numbers of examples to better verify its reliability and should be modified accordingly. Also, it only considers the case where the direction of the pipeline is perpendicular to the primary slip direction of the landslide.
Quantile regression for the statistical analysis of immunological data with many non-detects
Directory of Open Access Journals (Sweden)
Eilers Paul HC
2012-07-01
Full Text Available Abstract Background Immunological parameters are hard to measure. A well-known problem is the occurrence of values below the detection limit, the non-detects. Non-detects are a nuisance, because classical statistical analyses, like ANOVA and regression, cannot be applied. The more advanced statistical techniques currently available for the analysis of datasets with non-detects can only be used if a small percentage of the data are non-detects. Methods and results Quantile regression, a generalization of percentiles to regression models, models the median or higher percentiles and tolerates very high numbers of non-detects. We present a non-technical introduction and illustrate it with an implementation to real data from a clinical trial. We show that by using quantile regression, groups can be compared and that meaningful linear trends can be computed, even if more than half of the data consists of non-detects. Conclusion Quantile regression is a valuable addition to the statistical methods that can be used for the analysis of immunological datasets with non-detects.
Cherry, Kevin M; Peplinski, Brandon; Kim, Lauren; Wang, Shijun; Lu, Le; Zhang, Weidong; Liu, Jianfei; Wei, Zhuoshi; Summers, Ronald M
2015-01-01
Given the potential importance of marginal artery localization in automated registration in computed tomography colonography (CTC), we have devised a semi-automated method of marginal vessel detection employing sequential Monte Carlo tracking (also known as particle filtering tracking) by multiple cue fusion based on intensity, vesselness, organ detection, and minimum spanning tree information for poorly enhanced vessel segments. We then employed a random forest algorithm for intelligent cue fusion and decision making which achieved high sensitivity and robustness. After applying a vessel pruning procedure to the tracking results, we achieved statistically significantly improved precision compared to a baseline Hessian detection method (2.7% versus 75.2%, p<0.001). This method also showed statistically significantly improved recall rate compared to a 2-cue baseline method using fewer vessel cues (30.7% versus 67.7%, p<0.001). These results demonstrate that marginal artery localization on CTC is feasible by combining a discriminative classifier (i.e., random forest) with a sequential Monte Carlo tracking mechanism. In so doing, we present the effective application of an anatomical probability map to vessel pruning as well as a supplementary spatial coordinate system for colonic segmentation and registration when this task has been confounded by colon lumen collapse. PMID:25461335
Scientific Electronic Library Online (English)
E, CORNWELL.
2006-03-01
Full Text Available In QSRR discipline an easy novel to used parameter was designed (Vc) for evaluated classical topological index (W, ¹chi, Z, MTI) and two new generation ones (Xu, ¹chih). Regression between Vc and ¹chih presented a correlation index (r) of 0,9992, a surprising high value in comparison with that found [...] s commonly in QSPR/QSAR discipline. Through Vc parameter, an idea to treatise multiple three independent variable regression is present. Model of 35 saturated hydrocarbons were used
Automatic regression analysis for use in a complex system of evaluation of plant genetic resources
Directory of Open Access Journals (Sweden)
Attila T. SZABO
1984-08-01
Full Text Available In accordance with the general requirements regarding computerization in gene banks and germplasm research a computer program has been compiled for the analysis of univariate response in crop germplasm evaluation. The program is compiled in COBOL and run on a FELIX C-256 computer. The different modules of the program allows for: (1. data control and error listing; (2 computation of the regression function; (3 listing of the difference between the values measured and computed; (4 sorting of the individuals samples; (5 construction of scattergrams in two dimensions for measured values with the simultaneous representation of the regression line; (6 listing of examined samples in a sequence required in evaluation.
Methods and applications of linear models regression and the analysis of variance
Hocking, Ronald R
2013-01-01
Praise for the Second Edition"An essential desktop reference book . . . it should definitely be on your bookshelf." -Technometrics A thoroughly updated book, Methods and Applications of Linear Models: Regression and the Analysis of Variance, Third Edition features innovative approaches to understanding and working with models and theory of linear regression. The Third Edition provides readers with the necessary theoretical concepts, which are presented using intuitive ideas rather than complicated proofs, to describe the inference that is appropriate for the methods being discussed. The book
Fast algorithm of the robust Gaussian regression filter for areal surface analysis
International Nuclear Information System (INIS)
In this paper, the general model of the Gaussian regression filter for areal surface analysis is explored. The intrinsic relationships between the linear Gaussian filter and the robust filter are addressed. A general mathematical solution for this model is presented. Based on this technique, a fast algorithm is created. Both simulated and practical engineering data (stochastic and structured) have been used in the testing of the fast algorithm. Results show that with the same accuracy, the processing time of the second-order nonlinear regression filters for a dataset of 1024*1024 points has been reduced to several seconds from the several hours of traditional algorithms
Grégoire, G.
2014-01-01
The logistic regression originally is intended to explain the relationship between the probability of an event and a set of covariables. The model's coefficients can be interpreted via the odds and odds ratio, which are presented in introduction of the chapter. The observations are possibly got individually, then we speak of binary logistic regression. When they are grouped, the logistic regression is said binomial. In our presentation we mainly focus on the binary case. For statistical inference the main tool is the maximum likelihood methodology: we present the Wald, Rao and likelihoods ratio results and their use to compare nested models. The problems we intend to deal with are essentially the same as in multiple linear regression: testing global effect, individual effect, selection of variables to build a model, measure of the fitness of the model, prediction of new values… . The methods are demonstrated on data sets using R. Finally we briefly consider the binomial case and the situation where we are interested in several events, that is the polytomous (multinomial) logistic regression and the particular case of ordinal logistic regression.
Progressive methods in multiple criteria decision analysis
Meyer, Patrick
2007-01-01
Our work mainly focusses on the study and the development of progressive methods in the field of Multiple Criteria Decision Analysis, i.e., iterative procedures presenting partial conclusions to the Decision Maker that can be refined at further steps of the analysis. The thesis is divided into three parts. The first one is intended to be a general analysis of the concept of progressiveness. The last two parts develop?progressive methods related first to Multiattribute Value Theory and sec...
Pradhan, B.; Buchroithner, M. F.; Mansor, S.
2009-04-01
This paper presents the assessment results of spatially based probabilistic three models using Geoinformation Techniques (GIT) for landslide susceptibility analysis at Penang Island in Malaysia. Landslide locations within the study areas were identified by interpreting aerial photographs, satellite images and supported with field surveys. Maps of the topography, soil type, lineaments and land cover were constructed from the spatial data sets. There are nine landslide related factors were extracted from the spatial database and the neural network, frequency ratio and logistic regression coefficients of each factor was computed. Landslide susceptibility maps were drawn for study area using neural network, frequency ratios and logistic regression models. For verification, the results of the analyses were compared with actual landslide locations in study area. The verification results show that frequency ratio model provides higher prediction accuracy than the ANN and regression models.
Wang, Wen-Cheng; Cho, Wen-Chien; Chen, Yin-Jen
2014-01-01
It is estimated that mainland Chinese tourists travelling to Taiwan can bring annual revenues of 400 billion NTD to the Taiwan economy. Thus, how the Taiwanese Government formulates relevant measures to satisfy both sides is the focus of most concern. Taiwan must improve the facilities and service quality of its tourism industry so as to attract more mainland tourists. This paper conducted a questionnaire survey of mainland tourists and used grey relational analysis in grey mathematics to analyze the satisfaction performance of all satisfaction question items. The first eight satisfaction items were used as independent variables, and the overall satisfaction performance was used as a dependent variable for quantile regression model analysis to discuss the relationship between the dependent variable under different quantiles and independent variables. Finally, this study further discussed the predictive accuracy of the least mean regression model and each quantile regression model, as a reference for research personnel. The analysis results showed that other variables could also affect the overall satisfaction performance of mainland tourists, in addition to occupation and age. The overall predictive accuracy of quantile regression model Q0.25 was higher than that of the other three models. PMID:24574916
GENE-LEVEL PHARMACOGENETIC ANALYSIS ON SURVIVAL OUTCOMES USING GENE-TRAIT SIMILARITY REGRESSION
Tzeng, Jung-Ying; Lu, Wenbin; Hsu, Fang-Chi
2014-01-01
Gene/pathway-based methods are drawing significant attention due to their usefulness in detecting rare and common variants that affect disease susceptibility. The biological mechanism of drug responses indicates that a gene-based analysis has even greater potential in pharmacogenetics. Motivated by a study from the Vitamin Intervention for Stroke Prevention (VISP) trial, we develop a gene-trait similarity regression for survival analysis to assess the effect of a gene or pat...
Anderson, Carl A; McRae, Allan F.; Visscher, Peter M.
2006-01-01
Standard quantitative trait loci (QTL) mapping techniques commonly assume that the trait is both fully observed and normally distributed. When considering survival or age-at-onset traits these assumptions are often incorrect. Methods have been developed to map QTL for survival traits; however, they are both computationally intensive and not available in standard genome analysis software packages. We propose a grouped linear regression method for the analysis of continuous survival data. Using...
Todem, David; Kim, KyungMann; Fine, Jason; Peng, Limin
2010-01-01
We propose a family of regression models to adjust for nonrandom dropouts in the analysis of longitudinal outcomes with fully observed covariates. The approach conceptually focuses on generalized linear models with random effects. A novel formulation of a shared random effects model is presented and shown to provide a dropout selection parameter with a meaningful interpretation. The proposed semiparametric and parametric models are made part of a sensitivity analysis to delineate the range of...
DEFF Research Database (Denmark)
NØrgaard, Trine; MØldrup, Per
2014-01-01
Colloids are potential carriers for strongly sorbing chemicals in macroporous soils, but predicting the amount of colloids readily available for facilitated chemical transport is an unsolved challenge. This study addresses potential key parameters and predictive indicators when assessing colloid dispersibility and transport at the field scale. Samples representing three measurement scales (1-2 mm aggregates, intact 100 cm3 rings, and intact 6283 cm3 columns) were retrieved from the topsoil of a 1.69 ha agricultural field in a 15 m × 15 m grid (65 locations) to determine soil dispersibility as well as 24 comparison parameters including textural, chemical, and structural (e.g. air permeability) 8 soil properties. The soil dispersibility was determined (i) using a laser diffraction method on 1-2 mm aggregates equilibrated to an initial matric potential of -100 cm H2O, (ii) using an end-over-end shaking on 6.06 cm (diam.) × 3.48 cm (height) cm intact soil rings equilibrated to an initial matric potential of -5 cmH2O, and (iii) as the accumulated amount of particles leached from 20 cm × 20 cm intact soil columns after 6.5 hr (60 mm accumulated outflow). At all three scales, soil dispersibility was higher in samples collected from the northern part of the field where the greatest leaching of pesticides was observed in a horizontal well at ~ 3.5 m depth during a 9-year monitoring program. This suggests that the three dispersibility methods used are all relevant for field-scale mapping of areas with enhanced risk of colloid-facilitated transport. Subsequently, using multiple linear regression (MLR) analyses, soil dispersibility was predicted at all three sample scales from the 24 measured, geo-referenced parameters to produce sets of only a few promising indicator parameters for evaluating soil stability and particle mobilization on field scale. The MLR analyses at each scale were separated in predictions using all, only north, and only south locations in the field. We found that different independent variables were included in the regression models when the sample scale increased from aggregate to column level. Generally, the predictive power of the regression models was better on the 1-2 mm aggregate scale than on the intact 100 cm3 and 20 cm × 20 cm scales. Overall, results suggested that different drivers controlled soil dispersibility 1 at the three scales and the two sub-areas of the field. Predictions of soil dispersibility and the risk of colloid-facilitated chemical transport will therefore need to be highly scale- and area-specific.
Bagher Arayesh; Sayed J. Hosseini
2010-01-01
Problem statement: The purpose of this study was the regression analysis of effective factor on people participation in protecting, revitalizing, developing and using renewable natural resources in Ilam province. Approach: This study was a casual comparative and applies one. Sample was taken from natural resources users. Results: The sample size of groups was 317 for users respectively. For sample selection, stratified, cluster and multiple sampling were utilized. The main tools for gathering...
Regression analysis to predict growth performance from dietary net energy in growing-finishing pigs.
Nitikanchana, S; Dritz, S S; Tokach, M D; DeRouchey, J M; Goodband, R D; White, B J
2015-06-01
Data from 41 trials with multiple energy levels (285 observations) were used in a meta-analysis to predict growth performance based on dietary NE concentration. Nutrient and energy concentrations in all diets were estimated using the NRC ingredient library. Predictor variables examined for best fit models using Akaike information criteria included linear and quadratic terms of NE, BW, CP, standardized ileal digestible (SID) Lys, crude fiber, NDF, ADF, fat, ash, and their interactions. The initial best fit models included interactions between NE and CP or SID Lys. After removal of the observations that fed SID Lys below the suggested requirement, these terms were no longer significant. Including dietary fat in the model with NE and BW significantly improved the G:F prediction model, indicating that NE may underestimate the influence of fat on G:F. The meta-analysis indicated that, as long as diets are adequate for other nutrients (i.e., Lys), dietary NE is adequate to predict changes in ADG across different dietary ingredients and conditions. The analysis indicates that ADG increases with increasing dietary NE and BW but decreases when BW is above 87 kg. The G:F ratio improves with increasing dietary NE and fat but decreases with increasing BW. The regression equations were then evaluated by comparing the actual and predicted performance of 543 finishing pigs in 2 trials fed 5 dietary treatments, included 3 different levels of NE by adding wheat middlings, soybean hulls, dried distillers grains with solubles (DDGS; 8 to 9% oil), or choice white grease (CWG) to a corn-soybean meal-based diet. Diets were 1) 30% DDGS, 20% wheat middlings, and 4 to 5% soybean hulls (low energy); 2) 20% wheat middlings and 4 to 5% soybean hulls (low energy); 3) a corn-soybean meal diet (medium energy); 4) diet 2 supplemented with 3.7% CWG to equalize the NE level to diet 3 (medium energy); and 5) a corn-soybean meal diet with 3.7% CWG (high energy). Only small differences were observed between predicted and observed values of ADG and G:F except for the low-energy diet containing the greatest fiber content (30% DDGS diet), where ADG and G:F were overpredicted by 3 to 6%. Therefore, the prediction equations provided a good estimation of the growth rate and feed efficiency of growing-finishing pigs fed different levels of dietary NE except for the pigs fed the low-energy diet containing the greatest fiber content. PMID:26115270
Directory of Open Access Journals (Sweden)
Baxter Lisa K
2008-05-01
Full Text Available Abstract Background There is a growing body of literature linking GIS-based measures of traffic density to asthma and other respiratory outcomes. However, no consensus exists on which traffic indicators best capture variability in different pollutants or within different settings. As part of a study on childhood asthma etiology, we examined variability in outdoor concentrations of multiple traffic-related air pollutants within urban communities, using a range of GIS-based predictors and land use regression techniques. Methods We measured fine particulate matter (PM2.5, nitrogen dioxide (NO2, and elemental carbon (EC outside 44 homes representing a range of traffic densities and neighborhoods across Boston, Massachusetts and nearby communities. Multiple three to four-day average samples were collected at each home during winters and summers from 2003 to 2005. Traffic indicators were derived using Massachusetts Highway Department data and direct traffic counts. Multivariate regression analyses were performed separately for each pollutant, using traffic indicators, land use, meteorology, site characteristics, and central site concentrations. Results PM2.5 was strongly associated with the central site monitor (R2 = 0.68. Additional variability was explained by total roadway length within 100 m of the home, smoking or grilling near the monitor, and block-group population density (R2 = 0.76. EC showed greater spatial variability, especially during winter months, and was predicted by roadway length within 200 m of the home. The influence of traffic was greater under low wind speed conditions, and concentrations were lower during summer (R2 = 0.52. NO2 showed significant spatial variability, predicted by population density and roadway length within 50 m of the home, modified by site characteristics (obstruction, and with higher concentrations during summer (R2 = 0.56. Conclusion Each pollutant examined displayed somewhat different spatial patterns within urban neighborhoods, and were differently related to local traffic and meteorology. Our results indicate a need for multi-pollutant exposure modeling to disentangle causal agents in epidemiological studies, and further investigation of site-specific and meteorological modification of the traffic-concentration relationship in urban neighborhoods.
Zhang, Hui; Schaubel, Douglas E; Kalbfleisch, John D
2011-03-01
Case-cohort sampling is a commonly used and efficient method for studying large cohorts. Most existing methods of analysis for case-cohort data have concerned the analysis of univariate failure time data. However, clustered failure time data are commonly encountered in public health studies. For example, patients treated at the same center are unlikely to be independent. In this article, we consider methods based on estimating equations for case-cohort designs for clustered failure time data. We assume a marginal hazards model, with a common baseline hazard and common regression coefficient across clusters. The proposed estimators of the regression parameter and cumulative baseline hazard are shown to be consistent and asymptotically normal, and consistent estimators of the asymptotic covariance matrices are derived. The regression parameter estimator is easily computed using any standard Cox regression software that allows for offset terms. The proposed estimators are investigated in simulation studies, and demonstrated empirically to have increased efficiency relative to some existing methods. The proposed methods are applied to a study of mortality among Canadian dialysis patients. PMID:20560939
Application of Binary Regression Analysis in the Prescription Pattern of Antidepressants
Directory of Open Access Journals (Sweden)
Dr.Indrajit Banerjee, MBBS, MD
2013-05-01
Full Text Available Background:In Nepal several research studies are reported using percentages or cross tabulation method, but the relevance of logistic regression methodology in research is lag behind among the researchers. Objectives: The main objective of this study was to find the role of logistic regression analysis in the pattern of antidepressants in a tertiary care center in hospitalized patients of Western Nepal.Methods: A hospital based study was done between 1st October 2009 and 31st March 2010 at Psychiatry Ward of Manipal Teaching Hospital, Nepal. Z test, Chi square test and Binary logistic regression were used for the analysis. We calculated odds ratios (OR and their 95% confidence intervals (95% CI P-value 10000, 2.63 times more in Hindus and 1.197 times more in Brahmins than any other ethnic groups. 9.179 times more tendency of prescribing antidepressants by trade names in case of unemployed patients as compared to employed patients in Nepal.Conclusion: Binary Logistic regression plays an important role to understand the drug utilization pattern of mood elevators in Western Nepal.
MRI Texture Analysis in Multiple Sclerosis
Yunyan Zhang
2011-01-01
Multiple sclerosis (MS) is a complicated disease characterized by heterogeneous pathology that varies across individuals. Accurate identification and quantification of pathological changes may facilitate a better understanding of disease pathogenesis and progression and help identify novel therapies for MS patients. Texture analysis evaluates interpixel relationships that generate characteristic organizational patterns in an image, many of which are beyond the ability of visual perception. Gi...
Directory of Open Access Journals (Sweden)
Fereshteh Shiri
2010-08-01
Full Text Available In the present work, support vector machines (SVMs and multiple linear regression (MLR techniques were used for quantitative structure–property relationship (QSPR studies of retention time (tR in standardized liquid chromatography–UV–mass spectrometry of 67 mycotoxins (aflatoxins, trichothecenes, roquefortines and ochratoxins based on molecular descriptors calculated from the optimized 3D structures. By applying missing value, zero and multicollinearity tests with a cutoff value of 0.95, and genetic algorithm method of variable selection, the most relevant descriptors were selected to build QSPR models. MLRand SVMs methods were employed to build QSPR models. The robustness of the QSPR models was characterized by the statistical validation and applicability domain (AD. The prediction results from the MLR and SVM models are in good agreement with the experimental values. The correlation and predictability measure by r2 and q2 are 0.931 and 0.932, repectively, for SVM and 0.923 and 0.915, respectively, for MLR. The applicability domain of the model was investigated using William’s plot. The effects of different descriptors on the retention times are described.
Linard, Joshua I.
2013-01-01
Mitigating the effects of salt and selenium on water quality in the Grand Valley and lower Gunnison River Basin in western Colorado is a major concern for land managers. Previous modeling indicated means to improve the models by including more detailed geospatial data and a more rigorous method for developing the models. After evaluating all possible combinations of geospatial variables, four multiple linear regression models resulted that could estimate irrigation-season salt yield, nonirrigation-season salt yield, irrigation-season selenium yield, and nonirrigation-season selenium yield. The adjusted r-squared and the residual standard error (in units of log-transformed yield) of the models were, respectively, 0.87 and 2.03 for the irrigation-season salt model, 0.90 and 1.25 for the nonirrigation-season salt model, 0.85 and 2.94 for the irrigation-season selenium model, and 0.93 and 1.75 for the nonirrigation-season selenium model. The four models were used to estimate yields and loads from contributing areas corresponding to 12-digit hydrologic unit codes in the lower Gunnison River Basin study area. Each of the 175 contributing areas was ranked according to its estimated mean seasonal yield of salt and selenium.
International Nuclear Information System (INIS)
Prediction of the amount of hospital waste production will be helpful in the storage, transportation and disposal of hospital waste management. Based on this fact, two predictor models including artificial neural networks (ANNs) and multiple linear regression (MLR) were applied to predict the rate of medical waste generation totally and in different types of sharp, infectious and general. In this study, a 5-fold cross-validation procedure on a database containing total of 50 hospitals of Fars province (Iran) were used to verify the performance of the models. Three performance measures including MAR, RMSE and R2 were used to evaluate performance of models. The MLR as a conventional model obtained poor prediction performance measure values. However, MLR distinguished hospital capacity and bed occupancy as more significant parameters. On the other hand, ANNs as a more powerful model, which has not been introduced in predicting rate of medical waste generation, showed high performance measure values, especially 0.99 value of R2 confirming the good fit of the data. Such satisfactory results could be attributed to the non-linear nature of ANNs in problem solving which provides the opportunity for relating independent variables to dependent ones non-linearly. In conclusion, the obtained results showed that our ANN-based model approach is very promising and may play a useful role in developing a better cost-effective strategy for waste management in fututive strategy for waste management in future.
Analysis of the Evolution of the Gross Domestic Product by Means of Cyclic Regressions
Directory of Open Access Journals (Sweden)
Catalin Angelo Ioan
2011-08-01
Full Text Available In this article, we will carry out an analysis on the regularity of the Gross Domestic Product of a country, in our case the United States. The method of analysis is based on a new method of analysis – the cyclic regressions based on the Fourier series of a function. Another point of view is that of considering instead the growth rate of GDP the speed of variation of this rate, computed as a numerical derivative. The obtained results show a cycle for this indicator for 71 years, the mean square error being 0.93%. The method described allows an prognosis on short-term trends in GDP.
Regression And Time Series Analysis Of Loan Default At Minescho Cooperative Credit Union Tarkwa
Directory of Open Access Journals (Sweden)
Otoo
2015-08-01
Full Text Available Abstract Lending in the form of loans is a principal business activity for banks credit unions and other financial institutions. This forms a substantial amount of the banks assets. However when these loans are defaulted it tends to have serious effects on the financial institutions. This study sought to determine the trend and forecast loan default at Minescho CreditUnion Tarkwa. A secondary data from the Credit Union was analyzed using Regression Analysis and the Box-Jenkins method of Time Series. From the Regression Analysis there was a moderately strong relationship between the amount of loan default and time. Also the amount of loan default had an increasing trend. The two years forecast of the amount of loan default oscillated initially and remained constant from 2016 onwards.
Directory of Open Access Journals (Sweden)
Hüseyin BUDAK
2012-11-01
Full Text Available Credit scoring is a vital topic for Banks since there is a need to use limited financial sources more effectively. There are several credit scoring methods that are used by Banks. One of them is to estimate whether a credit demanding customer’s repayment order will be regular or not. In this study, artificial neural networks and logistic regression analysis have been used to provide a support to the Banks’ credit risk prediction and to estimate whether a credit demanding customers’ repayment order will be regular or not. The results of the study showed that artificial neural networks method is more reliable than logistic regression analysis while estimating a credit demanding customer’s repayment order.
Anwar Fitrianto; Lee Ceng Yik
2014-01-01
When independent variables have high linear correlation in a multiple linear regression model, we can have wrong analysis. It happens if we do the multiple linear regression analysis based on common Ordinary Least Squares (OLS) method. In this situation, we are suggested to use ridge regression estimator. We conduct some simulation study to compare the performance of ridge regression estimator and the OLS. We found that Hoerl and Kennard ridge regression estimation method has better performan...
Scientific Electronic Library Online (English)
Celsemy E., Maia; Elís R.C. de, Morais; Maurício de, Oliveira.
2001-04-01
Full Text Available Objetivou-se, com o presente trabalho, desenvolver uma metodologia para classificação da composição iônica da água de irrigação, através da regressão linear múltipla, tendo-se, como variável dependente, a condutividade elétrica e, como variáveis independentes, as concentrações de cátions e ânions da [...] água de irrigação, classificada de acordo com o peso de cada íon no modelo estatístico. A fonte secundária de dados para a pesquisa foi o Banco de Dados do Laboratório de Análise de Água e Fertilidade do Solo, da Escola Superior de Agricultura de Mossoró (LAAFS/ESAM). As regressões foram ajustadas utilizando-se o método da seleção por etapas, conhecido como the stepwise regression procedure, no qual a variável dependente foi a condutividade elétrica e, como variáveis independentes, os íons determinados pela análise físico-química da água. Os resultados mostraram que, empregando-se este critério de regressão linear múltipla, havia variação na contribuição de cada variável no modelo ajustado, cuja estimativa era baseada no aumento da soma de quadrado, devido à regressão, a medida em que se incorporava, ao modelo, cada variável independente. Em função de critérios preestabelecidos, águas provenientes de mananciais da região da Chapada do Apodi foram classificadas como cálcica-sódica, cálcica e cloretada, quando provinham de poço tubular, de poço amazonas e rio, respectivamente. As águas oriundas da região do Baixo Açu, foram classificadas como sódica, magnesiana-sódica e sódica, para as águas de poço tubular, poço amazonas e rio, respectivamente. Abstract in english This work was conducted with the objective of developing a methodology for classification of the ionic composition of the irrigation water using multiple linear regression. A Stepwise Regression Analysis model was tested, using electrical conductivity as the dependent variable and analyzed ions calc [...] ium, sodium, potassium, carbonate, bicarbonate and chlorides as the independent variables in all tested models. All water samples were collected by the farmers of the region where this work was conducted. The regression models were adjusted using the water analysis database from the ESAM's Analysis Laboratory (Laboratório de Análises de Água e Fertilidade do Solo da Escola Superior de Agricultura de Mossoró - LAAFS/ESAM). The linear model, adjusted using the Stepwise Regression Procedure, shows that the degree of model adjustment tested depends upon geological formation of watersheds and whether it is collected in a river or tubular wells. The classification of the water in calcareous region of the Chapada do Apodi is calcic-sodic, calcic or choride if this source was tubular well, piezometric well (drilled in unconfined water denominated in the region as poço amazonas) or surface rivers and lagoons water, respectively. In Baixo Açu region, these waters were classified as sodic, magnesian-sodic or sodic depending if the source collected is a tubular well (drilled in Açu sedimentary geological formation), piezometric well or superficial water, respectivelly.
Coherence Motivated Sampling and Convergence Analysis of Least-Squares Polynomial Chaos Regression
Hampton, Jerrad; Doostan, Alireza
2014-01-01
Independent sampling of orthogonal polynomial bases via Monte Carlo is of interest for uncertainty quantification of models, using Polynomial Chaos (PC) expansions. It is known that bounding the spectral radius of a random matrix consisting of PC samples, yields a bound on the number of samples necessary to identify coefficients in the PC expansion via solution to a least-squares regression problem. We present a related analysis which guarantees a mean square convergence usi...
Murtagh, Fionn; Spagat, Michael; A. Restrepo, Jorge
2009-01-01
We first pursue the study of how hierarchy provides a well-adapted tool for the analysis of change. Then, using a time sequence-constrained hierarchical clustering, we develop the practical aspects of a new approach to wavelet regression. This provides a new way to link hierarchical relationships in a multivariate time series data set with external signals. Violence data from the Colombian conflict in the years 1990 to 2004 is used throughout. We conclude with some proposals...
Estimation of Output Disturbance in Auto-Regressive Model via Independent Component Analysis
Tanaka, R.; K. Kawaguchi; J. Endo; Shibasaki, H; Y. Hikichi; Ishida, Y.
2013-01-01
This paper explains and demonstrates how to estimate an output disturbance in an auto-regressive model. This method uses the independent component analysis (ICA) technique, which restores source signals from their linear mixtures under the assumption that the source signals are mutually independent. The estimation is achieved by a model whose source signals consist of input and output disturbance, and observed signals consist of input and output. To solve the ICA problem, a natural gradient m...
Automatic regression analysis for use in a complex system of evaluation of plant genetic resources
Attila T. SZABO; Cs. ARKOSSY
1984-01-01
In accordance with the general requirements regarding computerization in gene banks and germplasm research a computer program has been compiled for the analysis of univariate response in crop germplasm evaluation. The program is compiled in COBOL and run on a FELIX C-256 computer. The different modules of the program allows for: (1.) data control and error listing; (2) computation of the regression function; (3) listing of the difference between the values measured and computed; (4) sorting o...
The effects of exchange rate variability on international trade: a Meta-Regression Analysis
??ori??, Bruno; Pugh, Geoffrey Thomas
2008-01-01
Abstract The trade effects of exchange rate variability have been an issue in international economics for the past 30 years. The contribution of this paper is to apply meta-regression analysis (MRA) to the empirical literature. On average, exchange rate variability exerts a negative effect on international trade. Yet MRA confirms the view that this result is highly conditional, by identifying factors that help to explain why estimated trade effects vary from significantly negative ...
Real Estate and the Stock Market: A Meta?Regression Analysis
GURDGIEV, CONSTANTIN; LUCEY, BRIAN MICHAEL
2011-01-01
The real estate finance literature provides diverse and contradictory findings regarding the relationship between the real estate market and the stock market. Despite the importance of this relationship to the economy in general relatively little is known of what causes such differences. In this paper, through applying the technique of meta?regression analysis to the empirical studies in the area a significant step is made towards objectively integrating and synthesising the results and ident...
Murtagh, Fionn; Spagat, Michael; A. Restrepo, Jorge
2011-01-01
We first pursue the study of how hierarchy provides a well-adapted tool for the analysis of change. Then, using a time sequence-constrained hierarchical clustering, we develop the practical aspects of a new approach to wavelet regression. This provides a new way to link hierarchical relationships in a multivariate time series data set with external signals. Violence data from the Colombian conflict in the years 1990 to 2004 is used throughout. We conclude with some proposals...
Chamroukhi, Faicel; Glotin, Hervé; Samé, Allou
2013-01-01
In this paper, we study the modeling and the classification of functional data presenting regime changes over time. We propose a new model-based functional mixture discriminant analysis approach based on a specific hidden process regression model that governs the regime changes over time. Our approach is particularly adapted to handle the problem of complex-shaped classes of curves, where each class is potentially composed of several sub-classes, and to deal with the regime ...
Czech Academy of Sciences Publication Activity Database
Trnka, M.; Žalud, Z.; Semerádová, Daniela; Dubrovský, Martin
Brno : ?eská bioklimatologická spole?nost, 2002 - (Rožnovský, J.; Litschmann, T.), s. - ISBN 80-85813-99-8. [14. ?eská-Slovenská bioklimatologická konference. Lednice na Morav? (CZ), 02.09.2002-04.09.2002] R&D Projects: GA ?R GA521/02/0827 Institutional research plan: CEZ:AV0Z3042911 Keywords : regression analysis * spring barley Subject RIV: DG - Athmosphere Sciences, Meteorology
Ruman, M; Olkowska, E; Kozio?, K; Absalon, D; Matysik, M; Polkowska, ?
2014-03-01
Monitoring contamination in river water is an expensive procedure, particularly for developing countries where pollution is a significant problem. This study was conducted to provide a pollution monitoring strategy that reduces the cost of laboratory analysis. The new monitoring strategy was designed as a result of cluster and regression analysis on field data collected from an industrially influenced river. Pollution sources in the study site were coal mining, metallurgy, chemical industry, and metropolitan sewage. This river resembles those in other areas of the world, including developing countries where environmental monitoring is financially constrained. Data were collected on variability of contaminant concentrations during four seasons at the same points on tributaries of the river. The variables described in the study are pH, electrical conductivity, inorganic ions, trace elements, and selected organic pollutants. These variables were divided into groups using cluster analysis. These groups were then tested using regression models to identify how the behavior of one variable changes in relation to another. It was found that up to 86.8% of variability of one parameter could be determined by another in the dataset. We adopted 60, 65, and 70% determination levels () for accepting a regression model. As a result, monitoring could be reduced by 15 (60% level) and 10 variables (65 and 70%) out of 43, which comprises 35 and 23% of the monitored variable total. Cost reduction would be most effective if trace elements or organic pollutants were excluded from monitoring because these are the constituents most expensive to analyze. PMID:25602676
Buston, Peter M; Elith, Jane
2011-05-01
1. Central questions of behavioural and evolutionary ecology are what factors influence the reproductive success of dominant breeders and subordinate nonbreeders within animal societies? A complete understanding of any society requires that these questions be answered for all individuals. 2. The clown anemonefish, Amphiprion percula, forms simple societies that live in close association with sea anemones, Heteractis magnifica. Here, we use data from a well-studied population of A. percula to determine the major predictors of reproductive success of dominant pairs in this species. 3. We analyse the effect of multiple predictors on four components of reproductive success, using a relatively new technique from the field of statistical learning: boosted regression trees (BRTs). BRTs have the potential to model complex relationships in ways that give powerful insight. 4. We show that the reproductive success of dominant pairs is unrelated to the presence, number or phenotype of nonbreeders. This is consistent with the observation that nonbreeders do not help or hinder breeders in any way, confirming and extending the results of a previous study. 5. Primarily, reproductive success is negatively related to male growth and positively related to breeding experience. It is likely that these effects are interrelated because males that grow a lot have little breeding experience. These effects are indicative of a trade-off between male growth and parental investment. 6. Secondarily, reproductive success is positively related to female growth and size. In this population, female size is positively related to group size and anemone size, also. These positive correlations among traits likely are caused by variation in site quality and are suggestive of a silver-spoon effect. 7. Noteworthily, whereas reproductive success is positively related to female size, it is unrelated to male size. This observation provides support for the size advantage hypothesis for sex change: both individuals maximize their reproductive success when the larger individual adopts the female tactic. 8. This study provides the most complete picture to date of the factors that predict the reproductive success of dominant pairs of clown anemonefish and illustrates the utility of BRTs for analysis of complex behavioural and evolutionary ecology data. PMID:21284624
A regression analysis of the effect of energy use in agriculture
International Nuclear Information System (INIS)
This study investigates the impacts of energy use on productivity of Turkey's agriculture. It reports the results of a regression analysis of the relationship between energy use and agricultural productivity. The study is based on the analysis of the yearbook data for the period 1971-2003. Agricultural productivity was specified as a function of its energy consumption (TOE) and gross additions of fixed assets during the year. Least square (LS) was employed to estimate equation parameters. The data of this study comes from the State Institute of Statistics (SIS) and The Ministry of Energy of Turkey
Baghi, Quentin; Métris, Gilles; Bergé, Joël; Christophe, Bruno; Touboul, Pierre; Rodrigues, Manuel
2015-03-01
The analysis of physical measurements often copes with highly correlated noises and interruptions caused by outliers, saturation events, or transmission losses. We assess the impact of missing data on the performance of linear regression analysis involving the fit of modeled or measured time series. We show that data gaps can significantly alter the precision of the regression parameter estimation in the presence of colored noise, due to the frequency leakage of the noise power. We present a regression method that cancels this effect and estimates the parameters of interest with a precision comparable to the complete data case, even if the noise power spectral density (PSD) is not known a priori. The method is based on an autoregressive fit of the noise, which allows us to build an approximate generalized least squares estimator approaching the minimal variance bound. The method, which can be applied to any similar data processing, is tested on simulated measurements of the MICROSCOPE space mission, whose goal is to test the weak equivalence principle (WEP) with a precision of 1 0-15. In this particular context the signal of interest is the WEP violation signal expected to be found around a well defined frequency. We test our method with different gap patterns and noise of known PSD and find that the results agree with the mission requirements, decreasing the uncertainty by a factor of 60 with respect to ordinary least squares methods. We show that it also provides a test of significance to assess the uncertainty of the measurement.
Correlation Study and Regression Analysis of Drinking Water Quality in Kashan City, Iran
Directory of Open Access Journals (Sweden)
Mohammad Mehdi HEYDARI
2013-06-01
Full Text Available Chemical and statistical regression analysis on drinking water samples at five fields (21 sampling wells with hot and dry climate in Kashan city, central Iran was carried out. Samples were collected during October 2006 to May 2007 (25 - 30 °C. Comparing the results with drinking water quality standards issued by World Health Organization (WHO, it is found that some of the water samples are not potable. Hydrochemical facies using a Piper diagram indicate that in most parts of the city, the chemical character of water is dominated by NaCl. All samples showed sulfate and sodium ion higher and K+ and F- content lower than the permissible limit. A strongly positive correlation is observed between TDS and EC (R = 0.995 and Ca2+ and TH (R = 0.948. The results showed that regression relations have the same correlation coefficients: (I pH -TH, EC -TH (R = 0.520, (II NO3- -pH, TH-pH (R = 0.520, (III Ca2+-SO42-, TH-SO42-, Cl- -SO42- (R = 0.630. The results revealed that systematic calculations of correlation coefficients between water parameters and regression analysis provide a useful means for rapid monitoring of water quality.
DEFF Research Database (Denmark)
Kinnebrock, Silja; Podolskij, Mark
2008-01-01
This paper introduces a new estimator to measure the ex-post covariation between high-frequency financial time series under market microstructure noise. We provide an asymptotic limit theory (including feasible central limit theorems) for standard methods such as regression, correlation analysis and covariance, for which we obtain the optimal rate of convergence. We demonstrate some positive semidefinite estimators of the covariation and construct a positive semidefinite estimator of the conditional covariance matrix in the central limit theorem. Furthermore, we indicate how the assumptions on the noise process can be relaxed and how our method can be applied to non-synchronous observations. We also present an empirical study of how high-frequency correlations, regressions and covariances change through time.
Sub-pixel estimation of tree cover and bare surface densities using regression tree analysis
Directory of Open Access Journals (Sweden)
Carlos Augusto Zangrando Toneli
2011-09-01
Full Text Available Sub-pixel analysis is capable of generating continuous fields, which represent the spatial variability of certain thematic classes. The aim of this work was to develop numerical models to represent the variability of tree cover and bare surfaces within the study area. This research was conducted in the riparian buffer within a watershed of the São Francisco River in the North of Minas Gerais, Brazil. IKONOS and Landsat TM imagery were used with the GUIDE algorithm to construct the models. The results were two index images derived with regression trees for the entire study area, one representing tree cover and the other representing bare surface. The use of non-parametric and non-linear regression tree models presented satisfactory results to characterize wetland, deciduous and savanna patterns of forest formation.
Lee, C. Y.; Tippett, M. K.; Sobel, A. H.; Camargo, S. J.
2014-12-01
We are working towards the development of a new statistical-dynamical downscaling system to study the influence of climate on tropical cyclones (TCs). The first step is development of an appropriate model for TC intensity as a function of environmental variables. We approach this issue with a stochastic model consisting of a multiple linear regression model (MLR) for 12-hour intensity forecasts as a deterministic component, and a random error generator as a stochastic component. Similar to the operational Statistical Hurricane Intensity Prediction Scheme (SHIPS), MLR relates the surrounding environment to storm intensity, but with only essential predictors calculated from monthly-mean NCEP reanalysis fields (potential intensity, shear, etc.) and from persistence. The deterministic MLR is developed with data from 1981-1999 and tested with data from 2000-2012 for the Atlantic, Eastern North Pacific, Western North Pacific, Indian Ocean, and Southern Hemisphere basins. While the global MLR's skill is comparable to that of the operational statistical models (e.g., SHIPS), the distribution of the predicted maximum intensity from deterministic results has a systematic low bias compared to observations; the deterministic MLR creates almost no storms with intensities greater than 100 kt. The deterministic MLR can be significantly improved by adding the stochastic component, based on the distribution of random forecasting errors from the deterministic model compared to the training data. This stochastic component may be thought of as representing the component of TC intensification that is not linearly related to the environmental variables. We find that in order for the stochastic model to accurately capture the observed distribution of maximum storm intensities, the stochastic component must be auto-correlated across 12-hour time steps. This presentation also includes a detailed discussion of the distributions of other TC-intensity related quantities, as well as the inter-annual variability of predicted storm intensity in the form of accumulated cyclone energy (ACE). Applying this stochastic model in conjunction with global climate model fields is an ongoing task.
Analysis of designed experiments by stabilised PLS Regression and jack-knifing
DEFF Research Database (Denmark)
Martens, Harald; HØy, M.
2001-01-01
Pragmatical, visually oriented methods for assessing and optimising bi-linear regression models are described, and applied to PLS Regression (PLSR) analysis of multi-response data from controlled experiments. The paper outlines some ways to stabilise the PLSR method to extend its range of applicability to the analysis of effects in designed experiments. Two ways of passifying unreliable variables are shown. A method for estimating the reliability of the cross- validated prediction error RMSEP is demonstrated. Some recently developed jack-knifing extensions are illustrated, for estimating the reliability of the linear and bi-linear model parameter estimates. The paper illustrates how the obtained PLSR "significance" probabilities are similar to those from conventional factorial ANOVA, but the PLSR is shown to give important additional overview plots of the main relevant structures in the multi-response data. The study is part of an ongoing effort to establish a cognitively simple and versatile approach to multivariate data analysis, with reliability assessment based on the data at hand, and with little need for abstract distribution theory [H. Martens, M. Martens, Multivariate Analysis of Quality. An Introduction, Wiley, Chichester, UK, 2001].
Scientific Electronic Library Online (English)
Roberto, Baeza-Serrato; José Antonio, Vázquez-López.
2014-06-01
Full Text Available Uno de los supuestos principales del análisis de regresión lineal es la existencia de una relación de causalidad entre las variables analizadas, sin que el análisis de regresión lo permita demostrar. Esta investigación demuestra la causalidad entre las variables analizadas a través de la construcció [...] n y análisis de la retroalimentación entre las variables en estudio, plasmada en un diagrama causal y validado a través de simulación dinámica. Una de las principales contribuciones de ésta investigación, es la propuesta de utilizar un enfoque de dinámica de sistemas, para desarrollar un método de transición de un modelo de regresión lineal múltiple predictivo a un modelo de regresión no lineal simple explicativo, que incrementa el nivel de predicción del modelo. El error cuadrático medio (ECM) es utilizado como criterio de predicción. La validación se realizó con tres modelos de regresión lineal obtenidos experimentalmente en una empresa del sector textil, mostrando una alternativa para incrementar la fiabilidad en los modelos de predicción. Abstract in english One of the main assumptions of the linear regression analysis is the existence of a causal relationship between the variables analyzed, which the regression analysis does not demonstrate. This paper demonstrates the causality between the variables analyzed through the construction and analysis of th [...] e feedback from the variables under study, expressed in a causal diagram and validated through dynamic simulation. The major contribution of this research is the proposal of the use of the system dynamics approach to develop a method of transition from a multiple regression predictive model to a simpler nonlinear regression explanatory model, which increases the level of prediction of the model. The mean square error (MSE) is taken as a criterion for prediction. The validation in the transition model was performed with three linear regression models obtained experimentally in a textile company, showing a method for increasing the reliability of prediction models.
Ryu, Duchwan; Li, Erning; Mallick, Bani K
2011-06-01
We consider nonparametric regression analysis in a generalized linear model (GLM) framework for data with covariates that are the subject-specific random effects of longitudinal measurements. The usual assumption that the effects of the longitudinal covariate processes are linear in the GLM may be unrealistic and if this happens it can cast doubt on the inference of observed covariate effects. Allowing the regression functions to be unknown, we propose to apply Bayesian nonparametric methods including cubic smoothing splines or P-splines for the possible nonlinearity and use an additive model in this complex setting. To improve computational efficiency, we propose the use of data-augmentation schemes. The approach allows flexible covariance structures for the random effects and within-subject measurement errors of the longitudinal processes. The posterior model space is explored through a Markov chain Monte Carlo (MCMC) sampler. The proposed methods are illustrated and compared to other approaches, the "naive" approach and the regression calibration, via simulations and by an application that investigates the relationship between obesity in adulthood and childhood growth curves. PMID:20880012
Diversity Performance Analysis on Multiple HAP Networks
Directory of Open Access Journals (Sweden)
Feihong Dong
2015-06-01
Full Text Available One of the main design challenges in wireless sensor networks (WSNs is achieving a high-data-rate transmission for individual sensor devices. The high altitude platform (HAP is an important communication relay platform for WSNs and next-generation wireless networks. Multiple-input multiple-output (MIMO techniques provide the diversity and multiplexing gain, which can improve the network performance effectively. In this paper, a virtual MIMO (V-MIMO model is proposed by networking multiple HAPs with the concept of multiple assets in view (MAV. In a shadowed Rician fading channel, the diversity performance is investigated. The probability density function (PDF and cumulative distribution function (CDF of the received signal-to-noise ratio (SNR are derived. In addition, the average symbol error rate (ASER with BPSK and QPSK is given for the V-MIMO model. The system capacity is studied for both perfect channel state information (CSI and unknown CSI individually. The ergodic capacity with various SNR and Rician factors for different network configurations is also analyzed. The simulation results validate the effectiveness of the performance analysis. It is shown that the performance of the HAPs network in WSNs can be significantly improved by utilizing the MAV to achieve overlapping coverage, with the help of the V-MIMO techniques.
Energy Technology Data Exchange (ETDEWEB)
Mackley, Rob D.; Spane, Frank A.; Pulsipher, Trenton C.; Allwardt, Craig H.
2010-09-01
A software tool was created in Fiscal Year 2010 (FY11) that enables multiple-regression correction of well water levels for river-stage effects. This task was conducted as part of the Remediation Science and Technology project of CH2MHILL Plateau Remediation Company (CHPRC). This document contains an overview of the correction methodology and a user’s manual for Multiple Regression in Excel (MRCX) v.1.1. It also contains a step-by-step tutorial that shows users how to use MRCX to correct river effects in two different wells. This report is accompanied by an enclosed CD that contains the MRCX installer application and files used in the tutorial exercises.
Analysis of reactor noise by multi-variate auto-regressive model
International Nuclear Information System (INIS)
The multi-variate auto-regressive model has recently been applied to the noise analysis of nuclear reactor systems. From such a standpoint a system identification study was performed at the Japan Power Demonstrain Reactor-2 (JPDR-2), 45 Mwt, using pseuds-random signals. The aim of this paper is further to extend and refine this identification problem based on the measured data. Emphasis is on the fact that the results obtained by the non-parametric method can by justified by the parametric one. Elucidation of feedback map is also made by estimating the noise contribution rate. Results of computation show the effectiveness of the procedure. (author)
KINETIC ANALYSIS OF HIGH-NITROGEN ENERGETIC MATERIALS USING MULTIVARIATE NONLINEAR REGRESSION
Energy Technology Data Exchange (ETDEWEB)
Campbell, M. S. (Mary Stinecipher); Rabie, R. L. (Ronald L.); Diaz-Acosta, I. (Irina); Pulay, P. (Peter)
2001-01-01
New high-nitrogen energetic materials were synthesized by Hiskey and Naud. J. Opfermann reported a new tool for finding the probable model of the complex reactions using multivariate non-linear regression analysis of DSC and TGA data from several measurements run at different heating rates. This study is to take the kinetic parameters from the different steps and discover which reaction step is responsible for the runaway reaction by comparing predicted results from the Frank-Kamenetsckii equation with the critical temperature found experimentally using the modified Henkin test.
International Nuclear Information System (INIS)
The computer program FREQFIT is designed to perform regression and statistical chi-squared goodness of fit analysis on one-dimensional or two-dimensional data. The program features an interactive user dialogue, numerous help messages, an option for screen or line printer output, and the flexibility to use practically any commercially available graphics package to create plots of the program's results. FREQFIT is written in Microsoft QuickBASIC, for IBM-PC compatible computers. A listing of the QuickBASIC source code for the FREQFIT program, a user manual, and sample input data, output, and plots are included. 6 refs., 1 fig
Mandal, Nilrudra; Doloi, Biswanath; Mondal, Biswanath
2015-05-01
In the present study, an attempt has been made to apply the Taguchi parameter design method and regression analysis for optimizing the cutting conditions on surface finish while machining AISI 4340 steel with the help of the newly developed yttria based Zirconia Toughened Alumina (ZTA) inserts. These inserts are prepared through wet chemical co-precipitation route followed by powder metallurgy process. Experiments have been carried out based on an orthogonal array L9 with three parameters (cutting speed, depth of cut and feed rate) at three levels (low, medium and high). Based on the mean response and signal to noise ratio (SNR), the best optimal cutting condition has been arrived at A3B1C1 i.e. cutting speed is 420 m/min, depth of cut is 0.5 mm and feed rate is 0.12 m/min considering the condition smaller is the better approach. Analysis of Variance (ANOVA) is applied to find out the significance and percentage contribution of each parameter. The mathematical model of surface roughness has been developed using regression analysis as a function of the above mentioned independent variables. The predicted values from the developed model and experimental values are found to be very close to each other justifying the significance of the model. A confirmation run has been carried out with 95 % confidence level to verify the optimized result and the values obtained are within the prescribed limit.
Lu, Lee-Jane W.; Nishino, Thomas K.; Khamapirad, Tuenchit; Grady, James J; Leonard, Morton H.; Brunder, Donald G.
2007-01-01
Breast density (the percentage of fibroglandular tissue in the breast) has been suggested to be a useful surrogate marker for breast cancer risk. It is conventionally measured using screen-film mammographic images by a labor intensive histogram segmentation method (HSM). We have adapted and modified the HSM for measuring breast density from raw digital mammograms acquired by full-field digital mammography. Multiple regression model analyses showed that many of the instrument parameters for ac...
Macrina, Francesco; Puddu, Paolo Emilio; Sciangula, Alfonso; Trigilia, Fausto; Totaro, Marco; Miraldi, Fabio; Toscano, Francesca; Cassese, Mauro; Toscano, Michele
2009-01-01
Background: There are few comparative reports on the overall accuracy of neural networks (NN), assessed only versus multiple logistic regression (LR), to predict events in cardiovascular surgery studies and none has been performed among acute aortic dissection (AAD) Type A patients. Objectives: We aimed at investigating the predictive potential of 30-day mortality by a large series of risk factors in AAD Type A patients comparing the overall performance of NN versus LR. Methods: We investigated 121 plus 87 AAD Type A patients consecutively operated during 7 years in two Centres. Forced and stepwise NN and LR solutions were obtained and compared, using receiver operating characteristic area under the curve (AUC) and their 95% confidence intervals (CI) and Gini’s coefficients. Both NN and LR models were re-applied to data from the second Centre to adhere to a methodological imperative with NN. Results: Forced LR solutions provided AUC 87.9±4.1% (CI: 80.7 to 93.2%) and 85.7±5.2% (CI: 78.5 to 91.1%) in the first and second Centre, respectively. Stepwise NN solution of the first Centre had AUC 90.5±3.7% (CI: 83.8 to 95.1%). The Gini’s coefficients for LR and NN stepwise solutions of the first Centre were 0.712 and 0.816, respectively. When the LR and NN stepwise solutions were re-applied to the second Centre data, Gini’s coefficients were, respectively, 0.761 and 0.850. Few predictors were selected in common by LR and NN models: the presence of pre-operative shock, intubation and neurological symptoms, immediate post-operative presence of dialysis in continuous and the quantity of post-operative bleeding in the first 24 h. The length of extracorporeal circulation, post-operative chronic renal failure and the year of surgery were specifically detected by NN. Conclusions: Different from the International Registry of AAD, operative and immediate post-operative factors were seen as potential predictors of short-term mortality. We report a higher overall predictive accuracy with NN than with LR. However, the list of potential risk factors to predict 30-day mortality after AAD Type A by NN model is not enlarged significantly. PMID:19657459
International Nuclear Information System (INIS)
The records of three earthquakes which had induced significant earthquake response to the piping system were obtained with the earthquake observation system. In the present paper, first, the eigenvalue analysis results for the natural piping system based on the piping support (boundary) conditions are described and second, the frequency and the damping factor evaluation results for each vibrational mode are described. In the present study, the Auto Regressive (AR) analysis method is used in the evaluation of natural frequencies and damping factors. The AR analysis applied here has a capability of direct evaluation of natural frequencies and damping factors from earthquake records observed on a piping system without any information on the input motions to the system. (orig./HP)
Regression analysis of growth responses to water depth in three wetland plant species
DEFF Research Database (Denmark)
Sorrell, Brian K; Tanner, Chris C
2012-01-01
Background and aims Plant species composition in wetlands and on lakeshores often shows dramatic zonation which is frequently ascribed to differences in flooding tolerance. This study compared the growth responses to water depth of three species (Phormium tenax, Carex secta, Typha orientalis) differing in depth preferences in wetlands, using non-linear and quantile regression analyses to establish how flooding tolerance can explain field zonation. Methodology Plants were established for 8 months in outdoor cultures in waterlogged soil without standing water, and then randomly allocated to water depths from 0 – 0.5 m. Morphological and growth responses to depth were followed for 54 days before harvest, and then analysed by repeated measures analysis of covariance, and non-linear and quantile regression analysis (QRA), to compare flooding tolerances. Principal results Growth responses to depth differed between the three species, and were non-linear. P. tenax growth rapidly decreased in standing water > 0.25 m depth, C. secta growth increased initially with depth but then decreased at depths > 0.30 m, accompanied by increased shoot height and decreased shoot density, and T. orientalis was unaffected by the 0 – 0.50 m depth range. In P. tenax the decrease in growth was associated with a decrease in the number of leaves produced per ramet and in C. secta the effect of water depth was greatest for the tallest shoots. Allocation patterns were unaffected by depth. Conclusions The responses are consistent with the principle that zonation in the field is primarily structured by competition in shallow water and by physiological flooding tolerance in deep water. Regression analyses, especially QRA, proved to be powerful tools in distinguishing genuine phenotypic responses to water depth from non-phenotypic variation due to size and developmental differences.
Directory of Open Access Journals (Sweden)
Rosana de Cassia de Souza Schneider
2011-03-01
Full Text Available O ar é um meio eficiente de dispersão de poluentes atmosféricos e seu comportamento depende dos movimentos atmosféricos que ocorrem na troposfera. Em Porto Alegre, Estado do Rio Grande do Sul, há um grande tráfego diário e uma concentração de indústrias que podem ser responsáveis por emissões atmosféricas. Neste trabalho, estudou-se o comportamento das concentrações diárias de material particulado (PM10 desta cidade, considerando a influência dos elementos meteorológicos. A análise dos dados foi realizada a partir de estatísticas descritivas, correlação linear e regressão múltipla. Os dados foram fornecidos pela Fundação Estadual de Proteção Ambiental Henrique Luiz Roessler - RS (FEPAM e pelo Instituto Nacional de Meteorologia (INMET. A partir das análises pôde-se verificar que: as concentrações do PM10, medidos diariamente às 16h, não ultrapassaram os padrões nacionais de qualidade do ar; os elementos meteorológicos que influenciam nas concentrações do PM10 foram: a velocidade média diária do vento e a radiação média diária com relações negativas; as temperaturas médias diárias do ar e as direções, norte e noroeste, do vento, com relações positivas. As direções do vento que contribuem significativamente para diminuir as concentrações nos locais medidos são Leste e Sudeste.Air is an efficient means of atmospheric pollutants dispersal and its r behavior depends on the atmospheric movements that occur in the troposphere. In Porto Alegre, Rio Grande do Sul State, there is a large daily traffic and a concentration of industries that may be responsible for atmospheric emission. In the present work we studied the behavior of daily concentrations of particulate matter (PM10, in this city, considering the influence of meteorological variables. Data analysis was performed from descriptive statistics, linear correlation and multiple regressions. Data were provided by the State Foundation of Environmental Protection Henrique Luiz Roessler - RS and the National Institute of Meteorology. Based on the analysis it was possible to verify that: the concentration of PM10, measured every day at 4:00 p.m., did not exceed national standards for air quality; meteorological elements that influenced on the concentrations of PM10 were the daily average wind speed and average daily radiation with negative relations; the daily average temperature of the air and the directions, north and northwest of wind, with positive relations. Wind directions which contribute significantly to lower concentrations on the measured places are east and southeast.
Directory of Open Access Journals (Sweden)
Keerthiprasad.K
2014-08-01
Full Text Available In recent years, alloy steels have been widely usedin aerospace and automotive industries. Machining of these materials requires better understanding of cutting processes regarding accuracy and efficiency. This study addresses the modelling of the machinability of EN353 and 20mncr5 materials. In this study, multiple regression analysis (MRA is used to investigate the influence of some parameters on the thrust force and torque in the drilling processes of alloy steel materials. The model were identified by using cutting speed, feed rate, and depth as input data and the thrust force and torque as the output data. The statistical analysis accompanied with results showed that cutting feed (f were the most significant parameters on the drilling process, while spindle speed seemed insignificant. Since the spindle speed was insignificant, it directed us to set it either at the highest spindle speed to obtain high material removal rate or at the lowest spindle speed to prolong the tool life depending on the need for the application. The mathematical model is based on a power regression modelling, dependent on the three above mentioned parameters.
Energy Technology Data Exchange (ETDEWEB)
Jiang Mingfeng; Wang Yaming [College of Electronics and Informatics, Zhejiang Sci-Tech University, Hangzhou 310018 (China); Zhu Lingyan [Dongfang College, Zhejiang University of Finance and Economics, Hangzhou, 310018 (China); Xia Ling; Shou Guofa; Liu Feng [Department of Biomedical Engineering, Zhejiang University, Hangzhou 310027 (China); Crozier, Stuart, E-mail: peterjiang0517@163.com, E-mail: jiang.mingfeng@hotmail.com [School of Information Technology and Electrical Engineering, University of Queensland, St Lucia, Brisbane, Queensland 4072 (Australia)
2011-03-21
Non-invasively reconstructing the transmembrane potentials (TMPs) from body surface potentials (BSPs) constitutes one form of the inverse ECG problem that can be treated as a regression problem with multi-inputs and multi-outputs, and which can be solved using the support vector regression (SVR) method. In developing an effective SVR model, feature extraction is an important task for pre-processing the original input data. This paper proposes the application of principal component analysis (PCA) and kernel principal component analysis (KPCA) to the SVR method for feature extraction. Also, the genetic algorithm and simplex optimization method is invoked to determine the hyper-parameters of the SVR. Based on the realistic heart-torso model, the equivalent double-layer source method is applied to generate the data set for training and testing the SVR model. The experimental results show that the SVR method with feature extraction (PCA-SVR and KPCA-SVR) can perform better than that without the extract feature extraction (single SVR) in terms of the reconstruction of the TMPs on epi- and endocardial surfaces. Moreover, compared with the PCA-SVR, the KPCA-SVR features good approximation and generalization ability when reconstructing the TMPs.
Jiang, Mingfeng; Zhu, Lingyan; Wang, Yaming; Xia, Ling; Shou, Guofa; Liu, Feng; Crozier, Stuart
2011-03-01
Non-invasively reconstructing the transmembrane potentials (TMPs) from body surface potentials (BSPs) constitutes one form of the inverse ECG problem that can be treated as a regression problem with multi-inputs and multi-outputs, and which can be solved using the support vector regression (SVR) method. In developing an effective SVR model, feature extraction is an important task for pre-processing the original input data. This paper proposes the application of principal component analysis (PCA) and kernel principal component analysis (KPCA) to the SVR method for feature extraction. Also, the genetic algorithm and simplex optimization method is invoked to determine the hyper-parameters of the SVR. Based on the realistic heart-torso model, the equivalent double-layer source method is applied to generate the data set for training and testing the SVR model. The experimental results show that the SVR method with feature extraction (PCA-SVR and KPCA-SVR) can perform better than that without the extract feature extraction (single SVR) in terms of the reconstruction of the TMPs on epi- and endocardial surfaces. Moreover, compared with the PCA-SVR, the KPCA-SVR features good approximation and generalization ability when reconstructing the TMPs.
International Nuclear Information System (INIS)
Non-invasively reconstructing the transmembrane potentials (TMPs) from body surface potentials (BSPs) constitutes one form of the inverse ECG problem that can be treated as a regression problem with multi-inputs and multi-outputs, and which can be solved using the support vector regression (SVR) method. In developing an effective SVR model, feature extraction is an important task for pre-processing the original input data. This paper proposes the application of principal component analysis (PCA) and kernel principal component analysis (KPCA) to the SVR method for feature extraction. Also, the genetic algorithm and simplex optimization method is invoked to determine the hyper-parameters of the SVR. Based on the realistic heart-torso model, the equivalent double-layer source method is applied to generate the data set for training and testing the SVR model. The experimental results show that the SVR method with feature extraction (PCA-SVR and KPCA-SVR) can perform better than that without the extract feature extraction (single SVR) in terms of the reconstruction of the TMPs on epi- and endocardial surfaces. Moreover, compared with the PCA-SVR, the KPCA-SVR features good approximation and generalization ability when reconstructing the TMPs.
International Nuclear Information System (INIS)
The observation of the equipment and piping system installed in an operating nuclear power plant in earthquakes is very umportant for evaluating and confirming the adequacy and the safety margin expected in the design stage. By analyzing observed earthquake records, it can be expected to get the valuable data concerning the behavior of those in earthquakes, and extract the information about the aseismatic design parameters for those systems. From these viewpoints, an earthquake observation system was installed in a reactor building in an operating plant. Up to now, the records of three earthquakes were obtained with this system. In this paper, an example of the analysis of earthquake records is shown, and the main purpose of the analysis was the evaluation of the vibration mode, natural frequency and damping factor of this piping system. Prior to the earthquake record analysis, the eigenvalue analysis for this piping system was performed. Auto-regressive analysis was applied to the observed acceleration time history which was obtained with a piping system installed in an operating BWR. The results of earthquake record analysis agreed well with the results of eigenvalue analysis. (Kako, I.)
Integrative Data Analysis: The Simultaneous Analysis of Multiple Data Sets
Curran, Patrick J.; Hussong, Andrea M.
2009-01-01
There are both quantitative and methodological techniques that foster the development and maintenance of a cumulative knowledge base within the psychological sciences. Most noteworthy of these techniques is meta-analysis, which allows for the synthesis of summary statistics drawn from multiple studies when the original data are not available.…
Within-session analysis of the extinction of pavlovian fear-conditioning using robust regression
Directory of Open Access Journals (Sweden)
Vargas-Irwin, Cristina
2010-06-01
Full Text Available Traditionally , the analysis of extinction data in fear conditioning experiments has involved the use of standard linear models, mostly ANOVA of between-group differences of subjects that have undergone different extinction protocols, pharmacological manipulations or some other treatment. Although some studies report individual differences in quantities such as suppression rates or freezing percentages, these differences are not included in the statistical modeling. Withinsubject response patterns are then averaged using coarse-grain time windows which can overlook these individual performance dynamics. Here we illustrate an alternative analytical procedure consisting of 2 steps: the estimation of a trend for within-session data and analysis of group differences in trend as main outcome. This procedure is tested on real fear-conditioning extinction data, comparing trend estimates via Ordinary Least Squares (OLS and robust Least Median of Squares (LMS regression estimates, as well as comparing between-group differences and analyzing mean freezing percentage versus LMS slopes as outcomes
Junek, W. N.; Jones, W. L.; Woods, M. T.
2011-12-01
An automated event tree analysis system for estimating the probability of short term volcanic activity is presented. The algorithm is driven by a suite of empirical statistical models that are derived through logistic regression. Each model is constructed from a multidisciplinary dataset that was assembled from a collection of historic volcanic unrest episodes. The dataset consists of monitoring measurements (e.g. InSAR, seismic), source modeling results, and historic eruption activity. This provides a simple mechanism for simultaneously accounting for the geophysical changes occurring within the volcano and the historic behavior of analog volcanoes. The algorithm is extensible and can be easily recalibrated to include new or additional monitoring, modeling, or historic information. Standard cross validation techniques are employed to optimize its forecasting capabilities. Analysis results from several recent volcanic unrest episodes are presented.
Statistical learning method in regression analysis of simulated positron spectral data
International Nuclear Information System (INIS)
Positron lifetime spectroscopy is a non-destructive tool for detection of radiation induced defects in nuclear reactor materials. This work concerns the applicability of the support vector machines method for the input data compression in the neural network analysis of positron lifetime spectra. It has been demonstrated that the SVM technique can be successfully applied to regression analysis of positron spectra. A substantial data compression of about 50 % and 8 % of the whole training set with two and three spectral components respectively has been achieved including a high accuracy of the spectra approximation. However, some parameters in the SVM approach such as the insensitivity zone e and the penalty parameter C have to be chosen carefully to obtain a good performance. (author)
Modelling and analysis of turbulent datasets using Auto Regressive Moving Average processes
Energy Technology Data Exchange (ETDEWEB)
Faranda, Davide, E-mail: davide.faranda@cea.fr; Dubrulle, Bérengère; Daviaud, François [Laboratoire SPHYNX, Service de Physique de l' Etat Condensé, DSM, CEA Saclay, CNRS URA 2464, 91191 Gif-sur-Yvette (France); Pons, Flavio Maria Emanuele [Dipartimento di Scienze Statistiche, Universitá di Bologna, Via delle Belle Arti 41, 40126 Bologna (Italy); Saint-Michel, Brice [Institut de Recherche sur les Phénomènes Hors Equilibre, Technopole de Chateau Gombert, 49 rue Frédéric Joliot Curie, B.P. 146, 13 384 Marseille (France); Herbert, Éric [Université Paris Diderot - LIED - UMR 8236, Laboratoire Interdisciplinaire des Énergies de Demain, Paris (France); Cortet, Pierre-Philippe [Laboratoire FAST, CNRS, Université Paris-Sud (France)
2014-10-15
We introduce a novel way to extract information from turbulent datasets by applying an Auto Regressive Moving Average (ARMA) statistical analysis. Such analysis goes well beyond the analysis of the mean flow and of the fluctuations and links the behavior of the recorded time series to a discrete version of a stochastic differential equation which is able to describe the correlation structure in the dataset. We introduce a new index ? that measures the difference between the resulting analysis and the Obukhov model of turbulence, the simplest stochastic model reproducing both Richardson law and the Kolmogorov spectrum. We test the method on datasets measured in a von Kármán swirling flow experiment. We found that the ARMA analysis is well correlated with spatial structures of the flow, and can discriminate between two different flows with comparable mean velocities, obtained by changing the forcing. Moreover, we show that the ? is highest in regions where shear layer vortices are present, thereby establishing a link between deviations from the Kolmogorov model and coherent structures. These deviations are consistent with the ones observed by computing the Hurst exponents for the same time series. We show that some salient features of the analysis are preserved when considering global instead of local observables. Finally, we analyze flow configurations with multistability features where the ARMA technique is efficient in discriminating different stability branches of the system.
Modelling and analysis of turbulent datasets using Auto Regressive Moving Average processes
Faranda, Davide; Pons, Flavio Maria Emanuele; Dubrulle, Bérengère; Daviaud, François; Saint-Michel, Brice; Herbert, Éric; Cortet, Pierre-Philippe
2014-10-01
We introduce a novel way to extract information from turbulent datasets by applying an Auto Regressive Moving Average (ARMA) statistical analysis. Such analysis goes well beyond the analysis of the mean flow and of the fluctuations and links the behavior of the recorded time series to a discrete version of a stochastic differential equation which is able to describe the correlation structure in the dataset. We introduce a new index ? that measures the difference between the resulting analysis and the Obukhov model of turbulence, the simplest stochastic model reproducing both Richardson law and the Kolmogorov spectrum. We test the method on datasets measured in a von Kármán swirling flow experiment. We found that the ARMA analysis is well correlated with spatial structures of the flow, and can discriminate between two different flows with comparable mean velocities, obtained by changing the forcing. Moreover, we show that the ? is highest in regions where shear layer vortices are present, thereby establishing a link between deviations from the Kolmogorov model and coherent structures. These deviations are consistent with the ones observed by computing the Hurst exponents for the same time series. We show that some salient features of the analysis are preserved when considering global instead of local observables. Finally, we analyze flow configurations with multistability features where the ARMA technique is efficient in discriminating different stability branches of the system.
Modelling and analysis of turbulent datasets using Auto Regressive Moving Average processes
International Nuclear Information System (INIS)
We introduce a novel way to extract information from turbulent datasets by applying an Auto Regressive Moving Average (ARMA) statistical analysis. Such analysis goes well beyond the analysis of the mean flow and of the fluctuations and links the behavior of the recorded time series to a discrete version of a stochastic differential equation which is able to describe the correlation structure in the dataset. We introduce a new index ? that measures the difference between the resulting analysis and the Obukhov model of turbulence, the simplest stochastic model reproducing both Richardson law and the Kolmogorov spectrum. We test the method on datasets measured in a von Kármán swirling flow experiment. We found that the ARMA analysis is well correlated with spatial structures of the flow, and can discriminate between two different flows with comparable mean velocities, obtained by changing the forcing. Moreover, we show that the ? is highest in regions where shear layer vortices are present, thereby establishing a link between deviations from the Kolmogorov model and coherent structures. These deviations are consistent with the ones observed by computing the Hurst exponents for the same time series. We show that some salient features of the analysis are preserved when considering global instead of local observables. Finally, we analyze flow configurations with multistability features where the ARMA technique is efficient in discriminating different stability branches of the system
Directory of Open Access Journals (Sweden)
Kao Jau-Tsuen
2006-08-01
Full Text Available Abstract Background The genetic association analysis using haplotypes as basic genetic units is anticipated to be a powerful strategy towards the discovery of genes predisposing human complex diseases. In particular, the increasing availability of high-resolution genetic markers such as the single-nucleotide polymorphisms (SNPs has made haplotype-based association analysis an attractive alternative to single marker analysis. Results We consider haplotype association analysis under the population-based case-control study design. A multinomial logistic model is proposed for haplotype analysis with unphased genotype data, which can be decomposed into a prospective logistic model for disease risk as well as a model for the haplotype-pair distribution in the control population. Environmental factors can be readily incorporated and hence the haplotype-environment interaction can be assessed in the proposed model. The maximum likelihood estimation with unphased genotype data can be conveniently implemented in the proposed model by applying the EM algorithm to a prospective multinomial logistic regression model and ignoring the case-control design. We apply the proposed method to the hypertriglyceridemia study and identifies 3 haplotypes in the apolipoprotein A5 gene that are associated with increased risk for hypertriglyceridemia. A haplotype-age interaction effect is also identified. Simulation studies show that the proposed estimator has satisfactory finite-sample performances. Conclusion Our results suggest that the proposed method can serve as a useful alternative to existing methods and a reliable tool for the case-control haplotype-based association analysis.
Kyriakides, Leonidas; Luyten, Hans
2009-01-01
This article reports the results of a study in which the basic regression-discontinuity approach to assess the effect of 1 year of schooling is extended. The data analysis covers the 6 grades of secondary education in Cyprus and thus assesses the contribution of secondary education to the cognitive development of 12- to 18-year-old students. A…
Directory of Open Access Journals (Sweden)
Shuai Wang
2014-10-01
Full Text Available Accurate prediction of the remaining useful life (RUL of lithium-ion batteries is important for battery management systems. Traditional empirical data-driven approaches for RUL prediction usually require multidimensional physical characteristics including the current, voltage, usage duration, battery temperature, and ambient temperature. From a capacity fading analysis of lithium-ion batteries, it is found that the energy efficiency and battery working temperature are closely related to the capacity degradation, which account for all performance metrics of lithium-ion batteries with regard to the RUL and the relationships between some performance metrics. Thus, we devise a non-iterative prediction model based on flexible support vector regression (F-SVR and an iterative multi-step prediction model based on support vector regression (SVR using the energy efficiency and battery working temperature as input physical characteristics. The experimental results show that the proposed prognostic models have high prediction accuracy by using fewer dimensions for the input data than the traditional empirical models.
The Use of Logistic Regression in the Analysis of Data Concerning Good Medical Practice
Directory of Open Access Journals (Sweden)
Damon MN
2002-06-01
Full Text Available Logistic regression is one of the commonly used models of explicative multivariate analysis utilized in epidemiology. Its use, which has become easier with modern statistical software, allows researchers to control confusion bias. It measures the odds-ratio , a quantification of the association probability between a given occurrence, represented by a dichotomic variable, and factors susceptible to influence it, represented by explicative variables. The choice of explicative variables integrated into the model is based on previous information on the study subject and is aimed at avoiding the confusion factors which have already been identified. The authors explain the fundamental principles of logistic regression and the steps involved in its application. By using two examples (the quality of the follow up care given to diabetics and in-hospital mortality after acute myocardial infarction, they demonstrate the value this statistical tool can have in studies performed by the medical service of the national health care fund, particularly in studies designed to evaluate professional practice.
Improved Regression Analysis of Temperature-Dependent Strain-Gage Balance Calibration Data
Ulbrich, N.
2015-01-01
An improved approach is discussed that may be used to directly include first and second order temperature effects in the load prediction algorithm of a wind tunnel strain-gage balance. The improved approach was designed for the Iterative Method that fits strain-gage outputs as a function of calibration loads and uses a load iteration scheme during the wind tunnel test to predict loads from measured gage outputs. The improved approach assumes that the strain-gage balance is at a constant uniform temperature when it is calibrated and used. First, the method introduces a new independent variable for the regression analysis of the balance calibration data. The new variable is designed as the difference between the uniform temperature of the balance and a global reference temperature. This reference temperature should be the primary calibration temperature of the balance so that, if needed, a tare load iteration can be performed. Then, two temperature{dependent terms are included in the regression models of the gage outputs. They are the temperature difference itself and the square of the temperature difference. Simulated temperature{dependent data obtained from Triumph Aerospace's 2013 calibration of NASA's ARC-30K five component semi{span balance is used to illustrate the application of the improved approach.
Regression Analysis of Thermal Conductivity Based on Measurements of Compacted Graphite Irons
Selin, Martin; König, Mathias
2009-12-01
A model describing the thermal conductivity of compacted graphite iron (CGI) was created based on the microstructure analysis and thermal conductivity measurements of 76 compacted graphite samples. The thermal conductivity was measured using a laser flash apparatus for seven temperatures ranging between 35 °C and 600 °C. The model was created by solving a linear regression model taking into account the influence of carbon and silicon additions, nodularity, and fractions of ferrite and carbide constituents. Observations and the results from the model indicated a positive influence of the fraction of ferrite in the metal matrix on the thermal conductivity. Increasing the amount of carbon addition while keeping the CE value constant, i.e., at the same time reducing the silicon addition, had a positive effect on the thermal conductivity value. Nodularity is known to reduce the thermal conductivity and this was also confirmed. The fraction of carbides was low in the samples, making their influence slight. A comparison of the thermal conductivity values calculated from the model with measured values showed a good agreement, even on materials not used to solve the linear regression model.
International Nuclear Information System (INIS)
Purpose: The goal of this study was to maximize the discrimination between benign and malignant masses in patients with sonographically indeterminate ovarian lesions by means of unenhanced and contrast-enhanced MR imaging, and to develop a computer-assisted diagnosis system. Material and Methods: Findings in precontrast and Gd-DTPA contrast-enhanced MR images of 104 patients with 115 sonographically indeterminate ovarian masses were analyzed, and the results were correlated with histopathological findings. Of 115 lesions, 65 were benign (23 cystadenomas, 13 complex cysts, 11 teratomas, 6 fibrothecomas, 12 others) and 50 were malignant (32 ovarian carcinomas, 7 metastatic tumors of the ovary, 4 carcinomas of the fallopian tubes, 7 others). A logistic regression analysis was performed to discriminate between benign and malignant lesions, and a model of a computer-assisted diagnosis was developed. This model was prospectively tested in 75 cases of ovarian tumors found at other institutions. Results: From the univariate analysis, the following parameters were selected as significant for predicting malignancy (p?0.05): A solid or cystic mass with a large solid component or wall thickness greater than 3 mm; complex internal architecture; ascites; and bilaterality. Based on these parameters, a model of a computer-assisted diagnosis system was developed with the logistic regression analysis. To distinguish benign from malignant lesions, the maximum cut-off point was obtaines, the maximum cut-off point was obtained between 0.47 and 0.51. In a prospective application of this model, 87% of the lesions were accurately identified as benign or malignant. (orig.)
Multiple tumors. Analysis of 50 patients
International Nuclear Information System (INIS)
The description of multiple primary neoplasms dating from the late nineteenth; Warrem and Gates established the clinicopathological criteria for diagnosis. frequency Clinical presentation is from 1.5 to 5.4% of cancers, and of 5% to 11% by autopsies. In recent years there has been an increase in second tumors probably due to new strategies of staging, monitoring patients (ptes) and therapeutic results with improved survival from first diagnosis. Objective: Analysis of 50 tumor ptes multiple carriers assisted in the HCFF.AA Oncology Service in the period 1/1997 to 1/2004. Patients and methods: ptes included. registered in the H.C.FF.AA, carriers 2 or histologically documented malignant tumors. Were reviewed medical records, describing age, sex, date of diagnosis and type of tumor. Frequency of these tumors and their occurrence interval were analyzed. Results: We included 50 ptes, with 2.0% of registered patients. (2400). The average age was 61 years (36-89 years). Median appearance interval between the first and second tumor was 28 months (0-300). The most common tumors were: breast carcinoma (23), no skin tumors melanoma (15), colon adenocarcinoma (12), prostate (8) and kidney (6). according to appearance 10 were synchronous and 40 metachronous. Breast tumor They most often associated endometrial tumors (5), ovarian (3), colon (3) and kidney (3). Of the 50 patients, 42 had 2 tumors in 8 cases and 3 tumors. Conclusions: The frequency of occurrence of multiple neoplasms in our series and presentation mode in time does not differ from that reported by other authors. Monitoring of patients with cancer and advances in diagnosis Therapeutic and lead to increased tumor diagnosis seconds and a new therapeutic challenge
Barbu, N.; Cuculeanu, V.; Stefan, S.
2015-08-01
The aim of this study is to investigate the relationship between the frequency of very warm days (TX90p) in Romania and large-scale atmospheric circulation for winter (December-February) and summer (June-August) between 1962 and 2010. In order to achieve this, two catalogues from COST733Action were used to derive daily circulation types. Seasonal occurrence frequencies of the circulation types were calculated and have been utilized as predictors within the multiple linear regression model (MLRM) for the estimation of winter and summer TX90p values for 85 synoptic stations covering the entire Romania. A forward selection procedure has been utilized to find adequate predictor combinations and those predictor combinations were tested for collinearity. The performance of the MLRMs has been quantified based on the explained variance. Furthermore, the leave-one-out cross-validation procedure was applied and the root-mean-squared error skill score was calculated at station level in order to obtain reliable evidence of MLRM robustness. From this analysis, it can be stated that the MLRM performance is higher in winter compared to summer. This is due to the annual cycle of incoming insolation and to the local factors such as orography and surface albedo variations. The MLRM performances exhibit distinct variations between regions with high performance in wintertime for the eastern and southern part of the country and in summertime for the western part of the country. One can conclude that the MLRM generally captures quite well the TX90p variability and reveals the potential for statistical downscaling of TX90p values based on circulation types.
Analysis of the Multiple SGTR of SMART
Energy Technology Data Exchange (ETDEWEB)
Lee, Seong Wook; Kim, Hee Kyung; Bae, Kyoo Hwan; Choi, Suhn [Korea Atomic Energy Research Institute, Daejeon (Korea, Republic of)
2011-10-15
An advanced integral pressurized water reactor (PWR), SMART (System-integrated Modular Advanced ReacTor) with a rated thermal power of 330MW, is under development by KAERI. SMART adopts helical once-through type steam generator producing the superheated steam in normal operation and passive residual heat removal system (PRHRS) for decay heat removal after the reactor shutdown as shown in the Fig. 1. As a design basis event, the single steam generator tube rupture (SGTR) has been analyzed. Recently, the licensing body requires the plant's capability for the multiple steam generator tube rupture (MSGTR) of the advanced reactor. Therefore, in this study, the analysis of the MSGTR in SMART has been accomplished to show the proper plant's response
Estimation of Output Disturbance in Auto-Regressive Model via Independent Component Analysis
Directory of Open Access Journals (Sweden)
R. Tanaka
2013-02-01
Full Text Available This paper explains and demonstrates how to estimate an output disturbance in an auto-regressive model. This method uses the independent component analysis (ICA technique, which restores source signals from their linear mixtures under the assumption that the source signals are mutually independent. The estimation is achieved by a model whose source signals consist of input and output disturbance, and observed signals consist of input and output. To solve the ICA problem, a natural gradient method based on mutual information is adopted. As a result, in this simulation, the NRR of our proposed method shows an improvement of about 4.0 [dB] compared with that of a conventional method.
Directory of Open Access Journals (Sweden)
Ashok Kumar Sahoo
2014-04-01
Full Text Available The objective of the study is to assess the performance of multilayer coated carbide insert in the machining of hardened AISI D2 steel (53 HRC using Taguchi design of experiment. The experiment was designed based on Taguchi L27 orthogonal array to predict surface roughness. The S/N ratio and optimum parametric condition are analysed. The analysis of variance has also been carried out to predict the significant factors affecting surface roughness. Based on Taguchi S/N ratio and ANOVA, feed is the most influencing parameter for surface roughness followed by cutting speed whereas depth of cut has least significant from the experiments. In regression model, the value of R2 being 0.98 indicates that 98 % of the total variations are explained by the model. It indicates that the developed model can be effectively used to predict the surface roughness on the machining of D2 steel with 95% confidence intervals.
Deterministic Assessment of Continuous Flight Auger Construction Durations Using Regression Analysis
Directory of Open Access Journals (Sweden)
Hossam E. Hosny
2015-07-01
Full Text Available One of the primary functions of construction equipment management is to calculate the production rate of equipment which will be a major input to the processes of time estimates, cost estimates and the overall project planning. Accordingly, it is crucial to stakeholders to be able to compute equipment production rates. This may be achieved using an accurate, reliable and easy tool. The objective of this research is to provide a simple model that can be used by specialists to predict the duration of a proposed Continuous Flight Auger job. The model was obtained using a prioritizing technique based on expert judgment then using multi-regression analysis based on a representative sample. The model was then validated on a selected sample of projects. The average error of the model was calculated to be about (3%-6%.
Shen, Chung-Wei; Chen, Yi-Hau
2015-10-01
Missing observations and covariate measurement error commonly arise in longitudinal data. However, existing methods for model selection in marginal regression analysis of longitudinal data fail to address the potential bias resulting from these issues. To tackle this problem, we propose a new model selection criterion, the Generalized Longitudinal Information Criterion, which is based on an approximately unbiased estimator for the expected quadratic error of a considered marginal model accounting for both data missingness and covariate measurement error. The simulation results reveal that the proposed method performs quite well in the presence of missing data and covariate measurement error. On the contrary, the naive procedures without taking care of such complexity in data may perform quite poorly. The proposed method is applied to data from the Taiwan Longitudinal Study on Aging to assess the relationship of depression with health and social status in the elderly, accommodating measurement error in the covariate as well as missing observations. PMID:26012353
Yuichi Sarusawa; Kohei Arai
2013-01-01
Regression analysis based method for turbidity and ocean current velocity estimation with remote sensing satellite data is proposed. Through regressive analysis with MODIS data and measured data of turbidity and ocean current velocity, regressive equation which allows estimation of turbidity and ocean current velocity is obtained. With the regressive equation as well as long term MODIS data, turbidity and ocean current velocity trends in Ariake Sea area are clarified. It is also confirmed tha...
The Analysis of Internet Addiction Scale Using Multivariate Adaptive Regression Splines
Directory of Open Access Journals (Sweden)
M Kayri
2010-12-01
Full Text Available "nBackground: Determining real effects on internet dependency is too crucial with unbiased and robust statistical method. MARS is a new non-parametric method in use in the literature for parameter estimations of cause and effect based research. MARS can both obtain legible model curves and make unbiased parametric predictions."nMethods: In order to examine the performance of MARS, MARS findings will be compared to Classification and Regression Tree (C&RT findings, which are considered in the literature to be efficient in revealing correlations between variables. The data set for the study is taken from "The Internet Addiction Scale" (IAS, which attempts to reveal addiction levels of individuals. The population of the study consists of 754 secondary school students (301 female, 443 male students with 10 missing data. MARS 2.0 trial version is used for analysis by MARS method and C&RT analysis was done by SPSS."nResults: MARS obtained six base functions of the model. As a common result of these six functions, regression equation of the model was found. Over the predicted variable, MARS showed that the predictors of daily Internet-use time on average, the purpose of Internet- use, grade of students and occupations of mothers had a significant effect (P< 0.05. In this comparative study, MARS obtained different findings from C&RT in dependency level prediction."nConclusion: The fact that MARS revealed extent to which the variable, which was considered significant, changes the character of the model was observed in this study.
Ridge Regression Analysis on the Influential Factors of FDI in Jiangsu Province
Directory of Open Access Journals (Sweden)
Yang CAO
2008-08-01
Full Text Available
As Chinese eastern coastal developed areas, through the use of foreign capital, Jiangsu Province has not only promoted economic growth rapidly, enhanced the regional comprehensive competitiveness, promoted employment, but also created a new famous mode of economic development called Sunan. Based on the qualitative analysis of factors affecting the inflow of foreign capital in Jiangsu, the paper establish a mathematical model between the FDI and major economic indicators in Jiangsu, in accordance with its own characteristics. And then taken 1992-2006 time-series data for the background, the paper use the method of ridge regression to analysis the influential factors of FDI in Jiangsu.
Key words: foreign direct investment, ridge regression, factors, Jiangsu
Résumé: En tant qu’une région développée dans la côte-est de la Chine, grâce à l’usage du capital étranger, la province du Jiangsu a non seulement eu une croissance économique rapide, augmenté la compétitivité générale, créé desemplois mais aussi inventé un nouveau modèle du développement économique qu’on appelle Sunan. En se basant sur les analyses qualitatives des facteurs affectant l’afflux du capital étranger dans la province de Jiangsu, l’article étalit un modèle mathématiqueentre le FDI et les principaux indicateurs économiques dans la Province, conformément à ses caractéristiques appropriées. Et puis, en employant les données de la période de l’année 1992 à 2006 comme l’arrière-plan, l’article utilise la méthode d’analyse de ridge régressionn pour étudier les facteurs influents de FDI dans la province de Jiangsu.
Mots-Clés: investissements directs étrangers, ridge régression, facteurs, Jiangsu
Applying support vector regression analysis on grip force level-related corticomuscular coherence
DEFF Research Database (Denmark)
Rong, Yao; Han, Xixuan
2014-01-01
Voluntary motor performance is the result of cortical commands driving muscle actions. Corticomuscular coherence can be used to examine the functional coupling or communication between human brain and muscles. To investigate the effects of grip force level on corticomuscular coherence in an accessory muscle, this study proposed an expanded support vector regression (ESVR) algorithm to quantify the coherence between electroencephalogram (EEG) from sensorimotor cortex and surface electromyogram (EMG) from brachioradialis in upper limb. A measure called coherence proportion was introduced to compare the corticomuscular coherence in the alpha (7–15Hz), beta (15–30Hz) and gamma (30–45Hz) band at 25 % maximum grip force (MGF) and 75 % MGF. Results show that ESVR could reduce the influence of deflected signals and summarize the overall behavior of multiple coherence curves. Coherence proportion is more sensitive to grip force level than coherence area. The significantly higher corticomuscular coherence occurred in the alpha (p<0.01) and beta band (p<0.01) during 75 % MGF, but in the gamma band (p<0.01) during 25 % MGF. The results suggest that sensorimotor cortex might control the activity of an accessory muscle for hand grip with increased grip intensity by changing functional corticomuscular coupling at certain frequency bands (alpha, beta and gamma bands).
Directory of Open Access Journals (Sweden)
G. Selvaraju
2013-12-01
Full Text Available Aim: A study was undertaken to develop a forecasting model for predicting bluetongue outbreaks in North-west agroclimatic zone of Tamil Nadu, India. Materials and Methods: Eleven bluetongue outbreaks were characterised by active and passive surveillances for a period of twelve years and used in this study. Meteorological data comprising of maximum and minimum temperatures, relative humidity, rainfall and wind speed were collected and used as the multiple predictor variables in the multiple liner regression model. Results: A multiple liner regression model was developed for the North-west zone of Tamil Nadu. Values of the dependant variables were less than or greater than one, and indicated remote or greater chances of bluetongue outbreaks respectively. The monthly mean maximum and minimum temperatures, relative humidity at 8.30 h and at 17.00 h IST, wind speed, and monthly total rainfall of 29.1 - 31.0°C, 20.1 - 22.0°C, 80.1 ? 85.0%, 65.1 ? 70.0%, 3.1 ? 5.0 km/h and < 200 mm respectively, were identified as the ideal climatic conditions for increased numbers of bluetongue outbreaks in this zone. Conclusion: Based on the values obtained from the prediction model, stake holders can be warned timely through the media to institute suitable prophylactic measures against bluetongue, to avoid economic losses due to disease. [Vet World 2013; 6(6.000: 321-324
Directory of Open Access Journals (Sweden)
K.Satyanarayana
2013-06-01
Full Text Available The present work deals with the cutting forces and cutting temperature produced during turning of titanium alloy Ti-6Al-4V with PVD TiN coated tungsten carbide inserts under dry environment. The 1st order mathematical models are developed using multiple regression analysis and optimized the process parameters using contour plots. The model presented high determination coefficient (R2 = 0.964 and 0.989 explaining 96.4 % and 98.9 % of the variability in the cutting force and cutting temperature, which indicates the goodness of fit for the model and high significance of the model. The developed mathematical model correlates the relationship of the cutting force and temperature with the process parameters with good degree of approximation. From the contour plots, the optimal parametric combination for lowest cutting force is v 3 (75 m/min – f 1 (0.25 mm/rev. Similarly, the optimal parametric combination for minimum temperature is v 1 (45 m/min – f 1 (0.25 mm/rev. Cutting speed is found to be the most significance parameter on cutting forces followed by feed. Similarly, for cutting temperature, feed is found to be the most influencing parameter followed by cutting speed.
Directory of Open Access Journals (Sweden)
Süleyman Demir
2014-04-01
Full Text Available This study performs a Differential Item Function (DIF analysis in terms of gender and culture on the items available in the PISA 2009 mathematics literacy sub-test. The DIF analyses were done through the Mantel Haenszel, Logistic Regression and the SIBTEST methods. The data for the gender variable were collected from the responses given by 332 students to the items in the mathematics literacy sub-test during the administration of the 5th booklet in the PISA 2009 application whereas the data for the culture variable were collected through the application of the 5th booklet in Turkey, Germany, Finland and the United States in the PISA 2009 application. As a result of DIF analysis according to gender, 4 items carried out in favor of men, only one item can be said to be advantageous in favor of girls. As a result of DIF analysis according to culture, 16 items for Turkish and German students, 14 items for Turkish and Finn students, 18 items for Turkish and United States students were determined.
A unified framework for association analysis with multiple related phenotypes.
Stephens, Matthew
2013-01-01
We consider the problem of assessing associations between multiple related outcome variables, and a single explanatory variable of interest. This problem arises in many settings, including genetic association studies, where the explanatory variable is genotype at a genetic variant. We outline a framework for conducting this type of analysis, based on Bayesian model comparison and model averaging for multivariate regressions. This framework unifies several common approaches to this problem, and includes both standard univariate and standard multivariate association tests as special cases. The framework also unifies the problems of testing for associations and explaining associations - that is, identifying which outcome variables are associated with genotype. This provides an alternative to the usual, but conceptually unsatisfying, approach of resorting to univariate tests when explaining and interpreting significant multivariate findings. The method is computationally tractable genome-wide for modest numbers of phenotypes (e.g. 5-10), and can be applied to summary data, without access to raw genotype and phenotype data. We illustrate the methods on both simulated examples, and to a genome-wide association study of blood lipid traits where we identify 18 potential novel genetic associations that were not identified by univariate analyses of the same data. PMID:23861737
International Nuclear Information System (INIS)
Mixed dissociation constants of four drug acids, i.e. silychristin, silybinin, silydianin and mycophenolate at various ionic strengths I of range 0.01 and 0.30 and at temperatures of 25 and 37 deg. C were determined using the SQUAD(84) regression analysis program applied to pH-spectrophotometric titration data. The proposed strategy of an efficient experimentation in a protonation constants determination, followed by a computational strategy for the chemical model with a protonation constants determination, is presented on the protonation equilibria of silychristin. The thermodynamic dissociation constant pKaT was estimated by non-linear regression of {pKa, I data at 25 and 37 deg. C: for silychristin pKa,1T=6.52(16) and 6.62(1), pKa,2T=7.22(13) and 7.41(5), pKa,3T=8.96(9) and 8.94(9), pKa,4T=10.17(7) and 10.03(8), pKa,5T=11.89(4) and 11.63(7); for silybin pKa,1T=7.00(4) and 6.86(5), pKa,2T=8.77(11) and 8.77(3), pKa,3T=9.57(8) and 9.62(1), pKa,4T=11.66(3) and 11.38(1); for silydianin pKa,1T=6.64(7) and 7.10(6), pKa,2T=7.78(5) and 8.93(1), pKa,3T=9.66(9) and 10.06(11), pKa,4T=10.71(7) and 10.77(7), pKa,5T=12.26(5) and 12.14(5); for mycophenolate pKaT=8.32(1) and 8.14(1). Goodness-of-fit tests for various regression diagnostics enabled the reliability of parameter estimates to be found
Quantile Regression in the Study of Developmental Sciences
Petscher, Yaacov; Logan, Jessica A.R.
2013-01-01
Linear regression analysis is one of the most common techniques applied in developmental research, but only allows for an estimate of the average relations between the predictor(s) and the outcome. This study describes quantile regression, which provides estimates of the relations between the predictor(s) and outcome, but across multiple points of the outcome’s distribution. Using data from the High School and Beyond and U.S. Sustained Effects Study databases, quantile regression is demonstra...
Directory of Open Access Journals (Sweden)
Pudji Ismartini
2010-08-01
Full Text Available One of the major problem facing the data modelling at social area is multicollinearity. Multicollinearity can have significant impact on the quality and stability of the fitted regression model. Common classical regression technique by using Least Squares estimate is highly sensitive to multicollinearity problem. In such a problem area, Partial Least Squares Regression (PLSR is a useful and flexible tool for statistical model building; however, PLSR can only yields point estimations. This paper will construct the interval estimations for PLSR regression parameters by implementing Jackknife technique to poverty data. A SAS macro programme is developed to obtain the Jackknife interval estimator for PLSR.
Multiple factor analysis with continuous and dichotomous variables
Thanoon, Thanoon Y.; Adnan, Robiah; Saffari, Seyed Ehsan
2014-12-01
In this paper, continuous and dichotomous variables are used in multiple factor analysis method. When all variables within the same group are continuous, we use principal component analysis method in factor analysis, if all variables within the same group are dichotomous we use multiple correspondence analysis method in factor analysis. Statistical analyses, which involve Eigen roots, Eigen vectors, multiple factor loadings, correlation coefficient RV, contribution table, are discussed. The proposed procedure is illustrated by a lung cancer data consists of four groups "group of personal variables", "group of therapeutic variables", "group of nutritional variables", "group of genetic variables". Analysis are done by using XLSTAT program.
Alternative Methods of Regression
Birkes, David
2011-01-01
Of related interest. Nonlinear Regression Analysis and its Applications Douglas M. Bates and Donald G. Watts ".an extraordinary presentation of concepts and methods concerning the use and analysis of nonlinear regression models.highly recommend[ed].for anyone needing to use and/or understand issues concerning the analysis of nonlinear regression models." --Technometrics This book provides a balance between theory and practice supported by extensive displays of instructive geometrical constructs. Numerous in-depth case studies illustrate the use of nonlinear regression analysis--with all data s
Generalized multilevel function-on-scalar regression and principal component analysis.
Goldsmith, Jeff; Zipunnikov, Vadim; Schrack, Jennifer
2015-06-01
This manuscript considers regression models for generalized, multilevel functional responses: functions are generalized in that they follow an exponential family distribution and multilevel in that they are clustered within groups or subjects. This data structure is increasingly common across scientific domains and is exemplified by our motivating example, in which binary curves indicating physical activity or inactivity are observed for nearly 600 subjects over 5 days. We use a generalized linear model to incorporate scalar covariates into the mean structure, and decompose subject-specific and subject-day-specific deviations using multilevel functional principal components analysis. Thus, functional fixed effects are estimated while accounting for within-function and within-subject correlations, and major directions of variability within and between subjects are identified. Fixed effect coefficient functions and principal component basis functions are estimated using penalized splines; model parameters are estimated in a Bayesian framework using Stan, a programming language that implements a Hamiltonian Monte Carlo sampler. Simulations designed to mimic the application have good estimation and inferential properties with reasonable computation times for moderate datasets, in both cross-sectional and multilevel scenarios; code is publicly available. In the application we identify effects of age and BMI on the time-specific change in probability of being active over a 24-hour period; in addition, the principal components analysis identifies the patterns of activity that distinguish subjects and days within subjects. PMID:25620473
Huang, Xuelin; Zhang, Nan
2008-01-01
In clinical studies, when censoring is caused by competing risks or patient withdrawal, there is always a concern about the validity of treatment effect estimates that are obtained under the assumption of independent censoring. Since dependent censoring is non-identifiable without additional information, the best we can do is a sensitivity analysis to assess the changes of parameter estimates under different degrees of assumed dependent censoring. Such an analysis is especially useful when kn...
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard; Henningsen, Arne
The estimation of the technical efficiency comprises a vast literature in the field of applied production economics. There are two predominant approaches: the non-parametric and non-stochastic Data Envelopment Analysis (DEA) and the parametric Stochastic Frontier Analysis (SFA). The DEA is criticised, because it cannot account for statistical noise such as random production shocks and measurement errors, which are inherent in more or less all production data sets. In contrast, the SFA is criticised, because it requires the specification of a functional form, which involves the risk of specifying an unsuitable functional form and thus, model misspecification and biased parameter estimates. Given these problems of the DEA and the SFA, Fan, Li and Weersink (1996) proposed a semi-parametric stochastic frontier model that estimates the production function (frontier) by non-parametric regression based on kernel estimators. This approach combines the virtues of the DEA and the SFA, while avoiding their drawbacks: itavoids the specification of a functional form and at the same time accounts for statistical noise. More recently, this approach was used by Henderson and Simar (2005), Kumbhakar et al. (2007), and Henningsen and Kumbhakar (2009). The aim of this paper and its main contribution to the existing literature is the estimation semi-parametric stochastic frontier models using a different non-parametric estimation technique: spline regression (Ma et al. 2011). We apply this approach to the Polish dairy sector and use a panel data set of Polish dairy farms from the years 2004-2010. The Polish dairy sector has changed considerably since the integration of Poland in the European Union: the number of dairy producers decreased by one third and the average herd size increased from 3.8 to 5.7 cows per farm within the period 2004-2010. It is expected that farms with small herds (less than 30 dairy cows) will quit and that the number of large farms (with more than 100 dairy cows) will increase. Therefore, a thorough empirical study of the technical efficiency and scale efficiency of Polish dairy farms contributes to the insight into this dynamic process. Furthermore, we compare and evaluate the results of this spline-based semi-parametric stochastic frontier model with results of other semi-parametric stochastic frontier models and of traditional parametric stochastic frontier models. References: Fan, Y.; Li, Q. , Weersink, A. (1996), Semiparametric Estimation of Stochastic Production Frontier Models, Journal of Business and Economic Statistics. Henderson, D. J., Simar, L. (2005), A Fully Nonparametric Stochastic Frontier Model for Panel Data, University of New York Henningsen, A. , Kumbhakar, S. C. (2009), Semiparametric Stochastic Frontier Analysis: An Application to Polish Farms During Transition, Paper presented at the (EWEPA) in Pisa, Italy. Kumbhakar S. C., Park, B. U., Simar, L. Tsionas E. G. (2007), Nonparametric Stochastic Frontiers: A Local Maximum Likelihood Approach, Journal of Econometrics. Ma,S., Racine, J. S. & Yang, L. (2011), Spline regression in the presence of categorical predictors, Working Paper
Kügler, S. D.; Polsterer, K.; Hoecker, M.
2015-04-01
Context. In astronomy, new approaches to process and analyze the exponentially increasing amount of data are inevitable. For spectra, such as in the Sloan Digital Sky Survey spectral database, usually templates of well-known classes are used for classification. In case the fitting of a template fails, wrong spectral properties (e.g. redshift) are derived. Validation of the derived properties is the key to understand the caveats of the template-based method. Aims: In this paper we present a method for statistically computing the redshift z based on a similarity approach. This allows us to determine redshifts in spectra for emission and absorption features without using any predefined model. Additionally, we show how to determine the redshift based on single features. As a consequence we are, for example, able to filter objects that show multiple redshift components. Methods: The redshift calculation is performed by comparing predefined regions in the spectra and individually applying a nearest neighbor regression model to each predefined emission and absorption region. Results: The choice of the model parameters controls the quality and the completeness of the redshifts. For ?90% of the analyzed 16 000 spectra of our reference and test sample, a certain redshift can be computed that is comparable to the completeness of SDSS (96%). The redshift calculation yields a precision for every individually tested feature that is comparable to the overall precision of the redshifts of SDSS. Using the new method to compute redshifts, we could also identify 14 spectra with a significant shift between emission and absorption or between emission and emission lines. The results already show the immense power of this simple machine-learning approach for investigating huge databases such as the SDSS.
DEFF Research Database (Denmark)
Fitzenberger, Bernd; Wilke, Ralf Andreas
2015-01-01
Quantile regression is emerging as a popular statistical approach, which complements the estimation of conditional mean models. While the latter only focuses on one aspect of the conditional distribution of the dependent variable, the mean, quantile regression provides more detailed insights by modeling conditional quantiles. Quantile regression can therefore detect whether the partial effect of a regressor on the conditional quantiles is the same for all quantiles or differs across quantiles. Quantile regression can provide evidence for a statistical relationship between two variables even if the mean regression model does not. We provide a short informal introduction into the principle of quantile regression which includes an illustrative application from empirical labor market research. This is followed by briefly sketching the underlying statistical model for linear quantile regression based on a cross-section sample. We summarize various important extensions of the model including the nonlinear quantileregression model, censored quantile regression, and quantile regression for time-series data. We also discuss a number of more recent extensions of the quantile regression model to censored data, duration data, and endogeneity, and we describe how quantile regression can be used for decomposition analysis. Finally, we identify several key issues, which should be addressed by future research, and we provide an overview of quantile regression implementations in major statistics software. Our treatment of the topic is based on the perspective of applied researchers using quantile regression in their empirical work.
Kaplan, David
2005-01-01
This article considers the problem of estimating dynamic linear regression models when the data are generated from finite mixture probability density function where the mixture components are characterized by different dynamic regression model parameters. Specifically, conventional linear models assume that the data are generated by a single…
CATEGORICAL REGRESSION ANALYSIS OF ACUTE INHALATION TOXICITY DATA FOR HYDROGEN SULFIDE
Categorical regression is one of the tools offered by the U.S. EPA for derivation of acute reference exposures (AREs), which are dose-response assessments for acute exposures to inhaled chemicals. Categorical regression is used as a meta-analytical technique to calculate probabi...
International Nuclear Information System (INIS)
This study aims to perform a regression analysis which leads to the optimization on the operating conditions of ionic liquid (IL), 1-ethyl-3-methylimidazolium acetate ([EMIM]oAc) pretreatment on sugarcane bagasse (SCB). The structural changes on SCB during pretreatment were also examined. The effects of temperature, time and solid loading on reducing sugar (RS) yield obtained from enzymatic hydrolysis of pretreated SCB were investigated by applying Central Composite Design (CCD) of Response Surface Methodology (RSM). Results from CCD were modeled into a second order polynomial equation and the model shows a good correlation between predicted and experimental values. The optimized condition for [EMIM]oAc pretreatment were 145 °C, 15 min and 14 wt% of solid loading with an optimum RS yield of 69.7%. Characterization of SCB was carried out and there were no significant difference between the chemical composition of untreated and [EMIM]oAc-pretreated SCB. Pretreated SCB was found to be porous, less crystalline and favorable to enzymatic hydrolysis as proven by Scanning Electron Microscopy (SEM), X-ray Powder Diffraction (XRD) analysis and Fourier Transform Infrared (FTIR) analysis. In short, [EMIM]oAc pretreatment shows good performance in improving the RS yield after enzymatic hydrolysis besides giving desirable structural modification on pretreated SCB. These are of great benefit to the subsequent downstream processes. -- Highlights: ? Reliable model prediction on reducing sugar yield. ? Temperature has the most significant effect on [EMIM]oAc pretreatment. ? High solid loading in [EMIM]oAc pretreatment is feasible. ? Amorphous and porous structure in pretreated bagasse was confirmed. ? No significant variation in chemical composition of untreated and pretreated bagasse.
Vozinaki, Anthi Eirini K.; Karatzas, George P.; Sibetheros, Ioannis A.; Varouchakis, Emmanouil A.
2014-05-01
Damage curves are the most significant component of the flood loss estimation models. Their development is quite complex. Two types of damage curves exist, historical and synthetic curves. Historical curves are developed from historical loss data from actual flood events. However, due to the scarcity of historical data, synthetic damage curves can be alternatively developed. Synthetic curves rely on the analysis of expected damage under certain hypothetical flooding conditions. A synthetic approach was developed and presented in this work for the development of damage curves, which are subsequently used as the basic input to a flood loss estimation model. A questionnaire-based survey took place among practicing and research agronomists, in order to generate rural loss data based on the responders' loss estimates, for several flood condition scenarios. In addition, a similar questionnaire-based survey took place among building experts, i.e. civil engineers and architects, in order to generate loss data for the urban sector. By answering the questionnaire, the experts were in essence expressing their opinion on how damage to various crop types or building types is related to a range of values of flood inundation parameters, such as floodwater depth and velocity. However, the loss data compiled from the completed questionnaires were not sufficient for the construction of workable damage curves; to overcome this problem, a Weighted Monte Carlo method was implemented, in order to generate extra synthetic datasets with statistical properties identical to those of the questionnaire-based data. The data generated by the Weighted Monte Carlo method were processed via Logistic Regression techniques in order to develop accurate logistic damage curves for the rural and the urban sectors. A Python-based code was developed, which combines the Weighted Monte Carlo method and the Logistic Regression analysis into a single code (WMCLR Python code). Each WMCLR code execution provided a flow velocity-depth damage curve for a specific land use. More specifically, each WMCLR code execution for the agricultural sector generated a damage curve for a specific crop and for every month of the year, thus relating the damage to any crop with floodwater depth, flow velocity and the growth phase of the crop at the time of flooding. Respectively, each WMCLR code execution for the urban sector developed a damage curve for a specific building type, relating structural damage with floodwater depth and velocity. Furthermore, two techno-economic models were developed in Python programming language, in order to estimate monetary values of flood damages to the rural and the urban sector, respectively. A new Monte Carlo simulation was performed, consisting of multiple executions of the techno-economic code, which generated multiple damage cost estimates. Each execution used the proper WMCLR simulated damage curve. The uncertainty analysis of the damage estimates established the accuracy and reliability of the proposed methodology for the synthetic damage curves' development.
Lombardo, L.; Cama, M.; Maerker, M.; Parisi, L.; Rotigliano, E.
2014-12-01
This study aims at comparing the performances of Binary Logistic Regression (BLR) and Boosted Regression Trees (BRT) methods in assessing landslide susceptibility for multiple-occurrence regional landslide events within the Mediterranean region. A test area was selected in the north-eastern sector of Sicily (southern Italy), corresponding to the catchments of the Briga and the Giampilieri streams both stretching for few kilometres from the Peloritan ridge (eastern Sicily, Italy) to the Ionian sea. This area was struck on the 1st October 2009 by an extreme climatic event resulting in thousands of rapid shallow landslides, mainly of debris flows and debris avalanches types involving the weathered layer of a low to high grade metamorphic bedrock. Exploiting the same set of predictors and the 2009 landslide archive, BLR- and BRT-based susceptibility models were obtained for the two catchments separately, adopting a random partition (RP) technique for validation; besides, the models trained in one of the two catchments (Briga) were tested in predicting the landslide distribution in the other (Giampilieri), adopting a spatial partition (SP) based validation procedure. All the validation procedures were based on multi-folds tests so to evaluate and compare the reliability of the fitting, the prediction skill, the coherence in the predictor selection and the precision of the susceptibility estimates. All the obtained models for the two methods produced very high predictive performances, with a general congruence between BLR and BRT in the predictor importance. In particular, the research highlighted that BRT-models reached a higher prediction performance with respect to BLR-models, for RP based modelling, whilst for the SP-based models the difference in predictive skills between the two methods dropped drastically, converging to an analogous excellent performance. However, when looking at the precision of the probability estimates, BLR demonstrated to produce more robust models in terms of selected predictors and coefficients, as well as of dispersion of the estimated probabilities around the mean value for each mapped pixel. The difference in the behaviour could be interpreted as the result of overfitting effects, which heavily affect decision tree classification more than logistic regression techniques.
Li, Y; Graubard, B I; Huang, P; Gastwirth, J L
2015-02-20
Determining the extent of a disparity, if any, between groups of people, for example, race or gender, is of interest in many fields, including public health for medical treatment and prevention of disease. An observed difference in the mean outcome between an advantaged group (AG) and disadvantaged group (DG) can be due to differences in the distribution of relevant covariates. The Peters-Belson (PB) method fits a regression model with covariates to the AG to predict, for each DG member, their outcome measure as if they had been from the AG. The difference between the mean predicted and the mean observed outcomes of DG members is the (unexplained) disparity of interest. We focus on applying the PB method to estimate the disparity based on binary/multinomial/proportional odds logistic regression models using data collected from complex surveys with more than one DG. Estimators of the unexplained disparity, an analytic variance-covariance estimator that is based on the Taylor linearization variance-covariance estimation method, as well as a Wald test for testing a joint null hypothesis of zero for unexplained disparities between two or more minority groups and a majority group, are provided. Simulation studies with data selected from simple random sampling and cluster sampling, as well as the analyses of disparity in body mass index in the National Health and Nutrition Examination Survey 1999-2004, are conducted.? Empirical results indicate that the Taylor linearization variance-covariance estimation is accurate and that the proposed Wald test maintains the nominal level. PMID:25382235
Comparison of Bayesian and Classical Analysis of Weibull Regression Model: A Simulation Study
Directory of Open Access Journals (Sweden)
?mran KURT ÖMÜRLÜ
2011-01-01
Full Text Available Objective: The purpose of this study was to compare performances of classical Weibull Regression Model (WRM and Bayesian-WRM under varying conditions using Monte Carlo simulations. Material and Methods: It was simulated the generated data by running for each of classical WRM and Bayesian-WRM under varying informative priors and sample sizes using our simulation algorithm. In simulation studies, n=50, 100 and 250 were for sample sizes, and informative prior values using a normal prior distribution with was selected for b1. For each situation, 1000 simulations were performed. Results: Bayesian-WRM with proper informative prior showed a good performance with too little bias. It was found out that bias of Bayesian-WRM increased while priors were becoming distant from reliability in all sample sizes. Furthermore, Bayesian-WRM obtained predictions with more little standard error than the classical WRM in both of small and big samples in the light of proper priors. Conclusion: In this simulation study, Bayesian-WRM showed better performance than classical method, when subjective data analysis performed by considering of expert opinions and historical knowledge about parameters. Consequently, Bayesian-WRM should be preferred in existence of reliable informative priors, in the contrast cases, classical WRM should be preferred.
A New Global Regression Analysis Method for the Prediction of Wind Tunnel Model Weight Corrections
Ulbrich, Norbert Manfred; Bridge, Thomas M.; Amaya, Max A.
2014-01-01
A new global regression analysis method is discussed that predicts wind tunnel model weight corrections for strain-gage balance loads during a wind tunnel test. The method determines corrections by combining "wind-on" model attitude measurements with least squares estimates of the model weight and center of gravity coordinates that are obtained from "wind-off" data points. The method treats the least squares fit of the model weight separate from the fit of the center of gravity coordinates. Therefore, it performs two fits of "wind- off" data points and uses the least squares estimator of the model weight as an input for the fit of the center of gravity coordinates. Explicit equations for the least squares estimators of the weight and center of gravity coordinates are derived that simplify the implementation of the method in the data system software of a wind tunnel. In addition, recommendations for sets of "wind-off" data points are made that take typical model support system constraints into account. Explicit equations of the confidence intervals on the model weight and center of gravity coordinates and two different error analyses of the model weight prediction are also discussed in the appendices of the paper.
Li, Shuo; Nyagilo, James O; Dave, Digant P; Gao, Jean X
2013-09-01
Quantitative analysis of Raman spectra using surface-enhanced Raman scattering (SERS) nanoparticles has shown the potential and promising trend of development in in vivo molecular imaging. Partial least square regression (PLSR) methods have been reported as state-of-the-art methods. However, the approaches fully rely on the intensities of Raman spectra and can not avoid the influences of the unstable background. In this paper we design a new continuous wavelet transform based PLSR (CWT-PLSR) algorithm that uses mixing concentrations and the average CWT coefficients of Raman spectra to carry out PLSR. We elaborate and prove how the average CWT coefficients with a Mexican hat mother wavelet are robust representations of Raman peaks, and the method can reduce the influences of unstable baseline and random noises during the prediction process. The algorithm was tested using three Raman spectra data sets with three cross-validation methods in comparison with current leading methods, and the results show its robustness and effectiveness. PMID:23963247
DEFF Research Database (Denmark)
SØrensen, Jens Benn; Badsberg, Jens Henrik
1989-01-01
The prognostic factors for survival in advanced adenocarcinoma of the lung were investigated in a consecutive series of 259 patients treated with chemotherapy. Twenty-eight pretreatment variables were investigated by use of Cox's multivariate regression model, including histological subtypes and degree of differentiation, the new international staging system for lung cancer, and seven laboratory parameters. Staging of the patients included bone marrow examination but were otherwise nonextensive without routine bone, liver, and brain scans. Factors predicting poor survival were low performance status, stage IV disease, no prior nonradical resection, liver metastases, high values of white blood cell count, and lactate dehydrogenase, and low values of aspartate aminotransaminase. The nonradical resection may not be a prognostic factor because of the resection itself but may rather serve as an indicator for patients having minimal disease spread. Liver metastases were of limited clinical value as a prognostic factor because they were detected in only seven cases in this patient population. A new Cox analysis ignoring the influence of this variable revealed no other variables than those occurring in the former Cox model to be of importance (performance status, stage, surgical resection, WBC, aspartate aminotransaminase, and lactate dehydrogenase). This simplified model appears to be a feasible clinical tool, allowing for prognostic stratification of patients when first the inoperability of the patient is known.
AGE OF BIRDS AT OPTIMAL PRODUCTION OF EGGS: A POLYNOMIAL REGRESSION ANALYSIS
Directory of Open Access Journals (Sweden)
Chigozie Kelechi Acha
2014-08-01
Full Text Available This paper discusses the age of birds at optimal production of eggs. The objective is to determine the age at which birds are at their best in terms of production of eggs which may be relevant in improving the output of poultry farmers in egg production. To achieve this objective, secondary data on egg production (in grams per day by age of birds (in weeks, from 18 to 87 weeks between 2008 and 2010 were collected from poultry farm of the National Root Crop Research Institute (NRCRI, Umudike. Polynomial regression was fitted to determine the appropriate model for age-pattern of egg production among birds aged 18 to 87 weeks. Result of the analysis show that polynomial of order 3 describes the pattern in the egg production data well, but not adequately. The result also shows that the residuals from the fitted polynomial follow the pattern autoregressive process of order 1. Using the fitted model, it was observed that the age of birds at maximum production of eggs is about 44.36weeks. The egg production corresponding to this age is about 12.14 grams per day. The birds were also found to be at their best, (in terms of egg production when they are aged between 34.5 weeks and 54.5 weeks, with egg production of at least 11.07 grams per day. Hence, for optimal production of eggs, it is recommended that birds are not kept far beyond 54.5 weeks.
Long, Nguyen Phuoc; Huy, Nguyen Tien; Trang, Nguyen Thi Huyen; Luan, Nguyen Thien; Anh, Nguyen Hoang; Nghi, Tran Diem; Van Hieu, Mai; Hirayama, Kenji; Karbwang, Juntra
2014-01-01
BACKGROUND: Ethics is one of the main pillars in the development of science. We performed a JoinPoint regression analysis to analyze the trends of ethical issue research over the past half century. The question is whether ethical issues are neglected despite their importance in modern research.
Analysis of neutral particle emission containing a fast ion tail by use of a non linear-regression
International Nuclear Information System (INIS)
We present a program for the analysis of neutral particle emission detected by a single channel analyzer which may be easily modified to handle the data from a multichannel analyzer. In particular the program uses a nonlinear regression to fit the data and therefore correctly handles cases where the Maxwellian velocity distribution function is distorted by a high energy ion population
Brabant, Marie-Eve; Hebert, Martine; Chagnon, Francois
2013-01-01
This study explored the clinical profiles of 77 female teenager survivors of sexual abuse and examined the association of abuse-related and personal variables with suicidal ideations. Analyses revealed that 64% of participants experienced suicidal ideations. Findings from classification and regression tree analysis indicated that depression,…
Directory of Open Access Journals (Sweden)
Lançon Christophe
2006-07-01
Full Text Available Abstract Background Data comparing duloxetine with existing antidepressant treatments is limited. A comparison of duloxetine with fluoxetine has been performed but no comparison with venlafaxine, the other antidepressant in the same therapeutic class with a significant market share, has been undertaken. In the absence of relevant data to assess the place that duloxetine should occupy in the therapeutic arsenal, indirect comparisons are the most rigorous way to go. We conducted a systematic review of the efficacy of duloxetine, fluoxetine and venlafaxine versus placebo in the treatment of Major Depressive Disorder (MDD, and performed indirect comparisons through meta-regressions. Methods The bibliography of the Agency for Health Care Policy and Research and the CENTRAL, Medline, and Embase databases were interrogated using advanced search strategies based on a combination of text and index terms. The search focused on randomized placebo-controlled clinical trials involving adult patients treated for acute phase Major Depressive Disorder. All outcomes were derived to take account for varying placebo responses throughout studies. Primary outcome was treatment efficacy as measured by Hedge's g effect size. Secondary outcomes were response and dropout rates as measured by log odds ratios. Meta-regressions were run to indirectly compare the drugs. Sensitivity analysis, assessing the influence of individual studies over the results, and the influence of patients' characteristics were run. Results 22 studies involving fluoxetine, 9 involving duloxetine and 8 involving venlafaxine were selected. Using indirect comparison methodology, estimated effect sizes for efficacy compared with duloxetine were 0.11 [-0.14;0.36] for fluoxetine and 0.22 [0.06;0.38] for venlafaxine. Response log odds ratios were -0.21 [-0.44;0.03], 0.70 [0.26;1.14]. Dropout log odds ratios were -0.02 [-0.33;0.29], 0.21 [-0.13;0.55]. Sensitivity analyses showed that results were consistent. Conclusion Fluoxetine was not statistically different in either tolerability or efficacy when compared with duloxetine. Venlafaxine was significantly superior to duloxetine in all analyses except dropout rate. In the absence of relevant data from head-to-head comparison trials, results suggest that venlafaxine is superior compared with duloxetine and that duloxetine does not differentiate from fluoxetine.
Pradhan, Biswajeet
Recently, in 2006 and 2007 heavy monsoons rainfall have triggered floods along Malaysia's east coast as well as in southern state of Johor. The hardest hit areas are along the east coast of peninsular Malaysia in the states of Kelantan, Terengganu and Pahang. The city of Johor was particularly hard hit in southern side. The flood cost nearly billion ringgit of property and many lives. The extent of damage could have been reduced or minimized if an early warning system would have been in place. This paper deals with flood susceptibility analysis using logistic regression model. We have evaluated the flood susceptibility and the effect of flood-related factors along the Kelantan river basin using the Geographic Information System (GIS) and remote sensing data. Previous flooded areas were extracted from archived radarsat images using image processing tools. Flood susceptibility mapping was conducted in the study area along the Kelantan River using radarsat imagery and then enlarged to 1:25,000 scales. Topographical, hydrological, geological data and satellite images were collected, processed, and constructed into a spatial database using GIS and image processing. The factors chosen that influence flood occurrence were: topographic slope, topographic aspect, topographic curvature, DEM and distance from river drainage, all from the topographic database; flow direction, flow accumulation, extracted from hydrological database; geology and distance from lineament, taken from the geologic database; land use from SPOT satellite images; soil texture from soil database; and the vegetation index value from SPOT satellite images. Flood susceptible areas were analyzed and mapped using the probability-logistic regression model. Results indicate that flood prone areas can be performed at 1:25,000 which is comparable to some conventional flood hazard map scales. The flood prone areas delineated on these maps correspond to areas that would be inundated by significant flooding (approximately the 100 year flood). The flood prone area boundaries were generally in agreement with flood hazard maps produced by the Department of Irrigation and Drainage although the latter are somewhat more detailed because of their larger scale.
De Mauro, Alessandro
2006-01-01
PURPOSE: Quality of life in multiple sclerosis has been often measured through the SF-36 questionnaire. In this study, validation of the SF-36 summary scores, its 'physical' component, and its 'mental' component was attempted by exploring the joint predictive power of disability (EDSS score), of anxiety and depression (HADS-A and -D scores, respectively), and of disease duration, progression type, age, gender and marital status. METHOD: The sample consisted of 75 patients suffering from multi...
Directory of Open Access Journals (Sweden)
Cecchini Diego M
2009-11-01
Full Text Available Abstract Background The central nervous system is considered a sanctuary site for HIV-1 replication. Variables associated with HIV cerebrospinal fluid (CSF viral load in the context of opportunistic CNS infections are poorly understood. Our objective was to evaluate the relation between: (1 CSF HIV-1 viral load and CSF cytological and biochemical characteristics (leukocyte count, protein concentration, cryptococcal antigen titer; (2 CSF HIV-1 viral load and HIV-1 plasma viral load; and (3 CSF leukocyte count and the peripheral blood CD4+ T lymphocyte count. Methods Our approach was to use a prospective collection and analysis of pre-treatment, paired CSF and plasma samples from antiretroviral-naive HIV-positive patients with cryptococcal meningitis and assisted at the Francisco J Muñiz Hospital, Buenos Aires, Argentina (period: 2004 to 2006. We measured HIV CSF and plasma levels by polymerase chain reaction using the Cobas Amplicor HIV-1 Monitor Test version 1.5 (Roche. Data were processed with Statistix 7.0 software (linear regression analysis. Results Samples from 34 patients were analyzed. CSF leukocyte count showed statistically significant correlation with CSF HIV-1 viral load (r = 0.4, 95% CI = 0.13-0.63, p = 0.01. No correlation was found with the plasma viral load, CSF protein concentration and cryptococcal antigen titer. A positive correlation was found between peripheral blood CD4+ T lymphocyte count and the CSF leukocyte count (r = 0.44, 95% CI = 0.125-0.674, p = 0.0123. Conclusion Our study suggests that CSF leukocyte count influences CSF HIV-1 viral load in patients with meningitis caused by Cryptococcus neoformans.
International Nuclear Information System (INIS)
There are many opinions on the reason of hypothyroidism after hyperthyroidism with 131I treatment. In this respect, there are a few scientific analyses and reports. The non-condition logistic regression solved this problem successfully. It has a higher scientific value and confidence in the risk factor analysis. 748 follow-up patients' data were analysed by the non-condition logistic regression. The results shown that the half-life and 131I dose were the main causes of the incidence of hypothyroidism. The degree of confidence is 92.4%
Analysis of a multiple dispatch algorithm
Holmberg, Johannes
2004-01-01
The development of the new programming language Scream, within the project Software Renaissance, led to the need of a good multiple dispatch algorithm. A multiple dispatch algorithm, called Compressed n-dimensional table with row sharing; CNT-RS, was developed from the algorithm Compressed n-dimensional table, CNT. The purpose of CNT-RS was to create a more efficient algorithm. This report is the result of the work to analyse the CNT-RS algorithm. In this report the domain of multiple dispat...
Directory of Open Access Journals (Sweden)
Jakši? Uroš G.
2014-01-01
Full Text Available This paper deals with the analysis of correlation and regression between the parameters of particle ionizing radiation and the stability characteristics of the irradiated monocrystalline silicon film. Based on the presented theoretical model of correlation and linear regression between two random variables, numeric and real experiments were performed. In the numeric experiment, a simulation of the effect of alpha radiation on a thin layer of monocrystalline silicon was performed by observing a number of vacancies along the film depth resulting from a single incident alpha particle. In the real experiment, the irradiation of a thin silicon film by alpha particles from a radioactive Am-241 alpha emitter was performed. The observed values of radiation effect on the Si film were specific resistance and the concentration of free charge carriers. The results showed a fine concordance between numeric and real experiments. Correlation verification of the observed values was presented by linear regression functions. [Projekat Ministarstva nauke Republike Srbije, br. 171007
Directory of Open Access Journals (Sweden)
Berhan Asres
2012-05-01
Full Text Available Abstract Background Reports on the sexual behavior of people on antiretroviral therapy (ART are inconsistent. We selected 14 articles that compared the sexual behavior of people with and without ART for this analysis. Methods We included both cross-sectional studies that compared different ART-naïve and ART-experienced participants and longitudinal studies examining the behavior of the same individuals pre- and post-ART start. Meta-analyses were performed both stratified by type of study and combined. Outcome variables assessed for association with ART experience were any sexual activity, unprotected sex and having multiple sexual partners. Random-effect models were applied to determine the overall odds ratios. Sub-group analyses and meta-regression analyses were performed to examine sources of heterogeneity among the studies. Sensitivity analysis was also conducted to evaluate the stability of the overall odds ratio in the presence of outliers. Results The meta-analysis failed to show a statistically significant association of any sexual activity with ART experience. It did, however, show an overall statistically significant reduction of any unprotected sex, having multiple sexual partners and unprotected sex with HIV negative or unknown HIV status with ART experience. Meta-regression showed no interaction between duration of ART use or recall period of sexual behavior with the sexual activity variables. However, there was an association between the percentage of married or cohabiting participants included in a study and reductions in the practice of unprotected sex with ART. Conclusion In general, this meta-analysis demonstrated a significant reduction in risky sexual behavior among people on ART in sub-Saharan Africa. Future studies should investigate the reproducibility and continuity of the observed positive behavioural changes as the duration of ART lasts a decade or more.
Rebechi, S R; Vélez, M A; Vaira, S; Perotti, M C
2016-02-01
The aims of the present study were to test the accuracy of the fatty acid ratios established by the Argentinean Legislation to detect adulterations of milk fat with animal fats and to propose a regression model suitable to evaluate these adulterations. For this purpose, 70 milk fat, 10 tallow and 7 lard fat samples were collected and analyzed by gas chromatography. Data was utilized to simulate arithmetically adulterated milk fat samples at 0%, 2%, 5%, 10% and 15%, for both animal fats. The fatty acids ratios failed to distinguish adulterated milk fats containing less than 15% of tallow or lard. For each adulterant, Multiple Linear Regression (MLR) was applied, and a model was chosen and validated. For that, calibration and validation matrices were constructed employing genuine and adulterated milk fat samples. The models were able to detect adulterations of milk fat at levels greater than 10% for tallow and 5% for lard. PMID:26304443
Nowrouzi, Behdin; Souza, Renan P; Zai, Clement; Shinkai, Takahiro; Monda, Marcellino; Lieberman, Jeffrey; Volvaka, Jan; Meltzer, Herbert Y; Kennedy, James L; De Luca, Vincenzo
2013-03-01
Antipsychotics-induced weight gain is a complex phenomenon with a relevant underlying genetic basis. Polymorphisms of serotonin receptors and related proteins were genotyped in 139 schizophrenia patients and incorporated as covariates in a mixture regression model of weight gain in combination with clinical covariates. The HTR1D rs6300 polymorphism was showing a slight significance conferring risk for obesity (heavy weight gain group) under additive model. After correcting for multiple testing all the genetic predictors were non-significant, however the clinical predictors were associated with the risk of heavy weight gain. These findings suggest a role of ethnicity and olanzapine in increasing the risk for obesity in the heavy weight gain group and haloperidol protecting against heavy weight gain. The mixture regression model appears to be a useful strategy to highlight different weight gain subgroups that are affected differently by clinical and genetic predictors. PMID:22840963
Analysis of dental caries using generalized linear and count regression models
Directory of Open Access Journals (Sweden)
Javali M. Phil
2013-11-01
Full Text Available Generalized linear models (GLM are generalization of linear regression models, which allow fitting regression models to response data in all the sciences especially medical and dental sciences that follow a general exponential family. These are flexible and widely used class of such models that can accommodate response variables. Count data are frequently characterized by overdispersion and excess zeros. Zero-inflated count models provide a parsimonious yet powerful way to model this type of situation. Such models assume that the data are a mixture of two separate data generation processes: one generates only zeros, and the other is either a Poisson or a negative binomial data-generating process. Zero inflated count regression models such as the zero-inflated Poisson (ZIP, zero-inflated negative binomial (ZINB regression models have been used to handle dental caries count data with many zeros. We present an evaluation framework to the suitability of applying the GLM, Poisson, NB, ZIP and ZINB to dental caries data set where the count data may exhibit evidence of many zeros and over-dispersion. Estimation of the model parameters using the method of maximum likelihood is provided. Based on the Vuong test statistic and the goodness of fit measure for dental caries data, the NB and ZINB regression models perform better than other count regression models.
Regression Models: A Brief Introduction
Grégoire, G.
2014-01-01
This brief introduction, without pretension, aims to give some help to non-specialists of statistics to find their way in regression models. What are the basic notions of a regression? A regression model can be linear, generalized linear, nonlinear. Statisticians speak also of parametric, semiparametric, nonparametric regression models. We hope that what is behind these terms will be made clearer after the reading of chapters devoted to simple linear regression, multiple linear regression, logistic regression, survival data and regression, kernel methods... But it can be interesting to have a global view, before reading these chapters, on a rather wide range of regression methods, and to have a first sight on what type of question a particular regression model is answering and what can be expected from such a model on the ground of modelling the data we have in hand.
Handling Missing Values with Regularized Iterative Multiple Correspondence Analysis
Josse, Julie; Chavent, Marie; Liquet, Benoit; Husson, François
2012-01-01
A common approach to deal with missing values in multivariate exploratory data analysis consists in minimizing the loss function over all non-missing elements. This can be achieved by EM-type algorithms where an iterative imputation of the missing values is performed during the estimation of the axes and components. This paper proposes such an algorithm, named iterative multiple correspondence analysis, to handle missing values in multiple correspondence analysis (MCA). This algorithm, based ...
Aguilar, I; Tsuruta, S; Misztal, I
2010-06-01
Data included 90,242,799 test day records from first, second and third parities of 5,402,484 Holstein cows and 9,326,754 animals in the pedigree. Additionally, daily temperature humidity indexes (THI) from 202 weather stations were available. The fixed effects included herd test day, age at calving, milking frequency and days in milk classes (DIM). Random effects were additive genetic, permanent environment and herd-year and were fit as random regressions. Covariates included linear splines with four knots at 5, 50, 200 and 305 DIM and a function of THI. Mixed model equations were solved using an iteration on data program with a preconditioned conjugate gradient algorithm. Preconditioners used were diagonal (D), block diagonal due to traits (BT) and block diagonal due to traits and correlated effects (BTCORR). One run included BT with a 'diagonalized' model in which the random effects were reparameterized for diagonal (co)variance matrices among traits (BTDIAG). Memory requirements were 8.7 Gb for D, 10.4 Gb for BT and BTDIAG, and 24.3 Gb for BTCORR. Computing times (rounds) were 14 days (952) for D, 10.7 days (706) for BT, 7.7 days (494) for BTDIAG and 4.6 days (289) for BTCORR. The convergence pattern was strongly influenced by the choice of fixed effects. When sufficient memory is available, the option BTCORR is the fastest and simplest to implement; the next efficient method, BTDIAG, requires additional steps for diagonalization and back-diagonalization. PMID:20536641
Middlebrook, A. M.; Murphy, D. M.; Lee, S.; Lee, S.; Lee, S.; Thomson, D. S.; Thomson, D. S.
2001-12-01
During the Atlanta Supersites project in August 1999, the PALMS (Particle Analysis by Laser Mass Spectrometry) instrument collected over 500,000 individual particle spectra. The Atlanta data were originally analyzed by examining combinations of peaks and relative peak areas [Lee et al., 2001a,b], and a wide range of particle components such as sulfate, nitrate, mineral species, metals, organic species, and elemental carbon were detected. To further study the dataset, a classification program using regression tree analysis was developed and applied. Spectral data were compressed into a lower resolution spectrum (every 0.25 mass units) of the raw data and a list of peak areas (every mass unit). Each spectrum started as a normalized classification vector by itself. If the dot product of two classification vectors was within a certain threshold, they were combined into a new classification. The new classification vector was a normalized running average of the classifications being combined. In subsequent steps, the threshold for combining classifications was continuously lowered until a reasonable number of classifications remained. After the final iteration, each spectrum was compared individually with the entire set of classification vectors. Classifications were also combined manually. The classification results from the Atlanta data are generally consistent with those determined by peak identification. However, the classification program identified specific patterns in the mass spectra that were not found by peak identification and generated new particle types. Furthermore, rare particle types that may affect human health were studied in more detail. A description of the classification program as well as the results for the Atlanta data will be presented. Lee, S.-H., D. M. Murphy, D. S. Thomson, and A. M. Middlebrook, Chemical components of single particles measured with particle analysis by laser mass spectrometry (PALMS) during the Atlanta Supersites Project: Focus on organic/sulfate, lead, soot, and mineral particles, J. Geophys. Res., in press, 2001a. Lee, S.-H., D. M. Murphy, D. S. Thomson, and A. M. Middlebrook, Nitrate and oxidized organic ions in single particle mass spectra during the 1999 Atlanta Supersites Project, submitted to J. Geophys. Res., 2001b.
Trends in Multiple Criteria Decision Analysis
Ehrgott, Matthias; Greco, Salvatore
2010-01-01
Multiple Criteria Decision Making (MCDM) is the study of methods and procedures by which concerns about multiple conflicting criteria can be formally incorporated into the management planning process. A key area of research in OR/MS, MCDM is now being applied in many new areas, including GIS systems, AI, and group decision making. This volume is in effect the third in a series of Springer books by these editors (all in the ISOR series), and it brings all the latest developments in MCDM into focus. Looking at developments in the applications, methodologies and foundations of MCDM, it presents r
International Nuclear Information System (INIS)
Most of the applications and new generation of instruments in X ray Fluorescence spectroscopy require that the results of measurements be known as soon as possible. The present work aims to introduce an alternative solution allowing the direct conversion of spectra obtained by the XRF spectrometer to elements concentrations. This solution is the use of PLS regression which consists of establishing a regression relation [Y]=[X][?]+[E] between concentrations and spectra where [E] is the residual of the regression.The principle is to determine the coefficient [?] by mean of calibration with standard samples. This coefficient is used after to predict elements concentrations in unknown samples by a simple matrix multiplication. In order to use this method, we have developped the X-PLS v1.0 software which implements the PLS regression. This software allows to perform all required tasks for the calibration and also the prediction of concetrations in unknown samples. Besides, it gives the possibility to save the calibration results in a file so that they can be used later in adequate conditions. The results obtained during the application of this method in total reflection X ray spectrometry can confirm that the PLS regression can be used as quantification method in XRF spectroscopy. The graphical representation of the regression coefficient illustrates the existence of a regression relation between spectra and concentrations. We could also prove that satisfactory accuracy can be obtained for the prediction in good conditions. One of these conditions is that the concentration of the element of interest must change as much as possible in as much samples as possible. This condition has been satisfied during the first set of measurements in which only the elements Cu and Zn had the major proportions within 10 samples. Relative errors obtained for prediction were 5,24% for the Cu and 7,16% for the Zn. For the second set, we used 5 elements (Cu, Zn, As, Se, Ni) in 10 samples. Results obtained for the predictions have had the following relative errors: Cu : 10,85%, Zn:6,93%, As:9,42%, Se:8,25% and 14,90% for the Ni.
Hauk, O; Pulvermüller, F; Ford, M; Marslen-Wilson, W D; Davis, M H
2009-01-01
We applied multiple linear regression analysis to event-related electrophysiological responses to words and pseudowords in a visual lexical decision task, yielding event-related regression coefficients (ERRCs) instead of the traditional event-related potential (ERP) measure. Our main goal was to disentangle the earliest ERP effects of the length of letter strings ("word length") and orthographic neighbourhood size (Coltheart's "N"). With respect to N, existing evidence is still ambiguous with respect to whether effects of N reflect early access to lexico-semantic information, or whether they occur at later decision or verification stages. In the present study, we found distinct neurophysiological manifestations of both N and word length around 100ms after word onset. Importantly, the effect of N distinguished between words and pseudowords, while the effect of word length did not. Minimum norm source estimation revealed the most dominant sources for word length in bilateral posterior brain areas for both words and pseudowords. For N, these sources were more left-lateralised and consistent with perisylvian brain areas, with activation peaks in temporal areas being more anterior for words compared to pseudowords. Our results support evidence for an effect of N at early and elementary stages of word recognition. We discuss the implications of these results for the time line of word recognition processes, and emphasise the value of ERRCs in combination with source analysis in psycholinguistic and cognitive brain research. PMID:18565639
Scientific Electronic Library Online (English)
Daniel F., Campos-Aranda.
2013-08-01
Full Text Available Cuando se emplean registros largos de escurrimiento, lluvia o crecientes anuales de una región con respuesta hidrológica similar, para ampliar una serie corta a través de la técnica estadística de regresión lineal múltiple (RLM), es probable que tales registros por su semejanza intrínseca den origen [...] a un problema de multicolinealidad. Tal problema se debe detectar y cuantificar para saber si es aceptable, moderada, fuerte o grave y buscar soluciones alternativas al método de ajuste de la RLM por mínimos cuadrados de los residuos. En este estudio se diagnosticó la multicolinealidad mediante factores de inflación de la variancia e índices de condición, basados en los eigenvalores. Además se presenta como método alternativo el ajuste sesgado de la RLM, conocido como regresión Ridge. Una aplicación numérica en el sistema del río Tempoal, de la Región Hidrológica No. 26 (Pánuco, México), se describió para completar el registro corto de volúmenes escurridos anuales de la estación hidrométrica Platón Sánchez, con base en las otras cuatro estaciones de aforos que cuentan con registros amplios. Se concluye que las principales ventajas de la regresión Ridge son la facilidad de manejo de transferencia con seis o más regresores y la sencillez de su implementación y desarrollo a través de la traza Ridge. Abstract in english When annual long records are used of runoff, rainfall or flooding of a region with similar hydrological response, to amplify short series through the statistical technique of multiple linear regression (MLR), it is likely that those records by reason of their intrinsic similarity will lead to a prob [...] lem of multicollinearity. This problem should be detected and quantified to know if it is acceptable, moderate, strong or serious and look for alternative solutions to the fitting method of the MLR by least squares of the residuals. In this study a diagnostic was made of multicollinearity through variance inflation factors and condition indices based on the eigenvalues. In addition, the biased fitting of the MLR is presented as an alternative method, known as Ridge regression. A numerical application in the system of the Tempoal river, of Hydrological Region No. 26 (Pánuco, México), was described to complete the short record of runoff volumes of the Platón Sánchez hydrometric station, based on the other four measuring stations that have long records. It is concluded that the principal advantages of Ridge regression are the ease of handling of transference with six or more regressions and the simplicity of its implementation and development by means of the Ridge trace.
Application of Robust Regression and Bootstrap in Poductivity Analysis of GERD Variable in EU27
Directory of Open Access Journals (Sweden)
Dagmar Blatná
2014-06-01
Full Text Available The GERD is one of Europe 2020 headline indicators being tracked within the Europe 2020 strategy. The headline indicator is the 3% target for the GERD to be reached within the EU by 2020. Eurostat defi nes “GERD” as total gross domestic expenditure on research and experimental development in a percentage of GDP. GERD depends on numerous factors of a general economic background, namely of employment, innovation and research, science and technology. The values of these indicators vary among the European countries, and consequently the occurrence of outliers can be anticipated in corresponding analyses. In such a case, a classical statistical approach – the least squares method – can be highly unreliable, the robust regression methods representing an acceptable and useful tool. The aim of the present paper is to demonstrate the advantages of robust regression and applicability of the bootstrap approach in regression based on both classical and robust methods.
JT-60 configuration parameters for feedback control determined by regression analysis
International Nuclear Information System (INIS)
The stepwise regression procedure was applied to obtain measurement formulas for equilibrium parameters used in the feedback control of JT-60. This procedure automatically selects variables necessary for the measurements, and selects a set of variables which are not likely to be picked up by physical considerations. Regression equations with stable and small multicollinearity were obtained and it was experimentally confirmed that the measurement formulas obtained through this procedure were accurate enough to be applicable to the feedback control of plasma configurations in JT-60. (author)
Approaches to data analysis of multiple-choice questions
Lin Ding1,*; Robert Beichner
2009-01-01
This paper introduces five commonly used approaches to analyzing multiple-choice test data. They are classical test theory, factor analysis, cluster analysis, item response theory, and model analysis. Brief descriptions of the goals and algorithms of these approaches are provided, together with examples illustrating their applications in physics education research. We minimize mathematics, instead placing emphasis on data interpretation using these approaches.
Sparse Group Penalized Integrative Analysis of Multiple Cancer Prognosis Datasets
LIU, Jin; Huang, Jian; Xie, Yang; Ma, Shuangge
2013-01-01
In cancer research, high-throughput profiling studies have been extensively conducted, searching for markers associated with prognosis. Because of the “large d, small n” characteristic, results generated from the analysis of a single dataset can be unsatisfactory. Recent studies have shown that integrative analysis, which simultaneously analyzes multiple datasets, can be more effective than single-dataset analysis and classic meta-analysis. In most of existing integrative analysis, the homoge...
Genetic analysis of multiple sclerosis in Shetland.
Roberts, D F; Roberts, M.J.; Poskanzer, D C
1983-01-01
In a family study of all patients with multiple sclerosis in Shetland the number of inbred patients, although high for Britain and higher than in Orkney, is not higher than the number among controls, and the inbreeding coefficients suggest that there is no recessive involvement of rare genes in the aetiology. The kinship coefficients show close interweaving of ancestries of patients and controls and eliminate from the aetiology any involvement of recently introduced single genes dominant or c...
Directory of Open Access Journals (Sweden)
J. B. Alam, M. Jobair Bin Alam, M. M. Rahman, A. K. Dikshit, S. K. Khan
Full Text Available The study reports the level of traffic-induced noise pollution in Sylhet City. For this purpose noise levels have been measured at thirty-seven major locations of the city from 7 am to 11 pm during the working days. It was observed that at all the locations the level of noise remains far above the acceptable limit for all the time. The noise level on the main road near residential area, hospital area and educational area were above the recommended level (65dBA. It was found that the predictive equations are in 60-70% correlated with the measured noise level. The study suggests that vulnerable institutions like school and hospital should be located about 60m away from the roadside unless any special arrangement to alleviate sound is used.
Lee, S T; T. T. Yu; W. F. Peng; C. L. Wang
2010-01-01
Seismic-induced landslide hazards are studied using seismic shaking intensity based on the topographic amplification effect. The estimation of the topographic effect includes the theoretical topographic amplification factors and the corresponding amplified ground motion. Digital elevation models (DEM) with a 5-m grid space are used. The logistic regression model and the geographic information system (GIS) are used to perform the seismic landslide hazard analysis. The 99 Peaks area, located 3 ...
Cape John; Whittington Craig; Buszewicz Marta; Wallace Paul; Underwood Lisa
2010-01-01
Abstract Background Psychological therapies provided in primary care are usually briefer than in secondary care. There has been no recent comprehensive review comparing their effectiveness for common mental health problems. We aimed to compare the effectiveness of different types of brief psychological therapy administered within primary care across and between anxiety, depressive and mixed disorders. Methods Meta-analysis and meta-regression of randomized controlled trials of brief psycholog...
Czech Academy of Sciences Publication Activity Database
Farokhi, S.; Shamsuddin, S. M.; Flusser, Jan; Sheikh, U. U.; Khansari, M.; Jafari-Khouzani, K.
2013-01-01
Ro?. 22, ?. 1 (2013), s. 1-11. ISSN 1017-9909 R&D Projects: GA ?R GAP103/11/1552 Keywords : face recognition * infrared imaging * image moments Subject RIV: JD - Computer Applications, Robotics Impact factor: 0.850, year: 2013 http://library.utia.cas.cz/separaty/2013/ZOI/flusser-rotation and noise invariant near-infrared face recognition by means of zernike moments and spectral regression discriminant analysis.pdf
The analysis of nonstationary time series using regression, correlation and cointegration
DEFF Research Database (Denmark)
Johansen, SØren
2012-01-01
There are simple well-known conditions for the validity of regression and correlation as statistical tools. We analyse by examples the effect of nonstationarity on inference using these methods and compare them to model based inference using the cointegrated vector autoregressive model. Finally we analyse some monthly data from US on interest rates as an illustration of the methods
Regression analysis of growth responses to water depth in three wetland plant species
Sorrell, Brian K.; Tanner, Chris C.; Brix, Hans
2012-01-01
Variability in plant flooding tolerance is often associated with differential growth responses to increasing water depth. This study highlights how morphological responses conferring flooding tolerance differ, using non-linear and quantile regression to quantitatively compare flooding-related growth responses of three species.
Sensitivity analysis of M-estimates of nonlinear regression model: Influence of data subsets.
Czech Academy of Sciences Publication Activity Database
Víšek, Jan Ámos
2002-01-01
Ro?. 54, ?. 2 (2002), s. 261-290. ISSN 0020-3157 Grant ostatní: GA UK(CZ) 255/2000/A EK/FSV Institutional research plan: CEZ:AV0Z1075907 Keywords : robust regression * M-estimate * sensitivity study on deletion of a group of observations Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.386, year: 2002
A Latent Class Regression Analysis of Men's Conformity to Masculine Norms and Psychological Distress
Wong, Y. Joel; Owen, Jesse; Shea, Munyi
2012-01-01
How are specific dimensions of masculinity related to psychological distress in specific groups of men? To address this question, the authors used latent class regression to assess the optimal number of latent classes that explained differential relationships between conformity to masculine norms and psychological distress in a racially diverse…
A Gauss-Newton-Based Broyden’s Class Algorithm for Parameters of Regression Analysis
Xiangrong Li; Xupei Zhao
2011-01-01
In this paper, a Gauss-Newton-based Broyden’s class method for parameters of regression problems is presented. The global convergence of this given method will be established under suitable conditions. Numerical results show that the proposed method is interesting.
Geddes, J.; Freemantle, N.; Harrison, P; Bebbington, P.
2000-01-01
OBJECTIVE: To develop an evidence base for recommendations on the use of atypical antipsychotics for patients with schizophrenia. DESIGN: Systematic overview and meta-regression analyses of randomised controlled trials, as a basis for formal development of guidelines. SUBJECTS: 12 649 patients in 52 randomised trials comparing atypical antipsychotics (amisulpride, clozapine, olanzapine, quetiapine, risperidone, and sertindole) with conventional antipsychotics (usually haloperidol or chlorprom...
An application of nonparametric Cox regression model in reliability analysis: A case study.
Czech Academy of Sciences Publication Activity Database
Volf, Petr
2004-01-01
Ro?. 40, ?. 5 (2004), s. 639-648. ISSN 0023-5954 R&D Projects: GA ?R GA201/02/0049; GA ?R GA402/01/0539 Institutional research plan: CEZ:AV0Z1075907 Keywords : hazard rate * nonparametric regression * Cox model Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.224, year: 2004
Risk Factors of Falls in Community-Dwelling Older Adults: Logistic Regression Tree Analysis
Yamashita, Takashi; Noe, Douglas A.; Bailer, A. John
2012-01-01
Purpose of the Study: A novel logistic regression tree-based method was applied to identify fall risk factors and possible interaction effects of those risk factors. Design and Methods: A nationally representative sample of American older adults aged 65 years and older (N = 9,592) in the Health and Retirement Study 2004 and 2006 modules was used.…
On the Usefulness of a Multilevel Logistic Regression Approach to Person-Fit Analysis
Conijn, Judith M.; Emons, Wilco H. M.; van Assen, Marcel A. L. M.; Sijtsma, Klaas
2011-01-01
The logistic person response function (PRF) models the probability of a correct response as a function of the item locations. Reise (2000) proposed to use the slope parameter of the logistic PRF as a person-fit measure. He reformulated the logistic PRF model as a multilevel logistic regression model and estimated the PRF parameters from this…
Analysis of Multivariate Experimental Data Using A Simplified Regression Model Search Algorithm
Ulbrich, Norbert Manfred
2013-01-01
A new regression model search algorithm was developed in 2011 that may be used to analyze both general multivariate experimental data sets and wind tunnel strain-gage balance calibration data. The new algorithm is a simplified version of a more complex search algorithm that was originally developed at the NASA Ames Balance Calibration Laboratory. The new algorithm has the advantage that it needs only about one tenth of the original algorithm's CPU time for the completion of a search. In addition, extensive testing showed that the prediction accuracy of math models obtained from the simplified algorithm is similar to the prediction accuracy of math models obtained from the original algorithm. The simplified algorithm, however, cannot guarantee that search constraints related to a set of statistical quality requirements are always satisfied in the optimized regression models. Therefore, the simplified search algorithm is not intended to replace the original search algorithm. Instead, it may be used to generate an alternate optimized regression model of experimental data whenever the application of the original search algorithm either fails or requires too much CPU time. Data from a machine calibration of NASA's MK40 force balance is used to illustrate the application of the new regression model search algorithm.
Chen, Ling; Sun, Jianguo
2010-01-01
This paper discusses regression analysis of interval-censored failure time data, which occur in many fields including demographical, epidemiological, financial, medical, and sociological studies. For the problem, we focus on the situation where the survival time of interest can be described by the additive hazards model and a multiple imputation approach is presented for inference. A major advantage of the approach is its simplicity and it can be easily implemented by using the existing softw...
Parsimonious Tensor Response Regression
Li, Lexin; Zhang, Xin
2015-01-01
Aiming at abundant scientific and engineering data with not only high dimensionality but also complex structure, we study the regression problem with a multidimensional array (tensor) response and a vector predictor. Applications include, among others, comparing tensor images across groups after adjusting for additional covariates, which is of central interest in neuroimaging analysis. We propose parsimonious tensor response regression adopting a generalized sparsity princip...
Ordered probit regression analysis of the effect of brand name on beer acceptance by consumers
Scientific Electronic Library Online (English)
Suzana Maria, Della Lucia; Valéria Paula Rodrigues, Minim; Carlos Henrique Osório, Silva; Luis Antonio, Minim; Paula De Aguiar, Cipriano.
2013-09-01
Full Text Available Ordered probit regression was used to analyze data of sensory acceptance tests designed to study the effect of brand name on the acceptability of beer samples. Eight different brands of Pilsen beer were evaluated by 101 consumers in two sessions of acceptance tests: blind evaluation and brand inform [...] ation test. Ordered probit regression, although a relatively sophisticated technique compared to others used to analyze sensory data, was chosen to enable the observation of consumers' behavior using graphical interpretations of estimated probabilities plotted against hedonic scales. It can be concluded that brands B, C, and D had a positive effect on the sensory acceptance of the product, whereas brands A, F, G, and H had a negative influence on consumers' evaluation of the samples. On the other hand, brand E had little influence on consumers' assessment.
A restoration method of medical X-ray images based on an extended regression analysis method
International Nuclear Information System (INIS)
In this paper, a new trial of restoration method for the X-ray images with optical blurs and quantum mottles is proposed by considering the physical formation process of X-ray images. More specifically, the optical blurs are first deterministically cleared away by using the transfer characteristic of the laser scanning, the characteristic of the radiographic screen-film system, the logarithmic transformation of the optical density and a digital inverse filter based on the point spread function. Next, the remaining quantum mottles are statistically taken away by using a regression model of newly generalized type with less information loss based on Bayes theorem and a simple experimental regression curve. Finally, in order to confirm the effectiveness of the proposed method, it is concretely applied to one of the actual medical diagnosis. (author)
Directory of Open Access Journals (Sweden)
Das Sumonkanti
2011-11-01
Full Text Available Abstract Background The study attempts to develop an ordinal logistic regression (OLR model to identify the determinants of child malnutrition instead of developing traditional binary logistic regression (BLR model using the data of Bangladesh Demographic and Health Survey 2004. Methods Based on weight-for-age anthropometric index (Z-score child nutrition status is categorized into three groups-severely undernourished ( Results All the models determine that age of child, birth interval, mothers' education, maternal nutrition, household wealth status, child feeding index, and incidence of fever, ARI & diarrhoea were the significant predictors of child malnutrition; however, results of PPOM were more precise than those of other models. Conclusion These findings clearly justify that OLR models (POM and PPOM are appropriate to find predictors of malnutrition instead of BLR models.
Lee, Soo Min; Lee, Jae-Won
2014-11-01
In this study, the optimal conditions for biomass torrefaction were determined by comparing the gain of energy content to the weight loss of biomass from the final products. Torrefaction experiments were performed at temperatures ranging from 220 to 280°C using 20-80min reaction times. Polynomial regression models ranging from the 1st to the 3rd order were used to determine a relationship between the severity factor (SF) and calorific value or weight loss. The intersection of two regression models for calorific value and weight loss was determined and assumed to be the optimized SF. The optimized SFs on each biomass ranged from 6.056 to 6.372. Optimized torrefaction conditions were determined at various reaction times of 15, 30, and 60min. The average optimized temperature was 248.55°C in the studied biomass when torrefaction was performed for 60min. PMID:25266685
Can the Eureqa symbolic regression program, computer algebra and numerical analysis help each other?
Stoutemyer, David R
2012-01-01
The Eureqa symbolic regression program has recently received extensive press praise. A representative quote is "There are very clever 'thinking machines' in existence today, such as Watson, the IBM computer that conquered Jeopardy! last year. But next to Eureqa, Watson is merely a glorified search engine." The program was designed to work with noisy experimental data. However, if the data is generated from an expression for which there exists more concise equivalent expressions, sometimes some of the Eureqa results are one or more of those more concise equivalents. If not, perhaps one or more of the returned Eureqa results might be a sufficiently accurate approximation that is more concise than the given expression. Moreover, when there is no known closed form expression, the data points can be generated by numerical methods, enabling Eureqa to find expressions that concisely fit those data points with sufficient accuracy. In contrast to typical regression software, the user does not have to explicitly or imp...
Parallel Approach for Time Series Analysis with General Regression Neural Networks
J.C. Cuevas-Tello; R.A. González-Grimaldo; O. Rodríguez-González; H.G. Pérez-González; O. VitalOchoa
2012-01-01
The accuracy on time delay estimation given pairs of irregularly sampled time series is of great relevance in astrophysics. However the computational time is also important because the study of large data sets is needed. Besides introducing a new approach for time delay estimation, this paper presents a parallel approach to obtain a fast algorithm for time delay estimation. The neural network architecture that we use is general Regression Neural Network (GRNN). For the parallel approach, we...
Schlechtingen, Meik; Santos, Ilmar
2011-01-01
This paper presents the research results of a comparison of three different model based approaches for wind turbine fault detection in online SCADA data, by applying developed models to five real measured faults and anomalies. The regression based model as the simplest approach to build a normal behavior model is compared to two artificial neural network based approaches, which are a full signal reconstruction and an autoregressive normal behavior model. Based on a real time series containing...
Sutan Emir Hidayat; Muhamad Abduh
2012-01-01
The 2007/2008 global financial crisis has given a significant impact on the performance of banking industry worldwide. The objective of this study is to see the impact of global financial crisis towards the financial performance of Islamic banking industry in Bahrain. Moreover, it also utilizes bank specific factors as predictors for Islamic bank performance in Bahrain. Panel regression is used to analyze the data. The result shows that LTA, LEQ, and LOHE are significant bank specific factors...
Javali Shivalingappa; Pandit Parameshwar
2010-01-01
Aim: The study aimed to determine the factors associated with periodontal disease (different levels of severity) by using different regression models for ordinal data. Design: A cross-sectional design was employed using clinical examination and ?questionnaire with interview? method. Materials and Methods: The study was conducted during June 2008 to October 2008 in Dharwad, Karnataka, India. It involved a systematic random sample of 1760 individuals aged 18-40 years. The periodon...
Directory of Open Access Journals (Sweden)
T. Nataraja Moorthy
2013-10-01
Full Text Available Objective:To derive regression equations to estimate stature from footprint lengths of Malay ethnicsin peninsular Malaysia.Material and methods: The study was carried out in UniversitiSains Malaysia involving 400 adult Malay subjects (200 males and 200 females who are staff and students. Informed consent and Human Ethical Approval were obtained. The heights of the individuals were recorded with portable height measuring device (SECA 208 and footprints were collected using inkless shoe print kit (Carolina, USA. The data obtained were analyzed with SPSS computer software and derived regression equations to estimate stature from footprint lengths of Malay ethnics in peninsular Malaysia.Results:Investigation revealed that all footprint lengths exhibit statistically positive significant correlation with stature (p<0.001. The mean stature is found to be significantly higher in males than females. All the footprint lengths in males are found to be larger than females both in left and right feet.The mean second toe–heel footprint lengths in both left and right are found to be the longest both in males and females. Correlation coefficient (R values are found to be higher in the pooled sample (0.74–0.78 when compared with males (0.43–0.51 and females (0.53–0.61.Conclusion: The result of this investigation provided population specific regression equations for stature estimation from footprints (complete and partial of Malays in peninsular Malaysia. The regression equations derived for the pooled sample can be used to estimate stature when the sex of the footprint’s owner remains unknown, as in real crime scenarios.
Predicting Student Success on the Texas Chemistry STAAR Test: A Logistic Regression Analysis
Johnson, William L.; Johnson, Annabel M.; Johnson, Jared
2012-01-01
Background: The context is the new Texas STAAR end-of-course testing program. Purpose: The authors developed a logistic regression model to predict who would pass-or-fail the new Texas chemistry STAAR end-of-course exam. Setting: Robert E. Lee High School (5A) with an enrollment of 2700 students, Tyler, Texas. Date of the study was the 2011-2012…
A Structural Analysis of the Correlated Random Coefficient Wage Regression Model
Belzil, Christian; Hansen, Jörgen
2007-01-01
We estimate a finite mixture dynamic programming model of schooling decisions in which the log wage regression function is set in a random coefficient framework. The model allows for absolute and comparative advantages in the labor market and assumes that the population is composed of 8 unknown types. Overall, labor market skills (as opposed to taste for schooling) appear to be the prime factor explaining schooling attainments. The estimates indicate a higher cross-sectional variance in the r...
Determinants of Academic Attainment in the US: a Quantile regression analysis of test scores
Haile, Getinet; Nguyen, Ngoc Anh
2007-01-01
We investigate the determinants of high school students’ academic attainment in maths, reading and science; focusing particularly on possible effects that ethnicity and family background may have on attainment. Using data from the NELS2000 and employing quantile regression techniques, we find two important results. First, the gaps in maths, reading and science test scores among ethnic groups vary across the conditional quantiles of the measured test scores. Specifically, Blacks...
M. Thirunavukkarasu; G kathiravan
2006-01-01
A study was undertaken to estimate the probability of conception in artificially bred bovines, based on various animal and environmental factors, based on the data collected from 2,283 bovines (1,942 cattle and 241 buffaloes) inseminated at 30 artificial insemination centres in six districts of North-eastern agroclimatic zone of Tamil Nadu State (India). Logistic regression technique was employed to estimate the probability of a particular breedable bovine female not being able to conceive of...
Can the Eureqa symbolic regression program, computer algebra and numerical analysis help each other?
Stoutemyer, David R.
2012-01-01
The Eureqa symbolic regression program has recently received extensive press praise. A representative quote is "There are very clever 'thinking machines' in existence today, such as Watson, the IBM computer that conquered Jeopardy! last year. But next to Eureqa, Watson is merely a glorified search engine." The program was designed to work with noisy experimental data. However, if the data is generated from an expression for which there exists more concise equivalent e...
Mobley Lee R; Kuo Tzy-Mey; Urato Matthew; Subramanian Sujha
2010-01-01
Abstract Background Colorectal cancer (CRC) is the second leading cause of cancer death in the United States, and endoscopic screening can both detect and prevent cancer, but utilization is suboptimal and varies across geographic regions. We use multilevel regression to examine the various predictors of individuals' decisions to utilize endoscopic CRC screening. Study subjects are a 100% population cohort of Medicare beneficiaries identified in 2001 and followed through 2005. The outcome vari...
Use of data envelopment analysis and regression for establishing manpower requirements in a bank
L P Fatti
2014-01-01
We describe an approach towards forecasting the manpower requirements in each of the branches of a bank, based on regression models fitted to the sets of efficient branches. DEA is employed to identify the efficient branches within a category, using the numbers of employees in the different grades at each branch as input variables, and the average volumes of different types of work performed by them during a month as output variables. Forecasts of future volumes of work are obtained by fittin...
Wage Inequality and Returns to Education in Turkey: A Quantile Regression Analysis
Tansel, Aysit; Bircan, Fatma
2011-01-01
This paper investigates the male wage inequality and its evolution over the 1994-2002 period in Turkey by estimating Mincerian wage equations using OLS and quantile regression techniques. Male wage inequality is high in Turkey. While it declined at the lower end of the wage distribution it increased at the top end of wage distribution. Education contributed to higher wage inequality through both within and between dimensions. The within-groups inequality increased and between-groups inequalit...
Schlechtingen, Meik; Ferreira Santos, Ilmar
2011-07-01
This paper presents the research results of a comparison of three different model based approaches for wind turbine fault detection in online SCADA data, by applying developed models to five real measured faults and anomalies. The regression based model as the simplest approach to build a normal behavior model is compared to two artificial neural network based approaches, which are a full signal reconstruction and an autoregressive normal behavior model. Based on a real time series containing two generator bearing damages the capabilities of identifying the incipient fault prior to the actual failure are investigated. The period after the first bearing damage is used to develop the three normal behavior models. The developed or trained models are used to investigate how the second damage manifests in the prediction error. Furthermore the full signal reconstruction and the autoregressive approach are applied to further real time series containing gearbox bearing damages and stator temperature anomalies. The comparison revealed all three models being capable of detecting incipient faults. However, they differ in the effort required for model development and the remaining operational time after first indication of damage. The general nonlinear neural network approaches outperform the regression model. The remaining seasonality in the regression model prediction error makes it difficult to detect abnormality and leads to increased alarm levels and thus a shorter remaining operational period. For the bearing damages and the stator anomalies under investigation the full signal reconstruction neural network gave the best fault visibility and thus led to the highest confidence level.
DEFF Research Database (Denmark)
Schlechtingen, Meik; Santos, Ilmar
2011-01-01
This paper presents the research results of a comparison of three different model based approaches for wind turbine fault detection in online SCADA data, by applying developed models to five real measured faults and anomalies. The regression based model as the simplest approach to build a normal behavior model is compared to two artificial neural network based approaches, which are a full signal reconstruction and an autoregressive normal behavior model. Based on a real time series containing two generator bearing damages the capabilities of identifying the incipient fault prior to the actual failure are investigated. The period after the first bearing damage is used to develop the three normal behavior models. The developed or trained models are used to investigate how the second damage manifests in the prediction error. Furthermore the full signal reconstruction and the autoregressive approach are applied to further real time series containing gearbox bearing damages and stator temperature anomalies.The comparison revealed all three models being capable of detecting incipient faults. However, they differ in the effort required for model development and the remaining operational time after first indication of damage. The general nonlinear neural network approaches outperform the regression model. The remaining seasonality in the regression model prediction error makes it difficult to detect abnormality and leads to increased alarm levels and thus a shorter remaining operational period. For the bearing damages and the stator anomalies under investigation the full signal reconstruction neural network gave the best fault visibility and thus led to the highest confidence level.
Directory of Open Access Journals (Sweden)
Marco Aurélio Carino Bouzada
2009-09-01
Full Text Available Este trabalho descreve - por meio do estudo de um caso - o problema da previsão de demanda de chamadas para um determinado produto no call center de uma grande empresa brasileira do setor - a Contax - e como ele foi abordado com o uso de Regressão Múltipla com variáveis dummy. Depois de destacar e justificar a importância do tema, o estudo apresenta uma breve revisão de literatura acerca de métodos de previsão de demanda e de sua aplicação em call centers. O caso é descrito, contextualizando, inicialmente, a empresa estudada e descrevendo, a seguir, a forma como ela lida com o problema de previsão de demanda de chamadas para o produto 103 - serviços relacionados à telefonia fixa. Um modelo de Regressão Múltipla com variáveis dummy é, então, desenvolvido para servir como base do processo de previsão de demanda proposto. Este modelo utiliza informações disponíveis capazes de influenciar a demanda, tais como o dia da semana, a ocorrência ou não de feriado e a proximidade da data com eventos críticos, como a chegada da conta à residência do cliente e seu vencimento; e apresentou ganhos de acurácia da ordem de 3 pontos percentuais para o período estudado, quando comparado com a ferramenta anteriormente em uso.This work describes - with the aid of a case study -a demand forecast problem for a specific product reported to the call center of a large Brazilian company in an industry called Contax, and the way it was approached with the use of Multiple Regression using dummy variables. After highlighting and justifying the studied matter relevance, the article presents a small literature review regarding demand forecast methods and their use in the call center industry. The case is described presenting the studied company and the way it deals with the Forecasting Demand for a telephone all center regarding telephone services products. Therefore, a Multiple Regression with dummy variables model was developed to work as the basis of the proposed demand forecast process. This model uses available data capable of influencing the demand such as the week day, occurrence of holidays, and the date of critical events such as the date on which the bill is sent and the date of payment collect. The model presented an improvement of Demand Forecasting Accuracy of 0.3% in the studied period when compared to the previously tool in use
Scientific Electronic Library Online (English)
Marco Aurélio Carino, Bouzada; Eduardo, Saliby.
2009-09-01
Full Text Available Este trabalho descreve - por meio do estudo de um caso - o problema da previsão de demanda de chamadas para um determinado produto no call center de uma grande empresa brasileira do setor - a Contax - e como ele foi abordado com o uso de Regressão Múltipla com variáveis dummy. Depois de destacar e j [...] ustificar a importância do tema, o estudo apresenta uma breve revisão de literatura acerca de métodos de previsão de demanda e de sua aplicação em call centers. O caso é descrito, contextualizando, inicialmente, a empresa estudada e descrevendo, a seguir, a forma como ela lida com o problema de previsão de demanda de chamadas para o produto 103 - serviços relacionados à telefonia fixa. Um modelo de Regressão Múltipla com variáveis dummy é, então, desenvolvido para servir como base do processo de previsão de demanda proposto. Este modelo utiliza informações disponíveis capazes de influenciar a demanda, tais como o dia da semana, a ocorrência ou não de feriado e a proximidade da data com eventos críticos, como a chegada da conta à residência do cliente e seu vencimento; e apresentou ganhos de acurácia da ordem de 3 pontos percentuais para o período estudado, quando comparado com a ferramenta anteriormente em uso. Abstract in english This work describes - with the aid of a case study -a demand forecast problem for a specific product reported to the call center of a large Brazilian company in an industry called Contax, and the way it was approached with the use of Multiple Regression using dummy variables. After highlighting and [...] justifying the studied matter relevance, the article presents a small literature review regarding demand forecast methods and their use in the call center industry. The case is described presenting the studied company and the way it deals with the Forecasting Demand for a telephone all center regarding telephone services products. Therefore, a Multiple Regression with dummy variables model was developed to work as the basis of the proposed demand forecast process. This model uses available data capable of influencing the demand such as the week day, occurrence of holidays, and the date of critical events such as the date on which the bill is sent and the date of payment collect. The model presented an improvement of Demand Forecasting Accuracy of 0.3% in the studied period when compared to the previously tool in use
Directory of Open Access Journals (Sweden)
H. M. Worden
2013-03-01
Full Text Available A current obstacle to the Observation System Simulation Experiments (OSSEs used to quantify the potential performance of future atmospheric composition remote sensing systems is a computationally efficient method to define the scene-dependent vertical sensitivity of measurements as expressed by the retrieval averaging kernels (AKs. We present a method for the efficient prediction of AKs for multispectral retrievals of carbon monoxide (CO and ozone (O3 based on actual retrievals from MOPITT on EOS-Terra and TES and OMI on EOS-Aura, respectively. This employs a multiple regression approach for deriving scene-dependent AKs using predictors based on state parameters such as the thermal contrast between the surface and lower atmospheric layers, trace gas volume mixing ratios (VMR, solar zenith angle, water vapor amount, etc. We first compute the singular vector decomposition (SVD for individual cloud-free AKs and retain the 1st three ranked singular vectors in order to fit the most significant, orthogonal components of the AK in the subsequent multiple regression on a training set of retrieval cases. The resulting fit coefficients are applied to the predictors from a different test set of retrievals cased to reconstruct predicted AKs, which can then be evaluated against the true test set retrieval AKs. By comparing the VMR profile adjustment resulting from the use of the predicted vs. true AKs, we quantify the CO and O3 VMR profile errors associated with the use of the predicted AKs compared to the true AKs that might be obtained from a computationally expensive full retrieval calculation as part of an OSSE. Similarly, we estimate the errors in CO and O3 VMRs from using a single regional average AK to represent all retrievals, which has been a common approximation in chemical OSSEs performed to-date. For both CO and O3 in the lower troposphere, we find a significant reduction in error when using the predicted AKs as compared to a single average AK. This study examined data from the continental United States (CONUS for 2006, but the approach could be applied to other regions and times.
Directory of Open Access Journals (Sweden)
H. M. Worden
2013-07-01
Full Text Available A current obstacle to the observation system simulation experiments (OSSEs used to quantify the potential performance of future atmospheric composition remote sensing systems is a computationally efficient method to define the scene-dependent vertical sensitivity of measurements as expressed by the retrieval averaging kernels (AKs. We present a method for the efficient prediction of AKs for multispectral retrievals of carbon monoxide (CO and ozone (O3 based on actual retrievals from MOPITT (Measurements Of Pollution In The Troposphere on the Earth Observing System (EOS-Terra satellite and TES (Tropospheric Emission Spectrometer and OMI (Ozone Monitoring Instrument on EOS-Aura, respectively. This employs a multiple regression approach for deriving scene-dependent AKs using predictors based on state parameters such as the thermal contrast between the surface and lower atmospheric layers, trace gas volume mixing ratios (VMRs, solar zenith angle, water vapor amount, etc. We first compute the singular value decomposition (SVD for individual cloud-free AKs and retain the first three ranked singular vectors in order to fit the most significant orthogonal components of the AK in the subsequent multiple regression on a training set of retrieval cases. The resulting fit coefficients are applied to the predictors from a different test set of test retrievals cased to reconstruct predicted AKs, which can then be evaluated against the true retrieval AKs from the test set. By comparing the VMR profile adjustment resulting from the use of the predicted vs. true AKs, we quantify the CO and O3 VMR profile errors associated with the use of the predicted AKs compared to the true AKs that might be obtained from a computationally expensive full retrieval calculation as part of an OSSE. Similarly, we estimate the errors in CO and O3 VMRs from using a single regional average AK to represent all retrievals, which has been a common approximation in chemical OSSEs performed to date. For both CO and O3 in the lower troposphere, we find a significant reduction in error when using the predicted AKs as compared to a single average AK. This study examined data from the continental United States (CONUS for 2006, but the approach could be applied to other regions and times.
Parappagoudar, Mahesh B.; Pratihar, Dilip K.; Datta, Gouranga L.
2008-08-01
A cement-bonded moulding sand system takes a fairly long time to attain the required strength. Hence, the moulds prepared with cement as a bonding material will have to wait a long time for the metal to be poured. In this work, an accelerator was used to accelerate the process of developing the bonding strength. Regression analysis was carried out on the experimental data collected as per statistical design of experiments (DOE) to establish input-output relationships of the process. The experiments were conducted to measure compression strength and hardness (output parameters) by varying the input variables, namely amount of cement, amount of accelerator, water in the form of cement-to-water ratio, and testing time. A two-level full-factorial design was used for linear regression model, whereas a three-level central composite design (CCD) had been utilized to develop non-linear regression model. Surface plots and main effects plots were used to study the effects of amount of cement, amount of accelerator, water and testing time on compression strength, and mould hardness. It was observed from both the linear as well as non-linear models that amount of cement, accelerator, and testing time have some positive contributions, whereas cement-to-water ratio has negative contribution to both the above responses. Compression strength was found to have linear relationship with the amount of cement and accelerator, and non-linear relationship with the remaining process parameters. Mould hardness was seen to vary linearly with testing time and non-linearly with the other parameters. Analysis of variance (ANOVA) was performed to test statistical adequacy of the models. Twenty random test cases were considered to test and compare their performances. Non-linear regression models were found to perform better than the linear models for both the responses. An attempt was also made to express compression strength of the moulding sand system as a function of mould hardness.
Seismic analysis of equipment supported at multiple levels
International Nuclear Information System (INIS)
The present paper makes a literature review of all available methods of seismic analysis of multiply supported equipment. It presents a comparative study of seismic responses on typical equipment using different methods of analysis including a discussion of results. Based on the study conclusions are drawn and recomendations are made on the use of appropriate analytical technique to solve multiple support problems. 9 refs
Multiple scattering problems in heavy ion elastic recoil detection analysis
International Nuclear Information System (INIS)
A number of groups use Heavy Ion Elastic Recoil Detection Analysis (HIERDA) to study materials science problems. Nevertheless, there is no standard methodology for the analysis of HIERDA spectra. To overcome this deficiency we have been establishing codes for 2-dimensional data analysis. A major problem involves the effects of multiple and plural scattering which are very significant, even for quite thin (?100 nm) layers of the very heavy elements. To examine the effects of multiple scattering we have made comparisons between the small-angle model of Sigmund et al. and TRIM calculations. (authors)
Kahane, Leo H
2007-01-01
Using a friendly, nontechnical approach, the Second Edition of Regression Basics introduces readers to the fundamentals of regression. Accessible to anyone with an introductory statistics background, this book builds from a simple two-variable model to a model of greater complexity. Author Leo H. Kahane weaves four engaging examples throughout the text to illustrate not only the techniques of regression but also how this empirical tool can be applied in creative ways to consider a broad array of topics. New to the Second Edition Offers greater coverage of simple panel-data estimation:
Alston, D. W.
1981-01-01
The considered research had the objective to design a statistical model that could perform an error analysis of curve fits of wind tunnel test data using analysis of variance and regression analysis techniques. Four related subproblems were defined, and by solving each of these a solution to the general research problem was obtained. The capabilities of the evolved true statistical model are considered. The least squares fit is used to determine the nature of the force, moment, and pressure data. The order of the curve fit is increased in order to delete the quadratic effect in the residuals. The analysis of variance is used to determine the magnitude and effect of the error factor associated with the experimental data.
Multiple block plane shear slope failure: Part I. Theoretical analysis
Energy Technology Data Exchange (ETDEWEB)
Barron, K.; Stimpson, B.; Kozar, K.
1985-03-01
Multiple block plane shear failures have been occurring in some coal strip mines in Alberta. These failures cause a rapid deterioration of the bench from which the dragline operates, resulting in reduced productivity and increased safety hazards. The first of a three part report, this volume develops a theoretical analysis of multiple block shear failure for the case of two slopes with an intervening bench and a horizontal water table. 1 ref.
Directory of Open Access Journals (Sweden)
L. Monika Moskal
2012-08-01
Full Text Available The characterization of soil attributes using hyperspectral sensors has revealed patterns in soil spectra that are known to respond to mineral composition, organic matter, soil moisture and particle size distribution. Soil samples from different soil horizons of replicated soil series from sites located within Washington and Oregon were analyzed with the FieldSpec Spectroradiometer to measure their spectral signatures across the electromagnetic range of 400 to 1,000 nm. Similarity rankings of individual soil samples reveal differences between replicate series as well as samples within the same replicate series. Using classification and regression tree statistical methods, regression trees were fitted to each spectral response using concentrations of nitrogen, carbon, carbonate and organic matter as the response variables. Statistics resulting from fitted trees were: nitrogen R^{2} 0.91 (p < 0.01 at 403, 470, 687, and 846 nm spectral band widths, carbonate R^{2} 0.95 (p < 0.01 at 531 and 898 nm band widths, total carbon R^{2} 0.93 (p < 0.01 at 400, 409, 441 and 907 nm band widths, and organic matter R^{2} 0.98 (p < 0.01 at 300, 400, 441, 832 and 907 nm band widths. Use of the 400 to 1,000 nm electromagnetic range utilizing regression trees provided a powerful, rapid and inexpensive method for assessing nitrogen, carbon, carbonate and organic matter for upper soil horizons in a nondestructive method.
Felipe, Vivian P S; Silva, Martinho A; Valente, Bruno D; Rosa, Guilherme J M
2015-04-01
The prediction of total egg production (TEP) potential in poultry is an important task to aid optimized management decisions in commercial enterprises. The objective of the present study was to compare different modeling approaches for prediction of TEP in meat type quails (Coturnix coturnix coturnix) using phenotypes such as weight, weight gain, egg production and egg quality measurements. Phenotypic data on 30 traits from two lines (L1, n=180; and L2, n=205) of quail were modeled to predict TEP. Prediction models included multiple linear regression and artificial neural network (ANN). Moreover, Bayesian network (BN) and a stepwise approach were used as variable selection methods. BN results showed that TEP is independent from other earlier expressed traits when conditioned on egg production from 35 to 80 days of age (EP1). In addition, the prediction accuracy was much lower when EP1 was not included in the model. The best predictive model was ANN, after feature selection, showing prediction correlations of r=0.792 and r=0.714 for L1 and L2, respectively. In conclusion, machine learning methods may be useful, but reasonable prediction accuracies are obtained only when partial egg production measurements are included in the model. PMID:25713397
Jalali-Heravi, M; Parastar, F
2000-12-01
A new series of six comprehensive descriptors that represent different features of the gas-liquid partition coefficient, K(L), for commonly used stationary phases is developed. These descriptors can be considered as counterparts of the parameters in the Abraham solvatochromic model of solution. A separate multiple linear regression (MLR) model was developed by using the six descriptors for each stationary phase of poly(ethylene glycol adipate) (EGAD), N,N,N',N'-tetrakis(2-hydroxypropyl) ethylenediamine (THPED), poly(ethylene glycol) (Ucon 50 HB 660) (U50HB), di(2-ethylhexyl)phosphoric acid (DEHPA) and tetra-n-butylammonium N,N-(bis-2-hydroxylethyl)-2-aminoethanesulfonate (QBES). The results obtained using these models are in good agreement with the experiment and with the results of the empirical model based on the solvatochromic theory. A 6-6-5 neural network was developed using the descriptors appearing in the MLR models as inputs. Comparison of the mean square errors (MSEs) shows the superiority of the artificial neural network (ANN) over that of the MLR. This indicates that the retention behavior of the molecules on different columns show some nonlinear characteristics. The experimental solvatochromic parameters proposed by Abraham can be replaced by the calculated descriptors in this work. PMID:11153937
Golmohammadi, Hassan
2009-11-30
A quantitative structure-property relationship (QSPR) study was performed to develop models those relate the structure of 141 organic compounds to their octanol-water partition coefficients (log P(o/w)). A genetic algorithm was applied as a variable selection tool. Modeling of log P(o/w) of these compounds as a function of theoretically derived descriptors was established by multiple linear regression (MLR), partial least squares (PLS), and artificial neural network (ANN). The best selected descriptors that appear in the models are: atomic charge weighted partial positively charged surface area (PPSA-3), fractional atomic charge weighted partial positive surface area (FPSA-3), minimum atomic partial charge (Qmin), molecular volume (MV), total dipole moment of molecule (mu), maximum antibonding contribution of a molecule orbital in the molecule (MAC), and maximum free valency of a C atom in the molecule (MFV). The result obtained showed the ability of developed artificial neural network to prediction of partition coefficients of organic compounds. Also, the results revealed the superiority of ANN over the MLR and PLS models. PMID:19360793
Czech Academy of Sciences Publication Activity Database
Strnádel, Ján; Kverka, Miloslav; Horák, Vratislav; Vannucci, Luca; Usvald, Dušan; Hlu?ilová, Jana; Plánská, Daniela; Vá?a, Petr; Reisnerová, H.; Jílek, F.
2007-01-01
Ro?. 53, - (2007), s. 216-219. ISSN 0015-5500 R&D Projects: GA ?R GA524/04/0102; GA ?R GD310/03/H147; GA ?R GD523/03/H076; GA AV ?R IAA600450601; GA AV ?R(CZ) IAA500200510; GA MŠk 2B06130 Institutional research plan: CEZ:AV0Z50450515; CEZ:AV0Z50200510 Keywords : cytokines * sarcoma cells * spontaneous regression Subject RIV: FD - Oncology ; Hematology Impact factor: 0.596, year: 2007
International Nuclear Information System (INIS)
A method for the computer processing of the curves of potentiometric differential titration using the precipitation reactions is developed. This method is based on transformation of the titration curve into a line of multiphase regression, whose parameters determine the equivalence points and the solubility products of the formed precipitates. The computational algorithm is tested using experimental curves for the titration of solutions containing Hg(2) and Cd(2) by the solution of sodium diethyldithiocarbamate. The random errors (RSD) for the titration of 1x10-4M solutions are in the range of 3-6%. 7 refs.; 2 figs.; 1 tab
Davies, Patrick Laurie
2014-01-01
Introduction IntroductionApproximate Models Notation Two Modes of Statistical AnalysisTowards One Mode of Analysis Approximation, Randomness, Chaos, Determinism ApproximationA Concept of Approximation Approximation Approximating a Data Set by a Model Approximation Regions Functionals and EquivarianceRegularization and Optimality Metrics and DiscrepanciesStrong and Weak Topologies On Being (almost) Honest Simulations and Tables Degree of Approximation and p-values ScalesStability of Analysis The Choice of En(?, P) Independence Procedures, Approximation and VaguenessDiscrete Models The Empirical
Reproducible statistical analysis with multiple languages
DEFF Research Database (Denmark)
Lenth, Russell; HØjsgaard, SØren
2011-01-01
This paper describes the system for making reproducible statistical analyses. differs from other systems for reproducible analysis in several ways. The two main differences are: (1) Several statistics programs can be in used in the same document. (2) Documents can be prepared using OpenOffice or \\LaTeX. The main part of this paper is an example showing how to use and together in an OpenOffice text document. The paper also contains some practical considerations on the use of literate programming in statistics.
International Nuclear Information System (INIS)
An analysis of bone tumor incidence in Beagle dogs exposed to 90SrCl2 using logistic regression indicates that the logit of the probability of bone-related neoplasms is described adequately as a linear function of bone dose to death. The bone tumor incidence appears to rise sharply above 5000 rad (50 Gy). When measures of dose rate were added to the dose-response relationship, they were not found to improve the relationship significantly. There were no important differences between male and female dogs in their dose-response relationships. 1 reference, 1 figure, 2 tables
Analysis of structures subjected to multiple spectra input
International Nuclear Information System (INIS)
Spectral analysis is currently used in the study of the response of structures under seismic excitation. It may be that the characteristics of the excitation, represented by oscillator response spectra, vary from a support to another one. In industrial plants, e.g. in nuclear power plants, this situation occurs more particularly for pipes or high equipments. Spectral analysis is extended to multiple support excitation by considering the relative motion with respect to the instantaneous static (''pseudo static'') response, which is no longer a rigid body motion. A complementary static analysis is therefore introduced, but spectral analysis can be performed under similar conditions to that of uniform excitation. The correction for rigid modes is also considered, as well as the influence of multiple excitation on time history analysis
Directory of Open Access Journals (Sweden)
Arash Shahin
2011-02-01
Full Text Available The main aim of this paper is to analyze the correlation of service quality gaps and to estimate customerdissatisfaction based on those gaps in the Iran Travel Agency (ITA as one of the international travel agencies ofthe country. For this purpose, a questionnaire has been designed based on the SERVQUAL approach(perceptions and expectations, which includes five major categories of service quality dimensions and aresubdivided into 15 dimensions and an additional question for measuring the overall dissatisfaction. 30 regularcustomers of the agency have been asked to fill the questionnaires. The correlation of service quality gaps andthen the relationship between overall customer dissatisfaction and major service quality gaps are determined bycorrelation and regression analysis. The findings imply that the maximum value of gap is related to 'appealingaccommodation facilities', which is a part of the dimension of tangibles. The minimum values of the gaps arealso related to 'on time delivery' and 'reputation of service. The correlation analysis has not addressed anysignificant correlation among the gaps. Ultimately, regression analysis has approved and estimated linearcorrelation between the gaps of empathy and tangibles and the overall customer dissatisfaction.
Directory of Open Access Journals (Sweden)
S. T. Lee
2010-12-01
Full Text Available Seismic-induced landslide hazards are studied using seismic shaking intensity based on the topographic amplification effect. The estimation of the topographic effect includes the theoretical topographic amplification factors and the corresponding amplified ground motion. Digital elevation models (DEM with a 5-m grid space are used. The logistic regression model and the geographic information system (GIS are used to perform the seismic landslide hazard analysis. The 99 Peaks area, located 3 km away from the ruptured fault of the Chi-Chi earthquake, is used to test the proposed hypothesis. An inventory map of earthquake-triggered landslides is used to produce a dependent variable that takes a value of 0 (no landslides or 1 (landslides. A set of independent parameters, including lithology, elevation, slope gradient, slope aspect, terrain roughness, land use, and Arias intensity (I_{a} with the topographic effect. Subsequently, logistic regression is used to find the best fitting function to describe the relationship between the occurrence and absence of landslides within an individual grid cell. The results of seismic landslide hazard analysis that includes the topographic effect (AUROC = 0.890 are better than those of the analysis without it (AUROC = 0.874.
Use of data envelopment analysis and regression for establishing manpower requirements in a bank
Directory of Open Access Journals (Sweden)
L.P. Fatti
2014-01-01
Full Text Available We describe an approach towards forecasting the manpower requirements in each of the branches of a bank, based on regression models fitted to the sets of efficient branches. DEA is employed to identify the efficient branches within a category, using the numbers of employees in the different grades at each branch as input variables, and the average volumes of different types of work performed by them during a month as output variables. Forecasts of future volumes of work are obtained by fitting a model which takes into account branch and seasonal effects, as well as separate trend effects for each of the branches. The models have been tested on data from a large bank, with very encouraging results. The approach holds great promise for use towards a decision support system for managing the bank's total branch manpower requirements.
Parallel Approach for Time Series Analysis with General Regression Neural Networks
Directory of Open Access Journals (Sweden)
J.C. Cuevas-Tello
2012-04-01
Full Text Available The accuracy on time delay estimation given pairs of irregularly sampled time series is of great relevance in astrophysics. However the computational time is also important because the study of large data sets is needed. Besides introducing a new approach for time delay estimation, this paper presents a parallel approach to obtain a fast algorithm for time delay estimation. The neural network architecture that we use is general Regression Neural Network (GRNN. For the parallel approach, we use Message Passing Interface (MPI on a beowulf-type cluster and on a Cray supercomputer and we also use the Compute Unified Device Architecture (CUDA™ language on Graphics Processing Units (GPUs. We demonstrate that, with our approach, fast algorithms can be obtained for time delay estimation on large data sets with the same accuracy as state-of-the-art methods.
Kang, Seung-Wan; Byun, Gukdo; Park, Hun-Joon
2014-12-01
This paper presents empirical research into the relationship between leader-follower value congruence in social responsibility and the level of ethical satisfaction for employees in the workplace. 163 dyads were analyzed, each consisting of a team leader and an employee working at a large manufacturing company in South Korea. Following current methodological recommendations for congruence research, polynomial regression and response surface modeling methodologies were used to determine the effects of value congruence. Results indicate that leader-follower value congruence in social responsibility was positively related to the ethical satisfaction of employees. Furthermore, employees' ethical satisfaction was stronger when aligned with a leader with high social responsibility. The theoretical and practical implications are discussed. PMID:25539173
DEFF Research Database (Denmark)
Shirali, Mahmoud; Nielsen, Vivi Hunnicke
Heritability of residual feed intake (RFI) increased from low to high over the growing period in male and female mink. The lowest heritability for RFI (male: 0.04 ± 0.01 standard deviation (SD); female: 0.05 ± 0.01 SD) was in early and the highest heritability (male: 0.33 ± 0.02; female: 0.34 ± 0.02 SD) was achieved at the late growth stages. The genetic correlation between different growth stages for RFI showed a high association (0.91 to 0.98) between early and late growing periods. However, phenotypic correlations were lower from 0.29 to 0.50. The residual variances were substantially higher at the end compared to the early growing period suggesting that heterogeneous residual variance should be considered for analyzing feed efficiency data in mink. This study suggests random regression methods are suitable for analyzing feed efficiency and that genetic selection for RFI in mink is promising.
Laporte, Audrey; Karimova, Alfia; Ferguson, Brian
2010-09-01
The time path of consumption from a rational addiction (RA) model contains information about an individual's tendency to be forward looking. In this paper, we use quantile regression (QR) techniques to investigate whether the tendency to be forward looking varies systematically with the level of consumption of cigarettes. Using panel data, we find that the forward-looking effect is strongest relative to the addiction effect in the lower quantiles of cigarette consumption, and that the forward-looking effect declines and the addiction effect increases as we move toward the upper quantiles. The results indicate that QR can be used to illuminate the heterogeneity in individuals' tendency to be forward looking even after controlling for factors such as education. QR also gives useful information about the differential impact of policy variables, most notably workplace smoking restrictions, on light and heavy smokers. PMID:20730997
Stratigraphic analysis of Pennsylvania rocks using hierarchy of transgressive-regressive units
Busch, R. M.
1984-06-01
Pennsylvanian stratigraphic sequences is described, interpreted, and correlated using a hierarchy of six scales of allocyclic, time-stratigraphic, transgressive-regressive units (abbreviated T-R units are inferred to be the net result of deposition during eustatic cycles of sea level change. The T-R units can be correlated across both marine and nonmarine facies in the Appalachian Basin. This permits differentiation of allocyclic T-R units from autocyclic T-R units or fluvial autocyclic units. The precise correlations also provide a time-stratigraphic framework for very accurate paleogeographic reconstructions. Paleogeographic maps were constructed for successive fifth-order marine events of the Glenshaw Formation (Upper Pennsylvanian) of the Northern Appalachian Basin. The hierarchal T-R unit approach is useful for understanding and predicting the location of marine units, claystones, various types of stratigraphic breaks, and economic mineral deposits.
Parallel Approach for Time Series Analysis with General Regression Neural Networks
Scientific Electronic Library Online (English)
J.C., Cuevas-Tello; R.A., González-Grimaldo; O., Rodríguez-González; H.G., Pérez-González; O., Vital-Ochoa.
2012-04-01
Full Text Available La precisión para estimar retrasos en tiempo en series de tiempo muestreadas irregularmente es de gran importancia en astrofísica. Sin embargo, el tiempo computacional también es importante para el estudio de conjuntos de datos de gran tamaño. Este artículo primero presenta un nuevo método para esti [...] mar retrasos en tiempo, posteriormente se presenta una metodología basada en cómputo paralelo para estimar de manera rápida retrasos en tiempo. En ambos casos se utiliza una arquitectura de redes neuronales denominada regresión generalizada (General Regression Neural Networks - GRNN). Para el cómputo paralelo se utiliza MPI (Message Passing Interface) en un cluster tipo beowulf y en una supercomputadora Cray, también se utiliza el lenguaje CUDA™) (Compute Unified Device Architecture) para GPUs (Graphics Processing Units). Finalmente se demuestra empíricamente que con nuestra metodología se obtienen algoritmos rápidos para estimar retrasos en tiempo en conjuntos de datos de gran tamaño con la misma precisión que métodos que se usan en la actualidad. Abstract in english The accuracy on time delay estimation given pairs of irregularly sampled time series is of great relevance in astrophysics. However the computational time is also important because the study of large data sets is needed. Besides introducing a new approach for time delay estimation, this paper presen [...] ts a parallel approach to obtain a fast algorithm for time delay estimation. The neural network architecture that we use is general Regression Neural Network (GRNN). For the parallel approach, we use Message Passing Interface (MPI) on a beowulf-type cluster and on a Cray supercomputer and we also use the Compute Unified Device Architecture (CUDA™) language on Graphics Processing Units (GPUs). We demonstrate that, with our approach, fast algorithms can be obtained for time delay estimation on large data sets with the same accuracy as state-of-the-art methods.
Arch Index: An Easier Approach for Arch Height (A Regression Analysis
Directory of Open Access Journals (Sweden)
Hironmoy Roy
2012-04-01
Full Text Available Background: Arch-height estimation though practiced usually in supine posture; is neither correct nor scientific as referred in literature, which favour for standing x-rays or arch-index as yardstick. In fact the standing x-rays can be excused for being troublesome in busy OPD, but an ink-footprint on simple graph-sheet can be documented, as it is easier, cheaper and requires almost no machineries and expertisation. Objective: So this study aimed to redefine the inter-relationship of the radiological standing arch-heights with the arch-index for correlation and regression so that from the later we can derive the radiographical standing arch-height values indirectly, avoiding the actual maneuver. Methods: The study involved 103 adult subjects attending at a tertiary care hospital of North Bengal. From the standing x-rays of foot, the standing navicular, talar heights were measured, and ‘normalised’ with the foot length. In parallel foot-prints also been obtained for arch-index. Finally variables analysed by SPSS software. Result: The arch-index showed significant negative correlations and simple linear regressions with standing navicular height, standing talar height as well as standing normalised navicular and talar heights analysed in both sexes separately with supporting mathematical equations. Conclusion: To measure the standing arch-height in a busy OPD, it is wise to have the foot-print first. Arch-index once get known, can be put in the equations as derived here, to predict the preferred standing arch-heights in either sex.
Cognitive analysis of multiple sclerosis utilizing fuzzy cluster means
Directory of Open Access Journals (Sweden)
Imianvan Anthony Agboizebeta
2012-02-01
Full Text Available Multiple sclerosis, often called MS, is a disease that affects the central nervous system (the brain andspinal cord. Myelin provides insulation for nerve cells improves the conduction of impulses along thenerves and is important for maintaining the health of the nerves. In multiple sclerosis, inflammationcauses the myelin to disappear. Genetic factors, environmental issues and viral infection may alsoplay a role in developing the disease. Ms is characterized by life threatening symptoms such as; loss ofbalance, hearing problem and depression. The application of Fuzzy Cluster Means (FCM or Fuzzy CMeananalysis to the diagnosis of different forms of multiple sclerosis is the focal point of this paper.Application of cluster analysis involves a sequence of methodological and analytical decision stepsthat enhances the quality and meaning of the clusters produced. Uncertainties associated withanalysis of multiple sclerosis test data are eliminated by the system
Weisberg, Sanford
2013-01-01
Praise for the Third Edition ""...this is an excellent book which could easily be used as a course text...""-International Statistical Institute The Fourth Edition of Applied Linear Regression provides a thorough update of the basic theory and methodology of linear regression modeling. Demonstrating the practical applications of linear regression analysis techniques, the Fourth Edition uses interesting, real-world exercises and examples. Stressing central concepts such as model building, understanding parameters, assessing fit and reliability, and drawing conclusions, the new edition illus
International Nuclear Information System (INIS)
In Baguio City, Philippines, a mountainous city of 252,386 people where 61% of motor vehicles use diesel fuel, ambient particulate matter 2.5) and 10) in aerodynamic diameter and carbon monoxide (CO) were measured at 30 street-level locations for 15 min apiece during the early morning (4:50-6:30 am), morning rush hour (6:30-9:10 am) and afternoon rush hour (3:40-5:40 pm) in December 2004. Environmental observations (e.g. traffic-related variables, building/roadway designs, wind speed and direction, etc.) at each location were noted during each monitoring event. Multiple regression models were formulated to determine which pollution sources and environmental factors significantly affect ground-level PM2.5, PM10 and CO concentrations. The models showed statistically significant relationships between traffic and early morning particulate air pollution [(PM2.5p = 0.021) and PM10 (p = 0.048)], traffic and morning rush hour CO (p = 0.048), traffic and afternoon rush hour CO (p = 0.034) and wind and early morning CO (p 0.044). The mean early morning, street-level PM2.5 (110 ± 8 ?g/m3; mean ± 1 standard error) was not significantly different (p-value > 0.05) from either rush hour PM2.5 concentration (morning = 98 ± 7 ?g/m3; afternoon = 107 ± 5 ?g/m3) due to nocturnal inversions in spite of a 100% increase in automotive density during rush hours. Early morning street-level CO (3.0 ± 1.7 ppm) differed from morning rush hour (4.1 ± 2.3 ppm) (p 0.039) and afternoon rush hour (4.5 ±2.2 ppm) (p = 0.007). Additionally, PM2.5, PM10, CO, nitrogen dioxide (NO2) and select volatile organic compounds were continuously measured at a downtown, third-story monitoring station along a busy roadway for 11 days. Twenty-four-hour average ambient concentrations were: PM2.5 = 72.9 ± 21 ?g/m3; CO = 2.61 ± 0.6 ppm; NO2 = 27.7 ± 1.6 ppb; benzene = 8.4 ± 1.4 ?g/m3; ethylbenzene = 4.6 ± 2.0 ?g/m3; p-xylene = 4.4 ± 1.9 ?g/m3; m-xylene = 10.2 ± 4.4 ?g/m3; o-xylene = 7.5 ± 3.2 ?g/m3. The multiple regression models suggest that traffic and wind in Baguio City, Philippines significantly affect street-level pollution concentrations. Ambient PM2.5 levels measured are above USEPA daily (65 ?g/m3) and Filipino/USEPA annual standards (15 ?g/m3) with concentrations of a magnitude rarely seen in most countries except in areas where local topography plays a significant role in air pollution entrapment. The elevated pollution concentrations present and the diesel-rich nature of motor vehicle emissions are important pertaining to human exposure and health information and as such warrant public health concern. (author)
Directory of Open Access Journals (Sweden)
Guo Junqiao
2008-09-01
Full Text Available Abstract Background The effects of climate variations on bacillary dysentery incidence have gained more recent concern. However, the multi-collinearity among meteorological factors affects the accuracy of correlation with bacillary dysentery incidence. Methods As a remedy, a modified method to combine ridge regression and hierarchical cluster analysis was proposed for investigating the effects of climate variations on bacillary dysentery incidence in northeast China. Results All weather indicators, temperatures, precipitation, evaporation and relative humidity have shown positive correlation with the monthly incidence of bacillary dysentery, while air pressure had a negative correlation with the incidence. Ridge regression and hierarchical cluster analysis showed that during 1987–1996, relative humidity, temperatures and air pressure affected the transmission of the bacillary dysentery. During this period, all meteorological factors were divided into three categories. Relative humidity and precipitation belonged to one class, temperature indexes and evaporation belonged to another class, and air pressure was the third class. Conclusion Meteorological factors have affected the transmission of bacillary dysentery in northeast China. Bacillary dysentery prevention and control would benefit from by giving more consideration to local climate variations.
Hu, Meng; Clark, Kelsey L; Gong, Xiajing; Noudoost, Behrad; Li, Mingyao; Moore, Tirin; Liang, Hualou
2015-06-10
Inferotemporal (IT) neurons are known to exhibit persistent, stimulus-selective activity during the delay period of object-based working memory tasks. Frontal eye field (FEF) neurons show robust, spatially selective delay period activity during memory-guided saccade tasks. We present a copula regression paradigm to examine neural interaction of these two types of signals between areas IT and FEF of the monkey during a working memory task. This paradigm is based on copula models that can account for both marginal distribution over spiking activity of individual neurons within each area and joint distribution over ensemble activity of neurons between areas. Considering the popular GLMs as marginal models, we developed a general and flexible likelihood framework that uses the copula to integrate separate GLMs into a joint regression analysis. Such joint analysis essentially leads to a multivariate analog of the marginal GLM theory and hence efficient model estimation. In addition, we show that Granger causality between spike trains can be readily assessed via the likelihood ratio statistic. The performance of this method is validated by extensive simulations, and compared favorably to the widely used GLMs. When applied to spiking activity of simultaneously recorded FEF and IT neurons during working memory task, we observed significant Granger causality influence from FEF to IT, but not in the opposite direction, suggesting the role of the FEF in the selection and retention of visual information during working memory. The copula model has the potential to provide unique neurophysiological insights about network properties of the brain. PMID:26063909
Directory of Open Access Journals (Sweden)
Yingqiang Ding
2012-04-01
Full Text Available A new hybrid algorithm named EODT-LS-SVR based on least squares support vector regression (LS-SVR with wavelet-based EODT algorithm as preprocessed tools is proposed for removing the interferences and developing the quantitative models with high precision in near-infrared (NIR spectra. EODT-LS-SVR algorithm is composed of two steps. In the first step, the preprocessing algorithm named EODT, which combines the ideas of wavelet packet transform (WPT, orthogonal signal correction (OSC and information theory, is employed for the characteristic extraction of analyte information through multi-scale analysis. Entropy-based baseline signal removing (EBSR algorithm is applied to remove the baseline of the spectra based on information theory with WPT-based analysis, and then the information orthogonal to the concentrations of analyte is removed by OSC algorithm in each frequency band of spectra. In the second step, LS-SVR method coupled with grid search and particle swarm optimization (PSO technique for parameters optimization is used to enhancing the quality of regression models. EODT-LS-SVR algorithm was validated by two NIR spectral datasets, one used for measuring the fat concentration of milk and the other used for measuring the oil content of corn. The comparison of prediction results demonstrated that the performance of calibration models developed by EODT-LS-SVR algorithm is better than that developed by other conventional algorithms, showing the high efficiency and the high quality for quantitative model development in NIR spectra of complex samples.
Energy Technology Data Exchange (ETDEWEB)
Jorjani, E.; Poorali, H.A.; Sam, A.; Chelgani, S.C.; Mesroghli, S.; Shayestehfar, M.R. [Islam Azad University, Tehran (Iran). Dept. of Mining Engineering
2009-10-15
In this paper, the combustible value (i.e. 100-Ash) and combustible recovery of coal flotation concentrate were predicted by regression and artificial neural network based on proximate and group macerals analysis. The regression method shows that the relationships between (a) in (ash), volatile matter and moisture (b) in (ash), in (liptinite), fusinite and vitrinite with combustible value can achieve the correlation coefficients (R{sup 2}) of 0.8 and 0.79, respectively. In addition, the input sets of (c) ash, volatile matter and moisture (d) ash, liptinite and fusinite can predict the combustible recovery with the correlation coefficients of 0.84 and 0.63, respectively. Feed-forward artificial neural network with 6-8-12-11-2-1 arrangement for moisture, ash and volatile matter input set was capable to estimate both combustible value and combustible recovery with correlation of 0.95. It was shown that the proposed neural network model could accurately reproduce all the effects of proximate and group macerals analysis on coal flotation system.
Directory of Open Access Journals (Sweden)
Y.-P. Lin
2010-06-01
Full Text Available The Chi-Chi Earthquake of September 1999 in Central Taiwan registered a moment magnitude MW of 7.6 on the Richter scale, causing widespread landslides. Subsequent typhoons associated with heavy rainfalls triggered the landslides. The study investigates multi-temporal landslide images from spatial analysis between 1996 and 2005 in the Chenyulan Watershed, Taiwan. Spatial patterns in various landslide frequencies were detected using landscapes metrics. The logistic regression results indicate that frequency of occurrence is an important factor in assessing landslide hazards. Low-occurrence landslides sprawl the catchment while the sustained (frequent landslide areas cluster near the ridge as well as the stream course. From those results, we can infer that landslide area and mean size for each landslide correlates with the frequency of occurrence. Although negatively correlated with frequency in the low-occurrence landslide, the mean size of each landslide is positively related to frequency in the high-occurrence one. Moreover, this study determines the spatial susceptibilities in landslides by performing logistic regression analysis. Results of this study demonstrate that the factors such as elevation, slope, lithology, and vegetation cover are significant explanatory variables. In addition to the various frequencies, the relationships between driving factors and landslide susceptibility in the study area are quantified as well.
Scientific Electronic Library Online (English)
Dora, Ocampo; Raúl, Rivas.
2013-08-01
Full Text Available El conocimiento a escala diaria de la radiación neta (Rn) permite cuantificar la energía que es utilizada en los diferentes procesos que ocurren a nivel de la superficie, como la evapotranspiración. En este estudio se aplica un Modelo de Regresión Lineal Múltiple (MRLM) para la estimación de la Rn e [...] n una zona subhúmeda-húmeda de Argentina. En el modelo se utilizaron datos meteorológicos de radiación solar global o total, temperatura, humedad relativa del aire, radiación neta (medida con un radiómetro neto Kipp & Zonen) y el valor del inverso de la distancia relativa tierra-sol o factor de excentricidad. Como resultado, se obtuvieron ocho ecuaciones de estimación de la Rn. Los MRLM se evaluaron a partir de los estadísticos desviación media del error (MBE) y raíz cuadrada del cuadrado medio del error (RMSE). Los resultados mostraron un buen ajuste y un bajo error a escala diaria, destacándose los modelos que involucraron la radiación solar, temperatura, humedad relativa del aire e inverso de la distancia tierra-sol, permitiendo cálculos de la Rn con errores inferiores a 19 W·m-2. Abstract in english Knowledge of daily net radiation (Rn) is basic to quantifying energy used in various processes occurring at the surface level such, as evapotranspiration. This study applies a Multiple Linear Regression Model (MRLM) for the estimation of Rn in a subhumid-humid zone of Argentina. In the model we used [...] weather data of solar radiation, temperature and relative humidity, Rn (measured with a Kipp & Zonen net radiometer) and inverse relative distance earth-sun. As a result, eight estimation equations of Rn were obtained. The MRLM models were evaluated using the statistics Mean Bias Error (MBE) and Root Mean Square Error (RMSE). The results showed good adjustment and low error at daily scale, highlighting those equations involving solar radiation, temperature, relative humidity and inverse distance earth-sun, allowing calculation of Rn with errors less than 19 W·m-2.
Comparative analysis and visualization of multiple collinear genomes
Wang Jeremy R; de Villena Fernando; McMillan Leonard
2012-01-01
Abstract Background Genome browsers are a common tool used by biologists to visualize genomic features including genes, polymorphisms, and many others. However, existing genome browsers and visualization tools are not well-suited to perform meaningful comparative analysis among a large number of genomes. With the increasing quantity and availability of genomic data, there is an increased burden to provide useful visualization and analysis tools for comparison of multiple collinear genomes suc...
Analysis of Multiple Manding Topographies during Functional Communication Training
Harding, Jay W.; Wacker, David P.; Berg, Wendy K.; Winborn-Kemmerer, Lisa; Lee, John F.; Ibrahimovic, Muska
2009-01-01
We evaluated the effects of reinforcing multiple manding topographies during functional communication training (FCT) to decrease problem behavior for three preschool-age children. During Phase 1, a functional analysis identified conditions that maintained problem behavior for each child. During Phase 2, the children's parents taught them to…
Generalized reduced rank regression
Hansen, Peter Reinhard
2002-01-01
I introduce a technique to estimate parameters in regressions with reduced rank parameters in a general setting. The framework can handle a general class of parameter restrictions and allows for specifications with heteroskedastic and autocorrelated regression errors. Applications of this technique include: estimation of structural equations, estimation of reduced rank matrices in cross-section, panel, and time-series analysis, including estimation of cointegration relations in time series an...
Matson, Johnny L.; Kozlowski, Alison M.
2010-01-01
Autistic regression is one of the many mysteries in the developmental course of autism and pervasive developmental disorders not otherwise specified (PDD-NOS). Various definitions of this phenomenon have been used, further clouding the study of the topic. Despite this problem, some efforts at establishing prevalence have been made. The purpose of…
Xu, Wenbo; Jing, Shaocai; Yu, Wenjuan; Wang, Zhaoxian; Zhang, Guoping; Huang, Jianxi
2013-11-01
In this study, the high risk areas of Sichuan Province with debris flow, Panzhihua and Liangshan Yi Autonomous Prefecture, were taken as the studied areas. By using rainfall and environmental factors as the predictors and based on the different prior probability combinations of debris flows, the prediction of debris flows was compared in the areas with statistical methods: logistic regression (LR) and Bayes discriminant analysis (BDA). The results through the comprehensive analysis show that (a) with the mid-range scale prior probability, the overall predicting accuracy of BDA is higher than those of LR; (b) with equal and extreme prior probabilities, the overall predicting accuracy of LR is higher than those of BDA; (c) the regional predicting models of debris flows with rainfall factors only have worse performance than those introduced environmental factors, and the predicting accuracies of occurrence and nonoccurrence of debris flows have been changed in the opposite direction as the supplemented information.
Energy Technology Data Exchange (ETDEWEB)
Stevens, F. J.; Bobrovnik, S. A.; Biosciences Division; Palladin Inst. Biochemistry
2007-12-01
Physiological responses of the adaptive immune system are polyclonal in nature whether induced by a naturally occurring infection, by vaccination to prevent infection or, in the case of animals, by challenge with antigen to generate reagents of research or commercial significance. The composition of the polyclonal responses is distinct to each individual or animal and changes over time. Differences exist in the affinities of the constituents and their relative proportion of the responsive population. In addition, some of the antibodies bind to different sites on the antigen, whereas other pairs of antibodies are sterically restricted from concurrent interaction with the antigen. Even if generation of a monoclonal antibody is the ultimate goal of a project, the quality of the resulting reagent is ultimately related to the characteristics of the initial immune response. It is probably impossible to quantitatively parse the composition of a polyclonal response to antigen. However, molecular regression allows further parameterization of a polyclonal antiserum in the context of certain simplifying assumptions. The antiserum is described as consisting of two competing populations of high- and low-affinity and unknown relative proportions. This simple model allows the quantitative determination of representative affinities and proportions. These parameters may be of use in evaluating responses to vaccines, to evaluating continuity of antibody production whether in vaccine recipients or animals used for the production of antisera, or in optimizing selection of donors for the production of monoclonal antibodies.
Regression analysis of the structure function for reliability evaluation of continuous-state system
Energy Technology Data Exchange (ETDEWEB)
Gamiz, M.L., E-mail: mgamiz@ugr.e [Departamento de Estadistica e I.O., Facultad de Ciencias, Universidad de Granada, Granada 18071 (Spain); Martinez Miranda, M.D. [Departamento de Estadistica e I.O., Facultad de Ciencias, Universidad de Granada, Granada 18071 (Spain)
2010-02-15
Technical systems are designed to perform an intended task with an admissible range of efficiency. According to this idea, it is permissible that the system runs among different levels of performance, in addition to complete failure and the perfect functioning one. As a consequence, reliability theory has evolved from binary-state systems to the most general case of continuous-state system, in which the state of the system changes over time through some interval on the real number line. In this context, obtaining an expression for the structure function becomes difficult, compared to the discrete case, with difficulty increasing as the number of components of the system increases. In this work, we propose a method to build a structure function for a continuum system by using multivariate nonparametric regression techniques, in which certain analytical restrictions on the variable of interest must be taken into account. Once the structure function is obtained, some reliability indices of the system are estimated. We illustrate our method via several numerical examples.
Regression analysis of the structure function for reliability evaluation of continuous-state system
International Nuclear Information System (INIS)
Technical systems are designed to perform an intended task with an admissible range of efficiency. According to this idea, it is permissible that the system runs among different levels of performance, in addition to complete failure and the perfect functioning one. As a consequence, reliability theory has evolved from binary-state systems to the most general case of continuous-state system, in which the state of the system changes over time through some interval on the real number line. In this context, obtaining an expression for the structure function becomes difficult, compared to the discrete case, with difficulty increasing as the number of components of the system increases. In this work, we propose a method to build a structure function for a continuum system by using multivariate nonparametric regression techniques, in which certain analytical restrictions on the variable of interest must be taken into account. Once the structure function is obtained, some reliability indices of the system are estimated. We illustrate our method via several numerical examples.
Analysis of air pollution data at a mixed source location using boosted regression trees
Carslaw, David C.; Taylor, Paul J.
This paper explores the use of boosted regression trees to draw inferences concerning the source characteristics at a location of high source complexity. Models are developed for hourly concentrations of nitrogen oxides (NO X) close to a large international airport. Model development is discussed and methods to quantify model uncertainties developed. It is shown that good explanatory models can be developed and further, allowing for interactions between model variables significantly improves the model fits compared with non-interacting models. Methods are used to determine which variables exert most influence over predicted concentrations and to explore the NO X dependency for each. Model predictions are used to estimate aircraft take-off contributions to total concentrations of NO X and determine how these predictions are affected by annual variations in meteorological conditions and runway use patterns. Furthermore, the results relating to the aircraft contributions to total NO X concentration are compared with those from a more detailed independent field campaign. Finally, we find empirical evidence that plumes from larger aircraft disperse more rapidly from the point of release compared with smaller aircraft. The reasons for this behaviour and the implications are discussed.
Uncertainty analysis for determination of plutonium mass by neutron multiplicity
International Nuclear Information System (INIS)
This paper describes an uncertainty analysis carried out in association with the use of neutron multiplicity counting to collect data, and assign a total plutonium mass. During 1997, the Los Alamos Safeguards Science and Technology Group carried out careful calorimetry and neutron multiplicity certification measurements on two 239Pu metal foils used as reference standards at the Idaho National Environmental Engineering Laboratory (INEEL). The foils were measured using a five ring neutron multiplicity counter designed for neutron measurement control activities. This multiplicity counter is well characterized, and the detector parameters were reaffirmed before the measurements were made using several well-known Los Alamos standards. Then, the 240Pu effective mass of the foils was determined directly from the multiplicity analysis without a conventional calibration curve based on representative standards. Finally, the 240Pu effective mass fraction and the total plutonium mass was calculated using gamma ray isotopics. Errors from statistical data collection, background subtraction, cosmic ray interaction, dead time corrections, calibration constants, sample geometry, and sample position were carefully estimated and propagated. The authors describe these error sources, the final calculated relative error in the foil assay, and the comparison with very accurate calorimetry measurements
Using the Chow Test to Analyze Regression Discontinuities
Directory of Open Access Journals (Sweden)
Howard H. Lee
2008-09-01
Full Text Available The Chow Test (Chow, 1960 is a method well known in econometrics. It was originally designed to analyze the same variables obtained in two different data sets to determine if they were similar enough to be pooled together. Regression discontinuity design is a variation of the two-group pre-test-post-test design. The usual method of data analysis for data collected using this design is multiple regression with one dummy coded variable representing the cut-off value. This article discusses the use of the Chow Test on data obtained in a regression discontinuity study.
Comparison of Some Estimation Methods in Linear Regression
?lkay Alt?nda?; Ümran M. Tek?en; A??r Genç
2010-01-01
In this study, we are informed about some methods as alternatives to the classical least squares methods which are used for simple linear and multiple linear regression analysis. In short, linear regression model is shown via matrix as;Y=X?+? where Y is the vector belonging to dependent variable, X is the design matrix of independent variables, ? is the parameter vector, ?is the vector belonging to error terms, so the least squares estimator of the linear regression is shown by?=(X^{?-1}X?...
International Nuclear Information System (INIS)
The mixed dissociation constants of four drug acids - losartan, paracetamol, phenylephrine and quinine - at various ionic strengths I of range 0.01 and 1.0 and at temperatures of 25 and 37 deg. C were determined using SPECFIT32 and SQUAD(84) regression analysis of the pH-spectrophotometric titration data. A proposed strategy of efficient experimentation in a dissociation constants determination, followed by a computational strategy for the chemical model with a dissociation constants determination, is presented on the protonation equilibria of losartan. Indices of precise methods predict the correct number of components, and even the presence of minor ones when the data quality is high and the instrumental error is known. Improved identification of the number of species uses the second or third derivative function for some indices, namely when the number of species in the mixture is higher than 3 and when, due to large variations in the indicator values even at logarithmic scale, the indicator curve does not reach an obvious point where the slope changes. The thermodynamic dissociation constant pKaT was estimated by nonlinear regression of {pKa, I} data at 25 and 37 deg. C: for losartan pKa,1T=3.63(1) and 3.57(3), pKa,2T=4.84(1) and 4.80(3), for paracetamol pKa,1T=9.78(1) and 9.65(1), for phenylephrine pKa,1T=9.17(1) and 8.95(1), pKa,2T=10.45(1) and 10.22(1), for quinine pKa,1T=4.25(1) and 4.12(1), pKa,2T=8.72(1) and 8.46(2). Goodness-of-fit tests for variod 8.46(2). Goodness-of-fit tests for various regression diagnostics enabled the reliability of the parameter estimates to be found
Energy Technology Data Exchange (ETDEWEB)
Meloun, Milan [Department of Analytical Chemistry, University of Pardubice, 53210 Pardubice (Czech Republic)]. E-mail: milan.meloun@upce.cz; Syrovy, Tomas [Department of Analytical Chemistry, University of Pardubice, 53210 Pardubice (Czech Republic)]. E-mail: tomas.syrovy@upce.cz; Vrana, Ales [IVAX Pharmaceuticals, s.r.o. 74770 Opava (Czech Republic)]. E-mail: ales_vrana@ivax-cr.com
2005-03-21
The mixed dissociation constants of four drug acids - losartan, paracetamol, phenylephrine and quinine - at various ionic strengths I of range 0.01 and 1.0 and at temperatures of 25 and 37 deg. C were determined using SPECFIT32 and SQUAD(84) regression analysis of the pH-spectrophotometric titration data. A proposed strategy of efficient experimentation in a dissociation constants determination, followed by a computational strategy for the chemical model with a dissociation constants determination, is presented on the protonation equilibria of losartan. Indices of precise methods predict the correct number of components, and even the presence of minor ones when the data quality is high and the instrumental error is known. Improved identification of the number of species uses the second or third derivative function for some indices, namely when the number of species in the mixture is higher than 3 and when, due to large variations in the indicator values even at logarithmic scale, the indicator curve does not reach an obvious point where the slope changes. The thermodynamic dissociation constant pKaT was estimated by nonlinear regression of {l_brace}pK{sub a}, I{r_brace} data at 25 and 37 deg. C: for losartan pKa,1T=3.63(1) and 3.57(3), pKa,2T=4.84(1) and 4.80(3), for paracetamol pKa,1T=9.78(1) and 9.65(1), for phenylephrine pKa,1T=9.17(1) and 8.95(1), pKa,2T=10.45(1) and 10.22(1), for quinine pKa,1T=4.25(1) and 4.12(1), pKa,2T=8.72(1) and 8.46(2). Goodness-of-fit tests for various regression diagnostics enabled the reliability of the parameter estimates to be found.
Multiplex: Analysis of Multiple Social Networks with Algebra
DEFF Research Database (Denmark)
Algebraic procedures for the analysis of multiple social networks. Among other things, it is possible to create and manipulate multivariate network data with different formats, and there are effective ways available to treat multiple networks with routines that combine algebraic systems like the partially ordered semigroup or the semiring structure together with the relational bundles occurring in different types of multivariate network data sets. As well an algebraic approach for two-mode networks is made through Galois derivations between families of the pair of subsets.
Comparative analysis and visualization of multiple collinear genomes
Directory of Open Access Journals (Sweden)
Wang Jeremy R
2012-03-01
Full Text Available Abstract Background Genome browsers are a common tool used by biologists to visualize genomic features including genes, polymorphisms, and many others. However, existing genome browsers and visualization tools are not well-suited to perform meaningful comparative analysis among a large number of genomes. With the increasing quantity and availability of genomic data, there is an increased burden to provide useful visualization and analysis tools for comparison of multiple collinear genomes such as the large panels of model organisms which are the basis for much of the current genetic research. Results We have developed a novel web-based tool for visualizing and analyzing multiple collinear genomes. Our tool illustrates genome-sequence similarity through a mosaic of intervals representing local phylogeny, subspecific origin, and haplotype identity. Comparative analysis is facilitated through reordering and clustering of tracks, which can vary throughout the genome. In addition, we provide local phylogenetic trees as an alternate visualization to assess local variations. Conclusions Unlike previous genome browsers and viewers, ours allows for simultaneous and comparative analysis. Our browser provides intuitive selection and interactive navigation about features of interest. Dynamic visualizations adjust to scale and data content making analysis at variable resolutions and of multiple data sets more informative. We demonstrate our genome browser for an extensive set of genomic data sets composed of almost 200 distinct mouse laboratory strains.
Directory of Open Access Journals (Sweden)
Majid Mohammadhosseini
2014-05-01
Full Text Available A reliable quantitative structure retention relationship (QSRR study has been evaluated to predict the retention indices (RIs of a broad spectrum of compounds, namely 118 non-linear, cyclic and heterocyclic terpenoids (both saturated and unsaturated, on an HP-5MS fused silica column. A principal component analysis showed that seven compounds lay outside of the main cluster. After elimination of the outliers, the data set was divided into training and test sets involving 80 and 28 compounds. The method was tested by application of the particle swarm optimization (PSO method to find the most effective molecular descriptors, followed by multiple linear regressions (MLR. The PSO-MLR model was further confirmed through “leave one out cross validation” (LOO-CV and “leave group out cross validation” (LGO-CV, as well as external validations. The promising statistical figures of merit associated with the proposed model (R2train=0.936, Q2LOO=0.928, Q2LGO=0.921, F=376.4 confirm its high ability to predict RIs with negligible relative errors of predictions (REP train=4.8%, REP test=6.0%.
Blood gas tensions in adult asthma : a systematic review and meta-regression analysis
DEFF Research Database (Denmark)
Johansen, Troels; Johansen, Peter
2014-01-01
Abstract Objective: The last half-century has seen substantial changes in asthma treatment and care. We investigated whether arterial blood gas parameters in acute and non-acute asthma have changed historically. Methods: We performed a systematic search of the literature for studies reporting [Formula: see text], [Formula: see text] and forced expiratory volume in 1?s, percentage of predicted (FEV1%). For each of the blood gas parameters, meta-regression analyses examined its association with four background variables: the publication year, mean FEV1%, mean age and female fraction in the respective studies. Results: After screening, we included 43 articles comprising 61 datasets published between 1967 and 2013. In studies of habitual-state asthma, mean [Formula: see text] was positively associated with the publication year (p?=?0.001) and negatively with mean age (p?
A non-linear regression method for CT brain perfusion analysis
Bennink, E.; Oosterbroek, J.; Viergever, M. A.; Velthuis, B. K.; de Jong, H. W. A. M.
2015-03-01
CT perfusion (CTP) imaging allows for rapid diagnosis of ischemic stroke. Generation of perfusion maps from CTP data usually involves deconvolution algorithms providing estimates for the impulse response function in the tissue. We propose the use of a fast non-linear regression (NLR) method that we postulate has similar performance to the current academic state-of-art method (bSVD), but that has some important advantages, including the estimation of vascular permeability, improved robustness to tracer-delay, and very few tuning parameters, that are all important in stroke assessment. The aim of this study is to evaluate the fast NLR method against bSVD and a commercial clinical state-of-art method. The three methods were tested against a published digital perfusion phantom earlier used to illustrate the superiority of bSVD. In addition, the NLR and clinical methods were also tested against bSVD on 20 clinical scans. Pearson correlation coefficients were calculated for each of the tested methods. All three methods showed high correlation coefficients (>0.9) with the ground truth in the phantom. With respect to the clinical scans, the NLR perfusion maps showed higher correlation with bSVD than the perfusion maps from the clinical method. Furthermore, the perfusion maps showed that the fast NLR estimates are robust to tracer-delay. In conclusion, the proposed fast NLR method provides a simple and flexible way of estimating perfusion parameters from CT perfusion scans, with high correlation coefficients. This suggests that it could be a better alternative to the current clinical and academic state-of-art methods.
Geneletti, Sara; O'Keeffe, Aidan G; Sharples, Linda D; Richardson, Sylvia; Baio, Gianluca
2015-07-10
The regression discontinuity (RD) design is a quasi-experimental design that estimates the causal effects of a treatment by exploiting naturally occurring treatment rules. It can be applied in any context where a particular treatment or intervention is administered according to a pre-specified rule linked to a continuous variable. Such thresholds are common in primary care drug prescription where the RD design can be used to estimate the causal effect of medication in the general population. Such results can then be contrasted to those obtained from randomised controlled trials (RCTs) and inform prescription policy and guidelines based on a more realistic and less expensive context. In this paper, we focus on statins, a class of cholesterol-lowering drugs, however, the methodology can be applied to many other drugs provided these are prescribed in accordance to pre-determined guidelines. Current guidelines in the UK state that statins should be prescribed to patients with 10-year cardiovascular disease risk scores in excess of 20%. If we consider patients whose risk scores are close to the 20%?risk score threshold, we find that there is an element of random variation in both the risk score itself and its measurement. We can therefore consider the threshold as a randomising device that assigns statin prescription to individuals just above the threshold and withholds it from those just below. Thus, we are effectively replicating the conditions of an RCT in the area around the threshold, removing or at least mitigating confounding. We frame the RD design in the language of conditional independence, which clarifies the assumptions necessary to apply an RD design to data, and which makes the links with instrumental variables clear. We also have context-specific knowledge about the expected sizes of the effects of statin prescription and are thus able to incorporate this into Bayesian models by formulating informative priors on our causal parameters. ©?2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. PMID:25809691
Directory of Open Access Journals (Sweden)
Vasileios A. Tzanakakis
2014-12-01
Full Text Available Partial Least Squares Regression (PLSR can integrate a great number of variables and overcome collinearity problems, a fact that makes it suitable for intensive agronomical practices such as land application. In the present study a PLSR model was developed to predict important management goals, including biomass production and nutrient recovery (i.e., nitrogen and phosphorus, associated with treatment potential, environmental impacts, and economic benefits. Effluent loading and a considerable number of soil parameters commonly monitored in effluent irrigated lands were considered as potential predictor variables during the model development. All data were derived from a three year field trial including plantations of four different plant species (Acacia cyanophylla, Eucalyptus camaldulensis, Populus nigra, and Arundo donax, irrigated with pre-treated domestic effluent. PLSR method was very effective despite the small sample size and the wide nature of data set (with many highly correlated inputs and several highly correlated responses. Through PLSR method the number of initial predictor variables was reduced and only several variables were remained and included in the final PLSR model. The important input variables maintained were: Effluent loading, electrical conductivity (EC, available phosphorus (Olsen-P, Na+, Ca2+, Mg2+, K2+, SAR, and NO3?-N. Among these variables, effluent loading, EC, and nitrates had the greater contribution to the final PLSR model. PLSR is highly compatible with intensive agronomical practices such as land application, in which a large number of highly collinear and noisy input variables is monitored to assess plant species performance and to detect impacts on the environment.
Proportional hazards regression for the analysis of clustered survival data from case-cohort studies
Zhang, Hui; Schaubel, Douglas E; Kalbeisch, John D.
2011-01-01
Case-cohort sampling is a commonly used and efficient method for studying large cohorts. Most existing methods of analysis for case-cohort data have concerned the analysis of univariate failure time data. However, clustered failure time data are commonly encountered in public health studies. For example, patients treated at the same center are unlikely to be independent. In this article, we consider methods based on estimating equations for case-cohort designs for clustered failure time data....
HARMONIC ANALYSIS OF SVPWM INVERTER USING MULTIPLE-PULSES METHOD
Directory of Open Access Journals (Sweden)
Mehmet YUMURTACI
2009-01-01
Full Text Available Space Vector Modulation (SVM technique is a popular and an important PWM technique for three phases voltage source inverter in the control of Induction Motor. In this study harmonic analysis of Space Vector PWM (SVPWM is investigated using multiple-pulses method. Multiple-Pulses method calculates the Fourier coefficients of individual positive and negative pulses of the output PWM waveform and adds them together using the principle of superposition to calculate the Fourier coefficients of the all PWM output signal. Harmonic magnitudes can be calculated directly by this method without linearization, using look-up tables or Bessel functions. In this study, the results obtained in the application of SVPWM for values of variable parameters are compared with the results obtained with the multiple-pulses method.
Directory of Open Access Journals (Sweden)
Igor K. Kochanenko
2013-01-01
Full Text Available Procedures of construction of curve regress by criterion of the least fractals, i.e. the greatest probability of the sums of degrees of the least deviations measured intensity from their modelling values are proved. The exponent is defined as fractal dimension of a time number. The difference of results of a well-founded method and a method of the least squares is quantitatively estimated.
Genetic analysis of longevity in Dutch dairy cattle using random regression.
van Pelt, M L; Meuwissen, T H E; de Jong, G; Veerkamp, R F
2015-06-01
Longevity, productive life, or lifespan of dairy cattle is an important trait for dairy farmers, and it is defined as the time from first calving to the last test date for milk production. Methods for genetic evaluations need to account for censored data; that is, records from cows that are still alive. The aim of this study was to investigate whether these methods also need to take account of survival being genetically a different trait across the entire lifespan of a cow. The data set comprised 112,000 cows with a total of 3,964,449 observations for survival per month from first calving until 72mo in productive life. A random regression model with second-order Legendre polynomials was fitted for the additive genetic effect. Alternative parameterizations were (1) different trait definitions for the length of time interval for survival after first calving (1, 3, 6, and 12mo); (2) linear or threshold model; and (3) differing the order of the Legendre polynomial. The partial derivatives of a profit function were used to transform variance components on the survival scale to those for lifespan. Survival rates were higher in early life than later in life (99 vs. 95%). When survival was defined over 12-mo intervals survival curves were smooth compared with curves when 1-, 3-, or 6-mo intervals were used. Heritabilities in each interval were very low and ranged from 0.002 to 0.031, but the heritability for lifespan over the entire period of 72mo after first calving ranged from 0.115 to 0.149. Genetic correlations between time intervals ranged from 0.25 to 1.00. Genetic parameters and breeding values for the genetic effect were more sensitive to the trait definition than to whether a linear or threshold model was used or to the order of Legendre polynomial used. Cumulative survival up to the first 6mo predicted lifespan with an accuracy of only 0.79 to 0.85; that is, reliability of breeding value with many daughters in the first 6mo can be, at most, 0.62 to 0.72, and changes of breeding values are still expected when daughters are getting older. Therefore, an improved model for genetic evaluation should treat survival as different traits during the lifespan by splitting lifespan in time intervals of 6mo or less to avoid overestimated reliabilities and changes in breeding values when daughters are getting older. PMID:25892695
Directory of Open Access Journals (Sweden)
Mobley Lee R
2010-09-01
Full Text Available Abstract Background Colorectal cancer (CRC is the second leading cause of cancer death in the United States, and endoscopic screening can both detect and prevent cancer, but utilization is suboptimal and varies across geographic regions. We use multilevel regression to examine the various predictors of individuals' decisions to utilize endoscopic CRC screening. Study subjects are a 100% population cohort of Medicare beneficiaries identified in 2001 and followed through 2005. The outcome variable is a binary indicator of any sigmoidoscopy or colonoscopy use over this period. We analyze each state separately and map the findings for all states together to reveal patterns in the observed heterogeneity across states. Results We estimate a fully adjusted model for each state, based on a comprehensive socio-ecological model. We focus the discussion on the independent contributions of each of three community contextual variables that are amenable to policy intervention. Prevalence of Medicare managed care in one's neighborhood was associated with lower probability of screening in 12 states and higher probability in 19 states. Prevalence of poor English language ability among elders in one's neighborhood was associated with lower probability of screening in 15 states and higher probability in 6 states. Prevalence of poverty in one's neighborhood was associated with lower probability of screening in 36 states and higher probability in 5 states. Conclusions There are considerable differences across states in the socio-ecological context of CRC screening by endoscopy, suggesting that the current decentralized configuration of state-specific comprehensive cancer control programs is well suited to respond to the observed heterogeneity. We find that interventions to mediate language barriers are more critically needed in some states than in others. Medicare managed care penetration, hypothesized to affect information about and diffusion of new endoscopic technologies, has a positive association in only a minority of states. This suggests that managed care plans' promotion of this cost-increasing technology has been rather limited. Area poverty has a negative impact in the vast majority of states, but is positive in five states, suggesting there are some effective cancer control policies in place targeting the poor with supplemental resources promoting CRC screening.
Archival Bone Marrow Samples: Suitable for Multiple Biomarker Analysis?
DEFF Research Database (Denmark)
Lund, Bendik; Najmi, A. Laeya
2015-01-01
Archival samples represent a significant potential for genetic studies, particularly in severe diseases with risk of lethal outcome, such as in cancer. In this pilot study, we aimed to evaluate the usability of archival bone marrow smears and biopsies for DNA extraction and purification, whole genome amplification (WGA), multiple marker analysis including 10 short tandem repeats, and finally a comprehensive genotyping of 33,683 single nucleotide polymorphisms (SNPs) with multiplexed targeted next-generation sequencing. A total of 73 samples from 21 bone marrow smears and 13 bone marrow biopsies from 18 Danish and Norwegian childhood acute lymphoblastic leukemia patients were included and compared with corresponding blood samples. Samples were grouped according to the age of sample and whether WGA was performed or not. We found that measurements of DNA concentration after DNA extraction was dependent on detection method and that spectrophotometry overestimated DNA amount compared with fluorometry. In the shorttandem repeat analysis, detection rate dropped slightly with longer fragments. After WGA, this drop was more pronounced. Samples stored for 0 to 3 years showed better results compared with samples stored for 4 to 10 years. Acceptable call rates for SNPs were detected for 7 of 42 archival samples. In conclusion, archival bone marrow samples are suitable for DNA extraction and multiple marker analysis, but WGA was less successful, especially when longer fragments were analyzed. Multiple SNP analysis seems feasible, but the method has to be further optimized.
Modular risk analysis for assessing multiple waste sites
International Nuclear Information System (INIS)
Human-health impacts, especially to the surrounding public, are extremely difficult to assess at installations that contain multiple waste sites and a variety of mixed-waste constituents (e.g., organic, inorganic, and radioactive). These assessments must address different constituents, multiple waste sites, multiple release patterns, different transport pathways (i.e., groundwater, surface water, air, and overland soil), different receptor types and locations, various times of interest, population distributions, land-use patterns, baseline assessments, a variety of exposure scenarios, etc. Although the process is complex, two of the most important difficulties to overcome are associated with (1) establishing an approach that allows for modifying the source term, transport, or exposure component as an individual module without having to re-evaluate the entire installation-wide assessment (i.e., all modules simultaneously), and (2) displaying and communicating the results in an understandable and useable maimer to interested parties. An integrated, physics-based, compartmentalized approach, which is coupled to a Geographical Information System (GIS), captures the regional health impacts associated with multiple waste sites (e.g., hundreds to thousands of waste sites) at locations within and surrounding the installation. Utilizing a modular/GIS-based approach overcomes difficulties in (1) analyzing a wide variety of scenarios for multiple waste sites, and (2) communicating results from a complex human-health-impact analysis by capturing the essence of the assessment in a relatively elegant manner, so the meaning of the results can be quickly conveyed to all who review them
Individual patient data meta-analysis of survival data using Poisson regression models
Crowther Michael J; Riley Richard D; Staessen Jan A; Wang Jiguang; Gueyffier Francois; Lambert Paul C
2012-01-01
Abstract Background An Individual Patient Data (IPD) meta-analysis is often considered the gold-standard for synthesising survival data from clinical trials. An IPD meta-analysis can be achieved by either a two-stage or a one-stage approach, depending on whether the trials are analysed separately or simultaneously. A range of one-stage hierarchical Cox models have been previously proposed, but these are known to be computationally intensive and are not currently available in all standard stat...
Study on reactor system identification with multivariate auto-regressive analysis
International Nuclear Information System (INIS)
The effects on MAR model parameters of sampling frequency, cut-off frequency, record length and correlation estimate techniques are studied using the stationary part of the benchmark data. The effectiveness of MAR for system identification, including signal transmission path analysis, is investigated. (Author)
Dalton, Starrette
The degree of nonorthogonality in a factorial design was systematically increased. Five methods of dealing with nonorthogonality were selected and applied: two were least squares solutions (Method 1 and Method 2); two were approximate solutions (the unweighted means analysis and the method of expected frequencies); and the fifth was the…
UNIPALS: SOFTWARE FOR PRINCIPAL COMPONENTS ANALYSIS AND PARTIAL LEAST SQUARES REGRESSION
Software for the analysis of multivariate chemical data by principal components and partial least squares methods is included on disk. he methods extract latent variables from the chemical data with the UNIversal PArtial Least Squares or UNIPALS algorithm. he software is written ...
Scientific Electronic Library Online (English)
Glauco Henrique, de Sousa Mendes; Gilberto, Miller Devós Ganga.
2013-11-01
Full Text Available Critical success factors in new product development (NPD) in the Brazilian small and medium enterprises (SMEs) are identified and analyzed. Critical success factors are best practices that can be used to improve NPD management and performance in a company. However, the traditional method for identif [...] ying these factors is survey methods. Subsequently, the collected data are reduced through traditional multivariate analysis. The objective of this work is to develop a logistic regression model for predicting the success or failure of the new product development. This model allows for an evaluation and prioritization of resource commitments. The results will be helpful for guiding management actions, as one way to improve NPD performance in those industries.
Directory of Open Access Journals (Sweden)
Glauco H.S. Mendes
2013-09-01
Full Text Available Critical success factors in new product development (NPD in the Brazilian small and medium enterprises (SMEs are identified and analyzed. Critical success factors are best practices that can be used to improve NPD management and performance in a company. However, the traditional method for identifying these factors is survey methods. Subsequently, the collected data are reduced through traditional multivariate analysis. The objective of this work is to develop a logistic regression model for predicting the success or failure of the new product development. This model allows for an evaluation and prioritization of resource commitments. The results will be helpful for guiding management actions, as one way to improve NPD performance in those industries.
Directory of Open Access Journals (Sweden)
Seyyed Ali Noorhosseini-Niyaki
2012-09-01
Full Text Available This study was carried out to identify Technical-AgronomicFactors Impressible from Fish Farming in Rice Fields.This investigation carried out by descriptive survey duringJuly-August 2009. Studied cities including Talesh, Rezvanshahrand Masal set in Tavalesh region near to Caspian Sea, Northof Iran. The questionnaire validity and reliability were determinedto enhance the dependability of the results. Data were collectedfrom 184 respondents (61 adopters and 123 non-adoptersrandomly sampled from selected villages and analyzed usinglogistic regression analysis. Results showed that there was asignificant positive relationship (p<0.05 between biologicalcontrol of pests in rice fields and the fish farming in ricefields. Also, there was a significant negative relationship(p<0.10 between the fish farming in rice fields and variablesof quantity using pesticide of Diazinon in rice fields andnumber of plows in rice fields.
Directory of Open Access Journals (Sweden)
Koyin Chang
2013-09-01
Full Text Available To understand the impact of drinking and driving laws on drinking and driving fatality rates, this study explored the different effects these laws have on areas with varying severity rates for drinking and driving. Unlike previous studies, this study employed quantile regression analysis. Empirical results showed that policies based on local conditions must be used to effectively reduce drinking and driving fatality rates; that is, different measures should be adopted to target the specific conditions in various regions. For areas with low fatality rates (low quantiles, people’s habits and attitudes toward alcohol should be emphasized instead of transportation safety laws because “preemptive regulations” are more effective. For areas with high fatality rates (or high quantiles, “ex-post regulations” are more effective, and impact these areas approximately 0.01% to 0.05% more than they do areas with low fatality rates.
Multiple sclerosis medical image analysis and information management.
Liu, Lifeng; Meier, Dominik; Polgar-Turcsanyi, Mariann; Karkocha, Pawel; Bakshi, Rohit; Guttmann, Charles R G
2005-01-01
Magnetic resonance imaging (MRI) has become a central tool for patient management, as well as research, in multiple sclerosis (MS). Measurements of disease burden and activity derived from MRI through quantitative image analysis techniques are increasingly being used. There are many complexities and challenges in building computerized processing pipelines to ensure efficiency, reproducibility, and quality control for MRI scans from MS patients. Such paradigms require advanced image processing and analysis technologies, as well as integrated database management systems to ensure the most utility for clinical and research purposes. This article reviews pipelines available for quantitative clinical MRI research in MS, including image segmentation, registration, time-series analysis, performance validation, visualization techniques, and advanced medical imaging software packages. To address the complex demands of the sequential processes, the authors developed a workflow management system that uses a centralized database and distributed computing system for image processing and analysis. The implementation of their system includes a web-form-based Oracle database application for information management and event dispatching, and multiple modules for image processing and analysis. The seamless integration of processing pipelines with the database makes it more efficient for users to navigate complex, multistep analysis protocols, reduces the user's learning curve, reduces the time needed for combining and activating different computing modules, and allows for close monitoring for quality-control purposes. The authors' system can be extended to general applications in clinical trials and to routine processing for image-based clinical research. PMID:16385023
Energy Technology Data Exchange (ETDEWEB)
Zhang, Yan-Feng; Dai, Shu-Gui [College of Environmental Science and Engineering, Nankai University, Key Laboratory for Pollution Process and Environmental Criteria of Ministry of Education, Tianjin (China); Ma, Yi [College of Chemistry, Nankai University, Institute of Elemento-Organic Chemistry, Tianjin (China); Gao, Zhi-Xian [Institute of Hygiene and Environmental Medicine, Tianjin (China)
2010-07-15
Immunoassays have been regarded as a possible alternative or supplement for measuring polycyclic aromatic hydrocarbons (PAHs) in the environment. Since there are too many potential cross-reactants for PAH immunoassays, it is difficult to determine all the cross-reactivities (CRs) by experimental tests. The relationship between CR and the physical-chemical properties of PAHs and related compounds was investigated using the CR data from a commercial enzyme-linked immunosorbent assay (ELISA) kit test. Two quantitative structure-activity relationship (QSAR) techniques, regression analysis and comparative molecular field analysis (CoMFA), were applied for predicting the CR of PAHs in this ELISA kit. Parabolic regression indicates that the CRs are significantly correlated with the logarithm of the partition coefficient for the octanol-water system (log K{sub ow}) (r{sup 2}=0.643, n=23, P<0.0001), suggesting that hydrophobic interactions play an important role in the antigen-antibody binding and the cross-reactions in this ELISA test. The CoMFA model obtained shows that the CRs of the PAHs are correlated with the 3D structure of the molecules (r{sub cv}{sup 2}=0.663, r{sup 2}=0.873, F{sub 4,32}=55.086). The contributions of the steric and electrostatic fields to CR were 40.4 and 59.6%, respectively. Both of the QSAR models satisfactorily predict the CR in this PAH immunoassay kit, and help in understanding the mechanisms of antigen-antibody interaction. (orig.)
Directory of Open Access Journals (Sweden)
CUI Yanping
2014-10-01
Full Text Available ObjectiveTo analyze the prognostic factors in acute-on-chronic liver failure (ACLF patients with hepatic encephalopathy (HE and to explore the risk factors for prognosis. MethodsA retrospective analysis was performed on 106 ACLF patients with HE who were hospitalized in our hospital from January 2010 to July 2013. The patients were divided into improved group and deteriorated group. The univariate indicators including age, sex, laboratory indicators ?total bilirubin (TBil, albumin (Alb, alanine aminotransferase (ALT, aspartate amino-transferase (AST, and prothrombin time activity (PTA?, the stage of HE, complications ?persistent hyponatremia, digestive tract bleeding, hepatorenal syndrome (HRS, ascites, infection, and spontaneous bacterial peritonitis (SBP?, and plasma exchange were analyzed by chi-square test or t-test. Indicators with statistical significance were subsequently analyzed by binary logistic regression. ResultsUnivariate analysis showed that ALT (P=0.009, PTA (P=0.043, the stage of HE (P=0.000, and HRS (P=0.003 were significantly different between the two groups, whereas differences in age, sex, TBil, Alb, AST, persistent hyponatremia, digestive tract bleeding, ascites, infection, SBP, and plasma exchange were not statistically significant (P?0.05. Binary logistic regression demonstrated that PTA (b=-0?097, P=0.025, OR=0.908, HRS (b=2.279, P=0.007, OR=9.764, and the stage of HE (b=1?873, P=0.000, OR=6.510 were prognostic factors in ACLF patients with HE. ConclusionThe stage of HE, HRS, and PTA are independent influential factors for the prognosis in ACLF patients with HE. Reduced PTA, advanced HE stage, and the presence of HRS indicate worse prognosis.
Directory of Open Access Journals (Sweden)
R Noori
2009-03-01
Full Text Available "nBackground: Municipal solid waste (MSW is the natural result of human activities. MSW generation modeling is of prime importance in designing and programming municipal solid waste management system. This study tests the short-term prediction of waste generation by artificial neural network (ANN and principal component-regression analysis."nMethods: Two forecasting techniques are presented in this paper for prediction of waste generation (WG. One of them, multivariate linear regression (MLR, is based on principal component analysis (PCA. The other technique is ANN model. For ANN, a feed-forward multi-layer perceptron was considered the best choice for this study. However, in this research after removing the problem of multicolinearity of independent variables by PCA, an appropriate model (PCA-MLR was developed for predicting WG."nResults: Correlation coefficient (R and average absolute relative error (AARE in ANN model obtained as equal to 0.837 and 4.4% respectively. In comparison whit PCA-MLR model (R= 0.445, MARE= 6.6%, ANN model has a better results. However, threshold statistic error is done for the both models in the testing stage that the maximum absolute relative error (ARE for 50% of prediction is 3.7% in ANN model but it is 6.2% for PCA-MLR model. Also we can say that the maximum ARE for 90% of prediction in testing step of ANN model is about 8.6% but it is 10.5% for PCA-MLR model."nConclusion: The ANN model has better results in comparison with the PCA-MLR model therefore this model is selected for prediction of WG in Tehran.
Polynomial Regression on Riemannian Manifolds
Hinkle, Jacob; Fletcher, P Thomas; Joshi, Sarang
2012-01-01
In this paper we develop the theory of parametric polynomial regression in Riemannian manifolds and Lie groups. We show application of Riemannian polynomial regression to shape analysis in Kendall shape space. Results are presented, showing the power of polynomial regression on the classic rat skull growth data of Bookstein as well as the analysis of the shape changes associated with aging of the corpus callosum from the OASIS Alzheimer's study.
Directory of Open Access Journals (Sweden)
S. Saravanan
2012-07-01
Full Text Available Power System planning starts with Electric load (demand forecasting. Accurate electricity load forecasting is one of the most important challenges in managing supply and demand of the electricity, since the electricity demand is volatile in nature; it cannot be stored and has to be consumed instantly. The aim of this study deals with electricity consumption in India, to forecast future projection of demand for a period of 19 years from 2012 to 2030. The eleven input variables used are Amount of CO2 emission, Population, Per capita GDP, Per capita gross national income, Gross Domestic savings, Industry, Consumer price index, Wholesale price index, Imports, Exports and Per capita power consumption. A new methodology based on Artificial Neural Networks (ANNs using principal components is also used. Data of 29 years used for training and data of 10 years used for testing the ANNs. Comparison made with multiple linear regression (based on original data and the principal components and ANNs with original data as input variables. The results show that the use of ANNs with principal components (PC is more effective.
Analysis of Multiple Manding Topographies during Functional Communication Training
Harding, Jay W.; David P. Wacker; Berg, Wendy K.; Winborn-Kemmerer, Lisa; Lee, John F; Ibrahimovic, Muska
2009-01-01
We evaluated the effects of reinforcing multiple manding topographies during functional communication training (FCT) to decrease problem behavior for three preschool-age children. During Phase 1, a functional analysis identified conditions that maintained problem behavior for each child. During Phase 2, the children’s parents taught them to request positive reinforcers (attention or toys) via vocal manding, manual signing, or touching a picture/word card with or without a microswitch recordin...
BER Analysis of Wideband Code Division Multiple Access
Veeranna.D; Sanjeev Kumar Sharma; Yashwant Singh,
2012-01-01
There has been a dramatic change in the field of field of telelcommunication service. M-ary Quadrature Amplitude modulation and Quadrature Phase shift Keying (QPSK) modulation schemes are considered in Wideband code Division Multiple Access(WCDMA) system. Here we are using MATLAB for simulation and evaluation of BER ( Bit Error Rate )and SNR(Siganl to Noise Ratio). There is analysis of Quadrature Phase shift Keying and 16 –ary Quadrature Amplitude modulation which are being used in WC...
Analysis and design of multiple element antennas for urban communication
Pontes, Juan
2010-01-01
This work focuses on the analysis and design of multiple element antennas (MEA) and their interaction with the propagation channel. In particular, attention is given to urban channels and how its information throughput, i.e. capacity, can be improved. With this in mind, this work extends an existing network model of the communication system in order to reduce computation time, investigates the communicational limits of MEA systems and proposes a synthesis method for capacity maximization.
Shuai Wang; Lingling Zhao; Xiaohong Su; Peijun Ma
2014-01-01
Accurate prediction of the remaining useful life (RUL) of lithium-ion batteries is important for battery management systems. Traditional empirical data-driven approaches for RUL prediction usually require multidimensional physical characteristics including the current, voltage, usage duration, battery temperature, and ambient temperature. From a capacity fading analysis of lithium-ion batteries, it is found that the energy efficiency and battery working temperature are closely related to the...
Semiparametric regression analysis for time-to-event marked endpoints in cancer studies
Hu, Chen; Tsodikov, Alex
2013-01-01
In cancer studies the disease natural history process is often observed only at a fixed, random point of diagnosis (a survival time), leading to a current status observation (Sun (2006). The statistical analysis of interval-censored failure time data. Berlin: Springer.) representing a surrogate (a mark) (Jacobsen (2006). Point process theory and applications: marked point and piecewise deterministic processes. Basel: Birkhauser.) attached to the observed survival time. Examples include time t...
Is the German Retail Gas Market Competitive? A Spatial-temporal Analysis Using Quantile Regression
Kihm, Alex; Ritter, Nolan; Vance, Colin
2014-01-01
We explore whether non-competitive pricing prevails in Germany's retail gasoline market by examining the influence of the crude oil price on the retail gasoline price, focusing specifically on how this influence varies according to the brand and to the degree of competition in the vicinity of the station. Our analysis identifies several factors other than cost - including the absence of nearby competitors and regional market concentration - that play a significant role in mediating the influe...