WorldWideScience

Sample records for spatial regression analysis

  1. Neighborhood social capital and crime victimization: comparison of spatial regression analysis and hierarchical regression analysis.

    Science.gov (United States)

    Takagi, Daisuke; Ikeda, Ken'ichi; Kawachi, Ichiro

    2012-11-01

    Crime is an important determinant of public health outcomes, including quality of life, mental well-being, and health behavior. A body of research has documented the association between community social capital and crime victimization. The association between social capital and crime victimization has been examined at multiple levels of spatial aggregation, ranging from entire countries, to states, metropolitan areas, counties, and neighborhoods. In multilevel analysis, the spatial boundaries at level 2 are most often drawn from administrative boundaries (e.g., Census tracts in the U.S.). One problem with adopting administrative definitions of neighborhoods is that it ignores spatial spillover. We conducted a study of social capital and crime victimization in one ward of Tokyo city, using a spatial Durbin model with an inverse-distance weighting matrix that assigned each respondent a unique level of "exposure" to social capital based on all other residents' perceptions. The study is based on a postal questionnaire sent to 20-69 years old residents of Arakawa Ward, Tokyo. The response rate was 43.7%. We examined the contextual influence of generalized trust, perceptions of reciprocity, two types of social network variables, as well as two principal components of social capital (constructed from the above four variables). Our outcome measure was self-reported crime victimization in the last five years. In the spatial Durbin model, we found that neighborhood generalized trust, reciprocity, supportive networks and two principal components of social capital were each inversely associated with crime victimization. By contrast, a multilevel regression performed with the same data (using administrative neighborhood boundaries) found generally null associations between neighborhood social capital and crime. Spatial regression methods may be more appropriate for investigating the contextual influence of social capital in homogeneous cultural settings such as Japan. Copyright

  2. Gaussian Process Regression Model in Spatial Logistic Regression

    Science.gov (United States)

    Sofro, A.; Oktaviarina, A.

    2018-01-01

    Spatial analysis has developed very quickly in the last decade. One of the favorite approaches is based on the neighbourhood of the region. Unfortunately, there are some limitations such as difficulty in prediction. Therefore, we offer Gaussian process regression (GPR) to accommodate the issue. In this paper, we will focus on spatial modeling with GPR for binomial data with logit link function. The performance of the model will be investigated. We will discuss the inference of how to estimate the parameters and hyper-parameters and to predict as well. Furthermore, simulation studies will be explained in the last section.

  3. Crime Modeling using Spatial Regression Approach

    Science.gov (United States)

    Saleh Ahmar, Ansari; Adiatma; Kasim Aidid, M.

    2018-01-01

    Act of criminality in Indonesia increased both variety and quantity every year. As murder, rape, assault, vandalism, theft, fraud, fencing, and other cases that make people feel unsafe. Risk of society exposed to crime is the number of reported cases in the police institution. The higher of the number of reporter to the police institution then the number of crime in the region is increasing. In this research, modeling criminality in South Sulawesi, Indonesia with the dependent variable used is the society exposed to the risk of crime. Modelling done by area approach is the using Spatial Autoregressive (SAR) and Spatial Error Model (SEM) methods. The independent variable used is the population density, the number of poor population, GDP per capita, unemployment and the human development index (HDI). Based on the analysis using spatial regression can be shown that there are no dependencies spatial both lag or errors in South Sulawesi.

  4. Spatial vulnerability assessments by regression kriging

    Science.gov (United States)

    Pásztor, László; Laborczi, Annamária; Takács, Katalin; Szatmári, Gábor

    2016-04-01

    information representing IEW or GRP forming environmental factors were taken into account to support the spatial inference of the locally experienced IEW frequency and measured GRP values respectively. An efficient spatial prediction methodology was applied to construct reliable maps, namely regression kriging (RK) using spatially exhaustive auxiliary data on soil, geology, topography, land use and climate. RK divides the spatial inference into two parts. Firstly the deterministic component of the target variable is determined by a regression model. The residuals of the multiple linear regression analysis represent the spatially varying but dependent stochastic component, which are interpolated by kriging. The final map is the sum of the two component predictions. Application of RK also provides the possibility of inherent accuracy assessment. The resulting maps are characterized by global and local measures of its accuracy. Additionally the method enables interval estimation for spatial extension of the areas of predefined risk categories. All of these outputs provide useful contribution to spatial planning, action planning and decision making. Acknowledgement: Our work was partly supported by the Hungarian National Scientific Research Foundation (OTKA, Grant No. K105167).

  5. Spatial regression analysis on 32 years of total column ozone data

    NARCIS (Netherlands)

    Knibbe, J.S.; van der A, J.R.; de Laat, A.T.J.

    2014-01-01

    Multiple-regression analyses have been performed on 32 years of total ozone column data that was spatially gridded with a 1 × 1.5° resolution. The total ozone data consist of the MSR (Multi Sensor Reanalysis; 1979-2008) and 2 years of assimilated SCIAMACHY (SCanning Imaging Absorption spectroMeter

  6. A logistic regression estimating function for spatial Gibbs point processes

    DEFF Research Database (Denmark)

    Baddeley, Adrian; Coeurjolly, Jean-François; Rubak, Ege

    We propose a computationally efficient logistic regression estimating function for spatial Gibbs point processes. The sample points for the logistic regression consist of the observed point pattern together with a random pattern of dummy points. The estimating function is closely related to the p......We propose a computationally efficient logistic regression estimating function for spatial Gibbs point processes. The sample points for the logistic regression consist of the observed point pattern together with a random pattern of dummy points. The estimating function is closely related...

  7. Application of Spatial Regression Models to Income Poverty Ratios in Middle Delta Contiguous Counties in Egypt

    Directory of Open Access Journals (Sweden)

    Sohair F Higazi

    2013-02-01

    Full Text Available Regression analysis depends on several assumptions that have to be satisfied. A major assumption that is never satisfied when variables are from contiguous observations is the independence of error terms. Spatial analysis treated the violation of that assumption by two derived models that put contiguity of observations into consideration. Data used are from Egypt's 2006 latest census, for 93 counties in middle delta seven adjacent Governorates. The dependent variable used is the percent of individuals classified as poor (those who make less than 1$ daily. Predictors are some demographic indicators. Explanatory Spatial Data Analysis (ESDA is performed to examine the existence of spatial clustering and spatial autocorrelation between neighboring counties. The ESDA revealed spatial clusters and spatial correlation between locations. Three statistical models are applied to the data, the Ordinary Least Square regression model (OLS, the Spatial Error Model (SEM and the Spatial Lag Model (SLM.The Likelihood Ratio test and some information criterions are used to compare SLM and SEM to OLS. The SEM model proved to be better than the SLM model. Recommendations are drawn regarding the two spatial models used.

  8. Comparing spatial regression to random forests for large ...

    Science.gov (United States)

    Environmental data may be “large” due to number of records, number of covariates, or both. Random forests has a reputation for good predictive performance when using many covariates, whereas spatial regression, when using reduced rank methods, has a reputation for good predictive performance when using many records. In this study, we compare these two techniques using a data set containing the macroinvertebrate multimetric index (MMI) at 1859 stream sites with over 200 landscape covariates. Our primary goal is predicting MMI at over 1.1 million perennial stream reaches across the USA. For spatial regression modeling, we develop two new methods to accommodate large data: (1) a procedure that estimates optimal Box-Cox transformations to linearize covariate relationships; and (2) a computationally efficient covariate selection routine that takes into account spatial autocorrelation. We show that our new methods lead to cross-validated performance similar to random forests, but that there is an advantage for spatial regression when quantifying the uncertainty of the predictions. Simulations are used to clarify advantages for each method. This research investigates different approaches for modeling and mapping national stream condition. We use MMI data from the EPA's National Rivers and Streams Assessment and predictors from StreamCat (Hill et al., 2015). Previous studies have focused on modeling the MMI condition classes (i.e., good, fair, and po

  9. Coefficient shifts in geographical ecology: an empirical evaluation of spatial and non-spatial regression

    DEFF Research Database (Denmark)

    Bini, L. M.; Diniz-Filho, J. A. F.; Rangel, T. F. L. V. B.

    2009-01-01

    A major focus of geographical ecology and macroecology is to understand the causes of spatially structured ecological patterns. However, achieving this understanding can be complicated when using multiple regression, because the relative importance of explanatory variables, as measured by regress...

  10. Spatial stochastic regression modelling of urban land use

    International Nuclear Information System (INIS)

    Arshad, S H M; Jaafar, J; Abiden, M Z Z; Latif, Z A; Rasam, A R A

    2014-01-01

    Urbanization is very closely linked to industrialization, commercialization or overall economic growth and development. This results in innumerable benefits of the quantity and quality of the urban environment and lifestyle but on the other hand contributes to unbounded development, urban sprawl, overcrowding and decreasing standard of living. Regulation and observation of urban development activities is crucial. The understanding of urban systems that promotes urban growth are also essential for the purpose of policy making, formulating development strategies as well as development plan preparation. This study aims to compare two different stochastic regression modeling techniques for spatial structure models of urban growth in the same specific study area. Both techniques will utilize the same datasets and their results will be analyzed. The work starts by producing an urban growth model by using stochastic regression modeling techniques namely the Ordinary Least Square (OLS) and Geographically Weighted Regression (GWR). The two techniques are compared to and it is found that, GWR seems to be a more significant stochastic regression model compared to OLS, it gives a smaller AICc (Akaike's Information Corrected Criterion) value and its output is more spatially explainable

  11. Spatial Quantile Regression In Analysis Of Healthy Life Years In The European Union Countries

    Directory of Open Access Journals (Sweden)

    Trzpiot Grażyna

    2016-12-01

    Full Text Available The paper investigates the impact of the selected factors on the healthy life years of men and women in the EU countries. The multiple quantile spatial autoregression models are used in order to account for substantial differences in the healthy life years and life quality across the EU members. Quantile regression allows studying dependencies between variables in different quantiles of the response distribution. Moreover, this statistical tool is robust against violations of the classical regression assumption about the distribution of the error term. Parameters of the models were estimated using instrumental variable method (Kim, Muller 2004, whereas the confidence intervals and p-values were bootstrapped.

  12. Accounting for spatial effects in land use regression for urban air pollution modeling.

    Science.gov (United States)

    Bertazzon, Stefania; Johnson, Markey; Eccles, Kristin; Kaplan, Gilaad G

    2015-01-01

    In order to accurately assess air pollution risks, health studies require spatially resolved pollution concentrations. Land-use regression (LUR) models estimate ambient concentrations at a fine spatial scale. However, spatial effects such as spatial non-stationarity and spatial autocorrelation can reduce the accuracy of LUR estimates by increasing regression errors and uncertainty; and statistical methods for resolving these effects--e.g., spatially autoregressive (SAR) and geographically weighted regression (GWR) models--may be difficult to apply simultaneously. We used an alternate approach to address spatial non-stationarity and spatial autocorrelation in LUR models for nitrogen dioxide. Traditional models were re-specified to include a variable capturing wind speed and direction, and re-fit as GWR models. Mean R(2) values for the resulting GWR-wind models (summer: 0.86, winter: 0.73) showed a 10-20% improvement over traditional LUR models. GWR-wind models effectively addressed both spatial effects and produced meaningful predictive models. These results suggest a useful method for improving spatially explicit models. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

  13. Spatial Regression and Prediction of Water Quality in a Watershed with Complex Pollution Sources.

    Science.gov (United States)

    Yang, Xiaoying; Liu, Qun; Luo, Xingzhang; Zheng, Zheng

    2017-08-16

    Fast economic development, burgeoning population growth, and rapid urbanization have led to complex pollution sources contributing to water quality deterioration simultaneously in many developing countries including China. This paper explored the use of spatial regression to evaluate the impacts of watershed characteristics on ambient total nitrogen (TN) concentration in a heavily polluted watershed and make predictions across the region. Regression results have confirmed the substantial impact on TN concentration by a variety of point and non-point pollution sources. In addition, spatial regression has yielded better performance than ordinary regression in predicting TN concentrations. Due to its best performance in cross-validation, the river distance based spatial regression model was used to predict TN concentrations across the watershed. The prediction results have revealed a distinct pattern in the spatial distribution of TN concentrations and identified three critical sub-regions in priority for reducing TN loads. Our study results have indicated that spatial regression could potentially serve as an effective tool to facilitate water pollution control in watersheds under diverse physical and socio-economical conditions.

  14. Spatial variability of excess mortality during prolonged dust events in a high-density city: a time-stratified spatial regression approach.

    Science.gov (United States)

    Wong, Man Sing; Ho, Hung Chak; Yang, Lin; Shi, Wenzhong; Yang, Jinxin; Chan, Ta-Chien

    2017-07-24

    Dust events have long been recognized to be associated with a higher mortality risk. However, no study has investigated how prolonged dust events affect the spatial variability of mortality across districts in a downwind city. In this study, we applied a spatial regression approach to estimate the district-level mortality during two extreme dust events in Hong Kong. We compared spatial and non-spatial models to evaluate the ability of each regression to estimate mortality. We also compared prolonged dust events with non-dust events to determine the influences of community factors on mortality across the city. The density of a built environment (estimated by the sky view factor) had positive association with excess mortality in each district, while socioeconomic deprivation contributed by lower income and lower education induced higher mortality impact in each territory planning unit during a prolonged dust event. Based on the model comparison, spatial error modelling with the 1st order of queen contiguity consistently outperformed other models. The high-risk areas with higher increase in mortality were located in an urban high-density environment with higher socioeconomic deprivation. Our model design shows the ability to predict spatial variability of mortality risk during an extreme weather event that is not able to be estimated based on traditional time-series analysis or ecological studies. Our spatial protocol can be used for public health surveillance, sustainable planning and disaster preparation when relevant data are available.

  15. EEG/MEG Source Reconstruction with Spatial-Temporal Two-Way Regularized Regression

    KAUST Repository

    Tian, Tian Siva

    2013-07-11

    In this work, we propose a spatial-temporal two-way regularized regression method for reconstructing neural source signals from EEG/MEG time course measurements. The proposed method estimates the dipole locations and amplitudes simultaneously through minimizing a single penalized least squares criterion. The novelty of our methodology is the simultaneous consideration of three desirable properties of the reconstructed source signals, that is, spatial focality, spatial smoothness, and temporal smoothness. The desirable properties are achieved by using three separate penalty functions in the penalized regression framework. Specifically, we impose a roughness penalty in the temporal domain for temporal smoothness, and a sparsity-inducing penalty and a graph Laplacian penalty in the spatial domain for spatial focality and smoothness. We develop a computational efficient multilevel block coordinate descent algorithm to implement the method. Using a simulation study with several settings of different spatial complexity and two real MEG examples, we show that the proposed method outperforms existing methods that use only a subset of the three penalty functions. © 2013 Springer Science+Business Media New York.

  16. Regression analysis by example

    CERN Document Server

    Chatterjee, Samprit

    2012-01-01

    Praise for the Fourth Edition: ""This book is . . . an excellent source of examples for regression analysis. It has been and still is readily readable and understandable."" -Journal of the American Statistical Association Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. Regression Analysis by Example, Fifth Edition has been expanded

  17. An Introduction to Macro- Level Spatial Nonstationarity: a Geographically Weighted Regression Analysis of Diabetes and Poverty.

    Science.gov (United States)

    Siordia, Carlos; Saenz, Joseph; Tom, Sarah E

    2012-01-01

    Type II diabetes is a growing health problem in the United States. Understanding geographic variation in diabetes prevalence will inform where resources for management and prevention should be allocated. Investigations of the correlates of diabetes prevalence have largely ignored how spatial nonstationarity might play a role in the macro-level distribution of diabetes. This paper introduces the reader to the concept of spatial nonstationarity-variance in statistical relationships as a function of geographical location. Since spatial nonstationarity means different predictors can have varying effects on model outcomes, we make use of a geographically weighed regression to calculate correlates of diabetes as a function of geographic location. By doing so, we demonstrate an exploratory example in which the diabetes-poverty macro-level statistical relationship varies as a function of location. In particular, we provide evidence that when predicting macro-level diabetes prevalence, poverty is not always positively associated with diabetes.

  18. Digital Hydrologic Networks Supporting Applications Related to Spatially Referenced Regression Modeling

    Science.gov (United States)

    Brakebill, J.W.; Wolock, D.M.; Terziotti, S.E.

    2011-01-01

    Digital hydrologic networks depicting surface-water pathways and their associated drainage catchments provide a key component to hydrologic analysis and modeling. Collectively, they form common spatial units that can be used to frame the descriptions of aquatic and watershed processes. In addition, they provide the ability to simulate and route the movement of water and associated constituents throughout the landscape. Digital hydrologic networks have evolved from derivatives of mapping products to detailed, interconnected, spatially referenced networks of water pathways, drainage areas, and stream and watershed characteristics. These properties are important because they enhance the ability to spatially evaluate factors that affect the sources and transport of water-quality constituents at various scales. SPAtially Referenced Regressions On Watershed attributes (SPARROW), a process-based/statistical model, relies on a digital hydrologic network in order to establish relations between quantities of monitored contaminant flux, contaminant sources, and the associated physical characteristics affecting contaminant transport. Digital hydrologic networks modified from the River Reach File (RF1) and National Hydrography Dataset (NHD) geospatial datasets provided frameworks for SPARROW in six regions of the conterminous United States. In addition, characteristics of the modified RF1 were used to update estimates of mean-annual streamflow. This produced more current flow estimates for use in SPARROW modeling. ?? 2011 American Water Resources Association. This article is a U.S. Government work and is in the public domain in the USA.

  19. EEG/MEG Source Reconstruction with Spatial-Temporal Two-Way Regularized Regression

    KAUST Repository

    Tian, Tian Siva; Huang, Jianhua Z.; Shen, Haipeng; Li, Zhimin

    2013-01-01

    In this work, we propose a spatial-temporal two-way regularized regression method for reconstructing neural source signals from EEG/MEG time course measurements. The proposed method estimates the dipole locations and amplitudes simultaneously

  20. Eigenvector Spatial Filtering Regression Modeling of Ground PM2.5 Concentrations Using Remotely Sensed Data

    Directory of Open Access Journals (Sweden)

    Jingyi Zhang

    2018-06-01

    Full Text Available This paper proposes a regression model using the Eigenvector Spatial Filtering (ESF method to estimate ground PM2.5 concentrations. Covariates are derived from remotely sensed data including aerosol optical depth, normal differential vegetation index, surface temperature, air pressure, relative humidity, height of planetary boundary layer and digital elevation model. In addition, cultural variables such as factory densities and road densities are also used in the model. With the Yangtze River Delta region as the study area, we constructed ESF-based Regression (ESFR models at different time scales, using data for the period between December 2015 and November 2016. We found that the ESFR models effectively filtered spatial autocorrelation in the OLS residuals and resulted in increases in the goodness-of-fit metrics as well as reductions in residual standard errors and cross-validation errors, compared to the classic OLS models. The annual ESFR model explained 70% of the variability in PM2.5 concentrations, 16.7% more than the non-spatial OLS model. With the ESFR models, we performed detail analyses on the spatial and temporal distributions of PM2.5 concentrations in the study area. The model predictions are lower than ground observations but match the general trend. The experiment shows that ESFR provides a promising approach to PM2.5 analysis and prediction.

  1. Eigenvector Spatial Filtering Regression Modeling of Ground PM2.5 Concentrations Using Remotely Sensed Data.

    Science.gov (United States)

    Zhang, Jingyi; Li, Bin; Chen, Yumin; Chen, Meijie; Fang, Tao; Liu, Yongfeng

    2018-06-11

    This paper proposes a regression model using the Eigenvector Spatial Filtering (ESF) method to estimate ground PM 2.5 concentrations. Covariates are derived from remotely sensed data including aerosol optical depth, normal differential vegetation index, surface temperature, air pressure, relative humidity, height of planetary boundary layer and digital elevation model. In addition, cultural variables such as factory densities and road densities are also used in the model. With the Yangtze River Delta region as the study area, we constructed ESF-based Regression (ESFR) models at different time scales, using data for the period between December 2015 and November 2016. We found that the ESFR models effectively filtered spatial autocorrelation in the OLS residuals and resulted in increases in the goodness-of-fit metrics as well as reductions in residual standard errors and cross-validation errors, compared to the classic OLS models. The annual ESFR model explained 70% of the variability in PM 2.5 concentrations, 16.7% more than the non-spatial OLS model. With the ESFR models, we performed detail analyses on the spatial and temporal distributions of PM 2.5 concentrations in the study area. The model predictions are lower than ground observations but match the general trend. The experiment shows that ESFR provides a promising approach to PM 2.5 analysis and prediction.

  2. Comparing spatial regression to random forests for large environmental data sets

    Science.gov (United States)

    Environmental data may be “large” due to number of records, number of covariates, or both. Random forests has a reputation for good predictive performance when using many covariates, whereas spatial regression, when using reduced rank methods, has a reputatio...

  3. Investigation of the marked and long-standing spatial inhomogeneity of the Hungarian suicide rate: a spatial regression approach.

    Science.gov (United States)

    Balint, Lajos; Dome, Peter; Daroczi, Gergely; Gonda, Xenia; Rihmer, Zoltan

    2014-02-01

    In the last century Hungary had astonishingly high suicide rates characterized by marked regional within-country inequalities, a spatial pattern which has been quite stable over time. To explain the above phenomenon at the level of micro-regions (n=175) in the period between 2005 and 2011. Our dependent variable was the age and gender standardized mortality ratio (SMR) for suicide while explanatory variables were factors which are supposed to influence suicide risk, such as measures of religious and political integration, travel time accessibility of psychiatric services, alcohol consumption, unemployment and disability pensionery. When applying the ordinary least squared regression model, the residuals were found to be spatially autocorrelated, which indicates the violation of the assumption on the independence of error terms and - accordingly - the necessity of application of a spatial autoregressive (SAR) model to handle this problem. According to our calculations the SARlag model was a better way (versus the SARerr model) of addressing the problem of spatial autocorrelation, furthermore its substantive meaning is more convenient. SMR was significantly associated with the "political integration" variable in a negative and with "lack of religious integration" and "disability pensionery" variables in a positive manner. Associations were not significant for the remaining explanatory variables. Several important psychiatric variables were not available at the level of micro-regions. We conducted our analysis on aggregate data. Our results may draw attention to the relevance and abiding validity of the classic Durkheimian suicide risk factors - such as lack of social integration - apropos of the spatial pattern of Hungarian suicides. © 2013 Published by Elsevier B.V.

  4. Principal component regression analysis with SPSS.

    Science.gov (United States)

    Liu, R X; Kuang, J; Gong, Q; Hou, X L

    2003-06-01

    The paper introduces all indices of multicollinearity diagnoses, the basic principle of principal component regression and determination of 'best' equation method. The paper uses an example to describe how to do principal component regression analysis with SPSS 10.0: including all calculating processes of the principal component regression and all operations of linear regression, factor analysis, descriptives, compute variable and bivariate correlations procedures in SPSS 10.0. The principal component regression analysis can be used to overcome disturbance of the multicollinearity. The simplified, speeded up and accurate statistical effect is reached through the principal component regression analysis with SPSS.

  5. Power properties of invariant tests for spatial autocorrelation in linear regression

    NARCIS (Netherlands)

    Martellosio, F.

    2006-01-01

    Many popular tests for residual spatial autocorrelation in the context of the linear regression model belong to the class of invariant tests. This paper derives a number of exact properties of the power function of such tests. In particular, we extend the work of Krämer (2005, Journal of Statistical

  6. Spatially resolved regression analysis of pre-treatment FDG, FLT and Cu-ATSM PET from post-treatment FDG PET: an exploratory study

    Science.gov (United States)

    Bowen, Stephen R; Chappell, Richard J; Bentzen, Søren M; Deveau, Michael A; Forrest, Lisa J; Jeraj, Robert

    2012-01-01

    Purpose To quantify associations between pre-radiotherapy and post-radiotherapy PET parameters via spatially resolved regression. Materials and methods Ten canine sinonasal cancer patients underwent PET/CT scans of [18F]FDG (FDGpre), [18F]FLT (FLTpre), and [61Cu]Cu-ATSM (Cu-ATSMpre). Following radiotherapy regimens of 50 Gy in 10 fractions, veterinary patients underwent FDG PET/CT scans at three months (FDGpost). Regression of standardized uptake values in baseline FDGpre, FLTpre and Cu-ATSMpre tumour voxels to those in FDGpost images was performed for linear, log-linear, generalized-linear and mixed-fit linear models. Goodness-of-fit in regression coefficients was assessed by R2. Hypothesis testing of coefficients over the patient population was performed. Results Multivariate linear model fits of FDGpre to FDGpost were significantly positive over the population (FDGpost~0.17 FDGpre, p=0.03), and classified slopes of RECIST non-responders and responders to be different (0.37 vs. 0.07, p=0.01). Generalized-linear model fits related FDGpre to FDGpost by a linear power law (FDGpost~FDGpre0.93, pregression analysis indicates that pre-treatment FDG PET uptake is most strongly associated with three-month post-treatment FDG PET uptake in this patient population, though associations are histopathology-dependent. PMID:22682748

  7. GIS-based spatial regression and prediction of water quality in river networks: A case study in Iowa

    Science.gov (United States)

    Yang, X.; Jin, W.

    2010-01-01

    Nonpoint source pollution is the leading cause of the U.S.'s water quality problems. One important component of nonpoint source pollution control is an understanding of what and how watershed-scale conditions influence ambient water quality. This paper investigated the use of spatial regression to evaluate the impacts of watershed characteristics on stream NO3NO2-N concentration in the Cedar River Watershed, Iowa. An Arc Hydro geodatabase was constructed to organize various datasets on the watershed. Spatial regression models were developed to evaluate the impacts of watershed characteristics on stream NO3NO2-N concentration and predict NO3NO2-N concentration at unmonitored locations. Unlike the traditional ordinary least square (OLS) method, the spatial regression method incorporates the potential spatial correlation among the observations in its coefficient estimation. Study results show that NO3NO2-N observations in the Cedar River Watershed are spatially correlated, and by ignoring the spatial correlation, the OLS method tends to over-estimate the impacts of watershed characteristics on stream NO3NO2-N concentration. In conjunction with kriging, the spatial regression method not only makes better stream NO3NO2-N concentration predictions than the OLS method, but also gives estimates of the uncertainty of the predictions, which provides useful information for optimizing the design of stream monitoring network. It is a promising tool for better managing and controlling nonpoint source pollution. ?? 2010 Elsevier Ltd.

  8. Spatial analysis of instream nitrogen loads and factors controlling nitrogen delivery to streams in the southeastern United States using spatially referenced regression on watershed attributes (SPARROW) and regional classification frameworks

    Science.gov (United States)

    Hoos, Anne B.; McMahon, Gerard

    2009-01-01

    Understanding how nitrogen transport across the landscape varies with landscape characteristics is important for developing sound nitrogen management policies. We used a spatially referenced regression analysis (SPARROW) to examine landscape characteristics influencing delivery of nitrogen from sources in a watershed to stream channels. Modelled landscape delivery ratio varies widely (by a factor of 4) among watersheds in the southeastern United States—higher in the western part (Tennessee, Alabama, and Mississippi) than in the eastern part, and the average value for the region is lower compared to other parts of the nation. When we model landscape delivery ratio as a continuous function of local-scale landscape characteristics, we estimate a spatial pattern that varies as a function of soil and climate characteristics but exhibits spatial structure in residuals (observed load minus predicted load). The spatial pattern of modelled landscape delivery ratio and the spatial pattern of residuals coincide spatially with Level III ecoregions and also with hydrologic landscape regions. Subsequent incorporation into the model of these frameworks as regional scale variables improves estimation of landscape delivery ratio, evidenced by reduced spatial bias in residuals, and suggests that cross-scale processes affect nitrogen attenuation on the landscape. The model-fitted coefficient values are logically consistent with the hypothesis that broad-scale classifications of hydrologic response help to explain differential rates of nitrogen attenuation, controlling for local-scale landscape characteristics. Negative model coefficients for hydrologic landscape regions where the primary flow path is shallow ground water suggest that a lower fraction of nitrogen mass will be delivered to streams; this relation is reversed for regions where the primary flow path is overland flow.

  9. Estimating the Impact of Urbanization on Air Quality in China Using Spatial Regression Models

    OpenAIRE

    Fang, Chuanglin; Liu, Haimeng; Li, Guangdong; Sun, Dongqi; Miao, Zhuang

    2015-01-01

    Urban air pollution is one of the most visible environmental problems to have accompanied China’s rapid urbanization. Based on emission inventory data from 2014, gathered from 289 cities, we used Global and Local Moran’s I to measure the spatial autorrelation of Air Quality Index (AQI) values at the city level, and employed Ordinary Least Squares (OLS), Spatial Lag Model (SAR), and Geographically Weighted Regression (GWR) to quantitatively estimate the comprehensive impact and spatial variati...

  10. Estimating the Impact of Urbanization on Air Quality in China Using Spatial Regression Models

    Directory of Open Access Journals (Sweden)

    Chuanglin Fang

    2015-11-01

    Full Text Available Urban air pollution is one of the most visible environmental problems to have accompanied China’s rapid urbanization. Based on emission inventory data from 2014, gathered from 289 cities, we used Global and Local Moran’s I to measure the spatial autorrelation of Air Quality Index (AQI values at the city level, and employed Ordinary Least Squares (OLS, Spatial Lag Model (SAR, and Geographically Weighted Regression (GWR to quantitatively estimate the comprehensive impact and spatial variations of China’s urbanization process on air quality. The results show that a significant spatial dependence and heterogeneity existed in AQI values. Regression models revealed urbanization has played an important negative role in determining air quality in Chinese cities. The population, urbanization rate, automobile density, and the proportion of secondary industry were all found to have had a significant influence over air quality. Per capita Gross Domestic Product (GDP and the scale of urban land use, however, failed the significance test at 10% level. The GWR model performed better than global models and the results of GWR modeling show that the relationship between urbanization and air quality was not constant in space. Further, the local parameter estimates suggest significant spatial variation in the impacts of various urbanization factors on air quality.

  11. Estimating Loess Plateau Average Annual Precipitation with Multiple Linear Regression Kriging and Geographically Weighted Regression Kriging

    Directory of Open Access Journals (Sweden)

    Qiutong Jin

    2016-06-01

    Full Text Available Estimating the spatial distribution of precipitation is an important and challenging task in hydrology, climatology, ecology, and environmental science. In order to generate a highly accurate distribution map of average annual precipitation for the Loess Plateau in China, multiple linear regression Kriging (MLRK and geographically weighted regression Kriging (GWRK methods were employed using precipitation data from the period 1980–2010 from 435 meteorological stations. The predictors in regression Kriging were selected by stepwise regression analysis from many auxiliary environmental factors, such as elevation (DEM, normalized difference vegetation index (NDVI, solar radiation, slope, and aspect. All predictor distribution maps had a 500 m spatial resolution. Validation precipitation data from 130 hydrometeorological stations were used to assess the prediction accuracies of the MLRK and GWRK approaches. Results showed that both prediction maps with a 500 m spatial resolution interpolated by MLRK and GWRK had a high accuracy and captured detailed spatial distribution data; however, MLRK produced a lower prediction error and a higher variance explanation than GWRK, although the differences were small, in contrast to conclusions from similar studies.

  12. Forecasting urban water demand: A meta-regression analysis.

    Science.gov (United States)

    Sebri, Maamar

    2016-12-01

    Water managers and planners require accurate water demand forecasts over the short-, medium- and long-term for many purposes. These range from assessing water supply needs over spatial and temporal patterns to optimizing future investments and planning future allocations across competing sectors. This study surveys the empirical literature on the urban water demand forecasting using the meta-analytical approach. Specifically, using more than 600 estimates, a meta-regression analysis is conducted to identify explanations of cross-studies variation in accuracy of urban water demand forecasting. Our study finds that accuracy depends significantly on study characteristics, including demand periodicity, modeling method, forecasting horizon, model specification and sample size. The meta-regression results remain robust to different estimators employed as well as to a series of sensitivity checks performed. The importance of these findings lies in the conclusions and implications drawn out for regulators and policymakers and for academics alike. Copyright © 2016. Published by Elsevier Ltd.

  13. When homogeneity meets heterogeneity: the geographically weighted regression with spatial lag approach to prenatal care utilization

    Science.gov (United States)

    Shoff, Carla; Chen, Vivian Yi-Ju; Yang, Tse-Chuan

    2014-01-01

    Using geographically weighted regression (GWR), a recent study by Shoff and colleagues (2012) investigated the place-specific risk factors for prenatal care utilization in the US and found that most of the relationships between late or not prenatal care and its determinants are spatially heterogeneous. However, the GWR approach may be subject to the confounding effect of spatial homogeneity. The goal of this study is to address this concern by including both spatial homogeneity and heterogeneity into the analysis. Specifically, we employ an analytic framework where a spatially lagged (SL) effect of the dependent variable is incorporated into the GWR model, which is called GWR-SL. Using this innovative framework, we found evidence to argue that spatial homogeneity is neglected in the study by Shoff et al. (2012) and the results are changed after considering the spatially lagged effect of prenatal care utilization. The GWR-SL approach allows us to gain a place-specific understanding of prenatal care utilization in US counties. In addition, we compared the GWR-SL results with the results of conventional approaches (i.e., OLS and spatial lag models) and found that GWR-SL is the preferred modeling approach. The new findings help us to better estimate how the predictors are associated with prenatal care utilization across space, and determine whether and how the level of prenatal care utilization in neighboring counties matters. PMID:24893033

  14. Polynomial regression analysis and significance test of the regression function

    International Nuclear Information System (INIS)

    Gao Zhengming; Zhao Juan; He Shengping

    2012-01-01

    In order to analyze the decay heating power of a certain radioactive isotope per kilogram with polynomial regression method, the paper firstly demonstrated the broad usage of polynomial function and deduced its parameters with ordinary least squares estimate. Then significance test method of polynomial regression function is derived considering the similarity between the polynomial regression model and the multivariable linear regression model. Finally, polynomial regression analysis and significance test of the polynomial function are done to the decay heating power of the iso tope per kilogram in accord with the authors' real work. (authors)

  15. Spatial Bayesian latent factor regression modeling of coordinate-based meta-analysis data.

    Science.gov (United States)

    Montagna, Silvia; Wager, Tor; Barrett, Lisa Feldman; Johnson, Timothy D; Nichols, Thomas E

    2018-03-01

    Now over 20 years old, functional MRI (fMRI) has a large and growing literature that is best synthesised with meta-analytic tools. As most authors do not share image data, only the peak activation coordinates (foci) reported in the article are available for Coordinate-Based Meta-Analysis (CBMA). Neuroimaging meta-analysis is used to (i) identify areas of consistent activation; and (ii) build a predictive model of task type or cognitive process for new studies (reverse inference). To simultaneously address these aims, we propose a Bayesian point process hierarchical model for CBMA. We model the foci from each study as a doubly stochastic Poisson process, where the study-specific log intensity function is characterized as a linear combination of a high-dimensional basis set. A sparse representation of the intensities is guaranteed through latent factor modeling of the basis coefficients. Within our framework, it is also possible to account for the effect of study-level covariates (meta-regression), significantly expanding the capabilities of the current neuroimaging meta-analysis methods available. We apply our methodology to synthetic data and neuroimaging meta-analysis datasets. © 2017, The International Biometric Society.

  16. Spatial Bayesian Latent Factor Regression Modeling of Coordinate-based Meta-analysis Data

    Science.gov (United States)

    Montagna, Silvia; Wager, Tor; Barrett, Lisa Feldman; Johnson, Timothy D.; Nichols, Thomas E.

    2017-01-01

    Summary Now over 20 years old, functional MRI (fMRI) has a large and growing literature that is best synthesised with meta-analytic tools. As most authors do not share image data, only the peak activation coordinates (foci) reported in the paper are available for Coordinate-Based Meta-Analysis (CBMA). Neuroimaging meta-analysis is used to 1) identify areas of consistent activation; and 2) build a predictive model of task type or cognitive process for new studies (reverse inference). To simultaneously address these aims, we propose a Bayesian point process hierarchical model for CBMA. We model the foci from each study as a doubly stochastic Poisson process, where the study-specific log intensity function is characterised as a linear combination of a high-dimensional basis set. A sparse representation of the intensities is guaranteed through latent factor modeling of the basis coefficients. Within our framework, it is also possible to account for the effect of study-level covariates (meta-regression), significantly expanding the capabilities of the current neuroimaging meta-analysis methods available. We apply our methodology to synthetic data and neuroimaging meta-analysis datasets. PMID:28498564

  17. The Spatial Association Between Federally Qualified Health Centers and County-Level Reported Sexually Transmitted Infections: A Spatial Regression Approach.

    Science.gov (United States)

    Owusu-Edusei, Kwame; Gift, Thomas L; Leichliter, Jami S; Romaguera, Raul A

    2018-02-01

    The number of categorical sexually transmitted disease (STD) clinics is declining in the United States. Federally qualified health centers (FQHCs) have the potential to supplement the needed sexually transmitted infection (STI) services. In this study, we describe the spatial distribution of FQHC sites and determine if reported county-level nonviral STI morbidity were associated with having FQHC(s) using spatial regression techniques. We extracted map data from the Health Resources and Services Administration data warehouse on FQHCs (ie, geocoded health care service delivery [HCSD] sites) and extracted county-level data on the reported rates of chlamydia, gonorrhea and, primary and secondary (P&S) syphilis (2008-2012) from surveillance data. A 3-equation seemingly unrelated regression estimation procedure (with a spatial regression specification that controlled for county-level multiyear (2008-2012) demographic and socioeconomic factors) was used to determine the association between reported county-level STI morbidity and HCSD sites. Counties with HCSD sites had higher STI, poverty, unemployment, and violent crime rates than counties with no HCSD sites (P < 0.05). The number of HCSD sites was associated (P < 0.01) with increases in the temporally smoothed rates of chlamydia, gonorrhea, and P&S syphilis, but there was no significant association between the number of HCSD per 100,000 population and reported STI rates. There is a positive association between STI morbidity and the number of HCSD sites; however, this association does not exist when adjusting by population size. Further work may determine the extent to which HCSD sites can meet unmet needs for safety net STI services.

  18. Applied regression analysis a research tool

    CERN Document Server

    Pantula, Sastry; Dickey, David

    1998-01-01

    Least squares estimation, when used appropriately, is a powerful research tool. A deeper understanding of the regression concepts is essential for achieving optimal benefits from a least squares analysis. This book builds on the fundamentals of statistical methods and provides appropriate concepts that will allow a scientist to use least squares as an effective research tool. Applied Regression Analysis is aimed at the scientist who wishes to gain a working knowledge of regression analysis. The basic purpose of this book is to develop an understanding of least squares and related statistical methods without becoming excessively mathematical. It is the outgrowth of more than 30 years of consulting experience with scientists and many years of teaching an applied regression course to graduate students. Applied Regression Analysis serves as an excellent text for a service course on regression for non-statisticians and as a reference for researchers. It also provides a bridge between a two-semester introduction to...

  19. Optimizing landslide susceptibility zonation: Effects of DEM spatial resolution and slope unit delineation on logistic regression models

    Science.gov (United States)

    Schlögel, R.; Marchesini, I.; Alvioli, M.; Reichenbach, P.; Rossi, M.; Malet, J.-P.

    2018-01-01

    We perform landslide susceptibility zonation with slope units using three digital elevation models (DEMs) of varying spatial resolution of the Ubaye Valley (South French Alps). In so doing, we applied a recently developed algorithm automating slope unit delineation, given a number of parameters, in order to optimize simultaneously the partitioning of the terrain and the performance of a logistic regression susceptibility model. The method allowed us to obtain optimal slope units for each available DEM spatial resolution. For each resolution, we studied the susceptibility model performance by analyzing in detail the relevance of the conditioning variables. The analysis is based on landslide morphology data, considering either the whole landslide or only the source area outline as inputs. The procedure allowed us to select the most useful information, in terms of DEM spatial resolution, thematic variables and landslide inventory, in order to obtain the most reliable slope unit-based landslide susceptibility assessment.

  20. Regression Analysis by Example. 5th Edition

    Science.gov (United States)

    Chatterjee, Samprit; Hadi, Ali S.

    2012-01-01

    Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. "Regression Analysis by Example, Fifth Edition" has been expanded and thoroughly…

  1. The spatial prediction of landslide susceptibility applying artificial neural network and logistic regression models: A case study of Inje, Korea

    Science.gov (United States)

    Saro, Lee; Woo, Jeon Seong; Kwan-Young, Oh; Moung-Jin, Lee

    2016-02-01

    The aim of this study is to predict landslide susceptibility caused using the spatial analysis by the application of a statistical methodology based on the GIS. Logistic regression models along with artificial neutral network were applied and validated to analyze landslide susceptibility in Inje, Korea. Landslide occurrence area in the study were identified based on interpretations of optical remote sensing data (Aerial photographs) followed by field surveys. A spatial database considering forest, geophysical, soil and topographic data, was built on the study area using the Geographical Information System (GIS). These factors were analysed using artificial neural network (ANN) and logistic regression models to generate a landslide susceptibility map. The study validates the landslide susceptibility map by comparing them with landslide occurrence areas. The locations of landslide occurrence were divided randomly into a training set (50%) and a test set (50%). A training set analyse the landslide susceptibility map using the artificial network along with logistic regression models, and a test set was retained to validate the prediction map. The validation results revealed that the artificial neural network model (with an accuracy of 80.10%) was better at predicting landslides than the logistic regression model (with an accuracy of 77.05%). Of the weights used in the artificial neural network model, `slope' yielded the highest weight value (1.330), and `aspect' yielded the lowest value (1.000). This research applied two statistical analysis methods in a GIS and compared their results. Based on the findings, we were able to derive a more effective method for analyzing landslide susceptibility.

  2. The spatial prediction of landslide susceptibility applying artificial neural network and logistic regression models: A case study of Inje, Korea

    Directory of Open Access Journals (Sweden)

    Saro Lee

    2016-02-01

    Full Text Available The aim of this study is to predict landslide susceptibility caused using the spatial analysis by the application of a statistical methodology based on the GIS. Logistic regression models along with artificial neutral network were applied and validated to analyze landslide susceptibility in Inje, Korea. Landslide occurrence area in the study were identified based on interpretations of optical remote sensing data (Aerial photographs followed by field surveys. A spatial database considering forest, geophysical, soil and topographic data, was built on the study area using the Geographical Information System (GIS. These factors were analysed using artificial neural network (ANN and logistic regression models to generate a landslide susceptibility map. The study validates the landslide susceptibility map by comparing them with landslide occurrence areas. The locations of landslide occurrence were divided randomly into a training set (50% and a test set (50%. A training set analyse the landslide susceptibility map using the artificial network along with logistic regression models, and a test set was retained to validate the prediction map. The validation results revealed that the artificial neural network model (with an accuracy of 80.10% was better at predicting landslides than the logistic regression model (with an accuracy of 77.05%. Of the weights used in the artificial neural network model, ‘slope’ yielded the highest weight value (1.330, and ‘aspect’ yielded the lowest value (1.000. This research applied two statistical analysis methods in a GIS and compared their results. Based on the findings, we were able to derive a more effective method for analyzing landslide susceptibility.

  3. Regression analysis with categorized regression calibrated exposure: some interesting findings

    Directory of Open Access Journals (Sweden)

    Hjartåker Anette

    2006-07-01

    Full Text Available Abstract Background Regression calibration as a method for handling measurement error is becoming increasingly well-known and used in epidemiologic research. However, the standard version of the method is not appropriate for exposure analyzed on a categorical (e.g. quintile scale, an approach commonly used in epidemiologic studies. A tempting solution could then be to use the predicted continuous exposure obtained through the regression calibration method and treat it as an approximation to the true exposure, that is, include the categorized calibrated exposure in the main regression analysis. Methods We use semi-analytical calculations and simulations to evaluate the performance of the proposed approach compared to the naive approach of not correcting for measurement error, in situations where analyses are performed on quintile scale and when incorporating the original scale into the categorical variables, respectively. We also present analyses of real data, containing measures of folate intake and depression, from the Norwegian Women and Cancer study (NOWAC. Results In cases where extra information is available through replicated measurements and not validation data, regression calibration does not maintain important qualities of the true exposure distribution, thus estimates of variance and percentiles can be severely biased. We show that the outlined approach maintains much, in some cases all, of the misclassification found in the observed exposure. For that reason, regression analysis with the corrected variable included on a categorical scale is still biased. In some cases the corrected estimates are analytically equal to those obtained by the naive approach. Regression calibration is however vastly superior to the naive method when applying the medians of each category in the analysis. Conclusion Regression calibration in its most well-known form is not appropriate for measurement error correction when the exposure is analyzed on a

  4. When Deriving the Spatial QRS-T Angle from the 12-lead ECG, which Transform is More Frank: Regression or Inverse Dower?

    Science.gov (United States)

    Schlegel, Todd T.; Cortez, Daniel

    2010-01-01

    Our primary objective was to ascertain which commonly used 12-to-Frank-lead transformation yields spatial QRS-T angle values closest to those obtained from simultaneously collected true Frank-lead recordings. Simultaneous 12-lead and Frank XYZ-lead recordings were analyzed for 100 post-myocardial infarction patients and 50 controls. Relative agreement, with true Frank-lead results, of 12-to-Frank-lead transformed results for the spatial QRS-T angle using Kors regression versus inverse Dower was assessed via ANOVA, Lin s concordance and Bland-Altman plots. Spatial QRS-T angles from the true Frank leads were not significantly different than those derived from the Kors regression-related transformation but were significantly smaller than those derived from the inverse Dower-related transformation (P less than 0.001). Independent of method, spatial mean QRS-T angles were also always significantly larger than spatial maximum (peaks) QRS-T angles. Spatial QRS-T angles are best approximated by regression-related transforms. Spatial mean and spatial peaks QRS-T angles should also not be used interchangeably.

  5. Gaussian process regression analysis for functional data

    CERN Document Server

    Shi, Jian Qing

    2011-01-01

    Gaussian Process Regression Analysis for Functional Data presents nonparametric statistical methods for functional regression analysis, specifically the methods based on a Gaussian process prior in a functional space. The authors focus on problems involving functional response variables and mixed covariates of functional and scalar variables.Covering the basics of Gaussian process regression, the first several chapters discuss functional data analysis, theoretical aspects based on the asymptotic properties of Gaussian process regression models, and new methodological developments for high dime

  6. Spatial measurement error and correction by spatial SIMEX in linear regression models when using predicted air pollution exposures.

    Science.gov (United States)

    Alexeeff, Stacey E; Carroll, Raymond J; Coull, Brent

    2016-04-01

    Spatial modeling of air pollution exposures is widespread in air pollution epidemiology research as a way to improve exposure assessment. However, there are key sources of exposure model uncertainty when air pollution is modeled, including estimation error and model misspecification. We examine the use of predicted air pollution levels in linear health effect models under a measurement error framework. For the prediction of air pollution exposures, we consider a universal Kriging framework, which may include land-use regression terms in the mean function and a spatial covariance structure for the residuals. We derive the bias induced by estimation error and by model misspecification in the exposure model, and we find that a misspecified exposure model can induce asymptotic bias in the effect estimate of air pollution on health. We propose a new spatial simulation extrapolation (SIMEX) procedure, and we demonstrate that the procedure has good performance in correcting this asymptotic bias. We illustrate spatial SIMEX in a study of air pollution and birthweight in Massachusetts. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  7. Regression and regression analysis time series prediction modeling on climate data of quetta, pakistan

    International Nuclear Information System (INIS)

    Jafri, Y.Z.; Kamal, L.

    2007-01-01

    Various statistical techniques was used on five-year data from 1998-2002 of average humidity, rainfall, maximum and minimum temperatures, respectively. The relationships to regression analysis time series (RATS) were developed for determining the overall trend of these climate parameters on the basis of which forecast models can be corrected and modified. We computed the coefficient of determination as a measure of goodness of fit, to our polynomial regression analysis time series (PRATS). The correlation to multiple linear regression (MLR) and multiple linear regression analysis time series (MLRATS) were also developed for deciphering the interdependence of weather parameters. Spearman's rand correlation and Goldfeld-Quandt test were used to check the uniformity or non-uniformity of variances in our fit to polynomial regression (PR). The Breusch-Pagan test was applied to MLR and MLRATS, respectively which yielded homoscedasticity. We also employed Bartlett's test for homogeneity of variances on a five-year data of rainfall and humidity, respectively which showed that the variances in rainfall data were not homogenous while in case of humidity, were homogenous. Our results on regression and regression analysis time series show the best fit to prediction modeling on climatic data of Quetta, Pakistan. (author)

  8. Regression Analysis and the Sociological Imagination

    Science.gov (United States)

    De Maio, Fernando

    2014-01-01

    Regression analysis is an important aspect of most introductory statistics courses in sociology but is often presented in contexts divorced from the central concerns that bring students into the discipline. Consequently, we present five lesson ideas that emerge from a regression analysis of income inequality and mortality in the USA and Canada.

  9. Polylinear regression analysis in radiochemistry

    International Nuclear Information System (INIS)

    Kopyrin, A.A.; Terent'eva, T.N.; Khramov, N.N.

    1995-01-01

    A number of radiochemical problems have been formulated in the framework of polylinear regression analysis, which permits the use of conventional mathematical methods for their solution. The authors have considered features of the use of polylinear regression analysis for estimating the contributions of various sources to the atmospheric pollution, for studying irradiated nuclear fuel, for estimating concentrations from spectral data, for measuring neutron fields of a nuclear reactor, for estimating crystal lattice parameters from X-ray diffraction patterns, for interpreting data of X-ray fluorescence analysis, for estimating complex formation constants, and for analyzing results of radiometric measurements. The problem of estimating the target parameters can be incorrect at certain properties of the system under study. The authors showed the possibility of regularization by adding a fictitious set of data open-quotes obtainedclose quotes from the orthogonal design. To estimate only a part of the parameters under consideration, the authors used incomplete rank models. In this case, it is necessary to take into account the possibility of confounding estimates. An algorithm for evaluating the degree of confounding is presented which is realized using standard software or regression analysis

  10. Linear Regression Analysis

    CERN Document Server

    Seber, George A F

    2012-01-01

    Concise, mathematically clear, and comprehensive treatment of the subject.* Expanded coverage of diagnostics and methods of model fitting.* Requires no specialized knowledge beyond a good grasp of matrix algebra and some acquaintance with straight-line regression and simple analysis of variance models.* More than 200 problems throughout the book plus outline solutions for the exercises.* This revision has been extensively class-tested.

  11. Spatial Assessment of Road Traffic Injuries in the Greater Toronto Area (GTA: Spatial Analysis Framework

    Directory of Open Access Journals (Sweden)

    Sina Tehranchi

    2017-03-01

    Full Text Available This research presents a Geographic Information Systems (GIS and spatial analysis approach based on the global spatial autocorrelation of road traffic injuries for identifying spatial patterns. A locational spatial autocorrelation was also used for identifying traffic injury at spatial level. Data for this research study were acquired from Canadian Institute for Health Information (CIHI based on 2004 and 2011. Moran’s I statistics were used to examine spatial patterns of road traffic injuries in the Greater Toronto Area (GTA. An assessment of Getis-Ord Gi* statistic was followed as to identify hot spots and cold spots within the study area. The results revealed that Peel and Durham have the highest collision rate for other motor vehicle with motor vehicle. Geographic weighted regression (GWR technique was conducted to test the relationships between the dependent variable, number of road traffic injury incidents and independent variables such as number of seniors, low education, unemployed, vulnerable groups, people smoking and drinking, urban density and average median income. The result of this model suggested that number of seniors and low education have a very strong correlation with the number of road traffic injury incidents.

  12. Preface to Berk's "Regression Analysis: A Constructive Critique"

    OpenAIRE

    de Leeuw, Jan

    2003-01-01

    It is pleasure to write a preface for the book ”Regression Analysis” of my fellow series editor Dick Berk. And it is a pleasure in particular because the book is about regression analysis, the most popular and the most fundamental technique in applied statistics. And because it is critical of the way regression analysis is used in the sciences, in particular in the social and behavioral sciences. Although the book can be read as an introduction to regression analysis, it can also be read as a...

  13. Multicollinearity and Regression Analysis

    Science.gov (United States)

    Daoud, Jamal I.

    2017-12-01

    In regression analysis it is obvious to have a correlation between the response and predictor(s), but having correlation among predictors is something undesired. The number of predictors included in the regression model depends on many factors among which, historical data, experience, etc. At the end selection of most important predictors is something objective due to the researcher. Multicollinearity is a phenomena when two or more predictors are correlated, if this happens, the standard error of the coefficients will increase [8]. Increased standard errors means that the coefficients for some or all independent variables may be found to be significantly different from In other words, by overinflating the standard errors, multicollinearity makes some variables statistically insignificant when they should be significant. In this paper we focus on the multicollinearity, reasons and consequences on the reliability of the regression model.

  14. Influences of spatial and temporal variation on fish-habitat relationships defined by regression quantiles

    Science.gov (United States)

    Dunham, J.B.; Cade, B.S.; Terrell, J.W.

    2002-01-01

    We used regression quantiles to model potentially limiting relationships between the standing crop of cutthroat trout Oncorhynchus clarki and measures of stream channel morphology. Regression quantile models indicated that variation in fish density was inversely related to the width:depth ratio of streams but not to stream width or depth alone. The spatial and temporal stability of model predictions were examined across years and streams, respectively. Variation in fish density with width:depth ratio (10th-90th regression quantiles) modeled for streams sampled in 1993-1997 predicted the variation observed in 1998-1999, indicating similar habitat relationships across years. Both linear and nonlinear models described the limiting relationships well, the latter performing slightly better. Although estimated relationships were transferable in time, results were strongly dependent on the influence of spatial variation in fish density among streams. Density changes with width:depth ratio in a single stream were responsible for the significant (P 80th). This suggests that stream-scale factors other than width:depth ratio play a more direct role in determining population density. Much of the variation in densities of cutthroat trout among streams was attributed to the occurrence of nonnative brook trout Salvelinus fontinalis (a possible competitor) or connectivity to migratory habitats. Regression quantiles can be useful for estimating the effects of limiting factors when ecological responses are highly variable, but our results indicate that spatiotemporal variability in the data should be explicitly considered. In this study, data from individual streams and stream-specific characteristics (e.g., the occurrence of nonnative species and habitat connectivity) strongly affected our interpretation of the relationship between width:depth ratio and fish density.

  15. Hierarchical regression analysis in structural Equation Modeling

    NARCIS (Netherlands)

    de Jong, P.F.

    1999-01-01

    In a hierarchical or fixed-order regression analysis, the independent variables are entered into the regression equation in a prespecified order. Such an analysis is often performed when the extra amount of variance accounted for in a dependent variable by a specific independent variable is the main

  16. Multivariate Regression Analysis and Slaughter Livestock,

    Science.gov (United States)

    AGRICULTURE, *ECONOMICS), (*MEAT, PRODUCTION), MULTIVARIATE ANALYSIS, REGRESSION ANALYSIS , ANIMALS, WEIGHT, COSTS, PREDICTIONS, STABILITY, MATHEMATICAL MODELS, STORAGE, BEEF, PORK, FOOD, STATISTICAL DATA, ACCURACY

  17. Prediction of unwanted pregnancies using logistic regression, probit regression and discriminant analysis.

    Science.gov (United States)

    Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon

    2015-01-01

    Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended.

  18. Bayesian logistic regression analysis

    NARCIS (Netherlands)

    Van Erp, H.R.N.; Van Gelder, P.H.A.J.M.

    2012-01-01

    In this paper we present a Bayesian logistic regression analysis. It is found that if one wishes to derive the posterior distribution of the probability of some event, then, together with the traditional Bayes Theorem and the integrating out of nuissance parameters, the Jacobian transformation is an

  19. Regression analysis using dependent Polya trees.

    Science.gov (United States)

    Schörgendorfer, Angela; Branscum, Adam J

    2013-11-30

    Many commonly used models for linear regression analysis force overly simplistic shape and scale constraints on the residual structure of data. We propose a semiparametric Bayesian model for regression analysis that produces data-driven inference by using a new type of dependent Polya tree prior to model arbitrary residual distributions that are allowed to evolve across increasing levels of an ordinal covariate (e.g., time, in repeated measurement studies). By modeling residual distributions at consecutive covariate levels or time points using separate, but dependent Polya tree priors, distributional information is pooled while allowing for broad pliability to accommodate many types of changing residual distributions. We can use the proposed dependent residual structure in a wide range of regression settings, including fixed-effects and mixed-effects linear and nonlinear models for cross-sectional, prospective, and repeated measurement data. A simulation study illustrates the flexibility of our novel semiparametric regression model to accurately capture evolving residual distributions. In an application to immune development data on immunoglobulin G antibodies in children, our new model outperforms several contemporary semiparametric regression models based on a predictive model selection criterion. Copyright © 2013 John Wiley & Sons, Ltd.

  20. Spatial prediction of landslides using a hybrid machine learning approach based on Random Subspace and Classification and Regression Trees

    Science.gov (United States)

    Pham, Binh Thai; Prakash, Indra; Tien Bui, Dieu

    2018-02-01

    A hybrid machine learning approach of Random Subspace (RSS) and Classification And Regression Trees (CART) is proposed to develop a model named RSSCART for spatial prediction of landslides. This model is a combination of the RSS method which is known as an efficient ensemble technique and the CART which is a state of the art classifier. The Luc Yen district of Yen Bai province, a prominent landslide prone area of Viet Nam, was selected for the model development. Performance of the RSSCART model was evaluated through the Receiver Operating Characteristic (ROC) curve, statistical analysis methods, and the Chi Square test. Results were compared with other benchmark landslide models namely Support Vector Machines (SVM), single CART, Naïve Bayes Trees (NBT), and Logistic Regression (LR). In the development of model, ten important landslide affecting factors related with geomorphology, geology and geo-environment were considered namely slope angles, elevation, slope aspect, curvature, lithology, distance to faults, distance to rivers, distance to roads, and rainfall. Performance of the RSSCART model (AUC = 0.841) is the best compared with other popular landslide models namely SVM (0.835), single CART (0.822), NBT (0.821), and LR (0.723). These results indicate that performance of the RSSCART is a promising method for spatial landslide prediction.

  1. RAWS II: A MULTIPLE REGRESSION ANALYSIS PROGRAM,

    Science.gov (United States)

    This memorandum gives instructions for the use and operation of a revised version of RAWS, a multiple regression analysis program. The program...of preprocessed data, the directed retention of variable, listing of the matrix of the normal equations and its inverse, and the bypassing of the regression analysis to provide the input variable statistics only. (Author)

  2. [Application of Land-use Regression Models in Spatial-temporal Differentiation of Air Pollution].

    Science.gov (United States)

    Wu, Jian-sheng; Xie, Wu-dan; Li, Jia-cheng

    2016-02-15

    With the rapid development of urbanization, industrialization and motorization, air pollution has become one of the most serious environmental problems in our country, which has negative impacts on public health and ecological environment. LUR model is one of the common methods simulating spatial-temporal differentiation of air pollution at city scale. It has broad application in Europe and North America, but not really in China. Based on many studies at home and abroad, this study started with the main steps to develop LUR model, including obtaining the monitoring data, generating variables, developing models, model validation and regression mapping. Then a conclusion was drawn on the progress of LUR models in spatial-temporal differentiation of air pollution. Furthermore, the research focus and orientation in the future were prospected, including highlighting spatial-temporal differentiation, increasing classes of model variables and improving the methods of model development. This paper was aimed to popularize the application of LUR model in China, and provide a methodological basis for human exposure, epidemiologic study and health risk assessment.

  3. The rubber plantation environment and Lassa fever epidemics in Liberia, 2008-2012: a spatial regression.

    Science.gov (United States)

    Olugasa, Babasola O; Dogba, John B; Ogunro, Bamidele; Odigie, Eugene A; Nykoi, Jomah; Ojo, Johnson F; Taiwo, Olalekan; Kamara, Abraham; Mulbah, Charles K; Fasunla, Ayotunde J

    2014-10-01

    As Lassa fever continues to be a public health challenge in West Africa, it is critical to produce good maps of its risk pattern for use in active surveillance and control intervention. We identified eight spatial features related to the rubber plantation environment and used them as explanatory variables for Lassa fever (LF) outbreaks on the Uniroyal Liberian Agricultural Company (LAC) rubber plantation environment in Grand Bassa County, Liberia. We computed classical and spatial lag regression models on all spatial features, including proximity of residential camp to rubber tree-edge, main road in the plantation, LAC hospital, rice farmland, household refuse dump, human population density, post-harvest storage density of rice and density of rodent deterrent on rice storage. We found significant (p=0.0024) spatial autocorrelation between LF cases and the spatial features we have considered. We concluded that the rubber plantation environment influenced Mastomys species' breeding and transmission of Lassa virus along spatial scale to humans. The risk factors identified in this study offered a baseline for more effective surveillance and control of LF in the post-civil conflict Liberia. Copyright © 2014 Elsevier Ltd. All rights reserved.

  4. Spatial-Temporal Variations of Turbidity and Ocean Current Velocity of the Ariake Sea Area, Kyushu, Japan Through Regression Analysis with Remote Sensing Satellite Data

    OpenAIRE

    Yuichi Sarusawa; Kohei Arai

    2013-01-01

    Regression analysis based method for turbidity and ocean current velocity estimation with remote sensing satellite data is proposed. Through regressive analysis with MODIS data and measured data of turbidity and ocean current velocity, regressive equation which allows estimation of turbidity and ocean current velocity is obtained. With the regressive equation as well as long term MODIS data, turbidity and ocean current velocity trends in Ariake Sea area are clarified. It is also confirmed tha...

  5. Common pitfalls in statistical analysis: Linear regression analysis

    Directory of Open Access Journals (Sweden)

    Rakesh Aggarwal

    2017-01-01

    Full Text Available In a previous article in this series, we explained correlation analysis which describes the strength of relationship between two continuous variables. In this article, we deal with linear regression analysis which predicts the value of one continuous variable from another. We also discuss the assumptions and pitfalls associated with this analysis.

  6. Explorative spatial analysis of traffic accident statistics and road mortality among the provinces of Turkey.

    Science.gov (United States)

    Erdogan, Saffet

    2009-10-01

    The aim of the study is to describe the inter-province differences in traffic accidents and mortality on roads of Turkey. Two different risk indicators were used to evaluate the road safety performance of the provinces in Turkey. These indicators are the ratios between the number of persons killed in road traffic accidents (1) and the number of accidents (2) (nominators) and their exposure to traffic risk (denominator). Population and the number of registered motor vehicles in the provinces were used as denominators individually. Spatial analyses were performed to the mean annual rate of deaths and to the number of fatal accidents that were calculated for the period of 2001-2006. Empirical Bayes smoothing was used to remove background noise from the raw death and accident rates because of the sparsely populated provinces and small number of accident and death rates of provinces. Global and local spatial autocorrelation analyses were performed to show whether the provinces with high rates of deaths-accidents show clustering or are located closer by chance. The spatial distribution of provinces with high rates of deaths and accidents was nonrandom and detected as clustered with significance of Paccidents and deaths were located in the provinces that contain the roads connecting the Istanbul, Ankara, and Antalya provinces. Accident and death rates were also modeled with some independent variables such as number of motor vehicles, length of roads, and so forth using geographically weighted regression analysis with forward step-wise elimination. The level of statistical significance was taken as Paccidents according to denominators in the provinces. The geographically weighted regression analyses did significantly better predictions for both accident rates and death rates than did ordinary least regressions, as indicated by adjusted R(2) values. Geographically weighted regression provided values of 0.89-0.99 adjusted R(2) for death and accident rates, compared with 0

  7. Mapping geogenic radon potential by regression kriging

    Energy Technology Data Exchange (ETDEWEB)

    Pásztor, László [Institute for Soil Sciences and Agricultural Chemistry, Centre for Agricultural Research, Hungarian Academy of Sciences, Department of Environmental Informatics, Herman Ottó út 15, 1022 Budapest (Hungary); Szabó, Katalin Zsuzsanna, E-mail: sz_k_zs@yahoo.de [Department of Chemistry, Institute of Environmental Science, Szent István University, Páter Károly u. 1, Gödöllő 2100 (Hungary); Szatmári, Gábor; Laborczi, Annamária [Institute for Soil Sciences and Agricultural Chemistry, Centre for Agricultural Research, Hungarian Academy of Sciences, Department of Environmental Informatics, Herman Ottó út 15, 1022 Budapest (Hungary); Horváth, Ákos [Department of Atomic Physics, Eötvös University, Pázmány Péter sétány 1/A, 1117 Budapest (Hungary)

    2016-02-15

    Radon ({sup 222}Rn) gas is produced in the radioactive decay chain of uranium ({sup 238}U) which is an element that is naturally present in soils. Radon is transported mainly by diffusion and convection mechanisms through the soil depending mainly on the physical and meteorological parameters of the soil and can enter and accumulate in buildings. Health risks originating from indoor radon concentration can be attributed to natural factors and is characterized by geogenic radon potential (GRP). Identification of areas with high health risks require spatial modeling, that is, mapping of radon risk. In addition to geology and meteorology, physical soil properties play a significant role in the determination of GRP. In order to compile a reliable GRP map for a model area in Central-Hungary, spatial auxiliary information representing GRP forming environmental factors were taken into account to support the spatial inference of the locally measured GRP values. Since the number of measured sites was limited, efficient spatial prediction methodologies were searched for to construct a reliable map for a larger area. Regression kriging (RK) was applied for the interpolation using spatially exhaustive auxiliary data on soil, geology, topography, land use and climate. RK divides the spatial inference into two parts. Firstly, the deterministic component of the target variable is determined by a regression model. The residuals of the multiple linear regression analysis represent the spatially varying but dependent stochastic component, which are interpolated by kriging. The final map is the sum of the two component predictions. Overall accuracy of the map was tested by Leave-One-Out Cross-Validation. Furthermore the spatial reliability of the resultant map is also estimated by the calculation of the 90% prediction interval of the local prediction values. The applicability of the applied method as well as that of the map is discussed briefly. - Highlights: • A new method

  8. Mapping geogenic radon potential by regression kriging

    International Nuclear Information System (INIS)

    Pásztor, László; Szabó, Katalin Zsuzsanna; Szatmári, Gábor; Laborczi, Annamária; Horváth, Ákos

    2016-01-01

    Radon ( 222 Rn) gas is produced in the radioactive decay chain of uranium ( 238 U) which is an element that is naturally present in soils. Radon is transported mainly by diffusion and convection mechanisms through the soil depending mainly on the physical and meteorological parameters of the soil and can enter and accumulate in buildings. Health risks originating from indoor radon concentration can be attributed to natural factors and is characterized by geogenic radon potential (GRP). Identification of areas with high health risks require spatial modeling, that is, mapping of radon risk. In addition to geology and meteorology, physical soil properties play a significant role in the determination of GRP. In order to compile a reliable GRP map for a model area in Central-Hungary, spatial auxiliary information representing GRP forming environmental factors were taken into account to support the spatial inference of the locally measured GRP values. Since the number of measured sites was limited, efficient spatial prediction methodologies were searched for to construct a reliable map for a larger area. Regression kriging (RK) was applied for the interpolation using spatially exhaustive auxiliary data on soil, geology, topography, land use and climate. RK divides the spatial inference into two parts. Firstly, the deterministic component of the target variable is determined by a regression model. The residuals of the multiple linear regression analysis represent the spatially varying but dependent stochastic component, which are interpolated by kriging. The final map is the sum of the two component predictions. Overall accuracy of the map was tested by Leave-One-Out Cross-Validation. Furthermore the spatial reliability of the resultant map is also estimated by the calculation of the 90% prediction interval of the local prediction values. The applicability of the applied method as well as that of the map is discussed briefly. - Highlights: • A new method, regression

  9. Moderation analysis using a two-level regression model.

    Science.gov (United States)

    Yuan, Ke-Hai; Cheng, Ying; Maxwell, Scott

    2014-10-01

    Moderation analysis is widely used in social and behavioral research. The most commonly used model for moderation analysis is moderated multiple regression (MMR) in which the explanatory variables of the regression model include product terms, and the model is typically estimated by least squares (LS). This paper argues for a two-level regression model in which the regression coefficients of a criterion variable on predictors are further regressed on moderator variables. An algorithm for estimating the parameters of the two-level model by normal-distribution-based maximum likelihood (NML) is developed. Formulas for the standard errors (SEs) of the parameter estimates are provided and studied. Results indicate that, when heteroscedasticity exists, NML with the two-level model gives more efficient and more accurate parameter estimates than the LS analysis of the MMR model. When error variances are homoscedastic, NML with the two-level model leads to essentially the same results as LS with the MMR model. Most importantly, the two-level regression model permits estimating the percentage of variance of each regression coefficient that is due to moderator variables. When applied to data from General Social Surveys 1991, NML with the two-level model identified a significant moderation effect of race on the regression of job prestige on years of education while LS with the MMR model did not. An R package is also developed and documented to facilitate the application of the two-level model.

  10. Sub-pixel estimation of tree cover and bare surface densities using regression tree analysis

    Directory of Open Access Journals (Sweden)

    Carlos Augusto Zangrando Toneli

    2011-09-01

    Full Text Available Sub-pixel analysis is capable of generating continuous fields, which represent the spatial variability of certain thematic classes. The aim of this work was to develop numerical models to represent the variability of tree cover and bare surfaces within the study area. This research was conducted in the riparian buffer within a watershed of the São Francisco River in the North of Minas Gerais, Brazil. IKONOS and Landsat TM imagery were used with the GUIDE algorithm to construct the models. The results were two index images derived with regression trees for the entire study area, one representing tree cover and the other representing bare surface. The use of non-parametric and non-linear regression tree models presented satisfactory results to characterize wetland, deciduous and savanna patterns of forest formation.

  11. Perspectives on spatial data analysis

    CERN Document Server

    Rey, Sergio

    2010-01-01

    This book takes both a retrospective and prospective view of the field of spatial analysis by combining selected reprints of classic articles by Arthur Getis with current observations by leading experts in the field. Four main aspects are highlighted, dealing with spatial analysis, pattern analysis, local statistics as well as illustrative empirical applications. Researchers and students will gain an appreciation of Getis' methodological contributions to spatial analysis and the broad impact of the methods he has helped pioneer on an impressively broad array of disciplines including spatial epidemiology, demography, economics, and ecology. The volume is a compilation of high impact original contributions, as evidenced by citations, and the latest thinking on the field by leading scholars. This makes the book ideal for advanced seminars and courses in spatial analysis as well as a key resource for researchers seeking a comprehensive overview of recent advances and future directions in the field.

  12. Spatial-temporal analysis of wind power forecast errors for West-Coast Norway

    Energy Technology Data Exchange (ETDEWEB)

    Revheim, Paal Preede; Beyer, Hans Georg [Agder Univ. (UiA), Grimstad (Norway). Dept. of Engineering Sciences

    2012-07-01

    In this paper the spatial-temporal structure of forecast errors for wind power in West-Coast Norway is analyzed. Starting on the qualitative analysis of the forecast error reduction, with respect to single site data, for the lumped conditions of groups of sites the spatial and temporal correlations of the wind power forecast errors within and between the same groups are studied in detail. Based on this, time-series regression models to be used to analytically describe the error reduction are set up. The models give an expected reduction in forecast error between 48.4% and 49%. (orig.)

  13. Order-Constrained Reference Priors with Implications for Bayesian Isotonic Regression, Analysis of Covariance and Spatial Models

    Science.gov (United States)

    Gong, Maozhen

    Selecting an appropriate prior distribution is a fundamental issue in Bayesian Statistics. In this dissertation, under the framework provided by Berger and Bernardo, I derive the reference priors for several models which include: Analysis of Variance (ANOVA)/Analysis of Covariance (ANCOVA) models with a categorical variable under common ordering constraints, the conditionally autoregressive (CAR) models and the simultaneous autoregressive (SAR) models with a spatial autoregression parameter rho considered. The performances of reference priors for ANOVA/ANCOVA models are evaluated by simulation studies with comparisons to Jeffreys' prior and Least Squares Estimation (LSE). The priors are then illustrated in a Bayesian model of the "Risk of Type 2 Diabetes in New Mexico" data, where the relationship between the type 2 diabetes risk (through Hemoglobin A1c) and different smoking levels is investigated. In both simulation studies and real data set modeling, the reference priors that incorporate internal order information show good performances and can be used as default priors. The reference priors for the CAR and SAR models are also illustrated in the "1999 SAT State Average Verbal Scores" data with a comparison to a Uniform prior distribution. Due to the complexity of the reference priors for both CAR and SAR models, only a portion (12 states in the Midwest) of the original data set is considered. The reference priors can give a different marginal posterior distribution compared to a Uniform prior, which provides an alternative for prior specifications for areal data in Spatial statistics.

  14. Professional analysis in spatial planning

    Directory of Open Access Journals (Sweden)

    Andrej Černe

    2005-12-01

    Full Text Available Spatial analysis contributes to accomplishment of the three basic aims of spatial planning: it is basic element for setting spatial policies, concepts and strategies, gives basic information to inhabitants, land owners, investors, planners and helps in performing spatial policies, strategies, plans, programmes and projects. Analysis in planning are generally devoted to: understand current circumstances and emerging conditions within planning decisions; determine priorities of open questions and their solutions; formulate general principles for further development.

  15. Two Paradoxes in Linear Regression Analysis

    Science.gov (United States)

    FENG, Ge; PENG, Jing; TU, Dongke; ZHENG, Julia Z.; FENG, Changyong

    2016-01-01

    Summary Regression is one of the favorite tools in applied statistics. However, misuse and misinterpretation of results from regression analysis are common in biomedical research. In this paper we use statistical theory and simulation studies to clarify some paradoxes around this popular statistical method. In particular, we show that a widely used model selection procedure employed in many publications in top medical journals is wrong. Formal procedures based on solid statistical theory should be used in model selection. PMID:28638214

  16. Logistic regression accuracy across different spatial and temporal scales for a wide-ranging species, the marbled murrelet

    Science.gov (United States)

    Carolyn B. Meyer; Sherri L. Miller; C. John Ralph

    2004-01-01

    The scale at which habitat variables are measured affects the accuracy of resource selection functions in predicting animal use of sites. We used logistic regression models for a wide-ranging species, the marbled murrelet, (Brachyramphus marmoratus) in a large region in California to address how much changing the spatial or temporal scale of...

  17. Using Dominance Analysis to Determine Predictor Importance in Logistic Regression

    Science.gov (United States)

    Azen, Razia; Traxel, Nicole

    2009-01-01

    This article proposes an extension of dominance analysis that allows researchers to determine the relative importance of predictors in logistic regression models. Criteria for choosing logistic regression R[superscript 2] analogues were determined and measures were selected that can be used to perform dominance analysis in logistic regression. A…

  18. Identifying Flood-Related Infectious Diseases in Anhui Province, China: A Spatial and Temporal Analysis

    Science.gov (United States)

    Gao, Lu; Zhang, Ying; Ding, Guoyong; Liu, Qiyong; Jiang, Baofa

    2016-01-01

    The aim of this study was to explore infectious diseases related to the 2007 Huai River flood in Anhui Province, China. The study was based on the notified incidences of infectious diseases between June 29 and July 25 from 2004 to 2011. Daily incidences of notified diseases in 2007 were compared with the corresponding daily incidences during the same period in the other years (from 2004 to 2011, except 2007) by Poisson regression analysis. Spatial autocorrelation analysis was used to test the distribution pattern of the diseases. Spatial regression models were then performed to examine the association between the incidence of each disease and flood, considering lag effects and other confounders. After controlling the other meteorological and socioeconomic factors, malaria (odds ratio [OR] = 3.67, 95% confidence interval [CI] = 1.77–7.61), diarrhea (OR = 2.16, 95% CI = 1.24–3.78), and hepatitis A virus (HAV) infection (OR = 6.11, 95% CI = 1.04–35.84) were significantly related to the 2007 Huai River flood both from the spatial and temporal analyses. Special attention should be given to develop public health preparation and interventions with a focus on malaria, diarrhea, and HAV infection, in the study region. PMID:26903612

  19. Linear regression and sensitivity analysis in nuclear reactor design

    International Nuclear Information System (INIS)

    Kumar, Akansha; Tsvetkov, Pavel V.; McClarren, Ryan G.

    2015-01-01

    Highlights: • Presented a benchmark for the applicability of linear regression to complex systems. • Applied linear regression to a nuclear reactor power system. • Performed neutronics, thermal–hydraulics, and energy conversion using Brayton’s cycle for the design of a GCFBR. • Performed detailed sensitivity analysis to a set of parameters in a nuclear reactor power system. • Modeled and developed reactor design using MCNP, regression using R, and thermal–hydraulics in Java. - Abstract: The paper presents a general strategy applicable for sensitivity analysis (SA), and uncertainity quantification analysis (UA) of parameters related to a nuclear reactor design. This work also validates the use of linear regression (LR) for predictive analysis in a nuclear reactor design. The analysis helps to determine the parameters on which a LR model can be fit for predictive analysis. For those parameters, a regression surface is created based on trial data and predictions are made using this surface. A general strategy of SA to determine and identify the influential parameters those affect the operation of the reactor is mentioned. Identification of design parameters and validation of linearity assumption for the application of LR of reactor design based on a set of tests is performed. The testing methods used to determine the behavior of the parameters can be used as a general strategy for UA, and SA of nuclear reactor models, and thermal hydraulics calculations. A design of a gas cooled fast breeder reactor (GCFBR), with thermal–hydraulics, and energy transfer has been used for the demonstration of this method. MCNP6 is used to simulate the GCFBR design, and perform the necessary criticality calculations. Java is used to build and run input samples, and to extract data from the output files of MCNP6, and R is used to perform regression analysis and other multivariate variance, and analysis of the collinearity of data

  20. Least Squares Adjustment: Linear and Nonlinear Weighted Regression Analysis

    DEFF Research Database (Denmark)

    Nielsen, Allan Aasbjerg

    2007-01-01

    This note primarily describes the mathematics of least squares regression analysis as it is often used in geodesy including land surveying and satellite positioning applications. In these fields regression is often termed adjustment. The note also contains a couple of typical land surveying...... and satellite positioning application examples. In these application areas we are typically interested in the parameters in the model typically 2- or 3-D positions and not in predictive modelling which is often the main concern in other regression analysis applications. Adjustment is often used to obtain...... the clock error) and to obtain estimates of the uncertainty with which the position is determined. Regression analysis is used in many other fields of application both in the natural, the technical and the social sciences. Examples may be curve fitting, calibration, establishing relationships between...

  1. Application of geographically-weighted regression analysis to assess risk factors for malaria hotspots in Keur Soce health and demographic surveillance site.

    Science.gov (United States)

    Ndiath, Mansour M; Cisse, Badara; Ndiaye, Jean Louis; Gomis, Jules F; Bathiery, Ousmane; Dia, Anta Tal; Gaye, Oumar; Faye, Babacar

    2015-11-18

    In Senegal, considerable efforts have been made to reduce malaria morbidity and mortality during the last decade. This resulted in a marked decrease of malaria cases. With the decline of malaria cases, transmission has become sparse in most Senegalese health districts. This study investigated malaria hotspots in Keur Soce sites by using geographically-weighted regression. Because of the occurrence of hotspots, spatial modelling of malaria cases could have a considerable effect in disease surveillance. This study explored and analysed the spatial relationships between malaria occurrence and socio-economic and environmental factors in small communities in Keur Soce, Senegal, using 6 months passive surveillance. Geographically-weighted regression was used to explore the spatial variability of relationships between malaria incidence or persistence and the selected socio-economic, and human predictors. A model comparison of between ordinary least square and geographically-weighted regression was also explored. Vector dataset (spatial) of the study area by village levels and statistical data (non-spatial) on malaria confirmed cases, socio-economic status (bed net use), population data (size of the household) and environmental factors (temperature, rain fall) were used in this exploratory analysis. ArcMap 10.2 and Stata 11 were used to perform malaria hotspots analysis. From Jun to December, a total of 408 confirmed malaria cases were notified. The explanatory variables-household size, housing materials, sleeping rooms, sheep and distance to breeding site returned significant t values of -0.25, 2.3, 4.39, 1.25 and 2.36, respectively. The OLS global model revealed that it explained about 70 % (adjusted R(2) = 0.70) of the variation in malaria occurrence with AIC = 756.23. The geographically-weighted regression of malaria hotspots resulted in coefficient intercept ranging from 1.89 to 6.22 with a median of 3.5. Large positive values are distributed mainly in the southeast

  2. Analysis of the impact of immigration on labour market using spatial models

    Science.gov (United States)

    Polonyankina, Tatiana

    2017-07-01

    This paper investigates the impact of immigration on employment and unemployment of a host country. The question to answer is: How does employment/unemployment in the host country change after an increase in number of immigrants? The analysis is taking into account only legal immigrants in recession period. The model is combining classical regression of cross-sectional data with spatial econometrics models where cross-section dependencies are captured by a spatial matrix. The intention is by using spatial models analyse the sensitivity of employment/unemployment rate on change in a share of immigration in a region. The used panel data are based on the Labour force survey and on available macro data in Eurostat for 3 European countries (Germany, Austria and Czech Republic) grouped into cells by NUTS regions in a recession period.

  3. Design and analysis of experiments classical and regression approaches with SAS

    CERN Document Server

    Onyiah, Leonard C

    2008-01-01

    Introductory Statistical Inference and Regression Analysis Elementary Statistical Inference Regression Analysis Experiments, the Completely Randomized Design (CRD)-Classical and Regression Approaches Experiments Experiments to Compare Treatments Some Basic Ideas Requirements of a Good Experiment One-Way Experimental Layout or the CRD: Design and Analysis Analysis of Experimental Data (Fixed Effects Model) Expected Values for the Sums of Squares The Analysis of Variance (ANOVA) Table Follow-Up Analysis to Check fo

  4. Sparse Regression by Projection and Sparse Discriminant Analysis

    KAUST Repository

    Qi, Xin

    2015-04-03

    © 2015, © American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America. Recent years have seen active developments of various penalized regression methods, such as LASSO and elastic net, to analyze high-dimensional data. In these approaches, the direction and length of the regression coefficients are determined simultaneously. Due to the introduction of penalties, the length of the estimates can be far from being optimal for accurate predictions. We introduce a new framework, regression by projection, and its sparse version to analyze high-dimensional data. The unique nature of this framework is that the directions of the regression coefficients are inferred first, and the lengths and the tuning parameters are determined by a cross-validation procedure to achieve the largest prediction accuracy. We provide a theoretical result for simultaneous model selection consistency and parameter estimation consistency of our method in high dimension. This new framework is then generalized such that it can be applied to principal components analysis, partial least squares, and canonical correlation analysis. We also adapt this framework for discriminant analysis. Compared with the existing methods, where there is relatively little control of the dependency among the sparse components, our method can control the relationships among the components. We present efficient algorithms and related theory for solving the sparse regression by projection problem. Based on extensive simulations and real data analysis, we demonstrate that our method achieves good predictive performance and variable selection in the regression setting, and the ability to control relationships between the sparse components leads to more accurate classification. In supplementary materials available online, the details of the algorithms and theoretical proofs, and R codes for all simulation studies are provided.

  5. The Use of Nonparametric Kernel Regression Methods in Econometric Production Analysis

    DEFF Research Database (Denmark)

    Czekaj, Tomasz Gerard

    and nonparametric estimations of production functions in order to evaluate the optimal firm size. The second paper discusses the use of parametric and nonparametric regression methods to estimate panel data regression models. The third paper analyses production risk, price uncertainty, and farmers' risk preferences...... within a nonparametric panel data regression framework. The fourth paper analyses the technical efficiency of dairy farms with environmental output using nonparametric kernel regression in a semiparametric stochastic frontier analysis. The results provided in this PhD thesis show that nonparametric......This PhD thesis addresses one of the fundamental problems in applied econometric analysis, namely the econometric estimation of regression functions. The conventional approach to regression analysis is the parametric approach, which requires the researcher to specify the form of the regression...

  6. Simulation Experiments in Practice: Statistical Design and Regression Analysis

    OpenAIRE

    Kleijnen, J.P.C.

    2007-01-01

    In practice, simulation analysts often change only one factor at a time, and use graphical analysis of the resulting Input/Output (I/O) data. The goal of this article is to change these traditional, naïve methods of design and analysis, because statistical theory proves that more information is obtained when applying Design Of Experiments (DOE) and linear regression analysis. Unfortunately, classic DOE and regression analysis assume a single simulation response that is normally and independen...

  7. General Nature of Multicollinearity in Multiple Regression Analysis.

    Science.gov (United States)

    Liu, Richard

    1981-01-01

    Discusses multiple regression, a very popular statistical technique in the field of education. One of the basic assumptions in regression analysis requires that independent variables in the equation should not be highly correlated. The problem of multicollinearity and some of the solutions to it are discussed. (Author)

  8. Spatial analysis and planning under imprecision

    CERN Document Server

    Leung, Y

    1988-01-01

    The book deals with complexity, imprecision, human valuation, and uncertainty in spatial analysis and planning, providing a systematic exposure of a new philosophical and theoretical foundation for spatial analysis and planning under imprecision. Regional concepts and regionalization, spatial preference-utility-choice structures, spatial optimization with single and multiple objectives, dynamic spatial systems and their controls are analyzed in sequence.The analytical framework is based on fuzzy set theory. Basic concepts of fuzzy set theory are first discussed. Many numerical examples and emp

  9. Brillouin Scattering Spectrum Analysis Based on Auto-Regressive Spectral Estimation

    Science.gov (United States)

    Huang, Mengyun; Li, Wei; Liu, Zhangyun; Cheng, Linghao; Guan, Bai-Ou

    2018-06-01

    Auto-regressive (AR) spectral estimation technology is proposed to analyze the Brillouin scattering spectrum in Brillouin optical time-domain refelectometry. It shows that AR based method can reliably estimate the Brillouin frequency shift with an accuracy much better than fast Fourier transform (FFT) based methods provided the data length is not too short. It enables about 3 times improvement over FFT at a moderate spatial resolution.

  10. Brillouin Scattering Spectrum Analysis Based on Auto-Regressive Spectral Estimation

    Science.gov (United States)

    Huang, Mengyun; Li, Wei; Liu, Zhangyun; Cheng, Linghao; Guan, Bai-Ou

    2018-03-01

    Auto-regressive (AR) spectral estimation technology is proposed to analyze the Brillouin scattering spectrum in Brillouin optical time-domain refelectometry. It shows that AR based method can reliably estimate the Brillouin frequency shift with an accuracy much better than fast Fourier transform (FFT) based methods provided the data length is not too short. It enables about 3 times improvement over FFT at a moderate spatial resolution.

  11. On logistic regression analysis of dichotomized responses.

    Science.gov (United States)

    Lu, Kaifeng

    2017-01-01

    We study the properties of treatment effect estimate in terms of odds ratio at the study end point from logistic regression model adjusting for the baseline value when the underlying continuous repeated measurements follow a multivariate normal distribution. Compared with the analysis that does not adjust for the baseline value, the adjusted analysis produces a larger treatment effect as well as a larger standard error. However, the increase in standard error is more than offset by the increase in treatment effect so that the adjusted analysis is more powerful than the unadjusted analysis for detecting the treatment effect. On the other hand, the true adjusted odds ratio implied by the normal distribution of the underlying continuous variable is a function of the baseline value and hence is unlikely to be able to be adequately represented by a single value of adjusted odds ratio from the logistic regression model. In contrast, the risk difference function derived from the logistic regression model provides a reasonable approximation to the true risk difference function implied by the normal distribution of the underlying continuous variable over the range of the baseline distribution. We show that different metrics of treatment effect have similar statistical power when evaluated at the baseline mean. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  12. Spatial correlation in Bayesian logistic regression with misclassification

    DEFF Research Database (Denmark)

    Bihrmann, Kristine; Toft, Nils; Nielsen, Søren Saxmose

    2014-01-01

    Standard logistic regression assumes that the outcome is measured perfectly. In practice, this is often not the case, which could lead to biased estimates if not accounted for. This study presents Bayesian logistic regression with adjustment for misclassification of the outcome applied to data...

  13. Spatial analysis statistics, visualization, and computational methods

    CERN Document Server

    Oyana, Tonny J

    2015-01-01

    An introductory text for the next generation of geospatial analysts and data scientists, Spatial Analysis: Statistics, Visualization, and Computational Methods focuses on the fundamentals of spatial analysis using traditional, contemporary, and computational methods. Outlining both non-spatial and spatial statistical concepts, the authors present practical applications of geospatial data tools, techniques, and strategies in geographic studies. They offer a problem-based learning (PBL) approach to spatial analysis-containing hands-on problem-sets that can be worked out in MS Excel or ArcGIS-as well as detailed illustrations and numerous case studies. The book enables readers to: Identify types and characterize non-spatial and spatial data Demonstrate their competence to explore, visualize, summarize, analyze, optimize, and clearly present statistical data and results Construct testable hypotheses that require inferential statistical analysis Process spatial data, extract explanatory variables, conduct statisti...

  14. Spatial analysis of Schistosomiasis in Hubei Province, China: a GIS-based analysis of Schistosomiasis from 2009 to 2013.

    Directory of Open Access Journals (Sweden)

    Yan-Yan Chen

    Full Text Available Schistosomiasis remains a major public health problem in China. The major endemic areas are located in the lake and marshland regions of southern China, particularly in areas along the middle and low reach of the Yangtze River. Spatial analytical techniques are often used in epidemiology to identify spatial clusters in disease regions. This study assesses the spatial distribution of schistosomiasis and explores high-risk regions in Hubei Province, China to provide guidance on schistosomiasis control in marshland regions.In this study, spatial autocorrelation methodologies, including global Moran's I and local Getis-Ord statistics, were utilized to describe and map spatial clusters and areas where human Schistosoma japonicum infection is prevalent at the county level in Hubei province. In addition, linear logistic regression model was used to determine the characteristics of spatial autocorrelation with time.The infection rates of S. japonicum decreased from 2009 to 2013. The global autocorrelation analysis results on the infection rate of S. japonicum for five years showed statistical significance (Moran's I > 0, P < 0.01, which suggested that spatial clusters were present in the distribution of S. japonicum infection from 2009 to 2013. Local autocorrelation analysis results showed that the number of highly aggregated areas ranged from eight to eleven within the five-year analysis period. The highly aggregated areas were mainly distributed in eight counties.The spatial distribution of human S. japonicum infections did not exhibit a temporal change at the county level in Hubei Province. The risk factors that influence human S. japonicum transmission may not have changed after achieving the national criterion of infection control. The findings indicated that spatial-temporal surveillance of S. japonicum transmission plays a significant role on schistosomiasis control. Timely and integrated prevention should be continued, especially in the Yangtze

  15. Spatial and temporal epidemiological analysis in the Big Data era.

    Science.gov (United States)

    Pfeiffer, Dirk U; Stevens, Kim B

    2015-11-01

    Concurrent with global economic development in the last 50 years, the opportunities for the spread of existing diseases and emergence of new infectious pathogens, have increased substantially. The activities associated with the enormously intensified global connectivity have resulted in large amounts of data being generated, which in turn provides opportunities for generating knowledge that will allow more effective management of animal and human health risks. This so-called Big Data has, more recently, been accompanied by the Internet of Things which highlights the increasing presence of a wide range of sensors, interconnected via the Internet. Analysis of this data needs to exploit its complexity, accommodate variation in data quality and should take advantage of its spatial and temporal dimensions, where available. Apart from the development of hardware technologies and networking/communication infrastructure, it is necessary to develop appropriate data management tools that make this data accessible for analysis. This includes relational databases, geographical information systems and most recently, cloud-based data storage such as Hadoop distributed file systems. While the development in analytical methodologies has not quite caught up with the data deluge, important advances have been made in a number of areas, including spatial and temporal data analysis where the spectrum of analytical methods ranges from visualisation and exploratory analysis, to modelling. While there used to be a primary focus on statistical science in terms of methodological development for data analysis, the newly emerged discipline of data science is a reflection of the challenges presented by the need to integrate diverse data sources and exploit them using novel data- and knowledge-driven modelling methods while simultaneously recognising the value of quantitative as well as qualitative analytical approaches. Machine learning regression methods, which are more robust and can handle

  16. On two flexible methods of 2-dimensional regression analysis

    Czech Academy of Sciences Publication Activity Database

    Volf, Petr

    2012-01-01

    Roč. 18, č. 4 (2012), s. 154-164 ISSN 1803-9782 Grant - others:GA ČR(CZ) GAP209/10/2045 Institutional support: RVO:67985556 Keywords : regression analysis * Gordon surface * prediction error * projection pursuit Subject RIV: BB - Applied Statistics, Operational Research http://library.utia.cas.cz/separaty/2013/SI/volf-on two flexible methods of 2-dimensional regression analysis.pdf

  17. Comparison of multinomial logistic regression and logistic regression: which is more efficient in allocating land use?

    Science.gov (United States)

    Lin, Yingzhi; Deng, Xiangzheng; Li, Xing; Ma, Enjun

    2014-12-01

    Spatially explicit simulation of land use change is the basis for estimating the effects of land use and cover change on energy fluxes, ecology and the environment. At the pixel level, logistic regression is one of the most common approaches used in spatially explicit land use allocation models to determine the relationship between land use and its causal factors in driving land use change, and thereby to evaluate land use suitability. However, these models have a drawback in that they do not determine/allocate land use based on the direct relationship between land use change and its driving factors. Consequently, a multinomial logistic regression method was introduced to address this flaw, and thereby, judge the suitability of a type of land use in any given pixel in a case study area of the Jiangxi Province, China. A comparison of the two regression methods indicated that the proportion of correctly allocated pixels using multinomial logistic regression was 92.98%, which was 8.47% higher than that obtained using logistic regression. Paired t-test results also showed that pixels were more clearly distinguished by multinomial logistic regression than by logistic regression. In conclusion, multinomial logistic regression is a more efficient and accurate method for the spatial allocation of land use changes. The application of this method in future land use change studies may improve the accuracy of predicting the effects of land use and cover change on energy fluxes, ecology, and environment.

  18. Disease Mapping and Regression with Count Data in the Presence of Overdispersion and Spatial Autocorrelation: A Bayesian Model Averaging Approach

    Science.gov (United States)

    Mohebbi, Mohammadreza; Wolfe, Rory; Forbes, Andrew

    2014-01-01

    This paper applies the generalised linear model for modelling geographical variation to esophageal cancer incidence data in the Caspian region of Iran. The data have a complex and hierarchical structure that makes them suitable for hierarchical analysis using Bayesian techniques, but with care required to deal with problems arising from counts of events observed in small geographical areas when overdispersion and residual spatial autocorrelation are present. These considerations lead to nine regression models derived from using three probability distributions for count data: Poisson, generalised Poisson and negative binomial, and three different autocorrelation structures. We employ the framework of Bayesian variable selection and a Gibbs sampling based technique to identify significant cancer risk factors. The framework deals with situations where the number of possible models based on different combinations of candidate explanatory variables is large enough such that calculation of posterior probabilities for all models is difficult or infeasible. The evidence from applying the modelling methodology suggests that modelling strategies based on the use of generalised Poisson and negative binomial with spatial autocorrelation work well and provide a robust basis for inference. PMID:24413702

  19. Recent developments in spatial analysis spatial statistics, behavioural modelling, and computational intelligence

    CERN Document Server

    Getis, Arthur

    1997-01-01

    In recent years, spatial analysis has become an increasingly active field, as evidenced by the establishment of educational and research programs at many universities. Its popularity is due mainly to new technologies and the development of spatial data infrastructures. This book illustrates some recent developments in spatial analysis, behavioural modelling, and computational intelligence. World renown spatial analysts explain and demonstrate their new and insightful models and methods. The applications are in areas of societal interest such as the spread of infectious diseases, migration behaviour, and retail and agricultural location strategies. In addition, there is emphasis on the uses of new technologoies for the analysis of spatial data through the application of neural network concepts.

  20. [Spatial heterogeneity in body condition of small yellow croaker in Yellow Sea and East China Sea based on mixed-effects model and quantile regression analysis].

    Science.gov (United States)

    Liu, Zun-Lei; Yuan, Xing-Wei; Yan, Li-Ping; Yang, Lin-Lin; Cheng, Jia-Hua

    2013-09-01

    By using the 2008-2010 investigation data about the body condition of small yellow croaker in the offshore waters of southern Yellow Sea (SYS), open waters of northern East China Sea (NECS), and offshore waters of middle East China Sea (MECS), this paper analyzed the spatial heterogeneity of body length-body mass of juvenile and adult small yellow croakers by the statistical approaches of mean regression model and quantile regression model. The results showed that the residual standard errors from the analysis of covariance (ANCOVA) and the linear mixed-effects model were similar, and those from the simple linear regression were the highest. For the juvenile small yellow croakers, their mean body mass in SYS and NECS estimated by the mixed-effects mean regression model was higher than the overall average mass across the three regions, while the mean body mass in MECS was below the overall average. For the adult small yellow croakers, their mean body mass in NECS was higher than the overall average, while the mean body mass in SYS and MECS was below the overall average. The results from quantile regression indicated the substantial differences in the allometric relationships of juvenile small yellow croakers between SYS, NECS, and MECS, with the estimated mean exponent of the allometric relationship in SYS being 2.85, and the interquartile range being from 2.63 to 2.96, which indicated the heterogeneity of body form. The results from ANCOVA showed that the allometric body length-body mass relationships were significantly different between the 25th and 75th percentile exponent values (F=6.38, df=1737, P<0.01) and the 25th percentile and median exponent values (F=2.35, df=1737, P=0.039). The relationship was marginally different between the median and 75th percentile exponent values (F=2.21, df=1737, P=0.051). The estimated body length-body mass exponent of adult small yellow croakers in SYS was 3.01 (10th and 95th percentiles = 2.77 and 3.1, respectively). The

  1. Resting-state functional magnetic resonance imaging: the impact of regression analysis.

    Science.gov (United States)

    Yeh, Chia-Jung; Tseng, Yu-Sheng; Lin, Yi-Ru; Tsai, Shang-Yueh; Huang, Teng-Yi

    2015-01-01

    To investigate the impact of regression methods on resting-state functional magnetic resonance imaging (rsfMRI). During rsfMRI preprocessing, regression analysis is considered effective for reducing the interference of physiological noise on the signal time course. However, it is unclear whether the regression method benefits rsfMRI analysis. Twenty volunteers (10 men and 10 women; aged 23.4 ± 1.5 years) participated in the experiments. We used node analysis and functional connectivity mapping to assess the brain default mode network by using five combinations of regression methods. The results show that regressing the global mean plays a major role in the preprocessing steps. When a global regression method is applied, the values of functional connectivity are significantly lower (P ≤ .01) than those calculated without a global regression. This step increases inter-subject variation and produces anticorrelated brain areas. rsfMRI data processed using regression should be interpreted carefully. The significance of the anticorrelated brain areas produced by global signal removal is unclear. Copyright © 2014 by the American Society of Neuroimaging.

  2. Development of a User Interface for a Regression Analysis Software Tool

    Science.gov (United States)

    Ulbrich, Norbert Manfred; Volden, Thomas R.

    2010-01-01

    An easy-to -use user interface was implemented in a highly automated regression analysis tool. The user interface was developed from the start to run on computers that use the Windows, Macintosh, Linux, or UNIX operating system. Many user interface features were specifically designed such that a novice or inexperienced user can apply the regression analysis tool with confidence. Therefore, the user interface s design minimizes interactive input from the user. In addition, reasonable default combinations are assigned to those analysis settings that influence the outcome of the regression analysis. These default combinations will lead to a successful regression analysis result for most experimental data sets. The user interface comes in two versions. The text user interface version is used for the ongoing development of the regression analysis tool. The official release of the regression analysis tool, on the other hand, has a graphical user interface that is more efficient to use. This graphical user interface displays all input file names, output file names, and analysis settings for a specific software application mode on a single screen which makes it easier to generate reliable analysis results and to perform input parameter studies. An object-oriented approach was used for the development of the graphical user interface. This choice keeps future software maintenance costs to a reasonable limit. Examples of both the text user interface and graphical user interface are discussed in order to illustrate the user interface s overall design approach.

  3. Method for nonlinear exponential regression analysis

    Science.gov (United States)

    Junkin, B. G.

    1972-01-01

    Two computer programs developed according to two general types of exponential models for conducting nonlinear exponential regression analysis are described. Least squares procedure is used in which the nonlinear problem is linearized by expanding in a Taylor series. Program is written in FORTRAN 5 for the Univac 1108 computer.

  4. Regression of uveal malignant melanomas following cobalt-60 plaque. Correlates between acoustic spectrum analysis and tumor regression

    International Nuclear Information System (INIS)

    Coleman, D.J.; Lizzi, F.L.; Silverman, R.H.; Ellsworth, R.M.; Haik, B.G.; Abramson, D.H.; Smith, M.E.; Rondeau, M.J.

    1985-01-01

    Parameters derived from computer analysis of digital radio-frequency (rf) ultrasound scan data of untreated uveal malignant melanomas were examined for correlations with tumor regression following cobalt-60 plaque. Parameters included tumor height, normalized power spectrum and acoustic tissue type (ATT). Acoustic tissue type was based upon discriminant analysis of tumor power spectra, with spectra of tumors of known pathology serving as a model. Results showed ATT to be correlated with tumor regression during the first 18 months following treatment. Tumors with ATT associated with spindle cell malignant melanoma showed over twice the percentage reduction in height as those with ATT associated with mixed/epithelioid melanomas. Pre-treatment height was only weakly correlated with regression. Additionally, significant spectral changes were observed following treatment. Ultrasonic spectrum analysis thus provides a noninvasive tool for classification, prediction and monitoring of tumor response to cobalt-60 plaque

  5. A Skew-t space-varying regression model for the spectral analysis of resting state brain activity.

    Science.gov (United States)

    Ismail, Salimah; Sun, Wenqi; Nathoo, Farouk S; Babul, Arif; Moiseev, Alexader; Beg, Mirza Faisal; Virji-Babul, Naznin

    2013-08-01

    It is known that in many neurological disorders such as Down syndrome, main brain rhythms shift their frequencies slightly, and characterizing the spatial distribution of these shifts is of interest. This article reports on the development of a Skew-t mixed model for the spatial analysis of resting state brain activity in healthy controls and individuals with Down syndrome. Time series of oscillatory brain activity are recorded using magnetoencephalography, and spectral summaries are examined at multiple sensor locations across the scalp. We focus on the mean frequency of the power spectral density, and use space-varying regression to examine associations with age, gender and Down syndrome across several scalp regions. Spatial smoothing priors are incorporated based on a multivariate Markov random field, and the markedly non-Gaussian nature of the spectral response variable is accommodated by the use of a Skew-t distribution. A range of models representing different assumptions on the association structure and response distribution are examined, and we conduct model selection using the deviance information criterion. (1) Our analysis suggests region-specific differences between healthy controls and individuals with Down syndrome, particularly in the left and right temporal regions, and produces smoothed maps indicating the scalp topography of the estimated differences.

  6. Detecting overdispersion in count data: A zero-inflated Poisson regression analysis

    Science.gov (United States)

    Afiqah Muhamad Jamil, Siti; Asrul Affendi Abdullah, M.; Kek, Sie Long; Nor, Maria Elena; Mohamed, Maryati; Ismail, Norradihah

    2017-09-01

    This study focusing on analysing count data of butterflies communities in Jasin, Melaka. In analysing count dependent variable, the Poisson regression model has been known as a benchmark model for regression analysis. Continuing from the previous literature that used Poisson regression analysis, this study comprising the used of zero-inflated Poisson (ZIP) regression analysis to gain acute precision on analysing the count data of butterfly communities in Jasin, Melaka. On the other hands, Poisson regression should be abandoned in the favour of count data models, which are capable of taking into account the extra zeros explicitly. By far, one of the most popular models include ZIP regression model. The data of butterfly communities which had been called as the number of subjects in this study had been taken in Jasin, Melaka and consisted of 131 number of subjects visits Jasin, Melaka. Since the researchers are considering the number of subjects, this data set consists of five families of butterfly and represent the five variables involve in the analysis which are the types of subjects. Besides, the analysis of ZIP used the SAS procedure of overdispersion in analysing zeros value and the main purpose of continuing the previous study is to compare which models would be better than when exists zero values for the observation of the count data. The analysis used AIC, BIC and Voung test of 5% level significance in order to achieve the objectives. The finding indicates that there is a presence of over-dispersion in analysing zero value. The ZIP regression model is better than Poisson regression model when zero values exist.

  7. Spatial Durbin model analysis macroeconomic loss due to natural disasters

    Science.gov (United States)

    Kusrini, D. E.; Mukhtasor

    2015-03-01

    Magnitude of the damage and losses caused by natural disasters is huge for Indonesia, therefore this study aimed to analyze the effects of natural disasters for macroeconomic losses that occurred in 115 cities/districts across Java during 2012. Based on the results of previous studies it is suspected that it contains effects of spatial dependencies in this case, so that the completion of this case is performed using a regression approach to the area, namely Analysis of Spatial Durbin Model (SDM). The obtained significant predictor variable is population, and predictor variable with a significant weighting is the number of occurrences of disasters, i.e., disasters in the region which have an impact on other neighboring regions. Moran's I index value using the weighted Queen Contiguity also showed significant results, meaning that the incidence of disasters in the region will decrease the value of GDP in other.

  8. An Analysis of Bank Service Satisfaction Based on Quantile Regression and Grey Relational Analysis

    Directory of Open Access Journals (Sweden)

    Wen-Tsao Pan

    2016-01-01

    Full Text Available Bank service satisfaction is vital to the success of a bank. In this paper, we propose to use the grey relational analysis to gauge the levels of service satisfaction of the banks. With the grey relational analysis, we compared the effects of different variables on service satisfaction. We gave ranks to the banks according to their levels of service satisfaction. We further used the quantile regression model to find the variables that affected the satisfaction of a customer at a specific quantile of satisfaction level. The result of the quantile regression analysis provided a bank manager with information to formulate policies to further promote satisfaction of the customers at different quantiles of satisfaction level. We also compared the prediction accuracies of the regression models at different quantiles. The experiment result showed that, among the seven quantile regression models, the median regression model has the best performance in terms of RMSE, RTIC, and CE performance measures.

  9. A new methodology of spatial cross-correlation analysis.

    Science.gov (United States)

    Chen, Yanguang

    2015-01-01

    Spatial correlation modeling comprises both spatial autocorrelation and spatial cross-correlation processes. The spatial autocorrelation theory has been well-developed. It is necessary to advance the method of spatial cross-correlation analysis to supplement the autocorrelation analysis. This paper presents a set of models and analytical procedures for spatial cross-correlation analysis. By analogy with Moran's index newly expressed in a spatial quadratic form, a theoretical framework is derived for geographical cross-correlation modeling. First, two sets of spatial cross-correlation coefficients are defined, including a global spatial cross-correlation coefficient and local spatial cross-correlation coefficients. Second, a pair of scatterplots of spatial cross-correlation is proposed, and the plots can be used to visually reveal the causality behind spatial systems. Based on the global cross-correlation coefficient, Pearson's correlation coefficient can be decomposed into two parts: direct correlation (partial correlation) and indirect correlation (spatial cross-correlation). As an example, the methodology is applied to the relationships between China's urbanization and economic development to illustrate how to model spatial cross-correlation phenomena. This study is an introduction to developing the theory of spatial cross-correlation, and future geographical spatial analysis might benefit from these models and indexes.

  10. A spatial cluster analysis of tractor overturns in Kentucky from 1960 to 2002.

    Directory of Open Access Journals (Sweden)

    Daniel M Saman

    Full Text Available Agricultural tractor overturns without rollover protective structures are the leading cause of farm fatalities in the United States. To our knowledge, no studies have incorporated the spatial scan statistic in identifying high-risk areas for tractor overturns. The aim of this study was to determine whether tractor overturns cluster in certain parts of Kentucky and identify factors associated with tractor overturns.A spatial statistical analysis using Kulldorff's spatial scan statistic was performed to identify county clusters at greatest risk for tractor overturns. A regression analysis was then performed to identify factors associated with tractor overturns.The spatial analysis revealed a cluster of higher than expected tractor overturns in four counties in northern Kentucky (RR = 2.55 and 10 counties in eastern Kentucky (RR = 1.97. Higher rates of tractor overturns were associated with steeper average percent slope of pasture land by county (p = 0.0002 and a greater percent of total tractors with less than 40 horsepower by county (p<0.0001.This study reveals that geographic hotspots of tractor overturns exist in Kentucky and identifies factors associated with overturns. This study provides policymakers a guide to targeted county-level interventions (e.g., roll-over protective structures promotion interventions with the intention of reducing tractor overturns in the highest risk counties in Kentucky.

  11. Spatial Analysis of International Tourist Movement to Langkawi for 2010 and 2011

    Directory of Open Access Journals (Sweden)

    Basiron Nur Fatma Zuhra

    2014-01-01

    Full Text Available Tourism is a temporary movement of people to a destination of their choice intended for leisure or recreation. The resulting of this movement can be divided into interdestination dan intradestinastion. This study is about interdestination movement which is characterised by movement from tourist-generating regions to one or more destinations. This research aims to analyse the movement of international tourist who come to Langkawi in 2010 and 2011. The application geographic information system (GIS is aids in visualizing spatial data through mapping. Regression analysis using SPSS was also used to examine the relationship between the distance and the amount of flow that exists. Map of the distribution and flow of tourist was produced to facilitate the analysis. The results of the analyses show that the tourist movement is driven by certain factors such as climatic, economic, distance and etc. Regression analysis showed the nexus between the distance and the amount of the tourist flow was not strong. Map produced can be used by LADA specifically for planning marketing strategies and etc.

  12. Research and analyze of physical health using multiple regression analysis

    Directory of Open Access Journals (Sweden)

    T. S. Kyi

    2014-01-01

    Full Text Available This paper represents the research which is trying to create a mathematical model of the "healthy people" using the method of regression analysis. The factors are the physical parameters of the person (such as heart rate, lung capacity, blood pressure, breath holding, weight height coefficient, flexibility of the spine, muscles of the shoulder belt, abdominal muscles, squatting, etc.., and the response variable is an indicator of physical working capacity. After performing multiple regression analysis, obtained useful multiple regression models that can predict the physical performance of boys the aged of fourteen to seventeen years. This paper represents the development of regression model for the sixteen year old boys and analyzed results.

  13. An improved multiple linear regression and data analysis computer program package

    Science.gov (United States)

    Sidik, S. M.

    1972-01-01

    NEWRAP, an improved version of a previous multiple linear regression program called RAPIER, CREDUC, and CRSPLT, allows for a complete regression analysis including cross plots of the independent and dependent variables, correlation coefficients, regression coefficients, analysis of variance tables, t-statistics and their probability levels, rejection of independent variables, plots of residuals against the independent and dependent variables, and a canonical reduction of quadratic response functions useful in optimum seeking experimentation. A major improvement over RAPIER is that all regression calculations are done in double precision arithmetic.

  14. A New Methodology of Spatial Cross-Correlation Analysis

    Science.gov (United States)

    Chen, Yanguang

    2015-01-01

    Spatial correlation modeling comprises both spatial autocorrelation and spatial cross-correlation processes. The spatial autocorrelation theory has been well-developed. It is necessary to advance the method of spatial cross-correlation analysis to supplement the autocorrelation analysis. This paper presents a set of models and analytical procedures for spatial cross-correlation analysis. By analogy with Moran’s index newly expressed in a spatial quadratic form, a theoretical framework is derived for geographical cross-correlation modeling. First, two sets of spatial cross-correlation coefficients are defined, including a global spatial cross-correlation coefficient and local spatial cross-correlation coefficients. Second, a pair of scatterplots of spatial cross-correlation is proposed, and the plots can be used to visually reveal the causality behind spatial systems. Based on the global cross-correlation coefficient, Pearson’s correlation coefficient can be decomposed into two parts: direct correlation (partial correlation) and indirect correlation (spatial cross-correlation). As an example, the methodology is applied to the relationships between China’s urbanization and economic development to illustrate how to model spatial cross-correlation phenomena. This study is an introduction to developing the theory of spatial cross-correlation, and future geographical spatial analysis might benefit from these models and indexes. PMID:25993120

  15. Use of geographically weighted logistic regression to quantify spatial variation in the environmental and sociodemographic drivers of leptospirosis in Fiji: a modelling study

    Directory of Open Access Journals (Sweden)

    Helen J Mayfield, PhD

    2018-05-01

    Full Text Available Summary: Background: Leptospirosis is a globally important zoonotic disease, with complex exposure pathways that depend on interactions between human beings, animals, and the environment. Major drivers of outbreaks include flooding, urbanisation, poverty, and agricultural intensification. The intensity of these drivers and their relative importance vary between geographical areas; however, non-spatial regression methods are incapable of capturing the spatial variations. This study aimed to explore the use of geographically weighted logistic regression (GWLR to provide insights into the ecoepidemiology of human leptospirosis in Fiji. Methods: We obtained field data from a cross-sectional community survey done in 2013 in the three main islands of Fiji. A blood sample obtained from each participant (aged 1–90 years was tested for anti-Leptospira antibodies and household locations were recorded using GPS receivers. We used GWLR to quantify the spatial variation in the relative importance of five environmental and sociodemographic covariates (cattle density, distance to river, poverty rate, residential setting [urban or rural], and maximum rainfall in the wettest month on leptospirosis transmission in Fiji. We developed two models, one using GWLR and one with standard logistic regression; for each model, the dependent variable was the presence or absence of anti-Leptospira antibodies. GWLR results were compared with results obtained with standard logistic regression, and used to produce a predictive risk map and maps showing the spatial variation in odds ratios (OR for each covariate. Findings: The dataset contained location information for 2046 participants from 1922 households representing 81 communities. The Aikaike information criterion value of the GWLR model was 1935·2 compared with 1254·2 for the standard logistic regression model, indicating that the GWLR model was more efficient. Both models produced similar OR for the covariates, but

  16. Functional data analysis of generalized regression quantiles

    KAUST Repository

    Guo, Mengmeng; Zhou, Lan; Huang, Jianhua Z.; Hä rdle, Wolfgang Karl

    2013-01-01

    Generalized regression quantiles, including the conditional quantiles and expectiles as special cases, are useful alternatives to the conditional means for characterizing a conditional distribution, especially when the interest lies in the tails. We develop a functional data analysis approach to jointly estimate a family of generalized regression quantiles. Our approach assumes that the generalized regression quantiles share some common features that can be summarized by a small number of principal component functions. The principal component functions are modeled as splines and are estimated by minimizing a penalized asymmetric loss measure. An iterative least asymmetrically weighted squares algorithm is developed for computation. While separate estimation of individual generalized regression quantiles usually suffers from large variability due to lack of sufficient data, by borrowing strength across data sets, our joint estimation approach significantly improves the estimation efficiency, which is demonstrated in a simulation study. The proposed method is applied to data from 159 weather stations in China to obtain the generalized quantile curves of the volatility of the temperature at these stations. © 2013 Springer Science+Business Media New York.

  17. Functional data analysis of generalized regression quantiles

    KAUST Repository

    Guo, Mengmeng

    2013-11-05

    Generalized regression quantiles, including the conditional quantiles and expectiles as special cases, are useful alternatives to the conditional means for characterizing a conditional distribution, especially when the interest lies in the tails. We develop a functional data analysis approach to jointly estimate a family of generalized regression quantiles. Our approach assumes that the generalized regression quantiles share some common features that can be summarized by a small number of principal component functions. The principal component functions are modeled as splines and are estimated by minimizing a penalized asymmetric loss measure. An iterative least asymmetrically weighted squares algorithm is developed for computation. While separate estimation of individual generalized regression quantiles usually suffers from large variability due to lack of sufficient data, by borrowing strength across data sets, our joint estimation approach significantly improves the estimation efficiency, which is demonstrated in a simulation study. The proposed method is applied to data from 159 weather stations in China to obtain the generalized quantile curves of the volatility of the temperature at these stations. © 2013 Springer Science+Business Media New York.

  18. Application of spatial and non-spatial data analysis in determination of the factors that impact municipal solid waste generation rates in Turkey

    International Nuclear Information System (INIS)

    Keser, Saniye; Duzgun, Sebnem; Aksoy, Aysegul

    2012-01-01

    Highlights: ► Spatial autocorrelation exists in municipal solid waste generation rates for different provinces in Turkey. ► Traditional non-spatial regression models may not provide sufficient information for better solid waste management. ► Unemployment rate is a global variable that significantly impacts the waste generation rates in Turkey. ► Significances of global parameters may diminish at local scale for some provinces. ► GWR model can be used to create clusters of cities for solid waste management. - Abstract: In studies focusing on the factors that impact solid waste generation habits and rates, the potential spatial dependency in solid waste generation data is not considered in relating the waste generation rates to its determinants. In this study, spatial dependency is taken into account in determination of the significant socio-economic and climatic factors that may be of importance for the municipal solid waste (MSW) generation rates in different provinces of Turkey. Simultaneous spatial autoregression (SAR) and geographically weighted regression (GWR) models are used for the spatial data analyses. Similar to ordinary least squares regression (OLSR), regression coefficients are global in SAR model. In other words, the effect of a given independent variable on a dependent variable is valid for the whole country. Unlike OLSR or SAR, GWR reveals the local impact of a given factor (or independent variable) on the waste generation rates of different provinces. Results show that provinces within closer neighborhoods have similar MSW generation rates. On the other hand, this spatial autocorrelation is not very high for the exploratory variables considered in the study. OLSR and SAR models have similar regression coefficients. GWR is useful to indicate the local determinants of MSW generation rates. GWR model can be utilized to plan waste management activities at local scale including waste minimization, collection, treatment, and disposal. At global

  19. The AIDS epidemic and economic input impact factors in Chongqing, China, from 2006 to 2012: a spatial-temporal analysis.

    Science.gov (United States)

    Zhang, Yanqi; Xiao, Qin; Zhou, Liang; Ma, Dihui; Liu, Ling; Lu, Rongrong; Yi, Dali; Yi, Dong

    2015-03-27

    To analyse the spatial-temporal clustering of the HIV/AIDS epidemic in Chongqing and to explore its association with the economic indices of AIDS prevention and treatment. Data on the HIV/AIDS epidemic and economic indices of AIDS prevention and treatment were obtained from the annual reports of the Chongqing Municipal Center for Disease Control for 2006-2012. Spatial clustering analysis, temporal-spatial clustering analysis, and spatial regression were used to conduct statistical analysis. The annual average new HIV infection rate, incidence rate for new AIDS cases, and rate of people living with HIV in Chongqing were 5.97, 2.42 and 28.12 per 100,000, respectively, for 2006-2012. The HIV/AIDS epidemic showed a non-random spatial distribution (Moran's I≥0.310; p<0.05). The epidemic hotspots were distributed in the 15 mid-western counties. The most likely clusters were primarily located in the central region and southwest of Chongqing and occurred in 2010-2012. The regression coefficients of the total amount of special funds allocated to AIDS and to the public awareness unit for the numbers of new HIV cases, new AIDS cases, and people living with HIV were 0.775, 0.976 and 0.816, and -0.188, -0.259 and -0.215 (p<0.002), respectively. The Chongqing HIV/AIDS epidemic showed temporal-spatial clustering and was mainly clustered in the mid-western and south-western counties, showing an upward trend over time. The amount of special funds dedicated to AIDS and to the public awareness unit showed positive and negative relationships with HIV/AIDS spatial clustering, respectively. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  20. Analysis of Relationship Between Personality and Favorite Places with Poisson Regression Analysis

    Directory of Open Access Journals (Sweden)

    Yoon Song Ha

    2018-01-01

    Full Text Available A relationship between human personality and preferred locations have been a long conjecture for human mobility research. In this paper, we analyzed the relationship between personality and visiting place with Poisson Regression. Poisson Regression can analyze correlation between countable dependent variable and independent variable. For this analysis, 33 volunteers provided their personality data and 49 location categories data are used. Raw location data is preprocessed to be normalized into rates of visit and outlier data is prunned. For the regression analysis, independent variables are personality data and dependent variables are preprocessed location data. Several meaningful results are found. For example, persons with high tendency of frequent visiting to university laboratory has personality with high conscientiousness and low openness. As well, other meaningful location categories are presented in this paper.

  1. application of multilinear regression analysis in modeling of soil

    African Journals Online (AJOL)

    Windows User

    Accordingly [1, 3] in their work, they applied linear regression ... (MLRA) is a statistical technique that uses several explanatory ... order to check this, they adopted bivariate correlation analysis .... groups, namely A-1 through A-7, based on their relative expected ..... Multivariate Regression in Gorgan Province North of Iran” ...

  2. Multiple regression analysis of Jominy hardenability data for boron treated steels

    International Nuclear Information System (INIS)

    Komenda, J.; Sandstroem, R.; Tukiainen, M.

    1997-01-01

    The relations between chemical composition and their hardenability of boron treated steels have been investigated using a multiple regression analysis method. A linear model of regression was chosen. The free boron content that is effective for the hardenability was calculated using a model proposed by Jansson. The regression analysis for 1261 steel heats provided equations that were statistically significant at the 95% level. All heats met the specification according to the nordic countries producers classification. The variation in chemical composition explained typically 80 to 90% of the variation in the hardenability. In the regression analysis elements which did not significantly contribute to the calculated hardness according to the F test were eliminated. Carbon, silicon, manganese, phosphorus and chromium were of importance at all Jominy distances, nickel, vanadium, boron and nitrogen at distances above 6 mm. After the regression analysis it was demonstrated that very few outliers were present in the data set, i.e. data points outside four times the standard deviation. The model has successfully been used in industrial practice replacing some of the necessary Jominy tests. (orig.)

  3. Understanding logistic regression analysis

    OpenAIRE

    Sperandei, Sandro

    2014-01-01

    Logistic regression is used to obtain odds ratio in the presence of more than one explanatory variable. The procedure is quite similar to multiple linear regression, with the exception that the response variable is binomial. The result is the impact of each variable on the odds ratio of the observed event of interest. The main advantage is to avoid confounding effects by analyzing the association of all variables together. In this article, we explain the logistic regression procedure using ex...

  4. Drought Patterns Forecasting using an Auto-Regressive Logistic Model

    Science.gov (United States)

    del Jesus, M.; Sheffield, J.; Méndez Incera, F. J.; Losada, I. J.; Espejo, A.

    2014-12-01

    Drought is characterized by a water deficit that may manifest across a large range of spatial and temporal scales. Drought may create important socio-economic consequences, many times of catastrophic dimensions. A quantifiable definition of drought is elusive because depending on its impacts, consequences and generation mechanism, different water deficit periods may be identified as a drought by virtue of some definitions but not by others. Droughts are linked to the water cycle and, although a climate change signal may not have emerged yet, they are also intimately linked to climate.In this work we develop an auto-regressive logistic model for drought prediction at different temporal scales that makes use of a spatially explicit framework. Our model allows to include covariates, continuous or categorical, to improve the performance of the auto-regressive component.Our approach makes use of dimensionality reduction (principal component analysis) and classification techniques (K-Means and maximum dissimilarity) to simplify the representation of complex climatic patterns, such as sea surface temperature (SST) and sea level pressure (SLP), while including information on their spatial structure, i.e. considering their spatial patterns. This procedure allows us to include in the analysis multivariate representation of complex climatic phenomena, as the El Niño-Southern Oscillation. We also explore the impact of other climate-related variables such as sun spots. The model allows to quantify the uncertainty of the forecasts and can be easily adapted to make predictions under future climatic scenarios. The framework herein presented may be extended to other applications such as flash flood analysis, or risk assessment of natural hazards.

  5. Background stratified Poisson regression analysis of cohort data.

    Science.gov (United States)

    Richardson, David B; Langholz, Bryan

    2012-03-01

    Background stratified Poisson regression is an approach that has been used in the analysis of data derived from a variety of epidemiologically important studies of radiation-exposed populations, including uranium miners, nuclear industry workers, and atomic bomb survivors. We describe a novel approach to fit Poisson regression models that adjust for a set of covariates through background stratification while directly estimating the radiation-disease association of primary interest. The approach makes use of an expression for the Poisson likelihood that treats the coefficients for stratum-specific indicator variables as 'nuisance' variables and avoids the need to explicitly estimate the coefficients for these stratum-specific parameters. Log-linear models, as well as other general relative rate models, are accommodated. This approach is illustrated using data from the Life Span Study of Japanese atomic bomb survivors and data from a study of underground uranium miners. The point estimate and confidence interval obtained from this 'conditional' regression approach are identical to the values obtained using unconditional Poisson regression with model terms for each background stratum. Moreover, it is shown that the proposed approach allows estimation of background stratified Poisson regression models of non-standard form, such as models that parameterize latency effects, as well as regression models in which the number of strata is large, thereby overcoming the limitations of previously available statistical software for fitting background stratified Poisson regression models.

  6. Semiparametric regression during 2003–2007

    KAUST Repository

    Ruppert, David; Wand, M.P.; Carroll, Raymond J.

    2009-01-01

    Semiparametric regression is a fusion between parametric regression and nonparametric regression that integrates low-rank penalized splines, mixed model and hierarchical Bayesian methodology – thus allowing more streamlined handling of longitudinal and spatial correlation. We review progress in the field over the five-year period between 2003 and 2007. We find semiparametric regression to be a vibrant field with substantial involvement and activity, continual enhancement and widespread application.

  7. Use of geographically weighted logistic regression to quantify spatial variation in the environmental and sociodemographic drivers of leptospirosis in Fiji: a modelling study.

    Science.gov (United States)

    Mayfield, Helen J; Lowry, John H; Watson, Conall H; Kama, Mike; Nilles, Eric J; Lau, Colleen L

    2018-05-01

    Leptospirosis is a globally important zoonotic disease, with complex exposure pathways that depend on interactions between human beings, animals, and the environment. Major drivers of outbreaks include flooding, urbanisation, poverty, and agricultural intensification. The intensity of these drivers and their relative importance vary between geographical areas; however, non-spatial regression methods are incapable of capturing the spatial variations. This study aimed to explore the use of geographically weighted logistic regression (GWLR) to provide insights into the ecoepidemiology of human leptospirosis in Fiji. We obtained field data from a cross-sectional community survey done in 2013 in the three main islands of Fiji. A blood sample obtained from each participant (aged 1-90 years) was tested for anti-Leptospira antibodies and household locations were recorded using GPS receivers. We used GWLR to quantify the spatial variation in the relative importance of five environmental and sociodemographic covariates (cattle density, distance to river, poverty rate, residential setting [urban or rural], and maximum rainfall in the wettest month) on leptospirosis transmission in Fiji. We developed two models, one using GWLR and one with standard logistic regression; for each model, the dependent variable was the presence or absence of anti-Leptospira antibodies. GWLR results were compared with results obtained with standard logistic regression, and used to produce a predictive risk map and maps showing the spatial variation in odds ratios (OR) for each covariate. The dataset contained location information for 2046 participants from 1922 households representing 81 communities. The Aikaike information criterion value of the GWLR model was 1935·2 compared with 1254·2 for the standard logistic regression model, indicating that the GWLR model was more efficient. Both models produced similar OR for the covariates, but GWLR also detected spatial variation in the effect of each

  8. Poisson Regression Analysis of Illness and Injury Surveillance Data

    Energy Technology Data Exchange (ETDEWEB)

    Frome E.L., Watkins J.P., Ellis E.D.

    2012-12-12

    The Department of Energy (DOE) uses illness and injury surveillance to monitor morbidity and assess the overall health of the work force. Data collected from each participating site include health events and a roster file with demographic information. The source data files are maintained in a relational data base, and are used to obtain stratified tables of health event counts and person time at risk that serve as the starting point for Poisson regression analysis. The explanatory variables that define these tables are age, gender, occupational group, and time. Typical response variables of interest are the number of absences due to illness or injury, i.e., the response variable is a count. Poisson regression methods are used to describe the effect of the explanatory variables on the health event rates using a log-linear main effects model. Results of fitting the main effects model are summarized in a tabular and graphical form and interpretation of model parameters is provided. An analysis of deviance table is used to evaluate the importance of each of the explanatory variables on the event rate of interest and to determine if interaction terms should be considered in the analysis. Although Poisson regression methods are widely used in the analysis of count data, there are situations in which over-dispersion occurs. This could be due to lack-of-fit of the regression model, extra-Poisson variation, or both. A score test statistic and regression diagnostics are used to identify over-dispersion. A quasi-likelihood method of moments procedure is used to evaluate and adjust for extra-Poisson variation when necessary. Two examples are presented using respiratory disease absence rates at two DOE sites to illustrate the methods and interpretation of the results. In the first example the Poisson main effects model is adequate. In the second example the score test indicates considerable over-dispersion and a more detailed analysis attributes the over-dispersion to extra

  9. Geographically weighted negative binomial regression applied to zonal level safety performance models.

    Science.gov (United States)

    Gomes, Marcos José Timbó Lima; Cunto, Flávio; da Silva, Alan Ricardo

    2017-09-01

    Generalized Linear Models (GLM) with negative binomial distribution for errors, have been widely used to estimate safety at the level of transportation planning. The limited ability of this technique to take spatial effects into account can be overcome through the use of local models from spatial regression techniques, such as Geographically Weighted Poisson Regression (GWPR). Although GWPR is a system that deals with spatial dependency and heterogeneity and has already been used in some road safety studies at the planning level, it fails to account for the possible overdispersion that can be found in the observations on road-traffic crashes. Two approaches were adopted for the Geographically Weighted Negative Binomial Regression (GWNBR) model to allow discrete data to be modeled in a non-stationary form and to take note of the overdispersion of the data: the first examines the constant overdispersion for all the traffic zones and the second includes the variable for each spatial unit. This research conducts a comparative analysis between non-spatial global crash prediction models and spatial local GWPR and GWNBR at the level of traffic zones in Fortaleza/Brazil. A geographic database of 126 traffic zones was compiled from the available data on exposure, network characteristics, socioeconomic factors and land use. The models were calibrated by using the frequency of injury crashes as a dependent variable and the results showed that GWPR and GWNBR achieved a better performance than GLM for the average residuals and likelihood as well as reducing the spatial autocorrelation of the residuals, and the GWNBR model was more able to capture the spatial heterogeneity of the crash frequency. Copyright © 2017 Elsevier Ltd. All rights reserved.

  10. Development of planning level transportation safety tools using Geographically Weighted Poisson Regression.

    Science.gov (United States)

    Hadayeghi, Alireza; Shalaby, Amer S; Persaud, Bhagwant N

    2010-03-01

    A common technique used for the calibration of collision prediction models is the Generalized Linear Modeling (GLM) procedure with the assumption of Negative Binomial or Poisson error distribution. In this technique, fixed coefficients that represent the average relationship between the dependent variable and each explanatory variable are estimated. However, the stationary relationship assumed may hide some important spatial factors of the number of collisions at a particular traffic analysis zone. Consequently, the accuracy of such models for explaining the relationship between the dependent variable and the explanatory variables may be suspected since collision frequency is likely influenced by many spatially defined factors such as land use, demographic characteristics, and traffic volume patterns. The primary objective of this study is to investigate the spatial variations in the relationship between the number of zonal collisions and potential transportation planning predictors, using the Geographically Weighted Poisson Regression modeling technique. The secondary objective is to build on knowledge comparing the accuracy of Geographically Weighted Poisson Regression models to that of Generalized Linear Models. The results show that the Geographically Weighted Poisson Regression models are useful for capturing spatially dependent relationships and generally perform better than the conventional Generalized Linear Models. Copyright 2009 Elsevier Ltd. All rights reserved.

  11. Remote sensing and GIS-based landslide hazard analysis and cross-validation using multivariate logistic regression model on three test areas in Malaysia

    Science.gov (United States)

    Pradhan, Biswajeet

    2010-05-01

    This paper presents the results of the cross-validation of a multivariate logistic regression model using remote sensing data and GIS for landslide hazard analysis on the Penang, Cameron, and Selangor areas in Malaysia. Landslide locations in the study areas were identified by interpreting aerial photographs and satellite images, supported by field surveys. SPOT 5 and Landsat TM satellite imagery were used to map landcover and vegetation index, respectively. Maps of topography, soil type, lineaments and land cover were constructed from the spatial datasets. Ten factors which influence landslide occurrence, i.e., slope, aspect, curvature, distance from drainage, lithology, distance from lineaments, soil type, landcover, rainfall precipitation, and normalized difference vegetation index (ndvi), were extracted from the spatial database and the logistic regression coefficient of each factor was computed. Then the landslide hazard was analysed using the multivariate logistic regression coefficients derived not only from the data for the respective area but also using the logistic regression coefficients calculated from each of the other two areas (nine hazard maps in all) as a cross-validation of the model. For verification of the model, the results of the analyses were then compared with the field-verified landslide locations. Among the three cases of the application of logistic regression coefficient in the same study area, the case of Selangor based on the Selangor logistic regression coefficients showed the highest accuracy (94%), where as Penang based on the Penang coefficients showed the lowest accuracy (86%). Similarly, among the six cases from the cross application of logistic regression coefficient in other two areas, the case of Selangor based on logistic coefficient of Cameron showed highest (90%) prediction accuracy where as the case of Penang based on the Selangor logistic regression coefficients showed the lowest accuracy (79%). Qualitatively, the cross

  12. Reevaluation of Stratospheric Ozone Trends From SAGE II Data Using a Simultaneous Temporal and Spatial Analysis

    Science.gov (United States)

    Damadeo, R. P.; Zawodny, J. M.; Thomason, L. W.

    2014-01-01

    This paper details a new method of regression for sparsely sampled data sets for use with time-series analysis, in particular the Stratospheric Aerosol and Gas Experiment (SAGE) II ozone data set. Non-uniform spatial, temporal, and diurnal sampling present in the data set result in biased values for the long-term trend if not accounted for. This new method is performed close to the native resolution of measurements and is a simultaneous temporal and spatial analysis that accounts for potential diurnal ozone variation. Results show biases, introduced by the way data is prepared for use with traditional methods, can be as high as 10%. Derived long-term changes show declines in ozone similar to other studies but very different trends in the presumed recovery period, with differences up to 2% per decade. The regression model allows for a variable turnaround time and reveals a hemispheric asymmetry in derived trends in the middle to upper stratosphere. Similar methodology is also applied to SAGE II aerosol optical depth data to create a new volcanic proxy that covers the SAGE II mission period. Ultimately this technique may be extensible towards the inclusion of multiple data sets without the need for homogenization.

  13. Better Autologistic Regression

    Directory of Open Access Journals (Sweden)

    Mark A. Wolters

    2017-11-01

    Full Text Available Autologistic regression is an important probability model for dichotomous random variables observed along with covariate information. It has been used in various fields for analyzing binary data possessing spatial or network structure. The model can be viewed as an extension of the autologistic model (also known as the Ising model, quadratic exponential binary distribution, or Boltzmann machine to include covariates. It can also be viewed as an extension of logistic regression to handle responses that are not independent. Not all authors use exactly the same form of the autologistic regression model. Variations of the model differ in two respects. First, the variable coding—the two numbers used to represent the two possible states of the variables—might differ. Common coding choices are (zero, one and (minus one, plus one. Second, the model might appear in either of two algebraic forms: a standard form, or a recently proposed centered form. Little attention has been paid to the effect of these differences, and the literature shows ambiguity about their importance. It is shown here that changes to either coding or centering in fact produce distinct, non-nested probability models. Theoretical results, numerical studies, and analysis of an ecological data set all show that the differences among the models can be large and practically significant. Understanding the nature of the differences and making appropriate modeling choices can lead to significantly improved autologistic regression analyses. The results strongly suggest that the standard model with plus/minus coding, which we call the symmetric autologistic model, is the most natural choice among the autologistic variants.

  14. A Quality Assessment Tool for Non-Specialist Users of Regression Analysis

    Science.gov (United States)

    Argyrous, George

    2015-01-01

    This paper illustrates the use of a quality assessment tool for regression analysis. It is designed for non-specialist "consumers" of evidence, such as policy makers. The tool provides a series of questions such consumers of evidence can ask to interrogate regression analysis, and is illustrated with reference to a recent study published…

  15. Progress in spatial analysis methods and applications

    CERN Document Server

    Páez, Antonio; Buliung, Ron N; Dall'erba, Sandy

    2010-01-01

    This book brings together developments in spatial analysis techniques, including spatial statistics, econometrics, and spatial visualization, and applications to fields such as regional studies, transportation and land use, population and health.

  16. Background stratified Poisson regression analysis of cohort data

    International Nuclear Information System (INIS)

    Richardson, David B.; Langholz, Bryan

    2012-01-01

    Background stratified Poisson regression is an approach that has been used in the analysis of data derived from a variety of epidemiologically important studies of radiation-exposed populations, including uranium miners, nuclear industry workers, and atomic bomb survivors. We describe a novel approach to fit Poisson regression models that adjust for a set of covariates through background stratification while directly estimating the radiation-disease association of primary interest. The approach makes use of an expression for the Poisson likelihood that treats the coefficients for stratum-specific indicator variables as 'nuisance' variables and avoids the need to explicitly estimate the coefficients for these stratum-specific parameters. Log-linear models, as well as other general relative rate models, are accommodated. This approach is illustrated using data from the Life Span Study of Japanese atomic bomb survivors and data from a study of underground uranium miners. The point estimate and confidence interval obtained from this 'conditional' regression approach are identical to the values obtained using unconditional Poisson regression with model terms for each background stratum. Moreover, it is shown that the proposed approach allows estimation of background stratified Poisson regression models of non-standard form, such as models that parameterize latency effects, as well as regression models in which the number of strata is large, thereby overcoming the limitations of previously available statistical software for fitting background stratified Poisson regression models. (orig.)

  17. Understanding logistic regression analysis.

    Science.gov (United States)

    Sperandei, Sandro

    2014-01-01

    Logistic regression is used to obtain odds ratio in the presence of more than one explanatory variable. The procedure is quite similar to multiple linear regression, with the exception that the response variable is binomial. The result is the impact of each variable on the odds ratio of the observed event of interest. The main advantage is to avoid confounding effects by analyzing the association of all variables together. In this article, we explain the logistic regression procedure using examples to make it as simple as possible. After definition of the technique, the basic interpretation of the results is highlighted and then some special issues are discussed.

  18. Hierarchical modeling and analysis for spatial data

    CERN Document Server

    Banerjee, Sudipto; Gelfand, Alan E

    2003-01-01

    Among the many uses of hierarchical modeling, their application to the statistical analysis of spatial and spatio-temporal data from areas such as epidemiology And environmental science has proven particularly fruitful. Yet to date, the few books that address the subject have been either too narrowly focused on specific aspects of spatial analysis, or written at a level often inaccessible to those lacking a strong background in mathematical statistics.Hierarchical Modeling and Analysis for Spatial Data is the first accessible, self-contained treatment of hierarchical methods, modeling, and dat

  19. Modelling and analysis of turbulent datasets using Auto Regressive Moving Average processes

    International Nuclear Information System (INIS)

    Faranda, Davide; Dubrulle, Bérengère; Daviaud, François; Pons, Flavio Maria Emanuele; Saint-Michel, Brice; Herbert, Éric; Cortet, Pierre-Philippe

    2014-01-01

    We introduce a novel way to extract information from turbulent datasets by applying an Auto Regressive Moving Average (ARMA) statistical analysis. Such analysis goes well beyond the analysis of the mean flow and of the fluctuations and links the behavior of the recorded time series to a discrete version of a stochastic differential equation which is able to describe the correlation structure in the dataset. We introduce a new index Υ that measures the difference between the resulting analysis and the Obukhov model of turbulence, the simplest stochastic model reproducing both Richardson law and the Kolmogorov spectrum. We test the method on datasets measured in a von Kármán swirling flow experiment. We found that the ARMA analysis is well correlated with spatial structures of the flow, and can discriminate between two different flows with comparable mean velocities, obtained by changing the forcing. Moreover, we show that the Υ is highest in regions where shear layer vortices are present, thereby establishing a link between deviations from the Kolmogorov model and coherent structures. These deviations are consistent with the ones observed by computing the Hurst exponents for the same time series. We show that some salient features of the analysis are preserved when considering global instead of local observables. Finally, we analyze flow configurations with multistability features where the ARMA technique is efficient in discriminating different stability branches of the system

  20. Prediction of spatial soil property information from ancillary sensor data using ordinary linear regression: Model derivations, residual assumptions and model validation tests

    Science.gov (United States)

    Geospatial measurements of ancillary sensor data, such as bulk soil electrical conductivity or remotely sensed imagery data, are commonly used to characterize spatial variation in soil or crop properties. Geostatistical techniques like kriging with external drift or regression kriging are often use...

  1. Retro-regression--another important multivariate regression improvement.

    Science.gov (United States)

    Randić, M

    2001-01-01

    We review the serious problem associated with instabilities of the coefficients of regression equations, referred to as the MRA (multivariate regression analysis) "nightmare of the first kind". This is manifested when in a stepwise regression a descriptor is included or excluded from a regression. The consequence is an unpredictable change of the coefficients of the descriptors that remain in the regression equation. We follow with consideration of an even more serious problem, referred to as the MRA "nightmare of the second kind", arising when optimal descriptors are selected from a large pool of descriptors. This process typically causes at different steps of the stepwise regression a replacement of several previously used descriptors by new ones. We describe a procedure that resolves these difficulties. The approach is illustrated on boiling points of nonanes which are considered (1) by using an ordered connectivity basis; (2) by using an ordering resulting from application of greedy algorithm; and (3) by using an ordering derived from an exhaustive search for optimal descriptors. A novel variant of multiple regression analysis, called retro-regression (RR), is outlined showing how it resolves the ambiguities associated with both "nightmares" of the first and the second kind of MRA.

  2. Prediction of spatial patterns of collapsed pipes in loess-derived soils in a temperate humid climate using logistic regression

    Science.gov (United States)

    Verachtert, E.; Den Eeckhaut, M. Van; Poesen, J.; Govers, G.; Deckers, J.

    2011-07-01

    Soil piping (tunnel erosion) has been recognised as an important erosion process in collapsible loess-derived soils of temperate humid climates, which can cause collapse of the topsoil and formation of discontinuous gullies. Information about the spatial patterns of collapsed pipes and regional models describing these patterns is still limited. Therefore, this study aims at better understanding the factors controlling the spatial distribution and predicting pipe collapse. A dataset with parcels suffering from collapsed pipes (n = 560) and parcels without collapsed pipes was obtained through a regional survey in a 236 km² study area in the Flemish Ardennes (Belgium). Logistic regression was applied to find the best model describing the relationship between the presence/absence of a collapsed pipe and a set of independent explanatory variables (i.e. slope gradient, drainage area, distance-to-thalweg, curvature, aspect, soil type and lithology). Special attention was paid to the selection procedure of the grid cells without collapsed pipes. Apart from the first piping susceptibility map created by logistic regression modelling, a second map was made based on topographical thresholds of slope gradient and upslope drainage area. The logistic regression model allowed identification of the most important factors controlling pipe collapse. Pipes are much more likely to occur when a topographical threshold depending on both slope gradient and upslope area is exceeded in zones with a sufficient water supply (due to topographical convergence and/or the presence of a clay-rich lithology). On the other hand, the use of slope-area thresholds only results in reasonable predictions of piping susceptibility, with minimum information.

  3. Linear regression analysis: part 14 of a series on evaluation of scientific publications.

    Science.gov (United States)

    Schneider, Astrid; Hommel, Gerhard; Blettner, Maria

    2010-11-01

    Regression analysis is an important statistical method for the analysis of medical data. It enables the identification and characterization of relationships among multiple factors. It also enables the identification of prognostically relevant risk factors and the calculation of risk scores for individual prognostication. This article is based on selected textbooks of statistics, a selective review of the literature, and our own experience. After a brief introduction of the uni- and multivariable regression models, illustrative examples are given to explain what the important considerations are before a regression analysis is performed, and how the results should be interpreted. The reader should then be able to judge whether the method has been used correctly and interpret the results appropriately. The performance and interpretation of linear regression analysis are subject to a variety of pitfalls, which are discussed here in detail. The reader is made aware of common errors of interpretation through practical examples. Both the opportunities for applying linear regression analysis and its limitations are presented.

  4. Applying spatial regression to evaluate risk factors for microbiological contamination of urban groundwater sources in Juba, South Sudan

    Science.gov (United States)

    Engström, Emma; Mörtberg, Ulla; Karlström, Anders; Mangold, Mikael

    2017-06-01

    This study developed methodology for statistically assessing groundwater contamination mechanisms. It focused on microbial water pollution in low-income regions. Risk factors for faecal contamination of groundwater-fed drinking-water sources were evaluated in a case study in Juba, South Sudan. The study was based on counts of thermotolerant coliforms in water samples from 129 sources, collected by the humanitarian aid organisation Médecins Sans Frontières in 2010. The factors included hydrogeological settings, land use and socio-economic characteristics. The results showed that the residuals of a conventional probit regression model had a significant positive spatial autocorrelation (Moran's I = 3.05, I-stat = 9.28); therefore, a spatial model was developed that had better goodness-of-fit to the observations. The most significant factor in this model ( p-value 0.005) was the distance from a water source to the nearest Tukul area, an area with informal settlements that lack sanitation services. It is thus recommended that future remediation and monitoring efforts in the city be concentrated in such low-income regions. The spatial model differed from the conventional approach: in contrast with the latter case, lowland topography was not significant at the 5% level, as the p-value was 0.074 in the spatial model and 0.040 in the traditional model. This study showed that statistical risk-factor assessments of groundwater contamination need to consider spatial interactions when the water sources are located close to each other. Future studies might further investigate the cut-off distance that reflects spatial autocorrelation. Particularly, these results advise research on urban groundwater quality.

  5. Institutions and deforestation in the Brazilian amazon: a geographic regression discontinuity analysis

    OpenAIRE

    Bogetvedt, Ingvild Engen; Hauge, Mari Johnsrud

    2017-01-01

    This study explores the impact of institutional quality at the municipal level on deforestation in the Legal Amazon. We add to this insufficiently understood topic by implementing a geographic regression discontinuity design. By taking advantage of high-resolution spatial data on deforestation combined with an objective measure of corruption used as a proxy for institutional quality, we analyse 138 Brazilian municipalities in the period of 2002-2004. Our empirical findings show...

  6. Management of Industrial Performance Indicators: Regression Analysis and Simulation

    Directory of Open Access Journals (Sweden)

    Walter Roberto Hernandez Vergara

    2017-11-01

    Full Text Available Stochastic methods can be used in problem solving and explanation of natural phenomena through the application of statistical procedures. The article aims to associate the regression analysis and systems simulation, in order to facilitate the practical understanding of data analysis. The algorithms were developed in Microsoft Office Excel software, using statistical techniques such as regression theory, ANOVA and Cholesky Factorization, which made it possible to create models of single and multiple systems with up to five independent variables. For the analysis of these models, the Monte Carlo simulation and analysis of industrial performance indicators were used, resulting in numerical indices that aim to improve the goals’ management for compliance indicators, by identifying systems’ instability, correlation and anomalies. The analytical models presented in the survey indicated satisfactory results with numerous possibilities for industrial and academic applications, as well as the potential for deployment in new analytical techniques.

  7. Least-Squares Linear Regression and Schrodinger's Cat: Perspectives on the Analysis of Regression Residuals.

    Science.gov (United States)

    Hecht, Jeffrey B.

    The analysis of regression residuals and detection of outliers are discussed, with emphasis on determining how deviant an individual data point must be to be considered an outlier and the impact that multiple suspected outlier data points have on the process of outlier determination and treatment. Only bivariate (one dependent and one independent)…

  8. Spatial analysis for the epidemiological study of cardiovascular diseases: A systematic literature search.

    Science.gov (United States)

    Mena, Carlos; Sepúlveda, Cesar; Fuentes, Eduardo; Ormazábal, Yony; Palomo, Iván

    2018-05-07

    Cardiovascular diseases (CVDs) are the primary cause of death and disability in de world, and the detection of populations at risk as well as localization of vulnerable areas is essential for adequate epidemiological management. Techniques developed for spatial analysis, among them geographical information systems and spatial statistics, such as cluster detection and spatial correlation, are useful for the study of the distribution of the CVDs. These techniques, enabling recognition of events at different geographical levels of study (e.g., rural, deprived neighbourhoods, etc.), make it possible to relate CVDs to factors present in the immediate environment. The systemic literature presented here shows that this group of diseases is clustered with regard to incidence, mortality and hospitalization as well as obesity, smoking, increased glycated haemoglobin levels, hypertension physical activity and age. In addition, acquired variables such as income, residency (rural or urban) and education, contribute to CVD clustering. Both local cluster detection and spatial regression techniques give statistical weight to the findings providing valuable information that can influence response mechanisms in the health services by indicating locations in need of intervention and assignment of available resources.

  9. Predicting Dropouts of University Freshmen: A Logit Regression Analysis.

    Science.gov (United States)

    Lam, Y. L. Jack

    1984-01-01

    Stepwise discriminant analysis coupled with logit regression analysis of freshmen data from Brandon University (Manitoba) indicated that six tested variables drawn from research on university dropouts were useful in predicting attrition: student status, residence, financial sources, distance from home town, goal fulfillment, and satisfaction with…

  10. Simulation Experiments in Practice : Statistical Design and Regression Analysis

    NARCIS (Netherlands)

    Kleijnen, J.P.C.

    2007-01-01

    In practice, simulation analysts often change only one factor at a time, and use graphical analysis of the resulting Input/Output (I/O) data. Statistical theory proves that more information is obtained when applying Design Of Experiments (DOE) and linear regression analysis. Unfortunately, classic

  11. Quality of life in breast cancer patients--a quantile regression analysis.

    Science.gov (United States)

    Pourhoseingholi, Mohamad Amin; Safaee, Azadeh; Moghimi-Dehkordi, Bijan; Zeighami, Bahram; Faghihzadeh, Soghrat; Tabatabaee, Hamid Reza; Pourhoseingholi, Asma

    2008-01-01

    Quality of life study has an important role in health care especially in chronic diseases, in clinical judgment and in medical resources supplying. Statistical tools like linear regression are widely used to assess the predictors of quality of life. But when the response is not normal the results are misleading. The aim of this study is to determine the predictors of quality of life in breast cancer patients, using quantile regression model and compare to linear regression. A cross-sectional study conducted on 119 breast cancer patients that admitted and treated in chemotherapy ward of Namazi hospital in Shiraz. We used QLQ-C30 questionnaire to assessment quality of life in these patients. A quantile regression was employed to assess the assocciated factors and the results were compared to linear regression. All analysis carried out using SAS. The mean score for the global health status for breast cancer patients was 64.92+/-11.42. Linear regression showed that only grade of tumor, occupational status, menopausal status, financial difficulties and dyspnea were statistically significant. In spite of linear regression, financial difficulties were not significant in quantile regression analysis and dyspnea was only significant for first quartile. Also emotion functioning and duration of disease statistically predicted the QOL score in the third quartile. The results have demonstrated that using quantile regression leads to better interpretation and richer inference about predictors of the breast cancer patient quality of life.

  12. Mapping extreme rainfall in the Northwest Portugal region: statistical analysis and spatial modelling

    Science.gov (United States)

    Santos, Monica; Fragoso, Marcelo

    2010-05-01

    Extreme precipitation events are one of the causes of natural hazards, such as floods and landslides, making its investigation so important, and this research aims to contribute to the study of the extreme rainfall patterns in a Portuguese mountainous area. The study area is centred on the Arcos de Valdevez county, located in the northwest region of Portugal, the rainiest of the country, with more than 3000 mm of annual rainfall at the Peneda-Gerês mountain system. This work focus on two main subjects related with the precipitation variability on the study area. First, a statistical analysis of several precipitation parameters is carried out, using daily data from 17 rain-gauges with a complete record for the 1960-1995 period. This approach aims to evaluate the main spatial contrasts regarding different aspects of the rainfall regime, described by ten parameters and indices of precipitation extremes (e.g. mean annual precipitation, the annual frequency of precipitation days, wet spells durations, maximum daily precipitation, maximum of precipitation in 30 days, number of days with rainfall exceeding 100 mm and estimated maximum daily rainfall for a return period of 100 years). The results show that the highest precipitation amounts (from annual to daily scales) and the higher frequency of very abundant rainfall events occur in the Serra da Peneda and Gerês mountains, opposing to the valleys of the Lima, Minho and Vez rivers, with lower precipitation amounts and less frequent heavy storms. The second purpose of this work is to find a method of mapping extreme rainfall in this mountainous region, investigating the complex influence of the relief (e.g. elevation, topography) on the precipitation patterns, as well others geographical variables (e.g. distance from coast, latitude), applying tested geo-statistical techniques (Goovaerts, 2000; Diodato, 2005). Models of linear regression were applied to evaluate the influence of different geographical variables (altitude

  13. Consequences of kriging and land use regression for PM2.5 predictions in epidemiologic analyses: insights into spatial variability using high-resolution satellite data.

    Science.gov (United States)

    Alexeeff, Stacey E; Schwartz, Joel; Kloog, Itai; Chudnovsky, Alexandra; Koutrakis, Petros; Coull, Brent A

    2015-01-01

    Many epidemiological studies use predicted air pollution exposures as surrogates for true air pollution levels. These predicted exposures contain exposure measurement error, yet simulation studies have typically found negligible bias in resulting health effect estimates. However, previous studies typically assumed a statistical spatial model for air pollution exposure, which may be oversimplified. We address this shortcoming by assuming a realistic, complex exposure surface derived from fine-scale (1 km × 1 km) remote-sensing satellite data. Using simulation, we evaluate the accuracy of epidemiological health effect estimates in linear and logistic regression when using spatial air pollution predictions from kriging and land use regression models. We examined chronic (long-term) and acute (short-term) exposure to air pollution. Results varied substantially across different scenarios. Exposure models with low out-of-sample R(2) yielded severe biases in the health effect estimates of some models, ranging from 60% upward bias to 70% downward bias. One land use regression exposure model with >0.9 out-of-sample R(2) yielded upward biases up to 13% for acute health effect estimates. Almost all models drastically underestimated the SEs. Land use regression models performed better in chronic effect simulations. These results can help researchers when interpreting health effect estimates in these types of studies.

  14. Visual grading characteristics and ordinal regression analysis during optimisation of CT head examinations.

    Science.gov (United States)

    Zarb, Francis; McEntee, Mark F; Rainford, Louise

    2015-06-01

    To evaluate visual grading characteristics (VGC) and ordinal regression analysis during head CT optimisation as a potential alternative to visual grading assessment (VGA), traditionally employed to score anatomical visualisation. Patient images (n = 66) were obtained using current and optimised imaging protocols from two CT suites: a 16-slice scanner at the national Maltese centre for trauma and a 64-slice scanner in a private centre. Local resident radiologists (n = 6) performed VGA followed by VGC and ordinal regression analysis. VGC alone indicated that optimised protocols had similar image quality as current protocols. Ordinal logistic regression analysis provided an in-depth evaluation, criterion by criterion allowing the selective implementation of the protocols. The local radiology review panel supported the implementation of optimised protocols for brain CT examinations (including trauma) in one centre, achieving radiation dose reductions ranging from 24 % to 36 %. In the second centre a 29 % reduction in radiation dose was achieved for follow-up cases. The combined use of VGC and ordinal logistic regression analysis led to clinical decisions being taken on the implementation of the optimised protocols. This improved method of image quality analysis provided the evidence to support imaging protocol optimisation, resulting in significant radiation dose savings. • There is need for scientifically based image quality evaluation during CT optimisation. • VGC and ordinal regression analysis in combination led to better informed clinical decisions. • VGC and ordinal regression analysis led to dose reductions without compromising diagnostic efficacy.

  15. Regression analysis of radiological parameters in nuclear power plants

    International Nuclear Information System (INIS)

    Bhargava, Pradeep; Verma, R.K.; Joshi, M.L.

    2003-01-01

    Indian Pressurized Heavy Water Reactors (PHWRs) have now attained maturity in their operations. Indian PHWR operation started in the year 1972. At present there are 12 operating PHWRs collectively producing nearly 2400 MWe. Sufficient radiological data are available for analysis to draw inferences which may be utilised for better understanding of radiological parameters influencing the collective internal dose. Tritium is the main contributor to the occupational internal dose originating in PHWRs. An attempt has been made to establish the relationship between radiological parameters, which may be useful to draw inferences about the internal dose. Regression analysis have been done to find out the relationship, if it exist, among the following variables: A. Specific tritium activity of heavy water (Moderator and PHT) and tritium concentration in air at various work locations. B. Internal collective occupational dose and tritium release to environment through air route. C. Specific tritium activity of heavy water (Moderator and PHT) and collective internal occupational dose. For this purpose multivariate regression analysis has been carried out. D. Tritium concentration in air at various work location and tritium release to environment through air route. For this purpose multivariate regression analysis has been carried out. This analysis reveals that collective internal dose has got very good correlation with the tritium activity release to the environment through air route. Whereas no correlation has been found between specific tritium activity in the heavy water systems and collective internal occupational dose. The good correlation has been found in case D and F test reveals that it is not by chance. (author)

  16. Comparison of cranial sex determination by discriminant analysis and logistic regression.

    Science.gov (United States)

    Amores-Ampuero, Anabel; Alemán, Inmaculada

    2016-04-05

    Various methods have been proposed for estimating dimorphism. The objective of this study was to compare sex determination results from cranial measurements using discriminant analysis or logistic regression. The study sample comprised 130 individuals (70 males) of known sex, age, and cause of death from San José cemetery in Granada (Spain). Measurements of 19 neurocranial dimensions and 11 splanchnocranial dimensions were subjected to discriminant analysis and logistic regression, and the percentages of correct classification were compared between the sex functions obtained with each method. The discriminant capacity of the selected variables was evaluated with a cross-validation procedure. The percentage accuracy with discriminant analysis was 78.2% for the neurocranium (82.4% in females and 74.6% in males) and 73.7% for the splanchnocranium (79.6% in females and 68.8% in males). These percentages were higher with logistic regression analysis: 85.7% for the neurocranium (in both sexes) and 94.1% for the splanchnocranium (100% in females and 91.7% in males).

  17. Relative accuracy of spatial predictive models for lynx Lynx canadensis derived using logistic regression-AIC, multiple criteria evaluation and Bayesian approaches

    Directory of Open Access Journals (Sweden)

    Shelley M. ALEXANDER

    2009-02-01

    Full Text Available We compared probability surfaces derived using one set of environmental variables in three Geographic Information Systems (GIS-based approaches: logistic regression and Akaike’s Information Criterion (AIC, Multiple Criteria Evaluation (MCE, and Bayesian Analysis (specifically Dempster-Shafer theory. We used lynx Lynx canadensis as our focal species, and developed our environment relationship model using track data collected in Banff National Park, Alberta, Canada, during winters from 1997 to 2000. The accuracy of the three spatial models were compared using a contingency table method. We determined the percentage of cases in which both presence and absence points were correctly classified (overall accuracy, the failure to predict a species where it occurred (omission error and the prediction of presence where there was absence (commission error. Our overall accuracy showed the logistic regression approach was the most accurate (74.51%. The multiple criteria evaluation was intermediate (39.22%, while the Dempster-Shafer (D-S theory model was the poorest (29.90%. However, omission and commission error tell us a different story: logistic regression had the lowest commission error, while D-S theory produced the lowest omission error. Our results provide evidence that habitat modellers should evaluate all three error measures when ascribing confidence in their model. We suggest that for our study area at least, the logistic regression model is optimal. However, where sample size is small or the species is very rare, it may also be useful to explore and/or use a more ecologically cautious modelling approach (e.g. Dempster-Shafer that would over-predict, protect more sites, and thereby minimize the risk of missing critical habitat in conservation plans[Current Zoology 55(1: 28 – 40, 2009].

  18. Regression Analysis: Instructional Resource for Cost/Managerial Accounting

    Science.gov (United States)

    Stout, David E.

    2015-01-01

    This paper describes a classroom-tested instructional resource, grounded in principles of active learning and a constructivism, that embraces two primary objectives: "demystify" for accounting students technical material from statistics regarding ordinary least-squares (OLS) regression analysis--material that students may find obscure or…

  19. Spatial distribution and risk factors of influenza in Jiangsu province, China, based on geographical information system

    Directory of Open Access Journals (Sweden)

    Jia-Cheng Zhang

    2014-05-01

    Full Text Available Influenza poses a constant, heavy burden on society. Recent research has focused on ecological factors associated with influenza incidence and has also studied influenza with respect to its geographic spread at different scales. This research explores the temporal and spatial parameters of influenza and identifies factors influencing its transmission. A spatial autocorrelation analysis, a spatial-temporal cluster analysis and a spatial regression analysis of influenza rates, carried out in Jiangsu province from 2004 to 2011, found that influenza rates to be spatially dependent in 2004, 2005, 2006 and 2008. South-western districts consistently revealed hotspots of high-incidence influenza. The regression analysis indicates that railways, rivers and lakes are important predictive environmental variables for influenza risk. A better understanding of the epidemic pattern and ecological factors associated with pandemic influenza should benefit public health officials with respect to prevention and controlling measures during future epidemics.

  20. Spatial prediction of Lactarius deliciosus and Lactarius salmonicolor mushroom distribution with logistic regression models in the Kızılcasu Planning Unit, Turkey.

    Science.gov (United States)

    Mumcu Kucuker, Derya; Baskent, Emin Zeki

    2015-01-01

    Integration of non-wood forest products (NWFPs) into forest management planning has become an increasingly important issue in forestry over the last decade. Among NWFPs, mushrooms are valued due to their medicinal, commercial, high nutritional and recreational importance. Commercial mushroom harvesting also provides important income to local dwellers and contributes to the economic value of regional forests. Sustainable management of these products at the regional scale requires information on their locations in diverse forest settings and the ability to predict and map their spatial distributions over the landscape. This study focuses on modeling the spatial distribution of commercially harvested Lactarius deliciosus and L. salmonicolor mushrooms in the Kızılcasu Forest Planning Unit, Turkey. The best models were developed based on topographic, climatic and stand characteristics, separately through logistic regression analysis using SPSS™. The best topographic model provided better classification success (69.3 %) than the best climatic (65.4 %) and stand (65 %) models. However, the overall best model, with 73 % overall classification success, used a mix of several variables. The best models were integrated into an Arc/Info GIS program to create spatial distribution maps of L. deliciosus and L. salmonicolor in the planning area. Our approach may be useful to predict the occurrence and distribution of other NWFPs and provide a valuable tool for designing silvicultural prescriptions and preparing multiple-use forest management plans.

  1. A spatial analysis of county-level variation in syphilis and gonorrhea in Guangdong Province, China.

    Directory of Open Access Journals (Sweden)

    Nicholas X Tan

    2011-05-01

    Full Text Available Sexually transmitted infections (STI have made a resurgence in many rapidly developing regions of southern China, but there is little understanding of the social changes that contribute to this spatial distribution of STI. This study examines county-level socio-demographic characteristics associated with syphilis and gonorrhea in Guangdong Province.This study uses linear regression and spatial lag regression to determine county-level (n = 97 socio-demographic characteristics associated with a greater burden of syphilis, gonorrhea, and a combined syphilis/gonorrhea index. Data were obtained from the 2005 China Population Census and published public health data. A range of socio-demographic variables including gross domestic product, the Gender Empowerment Measure, standard of living, education level, migrant population and employment are examined. Reported syphilis and gonorrhea cases are disproportionately clustered in the Pearl River Delta, the central region of Guangdong Province. A higher fraction of employed men among the adult population, higher fraction of divorced men among the adult population, and higher standard of living (based on water availability and people per room are significantly associated with higher STI cases across all three models. Gross domestic product and gender inequality measures are not significant predictors of reported STI in these models.Although many ecological studies of STIs have found poverty to be associated with higher reported STI, this analysis found a greater number of reported syphilis cases in counties with a higher standard of living. Spatially targeted syphilis screening measures in regions with a higher standard of living may facilitate successful control efforts. This analysis also reinforces the importance of changing male sexual behaviors as part of a comprehensive response to syphilis control in China.

  2. Robust Mediation Analysis Based on Median Regression

    Science.gov (United States)

    Yuan, Ying; MacKinnon, David P.

    2014-01-01

    Mediation analysis has many applications in psychology and the social sciences. The most prevalent methods typically assume that the error distribution is normal and homoscedastic. However, this assumption may rarely be met in practice, which can affect the validity of the mediation analysis. To address this problem, we propose robust mediation analysis based on median regression. Our approach is robust to various departures from the assumption of homoscedasticity and normality, including heavy-tailed, skewed, contaminated, and heteroscedastic distributions. Simulation studies show that under these circumstances, the proposed method is more efficient and powerful than standard mediation analysis. We further extend the proposed robust method to multilevel mediation analysis, and demonstrate through simulation studies that the new approach outperforms the standard multilevel mediation analysis. We illustrate the proposed method using data from a program designed to increase reemployment and enhance mental health of job seekers. PMID:24079925

  3. Multiplication factor versus regression analysis in stature estimation from hand and foot dimensions.

    Science.gov (United States)

    Krishan, Kewal; Kanchan, Tanuj; Sharma, Abhilasha

    2012-05-01

    Estimation of stature is an important parameter in identification of human remains in forensic examinations. The present study is aimed to compare the reliability and accuracy of stature estimation and to demonstrate the variability in estimated stature and actual stature using multiplication factor and regression analysis methods. The study is based on a sample of 246 subjects (123 males and 123 females) from North India aged between 17 and 20 years. Four anthropometric measurements; hand length, hand breadth, foot length and foot breadth taken on the left side in each subject were included in the study. Stature was measured using standard anthropometric techniques. Multiplication factors were calculated and linear regression models were derived for estimation of stature from hand and foot dimensions. Derived multiplication factors and regression formula were applied to the hand and foot measurements in the study sample. The estimated stature from the multiplication factors and regression analysis was compared with the actual stature to find the error in estimated stature. The results indicate that the range of error in estimation of stature from regression analysis method is less than that of multiplication factor method thus, confirming that the regression analysis method is better than multiplication factor analysis in stature estimation. Copyright © 2012 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.

  4. A comparative study of multiple regression analysis and back ...

    Indian Academy of Sciences (India)

    Abhijit Sarkar

    artificial neural network (ANN) models to predict weld bead geometry and HAZ width in submerged arc welding ... Keywords. Submerged arc welding (SAW); multi-regression analysis (MRA); artificial neural network ..... Degree of freedom.

  5. Multilayer perceptron for robust nonlinear interval regression analysis using genetic algorithms.

    Science.gov (United States)

    Hu, Yi-Chung

    2014-01-01

    On the basis of fuzzy regression, computational models in intelligence such as neural networks have the capability to be applied to nonlinear interval regression analysis for dealing with uncertain and imprecise data. When training data are not contaminated by outliers, computational models perform well by including almost all given training data in the data interval. Nevertheless, since training data are often corrupted by outliers, robust learning algorithms employed to resist outliers for interval regression analysis have been an interesting area of research. Several approaches involving computational intelligence are effective for resisting outliers, but the required parameters for these approaches are related to whether the collected data contain outliers or not. Since it seems difficult to prespecify the degree of contamination beforehand, this paper uses multilayer perceptron to construct the robust nonlinear interval regression model using the genetic algorithm. Outliers beyond or beneath the data interval will impose slight effect on the determination of data interval. Simulation results demonstrate that the proposed method performs well for contaminated datasets.

  6. Exploratory regression analysis: a tool for selecting models and determining predictor importance.

    Science.gov (United States)

    Braun, Michael T; Oswald, Frederick L

    2011-06-01

    Linear regression analysis is one of the most important tools in a researcher's toolbox for creating and testing predictive models. Although linear regression analysis indicates how strongly a set of predictor variables, taken together, will predict a relevant criterion (i.e., the multiple R), the analysis cannot indicate which predictors are the most important. Although there is no definitive or unambiguous method for establishing predictor variable importance, there are several accepted methods. This article reviews those methods for establishing predictor importance and provides a program (in Excel) for implementing them (available for direct download at http://dl.dropbox.com/u/2480715/ERA.xlsm?dl=1) . The program investigates all 2(p) - 1 submodels and produces several indices of predictor importance. This exploratory approach to linear regression, similar to other exploratory data analysis techniques, has the potential to yield both theoretical and practical benefits.

  7. Spatial analysis of myocardial infarction in Iran: National report from the Iranian myocardial infarction registry

    Directory of Open Access Journals (Sweden)

    Ali Ahmadi

    2015-01-01

    Full Text Available Background: Myocardial infarction (MI is a leading cause of mortality and morbidity in Iran. No spatial analysis of MI has been conducted to date. The present study was conducted to determine the pattern of MI incidence and to identify the associated factors in Iran by province. Materials and Methods: This study has two parts. One part is prospective and hospital-based, and the other part is an ecological study. In this study, the data of 20,750 new MI cases registered in Iranian Myocardial Infarction Registry in 2012 were used. For spatial analysis in global and local, spatial autocorrelation, Moran′s I, Getis-Ord, and logistic regression models were used. Data were analyzed by Stata software and ArcGIS 9.3. Results: Based on autocorrelation coefficient, a specific pattern was observed in the distribution of MI incidence in different provinces (Moran′s I: 0.75, P < 0.001. Spatial pattern of incidence was approximately the same in men and women. MI incidence was clustering in six provinces (North Khorasan, Yazd, Kerman, Semnan, Golestan, and Mazandaran. Out of the associated factors with clustered MI in six provinces, temperature, humidity, hypertension, smoking, and body mass index (BMI could be mentioned. Hypertension, smoking, and BMI contributed to clustering with, respectively, 2.36, 1.31, and 1.31 odds ratio. Conclusion: Addressing the place-based pattern of incidence and clarifying their epidemiologic dimension, including spatial analysis, has not yet been implemented in Iran. Report on MI incidence rate by place and formal borders is useful and is used in the planning and prioritization in different levels of health system.

  8. Regression analysis for LED color detection of visual-MIMO system

    Science.gov (United States)

    Banik, Partha Pratim; Saha, Rappy; Kim, Ki-Doo

    2018-04-01

    Color detection from a light emitting diode (LED) array using a smartphone camera is very difficult in a visual multiple-input multiple-output (visual-MIMO) system. In this paper, we propose a method to determine the LED color using a smartphone camera by applying regression analysis. We employ a multivariate regression model to identify the LED color. After taking a picture of an LED array, we select the LED array region, and detect the LED using an image processing algorithm. We then apply the k-means clustering algorithm to determine the number of potential colors for feature extraction of each LED. Finally, we apply the multivariate regression model to predict the color of the transmitted LEDs. In this paper, we show our results for three types of environmental light condition: room environmental light, low environmental light (560 lux), and strong environmental light (2450 lux). We compare the results of our proposed algorithm from the analysis of training and test R-Square (%) values, percentage of closeness of transmitted and predicted colors, and we also mention about the number of distorted test data points from the analysis of distortion bar graph in CIE1931 color space.

  9. Evaluation of syngas production unit cost of bio-gasification facility using regression analysis techniques

    Energy Technology Data Exchange (ETDEWEB)

    Deng, Yangyang; Parajuli, Prem B.

    2011-08-10

    Evaluation of economic feasibility of a bio-gasification facility needs understanding of its unit cost under different production capacities. The objective of this study was to evaluate the unit cost of syngas production at capacities from 60 through 1800Nm 3/h using an economic model with three regression analysis techniques (simple regression, reciprocal regression, and log-log regression). The preliminary result of this study showed that reciprocal regression analysis technique had the best fit curve between per unit cost and production capacity, with sum of error squares (SES) lower than 0.001 and coefficient of determination of (R 2) 0.996. The regression analysis techniques determined the minimum unit cost of syngas production for micro-scale bio-gasification facilities of $0.052/Nm 3, under the capacity of 2,880 Nm 3/h. The results of this study suggest that to reduce cost, facilities should run at a high production capacity. In addition, the contribution of this technique could be the new categorical criterion to evaluate micro-scale bio-gasification facility from the perspective of economic analysis.

  10. A primer for biomedical scientists on how to execute model II linear regression analysis.

    Science.gov (United States)

    Ludbrook, John

    2012-04-01

    1. There are two very different ways of executing linear regression analysis. One is Model I, when the x-values are fixed by the experimenter. The other is Model II, in which the x-values are free to vary and are subject to error. 2. I have received numerous complaints from biomedical scientists that they have great difficulty in executing Model II linear regression analysis. This may explain the results of a Google Scholar search, which showed that the authors of articles in journals of physiology, pharmacology and biochemistry rarely use Model II regression analysis. 3. I repeat my previous arguments in favour of using least products linear regression analysis for Model II regressions. I review three methods for executing ordinary least products (OLP) and weighted least products (WLP) regression analysis: (i) scientific calculator and/or computer spreadsheet; (ii) specific purpose computer programs; and (iii) general purpose computer programs. 4. Using a scientific calculator and/or computer spreadsheet, it is easy to obtain correct values for OLP slope and intercept, but the corresponding 95% confidence intervals (CI) are inaccurate. 5. Using specific purpose computer programs, the freeware computer program smatr gives the correct OLP regression coefficients and obtains 95% CI by bootstrapping. In addition, smatr can be used to compare the slopes of OLP lines. 6. When using general purpose computer programs, I recommend the commercial programs systat and Statistica for those who regularly undertake linear regression analysis and I give step-by-step instructions in the Supplementary Information as to how to use loss functions. © 2011 The Author. Clinical and Experimental Pharmacology and Physiology. © 2011 Blackwell Publishing Asia Pty Ltd.

  11. External Tank Liquid Hydrogen (LH2) Prepress Regression Analysis Independent Review Technical Consultation Report

    Science.gov (United States)

    Parsons, Vickie s.

    2009-01-01

    The request to conduct an independent review of regression models, developed for determining the expected Launch Commit Criteria (LCC) External Tank (ET)-04 cycle count for the Space Shuttle ET tanking process, was submitted to the NASA Engineering and Safety Center NESC on September 20, 2005. The NESC team performed an independent review of regression models documented in Prepress Regression Analysis, Tom Clark and Angela Krenn, 10/27/05. This consultation consisted of a peer review by statistical experts of the proposed regression models provided in the Prepress Regression Analysis. This document is the consultation's final report.

  12. Tutorial on Biostatistics: Linear Regression Analysis of Continuous Correlated Eye Data.

    Science.gov (United States)

    Ying, Gui-Shuang; Maguire, Maureen G; Glynn, Robert; Rosner, Bernard

    2017-04-01

    To describe and demonstrate appropriate linear regression methods for analyzing correlated continuous eye data. We describe several approaches to regression analysis involving both eyes, including mixed effects and marginal models under various covariance structures to account for inter-eye correlation. We demonstrate, with SAS statistical software, applications in a study comparing baseline refractive error between one eye with choroidal neovascularization (CNV) and the unaffected fellow eye, and in a study determining factors associated with visual field in the elderly. When refractive error from both eyes were analyzed with standard linear regression without accounting for inter-eye correlation (adjusting for demographic and ocular covariates), the difference between eyes with CNV and fellow eyes was 0.15 diopters (D; 95% confidence interval, CI -0.03 to 0.32D, p = 0.10). Using a mixed effects model or a marginal model, the estimated difference was the same but with narrower 95% CI (0.01 to 0.28D, p = 0.03). Standard regression for visual field data from both eyes provided biased estimates of standard error (generally underestimated) and smaller p-values, while analysis of the worse eye provided larger p-values than mixed effects models and marginal models. In research involving both eyes, ignoring inter-eye correlation can lead to invalid inferences. Analysis using only right or left eyes is valid, but decreases power. Worse-eye analysis can provide less power and biased estimates of effect. Mixed effects or marginal models using the eye as the unit of analysis should be used to appropriately account for inter-eye correlation and maximize power and precision.

  13. Spatial Econometric data analysis: moving beyond traditional models

    NARCIS (Netherlands)

    Florax, R.J.G.M.; Vlist, van der A.J.

    2003-01-01

    This article appraises recent advances in the spatial econometric literature. It serves as the introduction too collection of new papers on spatial econometric data analysis brought together in this special issue, dealing specifically with new extensions to the spatial econometric modeling

  14. Non-stationary hydrologic frequency analysis using B-spline quantile regression

    Science.gov (United States)

    Nasri, B.; Bouezmarni, T.; St-Hilaire, A.; Ouarda, T. B. M. J.

    2017-11-01

    Hydrologic frequency analysis is commonly used by engineers and hydrologists to provide the basic information on planning, design and management of hydraulic and water resources systems under the assumption of stationarity. However, with increasing evidence of climate change, it is possible that the assumption of stationarity, which is prerequisite for traditional frequency analysis and hence, the results of conventional analysis would become questionable. In this study, we consider a framework for frequency analysis of extremes based on B-Spline quantile regression which allows to model data in the presence of non-stationarity and/or dependence on covariates with linear and non-linear dependence. A Markov Chain Monte Carlo (MCMC) algorithm was used to estimate quantiles and their posterior distributions. A coefficient of determination and Bayesian information criterion (BIC) for quantile regression are used in order to select the best model, i.e. for each quantile, we choose the degree and number of knots of the adequate B-spline quantile regression model. The method is applied to annual maximum and minimum streamflow records in Ontario, Canada. Climate indices are considered to describe the non-stationarity in the variable of interest and to estimate the quantiles in this case. The results show large differences between the non-stationary quantiles and their stationary equivalents for an annual maximum and minimum discharge with high annual non-exceedance probabilities.

  15. Spatial interpolation schemes of daily precipitation for hydrologic modeling

    Science.gov (United States)

    Hwang, Y.; Clark, M.R.; Rajagopalan, B.; Leavesley, G.

    2012-01-01

    Distributed hydrologic models typically require spatial estimates of precipitation interpolated from sparsely located observational points to the specific grid points. We compare and contrast the performance of regression-based statistical methods for the spatial estimation of precipitation in two hydrologically different basins and confirmed that widely used regression-based estimation schemes fail to describe the realistic spatial variability of daily precipitation field. The methods assessed are: (1) inverse distance weighted average; (2) multiple linear regression (MLR); (3) climatological MLR; and (4) locally weighted polynomial regression (LWP). In order to improve the performance of the interpolations, the authors propose a two-step regression technique for effective daily precipitation estimation. In this simple two-step estimation process, precipitation occurrence is first generated via a logistic regression model before estimate the amount of precipitation separately on wet days. This process generated the precipitation occurrence, amount, and spatial correlation effectively. A distributed hydrologic model (PRMS) was used for the impact analysis in daily time step simulation. Multiple simulations suggested noticeable differences between the input alternatives generated by three different interpolation schemes. Differences are shown in overall simulation error against the observations, degree of explained variability, and seasonal volumes. Simulated streamflows also showed different characteristics in mean, maximum, minimum, and peak flows. Given the same parameter optimization technique, LWP input showed least streamflow error in Alapaha basin and CMLR input showed least error (still very close to LWP) in Animas basin. All of the two-step interpolation inputs resulted in lower streamflow error compared to the directly interpolated inputs. ?? 2011 Springer-Verlag.

  16. Collaborative regression-based anatomical landmark detection

    International Nuclear Information System (INIS)

    Gao, Yaozong; Shen, Dinggang

    2015-01-01

    Anatomical landmark detection plays an important role in medical image analysis, e.g. for registration, segmentation and quantitative analysis. Among the various existing methods for landmark detection, regression-based methods have recently attracted much attention due to their robustness and efficiency. In these methods, landmarks are localised through voting from all image voxels, which is completely different from the classification-based methods that use voxel-wise classification to detect landmarks. Despite their robustness, the accuracy of regression-based landmark detection methods is often limited due to (1) the inclusion of uninformative image voxels in the voting procedure, and (2) the lack of effective ways to incorporate inter-landmark spatial dependency into the detection step. In this paper, we propose a collaborative landmark detection framework to address these limitations. The concept of collaboration is reflected in two aspects. (1) Multi-resolution collaboration. A multi-resolution strategy is proposed to hierarchically localise landmarks by gradually excluding uninformative votes from faraway voxels. Moreover, for informative voxels near the landmark, a spherical sampling strategy is also designed at the training stage to improve their prediction accuracy. (2) Inter-landmark collaboration. A confidence-based landmark detection strategy is proposed to improve the detection accuracy of ‘difficult-to-detect’ landmarks by using spatial guidance from ‘easy-to-detect’ landmarks. To evaluate our method, we conducted experiments extensively on three datasets for detecting prostate landmarks and head and neck landmarks in computed tomography images, and also dental landmarks in cone beam computed tomography images. The results show the effectiveness of our collaborative landmark detection framework in improving landmark detection accuracy, compared to other state-of-the-art methods. (paper)

  17. Spatial noise-aware temperature retrieval from infrared sounder data

    DEFF Research Database (Denmark)

    Malmgren-Hansen, David; Laparra, Valero; Nielsen, Allan Aasbjerg

    2017-01-01

    Principal Component Analysis (PCA) and Minimum Noise Fraction (MNF) for dimensionality reduction, and study the compactness and information content of the extracted features. Assessment of the results is done on a big dataset covering many spatial and temporal situations. PCA is widely used...... for these purposes but our analysis shows that one can gain significant improvements of the error rates when using MNF instead. In our analysis we also investigate the relationship between error rate improvements when including more spectral and spatial components in the regression model, aiming to uncover the trade...

  18. Alternative Methods of Regression

    CERN Document Server

    Birkes, David

    2011-01-01

    Of related interest. Nonlinear Regression Analysis and its Applications Douglas M. Bates and Donald G. Watts ".an extraordinary presentation of concepts and methods concerning the use and analysis of nonlinear regression models.highly recommend[ed].for anyone needing to use and/or understand issues concerning the analysis of nonlinear regression models." --Technometrics This book provides a balance between theory and practice supported by extensive displays of instructive geometrical constructs. Numerous in-depth case studies illustrate the use of nonlinear regression analysis--with all data s

  19. Spatial-temporal analysis of building surface temperatures in Hung Hom

    Science.gov (United States)

    Zeng, Ying; Shen, Yueqian

    2015-12-01

    This thesis presents a study on spatial-temporal analysis of building surface temperatures in Hung Hom. Observations were collected from Aug 2013 to Oct 2013 at a 30-min interval, using iButton sensors (N=20) covering twelve locations in Hung Hom. And thermal images were captured in PolyU from 05 Aug 2013 to 06 Aug 2013. A linear regression model of iButton and thermal records is established to calibrate temperature data. A 3D modeling system is developed based on Visual Studio 2010 development platform, using ArcEngine10.0 component, Microsoft Access 2010 database and C# programming language. The system realizes processing data, spatial analysis, compound query and 3D face temperature rendering and so on. After statistical analyses, building face azimuths are found to have a statistically significant relationship with sun azimuths at peak time. And seasonal building temperature changing also corresponds to the sun angle and sun azimuth variations. Building materials are found to have a significant effect on building surface temperatures. Buildings with lower albedo materials tend to have higher temperatures and larger thermal conductivity material have significant diurnal variations. For the geographical locations, the peripheral faces of campus have higher temperatures than the inner faces during day time and buildings located at the southeast are cooler than the western. Furthermore, human activity is found to have a strong relationship with building surface temperatures through weekday and weekend comparison.

  20. Temporal-Spatial Analysis of Traffic Congestion Based on Modified CTM

    Directory of Open Access Journals (Sweden)

    Chenglong Chu

    2015-01-01

    Full Text Available A modified cell transmission model (CTM is proposed to depict the temporal-spatial evolution of traffic congestion on urban freeways. Specifically, drivers’ adaptive behaviors and the corresponding influence on traffic flows are emphasized. Two piecewise linear regression models are proposed to describe the relationship of flow and density (occupancy. Several types of cellular connections are designed to depict urban rapid roads with on/off-ramps and junctions. Based on the data collected on freeway of Queen Elizabeth, Ontario, Canada, we show that the new model provides a relatively higher accuracy of temporal-spatial evolution of traffic congestions.

  1. Gender, space, and the location changes of jobs and people: a spatial simultaneous equations analysis.

    Science.gov (United States)

    Hoogstra, Gerke J

    2012-01-01

    This article summarizes a spatial econometric analysis of local population and employment growth in the Netherlands, with specific reference to impacts of gender and space. The simultaneous equations model used distinguishes between population- and gender-specific employment groups, and includes autoregressive and cross-regressive spatial lags to detect relations both within and among these groups. Spatial weights matrices reflecting different bands of travel times are used to calculate the spatial lags and to gauge the spatial nature of these relations. The empirical results show that although population–employment interaction is more localized for women's employment, no gender difference exists in the direction of interaction. Employment growth for both men and women is more influenced by population growth than vice versa. The interaction within employment groups is even more important than population growth. Women's, and especially men's, local employment growth mostly benefits from the same employment growth in neighboring locations. Finally, interaction between these groups is practically absent, although men's employment growth may have a negative impact on women's employment growth within small geographic areas. In summary, the results confirm the crucial roles of gender and space, and offer important insights into possible relations within and among subgroups of jobs and people.

  2. On macroeconomic values investigation using fuzzy linear regression analysis

    Directory of Open Access Journals (Sweden)

    Richard Pospíšil

    2017-06-01

    Full Text Available The theoretical background for abstract formalization of the vague phenomenon of complex systems is the fuzzy set theory. In the paper, vague data is defined as specialized fuzzy sets - fuzzy numbers and there is described a fuzzy linear regression model as a fuzzy function with fuzzy numbers as vague parameters. To identify the fuzzy coefficients of the model, the genetic algorithm is used. The linear approximation of the vague function together with its possibility area is analytically and graphically expressed. A suitable application is performed in the tasks of the time series fuzzy regression analysis. The time-trend and seasonal cycles including their possibility areas are calculated and expressed. The examples are presented from the economy field, namely the time-development of unemployment, agricultural production and construction respectively between 2009 and 2011 in the Czech Republic. The results are shown in the form of the fuzzy regression models of variables of time series. For the period 2009-2011, the analysis assumptions about seasonal behaviour of variables and the relationship between them were confirmed; in 2010, the system behaved fuzzier and the relationships between the variables were vaguer, that has a lot of causes, from the different elasticity of demand, through state interventions to globalization and transnational impacts.

  3. Testing the transferability of regression equations derived from small sub-catchments to a large area in central Sweden

    Directory of Open Access Journals (Sweden)

    C. Xu

    2003-01-01

    Full Text Available There is an ever increasing need to apply hydrological models to catchments where streamflow data are unavailable or to large geographical regions where calibration is not feasible. Estimation of model parameters from spatial physical data is the key issue in the development and application of hydrological models at various scales. To investigate the suitability of transferring the regression equations relating model parameters to physical characteristics developed from small sub-catchments to a large region for estimating model parameters, a conceptual snow and water balance model was optimised on all the sub-catchments in the region. A multiple regression analysis related model parameters to physical data for the catchments and the regression equations derived from the small sub-catchments were used to calculate regional parameter values for the large basin using spatially aggregated physical data. For the model tested, the results support the suitability of transferring the regression equations to the larger region. Keywords: water balance modelling,large scale, multiple regression, regionalisation

  4. Singular spectrum analysis, Harmonic regression and El-Nino effect ...

    Indian Academy of Sciences (India)

    42

    Keywords: Total ozone; Singular Spectrum Analysis; Spatial interpolation; Multivariate ENSO .... needed for a whole gamut of activities that contribute to the ultimate synthesis ..... −0.0009 3 + 0.0581 2 − 1.0123 + 7.3246, 2 = 0.53…

  5. REGRESSION ANALYSIS OF SEA-SURFACE-TEMPERATURE PATTERNS FOR THE NORTH PACIFIC OCEAN.

    Science.gov (United States)

    SEA WATER, *SURFACE TEMPERATURE, *OCEANOGRAPHIC DATA, PACIFIC OCEAN, REGRESSION ANALYSIS , STATISTICAL ANALYSIS, UNDERWATER EQUIPMENT, DETECTION, UNDERWATER COMMUNICATIONS, DISTRIBUTION, THERMAL PROPERTIES, COMPUTERS.

  6. Analysis of an environmental exposure health questionnaire in a metropolitan minority population utilizing logistic regression and Support Vector Machines.

    Science.gov (United States)

    Chen, Chau-Kuang; Bruce, Michelle; Tyler, Lauren; Brown, Claudine; Garrett, Angelica; Goggins, Susan; Lewis-Polite, Brandy; Weriwoh, Mirabel L; Juarez, Paul D; Hood, Darryl B; Skelton, Tyler

    2013-02-01

    The goal of this study was to analyze a 54-item instrument for assessment of perception of exposure to environmental contaminants within the context of the built environment, or exposome. This exposome was defined in five domains to include 1) home and hobby, 2) school, 3) community, 4) occupation, and 5) exposure history. Interviews were conducted with child-bearing-age minority women at Metro Nashville General Hospital at Meharry Medical College. Data were analyzed utilizing DTReg software for Support Vector Machine (SVM) modeling followed by an SPSS package for a logistic regression model. The target (outcome) variable of interest was respondent's residence by ZIP code. The results demonstrate that the rank order of important variables with respect to SVM modeling versus traditional logistic regression models is almost identical. This is the first study documenting that SVM analysis has discriminate power for determination of higher-ordered spatial relationships on an environmental exposure history questionnaire.

  7. Regression analysis understanding and building business and economic models using Excel

    CERN Document Server

    Wilson, J Holton

    2012-01-01

    The technique of regression analysis is used so often in business and economics today that an understanding of its use is necessary for almost everyone engaged in the field. This book will teach you the essential elements of building and understanding regression models in a business/economic context in an intuitive manner. The authors take a non-theoretical treatment that is accessible even if you have a limited statistical background. It is specifically designed to teach the correct use of regression, while advising you of its limitations and teaching about common pitfalls. This book describe

  8. Nonlinear regression analysis for evaluating tracer binding parameters using the programmable K1003 desk computer

    International Nuclear Information System (INIS)

    Sarrach, D.; Strohner, P.

    1986-01-01

    The Gauss-Newton algorithm has been used to evaluate tracer binding parameters of RIA by nonlinear regression analysis. The calculations were carried out on the K1003 desk computer. Equations for simple binding models and its derivatives are presented. The advantages of nonlinear regression analysis over linear regression are demonstrated

  9. Regression analysis for the social sciences

    CERN Document Server

    Gordon, Rachel A

    2010-01-01

    The book provides graduate students in the social sciences with the basic skills that they need to estimate, interpret, present, and publish basic regression models using contemporary standards. Key features of the book include: interweaving the teaching of statistical concepts with examples developed for the course from publicly-available social science data or drawn from the literature. thorough integration of teaching statistical theory with teaching data processing and analysis. teaching of both SAS and Stata "side-by-side" and use of chapter exercises in which students practice programming and interpretation on the same data set and course exercises in which students can choose their own research questions and data set.

  10. Geographical, temporal and racial disparities in late-stage prostate cancer incidence across Florida: A multiscale joinpoint regression analysis

    Directory of Open Access Journals (Sweden)

    Goovaerts Pierre

    2011-12-01

    Full Text Available Abstract Background Although prostate cancer-related incidence and mortality have declined recently, striking racial/ethnic differences persist in the United States. Visualizing and modelling temporal trends of prostate cancer late-stage incidence, and how they vary according to geographic locations and race, should help explaining such disparities. Joinpoint regression is increasingly used to identify the timing and extent of changes in time series of health outcomes. Yet, most analyses of temporal trends are aspatial and conducted at the national level or for a single cancer registry. Methods Time series (1981-2007 of annual proportions of prostate cancer late-stage cases were analyzed for non-Hispanic Whites and non-Hispanic Blacks in each county of Florida. Noise in the data was first filtered by binomial kriging and results were modelled using joinpoint regression. A similar analysis was also conducted at the state level and for groups of metropolitan and non-metropolitan counties. Significant racial differences were detected using tests of parallelism and coincidence of time trends. A new disparity statistic was introduced to measure spatial and temporal changes in the frequency of racial disparities. Results State-level percentage of late-stage diagnosis decreased 50% since 1981; a decline that accelerated in the 90's when Prostate Specific Antigen (PSA screening was introduced. Analysis at the metropolitan and non-metropolitan levels revealed that the frequency of late-stage diagnosis increased recently in urban areas, and this trend was significant for white males. The annual rate of decrease in late-stage diagnosis and the onset years for significant declines varied greatly among counties and racial groups. Most counties with non-significant average annual percent change (AAPC were located in the Florida Panhandle for white males, whereas they clustered in South-eastern Florida for black males. The new disparity statistic indicated

  11. Geographical, temporal and racial disparities in late-stage prostate cancer incidence across Florida: a multiscale joinpoint regression analysis.

    Science.gov (United States)

    Goovaerts, Pierre; Xiao, Hong

    2011-12-05

    Although prostate cancer-related incidence and mortality have declined recently, striking racial/ethnic differences persist in the United States. Visualizing and modelling temporal trends of prostate cancer late-stage incidence, and how they vary according to geographic locations and race, should help explaining such disparities. Joinpoint regression is increasingly used to identify the timing and extent of changes in time series of health outcomes. Yet, most analyses of temporal trends are aspatial and conducted at the national level or for a single cancer registry. Time series (1981-2007) of annual proportions of prostate cancer late-stage cases were analyzed for non-Hispanic Whites and non-Hispanic Blacks in each county of Florida. Noise in the data was first filtered by binomial kriging and results were modelled using joinpoint regression. A similar analysis was also conducted at the state level and for groups of metropolitan and non-metropolitan counties. Significant racial differences were detected using tests of parallelism and coincidence of time trends. A new disparity statistic was introduced to measure spatial and temporal changes in the frequency of racial disparities. State-level percentage of late-stage diagnosis decreased 50% since 1981; a decline that accelerated in the 90's when Prostate Specific Antigen (PSA) screening was introduced. Analysis at the metropolitan and non-metropolitan levels revealed that the frequency of late-stage diagnosis increased recently in urban areas, and this trend was significant for white males. The annual rate of decrease in late-stage diagnosis and the onset years for significant declines varied greatly among counties and racial groups. Most counties with non-significant average annual percent change (AAPC) were located in the Florida Panhandle for white males, whereas they clustered in South-eastern Florida for black males. The new disparity statistic indicated that the spatial extent of racial disparities reached a

  12. Multivariate Linear Regression and CART Regression Analysis of TBM Performance at Abu Hamour Phase-I Tunnel

    Science.gov (United States)

    Jakubowski, J.; Stypulkowski, J. B.; Bernardeau, F. G.

    2017-12-01

    The first phase of the Abu Hamour drainage and storm tunnel was completed in early 2017. The 9.5 km long, 3.7 m diameter tunnel was excavated with two Earth Pressure Balance (EPB) Tunnel Boring Machines from Herrenknecht. TBM operation processes were monitored and recorded by Data Acquisition and Evaluation System. The authors coupled collected TBM drive data with available information on rock mass properties, cleansed, completed with secondary variables and aggregated by weeks and shifts. Correlations and descriptive statistics charts were examined. Multivariate Linear Regression and CART regression tree models linking TBM penetration rate (PR), penetration per revolution (PPR) and field penetration index (FPI) with TBM operational and geotechnical characteristics were performed for the conditions of the weak/soft rock of Doha. Both regression methods are interpretable and the data were screened with different computational approaches allowing enriched insight. The primary goal of the analysis was to investigate empirical relations between multiple explanatory and responding variables, to search for best subsets of explanatory variables and to evaluate the strength of linear and non-linear relations. For each of the penetration indices, a predictive model coupling both regression methods was built and validated. The resultant models appeared to be stronger than constituent ones and indicated an opportunity for more accurate and robust TBM performance predictions.

  13. Estimating leaf photosynthetic pigments information by stepwise multiple linear regression analysis and a leaf optical model

    Science.gov (United States)

    Liu, Pudong; Shi, Runhe; Wang, Hong; Bai, Kaixu; Gao, Wei

    2014-10-01

    Leaf pigments are key elements for plant photosynthesis and growth. Traditional manual sampling of these pigments is labor-intensive and costly, which also has the difficulty in capturing their temporal and spatial characteristics. The aim of this work is to estimate photosynthetic pigments at large scale by remote sensing. For this purpose, inverse model were proposed with the aid of stepwise multiple linear regression (SMLR) analysis. Furthermore, a leaf radiative transfer model (i.e. PROSPECT model) was employed to simulate the leaf reflectance where wavelength varies from 400 to 780 nm at 1 nm interval, and then these values were treated as the data from remote sensing observations. Meanwhile, simulated chlorophyll concentration (Cab), carotenoid concentration (Car) and their ratio (Cab/Car) were taken as target to build the regression model respectively. In this study, a total of 4000 samples were simulated via PROSPECT with different Cab, Car and leaf mesophyll structures as 70% of these samples were applied for training while the last 30% for model validation. Reflectance (r) and its mathematic transformations (1/r and log (1/r)) were all employed to build regression model respectively. Results showed fair agreements between pigments and simulated reflectance with all adjusted coefficients of determination (R2) larger than 0.8 as 6 wavebands were selected to build the SMLR model. The largest value of R2 for Cab, Car and Cab/Car are 0.8845, 0.876 and 0.8765, respectively. Meanwhile, mathematic transformations of reflectance showed little influence on regression accuracy. We concluded that it was feasible to estimate the chlorophyll and carotenoids and their ratio based on statistical model with leaf reflectance data.

  14. Composite marginal quantile regression analysis for longitudinal adolescent body mass index data.

    Science.gov (United States)

    Yang, Chi-Chuan; Chen, Yi-Hau; Chang, Hsing-Yi

    2017-09-20

    Childhood and adolescenthood overweight or obesity, which may be quantified through the body mass index (BMI), is strongly associated with adult obesity and other health problems. Motivated by the child and adolescent behaviors in long-term evolution (CABLE) study, we are interested in individual, family, and school factors associated with marginal quantiles of longitudinal adolescent BMI values. We propose a new method for composite marginal quantile regression analysis for longitudinal outcome data, which performs marginal quantile regressions at multiple quantile levels simultaneously. The proposed method extends the quantile regression coefficient modeling method introduced by Frumento and Bottai (Biometrics 2016; 72:74-84) to longitudinal data accounting suitably for the correlation structure in longitudinal observations. A goodness-of-fit test for the proposed modeling is also developed. Simulation results show that the proposed method can be much more efficient than the analysis without taking correlation into account and the analysis performing separate quantile regressions at different quantile levels. The application to the longitudinal adolescent BMI data from the CABLE study demonstrates the practical utility of our proposal. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  15. Understanding child stunting in India: a comprehensive analysis of socio-economic, nutritional and environmental determinants using additive quantile regression.

    Directory of Open Access Journals (Sweden)

    Nora Fenske

    Full Text Available BACKGROUND: Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. OBJECTIVE: We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. DESIGN: Using cross-sectional data for children aged 0-24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. RESULTS: At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. CONCLUSIONS: Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role.

  16. Understanding child stunting in India: a comprehensive analysis of socio-economic, nutritional and environmental determinants using additive quantile regression.

    Science.gov (United States)

    Fenske, Nora; Burns, Jacob; Hothorn, Torsten; Rehfuess, Eva A

    2013-01-01

    Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. Using cross-sectional data for children aged 0-24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role.

  17. A spatially-explicit count data regression for modeling the density of forest cockchafer (Melolontha hippocastani larvae in the Hessian Ried (Germany

    Directory of Open Access Journals (Sweden)

    Matthias Schmidt

    2014-10-01

    Full Text Available Background In this paper, a regression model for predicting the spatial distribution of forest cockchafer larvae in the Hessian Ried region (Germany is presented. The forest cockchafer, a native biotic pest, is a major cause of damage in forests in this region particularly during the regeneration phase. The model developed in this study is based on a systematic sample inventory of forest cockchafer larvae by excavation across the Hessian Ried. These forest cockchafer larvae data were characterized by excess zeros and overdispersion. Methods Using specific generalized additive regression models, different discrete distributions, including the Poisson, negative binomial and zero-inflated Poisson distributions, were compared. The methodology employed allowed the simultaneous estimation of non-linear model effects of causal covariates and, to account for spatial autocorrelation, of a 2-dimensional spatial trend function. In the validation of the models, both the Akaike information criterion (AIC and more detailed graphical procedures based on randomized quantile residuals were used. Results The negative binomial distribution was superior to the Poisson and the zero-inflated Poisson distributions, providing a near perfect fit to the data, which was proven in an extensive validation process. The causal predictors found to affect the density of larvae significantly were distance to water table and percentage of pure clay layer in the soil to a depth of 1 m. Model predictions showed that larva density increased with an increase in distance to the water table up to almost 4 m, after which it remained constant, and with a reduction in the percentage of pure clay layer. However this latter correlation was weak and requires further investigation. The 2-dimensional trend function indicated a strong spatial effect, and thus explained by far the highest proportion of variation in larva density. Conclusions As such the model can be used to support forest

  18. Treating experimental data of inverse kinetic method by unitary linear regression analysis

    International Nuclear Information System (INIS)

    Zhao Yusen; Chen Xiaoliang

    2009-01-01

    The theory of treating experimental data of inverse kinetic method by unitary linear regression analysis was described. Not only the reactivity, but also the effective neutron source intensity could be calculated by this method. Computer code was compiled base on the inverse kinetic method and unitary linear regression analysis. The data of zero power facility BFS-1 in Russia were processed and the results were compared. The results show that the reactivity and the effective neutron source intensity can be obtained correctly by treating experimental data of inverse kinetic method using unitary linear regression analysis and the precision of reactivity measurement is improved. The central element efficiency can be calculated by using the reactivity. The result also shows that the effect to reactivity measurement caused by external neutron source should be considered when the reactor power is low and the intensity of external neutron source is strong. (authors)

  19. Regression analysis of informative current status data with the additive hazards model.

    Science.gov (United States)

    Zhao, Shishun; Hu, Tao; Ma, Ling; Wang, Peijie; Sun, Jianguo

    2015-04-01

    This paper discusses regression analysis of current status failure time data arising from the additive hazards model in the presence of informative censoring. Many methods have been developed for regression analysis of current status data under various regression models if the censoring is noninformative, and also there exists a large literature on parametric analysis of informative current status data in the context of tumorgenicity experiments. In this paper, a semiparametric maximum likelihood estimation procedure is presented and in the method, the copula model is employed to describe the relationship between the failure time of interest and the censoring time. Furthermore, I-splines are used to approximate the nonparametric functions involved and the asymptotic consistency and normality of the proposed estimators are established. A simulation study is conducted and indicates that the proposed approach works well for practical situations. An illustrative example is also provided.

  20. Estimation of Total Nitrogen and Phosphorus in New England Streams Using Spatially Referenced Regression Models

    Science.gov (United States)

    Moore, Richard Bridge; Johnston, Craig M.; Robinson, Keith W.; Deacon, Jeffrey R.

    2004-01-01

    The U.S. Geological Survey (USGS), in cooperation with the U.S. Environmental Protection Agency (USEPA) and the New England Interstate Water Pollution Control Commission (NEIWPCC), has developed a water-quality model, called SPARROW (Spatially Referenced Regressions on Watershed Attributes), to assist in regional total maximum daily load (TMDL) and nutrient-criteria activities in New England. SPARROW is a spatially detailed, statistical model that uses regression equations to relate total nitrogen and phosphorus (nutrient) stream loads to nutrient sources and watershed characteristics. The statistical relations in these equations are then used to predict nutrient loads in unmonitored streams. The New England SPARROW models are built using a hydrologic network of 42,000 stream reaches and associated watersheds. Watershed boundaries are defined for each stream reach in the network through the use of a digital elevation model and existing digitized watershed divides. Nutrient source data is from permitted wastewater discharge data from USEPA's Permit Compliance System (PCS), various land-use sources, and atmospheric deposition. Physical watershed characteristics include drainage area, land use, streamflow, time-of-travel, stream density, percent wetlands, slope of the land surface, and soil permeability. The New England SPARROW models for total nitrogen and total phosphorus have R-squared values of 0.95 and 0.94, with mean square errors of 0.16 and 0.23, respectively. Variables that were statistically significant in the total nitrogen model include permitted municipal-wastewater discharges, atmospheric deposition, agricultural area, and developed land area. Total nitrogen stream-loss rates were significant only in streams with average annual flows less than or equal to 2.83 cubic meters per second. In streams larger than this, there is nondetectable in-stream loss of annual total nitrogen in New England. Variables that were statistically significant in the total

  1. Credit Scoring Problem Based on Regression Analysis

    OpenAIRE

    Khassawneh, Bashar Suhil Jad Allah

    2014-01-01

    ABSTRACT: This thesis provides an explanatory introduction to the regression models of data mining and contains basic definitions of key terms in the linear, multiple and logistic regression models. Meanwhile, the aim of this study is to illustrate fitting models for the credit scoring problem using simple linear, multiple linear and logistic regression models and also to analyze the found model functions by statistical tools. Keywords: Data mining, linear regression, logistic regression....

  2. MULGRES: a computer program for stepwise multiple regression analysis

    Science.gov (United States)

    A. Jeff Martin

    1971-01-01

    MULGRES is a computer program source deck that is designed for multiple regression analysis employing the technique of stepwise deletion in the search for most significant variables. The features of the program, along with inputs and outputs, are briefly described, with a note on machine compatibility.

  3. Real-time regression analysis with deep convolutional neural networks

    OpenAIRE

    Huerta, E. A.; George, Daniel; Zhao, Zhizhen; Allen, Gabrielle

    2018-01-01

    We discuss the development of novel deep learning algorithms to enable real-time regression analysis for time series data. We showcase the application of this new method with a timely case study, and then discuss the applicability of this approach to tackle similar challenges across science domains.

  4. [Comparison of application of Cochran-Armitage trend test and linear regression analysis for rate trend analysis in epidemiology study].

    Science.gov (United States)

    Wang, D Z; Wang, C; Shen, C F; Zhang, Y; Zhang, H; Song, G D; Xue, X D; Xu, Z L; Zhang, S; Jiang, G H

    2017-05-10

    We described the time trend of acute myocardial infarction (AMI) from 1999 to 2013 in Tianjin incidence rate with Cochran-Armitage trend (CAT) test and linear regression analysis, and the results were compared. Based on actual population, CAT test had much stronger statistical power than linear regression analysis for both overall incidence trend and age specific incidence trend (Cochran-Armitage trend P valuelinear regression P value). The statistical power of CAT test decreased, while the result of linear regression analysis remained the same when population size was reduced by 100 times and AMI incidence rate remained unchanged. The two statistical methods have their advantages and disadvantages. It is necessary to choose statistical method according the fitting degree of data, or comprehensively analyze the results of two methods.

  5. The evolution of GDP in USA using cyclic regression analysis

    OpenAIRE

    Catalin Angelo IOAN; Gina IOAN

    2013-01-01

    Based on the four major types of economic cycles (Kondratieff, Juglar, Kitchin, Kuznet), the paper aims to determine their actual length (for the U.S. economy) using cyclic regressions based on Fourier analysis.

  6. Quantile regression for the statistical analysis of immunological data with many non-detects.

    Science.gov (United States)

    Eilers, Paul H C; Röder, Esther; Savelkoul, Huub F J; van Wijk, Roy Gerth

    2012-07-07

    Immunological parameters are hard to measure. A well-known problem is the occurrence of values below the detection limit, the non-detects. Non-detects are a nuisance, because classical statistical analyses, like ANOVA and regression, cannot be applied. The more advanced statistical techniques currently available for the analysis of datasets with non-detects can only be used if a small percentage of the data are non-detects. Quantile regression, a generalization of percentiles to regression models, models the median or higher percentiles and tolerates very high numbers of non-detects. We present a non-technical introduction and illustrate it with an implementation to real data from a clinical trial. We show that by using quantile regression, groups can be compared and that meaningful linear trends can be computed, even if more than half of the data consists of non-detects. Quantile regression is a valuable addition to the statistical methods that can be used for the analysis of immunological datasets with non-detects.

  7. Using Spatial Structure Analysis of Hyperspectral Imaging Data and Fourier Transformed Infrared Analysis to Determine Bioactivity of Surface Pesticide Treatment

    Directory of Open Access Journals (Sweden)

    Christian Nansen

    2010-03-01

    Full Text Available Many food products are subjected to quality control analyses for detection of surface residue/contaminants, and there is a trend of requiring more and more documentation and reporting by farmers regarding their use of pesticides. Recent outbreaks of food borne illnesses have been a major contributor to this trend. With a growing need for food safety measures and “smart applications” of insecticides, it is important to develop methods for rapid and accurate assessments of surface residues on food and feed items. As a model system, we investigated detection of a miticide applied to maize leaves and its miticidal bioactivity over time, and we compared two types of reflectance data: fourier transformed infrared (FTIR data and hyperspectral imaging (HI data. The miticide (bifenazate was applied at a commercial field rate to maize leaves in the field, with or without application of a surfactant, and with or without application of a simulated “rain event”. In addition, we collected FTIR and HI from untreated control leaves (total of five treatments. Maize leaf data were collected at seven time intervals from 0 to 48 hours after application. FTIR data were analyzed using conventional analysis of variance of miticide-specific vibration peaks. Two unique FTIR vibration peaks were associated with miticide application (1,700 cm−1 and 763 cm−1. The integrated intensities of these two peaks, miticide application, surfactant, rain event, time between miticide application, and rain event were used as explanatory variables in a linear multi-regression fit to spider mite mortality. The same linear multi-regression approach was applied to variogram parameters derived from HI data in five selected spectral bands (664, 683, 706, 740, and 747 nm. For each spectral band, we conducted a spatial structure analysis, and the three standard variogram parameters (“sill”, “range”, and “nugget” were examined as possible “indicators” of miticide

  8. Spatial analysis of the electrical energy demand in Greece

    International Nuclear Information System (INIS)

    Tyralis, Hristos; Mamassis, Nikos; Photis, Yorgos N.

    2017-01-01

    The Electrical Energy Demand (EED) of the agricultural, commercial and industrial sector in Greece, as well as its use for domestic activities, public and municipal authorities and street lighting are analysed spatially using Geographical Information System and spatial statistical methods. The analysis is performed on data which span from 2008 to 2012 and have annual temporal resolution and spatial resolution down to the NUTS (Nomenclature of Territorial Units for Statistics) level 3. The aim is to identify spatial patterns of the EED and its transformations such as the ratios of the EED to socioeconomic variables, i.e. the population, the total area, the population density and the Gross Domestic Product (GDP). Based on the analysis, Greece is divided in five regions, each one with a different development model, i.e. Attica and Thessaloniki which are two heavily populated major poles, Thessaly and Central Greece which form a connected geographical region with important agricultural and industrial sector, the islands and some coastal areas which are characterized by an important commercial sector and the rest Greek areas. The spatial patterns can provide additional information for policy decision about the electrical energy management and better representation of the regional socioeconomic conditions. - Highlights: • We visualize spatially the Electrical Energy Demand (EED) in Greece. • We apply spatial analysis methods to the EED data. • Spatial patterns of the EED are identified. • Greece is classified in five distinct groups, based on the analysis. • The results can be used for optimal planning of the electric system.

  9. Optimal choice of basis functions in the linear regression analysis

    International Nuclear Information System (INIS)

    Khotinskij, A.M.

    1988-01-01

    Problem of optimal choice of basis functions in the linear regression analysis is investigated. Step algorithm with estimation of its efficiency, which holds true at finite number of measurements, is suggested. Conditions, providing the probability of correct choice close to 1 are formulated. Application of the step algorithm to analysis of decay curves is substantiated. 8 refs

  10. Modeling Fire Occurrence at the City Scale: A Comparison between Geographically Weighted Regression and Global Linear Regression.

    Science.gov (United States)

    Song, Chao; Kwan, Mei-Po; Zhu, Jiping

    2017-04-08

    An increasing number of fires are occurring with the rapid development of cities, resulting in increased risk for human beings and the environment. This study compares geographically weighted regression-based models, including geographically weighted regression (GWR) and geographically and temporally weighted regression (GTWR), which integrates spatial and temporal effects and global linear regression models (LM) for modeling fire risk at the city scale. The results show that the road density and the spatial distribution of enterprises have the strongest influences on fire risk, which implies that we should focus on areas where roads and enterprises are densely clustered. In addition, locations with a large number of enterprises have fewer fire ignition records, probably because of strict management and prevention measures. A changing number of significant variables across space indicate that heterogeneity mainly exists in the northern and eastern rural and suburban areas of Hefei city, where human-related facilities or road construction are only clustered in the city sub-centers. GTWR can capture small changes in the spatiotemporal heterogeneity of the variables while GWR and LM cannot. An approach that integrates space and time enables us to better understand the dynamic changes in fire risk. Thus governments can use the results to manage fire safety at the city scale.

  11. Landslide Hazard Mapping in Rwanda Using Logistic Regression

    Science.gov (United States)

    Piller, A.; Anderson, E.; Ballard, H.

    2015-12-01

    Landslides in the United States cause more than $1 billion in damages and 50 deaths per year (USGS 2014). Globally, figures are much more grave, yet monitoring, mapping and forecasting of these hazards are less than adequate. Seventy-five percent of the population of Rwanda earns a living from farming, mostly subsistence. Loss of farmland, housing, or life, to landslides is a very real hazard. Landslides in Rwanda have an impact at the economic, social, and environmental level. In a developing nation that faces challenges in tracking, cataloging, and predicting the numerous landslides that occur each year, satellite imagery and spatial analysis allow for remote study. We have focused on the development of a landslide inventory and a statistical methodology for assessing landslide hazards. Using logistic regression on approximately 30 test variables (i.e. slope, soil type, land cover, etc.) and a sample of over 200 landslides, we determine which variables are statistically most relevant to landslide occurrence in Rwanda. A preliminary predictive hazard map for Rwanda has been produced, using the variables selected from the logistic regression analysis.

  12. Regression analysis for the social sciences

    CERN Document Server

    Gordon, Rachel A

    2015-01-01

    Provides graduate students in the social sciences with the basic skills they need to estimate, interpret, present, and publish basic regression models using contemporary standards. Key features of the book include: interweaving the teaching of statistical concepts with examples developed for the course from publicly-available social science data or drawn from the literature. thorough integration of teaching statistical theory with teaching data processing and analysis. teaching of Stata and use of chapter exercises in which students practice programming and interpretation on the same data set. A separate set of exercises allows students to select a data set to apply the concepts learned in each chapter to a research question of interest to them, all updated for this edition.

  13. Spatiotemporal Dynamics and Spatial Determinants of Urban Growth in Suzhou, China

    Directory of Open Access Journals (Sweden)

    Ling Zhang

    2017-03-01

    Full Text Available This paper analyzes the spatiotemporal dynamics of urban growth and models its spatial determinants in China through a case study of Suzhou, a rapidly industrializing and globalizing city. We conducted spatial analysis on land use data derived from multi-temporal remote sensing images of Suzhou from 1986 to 2008. Three urban growth types, namely infilling, edge-expansion, and leapfrog, were identified. We used landscape metrics to quantify the temporal trend of urban growth in Suzhou. During these 22 years, Suzhou’s urbanization changed from bottom-up rural urbanization to city-based top-down urban expansion. The underlying mechanism changed from TVE (town village enterprise driven rural industrialization to FDI (foreign direct investment driven development zone fever. Furthermore, we employed both global and local logistic regressions to model the probability of urban land conversion against a set of spatial variables. The global logistic regression model found the significance of proximity, neighborhood conditions, and socioeconomic factors. The logistic geographically weighted regression (GWR model improved the global regression model with better model goodness-of-fit and higher prediction accuracy. More importantly, the local parameter estimates of variables enabled us to exam spatial variations of the influences of variables on urban growth in Suzhou.

  14. Spatial analysis of weed patterns

    NARCIS (Netherlands)

    Heijting, S.

    2007-01-01

    Keywords: Spatial analysis, weed patterns, Mead’s test, space-time correlograms, 2-D correlograms, dispersal, Generalized Linear Models, heterogeneity, soil, Taylor’s power law. Weeds in agriculture occur in patches. This thesis is a contribution to the characterization of this patchiness, to its

  15. A method for nonlinear exponential regression analysis

    Science.gov (United States)

    Junkin, B. G.

    1971-01-01

    A computer-oriented technique is presented for performing a nonlinear exponential regression analysis on decay-type experimental data. The technique involves the least squares procedure wherein the nonlinear problem is linearized by expansion in a Taylor series. A linear curve fitting procedure for determining the initial nominal estimates for the unknown exponential model parameters is included as an integral part of the technique. A correction matrix was derived and then applied to the nominal estimate to produce an improved set of model parameters. The solution cycle is repeated until some predetermined criterion is satisfied.

  16. Analysis of Functional Data with Focus on Multinomial Regression and Multilevel Data

    DEFF Research Database (Denmark)

    Mousavi, Seyed Nourollah

    Functional data analysis (FDA) is a fast growing area in statistical research with increasingly diverse range of application from economics, medicine, agriculture, chemometrics, etc. Functional regression is an area of FDA which has received the most attention both in aspects of application...... and methodological development. Our main Functional data analysis (FDA) is a fast growing area in statistical research with increasingly diverse range of application from economics, medicine, agriculture, chemometrics, etc. Functional regression is an area of FDA which has received the most attention both in aspects...

  17. Regression analysis of a chemical reaction fouling model

    International Nuclear Information System (INIS)

    Vasak, F.; Epstein, N.

    1996-01-01

    A previously reported mathematical model for the initial chemical reaction fouling of a heated tube is critically examined in the light of the experimental data for which it was developed. A regression analysis of the model with respect to that data shows that the reference point upon which the two adjustable parameters of the model were originally based was well chosen, albeit fortuitously. (author). 3 refs., 2 tabs., 2 figs

  18. [A SAS marco program for batch processing of univariate Cox regression analysis for great database].

    Science.gov (United States)

    Yang, Rendong; Xiong, Jie; Peng, Yangqin; Peng, Xiaoning; Zeng, Xiaomin

    2015-02-01

    To realize batch processing of univariate Cox regression analysis for great database by SAS marco program. We wrote a SAS macro program, which can filter, integrate, and export P values to Excel by SAS9.2. The program was used for screening survival correlated RNA molecules of ovarian cancer. A SAS marco program could finish the batch processing of univariate Cox regression analysis, the selection and export of the results. The SAS macro program has potential applications in reducing the workload of statistical analysis and providing a basis for batch processing of univariate Cox regression analysis.

  19. Sensitivity analysis and optimization of system dynamics models : Regression analysis and statistical design of experiments

    NARCIS (Netherlands)

    Kleijnen, J.P.C.

    1995-01-01

    This tutorial discusses what-if analysis and optimization of System Dynamics models. These problems are solved, using the statistical techniques of regression analysis and design of experiments (DOE). These issues are illustrated by applying the statistical techniques to a System Dynamics model for

  20. Using Structured Additive Regression Models to Estimate Risk Factors of Malaria: Analysis of 2010 Malawi Malaria Indicator Survey Data

    Science.gov (United States)

    Chirombo, James; Lowe, Rachel; Kazembe, Lawrence

    2014-01-01

    Background After years of implementing Roll Back Malaria (RBM) interventions, the changing landscape of malaria in terms of risk factors and spatial pattern has not been fully investigated. This paper uses the 2010 malaria indicator survey data to investigate if known malaria risk factors remain relevant after many years of interventions. Methods We adopted a structured additive logistic regression model that allowed for spatial correlation, to more realistically estimate malaria risk factors. Our model included child and household level covariates, as well as climatic and environmental factors. Continuous variables were modelled by assuming second order random walk priors, while spatial correlation was specified as a Markov random field prior, with fixed effects assigned diffuse priors. Inference was fully Bayesian resulting in an under five malaria risk map for Malawi. Results Malaria risk increased with increasing age of the child. With respect to socio-economic factors, the greater the household wealth, the lower the malaria prevalence. A general decline in malaria risk was observed as altitude increased. Minimum temperatures and average total rainfall in the three months preceding the survey did not show a strong association with disease risk. Conclusions The structured additive regression model offered a flexible extension to standard regression models by enabling simultaneous modelling of possible nonlinear effects of continuous covariates, spatial correlation and heterogeneity, while estimating usual fixed effects of categorical and continuous observed variables. Our results confirmed that malaria epidemiology is a complex interaction of biotic and abiotic factors, both at the individual, household and community level and that risk factors are still relevant many years after extensive implementation of RBM activities. PMID:24991915

  1. Application of multilinear regression analysis in modeling of soil ...

    African Journals Online (AJOL)

    The application of Multi-Linear Regression Analysis (MLRA) model for predicting soil properties in Calabar South offers a technical guide and solution in foundation designs problems in the area. Forty-five soil samples were collected from fifteen different boreholes at a different depth and 270 tests were carried out for CBR, ...

  2. Boosted beta regression.

    Directory of Open Access Journals (Sweden)

    Matthias Schmid

    Full Text Available Regression analysis with a bounded outcome is a common problem in applied statistics. Typical examples include regression models for percentage outcomes and the analysis of ratings that are measured on a bounded scale. In this paper, we consider beta regression, which is a generalization of logit models to situations where the response is continuous on the interval (0,1. Consequently, beta regression is a convenient tool for analyzing percentage responses. The classical approach to fit a beta regression model is to use maximum likelihood estimation with subsequent AIC-based variable selection. As an alternative to this established - yet unstable - approach, we propose a new estimation technique called boosted beta regression. With boosted beta regression estimation and variable selection can be carried out simultaneously in a highly efficient way. Additionally, both the mean and the variance of a percentage response can be modeled using flexible nonlinear covariate effects. As a consequence, the new method accounts for common problems such as overdispersion and non-binomial variance structures.

  3. Bias due to two-stage residual-outcome regression analysis in genetic association studies.

    Science.gov (United States)

    Demissie, Serkalem; Cupples, L Adrienne

    2011-11-01

    Association studies of risk factors and complex diseases require careful assessment of potential confounding factors. Two-stage regression analysis, sometimes referred to as residual- or adjusted-outcome analysis, has been increasingly used in association studies of single nucleotide polymorphisms (SNPs) and quantitative traits. In this analysis, first, a residual-outcome is calculated from a regression of the outcome variable on covariates and then the relationship between the adjusted-outcome and the SNP is evaluated by a simple linear regression of the adjusted-outcome on the SNP. In this article, we examine the performance of this two-stage analysis as compared with multiple linear regression (MLR) analysis. Our findings show that when a SNP and a covariate are correlated, the two-stage approach results in biased genotypic effect and loss of power. Bias is always toward the null and increases with the squared-correlation between the SNP and the covariate (). For example, for , 0.1, and 0.5, two-stage analysis results in, respectively, 0, 10, and 50% attenuation in the SNP effect. As expected, MLR was always unbiased. Since individual SNPs often show little or no correlation with covariates, a two-stage analysis is expected to perform as well as MLR in many genetic studies; however, it produces considerably different results from MLR and may lead to incorrect conclusions when independent variables are highly correlated. While a useful alternative to MLR under , the two -stage approach has serious limitations. Its use as a simple substitute for MLR should be avoided. © 2011 Wiley Periodicals, Inc.

  4. Crash rates analysis in China using a spatial panel model

    Directory of Open Access Journals (Sweden)

    Wonmongo Lacina Soro

    2017-10-01

    Full Text Available The consideration of spatial externalities in traffic safety analysis is of paramount importance for the success of road safety policies. Yet, the quasi-totality of spatial dependence studies on crash rates is performed within the framework of single-equation spatial cross-sectional studies. The present study extends the spatial cross-sectional scheme to a spatial fixed-effects panel model estimated using the maximum likelihood method. The spatial units are the 31 administrative regions of mainland China over the period 2004–2013. The presence of neighborhood effects is evidenced through the Moran's I statistic. Consistent with previous studies, the analysis reveals that omitting the spatial effects in traffic safety analysis is likely to bias the estimation results. The spatial and error lags are all positive and statistically significant suggesting similarities of crash rates pattern in neighboring regions. Some other explanatory variables, such as freight traffic, the length of paved roads and the populations of age 65 and above are related to higher rates while the opposite trend is observed for the Gross Regional Product, the urban unemployment rate and passenger traffic.

  5. Spatial analysis of NDVI readings with difference sampling density

    Science.gov (United States)

    Advanced remote sensing technologies provide research an innovative way of collecting spatial data for use in precision agriculture. Sensor information and spatial analysis together allow for a complete understanding of the spatial complexity of a field and its crop. The objective of the study was...

  6. Examining Spatial Variation in the Effects of Japanese Red Pine (Pinus densiflora on Burn Severity Using Geographically Weighted Regression

    Directory of Open Access Journals (Sweden)

    Hyun-Joo Lee

    2017-05-01

    Full Text Available Burn severity has profound impacts on the response of post-fire forest ecosystems to fire events. Numerous previous studies have reported that burn severity is determined by variables such as meteorological conditions, pre-fire forest structure, and fuel characteristics. An underlying assumption of these studies was the constant effects of environmental variables on burn severity over space, and these analyses therefore did not consider the spatial dimension. This study examined spatial variation in the effects of Japanese red pine (Pinus densiflora on burn severity. Specifically, this study investigated the presence of spatially varying relationships between Japanese red pine and burn severity due to changes in slope and elevation. We estimated conventional ordinary least squares (OLS and geographically weighted regression (GWR models and compared them using three criteria; the coefficients of determination (R2, Akaike information criterion for small samples (AICc, and Moran’s I-value. The GWR model performed considerably better than the OLS model in explaining variation in burn severity. The results provided strong evidence that the effect of Japanese red pine on burn severity was not constant but varied spatially. Elevation was a significant factor in the variation in the effects of Japanese red pine on burn severity. The influence of red pine on burn severity was considerably higher in low-elevation areas but became less important than the other variables in high-elevation areas. The results of this study can be applied to location-specific strategies for forest managers and can be adopted to improve fire simulation models to more realistically mimic the nature of fire behavior.

  7. Spatial Analysis Methods of Road Traffic Collisions

    DEFF Research Database (Denmark)

    Loo, Becky P. Y.; Anderson, Tessa Kate

    Spatial Analysis Methods of Road Traffic Collisions centers on the geographical nature of road crashes, and uses spatial methods to provide a greater understanding of the patterns and processes that cause them. Written by internationally known experts in the field of transport geography, the book...... outlines the key issues in identifying hazardous road locations (HRLs), considers current approaches used for reducing and preventing road traffic collisions, and outlines a strategy for improved road safety. The book covers spatial accuracy, validation, and other statistical issues, as well as link...

  8. Spatial analysis of soybean canopy response to soybean cyst nematodes (Heterodera glycines) in eastern Arkansas: An approach to future precision agriculture technology application

    Science.gov (United States)

    Kulkarni, Subodh

    2008-10-01

    Heterodera glycines Ichinohe, commonly known as soybean cyst nematode (SCN) is a serious widespread pathogen of soybean in the US. Present research primarily investigated feasibility of detecting SCN infestation in the field using aerial images and ground level spectrometric sensing. Non-spatial and spatial linear regression analyses were performed to correlate SCN population densities with Normalized Difference Vegetation Index (NDVI) and Green NDVI (GNDVI) derived from soybean canopy spectra. Field data were obtained from two fields; Field A and B under different nematode control strategies in 2003 and 2004. Analysis of aerial image data from July 18, 2004 from the Field A showed a significant relationship between SCN population at planting and the GNDVI (R2=0.17 at p=0.0006). Linear regression analysis revealed that SCN had a little effect on yield (R2 =0.14, at p=0.0001, RMSEP=1052.42 kg ha-1) and GNDVI (R 2=0.17 at p=0.0006, RMSEP=0.087) derived from the aerial imagery on a single date. However, the spatial regression analysis based on spherical semivariogram showed that the RMSEP was 0.037 for the GNDVI on July 18, 2004 and 427.32 kg ha-1 for yield on October 14, 2003 indicating better model performance. For July 18, 2004 data from Field B, a relationship between NDVI and the cyst counts at planting was significant (R2=0.5 at p=0.0468). Non-spatial analyses of the ground level spectrometric data for the first field showed that NDVI and GNDVI were correlated with cyst counts at planting (R 2=0.34 and 0.27 at p=0.0015 and 0.0127, respectively), and GNDVI was correlated with eggs count at planting (R2= 0.27 at p=0.0118). Both NDVI and GNDVI were correlated with egg counts at flowering (R 2=0.34 and 0.27 at p=0.0013 and 0.0018, respectively). However, paired T test to validate the above relationships showed that, predicted values of NDVI and GNDVI were significantly different. The statistical evidences suggested that variability in vegetation indices was caused

  9. An Original Stepwise Multilevel Logistic Regression Analysis of Discriminatory Accuracy

    DEFF Research Database (Denmark)

    Merlo, Juan; Wagner, Philippe; Ghith, Nermin

    2016-01-01

    BACKGROUND AND AIM: Many multilevel logistic regression analyses of "neighbourhood and health" focus on interpreting measures of associations (e.g., odds ratio, OR). In contrast, multilevel analysis of variance is rarely considered. We propose an original stepwise analytical approach that disting...

  10. Advanced statistics: linear regression, part I: simple linear regression.

    Science.gov (United States)

    Marill, Keith A

    2004-01-01

    Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.

  11. Ordinary least square regression, orthogonal regression, geometric mean regression and their applications in aerosol science

    International Nuclear Information System (INIS)

    Leng Ling; Zhang Tianyi; Kleinman, Lawrence; Zhu Wei

    2007-01-01

    Regression analysis, especially the ordinary least squares method which assumes that errors are confined to the dependent variable, has seen a fair share of its applications in aerosol science. The ordinary least squares approach, however, could be problematic due to the fact that atmospheric data often does not lend itself to calling one variable independent and the other dependent. Errors often exist for both measurements. In this work, we examine two regression approaches available to accommodate this situation. They are orthogonal regression and geometric mean regression. Comparisons are made theoretically as well as numerically through an aerosol study examining whether the ratio of organic aerosol to CO would change with age

  12. Evaluation of Logistic Regression and Multivariate Adaptive Regression Spline Models for Groundwater Potential Mapping Using R and GIS

    Directory of Open Access Journals (Sweden)

    Soyoung Park

    2017-07-01

    Full Text Available This study mapped and analyzed groundwater potential using two different models, logistic regression (LR and multivariate adaptive regression splines (MARS, and compared the results. A spatial database was constructed for groundwater well data and groundwater influence factors. Groundwater well data with a high potential yield of ≥70 m3/d were extracted, and 859 locations (70% were used for model training, whereas the other 365 locations (30% were used for model validation. We analyzed 16 groundwater influence factors including altitude, slope degree, slope aspect, plan curvature, profile curvature, topographic wetness index, stream power index, sediment transport index, distance from drainage, drainage density, lithology, distance from fault, fault density, distance from lineament, lineament density, and land cover. Groundwater potential maps (GPMs were constructed using LR and MARS models and tested using a receiver operating characteristics curve. Based on this analysis, the area under the curve (AUC for the success rate curve of GPMs created using the MARS and LR models was 0.867 and 0.838, and the AUC for the prediction rate curve was 0.836 and 0.801, respectively. This implies that the MARS model is useful and effective for groundwater potential analysis in the study area.

  13. Research progress and hotspot analysis of spatial interpolation

    Science.gov (United States)

    Jia, Li-juan; Zheng, Xin-qi; Miao, Jin-li

    2018-02-01

    In this paper, the literatures related to spatial interpolation between 1982 and 2017, which are included in the Web of Science core database, are used as data sources, and the visualization analysis is carried out according to the co-country network, co-category network, co-citation network, keywords co-occurrence network. It is found that spatial interpolation has experienced three stages: slow development, steady development and rapid development; The cross effect between 11 clustering groups, the main convergence of spatial interpolation theory research, the practical application and case study of spatial interpolation and research on the accuracy and efficiency of spatial interpolation. Finding the optimal spatial interpolation is the frontier and hot spot of the research. Spatial interpolation research has formed a theoretical basis and research system framework, interdisciplinary strong, is widely used in various fields.

  14. Spatial Factor Analysis for Aerosol Optical Depth in Metropolises in China with Regard to Spatial Heterogeneity

    Directory of Open Access Journals (Sweden)

    Hui Shi

    2018-04-01

    Full Text Available A substantial number of studies have analyzed how driving factors impact aerosols, but they have been little concerned with the spatial heterogeneity of aerosols and the factors that impact aerosols. The spatial distributions of the aerosol optical depth (AOD retrieved by Moderate Resolution Imaging Spectrometer (MODIS data at 550-nm and 3-km resolution for three highly developed metropolises, the Beijing-Tianjin-Hebei (BTH region, the Yangtze River Delta (YRD, and the Pearl River Delta (PRD, in China during 2015 were analyzed. Different degrees of spatial heterogeneity of the AOD were found, which were indexed by Moran’s I index giving values of 0.940, 0.715, and 0.680 in BTH, YRD, and PRD, respectively. For the spatial heterogeneity, geographically weighted regression (GWR was employed to carry out a spatial factor analysis, where terrain, climate condition, urban development, and vegetation coverage were taken as the potential driving factors. The results of the GWR imply varying relationships between the AOD and the factors. The results were generally consistent with existing studies, but the results suggest the following: (1 Elevation increase would more likely lead to a strong negative impact on aerosols (with most of the coefficients ranging from −1.5~0 in the BTH, −1.5~0 in the YRD, and −1~0 in the PRD in the places with greater elevations where the R-squared values are always larger than 0.5. However, the variation of elevations cannot explain the variation of aerosols in the places with relatively low elevations (with R-squared values approximately 0.1, ranging from 0 to 0.3, and approximately 0.1 in the BTH, YRD, and PRD, such as urban areas in the BTH and YRD. (2 The density of the built-up areas made a strong and positive impact on aerosols in the urban areas of the BTH (R-squared larger than 0.5, while the R-squared dropped to 0.1 in the places far away from the urban areas. (3 The vegetation coverage led to a stronger

  15. A simple linear regression method for quantitative trait loci linkage analysis with censored observations.

    Science.gov (United States)

    Anderson, Carl A; McRae, Allan F; Visscher, Peter M

    2006-07-01

    Standard quantitative trait loci (QTL) mapping techniques commonly assume that the trait is both fully observed and normally distributed. When considering survival or age-at-onset traits these assumptions are often incorrect. Methods have been developed to map QTL for survival traits; however, they are both computationally intensive and not available in standard genome analysis software packages. We propose a grouped linear regression method for the analysis of continuous survival data. Using simulation we compare this method to both the Cox and Weibull proportional hazards models and a standard linear regression method that ignores censoring. The grouped linear regression method is of equivalent power to both the Cox and Weibull proportional hazards methods and is significantly better than the standard linear regression method when censored observations are present. The method is also robust to the proportion of censored individuals and the underlying distribution of the trait. On the basis of linear regression methodology, the grouped linear regression model is computationally simple and fast and can be implemented readily in freely available statistical software.

  16. The use of cognitive ability measures as explanatory variables in regression analysis.

    Science.gov (United States)

    Junker, Brian; Schofield, Lynne Steuerle; Taylor, Lowell J

    2012-12-01

    Cognitive ability measures are often taken as explanatory variables in regression analysis, e.g., as a factor affecting a market outcome such as an individual's wage, or a decision such as an individual's education acquisition. Cognitive ability is a latent construct; its true value is unobserved. Nonetheless, researchers often assume that a test score , constructed via standard psychometric practice from individuals' responses to test items, can be safely used in regression analysis. We examine problems that can arise, and suggest that an alternative approach, a "mixed effects structural equations" (MESE) model, may be more appropriate in many circumstances.

  17. Multiple Imputation of a Randomly Censored Covariate Improves Logistic Regression Analysis.

    Science.gov (United States)

    Atem, Folefac D; Qian, Jing; Maye, Jacqueline E; Johnson, Keith A; Betensky, Rebecca A

    2016-01-01

    Randomly censored covariates arise frequently in epidemiologic studies. The most commonly used methods, including complete case and single imputation or substitution, suffer from inefficiency and bias. They make strong parametric assumptions or they consider limit of detection censoring only. We employ multiple imputation, in conjunction with semi-parametric modeling of the censored covariate, to overcome these shortcomings and to facilitate robust estimation. We develop a multiple imputation approach for randomly censored covariates within the framework of a logistic regression model. We use the non-parametric estimate of the covariate distribution or the semiparametric Cox model estimate in the presence of additional covariates in the model. We evaluate this procedure in simulations, and compare its operating characteristics to those from the complete case analysis and a survival regression approach. We apply the procedures to an Alzheimer's study of the association between amyloid positivity and maternal age of onset of dementia. Multiple imputation achieves lower standard errors and higher power than the complete case approach under heavy and moderate censoring and is comparable under light censoring. The survival regression approach achieves the highest power among all procedures, but does not produce interpretable estimates of association. Multiple imputation offers a favorable alternative to complete case analysis and ad hoc substitution methods in the presence of randomly censored covariates within the framework of logistic regression.

  18. Asymptotic analysis of spatial discretizations in implicit Monte Carlo

    International Nuclear Information System (INIS)

    Densmore, Jeffery D.

    2009-01-01

    We perform an asymptotic analysis of spatial discretizations in Implicit Monte Carlo (IMC). We consider two asymptotic scalings: one that represents a time step that resolves the mean-free time, and one that corresponds to a fixed, optically large time step. We show that only the latter scaling results in a valid spatial discretization of the proper diffusion equation, and thus we conclude that IMC only yields accurate solutions when using optically large spatial cells if time steps are also optically large. We demonstrate the validity of our analysis with a set of numerical examples.

  19. Temporal trends in sperm count: a systematic review and meta-regression analysis.

    Science.gov (United States)

    Levine, Hagai; Jørgensen, Niels; Martino-Andrade, Anderson; Mendiola, Jaime; Weksler-Derri, Dan; Mindlis, Irina; Pinotti, Rachel; Swan, Shanna H

    2017-11-01

    Reported declines in sperm counts remain controversial today and recent trends are unknown. A definitive meta-analysis is critical given the predictive value of sperm count for fertility, morbidity and mortality. To provide a systematic review and meta-regression analysis of recent trends in sperm counts as measured by sperm concentration (SC) and total sperm count (TSC), and their modification by fertility and geographic group. PubMed/MEDLINE and EMBASE were searched for English language studies of human SC published in 1981-2013. Following a predefined protocol 7518 abstracts were screened and 2510 full articles reporting primary data on SC were reviewed. A total of 244 estimates of SC and TSC from 185 studies of 42 935 men who provided semen samples in 1973-2011 were extracted for meta-regression analysis, as well as information on years of sample collection and covariates [fertility group ('Unselected by fertility' versus 'Fertile'), geographic group ('Western', including North America, Europe Australia and New Zealand versus 'Other', including South America, Asia and Africa), age, ejaculation abstinence time, semen collection method, method of measuring SC and semen volume, exclusion criteria and indicators of completeness of covariate data]. The slopes of SC and TSC were estimated as functions of sample collection year using both simple linear regression and weighted meta-regression models and the latter were adjusted for pre-determined covariates and modification by fertility and geographic group. Assumptions were examined using multiple sensitivity analyses and nonlinear models. SC declined significantly between 1973 and 2011 (slope in unadjusted simple regression models -0.70 million/ml/year; 95% CI: -0.72 to -0.69; P regression analysis reports a significant decline in sperm counts (as measured by SC and TSC) between 1973 and 2011, driven by a 50-60% decline among men unselected by fertility from North America, Europe, Australia and New Zealand. Because

  20. Regional Convergence of Income: Spatial Analysis

    Directory of Open Access Journals (Sweden)

    Vera Ivanovna Ivanova

    2014-12-01

    Full Text Available Russia has a huge territory and a strong interregional heterogeneity, so we can assume that geographical factors have a significant impact on the pace of economic growth in Russian regions. Therefore the article is focused on the following issues: 1 correlation between comparative advantages of geographical location and differences in growth rates; 2 impact of more developed regions on their neighbors and 3 correlation between economic growth of regions and their spatial interaction. The article is devoted to the empirical analysis of regional per capita incomes from 1996 to 2012 and explores the dynamics of the spatial autocorrelation of regional development indicator. It is shown that there is a problem of measuring the intensity of spatial dependence: factor value of Moran’s index varies greatly depending on the choice of the matrix of distances. In addition, with the help of spatial econometrics the author tests the following hypotheses: 1 there is convergence between regions for a specified period; 2 the process of beta convergence is explained by the spatial arrangement of regions and 3 there is positive impact of market size on regional growth. The author empirically confirmed all three hypotheses

  1. Spatial-temporal event detection in climate parameter imagery.

    Energy Technology Data Exchange (ETDEWEB)

    McKenna, Sean Andrew; Gutierrez, Karen A.

    2011-10-01

    Previously developed techniques that comprise statistical parametric mapping, with applications focused on human brain imaging, are examined and tested here for new applications in anomaly detection within remotely-sensed imagery. Two approaches to analysis are developed: online, regression-based anomaly detection and conditional differences. These approaches are applied to two example spatial-temporal data sets: data simulated with a Gaussian field deformation approach and weekly NDVI images derived from global satellite coverage. Results indicate that anomalies can be identified in spatial temporal data with the regression-based approach. Additionally, la Nina and el Nino climatic conditions are used as different stimuli applied to the earth and this comparison shows that el Nino conditions lead to significant decreases in NDVI in both the Amazon Basin and in Southern India.

  2. Introduction to regression graphics

    CERN Document Server

    Cook, R Dennis

    2009-01-01

    Covers the use of dynamic and interactive computer graphics in linear regression analysis, focusing on analytical graphics. Features new techniques like plot rotation. The authors have composed their own regression code, using Xlisp-Stat language called R-code, which is a nearly complete system for linear regression analysis and can be utilized as the main computer program in a linear regression course. The accompanying disks, for both Macintosh and Windows computers, contain the R-code and Xlisp-Stat. An Instructor's Manual presenting detailed solutions to all the problems in the book is ava

  3. Analysis of γ spectra in airborne radioactivity measurements using multiple linear regressions

    International Nuclear Information System (INIS)

    Bao Min; Shi Quanlin; Zhang Jiamei

    2004-01-01

    This paper describes the net peak counts calculating of nuclide 137 Cs at 662 keV of γ spectra in airborne radioactivity measurements using multiple linear regressions. Mathematic model is founded by analyzing every factor that has contribution to Cs peak counts in spectra, and multiple linear regression function is established. Calculating process adopts stepwise regression, and the indistinctive factors are eliminated by F check. The regression results and its uncertainty are calculated using Least Square Estimation, then the Cs peak net counts and its uncertainty can be gotten. The analysis results for experimental spectrum are displayed. The influence of energy shift and energy resolution on the analyzing result is discussed. In comparison with the stripping spectra method, multiple linear regression method needn't stripping radios, and the calculating result has relation with the counts in Cs peak only, and the calculating uncertainty is reduced. (authors)

  4. Regression Analysis and Calibration Recommendations for the Characterization of Balance Temperature Effects

    Science.gov (United States)

    Ulbrich, N.; Volden, T.

    2018-01-01

    Analysis and use of temperature-dependent wind tunnel strain-gage balance calibration data are discussed in the paper. First, three different methods are presented and compared that may be used to process temperature-dependent strain-gage balance data. The first method uses an extended set of independent variables in order to process the data and predict balance loads. The second method applies an extended load iteration equation during the analysis of balance calibration data. The third method uses temperature-dependent sensitivities for the data analysis. Physical interpretations of the most important temperature-dependent regression model terms are provided that relate temperature compensation imperfections and the temperature-dependent nature of the gage factor to sets of regression model terms. Finally, balance calibration recommendations are listed so that temperature-dependent calibration data can be obtained and successfully processed using the reviewed analysis methods.

  5. Clinical evaluation of a novel population-based regression analysis for detecting glaucomatous visual field progression.

    Science.gov (United States)

    Kovalska, M P; Bürki, E; Schoetzau, A; Orguel, S F; Orguel, S; Grieshaber, M C

    2011-04-01

    The distinction of real progression from test variability in visual field (VF) series may be based on clinical judgment, on trend analysis based on follow-up of test parameters over time, or on identification of a significant change related to the mean of baseline exams (event analysis). The aim of this study was to compare a new population-based method (Octopus field analysis, OFA) with classic regression analyses and clinical judgment for detecting glaucomatous VF changes. 240 VF series of 240 patients with at least 9 consecutive examinations available were included into this study. They were independently classified by two experienced investigators. The results of such a classification served as a reference for comparison for the following statistical tests: (a) t-test global, (b) r-test global, (c) regression analysis of 10 VF clusters and (d) point-wise linear regression analysis. 32.5 % of the VF series were classified as progressive by the investigators. The sensitivity and specificity were 89.7 % and 92.0 % for r-test, and 73.1 % and 93.8 % for the t-test, respectively. In the point-wise linear regression analysis, the specificity was comparable (89.5 % versus 92 %), but the sensitivity was clearly lower than in the r-test (22.4 % versus 89.7 %) at a significance level of p = 0.01. A regression analysis for the 10 VF clusters showed a markedly higher sensitivity for the r-test (37.7 %) than the t-test (14.1 %) at a similar specificity (88.3 % versus 93.8 %) for a significant trend (p = 0.005). In regard to the cluster distribution, the paracentral clusters and the superior nasal hemifield progressed most frequently. The population-based regression analysis seems to be superior to the trend analysis in detecting VF progression in glaucoma, and may eliminate the drawbacks of the event analysis. Further, it may assist the clinician in the evaluation of VF series and may allow better visualization of the correlation between function and structure owing to VF

  6. Selective principal component regression analysis of fluorescence hyperspectral image to assess aflatoxin contamination in corn

    Science.gov (United States)

    Selective principal component regression analysis (SPCR) uses a subset of the original image bands for principal component transformation and regression. For optimal band selection before the transformation, this paper used genetic algorithms (GA). In this case, the GA process used the regression co...

  7. Determining Balıkesir’s Energy Potential Using a Regression Analysis Computer Program

    Directory of Open Access Journals (Sweden)

    Bedri Yüksel

    2014-01-01

    Full Text Available Solar power and wind energy are used concurrently during specific periods, while at other times only the more efficient is used, and hybrid systems make this possible. When establishing a hybrid system, the extent to which these two energy sources support each other needs to be taken into account. This paper is a study of the effects of wind speed, insolation levels, and the meteorological parameters of temperature and humidity on the energy potential in Balıkesir, in the Marmara region of Turkey. The relationship between the parameters was studied using a multiple linear regression method. Using a designed-for-purpose computer program, two different regression equations were derived, with wind speed being the dependent variable in the first and insolation levels in the second. The regression equations yielded accurate results. The computer program allowed for the rapid calculation of different acceptance rates. The results of the statistical analysis proved the reliability of the equations. An estimate of identified meteorological parameters and unknown parameters could be produced with a specified precision by using the regression analysis method. The regression equations also worked for the evaluation of energy potential.

  8. Covariate Imbalance and Adjustment for Logistic Regression Analysis of Clinical Trial Data

    Science.gov (United States)

    Ciolino, Jody D.; Martin, Reneé H.; Zhao, Wenle; Jauch, Edward C.; Hill, Michael D.; Palesch, Yuko Y.

    2014-01-01

    In logistic regression analysis for binary clinical trial data, adjusted treatment effect estimates are often not equivalent to unadjusted estimates in the presence of influential covariates. This paper uses simulation to quantify the benefit of covariate adjustment in logistic regression. However, International Conference on Harmonization guidelines suggest that covariate adjustment be pre-specified. Unplanned adjusted analyses should be considered secondary. Results suggest that that if adjustment is not possible or unplanned in a logistic setting, balance in continuous covariates can alleviate some (but never all) of the shortcomings of unadjusted analyses. The case of log binomial regression is also explored. PMID:24138438

  9. Extensions of Morse-Smale Regression with Application to Actuarial Science

    OpenAIRE

    Farrelly, Colleen M.

    2017-01-01

    The problem of subgroups is ubiquitous in scientific research (ex. disease heterogeneity, spatial distributions in ecology...), and piecewise regression is one way to deal with this phenomenon. Morse-Smale regression offers a way to partition the regression function based on level sets of a defined function and that function's basins of attraction. This topologically-based piecewise regression algorithm has shown promise in its initial applications, but the current implementation in the liter...

  10. UTOOLS: microcomputer software for spatial analysis and landscape visualization.

    Science.gov (United States)

    Alan A. Ager; Robert J. McGaughey

    1997-01-01

    UTOOLS is a collection of programs designed to integrate various spatial data in a way that allows versatile spatial analysis and visualization. The programs were designed for watershed-scale assessments in which a wide array of resource data must be integrated, analyzed, and interpreted. UTOOLS software combines raster, attribute, and vector data into "spatial...

  11. Sparse modeling of spatial environmental variables associated with asthma.

    Science.gov (United States)

    Chang, Timothy S; Gangnon, Ronald E; David Page, C; Buckingham, William R; Tandias, Aman; Cowan, Kelly J; Tomasallo, Carrie D; Arndt, Brian G; Hanrahan, Lawrence P; Guilbert, Theresa W

    2015-02-01

    Geographically distributed environmental factors influence the burden of diseases such as asthma. Our objective was to identify sparse environmental variables associated with asthma diagnosis gathered from a large electronic health record (EHR) dataset while controlling for spatial variation. An EHR dataset from the University of Wisconsin's Family Medicine, Internal Medicine and Pediatrics Departments was obtained for 199,220 patients aged 5-50years over a three-year period. Each patient's home address was geocoded to one of 3456 geographic census block groups. Over one thousand block group variables were obtained from a commercial database. We developed a Sparse Spatial Environmental Analysis (SASEA). Using this method, the environmental variables were first dimensionally reduced with sparse principal component analysis. Logistic thin plate regression spline modeling was then used to identify block group variables associated with asthma from sparse principal components. The addresses of patients from the EHR dataset were distributed throughout the majority of Wisconsin's geography. Logistic thin plate regression spline modeling captured spatial variation of asthma. Four sparse principal components identified via model selection consisted of food at home, dog ownership, household size, and disposable income variables. In rural areas, dog ownership and renter occupied housing units from significant sparse principal components were associated with asthma. Our main contribution is the incorporation of sparsity in spatial modeling. SASEA sequentially added sparse principal components to Logistic thin plate regression spline modeling. This method allowed association of geographically distributed environmental factors with asthma using EHR and environmental datasets. SASEA can be applied to other diseases with environmental risk factors. Copyright © 2014 Elsevier Inc. All rights reserved.

  12. Declining Bias and Gender Wage Discrimination? A Meta-Regression Analysis

    Science.gov (United States)

    Jarrell, Stephen B.; Stanley, T. D.

    2004-01-01

    The meta-regression analysis reveals that there is a strong tendency for discrimination estimates to fall and wage discrimination exist against the woman. The biasing effect of researchers' gender of not correcting for selection bias has weakened and changes in labor market have made it less important.

  13. Statistical methods in regression and calibration analysis of chromosome aberration data

    International Nuclear Information System (INIS)

    Merkle, W.

    1983-01-01

    The method of iteratively reweighted least squares for the regression analysis of Poisson distributed chromosome aberration data is reviewed in the context of other fit procedures used in the cytogenetic literature. As an application of the resulting regression curves methods for calculating confidence intervals on dose from aberration yield are described and compared, and, for the linear quadratic model a confidence interval is given. Emphasis is placed on the rational interpretation and the limitations of various methods from a statistical point of view. (orig./MG)

  14. Spatial Modeling of Deforestation in FMU of Poigar, North Sulawesi

    OpenAIRE

    Ahmad, Afandi; Saleh, Muhammad Buce; Rusolono, Teddy

    2016-01-01

    Forest is a part of the ecosystem that provides environmental services. Deforestation may decrease forest function in an ecosystem. This study aims to build a spatial model of deforestation in a forest management unit (FMU) of Poigar. Deforestation analysis carried out by analyze the change of forest cover into non-forest cover with post classification comparison technique. Driving forces of deforestation carried out by spatial modeling using binary logistic regression models (LRM). Result of...

  15. Length bias correction in gene ontology enrichment analysis using logistic regression.

    Science.gov (United States)

    Mi, Gu; Di, Yanming; Emerson, Sarah; Cumbie, Jason S; Chang, Jeff H

    2012-01-01

    When assessing differential gene expression from RNA sequencing data, commonly used statistical tests tend to have greater power to detect differential expression of genes encoding longer transcripts. This phenomenon, called "length bias", will influence subsequent analyses such as Gene Ontology enrichment analysis. In the presence of length bias, Gene Ontology categories that include longer genes are more likely to be identified as enriched. These categories, however, are not necessarily biologically more relevant. We show that one can effectively adjust for length bias in Gene Ontology analysis by including transcript length as a covariate in a logistic regression model. The logistic regression model makes the statistical issue underlying length bias more transparent: transcript length becomes a confounding factor when it correlates with both the Gene Ontology membership and the significance of the differential expression test. The inclusion of the transcript length as a covariate allows one to investigate the direct correlation between the Gene Ontology membership and the significance of testing differential expression, conditional on the transcript length. We present both real and simulated data examples to show that the logistic regression approach is simple, effective, and flexible.

  16. Geospatial analysis platform: Supporting strategic spatial analysis and planning

    CSIR Research Space (South Africa)

    Naude, A

    2008-11-01

    Full Text Available Whilst there have been rapid advances in satellite imagery and related fine resolution mapping and web-based interfaces (e.g. Google Earth), the development of capabilities for strategic spatial analysis and planning support has lagged behind...

  17. Analysis of designed experiments by stabilised PLS Regression and jack-knifing

    DEFF Research Database (Denmark)

    Martens, Harald; Høy, M.; Westad, F.

    2001-01-01

    Pragmatical, visually oriented methods for assessing and optimising bi-linear regression models are described, and applied to PLS Regression (PLSR) analysis of multi-response data from controlled experiments. The paper outlines some ways to stabilise the PLSR method to extend its range...... the reliability of the linear and bi-linear model parameter estimates. The paper illustrates how the obtained PLSR "significance" probabilities are similar to those from conventional factorial ANOVA, but the PLSR is shown to give important additional overview plots of the main relevant structures in the multi....... An Introduction, Wiley, Chichester, UK, 2001]....

  18. Replica analysis of overfitting in regression models for time-to-event data

    Science.gov (United States)

    Coolen, A. C. C.; Barrett, J. E.; Paga, P.; Perez-Vicente, C. J.

    2017-09-01

    Overfitting, which happens when the number of parameters in a model is too large compared to the number of data points available for determining these parameters, is a serious and growing problem in survival analysis. While modern medicine presents us with data of unprecedented dimensionality, these data cannot yet be used effectively for clinical outcome prediction. Standard error measures in maximum likelihood regression, such as p-values and z-scores, are blind to overfitting, and even for Cox’s proportional hazards model (the main tool of medical statisticians), one finds in literature only rules of thumb on the number of samples required to avoid overfitting. In this paper we present a mathematical theory of overfitting in regression models for time-to-event data, which aims to increase our quantitative understanding of the problem and provide practical tools with which to correct regression outcomes for the impact of overfitting. It is based on the replica method, a statistical mechanical technique for the analysis of heterogeneous many-variable systems that has been used successfully for several decades in physics, biology, and computer science, but not yet in medical statistics. We develop the theory initially for arbitrary regression models for time-to-event data, and verify its predictions in detail for the popular Cox model.

  19. Predictors of postoperative outcomes of cubital tunnel syndrome treatments using multiple logistic regression analysis.

    Science.gov (United States)

    Suzuki, Taku; Iwamoto, Takuji; Shizu, Kanae; Suzuki, Katsuji; Yamada, Harumoto; Sato, Kazuki

    2017-05-01

    This retrospective study was designed to investigate prognostic factors for postoperative outcomes for cubital tunnel syndrome (CubTS) using multiple logistic regression analysis with a large number of patients. Eighty-three patients with CubTS who underwent surgeries were enrolled. The following potential prognostic factors for disease severity were selected according to previous reports: sex, age, type of surgery, disease duration, body mass index, cervical lesion, presence of diabetes mellitus, Workers' Compensation status, preoperative severity, and preoperative electrodiagnostic testing. Postoperative severity of disease was assessed 2 years after surgery by Messina's criteria which is an outcome measure specifically for CubTS. Bivariate analysis was performed to select candidate prognostic factors for multiple linear regression analyses. Multiple logistic regression analysis was conducted to identify the association between postoperative severity and selected prognostic factors. Both bivariate and multiple linear regression analysis revealed only preoperative severity as an independent risk factor for poor prognosis, while other factors did not show any significant association. Although conflicting results exist regarding prognosis of CubTS, this study supports evidence from previous studies and concludes early surgical intervention portends the most favorable prognosis. Copyright © 2017 The Japanese Orthopaedic Association. Published by Elsevier B.V. All rights reserved.

  20. Quantitative electron microscope autoradiography: application of multiple linear regression analysis

    International Nuclear Information System (INIS)

    Markov, D.V.

    1986-01-01

    A new method for the analysis of high resolution EM autoradiographs is described. It identifies labelled cell organelle profiles in sections on a strictly statistical basis and provides accurate estimates for their radioactivity without the need to make any assumptions about their size, shape and spatial arrangement. (author)

  1. Rings and sector : intrasite spatial analysis of stone age sites

    NARCIS (Netherlands)

    Stapert, Durk

    1992-01-01

    This thesis deals with intrasite spatial analysis: the analysis of spatial patterns on site level. My main concern has been to develop a simple method for analysing Stone Age sites of a special type: those characterised by the presence of a hearth closely associated in space with an artefact

  2. Multiple linear regression analysis

    Science.gov (United States)

    Edwards, T. R.

    1980-01-01

    Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.

  3. Stereological analysis of spatial structures

    DEFF Research Database (Denmark)

    Hansen, Linda Vadgård

    The thesis deals with stereological analysis of spatial structures. One area of focus has been to improve the precision of well-known stereological estimators by including information that is available via automatic image analysis. Furthermore, the thesis presents a stochastic model for star......-shaped three-dimensional objects using the radial function. It appears that the model is highly fleksiblel in the sense that it can be used to describe an object with arbitrary irregular surface. Results on the distribution of well-known local stereological volume estimators are provided....

  4. Logistic regression for southern pine beetle outbreaks with spatial and temporal autocorrelation

    Science.gov (United States)

    M. L. Gumpertz; C.-T. Wu; John M. Pye

    2000-01-01

    Regional outbreaks of southern pine beetle (Dendroctonus frontalis Zimm.) show marked spatial and temporal patterns. While these patterns are of interest in themselves, we focus on statistical methods for estimating the effects of underlying environmental factors in the presence of spatial and temporal autocorrelation. The most comprehensive available information on...

  5. Exploring factors associated with traumatic dental injuries in preschool children: a Poisson regression analysis.

    Science.gov (United States)

    Feldens, Carlos Alberto; Kramer, Paulo Floriani; Ferreira, Simone Helena; Spiguel, Mônica Hermann; Marquezan, Marcela

    2010-04-01

    This cross-sectional study aimed to investigate the factors associated with dental trauma in preschool children using Poisson regression analysis with robust variance. The study population comprised 888 children aged 3- to 5-year-old attending public nurseries in Canoas, southern Brazil. Questionnaires assessing information related to the independent variables (age, gender, race, mother's educational level and family income) were completed by the parents. Clinical examinations were carried out by five trained examiners in order to assess traumatic dental injuries (TDI) according to Andreasen's classification. One of the five examiners was calibrated to assess orthodontic characteristics (open bite and overjet). Multivariable Poisson regression analysis with robust variance was used to determine the factors associated with dental trauma as well as the strengths of association. Traditional logistic regression was also performed in order to compare the estimates obtained by both methods of statistical analysis. 36.4% (323/888) of the children suffered dental trauma and there was no difference in prevalence rates from 3 to 5 years of age. Poisson regression analysis showed that the probability of the outcome was almost 30% higher for children whose mothers had more than 8 years of education (Prevalence Ratio = 1.28; 95% CI = 1.03-1.60) and 63% higher for children with an overjet greater than 2 mm (Prevalence Ratio = 1.63; 95% CI = 1.31-2.03). Odds ratios clearly overestimated the size of the effect when compared with prevalence ratios. These findings indicate the need for preventive orientation regarding TDI, in order to educate parents and caregivers about supervising infants, particularly those with increased overjet and whose mothers have a higher level of education. Poisson regression with robust variance represents a better alternative than logistic regression to estimate the risk of dental trauma in preschool children.

  6. Application of range-test in multiple linear regression analysis in ...

    African Journals Online (AJOL)

    Application of range-test in multiple linear regression analysis in the presence of outliers is studied in this paper. First, the plot of the explanatory variables (i.e. Administration, Social/Commercial, Economic services and Transfer) on the dependent variable (i.e. GDP) was done to identify the statistical trend over the years.

  7. High Incidence of Breast Cancer in Light-Polluted Areas with Spatial Effects in Korea.

    Science.gov (United States)

    Kim, Yun Jeong; Park, Man Sik; Lee, Eunil; Choi, Jae Wook

    2016-01-01

    We have reported a high prevalence of breast cancer in light-polluted areas in Korea. However, it is necessary to analyze the spatial effects of light polluted areas on breast cancer because light pollution levels are correlated with region proximity to central urbanized areas in studied cities. In this study, we applied a spatial regression method (an intrinsic conditional autoregressive [iCAR] model) to analyze the relationship between the incidence of breast cancer and artificial light at night (ALAN) levels in 25 regions including central city, urbanized, and rural areas. By Poisson regression analysis, there was a significant correlation between ALAN, alcohol consumption rates, and the incidence of breast cancer. We also found significant spatial effects between ALAN and the incidence of breast cancer, with an increase in the deviance information criterion (DIC) from 374.3 to 348.6 and an increase in R2 from 0.574 to 0.667. Therefore, spatial analysis (an iCAR model) is more appropriate for assessing ALAN effects on breast cancer. To our knowledge, this study is the first to show spatial effects of light pollution on breast cancer, despite the limitations of an ecological study. We suggest that a decrease in ALAN could reduce breast cancer more than expected because of spatial effects.

  8. Land-use regression with long-term satellite-based greenness index and culture-specific sources to model PM2.5 spatial-temporal variability.

    Science.gov (United States)

    Wu, Chih-Da; Chen, Yu-Cheng; Pan, Wen-Chi; Zeng, Yu-Ting; Chen, Mu-Jean; Guo, Yue Leon; Lung, Shih-Chun Candice

    2017-05-01

    This study utilized a long-term satellite-based vegetation index, and considered culture-specific emission sources (temples and Chinese restaurants) with Land-use Regression (LUR) modelling to estimate the spatial-temporal variability of PM 2.5 using data from Taipei metropolis, which exhibits typical Asian city characteristics. Annual average PM 2.5 concentrations from 2006 to 2012 of 17 air quality monitoring stations established by Environmental Protection Administration of Taiwan were used for model development. PM 2.5 measurements from 2013 were used for external data verification. Monthly Normalized Difference Vegetation Index (NDVI) images coupled with buffer analysis were used to assess the spatial-temporal variations of greenness surrounding the monitoring sites. The distribution of temples and Chinese restaurants were included to represent the emission contributions from incense and joss money burning, and gas cooking, respectively. Spearman correlation coefficient and stepwise regression were used for LUR model development, and 10-fold cross-validation and external data verification were applied to verify the model reliability. The results showed a strongly negative correlation (r: -0.71 to -0.77) between NDVI and PM 2.5 while temples (r: 0.52 to 0.66) and Chinese restaurants (r: 0.31 to 0.44) were positively correlated to PM 2.5 concentrations. With the adjusted model R 2 of 0.89, a cross-validated adj-R 2 of 0.90, and external validated R 2 of 0.83, the high explanatory power of the resultant model was confirmed. Moreover, the averaged NDVI within a 1750 m circular buffer (p < 0.01), the number of Chinese restaurants within a 1750 m buffer (p < 0.01), and the number of temples within a 750 m buffer (p = 0.06) were selected as important predictors during the stepwise selection procedures. According to the partial R 2 , NDVI explained 66% of PM 2.5 variation and was the dominant variable in the developed model. We suggest future studies

  9. Multiple regression analysis of anthropometric measurements influencing the cephalic index of male Japanese university students.

    Science.gov (United States)

    Hossain, Md Golam; Saw, Aik; Alam, Rashidul; Ohtsuki, Fumio; Kamarul, Tunku

    2013-09-01

    Cephalic index (CI), the ratio of head breadth to head length, is widely used to categorise human populations. The aim of this study was to access the impact of anthropometric measurements on the CI of male Japanese university students. This study included 1,215 male university students from Tokyo and Kyoto, selected using convenient sampling. Multiple regression analysis was used to determine the effect of anthropometric measurements on CI. The variance inflation factor (VIF) showed no evidence of a multicollinearity problem among independent variables. The coefficients of the regression line demonstrated a significant positive relationship between CI and minimum frontal breadth (p regression analysis showed a greater likelihood for minimum frontal breadth (p regression analysis revealed bizygomatic breadth, head circumference, minimum frontal breadth, head height and morphological facial height to be the best predictor craniofacial measurements with respect to CI. The results suggest that most of the variables considered in this study appear to influence the CI of adult male Japanese students.

  10. Variations in cardiovascular disease under-diagnosis in England: national cross-sectional spatial analysis

    Directory of Open Access Journals (Sweden)

    Walford Hannah

    2011-03-01

    Full Text Available Abstract Background There is under-diagnosis of cardiovascular disease (CVD in the English population, despite financial incentives to encourage general practices to register new cases. We compared the modelled (expected and diagnosed (observed prevalence of three cardiovascular conditions- coronary heart disease (CHD, hypertension and stroke- at local level, their geographical variation, and population and healthcare predictors which might influence diagnosis. Methods Cross-sectional observational study in all English local authorities (351 and general practices (8,372 comparing model-based expected prevalence with diagnosed prevalence on practice disease registers. Spatial analyses were used to identify geographic clusters and variation in regression relationships. Results A total of 9,682,176 patients were on practice CHD, stroke and transient ischaemic attack, and hypertension registers. There was wide spatial variation in observed: expected prevalence ratios for all three diseases, with less than five per cent of expected cases diagnosed in some areas. London and the surrounding area showed statistically significant discrepancies in observed: expected prevalence ratios, with observed prevalence much lower than the epidemiological models predicted. The addition of general practitioner supply as a variable yielded stronger regression results for all three conditions. Conclusions Despite almost universal access to free primary healthcare, there may be significant and highly variable under-diagnosis of CVD across England, which can be partially explained by persistent inequity in GP supply. Disease management studies should consider the possible impact of under-diagnosis on population health outcomes. Compared to classical regression modelling, spatial analytic techniques can provide additional information on risk factors for under-diagnosis, and can suggest where healthcare resources may be most needed.

  11. Understanding poisson regression.

    Science.gov (United States)

    Hayat, Matthew J; Higgins, Melinda

    2014-04-01

    Nurse investigators often collect study data in the form of counts. Traditional methods of data analysis have historically approached analysis of count data either as if the count data were continuous and normally distributed or with dichotomization of the counts into the categories of occurred or did not occur. These outdated methods for analyzing count data have been replaced with more appropriate statistical methods that make use of the Poisson probability distribution, which is useful for analyzing count data. The purpose of this article is to provide an overview of the Poisson distribution and its use in Poisson regression. Assumption violations for the standard Poisson regression model are addressed with alternative approaches, including addition of an overdispersion parameter or negative binomial regression. An illustrative example is presented with an application from the ENSPIRE study, and regression modeling of comorbidity data is included for illustrative purposes. Copyright 2014, SLACK Incorporated.

  12. Analysis of the influence of quantile regression model on mainland tourists' service satisfaction performance.

    Science.gov (United States)

    Wang, Wen-Cheng; Cho, Wen-Chien; Chen, Yin-Jen

    2014-01-01

    It is estimated that mainland Chinese tourists travelling to Taiwan can bring annual revenues of 400 billion NTD to the Taiwan economy. Thus, how the Taiwanese Government formulates relevant measures to satisfy both sides is the focus of most concern. Taiwan must improve the facilities and service quality of its tourism industry so as to attract more mainland tourists. This paper conducted a questionnaire survey of mainland tourists and used grey relational analysis in grey mathematics to analyze the satisfaction performance of all satisfaction question items. The first eight satisfaction items were used as independent variables, and the overall satisfaction performance was used as a dependent variable for quantile regression model analysis to discuss the relationship between the dependent variable under different quantiles and independent variables. Finally, this study further discussed the predictive accuracy of the least mean regression model and each quantile regression model, as a reference for research personnel. The analysis results showed that other variables could also affect the overall satisfaction performance of mainland tourists, in addition to occupation and age. The overall predictive accuracy of quantile regression model Q0.25 was higher than that of the other three models.

  13. Analysis of the Influence of Quantile Regression Model on Mainland Tourists' Service Satisfaction Performance

    Science.gov (United States)

    Wang, Wen-Cheng; Cho, Wen-Chien; Chen, Yin-Jen

    2014-01-01

    It is estimated that mainland Chinese tourists travelling to Taiwan can bring annual revenues of 400 billion NTD to the Taiwan economy. Thus, how the Taiwanese Government formulates relevant measures to satisfy both sides is the focus of most concern. Taiwan must improve the facilities and service quality of its tourism industry so as to attract more mainland tourists. This paper conducted a questionnaire survey of mainland tourists and used grey relational analysis in grey mathematics to analyze the satisfaction performance of all satisfaction question items. The first eight satisfaction items were used as independent variables, and the overall satisfaction performance was used as a dependent variable for quantile regression model analysis to discuss the relationship between the dependent variable under different quantiles and independent variables. Finally, this study further discussed the predictive accuracy of the least mean regression model and each quantile regression model, as a reference for research personnel. The analysis results showed that other variables could also affect the overall satisfaction performance of mainland tourists, in addition to occupation and age. The overall predictive accuracy of quantile regression model Q0.25 was higher than that of the other three models. PMID:24574916

  14. Analysis of the Influence of Quantile Regression Model on Mainland Tourists’ Service Satisfaction Performance

    Directory of Open Access Journals (Sweden)

    Wen-Cheng Wang

    2014-01-01

    Full Text Available It is estimated that mainland Chinese tourists travelling to Taiwan can bring annual revenues of 400 billion NTD to the Taiwan economy. Thus, how the Taiwanese Government formulates relevant measures to satisfy both sides is the focus of most concern. Taiwan must improve the facilities and service quality of its tourism industry so as to attract more mainland tourists. This paper conducted a questionnaire survey of mainland tourists and used grey relational analysis in grey mathematics to analyze the satisfaction performance of all satisfaction question items. The first eight satisfaction items were used as independent variables, and the overall satisfaction performance was used as a dependent variable for quantile regression model analysis to discuss the relationship between the dependent variable under different quantiles and independent variables. Finally, this study further discussed the predictive accuracy of the least mean regression model and each quantile regression model, as a reference for research personnel. The analysis results showed that other variables could also affect the overall satisfaction performance of mainland tourists, in addition to occupation and age. The overall predictive accuracy of quantile regression model Q0.25 was higher than that of the other three models.

  15. Econometric analysis of realised covariation: high frequency covariance, regression and correlation in financial economics

    OpenAIRE

    Ole E. Barndorff-Nielsen; Neil Shephard

    2002-01-01

    This paper analyses multivariate high frequency financial data using realised covariation. We provide a new asymptotic distribution theory for standard methods such as regression, correlation analysis and covariance. It will be based on a fixed interval of time (e.g. a day or week), allowing the number of high frequency returns during this period to go to infinity. Our analysis allows us to study how high frequency correlations, regressions and covariances change through time. In particular w...

  16. Spatial Estimation of Losses Attributable to Meteorological Disasters in a Specific Area (105.0°E–115.0°E, 25°N–35°N Using Bayesian Maximum Entropy and Partial Least Squares Regression

    Directory of Open Access Journals (Sweden)

    F. S. Zhang

    2016-01-01

    Full Text Available The spatial mapping of losses attributable to such disasters is now well established as a means of describing the spatial patterns of disaster risk, and it has been shown to be suitable for many types of major meteorological disasters. However, few studies have been carried out by developing a regression model to estimate the effects of the spatial distribution of meteorological factors on losses associated with meteorological disasters. In this study, the proposed approach is capable of the following: (a estimating the spatial distributions of seven meteorological factors using Bayesian maximum entropy, (b identifying the four mapping methods used in this research with the best performance based on the cross validation, and (c establishing a fitted model between the PLS components and disaster losses information using partial least squares regression within a specific research area. The results showed the following: (a best mapping results were produced by multivariate Bayesian maximum entropy with probabilistic soft data; (b the regression model using three PLS components, extracted from seven meteorological factors by PLS method, was the most predictive by means of PRESS/SS test; (c northern Hunan Province sustains the most damage, and southeastern Gansu Province and western Guizhou Province sustained the least.

  17. Prediction of radiation levels in residences: A methodological comparison of CART [Classification and Regression Tree Analysis] and conventional regression

    International Nuclear Information System (INIS)

    Janssen, I.; Stebbings, J.H.

    1990-01-01

    In environmental epidemiology, trace and toxic substance concentrations frequently have very highly skewed distributions ranging over one or more orders of magnitude, and prediction by conventional regression is often poor. Classification and Regression Tree Analysis (CART) is an alternative in such contexts. To compare the techniques, two Pennsylvania data sets and three independent variables are used: house radon progeny (RnD) and gamma levels as predicted by construction characteristics in 1330 houses; and ∼200 house radon (Rn) measurements as predicted by topographic parameters. CART may identify structural variables of interest not identified by conventional regression, and vice versa, but in general the regression models are similar. CART has major advantages in dealing with other common characteristics of environmental data sets, such as missing values, continuous variables requiring transformations, and large sets of potential independent variables. CART is most useful in the identification and screening of independent variables, greatly reducing the need for cross-tabulations and nested breakdown analyses. There is no need to discard cases with missing values for the independent variables because surrogate variables are intrinsic to CART. The tree-structured approach is also independent of the scale on which the independent variables are measured, so that transformations are unnecessary. CART identifies important interactions as well as main effects. The major advantages of CART appear to be in exploring data. Once the important variables are identified, conventional regressions seem to lead to results similar but more interpretable by most audiences. 12 refs., 8 figs., 10 tabs

  18. Do the risk factors for type 2 diabetes mellitus vary by location? A spatial analysis of health insurance claims in Northeastern Germany using kernel density estimation and geographically weighted regression

    Directory of Open Access Journals (Sweden)

    Boris Kauhl

    2016-11-01

    Full Text Available Abstract Background The provision of general practitioners (GPs in Germany still relies mainly on the ratio of inhabitants to GPs at relatively large scales and barely accounts for an increased prevalence of chronic diseases among the elderly and socially underprivileged populations. Type 2 Diabetes Mellitus (T2DM is one of the major cost-intensive diseases with high rates of potentially preventable complications. Provision of healthcare and access to preventive measures is necessary to reduce the burden of T2DM. However, current studies on the spatial variation of T2DM in Germany are mostly based on survey data, which do not only underestimate the true prevalence of T2DM, but are also only available on large spatial scales. The aim of this study is therefore to analyse the spatial distribution of T2DM at fine geographic scales and to assess location-specific risk factors based on data of the AOK health insurance. Methods To display the spatial heterogeneity of T2DM, a bivariate, adaptive kernel density estimation (KDE was applied. The spatial scan statistic (SaTScan was used to detect areas of high risk. Global and local spatial regression models were then constructed to analyze socio-demographic risk factors of T2DM. Results T2DM is especially concentrated in rural areas surrounding Berlin. The risk factors for T2DM consist of proportions of 65–79 year olds, 80 + year olds, unemployment rate among the 55–65 year olds, proportion of employees covered by mandatory social security insurance, mean income tax, and proportion of non-married couples. However, the strength of the association between T2DM and the examined socio-demographic variables displayed strong regional variations. Conclusion The prevalence of T2DM varies at the very local level. Analyzing point data on T2DM of northeastern Germany’s largest health insurance provider thus allows very detailed, location-specific knowledge about increased medical needs. Risk factors

  19. A rotor optimization using regression analysis

    Science.gov (United States)

    Giansante, N.

    1984-01-01

    The design and development of helicopter rotors is subject to the many design variables and their interactions that effect rotor operation. Until recently, selection of rotor design variables to achieve specified rotor operational qualities has been a costly, time consuming, repetitive task. For the past several years, Kaman Aerospace Corporation has successfully applied multiple linear regression analysis, coupled with optimization and sensitivity procedures, in the analytical design of rotor systems. It is concluded that approximating equations can be developed rapidly for a multiplicity of objective and constraint functions and optimizations can be performed in a rapid and cost effective manner; the number and/or range of design variables can be increased by expanding the data base and developing approximating functions to reflect the expanded design space; the order of the approximating equations can be expanded easily to improve correlation between analyzer results and the approximating equations; gradients of the approximating equations can be calculated easily and these gradients are smooth functions reducing the risk of numerical problems in the optimization; the use of approximating functions allows the problem to be started easily and rapidly from various initial designs to enhance the probability of finding a global optimum; and the approximating equations are independent of the analysis or optimization codes used.

  20. Regression and local control rates after radiotherapy for jugulotympanic paragangliomas: Systematic review and meta-analysis

    International Nuclear Information System (INIS)

    Hulsteijn, Leonie T. van; Corssmit, Eleonora P.M.; Coremans, Ida E.M.; Smit, Johannes W.A.; Jansen, Jeroen C.; Dekkers, Olaf M.

    2013-01-01

    The primary treatment goal of radiotherapy for paragangliomas of the head and neck region (HNPGLs) is local control of the tumor, i.e. stabilization of tumor volume. Interestingly, regression of tumor volume has also been reported. Up to the present, no meta-analysis has been performed giving an overview of regression rates after radiotherapy in HNPGLs. The main objective was to perform a systematic review and meta-analysis to assess regression of tumor volume in HNPGL-patients after radiotherapy. A second outcome was local tumor control. Design of the study is systematic review and meta-analysis. PubMed, EMBASE, Web of Science, COCHRANE and Academic Search Premier and references of key articles were searched in March 2012 to identify potentially relevant studies. Considering the indolent course of HNPGLs, only studies with ⩾12 months follow-up were eligible. Main outcomes were the pooled proportions of regression and local control after radiotherapy as initial, combined (i.e. directly post-operatively or post-embolization) or salvage treatment (i.e. after initial treatment has failed) for HNPGLs. A meta-analysis was performed with an exact likelihood approach using a logistic regression with a random effect at the study level. Pooled proportions with 95% confidence intervals (CI) were reported. Fifteen studies were included, concerning a total of 283 jugulotympanic HNPGLs in 276 patients. Pooled regression proportions for initial, combined and salvage treatment were respectively 21%, 33% and 52% in radiosurgery studies and 4%, 0% and 64% in external beam radiotherapy studies. Pooled local control proportions for radiotherapy as initial, combined and salvage treatment ranged from 79% to 100%. Radiotherapy for jugulotympanic paragangliomas results in excellent local tumor control and therefore is a valuable treatment for these types of tumors. The effects of radiotherapy on regression of tumor volume remain ambiguous, although the data suggest that regression can

  1. Robust analysis of trends in noisy tokamak confinement data using geodesic least squares regression

    Energy Technology Data Exchange (ETDEWEB)

    Verdoolaege, G., E-mail: geert.verdoolaege@ugent.be [Department of Applied Physics, Ghent University, B-9000 Ghent (Belgium); Laboratory for Plasma Physics, Royal Military Academy, B-1000 Brussels (Belgium); Shabbir, A. [Department of Applied Physics, Ghent University, B-9000 Ghent (Belgium); Max Planck Institute for Plasma Physics, Boltzmannstr. 2, 85748 Garching (Germany); Hornung, G. [Department of Applied Physics, Ghent University, B-9000 Ghent (Belgium)

    2016-11-15

    Regression analysis is a very common activity in fusion science for unveiling trends and parametric dependencies, but it can be a difficult matter. We have recently developed the method of geodesic least squares (GLS) regression that is able to handle errors in all variables, is robust against data outliers and uncertainty in the regression model, and can be used with arbitrary distribution models and regression functions. We here report on first results of application of GLS to estimation of the multi-machine scaling law for the energy confinement time in tokamaks, demonstrating improved consistency of the GLS results compared to standard least squares.

  2. Meta-regression analysis of commensal and pathogenic Escherichia coli survival in soil and water.

    Science.gov (United States)

    Franz, Eelco; Schijven, Jack; de Roda Husman, Ana Maria; Blaak, Hetty

    2014-06-17

    The extent to which pathogenic and commensal E. coli (respectively PEC and CEC) can survive, and which factors predominantly determine the rate of decline, are crucial issues from a public health point of view. The goal of this study was to provide a quantitative summary of the variability in E. coli survival in soil and water over a broad range of individual studies and to identify the most important sources of variability. To that end, a meta-regression analysis on available literature data was conducted. The considerable variation in reported decline rates indicated that the persistence of E. coli is not easily predictable. The meta-analysis demonstrated that for soil and water, the type of experiment (laboratory or field), the matrix subtype (type of water and soil), and temperature were the main factors included in the regression analysis. A higher average decline rate in soil of PEC compared with CEC was observed. The regression models explained at best 57% of the variation in decline rate in soil and 41% of the variation in decline rate in water. This indicates that additional factors, not included in the current meta-regression analysis, are of importance but rarely reported. More complete reporting of experimental conditions may allow future inference on the global effects of these variables on the decline rate of E. coli.

  3. Fourier transform infrared spectroscopic imaging and multivariate regression for prediction of proteoglycan content of articular cartilage.

    Directory of Open Access Journals (Sweden)

    Lassi Rieppo

    Full Text Available Fourier Transform Infrared (FT-IR spectroscopic imaging has been earlier applied for the spatial estimation of the collagen and the proteoglycan (PG contents of articular cartilage (AC. However, earlier studies have been limited to the use of univariate analysis techniques. Current analysis methods lack the needed specificity for collagen and PGs. The aim of the present study was to evaluate the suitability of partial least squares regression (PLSR and principal component regression (PCR methods for the analysis of the PG content of AC. Multivariate regression models were compared with earlier used univariate methods and tested with a sample material consisting of healthy and enzymatically degraded steer AC. Chondroitinase ABC enzyme was used to increase the variation in PG content levels as compared to intact AC. Digital densitometric measurements of Safranin O-stained sections provided the reference for PG content. The results showed that multivariate regression models predict PG content of AC significantly better than earlier used absorbance spectrum (i.e. the area of carbohydrate region with or without amide I normalization or second derivative spectrum univariate parameters. Increased molecular specificity favours the use of multivariate regression models, but they require more knowledge of chemometric analysis and extended laboratory resources for gathering reference data for establishing the models. When true molecular specificity is required, the multivariate models should be used.

  4. Research on the spatial analysis method of seismic hazard for island

    International Nuclear Information System (INIS)

    Jia, Jing; Jiang, Jitong; Zheng, Qiuhong; Gao, Huiying

    2017-01-01

    Seismic hazard analysis(SHA) is a key component of earthquake disaster prevention field for island engineering, whose result could provide parameters for seismic design microscopically and also is the requisite work for the island conservation planning’s earthquake and comprehensive disaster prevention planning macroscopically, in the exploitation and construction process of both inhabited and uninhabited islands. The existing seismic hazard analysis methods are compared in their application, and their application and limitation for island is analysed. Then a specialized spatial analysis method of seismic hazard for island (SAMSHI) is given to support the further related work of earthquake disaster prevention planning, based on spatial analysis tools in GIS and fuzzy comprehensive evaluation model. The basic spatial database of SAMSHI includes faults data, historical earthquake record data, geological data and Bouguer gravity anomalies data, which are the data sources for the 11 indices of the fuzzy comprehensive evaluation model, and these indices are calculated by the spatial analysis model constructed in ArcGIS’s Model Builder platform. (paper)

  5. Research on the spatial analysis method of seismic hazard for island

    Science.gov (United States)

    Jia, Jing; Jiang, Jitong; Zheng, Qiuhong; Gao, Huiying

    2017-05-01

    Seismic hazard analysis(SHA) is a key component of earthquake disaster prevention field for island engineering, whose result could provide parameters for seismic design microscopically and also is the requisite work for the island conservation planning’s earthquake and comprehensive disaster prevention planning macroscopically, in the exploitation and construction process of both inhabited and uninhabited islands. The existing seismic hazard analysis methods are compared in their application, and their application and limitation for island is analysed. Then a specialized spatial analysis method of seismic hazard for island (SAMSHI) is given to support the further related work of earthquake disaster prevention planning, based on spatial analysis tools in GIS and fuzzy comprehensive evaluation model. The basic spatial database of SAMSHI includes faults data, historical earthquake record data, geological data and Bouguer gravity anomalies data, which are the data sources for the 11 indices of the fuzzy comprehensive evaluation model, and these indices are calculated by the spatial analysis model constructed in ArcGIS’s Model Builder platform.

  6. GIS-based spatial statistical analysis of risk areas for liver flukes in Surin Province of Thailand.

    Science.gov (United States)

    Rujirakul, Ratana; Ueng-arporn, Naporn; Kaewpitoon, Soraya; Loyd, Ryan J; Kaewthani, Sarochinee; Kaewpitoon, Natthawut

    2015-01-01

    It is urgently necessary to be aware of the distribution and risk areas of liver fluke, Opisthorchis viverrini, for proper allocation of prevention and control measures. This study aimed to investigate the human behavior, and environmental factors influencing the distribution in Surin Province of Thailand, and to build a model using stepwise multiple regression analysis with a geographic information system (GIS) on environment and climate data. The relationship between the human behavior, attitudes (R Square=0.878, and, Adjust R Square=0.849. By GIS analysis, we found Si Narong, Sangkha, Phanom Dong Rak, Mueang Surin, Non Narai, Samrong Thap, Chumphon Buri, and Rattanaburi to have the highest distributions in Surin province. In conclusion, the combination of GIS and statistical analysis can help simulate the spatial distribution and risk areas of liver fluke, and thus may be an important tool for future planning of prevention and control measures.

  7. Vector regression introduced

    Directory of Open Access Journals (Sweden)

    Mok Tik

    2014-06-01

    Full Text Available This study formulates regression of vector data that will enable statistical analysis of various geodetic phenomena such as, polar motion, ocean currents, typhoon/hurricane tracking, crustal deformations, and precursory earthquake signals. The observed vector variable of an event (dependent vector variable is expressed as a function of a number of hypothesized phenomena realized also as vector variables (independent vector variables and/or scalar variables that are likely to impact the dependent vector variable. The proposed representation has the unique property of solving the coefficients of independent vector variables (explanatory variables also as vectors, hence it supersedes multivariate multiple regression models, in which the unknown coefficients are scalar quantities. For the solution, complex numbers are used to rep- resent vector information, and the method of least squares is deployed to estimate the vector model parameters after transforming the complex vector regression model into a real vector regression model through isomorphism. Various operational statistics for testing the predictive significance of the estimated vector parameter coefficients are also derived. A simple numerical example demonstrates the use of the proposed vector regression analysis in modeling typhoon paths.

  8. Spatial modeling of rat bites and prediction of rat infestation in Peshawar valley using binomial kriging with logistic regression.

    Science.gov (United States)

    Ali, Asad; Zaidi, Farrah; Fatima, Syeda Hira; Adnan, Muhammad; Ullah, Saleem

    2018-03-24

    In this study, we propose to develop a geostatistical computational framework to model the distribution of rat bite infestation of epidemic proportion in Peshawar valley, Pakistan. Two species Rattus norvegicus and Rattus rattus are suspected to spread the infestation. The framework combines strengths of maximum entropy algorithm and binomial kriging with logistic regression to spatially model the distribution of infestation and to determine the individual role of environmental predictors in modeling the distribution trends. Our results demonstrate the significance of a number of social and environmental factors in rat infestations such as (I) high human population density; (II) greater dispersal ability of rodents due to the availability of better connectivity routes such as roads, and (III) temperature and precipitation influencing rodent fecundity and life cycle.

  9. Robust spinal cord resting-state fMRI using independent component analysis-based nuisance regression noise reduction.

    Science.gov (United States)

    Hu, Yong; Jin, Richu; Li, Guangsheng; Luk, Keith Dk; Wu, Ed X

    2018-04-16

    Physiological noise reduction plays a critical role in spinal cord (SC) resting-state fMRI (rsfMRI). To reduce physiological noise and increase the robustness of SC rsfMRI by using an independent component analysis (ICA)-based nuisance regression (ICANR) method. Retrospective. Ten healthy subjects (female/male = 4/6, age = 27 ± 3 years, range 24-34 years). 3T/gradient-echo echo planar imaging (EPI). We used three alternative methods (no regression [Nil], conventional region of interest [ROI]-based noise reduction method without ICA [ROI-based], and correction of structured noise using spatial independent component analysis [CORSICA]) to compare with the performance of ICANR. Reduction of the influence of physiological noise on the SC and the reproducibility of rsfMRI analysis after noise reduction were examined. The correlation coefficient (CC) was calculated to assess the influence of physiological noise. Reproducibility was calculated by intraclass correlation (ICC). Results from different methods were compared by one-way analysis of variance (ANOVA) with post-hoc analysis. No significant difference in cerebrospinal fluid (CSF) pulsation influence or tissue motion influence were found (P = 0.223 in CSF, P = 0.2461 in tissue motion) in the ROI-based (CSF: 0.122 ± 0.020; tissue motion: 0.112 ± 0.015), and Nil (CSF: 0.134 ± 0.026; tissue motion: 0.124 ± 0.019). CORSICA showed a significantly stronger influence of CSF pulsation and tissue motion (CSF: 0.166 ± 0.045, P = 0.048; tissue motion: 0.160 ± 0.032, P = 0.048) than Nil. ICANR showed a significantly weaker influence of CSF pulsation and tissue motion (CSF: 0.076 ± 0.007, P = 0.0003; tissue motion: 0.081 ± 0.014, P = 0.0182) than Nil. The ICC values in the Nil, ROI-based, CORSICA, and ICANR were 0.669, 0.645, 0.561, and 0.766, respectively. ICANR more effectively reduced physiological noise from both tissue motion and CSF pulsation than three alternative methods. ICANR increases the robustness of SC rsf

  10. A land use regression model for ambient ultrafine particles in Montreal, Canada: A comparison of linear regression and a machine learning approach.

    Science.gov (United States)

    Weichenthal, Scott; Ryswyk, Keith Van; Goldstein, Alon; Bagg, Scott; Shekkarizfard, Maryam; Hatzopoulou, Marianne

    2016-04-01

    Existing evidence suggests that ambient ultrafine particles (UFPs) (regression model for UFPs in Montreal, Canada using mobile monitoring data collected from 414 road segments during the summer and winter months between 2011 and 2012. Two different approaches were examined for model development including standard multivariable linear regression and a machine learning approach (kernel-based regularized least squares (KRLS)) that learns the functional form of covariate impacts on ambient UFP concentrations from the data. The final models included parameters for population density, ambient temperature and wind speed, land use parameters (park space and open space), length of local roads and rail, and estimated annual average NOx emissions from traffic. The final multivariable linear regression model explained 62% of the spatial variation in ambient UFP concentrations whereas the KRLS model explained 79% of the variance. The KRLS model performed slightly better than the linear regression model when evaluated using an external dataset (R(2)=0.58 vs. 0.55) or a cross-validation procedure (R(2)=0.67 vs. 0.60). In general, our findings suggest that the KRLS approach may offer modest improvements in predictive performance compared to standard multivariable linear regression models used to estimate spatial variations in ambient UFPs. However, differences in predictive performance were not statistically significant when evaluated using the cross-validation procedure. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.

  11. Spatially explicit spectral analysis of point clouds and geospatial data

    Science.gov (United States)

    Buscombe, Daniel D.

    2015-01-01

    The increasing use of spatially explicit analyses of high-resolution spatially distributed data (imagery and point clouds) for the purposes of characterising spatial heterogeneity in geophysical phenomena necessitates the development of custom analytical and computational tools. In recent years, such analyses have become the basis of, for example, automated texture characterisation and segmentation, roughness and grain size calculation, and feature detection and classification, from a variety of data types. In this work, much use has been made of statistical descriptors of localised spatial variations in amplitude variance (roughness), however the horizontal scale (wavelength) and spacing of roughness elements is rarely considered. This is despite the fact that the ratio of characteristic vertical to horizontal scales is not constant and can yield important information about physical scaling relationships. Spectral analysis is a hitherto under-utilised but powerful means to acquire statistical information about relevant amplitude and wavelength scales, simultaneously and with computational efficiency. Further, quantifying spatially distributed data in the frequency domain lends itself to the development of stochastic models for probing the underlying mechanisms which govern the spatial distribution of geological and geophysical phenomena. The software packagePySESA (Python program for Spatially Explicit Spectral Analysis) has been developed for generic analyses of spatially distributed data in both the spatial and frequency domains. Developed predominantly in Python, it accesses libraries written in Cython and C++ for efficiency. It is open source and modular, therefore readily incorporated into, and combined with, other data analysis tools and frameworks with particular utility for supporting research in the fields of geomorphology, geophysics, hydrography, photogrammetry and remote sensing. The analytical and computational structure of the toolbox is

  12. Spatially explicit spectral analysis of point clouds and geospatial data

    Science.gov (United States)

    Buscombe, Daniel

    2016-01-01

    The increasing use of spatially explicit analyses of high-resolution spatially distributed data (imagery and point clouds) for the purposes of characterising spatial heterogeneity in geophysical phenomena necessitates the development of custom analytical and computational tools. In recent years, such analyses have become the basis of, for example, automated texture characterisation and segmentation, roughness and grain size calculation, and feature detection and classification, from a variety of data types. In this work, much use has been made of statistical descriptors of localised spatial variations in amplitude variance (roughness), however the horizontal scale (wavelength) and spacing of roughness elements is rarely considered. This is despite the fact that the ratio of characteristic vertical to horizontal scales is not constant and can yield important information about physical scaling relationships. Spectral analysis is a hitherto under-utilised but powerful means to acquire statistical information about relevant amplitude and wavelength scales, simultaneously and with computational efficiency. Further, quantifying spatially distributed data in the frequency domain lends itself to the development of stochastic models for probing the underlying mechanisms which govern the spatial distribution of geological and geophysical phenomena. The software package PySESA (Python program for Spatially Explicit Spectral Analysis) has been developed for generic analyses of spatially distributed data in both the spatial and frequency domains. Developed predominantly in Python, it accesses libraries written in Cython and C++ for efficiency. It is open source and modular, therefore readily incorporated into, and combined with, other data analysis tools and frameworks with particular utility for supporting research in the fields of geomorphology, geophysics, hydrography, photogrammetry and remote sensing. The analytical and computational structure of the toolbox is described

  13. Application of Fourier analysis to multispectral/spatial recognition

    Science.gov (United States)

    Hornung, R. J.; Smith, J. A.

    1973-01-01

    One approach for investigating spectral response from materials is to consider spatial features of the response. This might be accomplished by considering the Fourier spectrum of the spatial response. The Fourier Transform may be used in a one-dimensional to multidimensional analysis of more than one channel of data. The two-dimensional transform represents the Fraunhofer diffraction pattern of the image in optics and has certain invariant features. Physically the diffraction pattern contains spatial features which are possibly unique to a given configuration or classification type. Different sampling strategies may be used to either enhance geometrical differences or extract additional features.

  14. [Analysis of Cr in soil by LIBS based on conical spatial confinement of plasma].

    Science.gov (United States)

    Lin, Yong-Zeng; Yao, Ming-Yin; Chen, Tian-Bing; Li, Wen-Bing; Zheng, Mei-Lan; Xu, Xue-Hong; Tu, Jian-Ping; Liu, Mu-Hua

    2013-11-01

    The present study is to improve the sensitivity of detection and reduce the limit of detection in detecting heavy metal of soil by laser induced breakdown spectroscopy (LIBS). The Cr element of national standard soil was regarded as the research object. In the experiment, a conical cavity with small diameter end of 20 mm and large diameter end of 45 mm respectively was installed below the focusing lens near the experiment sample to mainly confine the signal transmitted by plasma and to some extent to confine the plasma itself in the LIBS setup. In detecting Cr I 425.44 nm, the beast delay time gained from experiment is 1.3 micros, and the relative standard deviation is below 10%. Compared with the setup of non-spatial confinement, the spectral intensity of Cr in the soil sample was enhanced more than 7%. Calibration curve was established in the Cr concentration range from 60 to 400 microg x g(-1). Under the condition of spatial confinement, the liner regression coefficient and the limit of detection were 0.997 71 and 18.85 microg x g(-1) respectively, however, the regression coefficient and the limit of detection were 0.991 22 and 36.99 microg x g(-1) without spatial confinement. So, this shows that conical spatial confinement can/improve the sensitivity of detection and enhance the spectral intensity. And it is a good auxiliary function in detecting Cr in the soil by laser induced breakdown spectroscopy.

  15. Repeated Results Analysis for Middleware Regression Benchmarking

    Czech Academy of Sciences Publication Activity Database

    Bulej, Lubomír; Kalibera, T.; Tůma, P.

    2005-01-01

    Roč. 60, - (2005), s. 345-358 ISSN 0166-5316 R&D Projects: GA ČR GA102/03/0672 Institutional research plan: CEZ:AV0Z10300504 Keywords : middleware benchmarking * regression benchmarking * regression testing Subject RIV: JD - Computer Applications, Robotics Impact factor: 0.756, year: 2005

  16. Spatial audio quality perception (part 2)

    DEFF Research Database (Denmark)

    Conetta, R.; Brookes, T.; Rumsey, F.

    2015-01-01

    location, envelopment, coverage angle, ensemble width, and spaciousness. They can also impact timbre, and changes to timbre can then influence spatial perception. Previously obtained data was used to build a regression model of perceived spatial audio quality in terms of spatial and timbral metrics...

  17. Bayesian Analysis for Penalized Spline Regression Using WinBUGS

    Directory of Open Access Journals (Sweden)

    Ciprian M. Crainiceanu

    2005-09-01

    Full Text Available Penalized splines can be viewed as BLUPs in a mixed model framework, which allows the use of mixed model software for smoothing. Thus, software originally developed for Bayesian analysis of mixed models can be used for penalized spline regression. Bayesian inference for nonparametric models enjoys the flexibility of nonparametric models and the exact inference provided by the Bayesian inferential machinery. This paper provides a simple, yet comprehensive, set of programs for the implementation of nonparametric Bayesian analysis in WinBUGS. Good mixing properties of the MCMC chains are obtained by using low-rank thin-plate splines, while simulation times per iteration are reduced employing WinBUGS specific computational tricks.

  18. Forecasting municipal solid waste generation using prognostic tools and regression analysis.

    Science.gov (United States)

    Ghinea, Cristina; Drăgoi, Elena Niculina; Comăniţă, Elena-Diana; Gavrilescu, Marius; Câmpean, Teofil; Curteanu, Silvia; Gavrilescu, Maria

    2016-11-01

    For an adequate planning of waste management systems the accurate forecast of waste generation is an essential step, since various factors can affect waste trends. The application of predictive and prognosis models are useful tools, as reliable support for decision making processes. In this paper some indicators such as: number of residents, population age, urban life expectancy, total municipal solid waste were used as input variables in prognostic models in order to predict the amount of solid waste fractions. We applied Waste Prognostic Tool, regression analysis and time series analysis to forecast municipal solid waste generation and composition by considering the Iasi Romania case study. Regression equations were determined for six solid waste fractions (paper, plastic, metal, glass, biodegradable and other waste). Accuracy Measures were calculated and the results showed that S-curve trend model is the most suitable for municipal solid waste (MSW) prediction. Copyright © 2016 Elsevier Ltd. All rights reserved.

  19. Copula Regression Analysis of Simultaneously Recorded Frontal Eye Field and Inferotemporal Spiking Activity during Object-Based Working Memory

    Science.gov (United States)

    Hu, Meng; Clark, Kelsey L.; Gong, Xiajing; Noudoost, Behrad; Li, Mingyao; Moore, Tirin

    2015-01-01

    Inferotemporal (IT) neurons are known to exhibit persistent, stimulus-selective activity during the delay period of object-based working memory tasks. Frontal eye field (FEF) neurons show robust, spatially selective delay period activity during memory-guided saccade tasks. We present a copula regression paradigm to examine neural interaction of these two types of signals between areas IT and FEF of the monkey during a working memory task. This paradigm is based on copula models that can account for both marginal distribution over spiking activity of individual neurons within each area and joint distribution over ensemble activity of neurons between areas. Considering the popular GLMs as marginal models, we developed a general and flexible likelihood framework that uses the copula to integrate separate GLMs into a joint regression analysis. Such joint analysis essentially leads to a multivariate analog of the marginal GLM theory and hence efficient model estimation. In addition, we show that Granger causality between spike trains can be readily assessed via the likelihood ratio statistic. The performance of this method is validated by extensive simulations, and compared favorably to the widely used GLMs. When applied to spiking activity of simultaneously recorded FEF and IT neurons during working memory task, we observed significant Granger causality influence from FEF to IT, but not in the opposite direction, suggesting the role of the FEF in the selection and retention of visual information during working memory. The copula model has the potential to provide unique neurophysiological insights about network properties of the brain. PMID:26063909

  20. On Bayesian shared component disease mapping and ecological regression with errors in covariates.

    Science.gov (United States)

    MacNab, Ying C

    2010-05-20

    Recent literature on Bayesian disease mapping presents shared component models (SCMs) for joint spatial modeling of two or more diseases with common risk factors. In this study, Bayesian hierarchical formulations of shared component disease mapping and ecological models are explored and developed in the context of ecological regression, taking into consideration errors in covariates. A review of multivariate disease mapping models (MultiVMs) such as the multivariate conditional autoregressive models that are also part of the more recent Bayesian disease mapping literature is presented. Some insights into the connections and distinctions between the SCM and MultiVM procedures are communicated. Important issues surrounding (appropriate) formulation of shared- and disease-specific components, consideration/choice of spatial or non-spatial random effects priors, and identification of model parameters in SCMs are explored and discussed in the context of spatial and ecological analysis of small area multivariate disease or health outcome rates and associated ecological risk factors. The methods are illustrated through an in-depth analysis of four-variate road traffic accident injury (RTAI) data: gender-specific fatal and non-fatal RTAI rates in 84 local health areas in British Columbia (Canada). Fully Bayesian inference via Markov chain Monte Carlo simulations is presented. Copyright 2010 John Wiley & Sons, Ltd.

  1. Development of Compressive Failure Strength for Composite Laminate Using Regression Analysis Method

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Myoung Keon [Agency for Defense Development, Daejeon (Korea, Republic of); Lee, Jeong Won; Yoon, Dong Hyun; Kim, Jae Hoon [Chungnam Nat’l Univ., Daejeon (Korea, Republic of)

    2016-10-15

    This paper provides the compressive failure strength value of composite laminate developed by using regression analysis method. Composite material in this document is a Carbon/Epoxy unidirection(UD) tape prepreg(Cycom G40-800/5276-1) cured at 350°F(177°C). The operating temperature is –60°F~+200°F(-55°C - +95°C). A total of 56 compression tests were conducted on specimens from eight (8) distinct laminates that were laid up by standard angle layers (0°, +45°, –45° and 90°). The ASTM-D-6484 standard was used for test method. The regression analysis was performed with the response variable being the laminate ultimate fracture strength and the regressor variables being two ply orientations (0° and ±45°)

  2. Development of Compressive Failure Strength for Composite Laminate Using Regression Analysis Method

    International Nuclear Information System (INIS)

    Lee, Myoung Keon; Lee, Jeong Won; Yoon, Dong Hyun; Kim, Jae Hoon

    2016-01-01

    This paper provides the compressive failure strength value of composite laminate developed by using regression analysis method. Composite material in this document is a Carbon/Epoxy unidirection(UD) tape prepreg(Cycom G40-800/5276-1) cured at 350°F(177°C). The operating temperature is –60°F~+200°F(-55°C - +95°C). A total of 56 compression tests were conducted on specimens from eight (8) distinct laminates that were laid up by standard angle layers (0°, +45°, –45° and 90°). The ASTM-D-6484 standard was used for test method. The regression analysis was performed with the response variable being the laminate ultimate fracture strength and the regressor variables being two ply orientations (0° and ±45°)

  3. Standards for Standardized Logistic Regression Coefficients

    Science.gov (United States)

    Menard, Scott

    2011-01-01

    Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…

  4. Detecting the Land-Cover Changes Induced by Large-Physical Disturbances Using Landscape Metrics, Spatial Sampling, Simulation and Spatial Analysis

    Directory of Open Access Journals (Sweden)

    Hone-Jay Chu

    2009-08-01

    Full Text Available The objectives of the study are to integrate the conditional Latin Hypercube Sampling (cLHS, sequential Gaussian simulation (SGS and spatial analysis in remotely sensed images, to monitor the effects of large chronological disturbances on spatial characteristics of landscape changes including spatial heterogeneity and variability. The multiple NDVI images demonstrate that spatial patterns of disturbed landscapes were successfully delineated by spatial analysis such as variogram, Moran’I and landscape metrics in the study area. The hybrid method delineates the spatial patterns and spatial variability of landscapes caused by these large disturbances. The cLHS approach is applied to select samples from Normalized Difference Vegetation Index (NDVI images from SPOT HRV images in the Chenyulan watershed of Taiwan, and then SGS with sufficient samples is used to generate maps of NDVI images. In final, the NDVI simulated maps are verified using indexes such as the correlation coefficient and mean absolute error (MAE. Therefore, the statistics and spatial structures of multiple NDVI images present a very robust behavior, which advocates the use of the index for the quantification of the landscape spatial patterns and land cover change. In addition, the results transferred by Open Geospatial techniques can be accessed from web-based and end-user applications of the watershed management.

  5. The Czech Pirate Party in the 2010 and 2013 Parliamentary Elections and the 2014 European Parliament Elections: Spatial Analysis of Voter Support

    Directory of Open Access Journals (Sweden)

    Maškarinec Pavel

    2017-01-01

    Full Text Available The paper presents a spatial analysis of the Czech Pirate Party (Pirates voter support in the 2010 and 2013 parliamentary elections and the 2014 European Parliament elections. The main method applied for classifying electoral results was the spatial autocorrelation and spatial regression. The result of the analysis has shown that territorial support for the Pirates copies to a great extent the areas of high support for right-wing parties and simultaneously the areas exemplified by a high development potential. In the case of spatial characteristics, little support for the Pirates was shown in Moravia and higher in the Sudetenland in terms of determinants of support. Additionally to spatial regimes, inter-regional support for the Pirates was also influenced by other non-spatial characteristics, although the strength of their influence was relatively weak. The units which embodied a successful environment for voting for the Pirates were particularly characterized by greater urbanization and a greater number of entrepreneurs, while a lack of jobs and the older age structure, i.e. the signs that in the socio-economic, or socio-ecological sense define peripheral areas, negatively impacted the gains of the Pirates. Ambiguous influence was exercised by college-educated inhabitants, who in the parliamentary elections in 2010 and 2013 decreased the gains of the Pirates, however, in the elections to the European Parliament in 2014 a direction of relationship was modified and turned positive.

  6. Sex differences in visual-spatial working memory: A meta-analysis.

    Science.gov (United States)

    Voyer, Daniel; Voyer, Susan D; Saint-Aubin, Jean

    2017-04-01

    Visual-spatial working memory measures are widely used in clinical and experimental settings. Furthermore, it has been argued that the male advantage in spatial abilities can be explained by a sex difference in visual-spatial working memory. Therefore, sex differences in visual-spatial working memory have important implication for research, theory, and practice, but they have yet to be quantified. The present meta-analysis quantified the magnitude of sex differences in visual-spatial working memory and examined variables that might moderate them. The analysis used a set of 180 effect sizes from healthy males and females drawn from 98 samples ranging in mean age from 3 to 86 years. Multilevel meta-analysis was used on the overall data set to account for non-independent effect sizes. The data also were analyzed in separate task subgroups by means of multilevel and mixed-effects models. Results showed a small but significant male advantage (mean d = 0.155, 95 % confidence interval = 0.087-0.223). All the tasks produced a male advantage, except for memory for location, where a female advantage emerged. Age of the participants was a significant moderator, indicating that sex differences in visual-spatial working memory appeared first in the 13-17 years age group. Removing memory for location tasks from the sample affected the pattern of significant moderators. The present results indicate a male advantage in visual-spatial working memory, although age and specific task modulate the magnitude and direction of the effects. Implications for clinical applications, cognitive model building, and experimental research are discussed.

  7. CADDIS Volume 4. Data Analysis: PECBO Appendix - R Scripts for Non-Parametric Regressions

    Science.gov (United States)

    Script for computing nonparametric regression analysis. Overview of using scripts to infer environmental conditions from biological observations, statistically estimating species-environment relationships, statistical scripts.

  8. Aneurysmal subarachnoid hemorrhage prognostic decision-making algorithm using classification and regression tree analysis.

    Science.gov (United States)

    Lo, Benjamin W Y; Fukuda, Hitoshi; Angle, Mark; Teitelbaum, Jeanne; Macdonald, R Loch; Farrokhyar, Forough; Thabane, Lehana; Levine, Mitchell A H

    2016-01-01

    Classification and regression tree analysis involves the creation of a decision tree by recursive partitioning of a dataset into more homogeneous subgroups. Thus far, there is scarce literature on using this technique to create clinical prediction tools for aneurysmal subarachnoid hemorrhage (SAH). The classification and regression tree analysis technique was applied to the multicenter Tirilazad database (3551 patients) in order to create the decision-making algorithm. In order to elucidate prognostic subgroups in aneurysmal SAH, neurologic, systemic, and demographic factors were taken into account. The dependent variable used for analysis was the dichotomized Glasgow Outcome Score at 3 months. Classification and regression tree analysis revealed seven prognostic subgroups. Neurological grade, occurrence of post-admission stroke, occurrence of post-admission fever, and age represented the explanatory nodes of this decision tree. Split sample validation revealed classification accuracy of 79% for the training dataset and 77% for the testing dataset. In addition, the occurrence of fever at 1-week post-aneurysmal SAH is associated with increased odds of post-admission stroke (odds ratio: 1.83, 95% confidence interval: 1.56-2.45, P tree was generated, which serves as a prediction tool to guide bedside prognostication and clinical treatment decision making. This prognostic decision-making algorithm also shed light on the complex interactions between a number of risk factors in determining outcome after aneurysmal SAH.

  9. Comparison of height-diameter models based on geographically weighted regressions and linear mixed modelling applied to large scale forest inventory data

    Energy Technology Data Exchange (ETDEWEB)

    Quirós Segovia, M.; Condés Ruiz, S.; Drápela, K.

    2016-07-01

    Aim of the study: The main objective of this study was to test Geographically Weighted Regression (GWR) for developing height-diameter curves for forests on a large scale and to compare it with Linear Mixed Models (LMM). Area of study: Monospecific stands of Pinus halepensis Mill. located in the region of Murcia (Southeast Spain). Materials and Methods: The dataset consisted of 230 sample plots (2582 trees) from the Third Spanish National Forest Inventory (SNFI) randomly split into training data (152 plots) and validation data (78 plots). Two different methodologies were used for modelling local (Petterson) and generalized height-diameter relationships (Cañadas I): GWR, with different bandwidths, and linear mixed models. Finally, the quality of the estimated models was compared throughout statistical analysis. Main results: In general, both LMM and GWR provide better prediction capability when applied to a generalized height-diameter function than when applied to a local one, with R2 values increasing from around 0.6 to 0.7 in the model validation. Bias and RMSE were also lower for the generalized function. However, error analysis showed that there were no large differences between these two methodologies, evidencing that GWR provides results which are as good as the more frequently used LMM methodology, at least when no additional measurements are available for calibrating. Research highlights: GWR is a type of spatial analysis for exploring spatially heterogeneous processes. GWR can model spatial variation in tree height-diameter relationship and its regression quality is comparable to LMM. The advantage of GWR over LMM is the possibility to determine the spatial location of every parameter without additional measurements. Abbreviations: GWR (Geographically Weighted Regression); LMM (Linear Mixed Model); SNFI (Spanish National Forest Inventory). (Author)

  10. Spatial analysis of hemorrhagic fever with renal syndrome in China

    Directory of Open Access Journals (Sweden)

    Yang Hong

    2006-04-01

    Full Text Available Abstract Background Hemorrhagic fever with renal syndrome (HFRS is endemic in many provinces with high incidence in mainland China, although integrated intervention measures including rodent control, environment management and vaccination have been implemented for over ten years. In this study, we conducted a geographic information system (GIS-based spatial analysis on distribution of HFRS cases for the whole country with an objective to inform priority areas for public health planning and resource allocation. Methods Annualized average incidence at a county level was calculated using HFRS cases reported during 1994–1998 in mainland China. GIS-based spatial analyses were conducted to detect spatial autocorrelation and clusters of HFRS incidence at the county level throughout the country. Results Spatial distribution of HFRS cases in mainland China from 1994 to 1998 was mapped at county level in the aspects of crude incidence, excess hazard and spatial smoothed incidence. The spatial distribution of HFRS cases was nonrandom and clustered with a Moran's I = 0.5044 (p = 0.001. Spatial cluster analyses suggested that 26 and 39 areas were at increased risks of HFRS (p Conclusion The application of GIS, together with spatial statistical techniques, provide a means to quantify explicit HFRS risks and to further identify environmental factors responsible for the increasing disease risks. We demonstrate a new perspective of integrating such spatial analysis tools into the epidemiologic study and risk assessment of HFRS.

  11. Multi-spatial analysis of aeolian dune-field patterns

    Science.gov (United States)

    Ewing, Ryan C.; McDonald, George D.; Hayes, Alex G.

    2015-07-01

    Aeolian dune-fields are composed of different spatial scales of bedform patterns that respond to changes in environmental boundary conditions over a wide range of time scales. This study examines how variations in spatial scales of dune and ripple patterns found within dune fields are used in environmental reconstructions on Earth, Mars and Titan. Within a single bedform type, different spatial scales of bedforms emerge as a pattern evolves from an initial state into a well-organized pattern, such as with the transition from protodunes to dunes. Additionally, different types of bedforms, such as ripples, coarse-grained ripples and dunes, coexist at different spatial scales within a dune-field. Analysis of dune-field patterns at the intersection of different scales and types of bedforms at different stages of development provides a more comprehensive record of sediment supply and wind regime than analysis of a single scale and type of bedform. Interpretations of environmental conditions from any scale of bedform, however, are limited to environmental signals associated with the response time of that bedform. Large-scale dune-field patterns integrate signals over long-term climate cycles and reveal little about short-term variations in wind or sediment supply. Wind ripples respond instantly to changing conditions, but reveal little about longer-term variations in wind or sediment supply. Recognizing the response time scales across different spatial scales of bedforms maximizes environmental interpretations from dune-field patterns.

  12. Incorporating twitter-based human activity information in spatial analysis of crashes in urban areas.

    Science.gov (United States)

    Bao, Jie; Liu, Pan; Yu, Hao; Xu, Chengcheng

    2017-09-01

    The primary objective of this study was to investigate how to incorporate human activity information in spatial analysis of crashes in urban areas using Twitter check-in data. This study used the data collected from the City of Los Angeles in the United States to illustrate the procedure. The following five types of data were collected: crash data, human activity data, traditional traffic exposure variables, road network attributes and social-demographic data. A web crawler by Python was developed to collect the venue type information from the Twitter check-in data automatically. The human activities were classified into seven categories by the obtained venue types. The collected data were aggregated into 896 Traffic Analysis Zones (TAZ). Geographically weighted regression (GWR) models were developed to establish a relationship between the crash counts reported in a TAZ and various contributing factors. Comparative analyses were conducted to compare the performance of GWR models which considered traditional traffic exposure variables only, Twitter-based human activity variables only, and both traditional traffic exposure and Twitter-based human activity variables. The model specification results suggested that human activity variables significantly affected the crash counts in a TAZ. The results of comparative analyses suggested that the models which considered both traditional traffic exposure and human activity variables had the best goodness-of-fit in terms of the highest R 2 and lowest AICc values. The finding seems to confirm the benefits of incorporating human activity information in spatial analysis of crashes using Twitter check-in data. Copyright © 2017 Elsevier Ltd. All rights reserved.

  13. Inferring gene expression dynamics via functional regression analysis

    Directory of Open Access Journals (Sweden)

    Leng Xiaoyan

    2008-01-01

    Full Text Available Abstract Background Temporal gene expression profiles characterize the time-dynamics of expression of specific genes and are increasingly collected in current gene expression experiments. In the analysis of experiments where gene expression is obtained over the life cycle, it is of interest to relate temporal patterns of gene expression associated with different developmental stages to each other to study patterns of long-term developmental gene regulation. We use tools from functional data analysis to study dynamic changes by relating temporal gene expression profiles of different developmental stages to each other. Results We demonstrate that functional regression methodology can pinpoint relationships that exist between temporary gene expression profiles for different life cycle phases and incorporates dimension reduction as needed for these high-dimensional data. By applying these tools, gene expression profiles for pupa and adult phases are found to be strongly related to the profiles of the same genes obtained during the embryo phase. Moreover, one can distinguish between gene groups that exhibit relationships with positive and others with negative associations between later life and embryonal expression profiles. Specifically, we find a positive relationship in expression for muscle development related genes, and a negative relationship for strictly maternal genes for Drosophila, using temporal gene expression profiles. Conclusion Our findings point to specific reactivation patterns of gene expression during the Drosophila life cycle which differ in characteristic ways between various gene groups. Functional regression emerges as a useful tool for relating gene expression patterns from different developmental stages, and avoids the problems with large numbers of parameters and multiple testing that affect alternative approaches.

  14. A comparative study on generating simulated Landsat NDVI images using data fusion and regression method-the case of the Korean Peninsula.

    Science.gov (United States)

    Lee, Mi Hee; Lee, Soo Bong; Eo, Yang Dam; Kim, Sun Woong; Woo, Jung-Hun; Han, Soo Hee

    2017-07-01

    Landsat optical images have enough spatial and spectral resolution to analyze vegetation growth characteristics. But, the clouds and water vapor degrade the image quality quite often, which limits the availability of usable images for the time series vegetation vitality measurement. To overcome this shortcoming, simulated images are used as an alternative. In this study, weighted average method, spatial and temporal adaptive reflectance fusion model (STARFM) method, and multilinear regression analysis method have been tested to produce simulated Landsat normalized difference vegetation index (NDVI) images of the Korean Peninsula. The test results showed that the weighted average method produced the images most similar to the actual images, provided that the images were available within 1 month before and after the target date. The STARFM method gives good results when the input image date is close to the target date. Careful regional and seasonal consideration is required in selecting input images. During summer season, due to clouds, it is very difficult to get the images close enough to the target date. Multilinear regression analysis gives meaningful results even when the input image date is not so close to the target date. Average R 2 values for weighted average method, STARFM, and multilinear regression analysis were 0.741, 0.70, and 0.61, respectively.

  15. Survival analysis II: Cox regression

    NARCIS (Netherlands)

    Stel, Vianda S.; Dekker, Friedo W.; Tripepi, Giovanni; Zoccali, Carmine; Jager, Kitty J.

    2011-01-01

    In contrast to the Kaplan-Meier method, Cox proportional hazards regression can provide an effect estimate by quantifying the difference in survival between patient groups and can adjust for confounding effects of other variables. The purpose of this article is to explain the basic concepts of the

  16. Use of generalized ordered logistic regression for the analysis of multidrug resistance data.

    Science.gov (United States)

    Agga, Getahun E; Scott, H Morgan

    2015-10-01

    Statistical analysis of antimicrobial resistance data largely focuses on individual antimicrobial's binary outcome (susceptible or resistant). However, bacteria are becoming increasingly multidrug resistant (MDR). Statistical analysis of MDR data is mostly descriptive often with tabular or graphical presentations. Here we report the applicability of generalized ordinal logistic regression model for the analysis of MDR data. A total of 1,152 Escherichia coli, isolated from the feces of weaned pigs experimentally supplemented with chlortetracycline (CTC) and copper, were tested for susceptibilities against 15 antimicrobials and were binary classified into resistant or susceptible. The 15 antimicrobial agents tested were grouped into eight different antimicrobial classes. We defined MDR as the number of antimicrobial classes to which E. coli isolates were resistant ranging from 0 to 8. Proportionality of the odds assumption of the ordinal logistic regression model was violated only for the effect of treatment period (pre-treatment, during-treatment and post-treatment); but not for the effect of CTC or copper supplementation. Subsequently, a partially constrained generalized ordinal logistic model was built that allows for the effect of treatment period to vary while constraining the effects of treatment (CTC and copper supplementation) to be constant across the levels of MDR classes. Copper (Proportional Odds Ratio [Prop OR]=1.03; 95% CI=0.73-1.47) and CTC (Prop OR=1.1; 95% CI=0.78-1.56) supplementation were not significantly associated with the level of MDR adjusted for the effect of treatment period. MDR generally declined over the trial period. In conclusion, generalized ordered logistic regression can be used for the analysis of ordinal data such as MDR data when the proportionality assumptions for ordered logistic regression are violated. Published by Elsevier B.V.

  17. Regression analysis of growth responses to water depth in three wetland plant species

    DEFF Research Database (Denmark)

    Sorrell, Brian K; Tanner, Chris C; Brix, Hans

    2012-01-01

    depths from 0 – 0.5 m. Morphological and growth responses to depth were followed for 54 days before harvest, and then analysed by repeated measures analysis of covariance, and non-linear and quantile regression analysis (QRA), to compare flooding tolerances. Principal results Growth responses to depth...

  18. A SOCIOLOGICAL ANALYSIS OF THE CHILDBEARING COEFFICIENT IN THE ALTAI REGION BASED ON METHOD OF FUZZY LINEAR REGRESSION

    Directory of Open Access Journals (Sweden)

    Sergei Vladimirovich Varaksin

    2017-06-01

    Full Text Available Purpose. Construction of a mathematical model of the dynamics of childbearing change in the Altai region in 2000–2016, analysis of the dynamics of changes in birth rates for multiple age categories of women of childbearing age. Methodology. A auxiliary analysis element is the construction of linear mathematical models of the dynamics of childbearing by using fuzzy linear regression method based on fuzzy numbers. Fuzzy linear regression is considered as an alternative to standard statistical linear regression for short time series and unknown distribution law. The parameters of fuzzy linear and standard statistical regressions for childbearing time series were defined with using the built in language MatLab algorithm. Method of fuzzy linear regression is not used in sociological researches yet. Results. There are made the conclusions about the socio-demographic changes in society, the high efficiency of the demographic policy of the leadership of the region and the country, and the applicability of the method of fuzzy linear regression for sociological analysis.

  19. Multiple Logistic Regression Analysis of Cigarette Use among High School Students

    Science.gov (United States)

    Adwere-Boamah, Joseph

    2011-01-01

    A binary logistic regression analysis was performed to predict high school students' cigarette smoking behavior from selected predictors from 2009 CDC Youth Risk Behavior Surveillance Survey. The specific target student behavior of interest was frequent cigarette use. Five predictor variables included in the model were: a) race, b) frequency of…

  20. THE PROGNOSIS OF RUSSIAN DEFENSE INDUSTRY DEVELOPMENT IMPLEMENTED THROUGH REGRESSION ANALYSIS

    Directory of Open Access Journals (Sweden)

    L.M. Kapustina

    2007-03-01

    Full Text Available The article illustrates the results of investigation the major internal and external factors which influence the development of the defense industry, as well as the results of regression analysis which quantitatively displays the factorial contribution in the growth rate of Russian defense industry. On the basis of calculated regression dependences the authors fulfilled the medium-term prognosis of defense industry. Optimistic and inertial versions of defense product growth rate for the period up to 2009 are based on scenario conditions in Russian economy worked out by the Ministry of economy and development. In conclusion authors point out which factors and conditions have the largest impact on successful and stable operation of Russian defense industry.

  1. Global sensitivity analysis for models with spatially dependent outputs

    International Nuclear Information System (INIS)

    Iooss, B.; Marrel, A.; Jullien, M.; Laurent, B.

    2011-01-01

    The global sensitivity analysis of a complex numerical model often calls for the estimation of variance-based importance measures, named Sobol' indices. Meta-model-based techniques have been developed in order to replace the CPU time-expensive computer code with an inexpensive mathematical function, which predicts the computer code output. The common meta-model-based sensitivity analysis methods are well suited for computer codes with scalar outputs. However, in the environmental domain, as in many areas of application, the numerical model outputs are often spatial maps, which may also vary with time. In this paper, we introduce an innovative method to obtain a spatial map of Sobol' indices with a minimal number of numerical model computations. It is based upon the functional decomposition of the spatial output onto a wavelet basis and the meta-modeling of the wavelet coefficients by the Gaussian process. An analytical example is presented to clarify the various steps of our methodology. This technique is then applied to a real hydrogeological case: for each model input variable, a spatial map of Sobol' indices is thus obtained. (authors)

  2. Spatial analysis of agri-environmental policy uptake and expenditure in Scotland.

    Science.gov (United States)

    Yang, Anastasia L; Rounsevell, Mark D A; Wilson, Ronald M; Haggett, Claire

    2014-01-15

    Agri-environment is one of the most widely supported rural development policy measures in Scotland in terms of number of participants and expenditure. It comprises 69 management options and sub-options that are delivered primarily through the competitive 'Rural Priorities scheme'. Understanding the spatial determinants of uptake and expenditure would assist policy-makers in guiding future policy targeting efforts for the rural environment. This study is unique in examining the spatial dependency and determinants of Scotland's agri-environmental measures and categorised options uptake and payments at the parish level. Spatial econometrics is applied to test the influence of 40 explanatory variables on farming characteristics, land capability, designated sites, accessibility and population. Results identified spatial dependency for each of the dependent variables, which supported the use of spatially-explicit models. The goodness of fit of the spatial models was better than for the aspatial regression models. There was also notable improvement in the models for participation compared with the models for expenditure. Furthermore a range of expected explanatory variables were found to be significant and varied according to the dependent variable used. The majority of models for both payment and uptake showed a significant positive relationship with SSSI (Sites of Special Scientific Interest), which are designated sites prioritised in Scottish policy. These results indicate that environmental targeting efforts by the government for AEP uptake in designated sites can be effective. However habitats outside of SSSI, termed here the 'wider countryside' may not be sufficiently competitive to receive funding in the current policy system. Copyright © 2013 Elsevier Ltd. All rights reserved.

  3. Applied linear regression

    CERN Document Server

    Weisberg, Sanford

    2013-01-01

    Praise for the Third Edition ""...this is an excellent book which could easily be used as a course text...""-International Statistical Institute The Fourth Edition of Applied Linear Regression provides a thorough update of the basic theory and methodology of linear regression modeling. Demonstrating the practical applications of linear regression analysis techniques, the Fourth Edition uses interesting, real-world exercises and examples. Stressing central concepts such as model building, understanding parameters, assessing fit and reliability, and drawing conclusions, the new edition illus

  4. A Comparison of Advanced Regression Algorithms for Quantifying Urban Land Cover

    Directory of Open Access Journals (Sweden)

    Akpona Okujeni

    2014-07-01

    Full Text Available Quantitative methods for mapping sub-pixel land cover fractions are gaining increasing attention, particularly with regard to upcoming hyperspectral satellite missions. We evaluated five advanced regression algorithms combined with synthetically mixed training data for quantifying urban land cover from HyMap data at 3.6 and 9 m spatial resolution. Methods included support vector regression (SVR, kernel ridge regression (KRR, artificial neural networks (NN, random forest regression (RFR and partial least squares regression (PLSR. Our experiments demonstrate that both kernel methods SVR and KRR yield high accuracies for mapping complex urban surface types, i.e., rooftops, pavements, grass- and tree-covered areas. SVR and KRR models proved to be stable with regard to the spatial and spectral differences between both images and effectively utilized the higher complexity of the synthetic training mixtures for improving estimates for coarser resolution data. Observed deficiencies mainly relate to known problems arising from spectral similarities or shadowing. The remaining regressors either revealed erratic (NN or limited (RFR and PLSR performances when comprehensively mapping urban land cover. Our findings suggest that the combination of kernel-based regression methods, such as SVR and KRR, with synthetically mixed training data is well suited for quantifying urban land cover from imaging spectrometer data at multiple scales.

  5. Multicollinearity in spatial genetics: separating the wheat from the chaff using commonality analyses.

    Science.gov (United States)

    Prunier, J G; Colyn, M; Legendre, X; Nimon, K F; Flamand, M C

    2015-01-01

    Direct gradient analyses in spatial genetics provide unique opportunities to describe the inherent complexity of genetic variation in wildlife species and are the object of many methodological developments. However, multicollinearity among explanatory variables is a systemic issue in multivariate regression analyses and is likely to cause serious difficulties in properly interpreting results of direct gradient analyses, with the risk of erroneous conclusions, misdirected research and inefficient or counterproductive conservation measures. Using simulated data sets along with linear and logistic regressions on distance matrices, we illustrate how commonality analysis (CA), a detailed variance-partitioning procedure that was recently introduced in the field of ecology, can be used to deal with nonindependence among spatial predictors. By decomposing model fit indices into unique and common (or shared) variance components, CA allows identifying the location and magnitude of multicollinearity, revealing spurious correlations and thus thoroughly improving the interpretation of multivariate regressions. Despite a few inherent limitations, especially in the case of resistance model optimization, this review highlights the great potential of CA to account for complex multicollinearity patterns in spatial genetics and identifies future applications and lines of research. We strongly urge spatial geneticists to systematically investigate commonalities when performing direct gradient analyses. © 2014 John Wiley & Sons Ltd.

  6. Spatial patterns of arrests, police assault and addiction treatment center locations in Tijuana, Mexico.

    Science.gov (United States)

    Werb, Dan; Strathdee, Steffanie A; Vera, Alicia; Arredondo, Jaime; Beletsky, Leo; Gonzalez-Zuniga, Patricia; Gaines, Tommi

    2016-07-01

    In the context of a public health-oriented drug policy reform in Mexico, we assessed the spatial distribution of police encounters among people who inject drugs (PWID) in Tijuana, determined the association between these encounters and the location of addiction treatment centers and explored the association between police encounters and treatment access. Geographically weighted regression (GWR) and logistic regression analysis using prospective spatial data from a community-recruited cohort of PWID in Tijuana and official geographical arrest data from the Tijuana Municipal Police Department. Tijuana, Mexico. A total of 608 participants (median age 37; 28.4% female) in the prospective Proyecto El Cuete cohort study recruited between January and December 2011. We compared the mean distance of police encounters and a randomly distributed set of events to treatment centers. GWR was undertaken to model the spatial relationship between police interactions and treatment centers. Logistic regression analysis was used to investigate factors associated with reporting police interactions. During the study period, 27.5% of police encounters occurred within 500 m of treatment centers. The GWR model suggested spatial correlation between encounters and treatment centers (global R(2)  = 0.53). Reporting a need for addiction treatment was associated with reporting arrest and police assault [adjusted odds ratio = 2.74, 95% confidence interval (CI) = 1.25-6.02, P = 0.012]. A geospatial analysis suggests that, in Mexico, people who inject drugs are at greater risk of being a victim of police violence if they consider themselves in need of addiction treatment, and their interactions with police appear to be more frequent around treatment centers. © 2016 Society for the Study of Addiction.

  7. Noninvasive spectral imaging of skin chromophores based on multiple regression analysis aided by Monte Carlo simulation

    Science.gov (United States)

    Nishidate, Izumi; Wiswadarma, Aditya; Hase, Yota; Tanaka, Noriyuki; Maeda, Takaaki; Niizeki, Kyuichi; Aizu, Yoshihisa

    2011-08-01

    In order to visualize melanin and blood concentrations and oxygen saturation in human skin tissue, a simple imaging technique based on multispectral diffuse reflectance images acquired at six wavelengths (500, 520, 540, 560, 580 and 600nm) was developed. The technique utilizes multiple regression analysis aided by Monte Carlo simulation for diffuse reflectance spectra. Using the absorbance spectrum as a response variable and the extinction coefficients of melanin, oxygenated hemoglobin, and deoxygenated hemoglobin as predictor variables, multiple regression analysis provides regression coefficients. Concentrations of melanin and total blood are then determined from the regression coefficients using conversion vectors that are deduced numerically in advance, while oxygen saturation is obtained directly from the regression coefficients. Experiments with a tissue-like agar gel phantom validated the method. In vivo experiments with human skin of the human hand during upper limb occlusion and of the inner forearm exposed to UV irradiation demonstrated the ability of the method to evaluate physiological reactions of human skin tissue.

  8. Multiple Regression Analysis of Unconfined Compression Strength of Mine Tailings Matrices

    Directory of Open Access Journals (Sweden)

    Mahmood Ali A.

    2017-01-01

    Full Text Available As part of a novel approach of sustainable development of mine tailings, experimental and numerical analysis is carried out on newly formulated tailings matrices. Several physical characteristic tests are carried out including the unconfined compression strength test to ascertain the integrity of these matrices when subjected to loading. The current paper attempts a multiple regression analysis of the unconfined compressive strength test results of these matrices to investigate the most pertinent factors affecting their strength. Results of this analysis showed that the suggested equation is reasonably applicable to the range of binder combinations used.

  9. 3D spatially-adaptive canonical correlation analysis: Local and global methods.

    Science.gov (United States)

    Yang, Zhengshi; Zhuang, Xiaowei; Sreenivasan, Karthik; Mishra, Virendra; Curran, Tim; Byrd, Richard; Nandy, Rajesh; Cordes, Dietmar

    2018-04-01

    Local spatially-adaptive canonical correlation analysis (local CCA) with spatial constraints has been introduced to fMRI multivariate analysis for improved modeling of activation patterns. However, current algorithms require complicated spatial constraints that have only been applied to 2D local neighborhoods because the computational time would be exponentially increased if the same method is applied to 3D spatial neighborhoods. In this study, an efficient and accurate line search sequential quadratic programming (SQP) algorithm has been developed to efficiently solve the 3D local CCA problem with spatial constraints. In addition, a spatially-adaptive kernel CCA (KCCA) method is proposed to increase accuracy of fMRI activation maps. With oriented 3D spatial filters anisotropic shapes can be estimated during the KCCA analysis of fMRI time courses. These filters are orientation-adaptive leading to rotational invariance to better match arbitrary oriented fMRI activation patterns, resulting in improved sensitivity of activation detection while significantly reducing spatial blurring artifacts. The kernel method in its basic form does not require any spatial constraints and analyzes the whole-brain fMRI time series to construct an activation map. Finally, we have developed a penalized kernel CCA model that involves spatial low-pass filter constraints to increase the specificity of the method. The kernel CCA methods are compared with the standard univariate method and with two different local CCA methods that were solved by the SQP algorithm. Results show that SQP is the most efficient algorithm to solve the local constrained CCA problem, and the proposed kernel CCA methods outperformed univariate and local CCA methods in detecting activations for both simulated and real fMRI episodic memory data. Copyright © 2017 Elsevier Inc. All rights reserved.

  10. The Regression Analysis of Individual Financial Performance: Evidence from Croatia

    OpenAIRE

    Bahovec, Vlasta; Barbić, Dajana; Palić, Irena

    2017-01-01

    Background: A large body of empirical literature indicates that gender and financial literacy are significant determinants of individual financial performance. Objectives: The purpose of this paper is to recognize the impact of the variable financial literacy and the variable gender on the variation of the financial performance using the regression analysis. Methods/Approach: The survey was conducted using the systematically chosen random sample of Croatian financial consumers. The cross sect...

  11. Do hospitals respond to rivals' quality and efficiency? A spatial panel econometric analysis.

    Science.gov (United States)

    Longo, Francesco; Siciliani, Luigi; Gravelle, Hugh; Santos, Rita

    2017-09-01

    We investigate whether hospitals in the English National Health Service change their quality or efficiency in response to changes in quality or efficiency of neighbouring hospitals. We first provide a theoretical model that predicts that a hospital will not respond to changes in the efficiency of its rivals but may change its quality or efficiency in response to changes in the quality of rivals, though the direction of the response is ambiguous. We use data on eight quality measures (including mortality, emergency readmissions, patient reported outcome, and patient satisfaction) and six efficiency measures (including bed occupancy, cancelled operations, and costs) for public hospitals between 2010/11 and 2013/14 to estimate both spatial cross-sectional and spatial fixed- and random-effects panel data models. We find that although quality and efficiency measures are unconditionally spatially correlated, the spatial regression models suggest that a hospital's quality or efficiency does not respond to its rivals' quality or efficiency, except for a hospital's overall mortality that is positively associated with that of its rivals. The results are robust to allowing for spatially correlated covariates and errors and to instrumenting rivals' quality and efficiency. Copyright © 2017 John Wiley & Sons, Ltd.

  12. Developing and testing a global-scale regression model to quantify mean annual streamflow

    Science.gov (United States)

    Barbarossa, Valerio; Huijbregts, Mark A. J.; Hendriks, A. Jan; Beusen, Arthur H. W.; Clavreul, Julie; King, Henry; Schipper, Aafke M.

    2017-01-01

    Quantifying mean annual flow of rivers (MAF) at ungauged sites is essential for assessments of global water supply, ecosystem integrity and water footprints. MAF can be quantified with spatially explicit process-based models, which might be overly time-consuming and data-intensive for this purpose, or with empirical regression models that predict MAF based on climate and catchment characteristics. Yet, regression models have mostly been developed at a regional scale and the extent to which they can be extrapolated to other regions is not known. In this study, we developed a global-scale regression model for MAF based on a dataset unprecedented in size, using observations of discharge and catchment characteristics from 1885 catchments worldwide, measuring between 2 and 106 km2. In addition, we compared the performance of the regression model with the predictive ability of the spatially explicit global hydrological model PCR-GLOBWB by comparing results from both models to independent measurements. We obtained a regression model explaining 89% of the variance in MAF based on catchment area and catchment averaged mean annual precipitation and air temperature, slope and elevation. The regression model performed better than PCR-GLOBWB for the prediction of MAF, as root-mean-square error (RMSE) values were lower (0.29-0.38 compared to 0.49-0.57) and the modified index of agreement (d) was higher (0.80-0.83 compared to 0.72-0.75). Our regression model can be applied globally to estimate MAF at any point of the river network, thus providing a feasible alternative to spatially explicit process-based global hydrological models.

  13. A systematic review and meta-regression analysis of mivacurium for tracheal intubation

    NARCIS (Netherlands)

    Vanlinthout, L.E.H.; Mesfin, S.H.; Hens, N.; Vanacker, B.F.; Robertson, E.N.; Booij, L.H.D.J.

    2014-01-01

    We systematically reviewed factors associated with intubation conditions in randomised controlled trials of mivacurium, using random-effects meta-regression analysis. We included 29 studies of 1050 healthy participants. Four factors explained 72.9% of the variation in the probability of excellent

  14. No rationale for 1 variable per 10 events criterion for binary logistic regression analysis.

    Science.gov (United States)

    van Smeden, Maarten; de Groot, Joris A H; Moons, Karel G M; Collins, Gary S; Altman, Douglas G; Eijkemans, Marinus J C; Reitsma, Johannes B

    2016-11-24

    Ten events per variable (EPV) is a widely advocated minimal criterion for sample size considerations in logistic regression analysis. Of three previous simulation studies that examined this minimal EPV criterion only one supports the use of a minimum of 10 EPV. In this paper, we examine the reasons for substantial differences between these extensive simulation studies. The current study uses Monte Carlo simulations to evaluate small sample bias, coverage of confidence intervals and mean square error of logit coefficients. Logistic regression models fitted by maximum likelihood and a modified estimation procedure, known as Firth's correction, are compared. The results show that besides EPV, the problems associated with low EPV depend on other factors such as the total sample size. It is also demonstrated that simulation results can be dominated by even a few simulated data sets for which the prediction of the outcome by the covariates is perfect ('separation'). We reveal that different approaches for identifying and handling separation leads to substantially different simulation results. We further show that Firth's correction can be used to improve the accuracy of regression coefficients and alleviate the problems associated with separation. The current evidence supporting EPV rules for binary logistic regression is weak. Given our findings, there is an urgent need for new research to provide guidance for supporting sample size considerations for binary logistic regression analysis.

  15. Spatial compression algorithm for the analysis of very large multivariate images

    Science.gov (United States)

    Keenan, Michael R [Albuquerque, NM

    2008-07-15

    A method for spatially compressing data sets enables the efficient analysis of very large multivariate images. The spatial compression algorithms use a wavelet transformation to map an image into a compressed image containing a smaller number of pixels that retain the original image's information content. Image analysis can then be performed on a compressed data matrix consisting of a reduced number of significant wavelet coefficients. Furthermore, a block algorithm can be used for performing common operations more efficiently. The spatial compression algorithms can be combined with spectral compression algorithms to provide further computational efficiencies.

  16. Estimate the contribution of incubation parameters influence egg hatchability using multiple linear regression analysis.

    Science.gov (United States)

    Khalil, Mohamed H; Shebl, Mostafa K; Kosba, Mohamed A; El-Sabrout, Karim; Zaki, Nesma

    2016-08-01

    This research was conducted to determine the most affecting parameters on hatchability of indigenous and improved local chickens' eggs. Five parameters were studied (fertility, early and late embryonic mortalities, shape index, egg weight, and egg weight loss) on four strains, namely Fayoumi, Alexandria, Matrouh, and Montazah. Multiple linear regression was performed on the studied parameters to determine the most influencing one on hatchability. The results showed significant differences in commercial and scientific hatchability among strains. Alexandria strain has the highest significant commercial hatchability (80.70%). Regarding the studied strains, highly significant differences in hatching chick weight among strains were observed. Using multiple linear regression analysis, fertility made the greatest percent contribution (71.31%) to hatchability, and the lowest percent contributions were made by shape index and egg weight loss. A prediction of hatchability using multiple regression analysis could be a good tool to improve hatchability percentage in chickens.

  17. Examining the Association of Economic Development with Intercity Multimodal Transport Demand in China: A Focus on Spatial Autoregressive Analysis

    Directory of Open Access Journals (Sweden)

    Jinbao Zhao

    2018-02-01

    Full Text Available Transportation is generally perceived as a catalyst for economic development. This has been highlighted in previous studies. However, less attention has been paid to examine the relationship between economy and transport demand by exploring spatially cross-sectional data, especially for countries with significant regional economic imbalance, like China. In this article, we assess the economic influence of intercity multimodal transport demand at the prefecture level in China. Spatial autoregressive regression models are used to examine the impact of transport demand on economy by deep analysis of transport modes (land, air, and water and regions (eastern, central, and western. Through contrasting results from spatial lag model and spatial error model with those from the ordinary least square, this study finds that the estimation results can become more accurate by controlling for spatial autocorrelation, especially at the national level. Through rigorous analysis it is identified that except for water passenger traffic, all other intercity transport demand significantly contribute to a city’s economic development level in gross domestic product. In particular, air transport demands distribute more evenly and are estimated with the highest beta coefficients at both national and regional levels. In addition, the beta coefficients for land, air and water transportation are estimated with different magnitudes and significances at the national and regional levels. This study contributes to the ongoing discussion on the relationship between intercity multimodal transport demand and economic development level. Findings from this paper provide planning makers with valid and efficient strategies to better develop the economy by leveraging the special “⊣” cluster pattern of economic development and the benefits of air transportation.

  18. Logistic Regression: Concept and Application

    Science.gov (United States)

    Cokluk, Omay

    2010-01-01

    The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…

  19. Spatial and Inter-temporal Sources of Poverty, Inequality and Gender Disparities in Cameroon: a Regression-Based Decomposition Analysis

    OpenAIRE

    Boniface Ngah Epo; Francis Menjo Baye; Nadine Teme Angele Manga

    2011-01-01

    This study applies the regression-based inequality decomposition technique to explain poverty and inequality trends in Cameroon. We also identify gender related factors which explain income disparities and discrimination based on the 2001 and 2007 Cameroon household consumption surveys. The results show that education, health, employment in the formal sector, age cohorts, household size, gender, ownership of farmland and urban versus rural residence explain household economic wellbeing; dispa...

  20. Deciphering factors controlling groundwater arsenic spatial variability in Bangladesh

    Science.gov (United States)

    Tan, Z.; Yang, Q.; Zheng, C.; Zheng, Y.

    2017-12-01

    Elevated concentrations of geogenic arsenic in groundwater have been found in many countries to exceed 10 μg/L, the WHO's guideline value for drinking water. A common yet unexplained characteristic of groundwater arsenic spatial distribution is the extensive variability at various spatial scales. This study investigates factors influencing the spatial variability of groundwater arsenic in Bangladesh to improve the accuracy of models predicting arsenic exceedance rate spatially. A novel boosted regression tree method is used to establish a weak-learning ensemble model, which is compared to a linear model using a conventional stepwise logistic regression method. The boosted regression tree models offer the advantage of parametric interaction when big datasets are analyzed in comparison to the logistic regression. The point data set (n=3,538) of groundwater hydrochemistry with 19 parameters was obtained by the British Geological Survey in 2001. The spatial data sets of geological parameters (n=13) were from the Consortium for Spatial Information, Technical University of Denmark, University of East Anglia and the FAO, while the soil parameters (n=42) were from the Harmonized World Soil Database. The aforementioned parameters were regressed to categorical groundwater arsenic concentrations below or above three thresholds: 5 μg/L, 10 μg/L and 50 μg/L to identify respective controlling factors. Boosted regression tree method outperformed logistic regression methods in all three threshold levels in terms of accuracy, specificity and sensitivity, resulting in an improvement of spatial distribution map of probability of groundwater arsenic exceeding all three thresholds when compared to disjunctive-kriging interpolated spatial arsenic map using the same groundwater arsenic dataset. Boosted regression tree models also show that the most important controlling factors of groundwater arsenic distribution include groundwater iron content and well depth for all three

  1. A multiple regression analysis for accurate background subtraction in 99Tcm-DTPA renography

    International Nuclear Information System (INIS)

    Middleton, G.W.; Thomson, W.H.; Davies, I.H.; Morgan, A.

    1989-01-01

    A technique for accurate background subtraction in 99 Tc m -DTPA renography is described. The technique is based on a multiple regression analysis of the renal curves and separate heart and soft tissue curves which together represent background activity. It is compared, in over 100 renograms, with a previously described linear regression technique. Results show that the method provides accurate background subtraction, even in very poorly functioning kidneys, thus enabling relative renal filtration and excretion to be accurately estimated. (author)

  2. Structured Additive Quantile Regression for Assessing the Determinants of Childhood Anemia in Rwanda

    Directory of Open Access Journals (Sweden)

    Faustin Habyarimana

    2017-06-01

    Full Text Available Childhood anemia is among the most significant health problems faced by public health departments in developing countries. This study aims at assessing the determinants and possible spatial effects associated with childhood anemia in Rwanda. The 2014/2015 Rwanda Demographic and Health Survey (RDHS data was used. The analysis was done using the structured spatial additive quantile regression model. The findings of this study revealed that the child’s age; the duration of breastfeeding; gender of the child; the nutritional status of the child (whether underweight and/or wasting; whether the child had a fever; had a cough in the two weeks prior to the survey or not; whether the child received vitamin A supplementation in the six weeks before the survey or not; the household wealth index; literacy of the mother; mother’s anemia status; mother’s age at the birth are all significant factors associated with childhood anemia in Rwanda. Furthermore, significant structured spatial location effects on childhood anemia was found.

  3. Structured Additive Quantile Regression for Assessing the Determinants of Childhood Anemia in Rwanda.

    Science.gov (United States)

    Habyarimana, Faustin; Zewotir, Temesgen; Ramroop, Shaun

    2017-06-17

    Childhood anemia is among the most significant health problems faced by public health departments in developing countries. This study aims at assessing the determinants and possible spatial effects associated with childhood anemia in Rwanda. The 2014/2015 Rwanda Demographic and Health Survey (RDHS) data was used. The analysis was done using the structured spatial additive quantile regression model. The findings of this study revealed that the child's age; the duration of breastfeeding; gender of the child; the nutritional status of the child (whether underweight and/or wasting); whether the child had a fever; had a cough in the two weeks prior to the survey or not; whether the child received vitamin A supplementation in the six weeks before the survey or not; the household wealth index; literacy of the mother; mother's anemia status; mother's age at the birth are all significant factors associated with childhood anemia in Rwanda. Furthermore, significant structured spatial location effects on childhood anemia was found.

  4. Development of an empirical model of turbine efficiency using the Taylor expansion and regression analysis

    International Nuclear Information System (INIS)

    Fang, Xiande; Xu, Yu

    2011-01-01

    The empirical model of turbine efficiency is necessary for the control- and/or diagnosis-oriented simulation and useful for the simulation and analysis of dynamic performances of the turbine equipment and systems, such as air cycle refrigeration systems, power plants, turbine engines, and turbochargers. Existing empirical models of turbine efficiency are insufficient because there is no suitable form available for air cycle refrigeration turbines. This work performs a critical review of empirical models (called mean value models in some literature) of turbine efficiency and develops an empirical model in the desired form for air cycle refrigeration, the dominant cooling approach in aircraft environmental control systems. The Taylor series and regression analysis are used to build the model, with the Taylor series being used to expand functions with the polytropic exponent and the regression analysis to finalize the model. The measured data of a turbocharger turbine and two air cycle refrigeration turbines are used for the regression analysis. The proposed model is compact and able to present the turbine efficiency map. Its predictions agree with the measured data very well, with the corrected coefficient of determination R c 2 ≥ 0.96 and the mean absolute percentage deviation = 1.19% for the three turbines. -- Highlights: → Performed a critical review of empirical models of turbine efficiency. → Developed an empirical model in the desired form for air cycle refrigeration, using the Taylor expansion and regression analysis. → Verified the method for developing the empirical model. → Verified the model.

  5. Econometric analysis of realized covariation: high frequency based covariance, regression, and correlation in financial economics

    DEFF Research Database (Denmark)

    Barndorff-Nielsen, Ole Eiler; Shephard, N.

    2004-01-01

    This paper analyses multivariate high frequency financial data using realized covariation. We provide a new asymptotic distribution theory for standard methods such as regression, correlation analysis, and covariance. It will be based on a fixed interval of time (e.g., a day or week), allowing...... the number of high frequency returns during this period to go to infinity. Our analysis allows us to study how high frequency correlations, regressions, and covariances change through time. In particular we provide confidence intervals for each of these quantities....

  6. Health care: necessity or luxury good? A meta-regression analysis

    OpenAIRE

    Iordache, Ioana Raluca

    2014-01-01

    When estimating the influence income per capita exerts on health care expenditure, the research in the field offers mixed results. Studies employ different data, estimation techniques and models, which brings about the question whether these differences in research design play any part in explaining the heterogeneity of reported outcomes. By employing meta-regression analysis, the present paper analyzes 220 estimates of health spending income elasticity collected from 54 studies and finds tha...

  7. Local regression type methods applied to the study of geophysics and high frequency financial data

    Science.gov (United States)

    Mariani, M. C.; Basu, K.

    2014-09-01

    In this work we applied locally weighted scatterplot smoothing techniques (Lowess/Loess) to Geophysical and high frequency financial data. We first analyze and apply this technique to the California earthquake geological data. A spatial analysis was performed to show that the estimation of the earthquake magnitude at a fixed location is very accurate up to the relative error of 0.01%. We also applied the same method to a high frequency data set arising in the financial sector and obtained similar satisfactory results. The application of this approach to the two different data sets demonstrates that the overall method is accurate and efficient, and the Lowess approach is much more desirable than the Loess method. The previous works studied the time series analysis; in this paper our local regression models perform a spatial analysis for the geophysics data providing different information. For the high frequency data, our models estimate the curve of best fit where data are dependent on time.

  8. Prediction of hourly PM2.5 using a space-time support vector regression model

    Science.gov (United States)

    Yang, Wentao; Deng, Min; Xu, Feng; Wang, Hang

    2018-05-01

    Real-time air quality prediction has been an active field of research in atmospheric environmental science. The existing methods of machine learning are widely used to predict pollutant concentrations because of their enhanced ability to handle complex non-linear relationships. However, because pollutant concentration data, as typical geospatial data, also exhibit spatial heterogeneity and spatial dependence, they may violate the assumptions of independent and identically distributed random variables in most of the machine learning methods. As a result, a space-time support vector regression model is proposed to predict hourly PM2.5 concentrations. First, to address spatial heterogeneity, spatial clustering is executed to divide the study area into several homogeneous or quasi-homogeneous subareas. To handle spatial dependence, a Gauss vector weight function is then developed to determine spatial autocorrelation variables as part of the input features. Finally, a local support vector regression model with spatial autocorrelation variables is established for each subarea. Experimental data on PM2.5 concentrations in Beijing are used to verify whether the results of the proposed model are superior to those of other methods.

  9. Distance Based Root Cause Analysis and Change Impact Analysis of Performance Regressions

    Directory of Open Access Journals (Sweden)

    Junzan Zhou

    2015-01-01

    Full Text Available Performance regression testing is applied to uncover both performance and functional problems of software releases. A performance problem revealed by performance testing can be high response time, low throughput, or even being out of service. Mature performance testing process helps systematically detect software performance problems. However, it is difficult to identify the root cause and evaluate the potential change impact. In this paper, we present an approach leveraging server side logs for identifying root causes of performance problems. Firstly, server side logs are used to recover call tree of each business transaction. We define a novel distance based metric computed from call trees for root cause analysis and apply inverted index from methods to business transactions for change impact analysis. Empirical studies show that our approach can effectively and efficiently help developers diagnose root cause of performance problems.

  10. Spatial patterns of March and September streamflow trends in Pacific Northwest Streams, 1958-2008

    Science.gov (United States)

    Chang, Heejun; Jung, Il-Won; Steele, Madeline; Gannett, Marshall

    2012-01-01

    Summer streamflow is a vital water resource for municipal and domestic water supplies, irrigation, salmonid habitat, recreation, and water-related ecosystem services in the Pacific Northwest (PNW) in the United States. This study detects significant negative trends in September absolute streamflow in a majority of 68 stream-gauging stations located on unregulated streams in the PNW from 1958 to 2008. The proportion of March streamflow to annual streamflow increases in most stations over 1,000 m elevation, with a baseflow index of less than 50, while absolute March streamflow does not increase in most stations. The declining trends of September absolute streamflow are strongly associated with seven-day low flow, January–March maximum temperature trends, and the size of the basin (19–7,260 km2), while the increasing trends of the fraction of March streamflow are associated with elevation, April 1 snow water equivalent, March precipitation, center timing of streamflow, and October–December minimum temperature trends. Compared with ordinary least squares (OLS) estimated regression models, spatial error regression and geographically weighted regression (GWR) models effectively remove spatial autocorrelation in residuals. The GWR model results show spatial gradients of local R 2 values with consistently higher local R 2 values in the northern Cascades. This finding illustrates that different hydrologic landscape factors, such as geology and seasonal distribution of precipitation, also influence streamflow trends in the PNW. In addition, our spatial analysis model results show that considering various geographic factors help clarify the dynamics of streamflow trends over a large geographical area, supporting a spatial analysis approach over aspatial OLS-estimated regression models for predicting streamflow trends. Results indicate that transitional rain–snow surface water-dominated basins are likely to have reduced summer streamflow under warming scenarios

  11. Sparse multivariate factor analysis regression models and its applications to integrative genomics analysis.

    Science.gov (United States)

    Zhou, Yan; Wang, Pei; Wang, Xianlong; Zhu, Ji; Song, Peter X-K

    2017-01-01

    The multivariate regression model is a useful tool to explore complex associations between two kinds of molecular markers, which enables the understanding of the biological pathways underlying disease etiology. For a set of correlated response variables, accounting for such dependency can increase statistical power. Motivated by integrative genomic data analyses, we propose a new methodology-sparse multivariate factor analysis regression model (smFARM), in which correlations of response variables are assumed to follow a factor analysis model with latent factors. This proposed method not only allows us to address the challenge that the number of association parameters is larger than the sample size, but also to adjust for unobserved genetic and/or nongenetic factors that potentially conceal the underlying response-predictor associations. The proposed smFARM is implemented by the EM algorithm and the blockwise coordinate descent algorithm. The proposed methodology is evaluated and compared to the existing methods through extensive simulation studies. Our results show that accounting for latent factors through the proposed smFARM can improve sensitivity of signal detection and accuracy of sparse association map estimation. We illustrate smFARM by two integrative genomics analysis examples, a breast cancer dataset, and an ovarian cancer dataset, to assess the relationship between DNA copy numbers and gene expression arrays to understand genetic regulatory patterns relevant to the disease. We identify two trans-hub regions: one in cytoband 17q12 whose amplification influences the RNA expression levels of important breast cancer genes, and the other in cytoband 9q21.32-33, which is associated with chemoresistance in ovarian cancer. © 2016 WILEY PERIODICALS, INC.

  12. Identifying Generalizable Image Segmentation Parameters for Urban Land Cover Mapping through Meta-Analysis and Regression Tree Modeling

    Directory of Open Access Journals (Sweden)

    Brian A. Johnson

    2018-01-01

    Full Text Available The advent of very high resolution (VHR satellite imagery and the development of Geographic Object-Based Image Analysis (GEOBIA have led to many new opportunities for fine-scale land cover mapping, especially in urban areas. Image segmentation is an important step in the GEOBIA framework, so great time/effort is often spent to ensure that computer-generated image segments closely match real-world objects of interest. In the remote sensing community, segmentation is frequently performed using the multiresolution segmentation (MRS algorithm, which is tuned through three user-defined parameters (the scale, shape/color, and compactness/smoothness parameters. The scale parameter (SP is the most important parameter and governs the average size of generated image segments. Existing automatic methods to determine suitable SPs for segmentation are scene-specific and often computationally intensive, so an approach to estimating appropriate SPs that is generalizable (i.e., not scene-specific could speed up the GEOBIA workflow considerably. In this study, we attempted to identify generalizable SPs for five common urban land cover types (buildings, vegetation, roads, bare soil, and water through meta-analysis and nonlinear regression tree (RT modeling. First, we performed a literature search of recent studies that employed GEOBIA for urban land cover mapping and extracted the MRS parameters used, the image properties (i.e., spatial and radiometric resolutions, and the land cover classes mapped. Using this data extracted from the literature, we constructed RT models for each land cover class to predict suitable SP values based on the: image spatial resolution, image radiometric resolution, shape/color parameter, and compactness/smoothness parameter. Based on a visual and quantitative analysis of results, we found that for all land cover classes except water, relatively accurate SPs could be identified using our RT modeling results. The main advantage of our

  13. Advanced statistics: linear regression, part II: multiple linear regression.

    Science.gov (United States)

    Marill, Keith A

    2004-01-01

    The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.

  14. Climatic factors associated with amyotrophic lateral sclerosis: a spatial analysis from Taiwan.

    Science.gov (United States)

    Tsai, Ching-Piao; Tzu-Chi Lee, Charles

    2013-11-01

    Few studies have assessed the spatial association of amyotrophic lateral sclerosis (ALS) incidence in the world. The aim of this study was to identify the association of climatic factors and ALS incidence in Taiwan. A total of 1,434 subjects with the primary diagnosis of ALS between years 1997 and 2008 were identified in the national health insurance research database. The diagnosis was also verified by the national health insurance programme, which had issued and providing them with "serious disabling disease (SDD) certificates". Local indicators of spatial association were employed to investigate spatial clustering of age-standardised incidence ratios in the townships of the study area. Spatial regression was utilised to reveal any association of annual average climatic factors and ALS incidence for the 12-year study period. The climatic factors included the annual average time of sunlight exposure, average temperature, maximum temperature, minimum temperature, atmospheric pressure, rainfall, relative humidity and wind speed with spatial autocorrelation controlled. Significant correlations were only found for exposure to sunlight and rainfall and it was similar in both genders. The annual average of the former was found to be negatively correlated with ALS, while the latter was positively correlated with ALS incidence. While accepting that ALS is most probably multifactorial, it was concluded that sunlight deprivation and/or rainfall are associated to some degree with ALS incidence in Taiwan.

  15. Advanced spatial metrics analysis in cellular automata land use and cover change modeling

    International Nuclear Information System (INIS)

    Zamyatin, Alexander; Cabral, Pedro

    2011-01-01

    This paper proposes an approach for a more effective definition of cellular automata transition rules for landscape change modeling using an advanced spatial metrics analysis. This approach considers a four-stage methodology based on: (i) the search for the appropriate spatial metrics with minimal correlations; (ii) the selection of the appropriate neighborhood size; (iii) the selection of the appropriate technique for spatial metrics application; and (iv) the analysis of the contribution level of each spatial metric for joint use. The case study uses an initial set of 7 spatial metrics of which 4 are selected for modeling. Results show a better model performance when compared to modeling without any spatial metrics or with the initial set of 7 metrics.

  16. HIV-, HCV-, and co-infections and associated risk factors among drug users in southwestern China: a township-level ecological study incorporating spatial regression.

    Directory of Open Access Journals (Sweden)

    Yi-Biao Zhou

    Full Text Available BACKGROUND: The human immunodeficiency virus (HIV and hepatitis C virus (HCV are major public health problems. Many studies have been performed to investigate the association between demographic and behavioral factors and HIV or HCV infection. However, some of the results of these studies have been in conflict. METHODOLOGY/PRINCIPAL FINDINGS: The data of all entrants in the 11 national methadone clinics in the Yi Autonomous Prefecture from March 2004 to December 2012 were collected from the national database. Several spatial regression models were used to analyze specific community characteristics associated with the prevalence of HIV and HCV infection at the township level. The study enrolled 6,417 adult patients. The prevalence of HIV infection, HCV infection and co-infection was 25.4%, 30.9%, and 11.0%, respectively. Prevalence exhibited stark geographical variations in the area studied. The four regression models showed Yi ethnicity to be associated with both the prevalence of HIV and of HIV/HCV co-infection. The male drug users in some northwestern counties had greater odds of being infected with HIV than female drug users, but the opposite was observed in some eastern counties. The 'being in drug rehabilitation variable was found to be positively associated with prevalence of HCV infection in some southern townships, however, it was found to be negatively associated with it in some northern townships. CONCLUSIONS/SIGNIFICANCE: The spatial modeling creates better representations of data such that public health interventions must focus on areas with high frequency of HIV/HCV to prevent further transmission of both HIV and HCV.

  17. No rationale for 1 variable per 10 events criterion for binary logistic regression analysis

    Directory of Open Access Journals (Sweden)

    Maarten van Smeden

    2016-11-01

    Full Text Available Abstract Background Ten events per variable (EPV is a widely advocated minimal criterion for sample size considerations in logistic regression analysis. Of three previous simulation studies that examined this minimal EPV criterion only one supports the use of a minimum of 10 EPV. In this paper, we examine the reasons for substantial differences between these extensive simulation studies. Methods The current study uses Monte Carlo simulations to evaluate small sample bias, coverage of confidence intervals and mean square error of logit coefficients. Logistic regression models fitted by maximum likelihood and a modified estimation procedure, known as Firth’s correction, are compared. Results The results show that besides EPV, the problems associated with low EPV depend on other factors such as the total sample size. It is also demonstrated that simulation results can be dominated by even a few simulated data sets for which the prediction of the outcome by the covariates is perfect (‘separation’. We reveal that different approaches for identifying and handling separation leads to substantially different simulation results. We further show that Firth’s correction can be used to improve the accuracy of regression coefficients and alleviate the problems associated with separation. Conclusions The current evidence supporting EPV rules for binary logistic regression is weak. Given our findings, there is an urgent need for new research to provide guidance for supporting sample size considerations for binary logistic regression analysis.

  18. Sparse Regression by Projection and Sparse Discriminant Analysis

    KAUST Repository

    Qi, Xin; Luo, Ruiyan; Carroll, Raymond J.; Zhao, Hongyu

    2015-01-01

    predictions. We introduce a new framework, regression by projection, and its sparse version to analyze high-dimensional data. The unique nature of this framework is that the directions of the regression coefficients are inferred first, and the lengths

  19. Spatial and Angular Moment Analysis of Continuous and Discretized Transport Problems

    International Nuclear Information System (INIS)

    Brantley, Patrick S.; Larsen, Edward W.

    2000-01-01

    A new theoretical tool for analyzing continuous and discretized transport equations is presented. This technique is based on a spatial and angular moment analysis of the analytic transport equation, which yields exact expressions for the 'center of mass' and 'squared radius of gyration' of the particle distribution. Essentially the same moment analysis is applied to discretized particle transport problems to determine numerical expressions for the center of mass and squared radius of gyration. Because this technique makes no assumption about the optical thickness of the spatial cells or about the amount of absorption in the system, it is applicable to problems that cannot be analyzed by a truncation analysis or an asymptotic diffusion limit analysis. The spatial differencing schemes examined (weighted- diamond, lumped linear discontinuous, and multiple balance) yield a numerically consistent expression for computing the squared radius of gyration plus an error term that depends on the mesh spacing, quadrature constants, and material properties of the system. The numerical results presented suggest that the relative accuracy of spatial differencing schemes for different types of problems can be assessed by comparing the magnitudes of these error terms

  20. Statistical methods and regression analysis of stratospheric ozone and meteorological variables in Isfahan

    Science.gov (United States)

    Hassanzadeh, S.; Hosseinibalam, F.; Omidvari, M.

    2008-04-01

    Data of seven meteorological variables (relative humidity, wet temperature, dry temperature, maximum temperature, minimum temperature, ground temperature and sun radiation time) and ozone values have been used for statistical analysis. Meteorological variables and ozone values were analyzed using both multiple linear regression and principal component methods. Data for the period 1999-2004 are analyzed jointly using both methods. For all periods, temperature dependent variables were highly correlated, but were all negatively correlated with relative humidity. Multiple regression analysis was used to fit the meteorological variables using the meteorological variables as predictors. A variable selection method based on high loading of varimax rotated principal components was used to obtain subsets of the predictor variables to be included in the linear regression model of the meteorological variables. In 1999, 2001 and 2002 one of the meteorological variables was weakly influenced predominantly by the ozone concentrations. However, the model did not predict that the meteorological variables for the year 2000 were not influenced predominantly by the ozone concentrations that point to variation in sun radiation. This could be due to other factors that were not explicitly considered in this study.

  1. Multivariate regression analysis for determining short-term values of radon and its decay products from filter measurements

    International Nuclear Information System (INIS)

    Kraut, W.; Schwarz, W.; Wilhelm, A.

    1994-01-01

    A multivariate regression analysis is applied to decay measurements of α-resp. β-filter activcity. Activity concentrations for Po-218, Pb-214 and Bi-214, resp. for the Rn-222 equilibrium equivalent concentration are obtained explicitly. The regression analysis takes into account properly the variances of the measured count rates and their influence on the resulting activity concentrations. (orig.) [de

  2. Spatial recurrence analysis: A sensitive and fast detection tool in digital mammography

    International Nuclear Information System (INIS)

    Prado, T. L.; Galuzio, P. P.; Lopes, S. R.; Viana, R. L.

    2014-01-01

    Efficient diagnostics of breast cancer requires fast digital mammographic image processing. Many breast lesions, both benign and malignant, are barely visible to the untrained eye and requires accurate and reliable methods of image processing. We propose a new method of digital mammographic image analysis that meets both needs. It uses the concept of spatial recurrence as the basis of a spatial recurrence quantification analysis, which is the spatial extension of the well-known time recurrence analysis. The recurrence-based quantifiers are able to evidence breast lesions in a way as good as the best standard image processing methods available, but with a better control over the spurious fragments in the image

  3. An Econometric Analysis of Modulated Realised Covariance, Regression and Correlation in Noisy Diffusion Models

    DEFF Research Database (Denmark)

    Kinnebrock, Silja; Podolskij, Mark

    This paper introduces a new estimator to measure the ex-post covariation between high-frequency financial time series under market microstructure noise. We provide an asymptotic limit theory (including feasible central limit theorems) for standard methods such as regression, correlation analysis...... process can be relaxed and how our method can be applied to non-synchronous observations. We also present an empirical study of how high-frequency correlations, regressions and covariances change through time....

  4. A menu-driven software package of Bayesian nonparametric (and parametric) mixed models for regression analysis and density estimation.

    Science.gov (United States)

    Karabatsos, George

    2017-02-01

    Most of applied statistics involves regression analysis of data. In practice, it is important to specify a regression model that has minimal assumptions which are not violated by data, to ensure that statistical inferences from the model are informative and not misleading. This paper presents a stand-alone and menu-driven software package, Bayesian Regression: Nonparametric and Parametric Models, constructed from MATLAB Compiler. Currently, this package gives the user a choice from 83 Bayesian models for data analysis. They include 47 Bayesian nonparametric (BNP) infinite-mixture regression models; 5 BNP infinite-mixture models for density estimation; and 31 normal random effects models (HLMs), including normal linear models. Each of the 78 regression models handles either a continuous, binary, or ordinal dependent variable, and can handle multi-level (grouped) data. All 83 Bayesian models can handle the analysis of weighted observations (e.g., for meta-analysis), and the analysis of left-censored, right-censored, and/or interval-censored data. Each BNP infinite-mixture model has a mixture distribution assigned one of various BNP prior distributions, including priors defined by either the Dirichlet process, Pitman-Yor process (including the normalized stable process), beta (two-parameter) process, normalized inverse-Gaussian process, geometric weights prior, dependent Dirichlet process, or the dependent infinite-probits prior. The software user can mouse-click to select a Bayesian model and perform data analysis via Markov chain Monte Carlo (MCMC) sampling. After the sampling completes, the software automatically opens text output that reports MCMC-based estimates of the model's posterior distribution and model predictive fit to the data. Additional text and/or graphical output can be generated by mouse-clicking other menu options. This includes output of MCMC convergence analyses, and estimates of the model's posterior predictive distribution, for selected

  5. The contribution of spatial analysis to understanding HIV/TB mortality in children: a structural equation modelling approach

    Directory of Open Access Journals (Sweden)

    Eustasius Musenge

    2013-01-01

    Full Text Available Background: South Africa accounts for more than a sixth of the global population of people infected with HIV and TB, ranking her highest in HIV/TB co-infection worldwide. Remote areas often bear the greatest burden of morbidity and mortality, yet there are spatial differences within rural settings. Objectives: The primary aim was to investigate HIV/TB mortality determinants and their spatial distribution in the rural Agincourt sub-district for children aged 1–5 years in 2004. Our secondary aim was to model how the associated factors were interrelated as either underlying or proximate factors of child mortality using pathway analysis based on a Mosley-Chen conceptual framework. Methods: We conducted a secondary data analysis based on cross-sectional data collected in 2004 from the Agincourt sub-district in rural northeast South Africa. Child HIV/TB death was the outcome measure derived from physician assessed verbal autopsy. Modelling used multiple logit regression models with and without spatial household random effects. Structural equation models were used in modelling the complex relationships between multiple exposures and the outcome (child HIV/TB mortality as relayed on a conceptual framework. Results: Fifty-four of 6,692 children aged 1–5 years died of HIV/TB, from a total of 5,084 households. Maternal death had the greatest effect on child HIV/TB mortality (adjusted odds ratio=4.00; 95% confidence interval=1.01–15.80. A protective effect was found in households with better socio-economic status and when the child was older. Spatial models disclosed that the areas which experienced the greatest child HIV/TB mortality were those without any health facility. Conclusion: Low socio-economic status and maternal deaths impacted indirectly and directly on child mortality, respectively. These factors are major concerns locally and should be used in formulating interventions to reduce child mortality. Spatial prediction maps can guide policy

  6. Regression analysis of mixed recurrent-event and panel-count data.

    Science.gov (United States)

    Zhu, Liang; Tong, Xinwei; Sun, Jianguo; Chen, Manhua; Srivastava, Deo Kumar; Leisenring, Wendy; Robison, Leslie L

    2014-07-01

    In event history studies concerning recurrent events, two types of data have been extensively discussed. One is recurrent-event data (Cook and Lawless, 2007. The Analysis of Recurrent Event Data. New York: Springer), and the other is panel-count data (Zhao and others, 2010. Nonparametric inference based on panel-count data. Test 20: , 1-42). In the former case, all study subjects are monitored continuously; thus, complete information is available for the underlying recurrent-event processes of interest. In the latter case, study subjects are monitored periodically; thus, only incomplete information is available for the processes of interest. In reality, however, a third type of data could occur in which some study subjects are monitored continuously, but others are monitored periodically. When this occurs, we have mixed recurrent-event and panel-count data. This paper discusses regression analysis of such mixed data and presents two estimation procedures for the problem. One is a maximum likelihood estimation procedure, and the other is an estimating equation procedure. The asymptotic properties of both resulting estimators of regression parameters are established. Also, the methods are applied to a set of mixed recurrent-event and panel-count data that arose from a Childhood Cancer Survivor Study and motivated this investigation. © The Author 2014. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  7. [Application of negative binomial regression and modified Poisson regression in the research of risk factors for injury frequency].

    Science.gov (United States)

    Cao, Qingqing; Wu, Zhenqiang; Sun, Ying; Wang, Tiezhu; Han, Tengwei; Gu, Chaomei; Sun, Yehuan

    2011-11-01

    To Eexplore the application of negative binomial regression and modified Poisson regression analysis in analyzing the influential factors for injury frequency and the risk factors leading to the increase of injury frequency. 2917 primary and secondary school students were selected from Hefei by cluster random sampling method and surveyed by questionnaire. The data on the count event-based injuries used to fitted modified Poisson regression and negative binomial regression model. The risk factors incurring the increase of unintentional injury frequency for juvenile students was explored, so as to probe the efficiency of these two models in studying the influential factors for injury frequency. The Poisson model existed over-dispersion (P Poisson regression and negative binomial regression model, was fitted better. respectively. Both showed that male gender, younger age, father working outside of the hometown, the level of the guardian being above junior high school and smoking might be the results of higher injury frequencies. On a tendency of clustered frequency data on injury event, both the modified Poisson regression analysis and negative binomial regression analysis can be used. However, based on our data, the modified Poisson regression fitted better and this model could give a more accurate interpretation of relevant factors affecting the frequency of injury.

  8. Spatial analysis of the etiology of amyotrophic lateral sclerosis among 1991 Gulf War veterans.

    Science.gov (United States)

    Miranda, Marie Lynn; Alicia Overstreet Galeano, M; Tassone, Eric; Allen, Kelli D; Horner, Ronnie D

    2008-11-01

    Veterans of the 1991 Gulf War have an increased risk of amyotrophic lateral sclerosis (ALS), but the etiology is unknown. This study sought to identify geographic areas with elevated risk for the later development of ALS among military personnel who served in the first Gulf War. A unified geographic information system (GIS) was constructed to allow analysis of secondary data on troop movements in the 1991 Gulf War theatre in the Persian Gulf region including Iraq, northern Saudi Arabia, and Kuwait. We fit Bayesian Poisson regression models to adjust for potential risk factors, including one relatively discrete environmental exposure, and to identify areas associated with elevated risk of ALS. We found that service in particular locations of the Gulf was associated with an elevated risk for later developing ALS, both before and after adjustment for branch of service and potential of exposure to chemical warfare agents in and around Khamisiyah, Iraq. Specific geographic locations of troop units within the 1991 Gulf War theatre are associated with an increased risk for the subsequent development of ALS among members of those units. The identified spatial locations represent the logical starting points in the search for potential etiologic factors of ALS among Gulf War veterans. Of note, for locations where the relative odds of subsequently developing ALS are among the highest, specific risk factors, whether environmental or occupationally related, have not been identified. The results of spatial models can be used to subsequently look for risk factors that follow the spatial pattern of elevated risk.

  9. Forecasting Model for IPTV Service in Korea Using Bootstrap Ridge Regression Analysis

    Science.gov (United States)

    Lee, Byoung Chul; Kee, Seho; Kim, Jae Bum; Kim, Yun Bae

    The telecom firms in Korea are taking new step to prepare for the next generation of convergence services, IPTV. In this paper we described our analysis on the effective method for demand forecasting about IPTV broadcasting. We have tried according to 3 types of scenarios based on some aspects of IPTV potential market and made a comparison among the results. The forecasting method used in this paper is the multi generation substitution model with bootstrap ridge regression analysis.

  10. Time dependent analysis of Xenon spatial oscillations in small power reactors

    International Nuclear Information System (INIS)

    Decco, Claudia Cristina Ghirardello

    1997-01-01

    This work presents time dependent analysis of xenon spatial oscillations studying the influence of the power density distribution, type of reactivity perturbation, power level and core size, using the one-dimensional and three-dimensional analysis with the MID2 and citation codes, respectively. It is concluded that small pressurized water reactors with height smaller than 1.5 m are stable and do not have xenon spatial oscillations. (author)

  11. A spatial epidemiological analysis of self-rated mental health in the slums of Dhaka

    Directory of Open Access Journals (Sweden)

    Müller Daniel

    2011-05-01

    Full Text Available Abstract Background The deprived physical environments present in slums are well-known to have adverse health effects on their residents. However, little is known about the health effects of the social environments in slums. Moreover, neighbourhood quantitative spatial analyses of the mental health status of slum residents are still rare. The aim of this paper is to study self-rated mental health data in several slums of Dhaka, Bangladesh, by accounting for neighbourhood social and physical associations using spatial statistics. We hypothesised that mental health would show a significant spatial pattern in different population groups, and that the spatial patterns would relate to spatially-correlated health-determining factors (HDF. Methods We applied a spatial epidemiological approach, including non-spatial ANOVA/ANCOVA, as well as global and local univariate and bivariate Moran's I statistics. The WHO-5 Well-being Index was used as a measure of self-rated mental health. Results We found that poor mental health (WHO-5 scores Conclusions Spatial patterns of mental health were detected and could be partly explained by spatially correlated HDF. We thereby showed that the socio-physical neighbourhood was significantly associated with health status, i.e., mental health at one location was spatially dependent on the mental health and HDF prevalent at neighbouring locations. Furthermore, the spatial patterns point to severe health disparities both within and between the slums. In addition to examining health outcomes, the methodology used here is also applicable to residuals of regression models, such as helping to avoid violating the assumption of data independence that underlies many statistical approaches. We assume that similar spatial structures can be found in other studies focussing on neighbourhood effects on health, and therefore argue for a more widespread incorporation of spatial statistics in epidemiological studies.

  12. An Additive-Multiplicative Cox-Aalen Regression Model

    DEFF Research Database (Denmark)

    Scheike, Thomas H.; Zhang, Mei-Jie

    2002-01-01

    Aalen model; additive risk model; counting processes; Cox regression; survival analysis; time-varying effects......Aalen model; additive risk model; counting processes; Cox regression; survival analysis; time-varying effects...

  13. Marital status integration and suicide: A meta-analysis and meta-regression.

    Science.gov (United States)

    Kyung-Sook, Woo; SangSoo, Shin; Sangjin, Shin; Young-Jeon, Shin

    2018-01-01

    Marital status is an index of the phenomenon of social integration within social structures and has long been identified as an important predictor suicide. However, previous meta-analyses have focused only on a particular marital status, or not sufficiently explored moderators. A meta-analysis of observational studies was conducted to explore the relationships between marital status and suicide and to understand the important moderating factors in this association. Electronic databases were searched to identify studies conducted between January 1, 2000 and June 30, 2016. We performed a meta-analysis, subgroup analysis, and meta-regression of 170 suicide risk estimates from 36 publications. Using random effects model with adjustment for covariates, the study found that the suicide risk for non-married versus married was OR = 1.92 (95% CI: 1.75-2.12). The suicide risk was higher for non-married individuals aged analysis by gender, non-married men exhibited a greater risk of suicide than their married counterparts in all sub-analyses, but women aged 65 years or older showed no significant association between marital status and suicide. The suicide risk in divorced individuals was higher than for non-married individuals in both men and women. The meta-regression showed that gender, age, and sample size affected between-study variation. The results of the study indicated that non-married individuals have an aggregate higher suicide risk than married ones. In addition, gender and age were confirmed as important moderating factors in the relationship between marital status and suicide. Copyright © 2017 Elsevier Ltd. All rights reserved.

  14. Spectro-spatial analysis of wave packet propagation in nonlinear acoustic metamaterials

    Science.gov (United States)

    Zhou, W. J.; Li, X. P.; Wang, Y. S.; Chen, W. Q.; Huang, G. L.

    2018-01-01

    The objective of this work is to analyze wave packet propagation in weakly nonlinear acoustic metamaterials and reveal the interior nonlinear wave mechanism through spectro-spatial analysis. The spectro-spatial analysis is based on full-scale transient analysis of the finite system, by which dispersion curves are generated from the transmitted waves and also verified by the perturbation method (the L-P method). We found that the spectro-spatial analysis can provide detailed information about the solitary wave in short-wavelength region which cannot be captured by the L-P method. It is also found that the optical wave modes in the nonlinear metamaterial are sensitive to the parameters of the nonlinear constitutive relation. Specifically, a significant frequency shift phenomenon is found in the middle-wavelength region of the optical wave branch, which makes this frequency region behave like a band gap for transient waves. This special frequency shift is then used to design a direction-biased waveguide device, and its efficiency is shown by numerical simulations.

  15. Assessment of tuberculosis spatial hotspot areas in Antananarivo, Madagascar, by combining spatial analysis and genotyping.

    Science.gov (United States)

    Ratovonirina, Noël Harijaona; Rakotosamimanana, Niaina; Razafimahatratra, Solohery Lalaina; Raherison, Mamy Serge; Refrégier, Guislaine; Sola, Christophe; Rakotomanana, Fanjasoa; Rasolofo Razanamparany, Voahangy

    2017-08-14

    Tuberculosis (TB) remains a public health problem in Madagascar. A crucial element of TB control is the development of an easy and rapid method for the orientation of TB control strategies in the country. Our main objective was to develop a TB spatial hotspot identification method by combining spatial analysis and TB genotyping method in Antananarivo. Sputa of new pulmonary TB cases from 20 TB diagnosis and treatment centers (DTCs) in Antananarivo were collected from August 2013 to May 2014 for culture. Mycobacterium tuberculosis complex (MTBC) clinical isolates were typed by spoligotyping on a Luminex® 200 platform. All TB patients were respectively localized according to their neighborhood residence and the spatial distribution of all pulmonary TB patients and patients with genotypic clustered isolates were scanned respectively by the Kulldorff spatial scanning method for identification of significant spatial clustering. Areas exhibiting spatial clustering of patients with genotypic clustered isolates were considered as hotspot TB areas for transmission. Overall, 467 new cases were included in the study, and 394 spoligotypes were obtained (84.4%). New TB cases were distributed in 133 of the 192 Fokontany (administrative neighborhoods) of Antananarivo (1 to 15 clinical patients per Fokontany) and patients with genotypic clustered isolates were distributed in 127 of the 192 Fokontany (1 to 13 per Fokontany). A single spatial focal point of epidemics was detected when ignoring genotypic data (p = 0.039). One Fokontany of this focal point and three additional ones were detected to be spatially clustered when taking genotypes into account (p Madagascar and will allow better TB control strategies by public health authorities.

  16. Applying Different Independent Component Analysis Algorithms and Support Vector Regression for IT Chain Store Sales Forecasting

    Science.gov (United States)

    Dai, Wensheng

    2014-01-01

    Sales forecasting is one of the most important issues in managing information technology (IT) chain store sales since an IT chain store has many branches. Integrating feature extraction method and prediction tool, such as support vector regression (SVR), is a useful method for constructing an effective sales forecasting scheme. Independent component analysis (ICA) is a novel feature extraction technique and has been widely applied to deal with various forecasting problems. But, up to now, only the basic ICA method (i.e., temporal ICA model) was applied to sale forecasting problem. In this paper, we utilize three different ICA methods including spatial ICA (sICA), temporal ICA (tICA), and spatiotemporal ICA (stICA) to extract features from the sales data and compare their performance in sales forecasting of IT chain store. Experimental results from a real sales data show that the sales forecasting scheme by integrating stICA and SVR outperforms the comparison models in terms of forecasting error. The stICA is a promising tool for extracting effective features from branch sales data and the extracted features can improve the prediction performance of SVR for sales forecasting. PMID:25165740

  17. Applying different independent component analysis algorithms and support vector regression for IT chain store sales forecasting.

    Science.gov (United States)

    Dai, Wensheng; Wu, Jui-Yu; Lu, Chi-Jie

    2014-01-01

    Sales forecasting is one of the most important issues in managing information technology (IT) chain store sales since an IT chain store has many branches. Integrating feature extraction method and prediction tool, such as support vector regression (SVR), is a useful method for constructing an effective sales forecasting scheme. Independent component analysis (ICA) is a novel feature extraction technique and has been widely applied to deal with various forecasting problems. But, up to now, only the basic ICA method (i.e., temporal ICA model) was applied to sale forecasting problem. In this paper, we utilize three different ICA methods including spatial ICA (sICA), temporal ICA (tICA), and spatiotemporal ICA (stICA) to extract features from the sales data and compare their performance in sales forecasting of IT chain store. Experimental results from a real sales data show that the sales forecasting scheme by integrating stICA and SVR outperforms the comparison models in terms of forecasting error. The stICA is a promising tool for extracting effective features from branch sales data and the extracted features can improve the prediction performance of SVR for sales forecasting.

  18. Applying Different Independent Component Analysis Algorithms and Support Vector Regression for IT Chain Store Sales Forecasting

    Directory of Open Access Journals (Sweden)

    Wensheng Dai

    2014-01-01

    Full Text Available Sales forecasting is one of the most important issues in managing information technology (IT chain store sales since an IT chain store has many branches. Integrating feature extraction method and prediction tool, such as support vector regression (SVR, is a useful method for constructing an effective sales forecasting scheme. Independent component analysis (ICA is a novel feature extraction technique and has been widely applied to deal with various forecasting problems. But, up to now, only the basic ICA method (i.e., temporal ICA model was applied to sale forecasting problem. In this paper, we utilize three different ICA methods including spatial ICA (sICA, temporal ICA (tICA, and spatiotemporal ICA (stICA to extract features from the sales data and compare their performance in sales forecasting of IT chain store. Experimental results from a real sales data show that the sales forecasting scheme by integrating stICA and SVR outperforms the comparison models in terms of forecasting error. The stICA is a promising tool for extracting effective features from branch sales data and the extracted features can improve the prediction performance of SVR for sales forecasting.

  19. Driven Factors Analysis of China’s Irrigation Water Use Efficiency by Stepwise Regression and Principal Component Analysis

    Directory of Open Access Journals (Sweden)

    Renfu Jia

    2016-01-01

    Full Text Available This paper introduces an integrated approach to find out the major factors influencing efficiency of irrigation water use in China. It combines multiple stepwise regression (MSR and principal component analysis (PCA to obtain more realistic results. In real world case studies, classical linear regression model often involves too many explanatory variables and the linear correlation issue among variables cannot be eliminated. Linearly correlated variables will cause the invalidity of the factor analysis results. To overcome this issue and reduce the number of the variables, PCA technique has been used combining with MSR. As such, the irrigation water use status in China was analyzed to find out the five major factors that have significant impacts on irrigation water use efficiency. To illustrate the performance of the proposed approach, the calculation based on real data was conducted and the results were shown in this paper.

  20. A regression analysis of the effect of energy use in agriculture

    International Nuclear Information System (INIS)

    Karkacier, Osman; Gokalp Goktolga, Z.; Cicek, Adnan

    2006-01-01

    This study investigates the impacts of energy use on productivity of Turkey's agriculture. It reports the results of a regression analysis of the relationship between energy use and agricultural productivity. The study is based on the analysis of the yearbook data for the period 1971-2003. Agricultural productivity was specified as a function of its energy consumption (TOE) and gross additions of fixed assets during the year. Least square (LS) was employed to estimate equation parameters. The data of this study comes from the State Institute of Statistics (SIS) and The Ministry of Energy of Turkey

  1. Assessment of the spatial scaling behaviour of floods in the United Kingdom

    Science.gov (United States)

    Formetta, Giuseppe; Stewart, Elizabeth; Bell, Victoria

    2017-04-01

    Floods are among the most dangerous natural hazards, causing loss of life and significant damage to private and public property. Regional flood-frequency analysis (FFA) methods are essential tools to assess the flood hazard and plan interventions for its mitigation. FFA methods are often based on the well-known index flood method that assumes the invariance of the coefficient of variation of floods with drainage area. This assumption is equivalent to the simple scaling or self-similarity assumption for peak floods, i.e. their spatial structure remains similar in a particular, relatively simple, way to itself over a range of scales. Spatial scaling of floods has been evaluated at national scale for different countries such as Canada, USA, and Australia. According our knowledge. Such a study has not been conducted for the United Kingdom even though the standard FFA method there is based on the index flood assumption. In this work we present an integrated approach to assess of the spatial scaling behaviour of floods in the United Kingdom using three different methods: product moments (PM), probability weighted moments (PWM), and quantile analysis (QA). We analyse both instantaneous and daily annual observed maximum floods and performed our analysis both across the entire country and in its sub-climatic regions as defined in the Flood Studies Report (NERC, 1975). To evaluate the relationship between the k-th moments or quantiles and the drainage area we used both regression with area alone and multiple regression considering other explanatory variables to account for the geomorphology, amount of rainfall, and soil type of the catchments. The latter multiple regression approach was only recently demonstrated being more robust than the traditional regression with area alone that can lead to biased estimates of scaling exponents and misinterpretation of spatial scaling behaviour. We tested our framework on almost 600 rural catchments in UK considered as entire region and

  2. Abundant Topological Outliers in Social Media Data and Their Effect on Spatial Analysis.

    Science.gov (United States)

    Westerholt, Rene; Steiger, Enrico; Resch, Bernd; Zipf, Alexander

    2016-01-01

    Twitter and related social media feeds have become valuable data sources to many fields of research. Numerous researchers have thereby used social media posts for spatial analysis, since many of them contain explicit geographic locations. However, despite its widespread use within applied research, a thorough understanding of the underlying spatial characteristics of these data is still lacking. In this paper, we investigate how topological outliers influence the outcomes of spatial analyses of social media data. These outliers appear when different users contribute heterogeneous information about different phenomena simultaneously from similar locations. As a consequence, various messages representing different spatial phenomena are captured closely to each other, and are at risk to be falsely related in a spatial analysis. Our results reveal indications for corresponding spurious effects when analyzing Twitter data. Further, we show how the outliers distort the range of outcomes of spatial analysis methods. This has significant influence on the power of spatial inferential techniques, and, more generally, on the validity and interpretability of spatial analysis results. We further investigate how the issues caused by topological outliers are composed in detail. We unveil that multiple disturbing effects are acting simultaneously and that these are related to the geographic scales of the involved overlapping patterns. Our results show that at some scale configurations, the disturbances added through overlap are more severe than at others. Further, their behavior turns into a volatile and almost chaotic fluctuation when the scales of the involved patterns become too different. Overall, our results highlight the critical importance of thoroughly considering the specific characteristics of social media data when analyzing them spatially.

  3. A Poisson-lognormal conditional-autoregressive model for multivariate spatial analysis of pedestrian crash counts across neighborhoods.

    Science.gov (United States)

    Wang, Yiyi; Kockelman, Kara M

    2013-11-01

    This work examines the relationship between 3-year pedestrian crash counts across Census tracts in Austin, Texas, and various land use, network, and demographic attributes, such as land use balance, residents' access to commercial land uses, sidewalk density, lane-mile densities (by roadway class), and population and employment densities (by type). The model specification allows for region-specific heterogeneity, correlation across response types, and spatial autocorrelation via a Poisson-based multivariate conditional auto-regressive (CAR) framework and is estimated using Bayesian Markov chain Monte Carlo methods. Least-squares regression estimates of walk-miles traveled per zone serve as the exposure measure. Here, the Poisson-lognormal multivariate CAR model outperforms an aspatial Poisson-lognormal multivariate model and a spatial model (without cross-severity correlation), both in terms of fit and inference. Positive spatial autocorrelation emerges across neighborhoods, as expected (due to latent heterogeneity or missing variables that trend in space, resulting in spatial clustering of crash counts). In comparison, the positive aspatial, bivariate cross correlation of severe (fatal or incapacitating) and non-severe crash rates reflects latent covariates that have impacts across severity levels but are more local in nature (such as lighting conditions and local sight obstructions), along with spatially lagged cross correlation. Results also suggest greater mixing of residences and commercial land uses is associated with higher pedestrian crash risk across different severity levels, ceteris paribus, presumably since such access produces more potential conflicts between pedestrian and vehicle movements. Interestingly, network densities show variable effects, and sidewalk provision is associated with lower severe-crash rates. Copyright © 2013 Elsevier Ltd. All rights reserved.

  4. SPATIAL ANALYSIS AND DECISION ASSISTANCE (SADA) TRAINING COURSE

    Science.gov (United States)

    Spatial Analysis and Decision Assistance (SADA) is a Windows freeware program that incorporates tools from environmental assessment into an effective problem-solving environment. SADA was developed by the Institute for Environmental Modeling at the University of Tennessee and inc...

  5. Spatial data analysis for exploration of regional scale geothermal resources

    Science.gov (United States)

    Moghaddam, Majid Kiavarz; Noorollahi, Younes; Samadzadegan, Farhad; Sharifi, Mohammad Ali; Itoi, Ryuichi

    2013-10-01

    Defining a comprehensive conceptual model of the resources sought is one of the most important steps in geothermal potential mapping. In this study, Fry analysis as a spatial distribution method and 5% well existence, distance distribution, weights of evidence (WofE), and evidential belief function (EBFs) methods as spatial association methods were applied comparatively to known geothermal occurrences, and to publicly-available regional-scale geoscience data in Akita and Iwate provinces within the Tohoku volcanic arc, in northern Japan. Fry analysis and rose diagrams revealed similar directional patterns of geothermal wells and volcanoes, NNW-, NNE-, NE-trending faults, hotsprings and fumaroles. Among the spatial association methods, WofE defined a conceptual model correspondent with the real world situations, approved with the aid of expert opinion. The results of the spatial association analyses quantitatively indicated that the known geothermal occurrences are strongly spatially-associated with geological features such as volcanoes, craters, NNW-, NNE-, NE-direction faults and geochemical features such as hotsprings, hydrothermal alteration zones and fumaroles. Geophysical data contains temperature gradients over 100 °C/km and heat flow over 100 mW/m2. In general, geochemical and geophysical data were better evidence layers than geological data for exploring geothermal resources. The spatial analyses of the case study area suggested that quantitative knowledge from hydrothermal geothermal resources was significantly useful for further exploration and for geothermal potential mapping in the case study region. The results can also be extended to the regions with nearly similar characteristics.

  6. Spatial regression methods to evaluate beekeeping production in the state of Rio de Janeiro Métodos de regressão espacial para avaliação da produção apícola do estado do Rio de Janeiro

    Directory of Open Access Journals (Sweden)

    W.S. Tassinari

    2013-04-01

    Full Text Available Brazilian beekeeping has been developed from the africanization of the honeybees and its high performance launches Brazil as one of the world´s largest honey producer. The Southeastern region has an expressive position in this market (45%, but the state of Rio de Janeiro is the smallest producer, despite presenting large areas of wild vegetation for honey production. In order to analyze the honey productivity in the state of Rio de Janeiro, this research used classic and spatial regression approaches. The data used in this study comprised the responses regarding beekeeping from 1418 beekeepers distributed throughout 72 counties of this state. The best statistical fit was a semiparametric spatial model. The proposed model could be used to estimate the annual honey yield per hive in regions and to detect production factors more related to beekeeping. Honey productivity was associated with the number of hives, wild swarm collection and losses in the apiaries. This paper highlights that the beekeeping sector needs support and help to elucidate the problems plaguing beekeepers, and the inclusion of spatial effects in the regression models is a useful tool in geographical data.A apicultura brasileira se desenvolveu a partir da africanização das abelhas melíferas, e seu bom desempenho permitiu lançar o Brasil como um dos maiores produtores mundiais de mel. A região Sudeste ocupa uma posição significativa no mercado, mas o estado do Rio de Janeiro é o menor produtor, apesar de apresentar áreas expressivas de vegetação silvestre para a produção de mel. Para analisar a produtividade de mel no estado do Rio de Janeiro, esta pesquisa estudou diversos métodos de regressão clássica e espacial. Os dados analisados compreenderam respostas sobre apicultura de 1418 apicultores distribuídos em 72 municípios do Rio de Janeiro. O melhor ajuste estatístico utilizado foi um modelo semiparamétrico espacial. A utilidade do modelo proposto é estimar

  7. Prediction of hearing outcomes by multiple regression analysis in patients with idiopathic sudden sensorineural hearing loss.

    Science.gov (United States)

    Suzuki, Hideaki; Tabata, Takahisa; Koizumi, Hiroki; Hohchi, Nobusuke; Takeuchi, Shoko; Kitamura, Takuro; Fujino, Yoshihisa; Ohbuchi, Toyoaki

    2014-12-01

    This study aimed to create a multiple regression model for predicting hearing outcomes of idiopathic sudden sensorineural hearing loss (ISSNHL). The participants were 205 consecutive patients (205 ears) with ISSNHL (hearing level ≥ 40 dB, interval between onset and treatment ≤ 30 days). They received systemic steroid administration combined with intratympanic steroid injection. Data were examined by simple and multiple regression analyses. Three hearing indices (percentage hearing improvement, hearing gain, and posttreatment hearing level [HLpost]) and 7 prognostic factors (age, days from onset to treatment, initial hearing level, initial hearing level at low frequencies, initial hearing level at high frequencies, presence of vertigo, and contralateral hearing level) were included in the multiple regression analysis as dependent and explanatory variables, respectively. In the simple regression analysis, the percentage hearing improvement, hearing gain, and HLpost showed significant correlation with 2, 5, and 6 of the 7 prognostic factors, respectively. The multiple correlation coefficients were 0.396, 0.503, and 0.714 for the percentage hearing improvement, hearing gain, and HLpost, respectively. Predicted values of HLpost calculated by the multiple regression equation were reliable with 70% probability with a 40-dB-width prediction interval. Prediction of HLpost by the multiple regression model may be useful to estimate the hearing prognosis of ISSNHL. © The Author(s) 2014.

  8. Regression of environmental noise in LIGO data

    International Nuclear Information System (INIS)

    Tiwari, V; Klimenko, S; Mitselmakher, G; Necula, V; Drago, M; Prodi, G; Frolov, V; Yakushin, I; Re, V; Salemi, F; Vedovato, G

    2015-01-01

    We address the problem of noise regression in the output of gravitational-wave (GW) interferometers, using data from the physical environmental monitors (PEM). The objective of the regression analysis is to predict environmental noise in the GW channel from the PEM measurements. One of the most promising regression methods is based on the construction of Wiener–Kolmogorov (WK) filters. Using this method, the seismic noise cancellation from the LIGO GW channel has already been performed. In the presented approach the WK method has been extended, incorporating banks of Wiener filters in the time–frequency domain, multi-channel analysis and regulation schemes, which greatly enhance the versatility of the regression analysis. Also we present the first results on regression of the bi-coherent noise in the LIGO data. (paper)

  9. [Multiple linear regression analysis of X-ray measurement and WOMAC scores of knee osteoarthritis].

    Science.gov (United States)

    Ma, Yu-Feng; Wang, Qing-Fu; Chen, Zhao-Jun; Du, Chun-Lin; Li, Jun-Hai; Huang, Hu; Shi, Zong-Ting; Yin, Yue-Shan; Zhang, Lei; A-Di, Li-Jiang; Dong, Shi-Yu; Wu, Ji

    2012-05-01

    To perform Multiple Linear Regression analysis of X-ray measurement and WOMAC scores of knee osteoarthritis, and to analyze their relationship with clinical and biomechanical concepts. From March 2011 to July 2011, 140 patients (250 knees) were reviewed, including 132 knees in the left and 118 knees in the right; ranging in age from 40 to 71 years, with an average of 54.68 years. The MB-RULER measurement software was applied to measure femoral angle, tibial angle, femorotibial angle, joint gap angle from antero-posterir and lateral position of X-rays. The WOMAC scores were also collected. Then multiple regression equations was applied for the linear regression analysis of correlation between the X-ray measurement and WOMAC scores. There was statistical significance in the regression equation of AP X-rays value and WOMAC scores (Pregression equation of lateral X-ray value and WOMAC scores (P>0.05). 1) X-ray measurement of knee joint can reflect the WOMAC scores to a certain extent. 2) It is necessary to measure the X-ray mechanical axis of knee, which is important for diagnosis and treatment of osteoarthritis. 3) The correlation between tibial angle,joint gap angle on antero-posterior X-ray and WOMAC scores is significant, which can be used to assess the functional recovery of patients before and after treatment.

  10. Neck-focused panic attacks among Cambodian refugees; a logistic and linear regression analysis.

    Science.gov (United States)

    Hinton, Devon E; Chhean, Dara; Pich, Vuth; Um, Khin; Fama, Jeanne M; Pollack, Mark H

    2006-01-01

    Consecutive Cambodian refugees attending a psychiatric clinic were assessed for the presence and severity of current--i.e., at least one episode in the last month--neck-focused panic. Among the whole sample (N=130), in a logistic regression analysis, the Anxiety Sensitivity Index (ASI; odds ratio=3.70) and the Clinician-Administered PTSD Scale (CAPS; odds ratio=2.61) significantly predicted the presence of current neck panic (NP). Among the neck panic patients (N=60), in the linear regression analysis, NP severity was significantly predicted by NP-associated flashbacks (beta=.42), NP-associated catastrophic cognitions (beta=.22), and CAPS score (beta=.28). Further analysis revealed the effect of the CAPS score to be significantly mediated (Sobel test [Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51, 1173-1182]) by both NP-associated flashbacks and catastrophic cognitions. In the care of traumatized Cambodian refugees, NP severity, as well as NP-associated flashbacks and catastrophic cognitions, should be specifically assessed and treated.

  11. Spatial Data Analysis: Recommendations for Educational Infrastructure in Sindh

    Directory of Open Access Journals (Sweden)

    Abdul Aziz Ansari

    2017-06-01

    Full Text Available Analysing the Education infrastructure has become a crucial activity in imparting quality teaching and resources to students. Facilitations required in improving current education status and future schools is an important analytical component. This is best achieved through a Geographical Information System (GIS analysis of the spatial distribution of schools. In this work, we will execute GIS Analytics on the rural and urban school distributions in Sindh, Pakistan. Using a reliable dataset collected from an international survey team, GIS analysis is done with respect to: 1 school locations, 2 school facilities (water, sanitation, class rooms etc. and 3 student’s results. We will carry out analysis at district level by presenting several spatial results. Correlational analysis of highly influential factors, which may impact the educational performance will generate recommendations for planning and development in weak areas which will provide useful insights regarding effective utilization of resources and new locations to build future schools. The time series analysis will predict the future results which may be witnessed through keen observations and data collections.

  12. Finding determinants of audit delay by pooled OLS regression analysis

    OpenAIRE

    Vuko, Tina; Čular, Marko

    2014-01-01

    The aim of this paper is to investigate determinants of audit delay. Audit delay is measured as the length of time (i.e. the number of calendar days) from the fiscal year-end to the audit report date. It is important to understand factors that influence audit delay since it directly affects the timeliness of financial reporting. The research is conducted on a sample of Croatian listed companies, covering the period of four years (from 2008 to 2011). We use pooled OLS regression analysis, mode...

  13. An improved geographically weighted regression model for PM2.5 concentration estimation in large areas

    Science.gov (United States)

    Zhai, Liang; Li, Shuang; Zou, Bin; Sang, Huiyong; Fang, Xin; Xu, Shan

    2018-05-01

    Considering the spatial non-stationary contributions of environment variables to PM2.5 variations, the geographically weighted regression (GWR) modeling method has been using to estimate PM2.5 concentrations widely. However, most of the GWR models in reported studies so far were established based on the screened predictors through pretreatment correlation analysis, and this process might cause the omissions of factors really driving PM2.5 variations. This study therefore developed a best subsets regression (BSR) enhanced principal component analysis-GWR (PCA-GWR) modeling approach to estimate PM2.5 concentration by fully considering all the potential variables' contributions simultaneously. The performance comparison experiment between PCA-GWR and regular GWR was conducted in the Beijing-Tianjin-Hebei (BTH) region over a one-year-period. Results indicated that the PCA-GWR modeling outperforms the regular GWR modeling with obvious higher model fitting- and cross-validation based adjusted R2 and lower RMSE. Meanwhile, the distribution map of PM2.5 concentration from PCA-GWR modeling also clearly depicts more spatial variation details in contrast to the one from regular GWR modeling. It can be concluded that the BSR enhanced PCA-GWR modeling could be a reliable way for effective air pollution concentration estimation in the coming future by involving all the potential predictor variables' contributions to PM2.5 variations.

  14. A Seemingly Unrelated Poisson Regression Model

    OpenAIRE

    King, Gary

    1989-01-01

    This article introduces a new estimator for the analysis of two contemporaneously correlated endogenous event count variables. This seemingly unrelated Poisson regression model (SUPREME) estimator combines the efficiencies created by single equation Poisson regression model estimators and insights from "seemingly unrelated" linear regression models.

  15. Dose-Dependent Effects of Statins for Patients with Aneurysmal Subarachnoid Hemorrhage: Meta-Regression Analysis.

    Science.gov (United States)

    To, Minh-Son; Prakash, Shivesh; Poonnoose, Santosh I; Bihari, Shailesh

    2018-05-01

    The study uses meta-regression analysis to quantify the dose-dependent effects of statin pharmacotherapy on vasospasm, delayed ischemic neurologic deficits (DIND), and mortality in aneurysmal subarachnoid hemorrhage. Prospective, retrospective observational studies, and randomized controlled trials (RCTs) were retrieved by a systematic database search. Summary estimates were expressed as absolute risk (AR) for a given statin dose or control (placebo). Meta-regression using inverse variance weighting and robust variance estimation was performed to assess the effect of statin dose on transformed AR in a random effects model. Dose-dependence of predicted AR with 95% confidence interval (CI) was recovered by using Miller's Freeman-Tukey inverse. The database search and study selection criteria yielded 18 studies (2594 patients) for analysis. These included 12 RCTs, 4 retrospective observational studies, and 2 prospective observational studies. Twelve studies investigated simvastatin, whereas the remaining studies investigated atorvastatin, pravastatin, or pitavastatin, with simvastatin-equivalent doses ranging from 20 to 80 mg. Meta-regression revealed dose-dependent reductions in Freeman-Tukey-transformed AR of vasospasm (slope coefficient -0.00404, 95% CI -0.00720 to -0.00087; P = 0.0321), DIND (slope coefficient -0.00316, 95% CI -0.00586 to -0.00047; P = 0.0392), and mortality (slope coefficient -0.00345, 95% CI -0.00623 to -0.00067; P = 0.0352). The present meta-regression provides weak evidence for dose-dependent reductions in vasospasm, DIND and mortality associated with acute statin use after aneurysmal subarachnoid hemorrhage. However, the analysis was limited by substantial heterogeneity among individual studies. Greater dosing strategies are a potential consideration for future RCTs. Copyright © 2018 Elsevier Inc. All rights reserved.

  16. Quantile regression theory and applications

    CERN Document Server

    Davino, Cristina; Vistocco, Domenico

    2013-01-01

    A guide to the implementation and interpretation of Quantile Regression models This book explores the theory and numerous applications of quantile regression, offering empirical data analysis as well as the software tools to implement the methods. The main focus of this book is to provide the reader with a comprehensivedescription of the main issues concerning quantile regression; these include basic modeling, geometrical interpretation, estimation and inference for quantile regression, as well as issues on validity of the model, diagnostic tools. Each methodological aspect is explored and

  17. Statistical analysis of long term spatial and temporal trends of ...

    Indian Academy of Sciences (India)

    Statistical analysis of long term spatial and temporal trends of temperature ... CGCM3; HadCM3; modified Mann–Kendall test; statistical analysis; Sutlej basin. ... Water Resources Systems Division, National Institute of Hydrology, Roorkee 247 ...

  18. The Role of Visual-Spatial Abilities in Dyslexia: Age Differences in Children's Reading?

    Science.gov (United States)

    Giovagnoli, Giulia; Vicari, Stefano; Tomassetti, Serena; Menghini, Deny

    2016-01-01

    Reading is a highly complex process in which integrative neurocognitive functions are required. Visual-spatial abilities play a pivotal role because of the multi-faceted visual sensory processing involved in reading. Several studies show that children with developmental dyslexia (DD) fail to develop effective visual strategies and that some reading difficulties are linked to visual-spatial deficits. However, the relationship between visual-spatial skills and reading abilities is still a controversial issue. Crucially, the role that age plays has not been investigated in depth in this population, and it is still not clear if visual-spatial abilities differ across educational stages in DD. The aim of the present study was to investigate visual-spatial abilities in children with DD and in age-matched normal readers (NR) according to different educational stages: in children attending primary school and in children and adolescents attending secondary school. Moreover, in order to verify whether visual-spatial measures could predict reading performance, a regression analysis has been performed in younger and older children. The results showed that younger children with DD performed significantly worse than NR in a mental rotation task, a more-local visual-spatial task, a more-global visual-perceptual task and a visual-motor integration task. However, older children with DD showed deficits in the more-global visual-perceptual task, in a mental rotation task and in a visual attention task. In younger children, the regression analysis documented that reading abilities are predicted by the visual-motor integration task, while in older children only the more-global visual-perceptual task predicted reading performances. Present findings showed that visual-spatial deficits in children with DD were age-dependent and that visual-spatial abilities engaged in reading varied across different educational stages. In order to better understand their potential role in affecting reading

  19. Subpixel Snow Cover Mapping from MODIS Data by Nonparametric Regression Splines

    Science.gov (United States)

    Akyurek, Z.; Kuter, S.; Weber, G. W.

    2016-12-01

    Spatial extent of snow cover is often considered as one of the key parameters in climatological, hydrological and ecological modeling due to its energy storage, high reflectance in the visible and NIR regions of the electromagnetic spectrum, significant heat capacity and insulating properties. A significant challenge in snow mapping by remote sensing (RS) is the trade-off between the temporal and spatial resolution of satellite imageries. In order to tackle this issue, machine learning-based subpixel snow mapping methods, like Artificial Neural Networks (ANNs), from low or moderate resolution images have been proposed. Multivariate Adaptive Regression Splines (MARS) is a nonparametric regression tool that can build flexible models for high dimensional and complex nonlinear data. Although MARS is not often employed in RS, it has various successful implementations such as estimation of vertical total electron content in ionosphere, atmospheric correction and classification of satellite images. This study is the first attempt in RS to evaluate the applicability of MARS for subpixel snow cover mapping from MODIS data. Total 16 MODIS-Landsat ETM+ image pairs taken over European Alps between March 2000 and April 2003 were used in the study. MODIS top-of-atmospheric reflectance, NDSI, NDVI and land cover classes were used as predictor variables. Cloud-covered, cloud shadow, water and bad-quality pixels were excluded from further analysis by a spatial mask. MARS models were trained and validated by using reference fractional snow cover (FSC) maps generated from higher spatial resolution Landsat ETM+ binary snow cover maps. A multilayer feed-forward ANN with one hidden layer trained with backpropagation was also developed. The mutual comparison of obtained MARS and ANN models was accomplished on independent test areas. The MARS model performed better than the ANN model with an average RMSE of 0.1288 over the independent test areas; whereas the average RMSE of the ANN model

  20. Correlation analysis of lung cancer and urban spatial factor: based on survey in Shanghai.

    Science.gov (United States)

    Wang, Lan; Zhao, Xiaojing; Xu, Wangyue; Tang, Jian; Jiang, Xiji

    2016-09-01

    The density of particulate matter (PM) in mega-cities in China such as Beijing and Shanghai has exceeded basic standards for health in recent years. Human exposure to PMs has been identified as traceable and controllable factor among all complicated risk factors for lung cancer. While the improvement of air quality needs tremendous efforts and time, certain revision of PM's density might happen associated with the adjustment of built environment. It is also proved that urban built environment is directly relevant to respiratory disease. Studies have respectively explored the indoor and outdoor factors on respiratory diseases. More comprehensive spatial factors need to be analyzed to understand the cumulative effect of built environment upon respiratory system. This interdisciplinary study examines the impact of both indoor (including age of housing, interval after decoration, indoor humidity etc.) and outdoor spatial factors (including density, parking, green spaces etc.) on lung cancer. A survey of lung cancer patients and a control group has been conducted in 2014 and 2015. A total of 472 interviewees are randomly selected within a pool of local residents who have resided in Shanghai for more than 5 years. Data are collected including their socio-demographic factors, lifestyle factors, and external and internal residential area factors. Regression models are established based on collected data to analyze the associations between lung cancer and urban spatial factors. Regression models illustrate that lung cancer presents significantly associated with a number of spatial factors. Significant outdoor spatial factors include external traffic volume (P=0.003), main plant type (P=0.035 for trees) of internal green space, internal water body (P=0.027) and land use of surrounding blocks (P=0.005 for residential areas of 7-9 floors, P=0.000 for residential areas of 4-6 floors, P=0.006 for business/commercial areas over 10 floors, P=0.005 for business/commercial areas of

  1. Image Chunking: Defining Spatial Building Blocks for Scene Analysis.

    Science.gov (United States)

    1987-04-01

    mumgs0.USmusa 7.AUWOJO 4. CIUTAC Rm6ANT Wuugme*j James V/. Mlahoney DACA? 6-85-C-00 10 NOQ 1 4-85-K-O 124 Artificial Inteligence Laboratory US USS 545...0197 672 IMAGE CHUWING: DEINING SPATIAL UILDING PLOCKS FOR 142 SCENE ANRLYSIS(U) MASSACHUSETTS INST OF TECH CAIIAIDGE ARTIFICIAL INTELLIGENCE LAO J...Technical Report 980 F-Image Chunking: Defining Spatial Building Blocks for Scene DTm -Analysis S ELECTED James V. Mahoney’ MIT Artificial Intelligence

  2. Logistic Regression and Path Analysis Method to Analyze Factors influencing Students’ Achievement

    Science.gov (United States)

    Noeryanti, N.; Suryowati, K.; Setyawan, Y.; Aulia, R. R.

    2018-04-01

    Students' academic achievement cannot be separated from the influence of two factors namely internal and external factors. The first factors of the student (internal factors) consist of intelligence (X1), health (X2), interest (X3), and motivation of students (X4). The external factors consist of family environment (X5), school environment (X6), and society environment (X7). The objects of this research are eighth grade students of the school year 2016/2017 at SMPN 1 Jiwan Madiun sampled by using simple random sampling. Primary data are obtained by distributing questionnaires. The method used in this study is binary logistic regression analysis that aims to identify internal and external factors that affect student’s achievement and how the trends of them. Path Analysis was used to determine the factors that influence directly, indirectly or totally on student’s achievement. Based on the results of binary logistic regression, variables that affect student’s achievement are interest and motivation. And based on the results obtained by path analysis, factors that have a direct impact on student’s achievement are students’ interest (59%) and students’ motivation (27%). While the factors that have indirect influences on students’ achievement, are family environment (97%) and school environment (37).

  3. Spatial analysis methods and land-use planning models for rural areas

    Directory of Open Access Journals (Sweden)

    Patrizia Tassinari

    2009-10-01

    Full Text Available The work presents a brief report of the main results of a study carried out by the Spatial Engineering Division of the Department of Agricultural Economics and Engineering of the University of Bologna, within a broader PRIN 2005 research project concerning landscape and economic analysis, planning and programming. In particular, the study focuses on the design of spatial analysis methods aimed at building knowledge frameworks of the various natural and anthropic resources of rural areas. The goal is to increase the level of spatial and information detail of common databases, thus allowing higher accuracy and effectiveness of the analyses needed to achieve the goals of new generation spatial and agriculture planning. Specific in-depth analyses allowed to define techniques useful in order to reduce the increase in survey costs. Moreover, the work reports the main results regarding a multicriteria model for the analysis of the countryside defined by the research. Such model is aimed to assess the various agricultural, environmental and landscape features, vocations, expressions and attitudes, and support the definition and implementation of specific and targeted planning and programming policies.

  4. Beyond the mean estimate: a quantile regression analysis of inequalities in educational outcomes using INVALSI survey data

    Directory of Open Access Journals (Sweden)

    Antonella Costanzo

    2017-09-01

    Full Text Available Abstract The number of studies addressing issues of inequality in educational outcomes using cognitive achievement tests and variables from large-scale assessment data has increased. Here the value of using a quantile regression approach is compared with a classical regression analysis approach to study the relationships between educational outcomes and likely predictor variables. Italian primary school data from INVALSI large-scale assessments were analyzed using both quantile and standard regression approaches. Mathematics and reading scores were regressed on students' characteristics and geographical variables selected for their theoretical and policy relevance. The results demonstrated that, in Italy, the role of gender and immigrant status varied across the entire conditional distribution of students’ performance. Analogous results emerged pertaining to the difference in students’ performance across Italian geographic areas. These findings suggest that quantile regression analysis is a useful tool to explore the determinants and mechanisms of inequality in educational outcomes. A proper interpretation of quantile estimates may enable teachers to identify effective learning activities and help policymakers to develop tailored programs that increase equity in education.

  5. Weighted functional linear regression models for gene-based association analysis.

    Science.gov (United States)

    Belonogova, Nadezhda M; Svishcheva, Gulnara R; Wilson, James F; Campbell, Harry; Axenovich, Tatiana I

    2018-01-01

    Functional linear regression models are effectively used in gene-based association analysis of complex traits. These models combine information about individual genetic variants, taking into account their positions and reducing the influence of noise and/or observation errors. To increase the power of methods, where several differently informative components are combined, weights are introduced to give the advantage to more informative components. Allele-specific weights have been introduced to collapsing and kernel-based approaches to gene-based association analysis. Here we have for the first time introduced weights to functional linear regression models adapted for both independent and family samples. Using data simulated on the basis of GAW17 genotypes and weights defined by allele frequencies via the beta distribution, we demonstrated that type I errors correspond to declared values and that increasing the weights of causal variants allows the power of functional linear models to be increased. We applied the new method to real data on blood pressure from the ORCADES sample. Five of the six known genes with P models. Moreover, we found an association between diastolic blood pressure and the VMP1 gene (P = 8.18×10-6), when we used a weighted functional model. For this gene, the unweighted functional and weighted kernel-based models had P = 0.004 and 0.006, respectively. The new method has been implemented in the program package FREGAT, which is freely available at https://cran.r-project.org/web/packages/FREGAT/index.html.

  6. Data analysis and approximate models model choice, location-scale, analysis of variance, nonparametric regression and image analysis

    CERN Document Server

    Davies, Patrick Laurie

    2014-01-01

    Introduction IntroductionApproximate Models Notation Two Modes of Statistical AnalysisTowards One Mode of Analysis Approximation, Randomness, Chaos, Determinism ApproximationA Concept of Approximation Approximation Approximating a Data Set by a Model Approximation Regions Functionals and EquivarianceRegularization and Optimality Metrics and DiscrepanciesStrong and Weak Topologies On Being (almost) Honest Simulations and Tables Degree of Approximation and p-values ScalesStability of Analysis The Choice of En(α, P) Independence Procedures, Approximation and VaguenessDiscrete Models The Empirical Density Metrics and Discrepancies The Total Variation Metric The Kullback-Leibler and Chi-Squared Discrepancies The Po(λ) ModelThe b(k, p) and nb(k, p) Models The Flying Bomb Data The Student Study Times Data OutliersOutliers, Data Analysis and Models Breakdown Points and Equivariance Identifying Outliers and Breakdown Outliers in Multivariate Data Outliers in Linear Regression Outliers in Structured Data The Location...

  7. [Inequities in health: socio-demographic and spatial analysis of breast cancer in women from Córdoba, Argentina].

    Science.gov (United States)

    Tumas, Natalia; Pou, Sonia Alejandra; Díaz, María Del Pilar

    To identify sociodemographic determinants associated with the spatial distribution of the breast cancer incidence in the province of Córdoba, Argentina, in order to reveal underlying social inequities. An ecological study was developed in Córdoba (26 counties as geographical units of analysis). The spatial autocorrelation of the crude and standardised incidence rates of breast cancer, and the sociodemographic indicators of urbanization, fertility and population ageing were estimated using Moran's index. These variables were entered into a Geographic Information System for mapping. Poisson multilevel regression models were adjusted, establishing the breast cancer incidence rates as the response variable, and by selecting sociodemographic indicators as covariables and the percentage of households with unmet basic needs as adjustment variables. In Córdoba, Argentina, a non-random pattern in the spatial distribution of breast cancer incidence rates and in certain sociodemographic indicators was found. The mean increase in annual urban population was inversely associated with breast cancer, whereas the proportion of households with unmet basic needs was directly associated with this cancer. Our results define social inequity scenarios that partially explain the geographical differentials in the breast cancer burden in Córdoba, Argentina. Women residing in socioeconomically disadvantaged households and in less urbanized areas merit special attention in future studies and in breast cancer public health activities. Copyright © 2017 SESPAS. Publicado por Elsevier España, S.L.U. All rights reserved.

  8. Downscaling soil moisture over East Asia through multi-sensor data fusion and optimization of regression trees

    Science.gov (United States)

    Park, Seonyoung; Im, Jungho; Park, Sumin; Rhee, Jinyoung

    2017-04-01

    Soil moisture is one of the most important keys for understanding regional and global climate systems. Soil moisture is directly related to agricultural processes as well as hydrological processes because soil moisture highly influences vegetation growth and determines water supply in the agroecosystem. Accurate monitoring of the spatiotemporal pattern of soil moisture is important. Soil moisture has been generally provided through in situ measurements at stations. Although field survey from in situ measurements provides accurate soil moisture with high temporal resolution, it requires high cost and does not provide the spatial distribution of soil moisture over large areas. Microwave satellite (e.g., advanced Microwave Scanning Radiometer on the Earth Observing System (AMSR2), the Advanced Scatterometer (ASCAT), and Soil Moisture Active Passive (SMAP)) -based approaches and numerical models such as Global Land Data Assimilation System (GLDAS) and Modern- Era Retrospective Analysis for Research and Applications (MERRA) provide spatial-temporalspatiotemporally continuous soil moisture products at global scale. However, since those global soil moisture products have coarse spatial resolution ( 25-40 km), their applications for agriculture and water resources at local and regional scales are very limited. Thus, soil moisture downscaling is needed to overcome the limitation of the spatial resolution of soil moisture products. In this study, GLDAS soil moisture data were downscaled up to 1 km spatial resolution through the integration of AMSR2 and ASCAT soil moisture data, Shuttle Radar Topography Mission (SRTM) Digital Elevation Model (DEM), and Moderate Resolution Imaging Spectroradiometer (MODIS) data—Land Surface Temperature, Normalized Difference Vegetation Index, and Land cover—using modified regression trees over East Asia from 2013 to 2015. Modified regression trees were implemented using Cubist, a commercial software tool based on machine learning. An

  9. Regression analysis of mixed panel count data with dependent terminal events.

    Science.gov (United States)

    Yu, Guanglei; Zhu, Liang; Li, Yang; Sun, Jianguo; Robison, Leslie L

    2017-05-10

    Event history studies are commonly conducted in many fields, and a great deal of literature has been established for the analysis of the two types of data commonly arising from these studies: recurrent event data and panel count data. The former arises if all study subjects are followed continuously, while the latter means that each study subject is observed only at discrete time points. In reality, a third type of data, a mixture of the two types of the data earlier, may occur and furthermore, as with the first two types of the data, there may exist a dependent terminal event, which may preclude the occurrences of recurrent events of interest. This paper discusses regression analysis of mixed recurrent event and panel count data in the presence of a terminal event and an estimating equation-based approach is proposed for estimation of regression parameters of interest. In addition, the asymptotic properties of the proposed estimator are established, and a simulation study conducted to assess the finite-sample performance of the proposed method suggests that it works well in practical situations. Finally, the methodology is applied to a childhood cancer study that motivated this study. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  10. Capacity analysis of spectrum sharing spatial multiplexing MIMO systems

    KAUST Repository

    Yang, Liang

    2014-12-01

    This paper considers a spectrum sharing (SS) multiple-input multiple-output (MIMO) system operating in a Rayleigh fading environment. First the capacity of a single-user SS spatial multiplexing system is investigated in two scenarios that assume different receivers. To explicitly show the capacity scaling law of SS MIMO systems, some approximate capacity expressions for the two scenarios are derived. Next, we extend our analysis to a multiple user system with zero-forcing receivers (ZF) under spatially-independent scheduling and analyze the sum-rate. Furthermore, we provide an asymptotic sum-rate analysis to investigate the effects of different parameters on the multiuser diversity gain. Our results show that the secondary system with a smaller number of transmit antennas Nt and a larger number of receive antennas Nr can achieve higher capacity at lower interference temperature Q, but at high Q the capacity follows the scaling law of the conventional MIMO systems. However, for a ZF SS spatial multiplexing system, the secondary system with small Nt and large Nr can achieve the highest capacity throughout the entire region of Q. For a ZF SS spatial multiplexing system with scheduling, the asymptotic sum-rate scales like Ntlog2(Q(KNtNp-1)/Nt), where Np denotes the number of antennas of the primary receiver and K represents the number of secondary transmitters.

  11. Mortality and Case Fatality Due to Visceral Leishmaniasis in Brazil: A Nationwide Analysis of Epidemiology, Trends and Spatial Patterns

    Science.gov (United States)

    Martins-Melo, Francisco Rogerlândio; Lima, Mauricélia da Silveira; Ramos, Alberto Novaes; Alencar, Carlos Henrique; Heukelbach, Jorg

    2014-01-01

    Background Visceral leishmaniasis (VL) is a significant public health problem in Brazil and several regions of the world. This study investigated the magnitude, temporal trends and spatial distribution of mortality related to VL in Brazil. Methods We performed a study based on secondary data obtained from the Brazilian Mortality Information System. We included all deaths in Brazil from 2000 to 2011, in which VL was recorded as cause of death. We present epidemiological characteristics, trend analysis of mortality and case fatality rates by joinpoint regression models, and spatial analysis using municipalities as geographical units of analysis. Results In the study period, 12,491,280 deaths were recorded in Brazil. VL was mentioned in 3,322 (0.03%) deaths. Average annual age-adjusted mortality rate was 0.15 deaths per 100,000 inhabitants and case fatality rate 8.1%. Highest mortality rates were observed in males (0.19 deaths/100,000 inhabitants), Brazil over the period, with different patterns between regions: increasing mortality rates in the North (Annual Percent Change – APC: 9.4%; 95% confidence interval – CI: 5.3 to 13.6), and Southeast (APC: 8.1%; 95% CI: 2.6 to 13.9); and increasing case fatality rates in the Northeast (APC: 4.0%; 95% CI: 0.8 to 7.4). Spatial analysis identified a major cluster of high mortality encompassing a wide geographic range in North and Northeast Brazil. Conclusions Despite ongoing control strategies, mortality related to VL in Brazil is increasing. Mortality and case fatality vary considerably between regions, and surveillance and control measures should be prioritized in high-risk clusters. Early diagnosis and treatment are fundamental strategies for reducing case fatality of VL in Brazil. PMID:24699517

  12. Spatial variability in levels of benzene, formaldehyde, and total benzene, toluene, ethylbenzene and xylenes in New York City: a land-use regression study.

    Science.gov (United States)

    Kheirbek, Iyad; Johnson, Sarah; Ross, Zev; Pezeshki, Grant; Ito, Kazuhiko; Eisl, Holger; Matte, Thomas

    2012-07-31

    Hazardous air pollutant exposures are common in urban areas contributing to increased risk of cancer and other adverse health outcomes. While recent analyses indicate that New York City residents experience significantly higher cancer risks attributable to hazardous air pollutant exposures than the United States as a whole, limited data exist to assess intra-urban variability in air toxics exposures. To assess intra-urban spatial variability in exposures to common hazardous air pollutants, street-level air sampling for volatile organic compounds and aldehydes was conducted at 70 sites throughout New York City during the spring of 2011. Land-use regression models were developed using a subset of 59 sites and validated against the remaining 11 sites to describe the relationship between concentrations of benzene, total BTEX (benzene, toluene, ethylbenzene, xylenes) and formaldehyde to indicators of local sources, adjusting for temporal variation. Total BTEX levels exhibited the most spatial variability, followed by benzene and formaldehyde (coefficient of variation of temporally adjusted measurements of 0.57, 0.35, 0.22, respectively). Total roadway length within 100 m, traffic signal density within 400 m of monitoring sites, and an indicator of temporal variation explained 65% of the total variability in benzene while 70% of the total variability in BTEX was accounted for by traffic signal density within 450 m, density of permitted solvent-use industries within 500 m, and an indicator of temporal variation. Measures of temporal variation, traffic signal density within 400 m, road length within 100 m, and interior building area within 100 m (indicator of heating fuel combustion) predicted 83% of the total variability of formaldehyde. The models built with the modeling subset were found to predict concentrations well, predicting 62% to 68% of monitored values at validation sites. Traffic and point source emissions cause substantial variation in street-level exposures

  13. CUSUM-Logistic Regression analysis for the rapid detection of errors in clinical laboratory test results.

    Science.gov (United States)

    Sampson, Maureen L; Gounden, Verena; van Deventer, Hendrik E; Remaley, Alan T

    2016-02-01

    The main drawback of the periodic analysis of quality control (QC) material is that test performance is not monitored in time periods between QC analyses, potentially leading to the reporting of faulty test results. The objective of this study was to develop a patient based QC procedure for the more timely detection of test errors. Results from a Chem-14 panel measured on the Beckman LX20 analyzer were used to develop the model. Each test result was predicted from the other 13 members of the panel by multiple regression, which resulted in correlation coefficients between the predicted and measured result of >0.7 for 8 of the 14 tests. A logistic regression model, which utilized the measured test result, the predicted test result, the day of the week and time of day, was then developed for predicting test errors. The output of the logistic regression was tallied by a daily CUSUM approach and used to predict test errors, with a fixed specificity of 90%. The mean average run length (ARL) before error detection by CUSUM-Logistic Regression (CSLR) was 20 with a mean sensitivity of 97%, which was considerably shorter than the mean ARL of 53 (sensitivity 87.5%) for a simple prediction model that only used the measured result for error detection. A CUSUM-Logistic Regression analysis of patient laboratory data can be an effective approach for the rapid and sensitive detection of clinical laboratory errors. Published by Elsevier Inc.

  14. Violence in public transportation: an approach based on spatial analysis.

    Science.gov (United States)

    Sousa, Daiane Castro Bittencourt de; Pitombo, Cira Souza; Rocha, Samille Santos; Salgueiro, Ana Rita; Delgado, Juan Pedro Moreno

    2017-12-11

    To carry out a spatial analysis of the occurrence of acts of violence (specifically robberies) in public transportation, identifying the regions of greater incidence, using geostatistics, and possible causes with the aid of a multicriteria analysis in the Geographic Information System. The unit of analysis is the traffic analysis zone of the survey named Origem-Destino, carried out in Salvador, state of Bahia, in 2013. The robberies recorded by the Department of Public Security of Bahia in 2013 were located and made compatible with the limits of the traffic analysis zones and, later, associated with the respective centroids. After determining the regions with the highest probability of robbery, we carried out a geographic analysis of the possible causes in the region with the highest robbery potential, considering the factors analyzed using a multicriteria analysis in a Geographic Information System environment. The execution of the two steps of this study allowed us to identify areas corresponding to the greater probability of occurrence of robberies in public transportation. In addition, the three most vulnerable road sections (Estrada da Liberdade, Rua Pero Vaz, and Avenida General San Martin) were identified in these areas. In these sections, the factors that most contribute with the potential for robbery in buses are: F1 - proximity to places that facilitate escape, F3 - great movement of persons, and F2 - absence of policing, respectively. Indicator Kriging (geostatistical estimation) can be used to construct a spatial probability surface, which can be a useful tool for the implementation of public policies. The multicriteria analysis in the Geographic Information System environment allowed us to understand the spatial factors related to the phenomenon under analysis.

  15. A GIS-based disaggregate spatial watershed analysis using RADAR data

    International Nuclear Information System (INIS)

    Al-Hamdan, M.

    2002-01-01

    Hydrology is the study of water in all its forms, origins, and destinations on the earth.This paper develops a novel modeling technique using a geographic information system (GIS) to facilitate watershed hydrological routing using RADAR data. The RADAR rainfall data, segmented to 4 km by 4 km blocks, divides the watershed into several sub basins which are modeled independently. A case study for the GIS-based disaggregate spatial watershed analysis using RADAR data is provided for South Fork Cowikee Creek near Batesville, Alabama. All the data necessary to complete the analysis is maintained in the ArcView GIS software. This paper concludes that the GIS-Based disaggregate spatial watershed analysis using RADAR data is a viable method to calculate hydrological routing for large watersheds. (author)

  16. Steganalysis using logistic regression

    Science.gov (United States)

    Lubenko, Ivans; Ker, Andrew D.

    2011-02-01

    We advocate Logistic Regression (LR) as an alternative to the Support Vector Machine (SVM) classifiers commonly used in steganalysis. LR offers more information than traditional SVM methods - it estimates class probabilities as well as providing a simple classification - and can be adapted more easily and efficiently for multiclass problems. Like SVM, LR can be kernelised for nonlinear classification, and it shows comparable classification accuracy to SVM methods. This work is a case study, comparing accuracy and speed of SVM and LR classifiers in detection of LSB Matching and other related spatial-domain image steganography, through the state-of-art 686-dimensional SPAM feature set, in three image sets.

  17. Field evaluation of spatial repellency of metofluthrin-impregnated latticework plastic strips against Aedes aegypti (L.) and analysis of environmental factors affecting its efficacy in My Tho City, Tien Giang, Vietnam.

    Science.gov (United States)

    Kawada, Hitoshi; Iwasaki, Tomonori; LE Loan, Luu; Tien, Tran Khanh; Mai, Nguyen Thi Nhu; Shono, Yoshinori; Katayama, Yasuyuki; Takagi, Masahiro

    2006-12-01

    Spatial repellency of metofluthrin-impregnated polyethylene latticework plastic strips against Aedes aegypti mosquitoes was evaluated. Analysis of environmental factors affecting the efficacy of these strips, such as room temperature, humidity, and house structure, was performed in a residential area in My Tho City, Tien Giang Province, Vietnam. Treatment with the strips at the rate of 1 strip per 2.6-5.52 m(2) (approximately 600 mg per 2.6-5.52 m(2)) reduced the collection of Ae. aegypti resting inside the houses for at least eight weeks. Multiple regression analysis indicated that both increase in the average room temperature and decrease in the area of openings in the rooms that were treated with the strips positively affected the spatial repellency of metofluthrin.

  18. Analysis of sparse data in logistic regression in medical research: A newer approach

    Directory of Open Access Journals (Sweden)

    S Devika

    2016-01-01

    Full Text Available Background and Objective: In the analysis of dichotomous type response variable, logistic regression is usually used. However, the performance of logistic regression in the presence of sparse data is questionable. In such a situation, a common problem is the presence of high odds ratios (ORs with very wide 95% confidence interval (CI (OR: >999.999, 95% CI: 999.999. In this paper, we addressed this issue by using penalized logistic regression (PLR method. Materials and Methods: Data from case-control study on hyponatremia and hiccups conducted in Christian Medical College, Vellore, Tamil Nadu, India was used. The outcome variable was the presence/absence of hiccups and the main exposure variable was the status of hyponatremia. Simulation dataset was created with different sample sizes and with a different number of covariates. Results: A total of 23 cases and 50 controls were used for the analysis of ordinary and PLR methods. The main exposure variable hyponatremia was present in nine (39.13% of the cases and in four (8.0% of the controls. Of the 23 hiccup cases, all were males and among the controls, 46 (92.0% were males. Thus, the complete separation between gender and the disease group led into an infinite OR with 95% CI (OR: >999.999, 95% CI: 999.999 whereas there was a finite and consistent regression coefficient for gender (OR: 5.35; 95% CI: 0.42, 816.48 using PLR. After adjusting for all the confounding variables, hyponatremia entailed 7.9 (95% CI: 2.06, 38.86 times higher risk for the development of hiccups as was found using PLR whereas there was an overestimation of risk OR: 10.76 (95% CI: 2.17, 53.41 using the conventional method. Simulation experiment shows that the estimated coverage probability of this method is near the nominal level of 95% even for small sample sizes and for a large number of covariates. Conclusions: PLR is almost equal to the ordinary logistic regression when the sample size is large and is superior in small cell

  19. Multiple Linear Regression Analysis of Factors Affecting Real Property Price Index From Case Study Research In Istanbul/Turkey

    Science.gov (United States)

    Denli, H. H.; Koc, Z.

    2015-12-01

    Estimation of real properties depending on standards is difficult to apply in time and location. Regression analysis construct mathematical models which describe or explain relationships that may exist between variables. The problem of identifying price differences of properties to obtain a price index can be converted into a regression problem, and standard techniques of regression analysis can be used to estimate the index. Considering regression analysis for real estate valuation, which are presented in real marketing process with its current characteristics and quantifiers, the method will help us to find the effective factors or variables in the formation of the value. In this study, prices of housing for sale in Zeytinburnu, a district in Istanbul, are associated with its characteristics to find a price index, based on information received from a real estate web page. The associated variables used for the analysis are age, size in m2, number of floors having the house, floor number of the estate and number of rooms. The price of the estate represents the dependent variable, whereas the rest are independent variables. Prices from 60 real estates have been used for the analysis. Same price valued locations have been found and plotted on the map and equivalence curves have been drawn identifying the same valued zones as lines.

  20. Prevalence of treponema species detected in endodontic infections: systematic review and meta-regression analysis.

    Science.gov (United States)

    Leite, Fábio R M; Nascimento, Gustavo G; Demarco, Flávio F; Gomes, Brenda P F A; Pucci, Cesar R; Martinho, Frederico C

    2015-05-01

    This systematic review and meta-regression analysis aimed to calculate a combined prevalence estimate and evaluate the prevalence of different Treponema species in primary and secondary endodontic infections, including symptomatic and asymptomatic cases. The MEDLINE/PubMed, Embase, Scielo, Web of Knowledge, and Scopus databases were searched without starting date restriction up to and including March 2014. Only reports in English were included. The selected literature was reviewed by 2 authors and classified as suitable or not to be included in this review. Lists were compared, and, in case of disagreements, decisions were made after a discussion based on inclusion and exclusion criteria. A pooled prevalence of Treponema species in endodontic infections was estimated. Additionally, a meta-regression analysis was performed. Among the 265 articles identified in the initial search, only 51 were included in the final analysis. The studies were classified into 2 different groups according to the type of endodontic infection and whether it was an exclusively primary/secondary study (n = 36) or a primary/secondary comparison (n = 15). The pooled prevalence of Treponema species was 41.5% (95% confidence interval, 35.9-47.0). In the multivariate model of meta-regression analysis, primary endodontic infections (P apical abscess, symptomatic apical periodontitis (P < .001), and concomitant presence of 2 or more species (P = .028) explained the heterogeneity regarding the prevalence rates of Treponema species. Our findings suggest that Treponema species are important pathogens involved in endodontic infections, particularly in cases of primary and acute infections. Copyright © 2015 American Association of Endodontists. Published by Elsevier Inc. All rights reserved.

  1. The Bayesian group lasso for confounded spatial data

    Science.gov (United States)

    Hefley, Trevor J.; Hooten, Mevin B.; Hanks, Ephraim M.; Russell, Robin E.; Walsh, Daniel P.

    2017-01-01

    Generalized linear mixed models for spatial processes are widely used in applied statistics. In many applications of the spatial generalized linear mixed model (SGLMM), the goal is to obtain inference about regression coefficients while achieving optimal predictive ability. When implementing the SGLMM, multicollinearity among covariates and the spatial random effects can make computation challenging and influence inference. We present a Bayesian group lasso prior with a single tuning parameter that can be chosen to optimize predictive ability of the SGLMM and jointly regularize the regression coefficients and spatial random effect. We implement the group lasso SGLMM using efficient Markov chain Monte Carlo (MCMC) algorithms and demonstrate how multicollinearity among covariates and the spatial random effect can be monitored as a derived quantity. To test our method, we compared several parameterizations of the SGLMM using simulated data and two examples from plant ecology and disease ecology. In all examples, problematic levels multicollinearity occurred and influenced sampling efficiency and inference. We found that the group lasso prior resulted in roughly twice the effective sample size for MCMC samples of regression coefficients and can have higher and less variable predictive accuracy based on out-of-sample data when compared to the standard SGLMM.

  2. Evaluation of logistic regression models and effect of covariates for case-control study in RNA-Seq analysis.

    Science.gov (United States)

    Choi, Seung Hoan; Labadorf, Adam T; Myers, Richard H; Lunetta, Kathryn L; Dupuis, Josée; DeStefano, Anita L

    2017-02-06

    Next generation sequencing provides a count of RNA molecules in the form of short reads, yielding discrete, often highly non-normally distributed gene expression measurements. Although Negative Binomial (NB) regression has been generally accepted in the analysis of RNA sequencing (RNA-Seq) data, its appropriateness has not been exhaustively evaluated. We explore logistic regression as an alternative method for RNA-Seq studies designed to compare cases and controls, where disease status is modeled as a function of RNA-Seq reads using simulated and Huntington disease data. We evaluate the effect of adjusting for covariates that have an unknown relationship with gene expression. Finally, we incorporate the data adaptive method in order to compare false positive rates. When the sample size is small or the expression levels of a gene are highly dispersed, the NB regression shows inflated Type-I error rates but the Classical logistic and Bayes logistic (BL) regressions are conservative. Firth's logistic (FL) regression performs well or is slightly conservative. Large sample size and low dispersion generally make Type-I error rates of all methods close to nominal alpha levels of 0.05 and 0.01. However, Type-I error rates are controlled after applying the data adaptive method. The NB, BL, and FL regressions gain increased power with large sample size, large log2 fold-change, and low dispersion. The FL regression has comparable power to NB regression. We conclude that implementing the data adaptive method appropriately controls Type-I error rates in RNA-Seq analysis. Firth's logistic regression provides a concise statistical inference process and reduces spurious associations from inaccurately estimated dispersion parameters in the negative binomial framework.

  3. Determinants of developing widened spatial QRS-T angle in HIV-infected individuals

    DEFF Research Database (Denmark)

    Dawood, Farah Z; Roediger, Mollie P; Grandits, Greg

    2014-01-01

    BACKGROUND: A widened electrocardiographic spatial QRS-T angle has been shown to be predictive of cardiovascular disease in HIV-infected individuals. However, determinants and risk factors of developing widened QRS-T angle over time in this population remain unknown. METHODS AND RESULTS: Spatial...... QRS-T angle was automatically measured from standard electrocardiogram of 1444 HIV-infected individuals without baseline widened spatial QRS-T angle from the Strategies for Management of Antiretroviral Therapy [SMART], a clinical trial comparing two antiretroviral treatment strategies [Drug...... Conservation (DC) vs. Viral Suppression (VS)]. Conditional logistic regression analysis was used to examine the association between baseline characteristics and incident widened spatial QRS-T angle (a new angle>93° in males and>74° in females). During 2544 person-years of follow-up, 199 participants developed...

  4. What Satisfies Students?: Mining Student-Opinion Data with Regression and Decision Tree Analysis

    Science.gov (United States)

    Thomas, Emily H.; Galambos, Nora

    2004-01-01

    To investigate how students' characteristics and experiences affect satisfaction, this study uses regression and decision tree analysis with the CHAID algorithm to analyze student-opinion data. A data mining approach identifies the specific aspects of students' university experience that most influence three measures of general satisfaction. The…

  5. Analysis Of Influence Of Spatial Planning On Performance Of Regional Development At Waropen District. Papua Indonesia

    Directory of Open Access Journals (Sweden)

    Suwandi

    2015-08-01

    Full Text Available The various problems in regional spatial planning in Waropen District Papua shows that the Spatial Planning RTRW of Waropen District Papua drafted in 2010 has not had a positive contribution to the settlement of spatial planning problems. This is most likely caused by the inconsistency in the spatial planning. This study tried to observe the consistency of spatial planning as well as its relation to the regional development performance. The method used to observe the consistency of the preparation of guided Spatial Planning RTRW is the analysis of comparative table followed by analysis of verbal logic. In order to determine if the preparation of Spatial Planning RTRW has already paid attention on the synergy with the surrounding regions Inter-Regional Context a map overlay was conducted followed by analysis of verbal logic. To determine the performance of the regional development a Principal Components Analysis PCA was done. The analysis results showed that inconsistencies in the spatial planning had caused a variety of problems that resulted in decreased performance of the regional development. The main problems that should receive more attention are infrastructure development growth economic growth transportation aspect and new properties.

  6. Ca analysis: an Excel based program for the analysis of intracellular calcium transients including multiple, simultaneous regression analysis.

    Science.gov (United States)

    Greensmith, David J

    2014-01-01

    Here I present an Excel based program for the analysis of intracellular Ca transients recorded using fluorescent indicators. The program can perform all the necessary steps which convert recorded raw voltage changes into meaningful physiological information. The program performs two fundamental processes. (1) It can prepare the raw signal by several methods. (2) It can then be used to analyze the prepared data to provide information such as absolute intracellular Ca levels. Also, the rates of change of Ca can be measured using multiple, simultaneous regression analysis. I demonstrate that this program performs equally well as commercially available software, but has numerous advantages, namely creating a simplified, self-contained analysis workflow. Copyright © 2013 The Author. Published by Elsevier Ireland Ltd.. All rights reserved.

  7. Spatial Analysis of Accident Spots Using Weighted Severity Index ...

    African Journals Online (AJOL)

    ADOWIE PERE

    Spatial Analysis of Accident Spots Using Weighted Severity Index (WSI) and ... pedestrians avoiding the use of pedestrian bridges/aid even when they are available. ..... not minding an unforeseen obstruction, miscalculations and wrong break.

  8. Robust Regression and its Application in Financial Data Analysis

    OpenAIRE

    Mansoor Momeni; Mahmoud Dehghan Nayeri; Ali Faal Ghayoumi; Hoda Ghorbani

    2010-01-01

    This research is aimed to describe the application of robust regression and its advantages over the least square regression method in analyzing financial data. To do this, relationship between earning per share, book value of equity per share and share price as price model and earning per share, annual change of earning per share and return of stock as return model is discussed using both robust and least square regressions, and finally the outcomes are compared. Comparing the results from th...

  9. A book review of Spatial data analysis in ecology and agriculture using R

    Science.gov (United States)

    Spatial Data Analysis in Ecology and Agriculture Using R is a valuable resource to assist agricultural and ecological researchers with spatial data analyses using the R statistical software(www.r-project.org). Special emphasis is on spatial data sets; how-ever, the text also provides ample guidance ...

  10. Spatial-temporal data model and fractal analysis of transportation network in GIS environment

    Science.gov (United States)

    Feng, Yongjiu; Tong, Xiaohua; Li, Yangdong

    2008-10-01

    How to organize transportation data characterized by multi-time, multi-scale, multi-resolution and multi-source is one of the fundamental problems of GIS-T development. A spatial-temporal data model for GIS-T is proposed based on Spatial-temporal- Object Model. Transportation network data is systemically managed using dynamic segmentation technologies. And then a spatial-temporal database is built to integrally store geographical data of multi-time for transportation. Based on the spatial-temporal database, functions of spatial analysis of GIS-T are substantively extended. Fractal module is developed to improve the analyzing in intensity, density, structure and connectivity of transportation network based on the validation and evaluation of topologic relation. Integrated fractal with GIS-T strengthens the functions of spatial analysis and enriches the approaches of data mining and knowledge discovery of transportation network. Finally, the feasibility of the model and methods are tested thorough Guangdong Geographical Information Platform for Highway Project.

  11. Support vector methods for survival analysis: a comparison between ranking and regression approaches.

    Science.gov (United States)

    Van Belle, Vanya; Pelckmans, Kristiaan; Van Huffel, Sabine; Suykens, Johan A K

    2011-10-01

    To compare and evaluate ranking, regression and combined machine learning approaches for the analysis of survival data. The literature describes two approaches based on support vector machines to deal with censored observations. In the first approach the key idea is to rephrase the task as a ranking problem via the concordance index, a problem which can be solved efficiently in a context of structural risk minimization and convex optimization techniques. In a second approach, one uses a regression approach, dealing with censoring by means of inequality constraints. The goal of this paper is then twofold: (i) introducing a new model combining the ranking and regression strategy, which retains the link with existing survival models such as the proportional hazards model via transformation models; and (ii) comparison of the three techniques on 6 clinical and 3 high-dimensional datasets and discussing the relevance of these techniques over classical approaches fur survival data. We compare svm-based survival models based on ranking constraints, based on regression constraints and models based on both ranking and regression constraints. The performance of the models is compared by means of three different measures: (i) the concordance index, measuring the model's discriminating ability; (ii) the logrank test statistic, indicating whether patients with a prognostic index lower than the median prognostic index have a significant different survival than patients with a prognostic index higher than the median; and (iii) the hazard ratio after normalization to restrict the prognostic index between 0 and 1. Our results indicate a significantly better performance for models including regression constraints above models only based on ranking constraints. This work gives empirical evidence that svm-based models using regression constraints perform significantly better than svm-based models based on ranking constraints. Our experiments show a comparable performance for methods

  12. Comparative analysis of time efficiency and spatial resolution between different EIT reconstruction algorithms

    International Nuclear Information System (INIS)

    Kacarska, Marija; Loskovska, Suzana

    2002-01-01

    In this paper comparative analysis between different EIT algorithms is presented. Analysis is made for spatial and temporal resolution of obtained images by several different algorithms. Discussions consider spatial resolution dependent on data acquisition method, too. Obtained results show that conventional applied-current EIT is more powerful compared to induced-current EIT. (Author)

  13. Hemoglobin concentration determination based on near infrared spatially resolved transmission spectra

    Science.gov (United States)

    Zhang, Linna; Li, Gang; Lin, Ling

    2016-10-01

    Spatially resolved diffuse reflectance spectroscopy method has been proved to be more effective than single point spectroscopy method in the experiment to predict the concentration of the Intralipid diluted solutions. However, Intralipid diluted solution is simple, cannot be the representative of turbid liquids. Blood is a natural and meaningful turbid liquid, more complicate. Hemoglobin is the major constituent of the whole blood. And hemoglobin concentration is commonly used in clinical medicine to diagnose many diseases. In this paper, near infrared spatially resolved transmission spectra (NIRSRTS) and Partial Least Square Regression (PLSR) were used to predict the hemoglobin concentration of human blood. The results showed the prediction ability for hemoglobin concentration of the proposed method is better than single point transmission spectroscopy method. This paper demonstrated the feasibility of the spatially resolved diffuse reflectance spectroscopy method for practical liquid composition analysis. This research provided a new thinking of practical turbid liquid composition analysis.

  14. Quantitative analysis of spatial variability of geotechnical parameters

    Science.gov (United States)

    Fang, Xing

    2018-04-01

    Geotechnical parameters are the basic parameters of geotechnical engineering design, while the geotechnical parameters have strong regional characteristics. At the same time, the spatial variability of geotechnical parameters has been recognized. It is gradually introduced into the reliability analysis of geotechnical engineering. Based on the statistical theory of geostatistical spatial information, the spatial variability of geotechnical parameters is quantitatively analyzed. At the same time, the evaluation of geotechnical parameters and the correlation coefficient between geotechnical parameters are calculated. A residential district of Tianjin Survey Institute was selected as the research object. There are 68 boreholes in this area and 9 layers of mechanical stratification. The parameters are water content, natural gravity, void ratio, liquid limit, plasticity index, liquidity index, compressibility coefficient, compressive modulus, internal friction angle, cohesion and SP index. According to the principle of statistical correlation, the correlation coefficient of geotechnical parameters is calculated. According to the correlation coefficient, the law of geotechnical parameters is obtained.

  15. REMOTE SENSING-BASED DETECTION AND SPATIAL PATTERN ANALYSIS FOR GEO-ECOLOGICAL NICHE MODELING OF TILLANDSIA SPP. IN THE ATACAMA, CHILE

    Directory of Open Access Journals (Sweden)

    N. Wolf

    2016-06-01

    Full Text Available In the coastal Atacama Desert in Northern Chile plant growth is constrained to so-called ‘fog oases’ dominated by monospecific stands of the genus Tillandsia. Adapted to the hyperarid environmental conditions, these plants specialize on the foliar uptake of fog as main water and nutrient source. It is this characteristic that leads to distinctive macro- and micro-scale distribution patterns, reflecting complex geo-ecological gradients, mainly affected by the spatiotemporal occurrence of coastal fog respectively the South Pacific Stratocumulus clouds reaching inlands. The current work employs remote sensing, machine learning and spatial pattern/GIS analysis techniques to acquire detailed information on the presence and state of Tillandsia spp. in the Tarapacá region as a base to better understand the bioclimatic and topographic constraints determining the distribution patterns of Tillandsia spp. Spatial and spectral predictors extracted from WorldView-3 satellite data are used to map present Tillandsia vegetation in the Tarapaca region. Regression models on Vegetation Cover Fraction (VCF are generated combining satellite-based as well as topographic variables and using aggregated high spatial resolution information on vegetation cover derived from UAV flight campaigns as a reference. The results are a first step towards mapping and modelling the topographic as well as bioclimatic factors explaining the spatial distribution patterns of Tillandsia fog oases in the Atacama, Chile.

  16. Remote Sensing-Based Detection and Spatial Pattern Analysis for Geo-Ecological Niche Modeling of Tillandsia SPP. In the Atacama, Chile

    Science.gov (United States)

    Wolf, N.; Siegmund, A.; del Río, C.; Osses, P.; García, J. L.

    2016-06-01

    In the coastal Atacama Desert in Northern Chile plant growth is constrained to so-called `fog oases' dominated by monospecific stands of the genus Tillandsia. Adapted to the hyperarid environmental conditions, these plants specialize on the foliar uptake of fog as main water and nutrient source. It is this characteristic that leads to distinctive macro- and micro-scale distribution patterns, reflecting complex geo-ecological gradients, mainly affected by the spatiotemporal occurrence of coastal fog respectively the South Pacific Stratocumulus clouds reaching inlands. The current work employs remote sensing, machine learning and spatial pattern/GIS analysis techniques to acquire detailed information on the presence and state of Tillandsia spp. in the Tarapacá region as a base to better understand the bioclimatic and topographic constraints determining the distribution patterns of Tillandsia spp. Spatial and spectral predictors extracted from WorldView-3 satellite data are used to map present Tillandsia vegetation in the Tarapaca region. Regression models on Vegetation Cover Fraction (VCF) are generated combining satellite-based as well as topographic variables and using aggregated high spatial resolution information on vegetation cover derived from UAV flight campaigns as a reference. The results are a first step towards mapping and modelling the topographic as well as bioclimatic factors explaining the spatial distribution patterns of Tillandsia fog oases in the Atacama, Chile.

  17. Backyard housing in Gauteng: An analysis of spatial dynamics

    African Journals Online (AJOL)

    Backyard housing in Gauteng: An analysis of spatial dynamics. Yasmin Shapurjee ... Drawing on quantitative geo-demographic data from GeoTerraImage (GTI). (2010), Knowledge .... a fundamental role in absorbing demand for low-income ...

  18. Regression Analysis for Multivariate Dependent Count Data Using Convolved Gaussian Processes

    OpenAIRE

    Sofro, A'yunin; Shi, Jian Qing; Cao, Chunzheng

    2017-01-01

    Research on Poisson regression analysis for dependent data has been developed rapidly in the last decade. One of difficult problems in a multivariate case is how to construct a cross-correlation structure and at the meantime make sure that the covariance matrix is positive definite. To address the issue, we propose to use convolved Gaussian process (CGP) in this paper. The approach provides a semi-parametric model and offers a natural framework for modeling common mean structure and covarianc...

  19. The Role of Auxiliary Variables in Deterministic and Deterministic-Stochastic Spatial Models of Air Temperature in Poland

    Science.gov (United States)

    Szymanowski, Mariusz; Kryza, Maciej

    2017-02-01

    Our study examines the role of auxiliary variables in the process of spatial modelling and mapping of climatological elements, with air temperature in Poland used as an example. The multivariable algorithms are the most frequently applied for spatialization of air temperature, and their results in many studies are proved to be better in comparison to those obtained by various one-dimensional techniques. In most of the previous studies, two main strategies were used to perform multidimensional spatial interpolation of air temperature. First, it was accepted that all variables significantly correlated with air temperature should be incorporated into the model. Second, it was assumed that the more spatial variation of air temperature was deterministically explained, the better was the quality of spatial interpolation. The main goal of the paper was to examine both above-mentioned assumptions. The analysis was performed using data from 250 meteorological stations and for 69 air temperature cases aggregated on different levels: from daily means to 10-year annual mean. Two cases were considered for detailed analysis. The set of potential auxiliary variables covered 11 environmental predictors of air temperature. Another purpose of the study was to compare the results of interpolation given by various multivariable methods using the same set of explanatory variables. Two regression models: multiple linear (MLR) and geographically weighted (GWR) method, as well as their extensions to the regression-kriging form, MLRK and GWRK, respectively, were examined. Stepwise regression was used to select variables for the individual models and the cross-validation method was used to validate the results with a special attention paid to statistically significant improvement of the model using the mean absolute error (MAE) criterion. The main results of this study led to rejection of both assumptions considered. Usually, including more than two or three of the most significantly

  20. Micro-epidemiology and spatial heterogeneity of P. vivax parasitaemia in riverine communities of the Peruvian Amazon: A multilevel analysis.

    Science.gov (United States)

    Carrasco-Escobar, Gabriel; Gamboa, Dionicia; Castro, Marcia C; Bangdiwala, Shrikant I; Rodriguez, Hugo; Contreras-Mancilla, Juan; Alava, Freddy; Speybroeck, Niko; Lescano, Andres G; Vinetz, Joseph M; Rosas-Aguirre, Angel; Llanos-Cuentas, Alejandro

    2017-08-14

    Malaria has steadily increased in the Peruvian Amazon over the last five years. This study aimed to determine the parasite prevalence and micro-geographical heterogeneity of Plasmodium vivax parasitaemia in communities of the Peruvian Amazon. Four cross-sectional active case detection surveys were conducted between May and July 2015 in four riverine communities in Mazan district. Analysis of 2785 samples of 820 individuals nested within 154 households for Plasmodium parasitaemia was carried out using light microscopy and qPCR. The spatio-temporal distribution of Plasmodium parasitaemia, dominated by P. vivax, was shown to cluster at both household and community levels. Of enrolled individuals, 47% had at least one P. vivax parasitaemia and 10% P. falciparum, by qPCR, both of which were predominantly sub-microscopic and asymptomatic. Spatial analysis detected significant clustering in three communities. Our findings showed that communities at small-to-moderate spatial scales differed in P. vivax parasite prevalence, and multilevel Poisson regression models showed that such differences were influenced by factors such as age, education, and location of households within high-risk clusters, as well as factors linked to a local micro-geographic context, such as travel and occupation. Complex transmission patterns were found to be related to human mobility among communities in the same micro-basin.

  1. Spatial variation of natural radiation and childhood leukaemia incidence in Great Britain

    International Nuclear Information System (INIS)

    Richardson, Sylvia; Monfort, Christine; Green, Martyn; Muirhead, Colin; Draper, Gerald

    1995-01-01

    This paper describes an analysis of the geographical variation of childhood leukaemia incidence in Great Britain over a 15 year period in relation to natural radiation (gamma and radon). Data at the level of the 459 district level local authorities in England, Wales and regional districts in Scotland are analysed in two complementary ways: first, by Poisson regressions with the inclusion of environmental covariates and a smooth spatial structure; secondly, by a hierarchical Bayesian model in which extra-Poisson variability is modelled explicitly in terms of spatial and non-spatial components. From this analysis, we deduce a strong indication that a main part of the variability is accounted for by a local neighbourhood 'clustering' structure. This structure is furthermore relatively stable over the 15 year period for the lymphocytic leukaemias which make up the majority of observed cases. We found no evidence of a positive association of childhood leukaemia incidence with outdoor or indoor gamma radiation levels. There is no consistent evidence of any association with radon levels. Indeed, in the Poisson regressions, a significant positive association was only observed for one 5-year period, a result which is not compatible with a stable environmental effect. Moreover, this positive association became clearly non-significant when over-dispersion relative to the Poisson distribution was taken into account. (author)

  2. Robust best linear estimation for regression analysis using surrogate and instrumental variables.

    Science.gov (United States)

    Wang, C Y

    2012-04-01

    We investigate methods for regression analysis when covariates are measured with errors. In a subset of the whole cohort, a surrogate variable is available for the true unobserved exposure variable. The surrogate variable satisfies the classical measurement error model, but it may not have repeated measurements. In addition to the surrogate variables that are available among the subjects in the calibration sample, we assume that there is an instrumental variable (IV) that is available for all study subjects. An IV is correlated with the unobserved true exposure variable and hence can be useful in the estimation of the regression coefficients. We propose a robust best linear estimator that uses all the available data, which is the most efficient among a class of consistent estimators. The proposed estimator is shown to be consistent and asymptotically normal under very weak distributional assumptions. For Poisson or linear regression, the proposed estimator is consistent even if the measurement error from the surrogate or IV is heteroscedastic. Finite-sample performance of the proposed estimator is examined and compared with other estimators via intensive simulation studies. The proposed method and other methods are applied to a bladder cancer case-control study.

  3. The effect of occlusion on the semantics of projective spatial terms: a case study in grounding language in perception.

    Science.gov (United States)

    Kelleher, John D; Ross, Robert J; Sloan, Colm; Mac Namee, Brian

    2011-02-01

    Although data-driven spatial template models provide a practical and cognitively motivated mechanism for characterizing spatial term meaning, the influence of perceptual rather than solely geometric and functional properties has yet to be systematically investigated. In the light of this, in this paper, we investigate the effects of the perceptual phenomenon of object occlusion on the semantics of projective terms. We did this by conducting a study to test whether object occlusion had a noticeable effect on the acceptance values assigned to projective terms with respect to a 2.5-dimensional visual stimulus. Based on the data collected, a regression model was constructed and presented. Subsequent analysis showed that the regression model that included the occlusion factor outperformed an adaptation of Regier & Carlson's well-regarded AVS model for that same spatial configuration.

  4. Sensitivity of Microstructural Factors Influencing the Impact Toughness of Hypoeutectoid Steels with Ferrite-Pearlite Structure using Multiple Regression Analysis

    International Nuclear Information System (INIS)

    Lee, Seung-Yong; Lee, Sang-In; Hwang, Byoung-chul

    2016-01-01

    In this study, the effect of microstructural factors on the impact toughness of hypoeutectoid steels with ferrite-pearlite structure was quantitatively investigated using multiple regression analysis. Microstructural analysis results showed that the pearlite fraction increased with increasing austenitizing temperature and decreasing transformation temperature which substantially decreased the pearlite interlamellar spacing and cementite thickness depending on carbon content. The impact toughness of hypoeutectoid steels usually increased as interlamellar spacing or cementite thickness decreased, although the impact toughness was largely associated with pearlite fraction. Based on these results, multiple regression analysis was performed to understand the individual effect of pearlite fraction, interlamellar spacing, and cementite thickness on the impact toughness. The regression analysis results revealed that pearlite fraction significantly affected impact toughness at room temperature, while cementite thickness did at low temperature.

  5. Predicting Insolvency : A comparison between discriminant analysis and logistic regression using principal components

    OpenAIRE

    Geroukis, Asterios; Brorson, Erik

    2014-01-01

    In this study, we compare the two statistical techniques logistic regression and discriminant analysis to see how well they classify companies based on clusters – made from the solvency ratio ­– using principal components as independent variables. The principal components are made with different financial ratios. We use cluster analysis to find groups with low, medium and high solvency ratio of 1200 different companies found on the NASDAQ stock market and use this as an apriori definition of ...

  6. A REVIEW ON THE USE OF REGRESSION ANALYSIS IN STUDIES OF AUDIT QUALITY

    Directory of Open Access Journals (Sweden)

    Agung Dodit Muliawan

    2015-07-01

    Full Text Available This study aimed to review how regression analysis has been used in studies of abstract phenomenon, such as audit quality, an importance concept in the auditing practice (Schroeder et al., 1986, yet is not well defined. The articles reviewed were the research articles that include audit quality as research variable, either as dependent or independent variables. The articles were purposefully selected to represent balance combination between audit specific and more general accounting journals and between Anglo Saxon and Anglo American journals. The articles were published between 1983-2011 and from the A/A class journal based on ERA 2010’s classifications. The study found that most of the articles reviewed used multiple regression analysis and treated audit quality as dependent variable and measured it by using a proxy. This study also highlights the size of data sample used and the lack of discussions about the assumptions of the statistical analysis used in most of the articles reviewed. This study concluded that the effectiveness and validity of multiple regressions do not only depends on its application by the researchers but also on how the researchers communicate their findings to the audience. KEYWORDS Audit quality, regression analysis ABSTRAK Kajian ini bertujuan untuk mereviu bagaimana analisa regresi digunakan dalam suatu fenomena abstrak seperti kualitas audit, suatu konsep yang penting dalam praktik audit (Schroeder et al., 1986 namun belum terdefinisi dengan jelas. Artikel yang direviu dalam kajian ini adalah artikel penelitian yang memasukkan kualitas audit sebagai variabel penelitian, baik sebagai variabel independen maupun dependen. Artikel-artikel tersebut dipilih dengan cara purposif sampling untuk mendapatkan keterwakilan yang seimbang antara artikel jurnal khusus audit dan akuntansi secara umum, serta mewakili jurnal Anglo Saxon dan Anglo American. Artikel yang direviu diterbitkan pada periode 1983-2011 oleh jurnal yang

  7. Logistic regression applied to natural hazards: rare event logistic regression with replications

    OpenAIRE

    Guns, M.; Vanacker, Veerle

    2012-01-01

    Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logisti...

  8. Differentiating regressed melanoma from regressed lichenoid keratosis.

    Science.gov (United States)

    Chan, Aegean H; Shulman, Kenneth J; Lee, Bonnie A

    2017-04-01

    Distinguishing regressed lichen planus-like keratosis (LPLK) from regressed melanoma can be difficult on histopathologic examination, potentially resulting in mismanagement of patients. We aimed to identify histopathologic features by which regressed melanoma can be differentiated from regressed LPLK. Twenty actively inflamed LPLK, 12 LPLK with regression and 15 melanomas with regression were compared and evaluated by hematoxylin and eosin staining as well as Melan-A, microphthalmia transcription factor (MiTF) and cytokeratin (AE1/AE3) immunostaining. (1) A total of 40% of regressed melanomas showed complete or near complete loss of melanocytes within the epidermis with Melan-A and MiTF immunostaining, while 8% of regressed LPLK exhibited this finding. (2) Necrotic keratinocytes were seen in the epidermis in 33% regressed melanomas as opposed to all of the regressed LPLK. (3) A dense infiltrate of melanophages in the papillary dermis was seen in 40% of regressed melanomas, a feature not seen in regressed LPLK. In summary, our findings suggest that a complete or near complete loss of melanocytes within the epidermis strongly favors a regressed melanoma over a regressed LPLK. In addition, necrotic epidermal keratinocytes and the presence of a dense band-like distribution of dermal melanophages can be helpful in differentiating these lesions. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  9. Urban Land Expansion and Spatial Dynamics in Globalizing Shanghai

    Directory of Open Access Journals (Sweden)

    Han Li

    2014-12-01

    Full Text Available Urban land expansion in China has attracted considerable scholarly attention. However, more work is needed to apply spatial modeling to understanding the mechanisms of urban growth from both institutional and physical perspectives. This paper analyzes urban expansion in Shanghai and its development zones (DZs. We find that, as nodes of global-local interface, the DZs are the most significant components of urban growth in Shanghai, and major spatial patterns of urban expansion in Shanghai are infilling and edge expansion. We apply logistic regression, geographically weighted logistic regression (GWLR and spatial regime regression to investigate the determinants of urban land expansion including physical conditions, state policy and land development. Regressions reveal that, though the market has been an important driving force in urban growth, the state has played a predominant role through the implementation of urban planning and the establishment of DZs to fully capitalize on globalization. We also find that differences in urban growth dynamics exist between the areas inside and outside of the DZs. Finally, this paper discusses policies to promote sustainable development in Shanghai.

  10. Approximating prediction uncertainty for random forest regression models

    Science.gov (United States)

    John W. Coulston; Christine E. Blinn; Valerie A. Thomas; Randolph H. Wynne

    2016-01-01

    Machine learning approaches such as random forest have increased for the spatial modeling and mapping of continuous variables. Random forest is a non-parametric ensemble approach, and unlike traditional regression approaches there is no direct quantification of prediction error. Understanding prediction uncertainty is important when using model-based continuous maps as...

  11. Socio-Economic Predictors and Distribution of Tuberculosis Incidence in Beijing, China: A Study Using a Combination of Spatial Statistics and GIS Technology.

    Science.gov (United States)

    Mahara, Gehendra; Yang, Kun; Chen, Sipeng; Wang, Wei; Guo, Xiuhua

    2018-03-21

    Evidence shows that multiple factors, such as socio-economic status and access to health care facilities, affect tuberculosis (TB) incidence. However, there is limited literature available with respect to the correlation between socio-economic/health facility factors and tuberculosis incidence. This study aimed to explore the relationship between TB incidence and socio-economic/health service predictors in the study settings. A retrospective spatial regression analysis was carried out based on new sputum smear-positive pulmonary TB cases in Beijing districts. Global Moran's I analysis was adopted to detect the spatial dependency followed by spatial regression models (spatial lag model, and spatial error model) along with the ordinary least square model were applied to examine the correlation between TB incidence and predictors. A high incidence of TB was seen in densely populated districts in Beijing, e.g., Haidian, Mentougou, and Xicheng. After comparing the R², log-likelihood, and Akaike information criterion (AIC) values among three models, the spatial error model (R² = 0.413; Log Likelihood = -591; AIC = 1199.76) identified the best model fit for the spatial regression model. The study showed that the number of beds in health institutes ( p < 0.001) and per capita gross domestic product (GDP) ( p = 0.025) had a positive effect on TB incidence, whereas population density ( p < 0.001) and migrated population ( p < 0.001) had an adverse impact on TB incidence in the study settings. High TB incidence districts were detected in urban and densely populated districts in Beijing. Our findings suggested that socio-economic predictors influence TB incidence. These findings may help to guide TB control programs and promote targeted intervention.

  12. High-throughput quantitative biochemical characterization of algal biomass by NIR spectroscopy; multiple linear regression and multivariate linear regression analysis.

    Science.gov (United States)

    Laurens, L M L; Wolfrum, E J

    2013-12-18

    One of the challenges associated with microalgal biomass characterization and the comparison of microalgal strains and conversion processes is the rapid determination of the composition of algae. We have developed and applied a high-throughput screening technology based on near-infrared (NIR) spectroscopy for the rapid and accurate determination of algal biomass composition. We show that NIR spectroscopy can accurately predict the full composition using multivariate linear regression analysis of varying lipid, protein, and carbohydrate content of algal biomass samples from three strains. We also demonstrate a high quality of predictions of an independent validation set. A high-throughput 96-well configuration for spectroscopy gives equally good prediction relative to a ring-cup configuration, and thus, spectra can be obtained from as little as 10-20 mg of material. We found that lipids exhibit a dominant, distinct, and unique fingerprint in the NIR spectrum that allows for the use of single and multiple linear regression of respective wavelengths for the prediction of the biomass lipid content. This is not the case for carbohydrate and protein content, and thus, the use of multivariate statistical modeling approaches remains necessary.

  13. Analysis of Regional Unemployment in Russia and Germany: Spatial-Econometric Approach

    Directory of Open Access Journals (Sweden)

    Elena Vyacheslavovna Semerikova

    2015-06-01

    Full Text Available The study was supported by the Government of the Russian Federation, grant No.11.G34.31.0059. This paper analyzes the regional unemployment in Russia and Germany in 2005-2010 and addresses issues of choosing the right specification of spatial-econometric models. The analysis based on data of 75 Russian and 370 German regions showed that for Germany the choice of the spatial weighting matrix has a more significant influence on the parameter estimates than for Russia. Presumably this is due to stronger linkages between regional labor markets in Germany compared to Russia. The authors also proposed an algorithm for choosing between spatial matrices and demonstrated the application of this algorithm on simulated Russian data. The authors found that 1 the deviation of the results from the true ones increases when the spatial dependence between regions is higher and 2 the matrix of inverse distances is more preferable than the boundary one for the analysis of regional unemployment in Russia (because of the lower value of the mean squared error. The authors are also planning to apply the proposed algorithm for simulated data of Germany. These results allow accounting the spatial dependence more correctly when modeling regional unemployment which is very important for making proper regional policy

  14. Extreme Precipitation Estimation with Typhoon Morakot Using Frequency and Spatial Analysis

    Directory of Open Access Journals (Sweden)

    Hone-Jay Chu

    2011-01-01

    Full Text Available Typhoon Morakot lashed Taiwan and produced copious amounts of precipitation in 2009. From the point view of hydrological statistics, the impact of the precipitation from typhoon Morakot using a frequency analysis can be analyzed and discussed. The frequency curve, which was fitted mathematically to historical observed data, can be used to estimate the probability of exceedance for runoff events of a certain magnitude. The study integrates frequency analysis and spatial analysis to assess the effect of Typhoon Morakot event on rainfall frequency in the Gaoping River basin of southern Taiwan. First, extreme rainfall data are collected at sixteen stations for durations of 1, 3, 6, 12, and 24 hours and then an appropriate probability distribution was selected to analyze the impact of the extreme hydrological event. Spatial rainfall patterns for a return period of 200-yr with 24-hr duration with and without Typhoon Morakot are estimated. Results show that the rainfall amount is significantly different with long duration with and without the event for frequency analysis. Furthermore, spatial analysis shows that extreme rainfall for a return period of 200-yr is highly dependent on topography and is smaller in the southwest than that in the east. The results not only demonstrate the distinct effect of Typhoon Morakot on frequency analysis, but also could provide reference in future planning of hydrological engineering.

  15. Identification of cotton properties to improve yarn count quality by using regression analysis

    International Nuclear Information System (INIS)

    Amin, M.; Ullah, M.; Akbar, A.

    2014-01-01

    Identification of raw material characteristics towards yarn count variation was studied by using statistical techniques. Regression analysis is used to meet the objective. Stepwise regression is used for mode) selection, and coefficient of determination and mean squared error (MSE) criteria are used to identify the contributing factors of cotton properties for yam count. Statistical assumptions of normality, autocorrelation and multicollinearity are evaluated by using probability plot, Durbin Watson test, variance inflation factor (VIF), and then model fitting is carried out. It is found that, invisible (INV), nepness (Nep), grayness (RD), cotton trash (TR) and uniformity index (VI) are the main contributing cotton properties for yarn count variation. The results are also verified by Pareto chart. (author)

  16. SCGICAR: Spatial concatenation based group ICA with reference for fMRI data analysis.

    Science.gov (United States)

    Shi, Yuhu; Zeng, Weiming; Wang, Nizhuan

    2017-09-01

    With the rapid development of big data, the functional magnetic resonance imaging (fMRI) data analysis of multi-subject is becoming more and more important. As a kind of blind source separation technique, group independent component analysis (GICA) has been widely applied for the multi-subject fMRI data analysis. However, spatial concatenated GICA is rarely used compared with temporal concatenated GICA due to its disadvantages. In this paper, in order to overcome these issues and to consider that the ability of GICA for fMRI data analysis can be improved by adding a priori information, we propose a novel spatial concatenation based GICA with reference (SCGICAR) method to take advantage of the priori information extracted from the group subjects, and then the multi-objective optimization strategy is used to implement this method. Finally, the post-processing means of principal component analysis and anti-reconstruction are used to obtain group spatial component and individual temporal component in the group, respectively. The experimental results show that the proposed SCGICAR method has a better performance on both single-subject and multi-subject fMRI data analysis compared with classical methods. It not only can detect more accurate spatial and temporal component for each subject of the group, but also can obtain a better group component on both temporal and spatial domains. These results demonstrate that the proposed SCGICAR method has its own advantages in comparison with classical methods, and it can better reflect the commonness of subjects in the group. Copyright © 2017 Elsevier B.V. All rights reserved.

  17. A tandem regression-outlier analysis of a ligand cellular system for key structural modifications around ligand binding.

    Science.gov (United States)

    Lin, Ying-Ting

    2013-04-30

    A tandem technique of hard equipment is often used for the chemical analysis of a single cell to first isolate and then detect the wanted identities. The first part is the separation of wanted chemicals from the bulk of a cell; the second part is the actual detection of the important identities. To identify the key structural modifications around ligand binding, the present study aims to develop a counterpart of tandem technique for cheminformatics. A statistical regression and its outliers act as a computational technique for separation. A PPARγ (peroxisome proliferator-activated receptor gamma) agonist cellular system was subjected to such an investigation. Results show that this tandem regression-outlier analysis, or the prioritization of the context equations tagged with features of the outliers, is an effective regression technique of cheminformatics to detect key structural modifications, as well as their tendency of impact to ligand binding. The key structural modifications around ligand binding are effectively extracted or characterized out of cellular reactions. This is because molecular binding is the paramount factor in such ligand cellular system and key structural modifications around ligand binding are expected to create outliers. Therefore, such outliers can be captured by this tandem regression-outlier analysis.

  18. Regression analysis: An evaluation of the inuences behindthe pricing of beer

    OpenAIRE

    Eriksson, Sara; Häggmark, Jonas

    2017-01-01

    This bachelor thesis in applied mathematics is an analysis of which factors affect the pricing of beer at the Swedish market. A multiple linear regression model is created with the statistical programming language R through a study of the influences for several explanatory variables. For example these variables include country of origin, beer style, volume sold and a Bayesian weighted mean rating from RateBeer, a popular website for beer enthusiasts. The main goal of the project is to find si...

  19. Few crystal balls are crystal clear : eyeballing regression

    International Nuclear Information System (INIS)

    Wittebrood, R.T.

    1998-01-01

    The theory of regression and statistical analysis as it applies to reservoir analysis was discussed. It was argued that regression lines are not always the final truth. It was suggested that regression lines and eyeballed lines are often equally accurate. The many conditions that must be fulfilled to calculate a proper regression were discussed. Mentioned among these conditions were the distribution of the data, hidden variables, knowledge of how the data was obtained, the need for causal correlation of the variables, and knowledge of the manner in which the regression results are going to be used. 1 tab., 13 figs

  20. [A spatially explicit analysis of traffic accidents involving pedestrians and cyclists in Berlin].

    Science.gov (United States)

    Lakes, Tobia

    2017-12-01

    In many German cities and counties, sustainable mobility concepts that strengthen pedestrian and cyclist traffic are promoted. From the perspectives of urban development, traffic planning and public healthcare, a spatially differentiated analysis of traffic accident data is decisive. 1) The identification of spatial and temporal patterns of the distribution of accidents involving cyclists and pedestrians, 2) the identification of hotspots and exploration of possible underlying causes and 3) the critical discussion of benefits and challenges of the results and the derivation of conclusions. Spatio-temporal distributions of data from accident statistics in Berlin involving pedestrians and cyclists from 2011 to 2015 were analysed with geographic information systems (GIS). While the total number of accidents remains relatively stable for pedestrian and cyclist accidents, the spatial distribution analysis shows, however, that there are significant spatial clusters (hotspots) of traffic accidents with a strong concentration in the inner city area. In a critical discussion, the benefits of geographic concepts are identified, such as spatially explicit health data (in this case traffic accident data), the importance of the integration of other data sources for the evaluation of the health impact of areas (traffic accident statistics of the police), and the possibilities and limitations of spatial-temporal data analysis (spatial point-density analyses) for the derivation of decision-supported recommendations and for the evaluation of policy measures of health prevention and of health-relevant urban development.

  1. Geo-Nested Analysis: Mixed-Methods Research with Spatially Dependent Data

    NARCIS (Netherlands)

    Harbers, I.; Ingram, M.C.

    Mixed-methods designs, especially those where cases selected for small-N analysis (SNA) are nested within a large-N analysis (LNA), have become increasingly popular. Yet, since the LNA in this approach assumes that units are independently distributed, such designs are unable to account for spatial

  2. Spatial data analysis and integration for regional-scale geothermal potential mapping, West Java, Indonesia

    Energy Technology Data Exchange (ETDEWEB)

    Carranza, Emmanuel John M.; Barritt, Sally D. [Department of Earth Systems Analysis, International Institute for Geo-information Science and Earth Observation (ITC), Enschede (Netherlands); Wibowo, Hendro; Sumintadireja, Prihadi [Laboratory of Volcanology and Geothermal, Geology Department, Institute of Technology Bandung (ITB), Bandung (Indonesia)

    2008-06-15

    Conceptual modeling and predictive mapping of potential for geothermal resources at the regional-scale in West Java are supported by analysis of the spatial distribution of geothermal prospects and thermal springs, and their spatial associations with geologic features derived from publicly available regional-scale spatial data sets. Fry analysis shows that geothermal occurrences have regional-scale spatial distributions that are related to Quaternary volcanic centers and shallow earthquake epicenters. Spatial frequency distribution analysis shows that geothermal occurrences have strong positive spatial associations with Quaternary volcanic centers, Quaternary volcanic rocks, quasi-gravity lows, and NE-, NNW-, WNW-trending faults. These geological features, with their strong positive spatial associations with geothermal occurrences, constitute spatial recognition criteria of regional-scale geothermal potential in a study area. Application of data-driven evidential belief functions in GIS-based predictive mapping of regional-scale geothermal potential resulted in delineation of high potential zones occupying 25% of West Java, which is a substantial reduction of the search area for further exploration of geothermal resources. The predicted high potential zones delineate about 53-58% of the training geothermal areas and 94% of the validated geothermal occurrences. The results of this study demonstrate the value of regional-scale geothermal potential mapping in: (a) data-poor situations, such as West Java, and (b) regions with geotectonic environments similar to the study area. (author)

  3. Análise espacial dos determinantes socioeconômicos dos homicídios no Estado de Pernambuco Spatial analysis of socioeconomic determinants of homicide in Brazil

    Directory of Open Access Journals (Sweden)

    Maria Luiza C de Lima

    2005-04-01

    income of the head of the family, poverty index, rate of illiteracy, and demographic density. The following techniques were used in the analysis: a spatial autocorrelation test determined by the Moran index, multiple linear regression, a spatial regression model (CAR and a generalized additive model for the detection of spatial trend (LOESS. RESULTS: The illiteracy and the poverty index explained 24.6% of the total variability of the homicide rates and there was an inverse relationship. Moran´s I statistics indicated spatial autocorrelation between municipalities. The multiple linear regression model best fitted for the purposes of this study was the Conditional Auto Regressive (CAR model. The latter confirmed the association between the poverty index, illiteracy and homicide rates. CONCLUSIONS: The inverse association observed between socio-economic indicators and homicides may be expressing a process that propitiates improvement in living conditions and that is linked predominantly to conditions that generate violence, such as drug traffic.

  4. WHEN THE DISTURBANCES ARE SPATIALLY CORRELATED

    African Journals Online (AJOL)

    correlation, spatial error process. INTRODUCTION. Consider the linear regression model for spatial correlation y=XB +u, u=Ce, (1) where y is a Txl observable random vector, X is a Txk matrix of known constants with full column rank k, B is a k xl vector of unknown parameters,. :2 is a Txl random vector with expectation zero ...

  5. Role of regression analysis and variation of rheological data in calculation of pressure drop for sludge pipelines.

    Science.gov (United States)

    Farno, E; Coventry, K; Slatter, P; Eshtiaghi, N

    2018-06-15

    Sludge pumps in wastewater treatment plants are often oversized due to uncertainty in calculation of pressure drop. This issue costs millions of dollars for industry to purchase and operate the oversized pumps. Besides costs, higher electricity consumption is associated with extra CO 2 emission which creates huge environmental impacts. Calculation of pressure drop via current pipe flow theory requires model estimation of flow curve data which depends on regression analysis and also varies with natural variation of rheological data. This study investigates impact of variation of rheological data and regression analysis on variation of pressure drop calculated via current pipe flow theories. Results compare the variation of calculated pressure drop between different models and regression methods and suggest on the suitability of each method. Copyright © 2018 Elsevier Ltd. All rights reserved.

  6. Non-standard spatial statistics and spatial econometrics

    CERN Document Server

    Griffith, Daniel A

    2011-01-01

    Spatial statistics and spatial econometrics are recent sprouts of the tree "spatial analysis with measurement". Still, several general themes have emerged. Exploring selected fields of possible interest is tantalizing, and this is what the authors aim here.

  7. Geographic information systems, remote sensing, and spatial analysis activities in Texas, 2002-07

    Science.gov (United States)

    Pearson, D.K.; Gary, R.H.; Wilson, Z.D.

    2007-01-01

    Geographic information system (GIS) technology has become an important tool for scientific investigation, resource management, and environmental planning. A GIS is a computer-aided system capable of collecting, storing, analyzing, and displaying spatially referenced digital data. GIS technology is particularly useful when analyzing a wide variety of spatial data such as with remote sensing and spatial analysis. Remote sensing involves collecting remotely sensed data, such as satellite imagery, aerial photography, or radar images, and analyzing the data to gather information or investigate trends about the environment or the Earth's surface. Spatial analysis combines remotely sensed, thematic, statistical, quantitative, and geographical data through overlay, modeling, and other analytical techniques to investigate specific research questions. It is the combination of data formats and analysis techniques that has made GIS an essential tool in scientific investigations. This document presents information about the technical capabilities and project activities of the U.S. Geological Survey (USGS) Texas Water Science Center (TWSC) GIS Workgroup from 2002 through 2007.

  8. Spatial analysis and characteristics of pig farming in Thailand.

    Science.gov (United States)

    Thanapongtharm, Weerapong; Linard, Catherine; Chinson, Pornpiroon; Kasemsuwan, Suwicha; Visser, Marjolein; Gaughan, Andrea E; Epprech, Michael; Robinson, Timothy P; Gilbert, Marius

    2016-10-06

    In Thailand, pig production intensified significantly during the last decade, with many economic, epidemiological and environmental implications. Strategies toward more sustainable future developments are currently investigated, and these could be informed by a detailed assessment of the main trends in the pig sector, and on how different production systems are geographically distributed. This study had two main objectives. First, we aimed to describe the main trends and geographic patterns of pig production systems in Thailand in terms of pig type (native, breeding, and fattening pigs), farm scales (smallholder and large-scale farming systems) and type of farming systems (farrow-to-finish, nursery, and finishing systems) based on a very detailed 2010 census. Second, we aimed to study the statistical spatial association between these different types of pig farming distribution and a set of spatial variables describing access to feed and markets. Over the last decades, pig population gradually increased, with a continuously increasing number of pigs per holder, suggesting a continuing intensification of the sector. The different pig-production systems showed very contrasted geographical distributions. The spatial distribution of large-scale pig farms corresponds with that of commercial pig breeds, and spatial analysis conducted using Random Forest distribution models indicated that these were concentrated in lowland urban or peri-urban areas, close to means of transportation, facilitating supply to major markets such as provincial capitals and the Bangkok Metropolitan region. Conversely the smallholders were distributed throughout the country, with higher densities located in highland, remote, and rural areas, where they supply local rural markets. A limitation of the study was that pig farming systems were defined from the number of animals per farm, resulting in their possible misclassification, but this should have a limited impact on the main patterns revealed

  9. COLOR IMAGE RETRIEVAL BASED ON FEATURE FUSION THROUGH MULTIPLE LINEAR REGRESSION ANALYSIS

    Directory of Open Access Journals (Sweden)

    K. Seetharaman

    2015-08-01

    Full Text Available This paper proposes a novel technique based on feature fusion using multiple linear regression analysis, and the least-square estimation method is employed to estimate the parameters. The given input query image is segmented into various regions according to the structure of the image. The color and texture features are extracted on each region of the query image, and the features are fused together using the multiple linear regression model. The estimated parameters of the model, which is modeled based on the features, are formed as a vector called a feature vector. The Canberra distance measure is adopted to compare the feature vectors of the query and target images. The F-measure is applied to evaluate the performance of the proposed technique. The obtained results expose that the proposed technique is comparable to the other existing techniques.

  10. An investigation on thermal patterns in Iran based on spatial autocorrelation

    Science.gov (United States)

    Fallah Ghalhari, Gholamabbas; Dadashi Roudbari, Abbasali

    2018-02-01

    The present study aimed at investigating temporal-spatial patterns and monthly patterns of temperature in Iran using new spatial statistical methods such as cluster and outlier analysis, and hotspot analysis. To do so, climatic parameters, monthly average temperature of 122 synoptic stations, were assessed. Statistical analysis showed that January with 120.75% had the most fluctuation among the studied months. Global Moran's Index revealed that yearly changes of temperature in Iran followed a strong spatially clustered pattern. Findings showed that the biggest thermal cluster pattern in Iran, 0.975388, occurred in May. Cluster and outlier analyses showed that thermal homogeneity in Iran decreases in cold months, while it increases in warm months. This is due to the radiation angle and synoptic systems which strongly influence thermal order in Iran. The elevations, however, have the most notable part proved by Geographically weighted regression model. Iran's thermal analysis through hotspot showed that hot thermal patterns (very hot, hot, and semi-hot) were dominant in the South, covering an area of 33.5% (about 552,145.3 km2). Regions such as mountain foot and low lands lack any significant spatial autocorrelation, 25.2% covering about 415,345.1 km2. The last is the cold thermal area (very cold, cold, and semi-cold) with about 25.2% covering about 552,145.3 km2 of the whole area of Iran.

  11. Statistical Downscaling Output GCM Modeling with Continuum Regression and Pre-Processing PCA Approach

    Directory of Open Access Journals (Sweden)

    Sutikno Sutikno

    2010-08-01

    Full Text Available One of the climate models used to predict the climatic conditions is Global Circulation Models (GCM. GCM is a computer-based model that consists of different equations. It uses numerical and deterministic equation which follows the physics rules. GCM is a main tool to predict climate and weather, also it uses as primary information source to review the climate change effect. Statistical Downscaling (SD technique is used to bridge the large-scale GCM with a small scale (the study area. GCM data is spatial and temporal data most likely to occur where the spatial correlation between different data on the grid in a single domain. Multicollinearity problems require the need for pre-processing of variable data X. Continuum Regression (CR and pre-processing with Principal Component Analysis (PCA methods is an alternative to SD modelling. CR is one method which was developed by Stone and Brooks (1990. This method is a generalization from Ordinary Least Square (OLS, Principal Component Regression (PCR and Partial Least Square method (PLS methods, used to overcome multicollinearity problems. Data processing for the station in Ambon, Pontianak, Losarang, Indramayu and Yuntinyuat show that the RMSEP values and R2 predict in the domain 8x8 and 12x12 by uses CR method produces results better than by PCR and PLS.

  12. Linear regression in astronomy. II

    Science.gov (United States)

    Feigelson, Eric D.; Babu, Gutti J.

    1992-01-01

    A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.

  13. PATH ANALYSIS WITH LOGISTIC REGRESSION MODELS : EFFECT ANALYSIS OF FULLY RECURSIVE CAUSAL SYSTEMS OF CATEGORICAL VARIABLES

    OpenAIRE

    Nobuoki, Eshima; Minoru, Tabata; Geng, Zhi; Department of Medical Information Analysis, Faculty of Medicine, Oita Medical University; Department of Applied Mathematics, Faculty of Engineering, Kobe University; Department of Probability and Statistics, Peking University

    2001-01-01

    This paper discusses path analysis of categorical variables with logistic regression models. The total, direct and indirect effects in fully recursive causal systems are considered by using model parameters. These effects can be explained in terms of log odds ratios, uncertainty differences, and an inner product of explanatory variables and a response variable. A study on food choice of alligators as a numerical exampleis reanalysed to illustrate the present approach.

  14. Análise de fatores e regressão bissegmentada em estudos de estratificação ambiental e adaptabilidade em milho Factor analysis and bissegmented regression for studies about environmental stratification and maize adaptability

    Directory of Open Access Journals (Sweden)

    Deoclécio Domingos Garbuglio

    2007-02-01

    Full Text Available O objetivo deste trabalho foi verificar possíveis divergências entre os resultados obtidos nas avaliações da adaptabilidade de 27 genótipos de milho (Zea mays L., e na estratificação de 22 ambientes no Estado do Paraná, por meio de técnicas baseadas na análise de fatores e regressão bissegmentada. As estratificações ambientais foram feitas por meio do método tradicional e por análise de fatores, aliada ao porcentual da porção simples da interação GxA (PS%. As análises de adaptabilidade foram realizadas por meio de regressão bissegmentada e análise de fatores. Pela análise de regressão bissegmentada, os genótipos estudados apresentaram alta performance produtiva; no entanto, não foi constatado o genótipo considerado como ideal. A adaptabilidade dos genótipos, analisada por meio de plotagens gráficas, apresentou respostas diferenciadas quando comparada à regressão bissegmentada. A análise de fatores mostrou-se eficiente nos processos de estratificação ambiental e adaptabilidade dos genótipos de milho.The objective of this work was to verify possible divergences among results obtained on adaptability evaluations of 27 maize genotypes (Zea mays L., and on stratification of 22 environments on Paraná State, Brazil, through techniques of factor analysis and bissegmented regression. The environmental stratifications were made through the traditional methodology and by factor analysis, allied to the percentage of the simple portion of GxE interaction (PS%. Adaptability analyses were carried out through bissegmented regression and factor analysis. By the analysis of bissegmented regression, studied genotypes had presented high productive performance; however, it was not evidenced the genotype considered as ideal. The adaptability of the genotypes, analyzed through graphs, presented different answers when compared to bissegmented regression. Factor analysis was efficient in the processes of environment stratification and

  15. Spatially Resolved Analysis of Bragg Selectivity

    Directory of Open Access Journals (Sweden)

    Tina Sabel

    2015-11-01

    Full Text Available This paper targets an inherent control of optical shrinkage in photosensitive polymers, contributing by means of spatially resolved analysis of volume holographic phase gratings. Point by point scanning of the local material response to the Gaussian intensity distribution of the recording beams is accomplished. Derived information on the local grating period and grating slant is evaluated by mapping of optical shrinkage in the lateral plane as well as through the depth of the layer. The influence of recording intensity, exposure duration and the material viscosity on the Bragg selectivity is investigated.

  16. High resolution or optimum resolution? Spatial analysis of the Federmesser site at Andernach, Germany

    NARCIS (Netherlands)

    Stapert, D; Street, M

    1997-01-01

    This paper discusses spatial analysis at site level. It is suggested that spatial analysis has to proceed in several levels, from global to more detailed questions, and that optimum resolution should be established when applying any quantitative methods in this field. As an example, the ring and

  17. Determinants of orphan drugs prices in France: a regression analysis.

    Science.gov (United States)

    Korchagina, Daria; Millier, Aurelie; Vataire, Anne-Lise; Aballea, Samuel; Falissard, Bruno; Toumi, Mondher

    2017-04-21

    The introduction of the orphan drug legislation led to the increase in the number of available orphan drugs, but the access to them is often limited due to the high price. Social preferences regarding funding orphan drugs as well as the criteria taken into consideration while setting the price remain unclear. The study aimed at identifying the determinant of orphan drug prices in France using a regression analysis. All drugs with a valid orphan designation at the moment of launch for which the price was available in France were included in the analysis. The selection of covariates was based on a literature review and included drug characteristics (Anatomical Therapeutic Chemical (ATC) class, treatment line, age of target population), diseases characteristics (severity, prevalence, availability of alternative therapeutic options), health technology assessment (HTA) details (actual benefit (AB) and improvement in actual benefit (IAB) scores, delay between the HTA and commercialisation), and study characteristics (type of study, comparator, type of endpoint). The main data sources were European public assessment reports, HTA reports, summaries of opinion on orphan designation of the European Medicines Agency, and the French insurance database of drugs and tariffs. A generalized regression model was developed to test the association between the annual treatment cost and selected covariates. A total of 68 drugs were included. The mean annual treatment cost was €96,518. In the univariate analysis, the ATC class (p = 0.01), availability of alternative treatment options (p = 0.02) and the prevalence (p = 0.02) showed a significant correlation with the annual cost. The multivariate analysis demonstrated significant association between the annual cost and availability of alternative treatment options, ATC class, IAB score, type of comparator in the pivotal clinical trial, as well as commercialisation date and delay between the HTA and commercialisation. The

  18. Logistic regression applied to natural hazards: rare event logistic regression with replications

    Science.gov (United States)

    Guns, M.; Vanacker, V.

    2012-06-01

    Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logistic regression with replications, combines the strength of probabilistic and statistical methods, and allows overcoming some of the limitations of previous developments through robust variable selection. This technique was here developed for the analyses of landslide controlling factors, but the concept is widely applicable for statistical analyses of natural hazards.

  19. Exploratory Temporal and Spatial Analysis of Myocardial Infarction Hospitalizations in Calgary, Canada

    Directory of Open Access Journals (Sweden)

    Xiaoxiao Liu

    2017-12-01

    Full Text Available Spatial and temporal analyses are critical to understand the pattern of myocardial infarction (MI hospitalizations over space and time, and to identify their underlying determinants. In this paper, we analyze MI hospitalizations in Calgary from 2004 to 2013, stratified by age and gender. First, a seasonal trend decomposition analyzes the seasonality; then a linear regression models the trend component. Moran’s I and hot spot analyses explore the spatial pattern. Though exploratory, results show that most age and gender groups feature a statistically significant decline over the 10 years, consistent with previous studies in Canada. Decline rates vary across ages and genders, with the slowest decline observed for younger males. Each gender exhibits a seasonal pattern with peaks in both winter and summer. Spatially, MI hot spots are identified in older communities, and in socioeconomically and environmentally disadvantaged communities. In the older communities, higher MI rates appear to be more highly associated with demographics. Conversely, worse air quality appears to be locally associated with higher MI incidence in younger age groups. The study helps identify areas of concern, where MI hot spots are identified for younger age groups, suggesting the need for localized public health policies to target local risk factors.

  20. Stepwise versus Hierarchical Regression: Pros and Cons

    Science.gov (United States)

    Lewis, Mitzi

    2007-01-01

    Multiple regression is commonly used in social and behavioral data analysis. In multiple regression contexts, researchers are very often interested in determining the "best" predictors in the analysis. This focus may stem from a need to identify those predictors that are supportive of theory. Alternatively, the researcher may simply be interested…

  1. Bayesian Nonparametric Regression Analysis of Data with Random Effects Covariates from Longitudinal Measurements

    KAUST Repository

    Ryu, Duchwan

    2010-09-28

    We consider nonparametric regression analysis in a generalized linear model (GLM) framework for data with covariates that are the subject-specific random effects of longitudinal measurements. The usual assumption that the effects of the longitudinal covariate processes are linear in the GLM may be unrealistic and if this happens it can cast doubt on the inference of observed covariate effects. Allowing the regression functions to be unknown, we propose to apply Bayesian nonparametric methods including cubic smoothing splines or P-splines for the possible nonlinearity and use an additive model in this complex setting. To improve computational efficiency, we propose the use of data-augmentation schemes. The approach allows flexible covariance structures for the random effects and within-subject measurement errors of the longitudinal processes. The posterior model space is explored through a Markov chain Monte Carlo (MCMC) sampler. The proposed methods are illustrated and compared to other approaches, the "naive" approach and the regression calibration, via simulations and by an application that investigates the relationship between obesity in adulthood and childhood growth curves. © 2010, The International Biometric Society.

  2. Detrended fluctuation analysis as a regression framework: Estimating dependence at different scales

    Czech Academy of Sciences Publication Activity Database

    Krištoufek, Ladislav

    2015-01-01

    Roč. 91, č. 1 (2015), 022802-1-022802-5 ISSN 1539-3755 R&D Projects: GA ČR(CZ) GP14-11402P Grant - others:GA ČR(CZ) GAP402/11/0948 Program:GA Institutional support: RVO:67985556 Keywords : Detrended cross-correlation analysis * Regression * Scales Subject RIV: AH - Economics Impact factor: 2.288, year: 2014 http://library.utia.cas.cz/separaty/2015/E/kristoufek-0452315.pdf

  3. MULTIPLE LINEAR REGRESSION ANALYSIS FOR PREDICTION OF BOILER LOSSES AND BOILER EFFICIENCY

    OpenAIRE

    Chayalakshmi C.L

    2018-01-01

    MULTIPLE LINEAR REGRESSION ANALYSIS FOR PREDICTION OF BOILER LOSSES AND BOILER EFFICIENCY ABSTRACT Calculation of boiler efficiency is essential if its parameters need to be controlled for either maintaining or enhancing its efficiency. But determination of boiler efficiency using conventional method is time consuming and very expensive. Hence, it is not recommended to find boiler efficiency frequently. The work presented in this paper deals with establishing the statistical mo...

  4. Statistical learning method in regression analysis of simulated positron spectral data

    International Nuclear Information System (INIS)

    Avdic, S. Dz.

    2005-01-01

    Positron lifetime spectroscopy is a non-destructive tool for detection of radiation induced defects in nuclear reactor materials. This work concerns the applicability of the support vector machines method for the input data compression in the neural network analysis of positron lifetime spectra. It has been demonstrated that the SVM technique can be successfully applied to regression analysis of positron spectra. A substantial data compression of about 50 % and 8 % of the whole training set with two and three spectral components respectively has been achieved including a high accuracy of the spectra approximation. However, some parameters in the SVM approach such as the insensitivity zone e and the penalty parameter C have to be chosen carefully to obtain a good performance. (author)

  5. Spatial regression test for ensuring temperature data quality in southern Spain

    Science.gov (United States)

    Estévez, J.; Gavilán, P.; García-Marín, A. P.

    2018-01-01

    Quality assurance of meteorological data is crucial for ensuring the reliability of applications and models that use such data as input variables, especially in the field of environmental sciences. Spatial validation of meteorological data is based on the application of quality control procedures using data from neighbouring stations to assess the validity of data from a candidate station (the station of interest). These kinds of tests, which are referred to in the literature as spatial consistency tests, take data from neighbouring stations in order to estimate the corresponding measurement at the candidate station. These estimations can be made by weighting values according to the distance between the stations or to the coefficient of correlation, among other methods. The test applied in this study relies on statistical decision-making and uses a weighting based on the standard error of the estimate. This paper summarizes the results of the application of this test to maximum, minimum and mean temperature data from the Agroclimatic Information Network of Andalusia (southern Spain). This quality control procedure includes a decision based on a factor f, the fraction of potential outliers for each station across the region. Using GIS techniques, the geographic distribution of the errors detected has been also analysed. Finally, the performance of the test was assessed by evaluating its effectiveness in detecting known errors.

  6. Physiologic noise regression, motion regression, and TOAST dynamic field correction in complex-valued fMRI time series.

    Science.gov (United States)

    Hahn, Andrew D; Rowe, Daniel B

    2012-02-01

    As more evidence is presented suggesting that the phase, as well as the magnitude, of functional MRI (fMRI) time series may contain important information and that there are theoretical drawbacks to modeling functional response in the magnitude alone, removing noise in the phase is becoming more important. Previous studies have shown that retrospective correction of noise from physiologic sources can remove significant phase variance and that dynamic main magnetic field correction and regression of estimated motion parameters also remove significant phase fluctuations. In this work, we investigate the performance of physiologic noise regression in a framework along with correction for dynamic main field fluctuations and motion regression. Our findings suggest that including physiologic regressors provides some benefit in terms of reduction in phase noise power, but it is small compared to the benefit of dynamic field corrections and use of estimated motion parameters as nuisance regressors. Additionally, we show that the use of all three techniques reduces phase variance substantially, removes undesirable spatial phase correlations and improves detection of the functional response in magnitude and phase. Copyright © 2011 Elsevier Inc. All rights reserved.

  7. Comparative analysis of elements and models of implementation in local-level spatial plans in Serbia

    Directory of Open Access Journals (Sweden)

    Stefanović Nebojša

    2017-01-01

    Full Text Available Implementation of local-level spatial plans is of paramount importance to the development of the local community. This paper aims to demonstrate the importance of and offer further directions for research into the implementation of spatial plans by presenting the results of a study on models of implementation. The paper describes the basic theoretical postulates of a model for implementing spatial plans. A comparative analysis of the application of elements and models of implementation of plans in practice was conducted based on the spatial plans for the local municipalities of Arilje, Lazarevac and Sremska Mitrovica. The analysis includes four models of implementation: the strategy and policy of spatial development; spatial protection; the implementation of planning solutions of a technical nature; and the implementation of rules of use, arrangement and construction of spaces. The main results of the analysis are presented and used to give recommendations for improving the elements and models of implementation. Final deliberations show that models of implementation are generally used in practice and combined in spatial plans. Based on the analysis of how models of implementation are applied in practice, a general conclusion concerning the complex character of the local level of planning is presented and elaborated. [Project of the Serbian Ministry of Education, Science and Technological Development, Grant no. TR 36035: Spatial, Environmental, Energy and Social Aspects of Developing Settlements and Climate Change - Mutual Impacts and Grant no. III 47014: The Role and Implementation of the National Spatial Plan and Regional Development Documents in Renewal of Strategic Research, Thinking and Governance in Serbia

  8. Use of generalized regression models for the analysis of stress-rupture data

    International Nuclear Information System (INIS)

    Booker, M.K.

    1978-01-01

    The design of components for operation in an elevated-temperature environment often requires a detailed consideration of the creep and creep-rupture properties of the construction materials involved. Techniques for the analysis and extrapolation of creep data have been widely discussed. The paper presents a generalized regression approach to the analysis of such data. This approach has been applied to multiple heat data sets for types 304 and 316 austenitic stainless steel, ferritic 2 1 / 4 Cr-1 Mo steel, and the high-nickel austenitic alloy 800H. Analyses of data for single heats of several materials are also presented. All results appear good. The techniques presented represent a simple yet flexible and powerful means for the analysis and extrapolation of creep and creep-rupture data

  9. Space, race, and poverty: Spatial inequalities in walkable neighborhood amenities?

    Directory of Open Access Journals (Sweden)

    John Whalen

    2012-05-01

    Full Text Available BACKGROUND Multiple and varied benefits have been suggested for increased neighborhood walkability. However, spatial inequalities in neighborhood walkability likely exist and may be attributable, in part, to residential segregation. OBJECTIVE Utilizing a spatial demographic perspective, we evaluated potential spatial inequalities in walkable neighborhood amenities across census tracts in Boston, MA (US. METHODS The independent variables included minority racial/ethnic population percentages and percent of families in poverty. Walkable neighborhood amenities were assessed with a composite measure. Spatial autocorrelation in key study variables were first calculated with the Global Moran's I statistic. Then, Spearman correlations between neighborhood socio-demographic characteristics and walkable neighborhood amenities were calculated as well as Spearman correlations accounting for spatial autocorrelation. We fit ordinary least squares (OLS regression and spatial autoregressive models when appropriate as a final step. RESULTS Significant positive spatial autocorrelation was found in neighborhood socio-demographic characteristics (e.g. census tract percent Black, but not walkable neighborhood amenities or in the OLS regression residuals. Spearman correlations between neighborhood socio-demographic characteristics and walkable neighborhood amenities were not statistically significant, nor were neighborhood socio-demographic characteristics significantly associated with walkable neighborhood amenities in OLS regression models. CONCLUSIONS Our results suggest that there is residential segregation in Boston and that spatial inequalities do not necessarily show up using a composite measure. COMMENTS Future research in other geographic areas (including international contexts and using different definitions of neighborhoods (including small-area definitions should evaluate if spatial inequalities are found using composite measures, but also should

  10. Spatial variability in levels of benzene, formaldehyde, and total benzene, toluene, ethylbenzene and xylenes in New York City: a land-use regression study

    Directory of Open Access Journals (Sweden)

    Kheirbek Iyad

    2012-07-01

    Full Text Available Abstract Background Hazardous air pollutant exposures are common in urban areas contributing to increased risk of cancer and other adverse health outcomes. While recent analyses indicate that New York City residents experience significantly higher cancer risks attributable to hazardous air pollutant exposures than the United States as a whole, limited data exist to assess intra-urban variability in air toxics exposures. Methods To assess intra-urban spatial variability in exposures to common hazardous air pollutants, street-level air sampling for volatile organic compounds and aldehydes was conducted at 70 sites throughout New York City during the spring of 2011. Land-use regression models were developed using a subset of 59 sites and validated against the remaining 11 sites to describe the relationship between concentrations of benzene, total BTEX (benzene, toluene, ethylbenzene, xylenes and formaldehyde to indicators of local sources, adjusting for temporal variation. Results Total BTEX levels exhibited the most spatial variability, followed by benzene and formaldehyde (coefficient of variation of temporally adjusted measurements of 0.57, 0.35, 0.22, respectively. Total roadway length within 100 m, traffic signal density within 400 m of monitoring sites, and an indicator of temporal variation explained 65% of the total variability in benzene while 70% of the total variability in BTEX was accounted for by traffic signal density within 450 m, density of permitted solvent-use industries within 500 m, and an indicator of temporal variation. Measures of temporal variation, traffic signal density within 400 m, road length within 100 m, and interior building area within 100 m (indicator of heating fuel combustion predicted 83% of the total variability of formaldehyde. The models built with the modeling subset were found to predict concentrations well, predicting 62% to 68% of monitored values at validation sites. Conclusions Traffic and

  11. Regression: The Apple Does Not Fall Far From the Tree.

    Science.gov (United States)

    Vetter, Thomas R; Schober, Patrick

    2018-05-15

    Researchers and clinicians are frequently interested in either: (1) assessing whether there is a relationship or association between 2 or more variables and quantifying this association; or (2) determining whether 1 or more variables can predict another variable. The strength of such an association is mainly described by the correlation. However, regression analysis and regression models can be used not only to identify whether there is a significant relationship or association between variables but also to generate estimations of such a predictive relationship between variables. This basic statistical tutorial discusses the fundamental concepts and techniques related to the most common types of regression analysis and modeling, including simple linear regression, multiple regression, logistic regression, ordinal regression, and Poisson regression, as well as the common yet often underrecognized phenomenon of regression toward the mean. The various types of regression analysis are powerful statistical techniques, which when appropriately applied, can allow for the valid interpretation of complex, multifactorial data. Regression analysis and models can assess whether there is a relationship or association between 2 or more observed variables and estimate the strength of this association, as well as determine whether 1 or more variables can predict another variable. Regression is thus being applied more commonly in anesthesia, perioperative, critical care, and pain research. However, it is crucial to note that regression can identify plausible risk factors; it does not prove causation (a definitive cause and effect relationship). The results of a regression analysis instead identify independent (predictor) variable(s) associated with the dependent (outcome) variable. As with other statistical methods, applying regression requires that certain assumptions be met, which can be tested with specific diagnostics.

  12. Regression Analysis

    CERN Document Server

    Freund, Rudolf J; Sa, Ping

    2006-01-01

    The book provides complete coverage of the classical methods of statistical analysis. It is designed to give students an understanding of the purpose of statistical analyses, to allow the student to determine, at least to some degree, the correct type of statistical analyses to be performed in a given situation, and have some appreciation of what constitutes good experimental design

  13. Systematic review, meta-analysis, and meta-regression: Successful second-line treatment for Helicobacter pylori.

    Science.gov (United States)

    Muñoz, Neus; Sánchez-Delgado, Jordi; Baylina, Mireia; Puig, Ignasi; López-Góngora, Sheila; Suarez, David; Calvet, Xavier

    2018-06-01

    Multiple Helicobacter pylori second-line schedules have been described as potentially useful. It remains unclear, however, which are the best combinations, and which features of second-line treatments are related to better cure rates. The aim of this study was to determine that second-line treatments achieved excellent (>90%) cure rates by performing a systematic review and when possible a meta-analysis. A meta-regression was planned to determine the characteristics of treatments achieving excellent cure rates. A systematic review for studies evaluating second-line Helicobacter pylori treatment was carried out in multiple databases. A formal meta-analysis was performed when an adequate number of comparative studies was found, using RevMan5.3. A meta-regression for evaluating factors predicting cure rates >90% was performed using Stata Statistical Software. The systematic review identified 115 eligible studies, including 203 evaluable treatment arms. The results were extremely heterogeneous, with 61 treatment arms (30%) achieving optimal (>90%) cure rates. The meta-analysis favored quadruple therapies over triple (83.2% vs 76.1%, OR: 0.59:0.38-0.93; P = .02) and 14-day quadruple treatments over 7-day treatments (91.2% vs 81.5%, OR; 95% CI: 0.42:0.24-0.73; P = .002), although the differences were significant only in the per-protocol analysis. The meta-regression did not find any particular characteristics of the studies to be associated with excellent cure rates. Second-line Helicobacter pylori treatments achieving>90% cure rates are extremely heterogeneous. Quadruple therapy and 14-day treatments seem better than triple therapies and 7-day ones. No single characteristic of the treatments was related to excellent cure rates. Future approaches suitable for infectious diseases-thus considering antibiotic resistances-are needed to design rescue treatments that consistently achieve excellent cure rates. © 2018 John Wiley & Sons Ltd.

  14. The Spatial Distribution of Hepatitis C Virus Infections and Associated Determinants--An Application of a Geographically Weighted Poisson Regression for Evidence-Based Screening Interventions in Hotspots.

    Science.gov (United States)

    Kauhl, Boris; Heil, Jeanne; Hoebe, Christian J P A; Schweikart, Jürgen; Krafft, Thomas; Dukers-Muijrers, Nicole H T M

    2015-01-01

    Hepatitis C Virus (HCV) infections are a major cause for liver diseases. A large proportion of these infections remain hidden to care due to its mostly asymptomatic nature. Population-based screening and screening targeted on behavioural risk groups had not proven to be effective in revealing these hidden infections. Therefore, more practically applicable approaches to target screenings are necessary. Geographic Information Systems (GIS) and spatial epidemiological methods may provide a more feasible basis for screening interventions through the identification of hotspots as well as demographic and socio-economic determinants. Analysed data included all HCV tests (n = 23,800) performed in the southern area of the Netherlands between 2002-2008. HCV positivity was defined as a positive immunoblot or polymerase chain reaction test. Population data were matched to the geocoded HCV test data. The spatial scan statistic was applied to detect areas with elevated HCV risk. We applied global regression models to determine associations between population-based determinants and HCV risk. Geographically weighted Poisson regression models were then constructed to determine local differences of the association between HCV risk and population-based determinants. HCV prevalence varied geographically and clustered in urban areas. The main population at risk were middle-aged males, non-western immigrants and divorced persons. Socio-economic determinants consisted of one-person households, persons with low income and mean property value. However, the association between HCV risk and demographic as well as socio-economic determinants displayed strong regional and intra-urban differences. The detection of local hotspots in our study may serve as a basis for prioritization of areas for future targeted interventions. Demographic and socio-economic determinants associated with HCV risk show regional differences underlining that a one-size-fits-all approach even within small geographic

  15. Multiple regression and beyond an introduction to multiple regression and structural equation modeling

    CERN Document Server

    Keith, Timothy Z

    2014-01-01

    Multiple Regression and Beyond offers a conceptually oriented introduction to multiple regression (MR) analysis and structural equation modeling (SEM), along with analyses that flow naturally from those methods. By focusing on the concepts and purposes of MR and related methods, rather than the derivation and calculation of formulae, this book introduces material to students more clearly, and in a less threatening way. In addition to illuminating content necessary for coursework, the accessibility of this approach means students are more likely to be able to conduct research using MR or SEM--and more likely to use the methods wisely. Covers both MR and SEM, while explaining their relevance to one another Also includes path analysis, confirmatory factor analysis, and latent growth modeling Figures and tables throughout provide examples and illustrate key concepts and techniques For additional resources, please visit: http://tzkeith.com/.

  16. Regression Association Analysis of Yield-Related Traits with RAPD Molecular Markers in Pistachio (Pistacia vera L.

    Directory of Open Access Journals (Sweden)

    Saeid Mirzaei

    2017-10-01

    Full Text Available Introduction: The pistachio (Pistacia vera, a member of the cashew family, is a small tree originating from Central Asia and the Middle East. The tree produces seeds that are widely consumed as food. Pistacia vera often is confused with other species in the genus Pistacia that are also known as pistachio. These other species can be distinguished by their geographic distributions and their seeds which are much smaller and have a soft shell. Continual advances in crop improvement through plant breeding are driven by the available genetic diversity. Therefore, the recognition and measurement of such diversity is crucial to breeding programs. In the past 20 years, the major effort in plant breeding has changed from quantitative to molecular genetics with emphasis on quantitative trait loci (QTL identification and marker assisted selection (MAS. The germplasm-regression-combined association studies not only allow mapping of genes/QTLs with higher level of confidence, but also allow detection of genes/QTLs, which will otherwise escape detection in linkage-based QTL studies based on the planned populations. The development of the marker-based technology offers a fast, reliable, and easy way to perform multiple regression analysis and comprise an alternative approach to breeding in diverse species of plants. The availability of many makers and morphological traits can help to regression analysis between these markers and morphological traits. Materials and Methods: In this study, 20 genotypes of Pistachio were studied and yield related traits were measured. Young well-expanded leaves were collected for DNA extraction and total genomic DNA was extracted. Genotyping was performed using 15 RAPD primers and PCR amplification products were visualized by gel electrophoresis. The reproducible RAPD fragments were scored on the basis of present (1 or absent (0 bands and a binary matrix constructed using each molecular marker. Association analysis between

  17. Visuo-Spatial Performance in Autism: A Meta-Analysis

    Science.gov (United States)

    Muth, Anne; Hönekopp, Johannes; Falter, Christine M.

    2014-01-01

    Visuo-spatial skills are believed to be enhanced in autism spectrum disorders (ASDs). This meta-analysis tests the current state of evidence for Figure Disembedding, Block Design, Mental Rotation and Navon tasks in ASD and neurotypicals. Block Design (d = 0.32) and Figure Disembedding (d = 0.26) showed superior performance for ASD with large…

  18. Development of spatial data guidelines and standards: spatial data set documentation to support hydrologic analysis in the U.S. Geological Survey

    Science.gov (United States)

    Fulton, James L.

    1992-01-01

    Spatial data analysis has become an integral component in many surface and sub-surface hydrologic investigations within the U.S. Geological Survey (USGS). Currently, one of the largest costs in applying spatial data analysis is the cost of developing the needed spatial data. Therefore, guidelines and standards are required for the development of spatial data in order to allow for data sharing and reuse; this eliminates costly redevelopment. In order to attain this goal, the USGS is expanding efforts to identify guidelines and standards for the development of spatial data for hydrologic analysis. Because of the variety of project and database needs, the USGS has concentrated on developing standards for documenting spatial sets to aid in the assessment of data set quality and compatibility of different data sets. An interim data set documentation standard (1990) has been developed that provides a mechanism for associating a wide variety of information with a data set, including data about source material, data automation and editing procedures used, projection parameters, data statistics, descriptions of features and feature attributes, information on organizational contacts lists of operations performed on the data, and free-form comments and notes about the data, made at various times in the evolution of the data set. The interim data set documentation standard has been automated using a commercial geographic information system (GIS) and data set documentation software developed by the USGS. Where possible, USGS developed software is used to enter data into the data set documentation file automatically. The GIS software closely associates a data set with its data set documentation file; the documentation file is retained with the data set whenever it is modified, copied, or transferred to another computer system. The Water Resources Division of the USGS is continuing to develop spatial data and data processing standards, with emphasis on standards needed to support

  19. Quantifying Sources and Fluxes of Aquatic Carbon in U.S. Streams and Reservoirs Using Spatially Referenced Regression Models

    Science.gov (United States)

    Boyer, E. W.; Smith, R. A.; Alexander, R. B.; Schwarz, G. E.

    2004-12-01

    Organic carbon (OC) is a critical water quality characteristic in riverine systems that is an important component of the aquatic carbon cycle and energy balance. Examples of processes controlled by OC interactions are complexation of trace metals; enhancement of the solubility of hydrophobic organic contaminants; formation of trihalomethanes in drinking water; and absorption of visible and UV radiation. Organic carbon also can have indirect effects on water quality by influencing internal processes of aquatic ecosystems (e.g. photosynthesis and autotrophic and heterotrophic activity). The importance of organic matter dynamics on water quality has been recognized, but challenges remain in quantitatively addressing OC processes over broad spatial scales in a hydrological context. In this study, we apply spatially referenced watershed models (SPARROW) to statistically estimate long-term mean-annual rates of dissolved- and total- organic carbon export in streams and reservoirs across the conterminous United States. We make use of a GIS framework for the analysis, describing sources, transport, and transformations of organic matter from spatial databases providing characterizations of climate, land use, primary productivity, topography, soils, and geology. This approach is useful because it illustrates spatial patterns of organic carbon fluxes in streamflow, highlighting hot spots (e.g., organic-rich environments in the southeastern coastal plain). Further, our simulations provide estimates of the relative contributions to streams from allochthonous and autochthonous sources. We quantify surface water fluxes of OC with estimates of uncertainty in relation to the overall US carbon budget; our simulations highlight that aquatic sources and sinks of OC may be a more significant component of regional carbon cycling than was previously thought. Further, we are using our simulations to explore the potential role of climate and other changes in the terrestrial environment on

  20. Intermediate and advanced topics in multilevel logistic regression analysis.

    Science.gov (United States)

    Austin, Peter C; Merlo, Juan

    2017-09-10

    Multilevel data occur frequently in health services, population and public health, and epidemiologic research. In such research, binary outcomes are common. Multilevel logistic regression models allow one to account for the clustering of subjects within clusters of higher-level units when estimating the effect of subject and cluster characteristics on subject outcomes. A search of the PubMed database demonstrated that the use of multilevel or hierarchical regression models is increasing rapidly. However, our impression is that many analysts simply use multilevel regression models to account for the nuisance of within-cluster homogeneity that is induced by clustering. In this article, we describe a suite of analyses that can complement the fitting of multilevel logistic regression models. These ancillary analyses permit analysts to estimate the marginal or population-average effect of covariates measured at the subject and cluster level, in contrast to the within-cluster or cluster-specific effects arising from the original multilevel logistic regression model. We describe the interval odds ratio and the proportion of opposed odds ratios, which are summary measures of effect for cluster-level covariates. We describe the variance partition coefficient and the median odds ratio which are measures of components of variance and heterogeneity in outcomes. These measures allow one to quantify the magnitude of the general contextual effect. We describe an R 2 measure that allows analysts to quantify the proportion of variation explained by different multilevel logistic regression models. We illustrate the application and interpretation of these measures by analyzing mortality in patients hospitalized with a diagnosis of acute myocardial infarction. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.