WorldWideScience

Sample records for hierarchical regression analysis

  1. Neighborhood social capital and crime victimization: comparison of spatial regression analysis and hierarchical regression analysis.

    Science.gov (United States)

    Takagi, Daisuke; Ikeda, Ken'ichi; Kawachi, Ichiro

    2012-11-01

    Crime is an important determinant of public health outcomes, including quality of life, mental well-being, and health behavior. A body of research has documented the association between community social capital and crime victimization. The association between social capital and crime victimization has been examined at multiple levels of spatial aggregation, ranging from entire countries, to states, metropolitan areas, counties, and neighborhoods. In multilevel analysis, the spatial boundaries at level 2 are most often drawn from administrative boundaries (e.g., Census tracts in the U.S.). One problem with adopting administrative definitions of neighborhoods is that it ignores spatial spillover. We conducted a study of social capital and crime victimization in one ward of Tokyo city, using a spatial Durbin model with an inverse-distance weighting matrix that assigned each respondent a unique level of "exposure" to social capital based on all other residents' perceptions. The study is based on a postal questionnaire sent to 20-69 years old residents of Arakawa Ward, Tokyo. The response rate was 43.7%. We examined the contextual influence of generalized trust, perceptions of reciprocity, two types of social network variables, as well as two principal components of social capital (constructed from the above four variables). Our outcome measure was self-reported crime victimization in the last five years. In the spatial Durbin model, we found that neighborhood generalized trust, reciprocity, supportive networks and two principal components of social capital were each inversely associated with crime victimization. By contrast, a multilevel regression performed with the same data (using administrative neighborhood boundaries) found generally null associations between neighborhood social capital and crime. Spatial regression methods may be more appropriate for investigating the contextual influence of social capital in homogeneous cultural settings such as Japan.

  2. Analysis of genomic signatures in prokaryotes using multinomial regression and hierarchical clustering

    DEFF Research Database (Denmark)

    Ussery, David; Bohlin, Jon; Skjerve, Eystein

    2009-01-01

    Recently there has been an explosion in the availability of bacterial genomic sequences, making possible now an analysis of genomic signatures across more than 800 hundred different bacterial chromosomes, from a wide variety of environments. Using genomic signatures, we pair-wise compared 867...... different genomic DNA sequences, taken from chromosomes and plasmids more than 100,000 base-pairs in length. Hierarchical clustering was performed on the outcome of the comparisons before a multinomial regression model was fitted. The regression model included the cluster groups as the response variable...... AT content. Small improvements to the regression model, although significant, were also obtained by factors such as sequence size, habitat, growth temperature, selective pressure measured as oligonucleotide usage variance, and oxygen requirement.The statistics obtained using hierarchical clustering...

  3. The Infinite Hierarchical Factor Regression Model

    CERN Document Server

    Rai, Piyush

    2009-01-01

    We propose a nonparametric Bayesian factor regression model that accounts for uncertainty in the number of factors, and the relationship between factors. To accomplish this, we propose a sparse variant of the Indian Buffet Process and couple this with a hierarchical model over factors, based on Kingman's coalescent. We apply this model to two problems (factor analysis and factor regression) in gene-expression data analysis.

  4. Hierarchical linear regression models for conditional quantiles

    Institute of Scientific and Technical Information of China (English)

    TIAN Maozai; CHEN Gemai

    2006-01-01

    The quantile regression has several useful features and therefore is gradually developing into a comprehensive approach to the statistical analysis of linear and nonlinear response models,but it cannot deal effectively with the data with a hierarchical structure.In practice,the existence of such data hierarchies is neither accidental nor ignorable,it is a common phenomenon.To ignore this hierarchical data structure risks overlooking the importance of group effects,and may also render many of the traditional statistical analysis techniques used for studying data relationships invalid.On the other hand,the hierarchical models take a hierarchical data structure into account and have also many applications in statistics,ranging from overdispersion to constructing min-max estimators.However,the hierarchical models are virtually the mean regression,therefore,they cannot be used to characterize the entire conditional distribution of a dependent variable given high-dimensional covariates.Furthermore,the estimated coefficient vector (marginal effects)is sensitive to an outlier observation on the dependent variable.In this article,a new approach,which is based on the Gauss-Seidel iteration and taking a full advantage of the quantile regression and hierarchical models,is developed.On the theoretical front,we also consider the asymptotic properties of the new method,obtaining the simple conditions for an n1/2-convergence and an asymptotic normality.We also illustrate the use of the technique with the real educational data which is hierarchical and how the results can be explained.

  5. Bayesian hierarchical regression analysis of variations in sea surface temperature change over the past million years

    Science.gov (United States)

    Snyder, Carolyn W.

    2016-09-01

    Statistical challenges often preclude comparisons among different sea surface temperature (SST) reconstructions over the past million years. Inadequate consideration of uncertainty can result in misinterpretation, overconfidence, and biased conclusions. Here I apply Bayesian hierarchical regressions to analyze local SST responsiveness to climate changes for 54 SST reconstructions from across the globe over the past million years. I develop methods to account for multiple sources of uncertainty, including the quantification of uncertainty introduced from absolute dating into interrecord comparisons. The estimates of local SST responsiveness explain 64% (62% to 77%, 95% interval) of the total variation within each SST reconstruction with a single number. There is remarkable agreement between SST proxy methods, with the exception of Mg/Ca proxy methods estimating muted responses at high latitudes. The Indian Ocean exhibits a muted response in comparison to other oceans. I find a stable estimate of the proposed "universal curve" of change in local SST responsiveness to climate changes as a function of sin2(latitude) over the past 400,000 years: SST change at 45°N/S is larger than the average tropical response by a factor of 1.9 (1.5 to 2.6, 95% interval) and explains 50% (35% to 58%, 95% interval) of the total variation between each SST reconstruction. These uncertainty and statistical methods are well suited for application across paleoclimate and environmental data series intercomparisons.

  6. Investigating the effects of climate variations on bacillary dysentery incidence in northeast China using ridge regression and hierarchical cluster analysis

    Directory of Open Access Journals (Sweden)

    Guo Junqiao

    2008-09-01

    Full Text Available Abstract Background The effects of climate variations on bacillary dysentery incidence have gained more recent concern. However, the multi-collinearity among meteorological factors affects the accuracy of correlation with bacillary dysentery incidence. Methods As a remedy, a modified method to combine ridge regression and hierarchical cluster analysis was proposed for investigating the effects of climate variations on bacillary dysentery incidence in northeast China. Results All weather indicators, temperatures, precipitation, evaporation and relative humidity have shown positive correlation with the monthly incidence of bacillary dysentery, while air pressure had a negative correlation with the incidence. Ridge regression and hierarchical cluster analysis showed that during 1987–1996, relative humidity, temperatures and air pressure affected the transmission of the bacillary dysentery. During this period, all meteorological factors were divided into three categories. Relative humidity and precipitation belonged to one class, temperature indexes and evaporation belonged to another class, and air pressure was the third class. Conclusion Meteorological factors have affected the transmission of bacillary dysentery in northeast China. Bacillary dysentery prevention and control would benefit from by giving more consideration to local climate variations.

  7. Entrepreneurial intention modeling using hierarchical multiple regression

    Directory of Open Access Journals (Sweden)

    Marina Jeger

    2014-12-01

    Full Text Available The goal of this study is to identify the contribution of effectuation dimensions to the predictive power of the entrepreneurial intention model over and above that which can be accounted for by other predictors selected and confirmed in previous studies. As is often the case in social and behavioral studies, some variables are likely to be highly correlated with each other. Therefore, the relative amount of variance in the criterion variable explained by each of the predictors depends on several factors such as the order of variable entry and sample specifics. The results show the modest predictive power of two dimensions of effectuation prior to the introduction of the theory of planned behavior elements. The article highlights the main advantages of applying hierarchical regression in social sciences as well as in the specific context of entrepreneurial intention formation, and addresses some of the potential pitfalls that this type of analysis entails.

  8. Predictive Ability of Pender's Health Promotion Model for Physical Activity and Exercise in People with Spinal Cord Injuries: A Hierarchical Regression Analysis

    Science.gov (United States)

    Keegan, John P.; Chan, Fong; Ditchman, Nicole; Chiu, Chung-Yi

    2012-01-01

    The main objective of this study was to validate Pender's Health Promotion Model (HPM) as a motivational model for exercise/physical activity self-management for people with spinal cord injuries (SCIs). Quantitative descriptive research design using hierarchical regression analysis (HRA) was used. A total of 126 individuals with SCI were recruited…

  9. Regression Analysis

    CERN Document Server

    Freund, Rudolf J; Sa, Ping

    2006-01-01

    The book provides complete coverage of the classical methods of statistical analysis. It is designed to give students an understanding of the purpose of statistical analyses, to allow the student to determine, at least to some degree, the correct type of statistical analyses to be performed in a given situation, and have some appreciation of what constitutes good experimental design

  10. Hierarchical Neural Regression Models for Customer Churn Prediction

    Directory of Open Access Journals (Sweden)

    Golshan Mohammadi

    2013-01-01

    Full Text Available As customers are the main assets of each industry, customer churn prediction is becoming a major task for companies to remain in competition with competitors. In the literature, the better applicability and efficiency of hierarchical data mining techniques has been reported. This paper considers three hierarchical models by combining four different data mining techniques for churn prediction, which are backpropagation artificial neural networks (ANN, self-organizing maps (SOM, alpha-cut fuzzy c-means (α-FCM, and Cox proportional hazards regression model. The hierarchical models are ANN + ANN + Cox, SOM + ANN + Cox, and α-FCM + ANN + Cox. In particular, the first component of the models aims to cluster data in two churner and nonchurner groups and also filter out unrepresentative data or outliers. Then, the clustered data as the outputs are used to assign customers to churner and nonchurner groups by the second technique. Finally, the correctly classified data are used to create Cox proportional hazards model. To evaluate the performance of the hierarchical models, an Iranian mobile dataset is considered. The experimental results show that the hierarchical models outperform the single Cox regression baseline model in terms of prediction accuracy, Types I and II errors, RMSE, and MAD metrics. In addition, the α-FCM + ANN + Cox model significantly performs better than the two other hierarchical models.

  11. Regression analysis by example

    National Research Council Canada - National Science Library

    Chatterjee, Samprit; Hadi, Ali S

    2012-01-01

    .... The emphasis continues to be on exploratory data analysis rather than statistical theory. The coverage offers in-depth treatment of regression diagnostics, transformation, multicollinearity, logistic regression, and robust regression...

  12. Analyzing Multilevel Data: Comparing Findings from Hierarchical Linear Modeling and Ordinary Least Squares Regression

    Science.gov (United States)

    Rocconi, Louis M.

    2013-01-01

    This study examined the differing conclusions one may come to depending upon the type of analysis chosen, hierarchical linear modeling or ordinary least squares (OLS) regression. To illustrate this point, this study examined the influences of seniors' self-reported critical thinking abilities three ways: (1) an OLS regression with the student…

  13. Regression analysis by example

    CERN Document Server

    Chatterjee, Samprit

    2012-01-01

    Praise for the Fourth Edition: ""This book is . . . an excellent source of examples for regression analysis. It has been and still is readily readable and understandable."" -Journal of the American Statistical Association Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. Regression Analysis by Example, Fifth Edition has been expanded

  14. Hierarchical Multiple Regression in Counseling Research: Common Problems and Possible Remedies.

    Science.gov (United States)

    Petrocelli, John V.

    2003-01-01

    A brief content analysis was conducted on the use of hierarchical regression in counseling research published in the "Journal of Counseling Psychology" and the "Journal of Counseling & Development" during the years 1997-2001. Common problems are cited and possible remedies are described. (Contains 43 references and 3 tables.) (Author)

  15. Linear Regression Analysis

    CERN Document Server

    Seber, George A F

    2012-01-01

    Concise, mathematically clear, and comprehensive treatment of the subject.* Expanded coverage of diagnostics and methods of model fitting.* Requires no specialized knowledge beyond a good grasp of matrix algebra and some acquaintance with straight-line regression and simple analysis of variance models.* More than 200 problems throughout the book plus outline solutions for the exercises.* This revision has been extensively class-tested.

  16. Price promotions on healthier compared with less healthy foods: a hierarchical regression analysis of the impact on sales and social patterning of responses to promotions in Great Britain.

    Science.gov (United States)

    Nakamura, Ryota; Suhrcke, Marc; Jebb, Susan A; Pechey, Rachel; Almiron-Roig, Eva; Marteau, Theresa M

    2015-04-01

    There is a growing concern, but limited evidence, that price promotions contribute to a poor diet and the social patterning of diet-related disease. We examined the following questions: 1) Are less-healthy foods more likely to be promoted than healthier foods? 2) Are consumers more responsive to promotions on less-healthy products? 3) Are there socioeconomic differences in food purchases in response to price promotions? With the use of hierarchical regression, we analyzed data on purchases of 11,323 products within 135 food and beverage categories from 26,986 households in Great Britain during 2010. Major supermarkets operated the same price promotions in all branches. The number of stores that offered price promotions on each product for each week was used to measure the frequency of price promotions. We assessed the healthiness of each product by using a nutrient profiling (NP) model. A total of 6788 products (60%) were in healthier categories and 4535 products (40%) were in less-healthy categories. There was no significant gap in the frequency of promotion by the healthiness of products neither within nor between categories. However, after we controlled for the reference price, price discount rate, and brand-specific effects, the sales uplift arising from price promotions was larger in less-healthy than in healthier categories; a 1-SD point increase in the category mean NP score, implying the category becomes less healthy, was associated with an additional 7.7-percentage point increase in sales (from 27.3% to 35.0%; P sales uplift from promotions was larger for higher-socioeconomic status (SES) groups than for lower ones (34.6% for the high-SES group, 28.1% for the middle-SES group, and 23.1% for the low-SES group). Finally, there was no significant SES gap in the absolute volume of purchases of less-healthy foods made on promotion. Attempts to limit promotions on less-healthy foods could improve the population diet but would be unlikely to reduce health

  17. Coordinate Descent Based Hierarchical Interactive Lasso Penalized Logistic Regression and Its Application to Classification Problems

    Directory of Open Access Journals (Sweden)

    Jin-Jia Wang

    2014-01-01

    Full Text Available We present the hierarchical interactive lasso penalized logistic regression using the coordinate descent algorithm based on the hierarchy theory and variables interactions. We define the interaction model based on the geometric algebra and hierarchical constraint conditions and then use the coordinate descent algorithm to solve for the coefficients of the hierarchical interactive lasso model. We provide the results of some experiments based on UCI datasets, Madelon datasets from NIPS2003, and daily activities of the elder. The experimental results show that the variable interactions and hierarchy contribute significantly to the classification. The hierarchical interactive lasso has the advantages of the lasso and interactive lasso.

  18. Multiple linear regression analysis

    Science.gov (United States)

    Edwards, T. R.

    1980-01-01

    Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.

  19. Hierarchical Matching and Regression with Application to Photometric Redshift Estimation

    Science.gov (United States)

    Murtagh, Fionn

    2017-06-01

    This work emphasizes that heterogeneity, diversity, discontinuity, and discreteness in data is to be exploited in classification and regression problems. A global a priori model may not be desirable. For data analytics in cosmology, this is motivated by the variety of cosmological objects such as elliptical, spiral, active, and merging galaxies at a wide range of redshifts. Our aim is matching and similarity-based analytics that takes account of discrete relationships in the data. The information structure of the data is represented by a hierarchy or tree where the branch structure, rather than just the proximity, is important. The representation is related to p-adic number theory. The clustering or binning of the data values, related to the precision of the measurements, has a central role in this methodology. If used for regression, our approach is a method of cluster-wise regression, generalizing nearest neighbour regression. Both to exemplify this analytics approach, and to demonstrate computational benefits, we address the well-known photometric redshift or `photo-z' problem, seeking to match Sloan Digital Sky Survey (SDSS) spectroscopic and photometric redshifts.

  20. Hierarchical matrices algorithms and analysis

    CERN Document Server

    Hackbusch, Wolfgang

    2015-01-01

    This self-contained monograph presents matrix algorithms and their analysis. The new technique enables not only the solution of linear systems but also the approximation of matrix functions, e.g., the matrix exponential. Other applications include the solution of matrix equations, e.g., the Lyapunov or Riccati equation. The required mathematical background can be found in the appendix. The numerical treatment of fully populated large-scale matrices is usually rather costly. However, the technique of hierarchical matrices makes it possible to store matrices and to perform matrix operations approximately with almost linear cost and a controllable degree of approximation error. For important classes of matrices, the computational cost increases only logarithmically with the approximation error. The operations provided include the matrix inversion and LU decomposition. Since large-scale linear algebra problems are standard in scientific computing, the subject of hierarchical matrices is of interest to scientists ...

  1. Mediation Analysis Using the Hierarchical Multiple Regression Technique: A Study of the Mediating Roles of World-Class Performance in Operations

    Directory of Open Access Journals (Sweden)

    Wakhid S. Ciptono

    2010-05-01

    mediating roles of the contextual factors of world-class performance in operations (i.e., world-class company practices or WCC, operational excellence practices or OE, company nonfinancial performance or CNFP, and the company financial performance would enable the company to facilitate the sustainability of TQM implementation model. This empirical study aims to assess how TQM—a holistic management philosophy initially developed by W. Edward Deming, which integrates improvement strategy, management practices, and organizational performance—is specifically implemented in the oil and gas companies operating in Indonesia. Relevant literature on the TQM, the world-class performance in operations (world-class company and operational performance, the company performance (financial and non-financial performances, and the amendments of the Law of the Republic of Indonesia concerning the oil and gas industry, and related research on how the oil and gas industry in Indonesia develops sustainable competitive advantage and sustainable development programs are reviewed in details in our study. The findings from data analysis provide evidence that there is a strong positive relationship between the critical factors of quality management practices and the company financial performance mediated by the three mediating variables, i.e., world-class company practices, operational excellence practices, and company non-financial performance.

  2. Hierarchical analysis of acceptable use policies

    Directory of Open Access Journals (Sweden)

    P. A. Laughton

    2008-01-01

    Full Text Available Acceptable use policies (AUPs are vital tools for organizations to protect themselves and their employees from misuse of computer facilities provided. A well structured, thorough AUP is essential for any organization. It is impossible for an effective AUP to deal with every clause and remain readable. For this reason, some sections of an AUP carry more weight than others, denoting importance. The methodology used to develop the hierarchical analysis is a literature review, where various sources were consulted. This hierarchical approach to AUP analysis attempts to highlight important sections and clauses dealt with in an AUP. The emphasis of the hierarchal analysis is to prioritize the objectives of an AUP.

  3. Hierarchical modeling and analysis for spatial data

    CERN Document Server

    Banerjee, Sudipto; Gelfand, Alan E

    2003-01-01

    Among the many uses of hierarchical modeling, their application to the statistical analysis of spatial and spatio-temporal data from areas such as epidemiology And environmental science has proven particularly fruitful. Yet to date, the few books that address the subject have been either too narrowly focused on specific aspects of spatial analysis, or written at a level often inaccessible to those lacking a strong background in mathematical statistics.Hierarchical Modeling and Analysis for Spatial Data is the first accessible, self-contained treatment of hierarchical methods, modeling, and dat

  4. An Automatic Hierarchical Delay Analysis Tool

    Institute of Scientific and Technical Information of China (English)

    FaridMheir-El-Saadi; BozenaKaminska

    1994-01-01

    The performance analysis of VLSI integrated circuits(ICs) with flat tools is slow and even sometimes impossible to complete.Some hierarchical tools have been developed to speed up the analysis of these large ICs.However,these hierarchical tools suffer from a poor interaction with the CAD database and poorly automatized operations.We introduce a general hierarchical framework for performance analysis to solve these problems.The circuit analysis is automatic under the proposed framework.Information that has been automatically abstracted in the hierarchy is kept in database properties along with the topological information.A limited software implementation of the framework,PREDICT,has also been developed to analyze the delay performance.Experimental results show that hierarchical analysis CPU time and memory requirements are low if heuristics are used during the abstraction process.

  5. Hierarchical Analysis of the Omega Ontology

    Energy Technology Data Exchange (ETDEWEB)

    Joslyn, Cliff A.; Paulson, Patrick R.

    2009-12-01

    Initial delivery for mathematical analysis of the Omega Ontology. We provide an analysis of the hierarchical structure of a version of the Omega Ontology currently in use within the US Government. After providing an initial statistical analysis of the distribution of all link types in the ontology, we then provide a detailed order theoretical analysis of each of the four main hierarchical links present. This order theoretical analysis includes the distribution of components and their properties, their parent/child and multiple inheritance structure, and the distribution of their vertical ranks.

  6. Evidence for a non-universal Kennicutt-Schmidt relationship using hierarchical Bayesian linear regression

    CERN Document Server

    Shetty, Rahul; Bigiel, Frank

    2012-01-01

    We develop a Bayesian linear regression method which rigorously treats measurement uncertainties, and accounts for hierarchical data structure for investigating the relationship between the star formation rate and gas surface density. The method simultaneously estimates the intercept, slope, and scatter about the regression line of each individual subject (e.g. a galaxy) and the population (e.g. an ensemble of galaxies). Using synthetic datasets, we demonstrate that the Bayesian method accurately recovers the parameters of both the individuals and the population, especially when compared to commonly employed least squares methods, such as the bisector. We apply the Bayesian method to estimate the Kennicutt-Schmidt (KS) parameters of a sample of spiral galaxies compiled by Bigiel et al. (2008). We find significant variation in the KS parameters, indicating that no single KS relationship holds for all galaxies. This suggests that the relationship between molecular gas and star formation differs between galaxies...

  7. Principal Covariates Clusterwise Regression (PCCR): Accounting for Multicollinearity and Population Heterogeneity in Hierarchically Organized Data.

    Science.gov (United States)

    Wilderjans, Tom Frans; Vande Gaer, Eva; Kiers, Henk A L; Van Mechelen, Iven; Ceulemans, Eva

    2017-03-01

    In the behavioral sciences, many research questions pertain to a regression problem in that one wants to predict a criterion on the basis of a number of predictors. Although in many cases, ordinary least squares regression will suffice, sometimes the prediction problem is more challenging, for three reasons: first, multiple highly collinear predictors can be available, making it difficult to grasp their mutual relations as well as their relations to the criterion. In that case, it may be very useful to reduce the predictors to a few summary variables, on which one regresses the criterion and which at the same time yields insight into the predictor structure. Second, the population under study may consist of a few unknown subgroups that are characterized by different regression models. Third, the obtained data are often hierarchically structured, with for instance, observations being nested into persons or participants within groups or countries. Although some methods have been developed that partially meet these challenges (i.e., principal covariates regression (PCovR), clusterwise regression (CR), and structural equation models), none of these methods adequately deals with all of them simultaneously. To fill this gap, we propose the principal covariates clusterwise regression (PCCR) method, which combines the key idea's behind PCovR (de Jong & Kiers in Chemom Intell Lab Syst 14(1-3):155-164, 1992) and CR (Späth in Computing 22(4):367-373, 1979). The PCCR method is validated by means of a simulation study and by applying it to cross-cultural data regarding satisfaction with life.

  8. Common pitfalls in statistical analysis: Logistic regression.

    Science.gov (United States)

    Ranganathan, Priya; Pramesh, C S; Aggarwal, Rakesh

    2017-01-01

    Logistic regression analysis is a statistical technique to evaluate the relationship between various predictor variables (either categorical or continuous) and an outcome which is binary (dichotomous). In this article, we discuss logistic regression analysis and the limitations of this technique.

  9. A Logistic Regression Model with a Hierarchical Random Error Term for Analyzing the Utilization of Public Transport

    Directory of Open Access Journals (Sweden)

    Chong Wei

    2015-01-01

    Full Text Available Logistic regression models have been widely used in previous studies to analyze public transport utilization. These studies have shown travel time to be an indispensable variable for such analysis and usually consider it to be a deterministic variable. This formulation does not allow us to capture travelers’ perception error regarding travel time, and recent studies have indicated that this error can have a significant effect on modal choice behavior. In this study, we propose a logistic regression model with a hierarchical random error term. The proposed model adds a new random error term for the travel time variable. This term structure enables us to investigate travelers’ perception error regarding travel time from a given choice behavior dataset. We also propose an extended model that allows constraining the sign of this error in the model. We develop two Gibbs samplers to estimate the basic hierarchical model and the extended model. The performance of the proposed models is examined using a well-known dataset.

  10. Principal component regression analysis with SPSS.

    Science.gov (United States)

    Liu, R X; Kuang, J; Gong, Q; Hou, X L

    2003-06-01

    The paper introduces all indices of multicollinearity diagnoses, the basic principle of principal component regression and determination of 'best' equation method. The paper uses an example to describe how to do principal component regression analysis with SPSS 10.0: including all calculating processes of the principal component regression and all operations of linear regression, factor analysis, descriptives, compute variable and bivariate correlations procedures in SPSS 10.0. The principal component regression analysis can be used to overcome disturbance of the multicollinearity. The simplified, speeded up and accurate statistical effect is reached through the principal component regression analysis with SPSS.

  11. Regression Analysis by Example. 5th Edition

    Science.gov (United States)

    Chatterjee, Samprit; Hadi, Ali S.

    2012-01-01

    Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. "Regression Analysis by Example, Fifth Edition" has been expanded and thoroughly…

  12. Analysis hierarchical model for discrete event systems

    Science.gov (United States)

    Ciortea, E. M.

    2015-11-01

    The This paper presents the hierarchical model based on discrete event network for robotic systems. Based on the hierarchical approach, Petri network is analysed as a network of the highest conceptual level and the lowest level of local control. For modelling and control of complex robotic systems using extended Petri nets. Such a system is structured, controlled and analysed in this paper by using Visual Object Net ++ package that is relatively simple and easy to use, and the results are shown as representations easy to interpret. The hierarchical structure of the robotic system is implemented on computers analysed using specialized programs. Implementation of hierarchical model discrete event systems, as a real-time operating system on a computer network connected via a serial bus is possible, where each computer is dedicated to local and Petri model of a subsystem global robotic system. Since Petri models are simplified to apply general computers, analysis, modelling, complex manufacturing systems control can be achieved using Petri nets. Discrete event systems is a pragmatic tool for modelling industrial systems. For system modelling using Petri nets because we have our system where discrete event. To highlight the auxiliary time Petri model using transport stream divided into hierarchical levels and sections are analysed successively. Proposed robotic system simulation using timed Petri, offers the opportunity to view the robotic time. Application of goods or robotic and transmission times obtained by measuring spot is obtained graphics showing the average time for transport activity, using the parameters sets of finished products. individually.

  13. Price promotions on healthier compared with less healthy foods: a hierarchical regression analysis of the impact on sales and social patterning of responses to promotions in Great Britain12345

    Science.gov (United States)

    Nakamura, Ryota; Suhrcke, Marc; Jebb, Susan A; Pechey, Rachel; Almiron-Roig, Eva; Marteau, Theresa M

    2015-01-01

    Background: There is a growing concern, but limited evidence, that price promotions contribute to a poor diet and the social patterning of diet-related disease. Objective: We examined the following questions: 1) Are less-healthy foods more likely to be promoted than healthier foods? 2) Are consumers more responsive to promotions on less-healthy products? 3) Are there socioeconomic differences in food purchases in response to price promotions? Design: With the use of hierarchical regression, we analyzed data on purchases of 11,323 products within 135 food and beverage categories from 26,986 households in Great Britain during 2010. Major supermarkets operated the same price promotions in all branches. The number of stores that offered price promotions on each product for each week was used to measure the frequency of price promotions. We assessed the healthiness of each product by using a nutrient profiling (NP) model. Results: A total of 6788 products (60%) were in healthier categories and 4535 products (40%) were in less-healthy categories. There was no significant gap in the frequency of promotion by the healthiness of products neither within nor between categories. However, after we controlled for the reference price, price discount rate, and brand-specific effects, the sales uplift arising from price promotions was larger in less-healthy than in healthier categories; a 1-SD point increase in the category mean NP score, implying the category becomes less healthy, was associated with an additional 7.7–percentage point increase in sales (from 27.3% to 35.0%; P sales uplift from promotions was larger for higher–socioeconomic status (SES) groups than for lower ones (34.6% for the high-SES group, 28.1% for the middle-SES group, and 23.1% for the low-SES group). Finally, there was no significant SES gap in the absolute volume of purchases of less-healthy foods made on promotion. Conclusion: Attempts to limit promotions on less-healthy foods could improve the

  14. 基于分层回归的中国互联网保险驱动因素实证研究%Empirical Study on the Driving Factors of China’s Internet Insurance Based on Hierarchical Regression Analysis

    Institute of Scientific and Technical Information of China (English)

    汤英汉

    2015-01-01

    By analyzing the features and status quo of China’s internet insurance development, this paper found that the main reason causing the weak growth in the insurance industry is the conflict between people’s increasing needs for insurance and the relatively backward insurance management approaches. Internet insurance is a supplement to traditional insurance to a certain degree. By using the hierarchical regression method, this paper analyzes the insurance premium and its relative data from 2003 to 2013. The result shows that the driving factors of the internet insurance are: tax, population, internet, etc. The study also indicates that internet insurance is not a replacement or a threat to the traditional insurance business, but a new form of it instead. Internet insurance can satisfy people’s various needs for insurance. Finally, the author proposes that internet insurance, as a new insurance business, its development facilitates changes in the thoughts and ideas of the insurance industry as a whole. Internet technology has pushed it forward, especially, in such areas as insurance channels, product and service innovations. Therefore, internet insurance also injects fresh blood to China’s insurance industry.%通过分析我国互联网保险的特点和发展现状,发现快速变化的市场环境引致的社会日益增长的保险需求同相对落后的保险经营管理方式之间的矛盾日益突出,造成当前保险业增长乏力。互联网保险的出现弥补了传统保险的不足,成为保险业新的增长动力。本文运用分层回归分析方法,对我国2003-2013年网销保费及相关数据进行研究,验证了我国互联网保险驱动因素主要取决于税收、人口、互联网等方面,保险业自身因素对互联网保险影响不显著。研究发现,互联网保险的发展不是对传统保险的替代和竞争,而是保险新需求的发现,互联网保险满足多层次的保险需求。提出互联

  15. Functional linear regression via canonical analysis

    CERN Document Server

    He, Guozhong; Wang, Jane-Ling; Yang, Wenjing; 10.3150/09-BEJ228

    2011-01-01

    We study regression models for the situation where both dependent and independent variables are square-integrable stochastic processes. Questions concerning the definition and existence of the corresponding functional linear regression models and some basic properties are explored for this situation. We derive a representation of the regression parameter function in terms of the canonical components of the processes involved. This representation establishes a connection between functional regression and functional canonical analysis and suggests alternative approaches for the implementation of functional linear regression analysis. A specific procedure for the estimation of the regression parameter function using canonical expansions is proposed and compared with an established functional principal component regression approach. As an example of an application, we present an analysis of mortality data for cohorts of medflies, obtained in experimental studies of aging and longevity.

  16. Using Regression Mixture Analysis in Educational Research

    Directory of Open Access Journals (Sweden)

    Cody S. Ding

    2006-11-01

    Full Text Available Conventional regression analysis is typically used in educational research. Usually such an analysis implicitly assumes that a common set of regression parameter estimates captures the population characteristics represented in the sample. In some situations, however, this implicit assumption may not be realistic, and the sample may contain several subpopulations such as high math achievers and low math achievers. In these cases, conventional regression models may provide biased estimates since the parameter estimates are constrained to be the same across subpopulations. This paper advocates the applications of regression mixture models, also known as latent class regression analysis, in educational research. Regression mixture analysis is more flexible than conventional regression analysis in that latent classes in the data can be identified and regression parameter estimates can vary within each latent class. An illustration of regression mixture analysis is provided based on a dataset of authentic data. The strengths and limitations of the regression mixture models are discussed in the context of educational research.

  17. Applied regression analysis a research tool

    CERN Document Server

    Pantula, Sastry; Dickey, David

    1998-01-01

    Least squares estimation, when used appropriately, is a powerful research tool. A deeper understanding of the regression concepts is essential for achieving optimal benefits from a least squares analysis. This book builds on the fundamentals of statistical methods and provides appropriate concepts that will allow a scientist to use least squares as an effective research tool. Applied Regression Analysis is aimed at the scientist who wishes to gain a working knowledge of regression analysis. The basic purpose of this book is to develop an understanding of least squares and related statistical methods without becoming excessively mathematical. It is the outgrowth of more than 30 years of consulting experience with scientists and many years of teaching an applied regression course to graduate students. Applied Regression Analysis serves as an excellent text for a service course on regression for non-statisticians and as a reference for researchers. It also provides a bridge between a two-semester introduction to...

  18. Regression Analysis and the Sociological Imagination

    Science.gov (United States)

    De Maio, Fernando

    2014-01-01

    Regression analysis is an important aspect of most introductory statistics courses in sociology but is often presented in contexts divorced from the central concerns that bring students into the discipline. Consequently, we present five lesson ideas that emerge from a regression analysis of income inequality and mortality in the USA and Canada.

  19. Regression Analysis and the Sociological Imagination

    Science.gov (United States)

    De Maio, Fernando

    2014-01-01

    Regression analysis is an important aspect of most introductory statistics courses in sociology but is often presented in contexts divorced from the central concerns that bring students into the discipline. Consequently, we present five lesson ideas that emerge from a regression analysis of income inequality and mortality in the USA and Canada.

  20. Hierarchical Vector Auto-Regressive Models and Their Applications to Multi-subject Effective Connectivity

    Directory of Open Access Journals (Sweden)

    Cristina eGorrostieta

    2013-11-01

    Full Text Available Vector auto-regressive (VAR models typically form the basis for constructing directed graphical models for investigating connectivity in a brain network with brain regions of interest (ROIs as nodes. There are limitations in the standard VAR models. The number of parameters in the VAR model increases quadratically with the number of ROIs and linearly with the order of the model and thus due to the large number of parameters, the model could pose serious estimation problems. Moreover, when applied to imaging data, the standard VAR model does not account for variability in the connectivity structure across all subjects. In this paper, we develop a novel generalization of the VAR model that overcomes these limitations. To deal with the high dimensionality of the parameter space, we propose a Bayesian hierarchical framework for the VAR model that will account for both temporal correlation within a subject and between subject variation. Our approach uses prior distributions that give rise to estimates that correspond to penalized least squares criterion with the elastic net penalty. We apply the proposed model to investigate differences in effective connectivity during a hand grasp experiment between healthy controls and patients with residual motor deficit following a stroke.

  1. Type Ia Supernova Colors and Ejecta Velocities: Hierarchical Bayesian Regression with Non-Gaussian Distributions

    CERN Document Server

    Mandel, Kaisey S; Kirshner, Robert P

    2014-01-01

    We investigate the correlations between the peak intrinsic colors of Type Ia supernovae (SN Ia) and their expansion velocities at maximum light, measured from the Si II 6355 A spectral feature. We construct a new hierarchical Bayesian regression model and Gibbs sampler to estimate the dependence of the intrinsic colors of a SN Ia on its ejecta velocity, while accounting for the random effects of intrinsic scatter, measurement error, and reddening by host galaxy dust. The method is applied to the apparent color data from BVRI light curves and Si II velocity data for 79 nearby SN Ia. Comparison of the apparent color distributions of high velocity (HV) and normal velocity (NV) supernovae reveals significant discrepancies in B-V and B-R, but not other colors. Hence, they are likely due to intrinsic color differences originating in the B-band, rather than dust reddening. The mean intrinsic B-V and B-R color differences between HV and NV groups are 0.06 +/- 0.02 and 0.09 +/- 0.02 mag, respectively. Under a linear m...

  2. Hierarchical Parallelization of Gene Differential Association Analysis

    Directory of Open Access Journals (Sweden)

    Dwarkadas Sandhya

    2011-09-01

    Full Text Available Abstract Background Microarray gene differential expression analysis is a widely used technique that deals with high dimensional data and is computationally intensive for permutation-based procedures. Microarray gene differential association analysis is even more computationally demanding and must take advantage of multicore computing technology, which is the driving force behind increasing compute power in recent years. In this paper, we present a two-layer hierarchical parallel implementation of gene differential association analysis. It takes advantage of both fine- and coarse-grain (with granularity defined by the frequency of communication parallelism in order to effectively leverage the non-uniform nature of parallel processing available in the cutting-edge systems of today. Results Our results show that this hierarchical strategy matches data sharing behavior to the properties of the underlying hardware, thereby reducing the memory and bandwidth needs of the application. The resulting improved efficiency reduces computation time and allows the gene differential association analysis code to scale its execution with the number of processors. The code and biological data used in this study are downloadable from http://www.urmc.rochester.edu/biostat/people/faculty/hu.cfm. Conclusions The performance sweet spot occurs when using a number of threads per MPI process that allows the working sets of the corresponding MPI processes running on the multicore to fit within the machine cache. Hence, we suggest that practitioners follow this principle in selecting the appropriate number of MPI processes and threads within each MPI process for their cluster configurations. We believe that the principles of this hierarchical approach to parallelization can be utilized in the parallelization of other computationally demanding kernels.

  3. Constructing storyboards based on hierarchical clustering analysis

    Science.gov (United States)

    Hasebe, Satoshi; Sami, Mustafa M.; Muramatsu, Shogo; Kikuchi, Hisakazu

    2005-07-01

    There are growing needs for quick preview of video contents for the purpose of improving accessibility of video archives as well as reducing network traffics. In this paper, a storyboard that contains a user-specified number of keyframes is produced from a given video sequence. It is based on hierarchical cluster analysis of feature vectors that are derived from wavelet coefficients of video frames. Consistent use of extracted feature vectors is the key to avoid a repetition of computationally-intensive parsing of the same video sequence. Experimental results suggest that a significant reduction in computational time is gained by this strategy.

  4. TYPE Ia SUPERNOVA COLORS AND EJECTA VELOCITIES: HIERARCHICAL BAYESIAN REGRESSION WITH NON-GAUSSIAN DISTRIBUTIONS

    Energy Technology Data Exchange (ETDEWEB)

    Mandel, Kaisey S.; Kirshner, Robert P. [Harvard-Smithsonian Center for Astrophysics, 60 Garden Street, Cambridge, MA 02138 (United States); Foley, Ryan J., E-mail: kmandel@cfa.harvard.edu [Astronomy Department, University of Illinois at Urbana-Champaign, 1002 West Green Street, Urbana, IL 61801 (United States)

    2014-12-20

    We investigate the statistical dependence of the peak intrinsic colors of Type Ia supernovae (SNe Ia) on their expansion velocities at maximum light, measured from the Si II λ6355 spectral feature. We construct a new hierarchical Bayesian regression model, accounting for the random effects of intrinsic scatter, measurement error, and reddening by host galaxy dust, and implement a Gibbs sampler and deviance information criteria to estimate the correlation. The method is applied to the apparent colors from BVRI light curves and Si II velocity data for 79 nearby SNe Ia. The apparent color distributions of high-velocity (HV) and normal velocity (NV) supernovae exhibit significant discrepancies for B – V and B – R, but not other colors. Hence, they are likely due to intrinsic color differences originating in the B band, rather than dust reddening. The mean intrinsic B – V and B – R color differences between HV and NV groups are 0.06 ± 0.02 and 0.09 ± 0.02 mag, respectively. A linear model finds significant slopes of –0.021 ± 0.006 and –0.030 ± 0.009 mag (10{sup 3} km s{sup –1}){sup –1} for intrinsic B – V and B – R colors versus velocity, respectively. Because the ejecta velocity distribution is skewed toward high velocities, these effects imply non-Gaussian intrinsic color distributions with skewness up to +0.3. Accounting for the intrinsic-color-velocity correlation results in corrections to A{sub V} extinction estimates as large as –0.12 mag for HV SNe Ia and +0.06 mag for NV events. Velocity measurements from SN Ia spectra have the potential to diminish systematic errors from the confounding of intrinsic colors and dust reddening affecting supernova distances.

  5. Hierarchical analysis of the quiet Sun magnetism

    CERN Document Server

    Ramos, A Asensio

    2014-01-01

    Standard statistical analysis of the magnetic properties of the quiet Sun rely on simple histograms of quantities inferred from maximum-likelihood estimations. Because of the inherent degeneracies, either intrinsic or induced by the noise, this approach is not optimal and can lead to highly biased results. We carry out a meta-analysis of the magnetism of the quiet Sun from Hinode observations using a hierarchical probabilistic method. This model allows us to infer the statistical properties of the magnetic field vector over the observed field-of-view consistently taking into account the uncertainties in each pixel due to noise and degeneracies. Our results point out that the magnetic fields are very weak, below 275 G with 95% credibility, with a slight preference for horizontal fields, although the distribution is not far from a quasi-isotropic distribution.

  6. Bayesian hierarchical piecewise regression models: a tool to detect trajectory divergence between groups in long-term observational studies.

    Science.gov (United States)

    Buscot, Marie-Jeanne; Wotherspoon, Simon S; Magnussen, Costan G; Juonala, Markus; Sabin, Matthew A; Burgner, David P; Lehtimäki, Terho; Viikari, Jorma S A; Hutri-Kähönen, Nina; Raitakari, Olli T; Thomson, Russell J

    2017-06-06

    Bayesian hierarchical piecewise regression (BHPR) modeling has not been previously formulated to detect and characterise the mechanism of trajectory divergence between groups of participants that have longitudinal responses with distinct developmental phases. These models are useful when participants in a prospective cohort study are grouped according to a distal dichotomous health outcome. Indeed, a refined understanding of how deleterious risk factor profiles develop across the life-course may help inform early-life interventions. Previous techniques to determine between-group differences in risk factors at each age may result in biased estimate of the age at divergence. We demonstrate the use of Bayesian hierarchical piecewise regression (BHPR) to generate a point estimate and credible interval for the age at which trajectories diverge between groups for continuous outcome measures that exhibit non-linear within-person response profiles over time. We illustrate our approach by modeling the divergence in childhood-to-adulthood body mass index (BMI) trajectories between two groups of adults with/without type 2 diabetes mellitus (T2DM) in the Cardiovascular Risk in Young Finns Study (YFS). Using the proposed BHPR approach, we estimated the BMI profiles of participants with T2DM diverged from healthy participants at age 16 years for males (95% credible interval (CI):13.5-18 years) and 21 years for females (95% CI: 19.5-23 years). These data suggest that a critical window for weight management intervention in preventing T2DM might exist before the age when BMI growth rate is naturally expected to decrease. Simulation showed that when using pairwise comparison of least-square means from categorical mixed models, smaller sample sizes tended to conclude a later age of divergence. In contrast, the point estimate of the divergence time is not biased by sample size when using the proposed BHPR method. BHPR is a powerful analytic tool to model long-term non

  7. Heteroscedastic regression analysis method for mixed data

    Institute of Scientific and Technical Information of China (English)

    FU Hui-min; YUE Xiao-rui

    2011-01-01

    The heteroscedastic regression model was established and the heteroscedastic regression analysis method was presented for mixed data composed of complete data, type- I censored data and type- Ⅱ censored data from the location-scale distribution. The best unbiased estimations of regression coefficients, as well as the confidence limits of the location parameter and scale parameter were given. Furthermore, the point estimations and confidence limits of percentiles were obtained. Thus, the traditional multiple regression analysis method which is only suitable to the complete data from normal distribution can be extended to the cases of heteroscedastic mixed data and the location-scale distribution. So the presented method has a broad range of promising applications.

  8. Relative risk regression analysis of epidemiologic data.

    Science.gov (United States)

    Prentice, R L

    1985-11-01

    Relative risk regression methods are described. These methods provide a unified approach to a range of data analysis problems in environmental risk assessment and in the study of disease risk factors more generally. Relative risk regression methods are most readily viewed as an outgrowth of Cox's regression and life model. They can also be viewed as a regression generalization of more classical epidemiologic procedures, such as that due to Mantel and Haenszel. In the context of an epidemiologic cohort study, relative risk regression methods extend conventional survival data methods and binary response (e.g., logistic) regression models by taking explicit account of the time to disease occurrence while allowing arbitrary baseline disease rates, general censorship, and time-varying risk factors. This latter feature is particularly relevant to many environmental risk assessment problems wherein one wishes to relate disease rates at a particular point in time to aspects of a preceding risk factor history. Relative risk regression methods also adapt readily to time-matched case-control studies and to certain less standard designs. The uses of relative risk regression methods are illustrated and the state of development of these procedures is discussed. It is argued that asymptotic partial likelihood estimation techniques are now well developed in the important special case in which the disease rates of interest have interpretations as counting process intensity functions. Estimation of relative risks processes corresponding to disease rates falling outside this class has, however, received limited attention. The general area of relative risk regression model criticism has, as yet, not been thoroughly studied, though a number of statistical groups are studying such features as tests of fit, residuals, diagnostics and graphical procedures. Most such studies have been restricted to exponential form relative risks as have simulation studies of relative risk estimation

  9. Regression analysis using dependent Polya trees.

    Science.gov (United States)

    Schörgendorfer, Angela; Branscum, Adam J

    2013-11-30

    Many commonly used models for linear regression analysis force overly simplistic shape and scale constraints on the residual structure of data. We propose a semiparametric Bayesian model for regression analysis that produces data-driven inference by using a new type of dependent Polya tree prior to model arbitrary residual distributions that are allowed to evolve across increasing levels of an ordinal covariate (e.g., time, in repeated measurement studies). By modeling residual distributions at consecutive covariate levels or time points using separate, but dependent Polya tree priors, distributional information is pooled while allowing for broad pliability to accommodate many types of changing residual distributions. We can use the proposed dependent residual structure in a wide range of regression settings, including fixed-effects and mixed-effects linear and nonlinear models for cross-sectional, prospective, and repeated measurement data. A simulation study illustrates the flexibility of our novel semiparametric regression model to accurately capture evolving residual distributions. In an application to immune development data on immunoglobulin G antibodies in children, our new model outperforms several contemporary semiparametric regression models based on a predictive model selection criterion. Copyright © 2013 John Wiley & Sons, Ltd.

  10. Bayesian hierarchical model used to analyze regression between fish body size and scale size: application to rare fish species Zingel asper

    Directory of Open Access Journals (Sweden)

    Fontez B.

    2014-04-01

    Full Text Available Back-calculation allows to increase available data on fish growth. The accuracy of back-calculation models is of paramount importance for growth analysis. Frequentist and Bayesian hierarchical approaches were used for regression between fish body size and scale size for the rare fish species Zingel asper. The Bayesian approach permits more reliable estimation of back-calculated size, taking into account biological information and cohort variability. This method greatly improves estimation of back-calculated length when sampling is uneven and/or small.

  11. Robust Mediation Analysis Based on Median Regression

    Science.gov (United States)

    Yuan, Ying; MacKinnon, David P.

    2014-01-01

    Mediation analysis has many applications in psychology and the social sciences. The most prevalent methods typically assume that the error distribution is normal and homoscedastic. However, this assumption may rarely be met in practice, which can affect the validity of the mediation analysis. To address this problem, we propose robust mediation analysis based on median regression. Our approach is robust to various departures from the assumption of homoscedasticity and normality, including heavy-tailed, skewed, contaminated, and heteroscedastic distributions. Simulation studies show that under these circumstances, the proposed method is more efficient and powerful than standard mediation analysis. We further extend the proposed robust method to multilevel mediation analysis, and demonstrate through simulation studies that the new approach outperforms the standard multilevel mediation analysis. We illustrate the proposed method using data from a program designed to increase reemployment and enhance mental health of job seekers. PMID:24079925

  12. Functional data analysis of generalized regression quantiles

    KAUST Repository

    Guo, Mengmeng

    2013-11-05

    Generalized regression quantiles, including the conditional quantiles and expectiles as special cases, are useful alternatives to the conditional means for characterizing a conditional distribution, especially when the interest lies in the tails. We develop a functional data analysis approach to jointly estimate a family of generalized regression quantiles. Our approach assumes that the generalized regression quantiles share some common features that can be summarized by a small number of principal component functions. The principal component functions are modeled as splines and are estimated by minimizing a penalized asymmetric loss measure. An iterative least asymmetrically weighted squares algorithm is developed for computation. While separate estimation of individual generalized regression quantiles usually suffers from large variability due to lack of sufficient data, by borrowing strength across data sets, our joint estimation approach significantly improves the estimation efficiency, which is demonstrated in a simulation study. The proposed method is applied to data from 159 weather stations in China to obtain the generalized quantile curves of the volatility of the temperature at these stations. © 2013 Springer Science+Business Media New York.

  13. Hierarchical design of a polymeric nanovehicle for efficient tumor regression and imaging

    Science.gov (United States)

    An, Jinxia; Guo, Qianqian; Zhang, Peng; Sinclair, Andrew; Zhao, Yu; Zhang, Xinge; Wu, Kan; Sun, Fang; Hung, Hsiang-Chieh; Li, Chaoxing; Jiang, Shaoyi

    2016-04-01

    Effective delivery of therapeutics to disease sites significantly contributes to drug efficacy, toxicity and clearance. Here we designed a hierarchical polymeric nanoparticle structure for anti-cancer chemotherapy delivery by utilizing state-of-the-art polymer chemistry and co-assembly techniques. This novel structural design combines the most desired merits for drug delivery in a single particle, including a long in vivo circulation time, inhibited non-specific cell uptake, enhanced tumor cell internalization, pH-controlled drug release and simultaneous imaging. This co-assembled nanoparticle showed exceptional stability in complex biological media. Benefiting from the synergistic effects of zwitterionic and multivalent galactose polymers, drug-loaded nanoparticles were selectively internalized by cancer cells rather than normal tissue cells. In addition, the pH-responsive core retained their cargo within their polymeric coating through hydrophobic interaction and released it under slightly acidic conditions. In vivo pharmacokinetic studies in mice showed minimal uptake of nanoparticles by the mononuclear phagocyte system and excellent blood circulation half-lives of 14.4 h. As a result, tumor growth was completely inhibited and no damage was observed for normal organ tissues. This newly developed drug nanovehicle has great potential in cancer therapy, and the hierarchical design principle should provide valuable information for the development of the next generation of drug delivery systems.Effective delivery of therapeutics to disease sites significantly contributes to drug efficacy, toxicity and clearance. Here we designed a hierarchical polymeric nanoparticle structure for anti-cancer chemotherapy delivery by utilizing state-of-the-art polymer chemistry and co-assembly techniques. This novel structural design combines the most desired merits for drug delivery in a single particle, including a long in vivo circulation time, inhibited non-specific cell uptake

  14. 1 Hierarchical Approaches to the Analysis of Genetic Diversity in ...

    African Journals Online (AJOL)

    2015-04-14

    Apr 14, 2015 ... Keywords: Genetic diversity, Hierarchical approach, Plant, Clustering,. Descriptive ... utilization) or by clustering (based on a phonetic analysis of individual ...... Improvement of Food Crop Preservatives for the next Millennium.

  15. Fractal Analysis Based on Hierarchical Scaling in Complex Systems

    CERN Document Server

    Chen, Yanguang

    2016-01-01

    A fractal is in essence a hierarchy with cascade structure, which can be described with a set of exponential functions. From these exponential functions, a set of power laws indicative of scaling can be derived. Hierarchy structure and spatial network proved to be associated with one another. This paper is devoted to exploring the theory of fractal analysis of complex systems by means of hierarchical scaling. Two research methods are utilized to make this study, including logic analysis method and empirical analysis method. The main results are as follows. First, a fractal system such as Cantor set is described from the hierarchical angle of view; based on hierarchical structure, three approaches are proposed to estimate fractal dimension. Second, the hierarchical scaling can be generalized to describe multifractals, fractal complementary sets, and self-similar curve such as logarithmic spiral. Third, complex systems such as urban system are demonstrated to be a self-similar hierarchy. The human settlements i...

  16. Credit Scoring Problem Based on Regression Analysis

    OpenAIRE

    Khassawneh, Bashar Suhil Jad Allah

    2014-01-01

    ABSTRACT: This thesis provides an explanatory introduction to the regression models of data mining and contains basic definitions of key terms in the linear, multiple and logistic regression models. Meanwhile, the aim of this study is to illustrate fitting models for the credit scoring problem using simple linear, multiple linear and logistic regression models and also to analyze the found model functions by statistical tools. Keywords: Data mining, linear regression, logistic regression....

  17. Remaining Phosphorus Estimate Through Multiple Regression Analysis

    Institute of Scientific and Technical Information of China (English)

    M. E. ALVES; A. LAVORENTI

    2006-01-01

    The remaining phosphorus (Prem), P concentration that remains in solution after shaking soil with 0.01 mol L-1 CaCl2 containing 60 μg mL-1 P, is a very useful index for studies related to the chemistry of variable charge soils. Although the Prem determination is a simple procedure, the possibility of estimating accurate values of this index from easily and/or routinely determined soil properties can be very useful for practical purposes. The present research evaluated the Premestimation through multiple regression analysis in which routinely determined soil chemical data, soil clay content and soil pH measured in 1 mol L-1 NaF (pHNaF) figured as Prem predictor variables. The Prem can be estimated with acceptable accuracy using the above-mentioned approach, and PHNaF not only substitutes for clay content as a predictor variable but also confers more accuracy to the Prem estimates.

  18. REGRESSION ANALYSIS OF PRODUCTIVITY USING MIXED EFFECT MODEL

    Directory of Open Access Journals (Sweden)

    Siana Halim

    2007-01-01

    Full Text Available Production plants of a company are located in several areas that spread across Middle and East Java. As the production process employs mostly manpower, we suspected that each location has different characteristics affecting the productivity. Thus, the production data may have a spatial and hierarchical structure. For fitting a linear regression using the ordinary techniques, we are required to make some assumptions about the nature of the residuals i.e. independent, identically and normally distributed. However, these assumptions were rarely fulfilled especially for data that have a spatial and hierarchical structure. We worked out the problem using mixed effect model. This paper discusses the model construction of productivity and several characteristics in the production line by taking location as a random effect. The simple model with high utility that satisfies the necessary regression assumptions was built using a free statistic software R version 2.6.1.

  19. Hierarchical manifold learning for regional image analysis.

    Science.gov (United States)

    Bhatia, Kanwal K; Rao, Anil; Price, Anthony N; Wolz, Robin; Hajnal, Joseph V; Rueckert, Daniel

    2014-02-01

    We present a novel method of hierarchical manifold learning which aims to automatically discover regional properties of image datasets. While traditional manifold learning methods have become widely used for dimensionality reduction in medical imaging, they suffer from only being able to consider whole images as single data points. We extend conventional techniques by additionally examining local variations, in order to produce spatially-varying manifold embeddings that characterize a given dataset. This involves constructing manifolds in a hierarchy of image patches of increasing granularity, while ensuring consistency between hierarchy levels. We demonstrate the utility of our method in two very different settings: 1) to learn the regional correlations in motion within a sequence of time-resolved MR images of the thoracic cavity; 2) to find discriminative regions of 3-D brain MR images associated with neurodegenerative disease.

  20. Common pitfalls in statistical analysis: Linear regression analysis.

    Science.gov (United States)

    Aggarwal, Rakesh; Ranganathan, Priya

    2017-01-01

    In a previous article in this series, we explained correlation analysis which describes the strength of relationship between two continuous variables. In this article, we deal with linear regression analysis which predicts the value of one continuous variable from another. We also discuss the assumptions and pitfalls associated with this analysis.

  1. Common pitfalls in statistical analysis: Linear regression analysis

    Directory of Open Access Journals (Sweden)

    Rakesh Aggarwal

    2017-01-01

    Full Text Available In a previous article in this series, we explained correlation analysis which describes the strength of relationship between two continuous variables. In this article, we deal with linear regression analysis which predicts the value of one continuous variable from another. We also discuss the assumptions and pitfalls associated with this analysis.

  2. Sliced Inverse Regression for Time Series Analysis

    Science.gov (United States)

    Chen, Li-Sue

    1995-11-01

    In this thesis, general nonlinear models for time series data are considered. A basic form is x _{t} = f(beta_sp{1} {T}X_{t-1},beta_sp {2}{T}X_{t-1},... , beta_sp{k}{T}X_ {t-1},varepsilon_{t}), where x_{t} is an observed time series data, X_{t } is the first d time lag vector, (x _{t},x_{t-1},... ,x _{t-d-1}), f is an unknown function, beta_{i}'s are unknown vectors, varepsilon_{t }'s are independent distributed. Special cases include AR and TAR models. We investigate the feasibility applying SIR/PHD (Li 1990, 1991) (the sliced inverse regression and principal Hessian methods) in estimating beta _{i}'s. PCA (Principal component analysis) is brought in to check one critical condition for SIR/PHD. Through simulation and a study on 3 well -known data sets of Canadian lynx, U.S. unemployment rate and sunspot numbers, we demonstrate how SIR/PHD can effectively retrieve the interesting low-dimension structures for time series data.

  3. Using Dominance Analysis to Determine Predictor Importance in Logistic Regression

    Science.gov (United States)

    Azen, Razia; Traxel, Nicole

    2009-01-01

    This article proposes an extension of dominance analysis that allows researchers to determine the relative importance of predictors in logistic regression models. Criteria for choosing logistic regression R[superscript 2] analogues were determined and measures were selected that can be used to perform dominance analysis in logistic regression. A…

  4. Using Dominance Analysis to Determine Predictor Importance in Logistic Regression

    Science.gov (United States)

    Azen, Razia; Traxel, Nicole

    2009-01-01

    This article proposes an extension of dominance analysis that allows researchers to determine the relative importance of predictors in logistic regression models. Criteria for choosing logistic regression R[superscript 2] analogues were determined and measures were selected that can be used to perform dominance analysis in logistic regression. A…

  5. Hierarchical Dependence in Meta-Analysis

    Science.gov (United States)

    Stevens, John R.; Taylor, Alan M.

    2009-01-01

    Meta-analysis is a frequent tool among education and behavioral researchers to combine results from multiple experiments to arrive at a clear understanding of some effect of interest. One of the traditional assumptions in a meta-analysis is the independence of the effect sizes from the studies under consideration. This article presents a…

  6. Fabrication and analysis of gecko-inspired hierarchical polymer nanosetae.

    Science.gov (United States)

    Ho, Audrey Yoke Yee; Yeo, Lip Pin; Lam, Yee Cheong; Rodríguez, Isabel

    2011-03-22

    A gecko's superb ability to adhere to surfaces is widely credited to the large attachment area of the hierarchical and fibrillar structure on its feet. The combination of these two features provides the necessary compliance for the gecko toe-pad to effectively engage a high percentage of the spatulae at each step to any kind of surface topography. With the use of multi-tiered porous anodic alumina template and capillary force assisted nanoimprinting, we have successfully fabricated a gecko-inspired hierarchical topography of branched nanopillars on a stiff polymer. We also demonstrated that the hierarchical topography improved the shear adhesion force over a topography of linear structures by 150%. A systematic analysis to understand the phenomenon was performed. It was determined that the effective stiffness of the hierarchical branched structure was lower than that of the linear structure. The reduction in effective stiffness favored a more efficient bending of the branched topography and a better compliance to a test surface, hence resulting in a higher area of residual deformation. As the area of residual deformation increased, the shear adhesion force emulated. The branched pillar topography also showed a marked increase in hydrophobicity, which is an essential property in the practical applications of these structures for good self-cleaning in dry adhesion conditions.

  7. Stability Analysis for Regularized Least Squares Regression

    OpenAIRE

    Rudin, Cynthia

    2005-01-01

    We discuss stability for a class of learning algorithms with respect to noisy labels. The algorithms we consider are for regression, and they involve the minimization of regularized risk functionals, such as L(f) := 1/N sum_i (f(x_i)-y_i)^2+ lambda ||f||_H^2. We shall call the algorithm `stable' if, when y_i is a noisy version of f*(x_i) for some function f* in H, the output of the algorithm converges to f* as the regularization term and noise simultaneously vanish. We consider two flavors of...

  8. Performance Analysis of Hierarchical Clustering Algorithm

    Directory of Open Access Journals (Sweden)

    K.Ranjini

    2011-07-01

    Full Text Available Clustering is the classification of objects into different groups, or more precisely, the partitioning of a data set into subsets (clusters, so that the data in each subset (ideally share some common trait - often proximity according to some defined distance measure. Data clustering is a common technique for statistical data analysis, which is used in many fields, including machine learning, data mining, pattern recognition, image analysis and bioinformatics. This paper explains the implementation of agglomerative and divisive clustering algorithms applied on various types of data. The details of the victims of Tsunami in Thailand during the year 2004, was taken as the test data. Visual programming is used for implementation and running time of the algorithms using different linkages (agglomerative to different types of data are taken for analysis.

  9. Hierarchical multivariate covariance analysis of metabolic connectivity.

    Science.gov (United States)

    Carbonell, Felix; Charil, Arnaud; Zijdenbos, Alex P; Evans, Alan C; Bedell, Barry J

    2014-12-01

    Conventional brain connectivity analysis is typically based on the assessment of interregional correlations. Given that correlation coefficients are derived from both covariance and variance, group differences in covariance may be obscured by differences in the variance terms. To facilitate a comprehensive assessment of connectivity, we propose a unified statistical framework that interrogates the individual terms of the correlation coefficient. We have evaluated the utility of this method for metabolic connectivity analysis using [18F]2-fluoro-2-deoxyglucose (FDG) positron emission tomography (PET) data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) study. As an illustrative example of the utility of this approach, we examined metabolic connectivity in angular gyrus and precuneus seed regions of mild cognitive impairment (MCI) subjects with low and high β-amyloid burdens. This new multivariate method allowed us to identify alterations in the metabolic connectome, which would not have been detected using classic seed-based correlation analysis. Ultimately, this novel approach should be extensible to brain network analysis and broadly applicable to other imaging modalities, such as functional magnetic resonance imaging (MRI).

  10. Analysis and Optimisation of Hierarchically Scheduled Multiprocessor Embedded Systems

    DEFF Research Database (Denmark)

    Pop, Traian; Pop, Paul; Eles, Petru;

    2008-01-01

    , they are organised in a hierarchy. In this paper, we first develop a holistic scheduling and schedulability analysis that determines the timing properties of a hierarchically scheduled system. Second, we address design problems that are characteristic to such hierarchically scheduled systems: assignment......We present an approach to the analysis and optimisation of heterogeneous multiprocessor embedded systems. The systems are heterogeneous not only in terms of hardware components, but also in terms of communication protocols and scheduling policies. When several scheduling policies share a resource...... of scheduling policies to tasks, mapping of tasks to hardware components, and the scheduling of the activities. We also present several algorithms for solving these problems. Our heuristics are able to find schedulable implementations under limited resources, achieving an efficient utilisation of the system...

  11. Hierarchical Cluster Analysis – Various Approaches to Data Preparation

    Directory of Open Access Journals (Sweden)

    Z. Pacáková

    2013-09-01

    Full Text Available The article deals with two various approaches to data preparation to avoid multicollinearity. The aim of the article is to find similarities among the e-communication level of EU states using hierarchical cluster analysis. The original set of fourteen indicators was first reduced on the basis of correlation analysis while in case of high correlation indicator of higher variability was included in further analysis. Secondly the data were transformed using principal component analysis while the principal components are poorly correlated. For further analysis five principal components explaining about 92% of variance were selected. Hierarchical cluster analysis was performed both based on the reduced data set and the principal component scores. Both times three clusters were assumed following Pseudo t-Squared and Pseudo F Statistic, but the final clusters were not identical. An important characteristic to compare the two results found was to look at the proportion of variance accounted for by the clusters which was about ten percent higher for the principal component scores (57.8% compared to 47%. Therefore it can be stated, that in case of using principal component scores as an input variables for cluster analysis with explained proportion high enough (about 92% for in our analysis, the loss of information is lower compared to data reduction on the basis of correlation analysis.

  12. Hierarchical models and the analysis of bird survey information

    Science.gov (United States)

    Sauer, J.R.; Link, W.A.

    2003-01-01

    Management of birds often requires analysis of collections of estimates. We describe a hierarchical modeling approach to the analysis of these data, in which parameters associated with the individual species estimates are treated as random variables, and probability statements are made about the species parameters conditioned on the data. A Markov-Chain Monte Carlo (MCMC) procedure is used to fit the hierarchical model. This approach is computer intensive, and is based upon simulation. MCMC allows for estimation both of parameters and of derived statistics. To illustrate the application of this method, we use the case in which we are interested in attributes of a collection of estimates of population change. Using data for 28 species of grassland-breeding birds from the North American Breeding Bird Survey, we estimate the number of species with increasing populations, provide precision-adjusted rankings of species trends, and describe a measure of population stability as the probability that the trend for a species is within a certain interval. Hierarchical models can be applied to a variety of bird survey applications, and we are investigating their use in estimation of population change from survey data.

  13. The importance of trait emotional intelligence and feelings in the prediction of perceived and biological stress in adolescents: hierarchical regressions and fsQCA models.

    Science.gov (United States)

    Villanueva, Lidón; Montoya-Castilla, Inmaculada; Prado-Gascó, Vicente

    2017-07-01

    The purpose of this study is to analyze the combined effects of trait emotional intelligence (EI) and feelings on healthy adolescents' stress. Identifying the extent to which adolescent stress varies with trait emotional differences and the feelings of adolescents is of considerable interest in the development of intervention programs for fostering youth well-being. To attain this goal, self-reported questionnaires (perceived stress, trait EI, and positive/negative feelings) and biological measures of stress (hair cortisol concentrations, HCC) were collected from 170 adolescents (12-14 years old). Two different methodologies were conducted, which included hierarchical regression models and a fuzzy-set qualitative comparative analysis (fsQCA). The results support trait EI as a protective factor against stress in healthy adolescents and suggest that feelings reinforce this relation. However, the debate continues regarding the possibility of optimal levels of trait EI for effective and adaptive emotional management, particularly in the emotional attention and clarity dimensions and for female adolescents.

  14. Least-Squares Linear Regression and Schrodinger's Cat: Perspectives on the Analysis of Regression Residuals.

    Science.gov (United States)

    Hecht, Jeffrey B.

    The analysis of regression residuals and detection of outliers are discussed, with emphasis on determining how deviant an individual data point must be to be considered an outlier and the impact that multiple suspected outlier data points have on the process of outlier determination and treatment. Only bivariate (one dependent and one independent)…

  15. Modeling place field activity with hierarchical slow feature analysis

    Directory of Open Access Journals (Sweden)

    Fabian eSchoenfeld

    2015-05-01

    Full Text Available In this paper we present six experimental studies from the literature on hippocampal place cells and replicate their main results in a computational framework based on the principle of slowness. Each of the chosen studies first allows rodents to develop stable place field activity and then examines a distinct property of the established spatial encoding, namely adaptation to cue relocation and removal; directional firing activity in the linear track and open field; and results of morphing and stretching the overall environment. To replicate these studies we employ a hierarchical Slow Feature Analysis (SFA network. SFA is an unsupervised learning algorithm extracting slowly varying information from a given stream of data, and hierarchical application of SFA allows for high dimensional input such as visual images to be processed efficiently and in a biologically plausible fashion. Training data for the network is produced in ratlab, a free basic graphics engine designed to quickly set up a wide range of 3D environments mimicking real life experimental studies, simulate a foraging rodent while recording its visual input, and training & sampling a hierarchical SFA network.

  16. Regression Commonality Analysis: A Technique for Quantitative Theory Building

    Science.gov (United States)

    Nimon, Kim; Reio, Thomas G., Jr.

    2011-01-01

    When it comes to multiple linear regression analysis (MLR), it is common for social and behavioral science researchers to rely predominately on beta weights when evaluating how predictors contribute to a regression model. Presenting an underutilized statistical technique, this article describes how organizational researchers can use commonality…

  17. General Nature of Multicollinearity in Multiple Regression Analysis.

    Science.gov (United States)

    Liu, Richard

    1981-01-01

    Discusses multiple regression, a very popular statistical technique in the field of education. One of the basic assumptions in regression analysis requires that independent variables in the equation should not be highly correlated. The problem of multicollinearity and some of the solutions to it are discussed. (Author)

  18. Hierarchical Visual Analysis and Steering Framework for Astrophysical Simulations

    Institute of Scientific and Technical Information of China (English)

    肖健; 张加万; 原野; 周鑫; 纪丽; 孙济洲

    2015-01-01

    A framework for accelerating modern long-running astrophysical simulations is presented, which is based on a hierarchical architecture where computational steering in the high-resolution run is performed under the guide of knowledge obtained in the gradually refined ensemble analyses. Several visualization schemes for facilitating ensem-ble management, error analysis, parameter grouping and tuning are also integrated owing to the pluggable modular design. The proposed approach is prototyped based on the Flash code, and it can be extended by introducing user-defined visualization for specific requirements. Two real-world simulations, i.e., stellar wind and supernova remnant, are carried out to verify the proposed approach.

  19. A general framework for the use of logistic regression models in meta-analysis.

    Science.gov (United States)

    Simmonds, Mark C; Higgins, Julian Pt

    2016-12-01

    Where individual participant data are available for every randomised trial in a meta-analysis of dichotomous event outcomes, "one-stage" random-effects logistic regression models have been proposed as a way to analyse these data. Such models can also be used even when individual participant data are not available and we have only summary contingency table data. One benefit of this one-stage regression model over conventional meta-analysis methods is that it maximises the correct binomial likelihood for the data and so does not require the common assumption that effect estimates are normally distributed. A second benefit of using this model is that it may be applied, with only minor modification, in a range of meta-analytic scenarios, including meta-regression, network meta-analyses and meta-analyses of diagnostic test accuracy. This single model can potentially replace the variety of often complex methods used in these areas. This paper considers, with a range of meta-analysis examples, how random-effects logistic regression models may be used in a number of different types of meta-analyses. This one-stage approach is compared with widely used meta-analysis methods including Bayesian network meta-analysis and the bivariate and hierarchical summary receiver operating characteristic (ROC) models for meta-analyses of diagnostic test accuracy.

  20. Predicting Nigeria budget allocation using regression analysis: A ...

    African Journals Online (AJOL)

    Predicting Nigeria budget allocation using regression analysis: A data mining approach. ... Open Access DOWNLOAD FULL TEXT ... Budget is used by the Government as a guiding tool for planning and management of its resources to aid in ...

  1. An Original Stepwise Multilevel Logistic Regression Analysis of Discriminatory Accuracy

    DEFF Research Database (Denmark)

    Merlo, Juan; Wagner, Philippe; Ghith, Nermin

    2016-01-01

    BACKGROUND AND AIM: Many multilevel logistic regression analyses of "neighbourhood and health" focus on interpreting measures of associations (e.g., odds ratio, OR). In contrast, multilevel analysis of variance is rarely considered. We propose an original stepwise analytical approach that disting......BACKGROUND AND AIM: Many multilevel logistic regression analyses of "neighbourhood and health" focus on interpreting measures of associations (e.g., odds ratio, OR). In contrast, multilevel analysis of variance is rarely considered. We propose an original stepwise analytical approach...

  2. 3D Regression Heat Map Analysis of Population Study Data.

    Science.gov (United States)

    Klemm, Paul; Lawonn, Kai; Glaßer, Sylvia; Niemann, Uli; Hegenscheid, Katrin; Völzke, Henry; Preim, Bernhard

    2016-01-01

    Epidemiological studies comprise heterogeneous data about a subject group to define disease-specific risk factors. These data contain information (features) about a subject's lifestyle, medical status as well as medical image data. Statistical regression analysis is used to evaluate these features and to identify feature combinations indicating a disease (the target feature). We propose an analysis approach of epidemiological data sets by incorporating all features in an exhaustive regression-based analysis. This approach combines all independent features w.r.t. a target feature. It provides a visualization that reveals insights into the data by highlighting relationships. The 3D Regression Heat Map, a novel 3D visual encoding, acts as an overview of the whole data set. It shows all combinations of two to three independent features with a specific target disease. Slicing through the 3D Regression Heat Map allows for the detailed analysis of the underlying relationships. Expert knowledge about disease-specific hypotheses can be included into the analysis by adjusting the regression model formulas. Furthermore, the influences of features can be assessed using a difference view comparing different calculation results. We applied our 3D Regression Heat Map method to a hepatic steatosis data set to reproduce results from a data mining-driven analysis. A qualitative analysis was conducted on a breast density data set. We were able to derive new hypotheses about relations between breast density and breast lesions with breast cancer. With the 3D Regression Heat Map, we present a visual overview of epidemiological data that allows for the first time an interactive regression-based analysis of large feature sets with respect to a disease.

  3. Least Squares Adjustment: Linear and Nonlinear Weighted Regression Analysis

    DEFF Research Database (Denmark)

    Nielsen, Allan Aasbjerg

    2007-01-01

    This note primarily describes the mathematics of least squares regression analysis as it is often used in geodesy including land surveying and satellite positioning applications. In these fields regression is often termed adjustment. The note also contains a couple of typical land surveying...... and satellite positioning application examples. In these application areas we are typically interested in the parameters in the model typically 2- or 3-D positions and not in predictive modelling which is often the main concern in other regression analysis applications. Adjustment is often used to obtain...

  4. Category theoretic analysis of hierarchical protein materials and social networks.

    Directory of Open Access Journals (Sweden)

    David I Spivak

    Full Text Available Materials in biology span all the scales from Angstroms to meters and typically consist of complex hierarchical assemblies of simple building blocks. Here we describe an application of category theory to describe structural and resulting functional properties of biological protein materials by developing so-called ologs. An olog is like a "concept web" or "semantic network" except that it follows a rigorous mathematical formulation based on category theory. This key difference ensures that an olog is unambiguous, highly adaptable to evolution and change, and suitable for sharing concepts with other olog. We consider simple cases of beta-helical and amyloid-like protein filaments subjected to axial extension and develop an olog representation of their structural and resulting mechanical properties. We also construct a representation of a social network in which people send text-messages to their nearest neighbors and act as a team to perform a task. We show that the olog for the protein and the olog for the social network feature identical category-theoretic representations, and we proceed to precisely explicate the analogy or isomorphism between them. The examples presented here demonstrate that the intrinsic nature of a complex system, which in particular includes a precise relationship between structure and function at different hierarchical levels, can be effectively represented by an olog. This, in turn, allows for comparative studies between disparate materials or fields of application, and results in novel approaches to derive functionality in the design of de novo hierarchical systems. We discuss opportunities and challenges associated with the description of complex biological materials by using ologs as a powerful tool for analysis and design in the context of materiomics, and we present the potential impact of this approach for engineering, life sciences, and medicine.

  5. Regional flood frequency analysis using spatial proximity and basin characteristics: Quantile regression vs. parameter regression technique

    Science.gov (United States)

    Ahn, Kuk-Hyun; Palmer, Richard

    2016-09-01

    Despite wide use of regression-based regional flood frequency analysis (RFFA) methods, the majority are based on either ordinary least squares (OLS) or generalized least squares (GLS). This paper proposes 'spatial proximity' based RFFA methods using the spatial lagged model (SLM) and spatial error model (SEM). The proposed methods are represented by two frameworks: the quantile regression technique (QRT) and parameter regression technique (PRT). The QRT develops prediction equations for flooding quantiles in average recurrence intervals (ARIs) of 2, 5, 10, 20, and 100 years whereas the PRT provides prediction of three parameters for the selected distribution. The proposed methods are tested using data incorporating 30 basin characteristics from 237 basins in Northeastern United States. Results show that generalized extreme value (GEV) distribution properly represents flood frequencies in the study gages. Also, basin area, stream network, and precipitation seasonality are found to be the most effective explanatory variables in prediction modeling by the QRT and PRT. 'Spatial proximity' based RFFA methods provide reliable flood quantile estimates compared to simpler methods. Compared to the QRT, the PRT may be recommended due to its accuracy and computational simplicity. The results presented in this paper may serve as one possible guidepost for hydrologists interested in flood analysis at ungaged sites.

  6. Hierarchical Cluster-based Partial Least Squares Regression (HC-PLSR is an efficient tool for metamodelling of nonlinear dynamic models

    Directory of Open Access Journals (Sweden)

    Omholt Stig W

    2011-06-01

    Full Text Available Abstract Background Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs to variation in features of the trajectories of the state variables (outputs throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR, where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR and ordinary least squares (OLS regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Results Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback

  7. Hierarchical Aligned Cluster Analysis for Temporal Clustering of Human Motion.

    Science.gov (United States)

    Zhou, Feng; De la Torre, Fernando; Hodgins, Jessica K

    2013-03-01

    Temporal segmentation of human motion into plausible motion primitives is central to understanding and building computational models of human motion. Several issues contribute to the challenge of discovering motion primitives: the exponential nature of all possible movement combinations, the variability in the temporal scale of human actions, and the complexity of representing articulated motion. We pose the problem of learning motion primitives as one of temporal clustering, and derive an unsupervised hierarchical bottom-up framework called hierarchical aligned cluster analysis (HACA). HACA finds a partition of a given multidimensional time series into m disjoint segments such that each segment belongs to one of k clusters. HACA combines kernel k-means with the generalized dynamic time alignment kernel to cluster time series data. Moreover, it provides a natural framework to find a low-dimensional embedding for time series. HACA is efficiently optimized with a coordinate descent strategy and dynamic programming. Experimental results on motion capture and video data demonstrate the effectiveness of HACA for segmenting complex motions and as a visualization tool. We also compare the performance of HACA to state-of-the-art algorithms for temporal clustering on data of a honey bee dance. The HACA code is available online.

  8. A Novel Divisive Hierarchical Clustering Algorithm for Geospatial Analysis

    Directory of Open Access Journals (Sweden)

    Shaoning Li

    2017-01-01

    Full Text Available In the fields of geographic information systems (GIS and remote sensing (RS, the clustering algorithm has been widely used for image segmentation, pattern recognition, and cartographic generalization. Although clustering analysis plays a key role in geospatial modelling, traditional clustering methods are limited due to computational complexity, noise resistant ability and robustness. Furthermore, traditional methods are more focused on the adjacent spatial context, which makes it hard for the clustering methods to be applied to multi-density discrete objects. In this paper, a new method, cell-dividing hierarchical clustering (CDHC, is proposed based on convex hull retraction. The main steps are as follows. First, a convex hull structure is constructed to describe the global spatial context of geospatial objects. Then, the retracting structure of each borderline is established in sequence by setting the initial parameter. The objects are split into two clusters (i.e., “sub-clusters” if the retracting structure intersects with the borderlines. Finally, clusters are repeatedly split and the initial parameter is updated until the terminate condition is satisfied. The experimental results show that CDHC separates the multi-density objects from noise sufficiently and also reduces complexity compared to the traditional agglomerative hierarchical clustering algorithm.

  9. Research and analyze of physical health using multiple regression analysis

    Directory of Open Access Journals (Sweden)

    T. S. Kyi

    2014-01-01

    Full Text Available This paper represents the research which is trying to create a mathematical model of the "healthy people" using the method of regression analysis. The factors are the physical parameters of the person (such as heart rate, lung capacity, blood pressure, breath holding, weight height coefficient, flexibility of the spine, muscles of the shoulder belt, abdominal muscles, squatting, etc.., and the response variable is an indicator of physical working capacity. After performing multiple regression analysis, obtained useful multiple regression models that can predict the physical performance of boys the aged of fourteen to seventeen years. This paper represents the development of regression model for the sixteen year old boys and analyzed results.

  10. Regression Model Optimization for the Analysis of Experimental Data

    Science.gov (United States)

    Ulbrich, N.

    2009-01-01

    A candidate math model search algorithm was developed at Ames Research Center that determines a recommended math model for the multivariate regression analysis of experimental data. The search algorithm is applicable to classical regression analysis problems as well as wind tunnel strain gage balance calibration analysis applications. The algorithm compares the predictive capability of different regression models using the standard deviation of the PRESS residuals of the responses as a search metric. This search metric is minimized during the search. Singular value decomposition is used during the search to reject math models that lead to a singular solution of the regression analysis problem. Two threshold dependent constraints are also applied. The first constraint rejects math models with insignificant terms. The second constraint rejects math models with near-linear dependencies between terms. The math term hierarchy rule may also be applied as an optional constraint during or after the candidate math model search. The final term selection of the recommended math model depends on the regressor and response values of the data set, the user s function class combination choice, the user s constraint selections, and the result of the search metric minimization. A frequently used regression analysis example from the literature is used to illustrate the application of the search algorithm to experimental data.

  11. Simulation Experiments in Practice : Statistical Design and Regression Analysis

    NARCIS (Netherlands)

    Kleijnen, J.P.C.

    2007-01-01

    In practice, simulation analysts often change only one factor at a time, and use graphical analysis of the resulting Input/Output (I/O) data. Statistical theory proves that more information is obtained when applying Design Of Experiments (DOE) and linear regression analysis. Unfortunately, classic t

  12. Evaluation Applications of Regression Analysis with Time-Series Data.

    Science.gov (United States)

    Veney, James E.

    1993-01-01

    The application of time series analysis is described, focusing on the use of regression analysis for analyzing time series in a way that may make it more readily available to an evaluation practice audience. Practical guidelines are suggested for decision makers in government, health, and social welfare agencies. (SLD)

  13. The Analysis of the Regression-Discontinuity Design in R

    Science.gov (United States)

    Thoemmes, Felix; Liao, Wang; Jin, Ze

    2017-01-01

    This article describes the analysis of regression-discontinuity designs (RDDs) using the R packages rdd, rdrobust, and rddtools. We discuss similarities and differences between these packages and provide directions on how to use them effectively. We use real data from the Carolina Abecedarian Project to show how an analysis of an RDD can be…

  14. Joint regression analysis and AMMI model applied to oat improvement

    Science.gov (United States)

    Oliveira, A.; Oliveira, T. A.; Mejza, S.

    2012-09-01

    In our work we present an application of some biometrical methods useful in genotype stability evaluation, namely AMMI model, Joint Regression Analysis (JRA) and multiple comparison tests. A genotype stability analysis of oat (Avena Sativa L.) grain yield was carried out using data of the Portuguese Plant Breeding Board, sample of the 22 different genotypes during the years 2002, 2003 and 2004 in six locations. In Ferreira et al. (2006) the authors state the relevance of the regression models and of the Additive Main Effects and Multiplicative Interactions (AMMI) model, to study and to estimate phenotypic stability effects. As computational techniques we use the Zigzag algorithm to estimate the regression coefficients and the agricolae-package available in R software for AMMI model analysis.

  15. Hierarchical resource analysis for land use planning through remote sensing

    Science.gov (United States)

    Byrnes, B. H.; Frazee, C. J.; Cox, T. L.

    1976-01-01

    A hierarchical resource analysis was applied to remote sensing data to provide maps at Planning Levels I and III (Anderson et al., U.S. Geological Survey Circular 671) for Meade County, S. Dak. Level I land use and general soil maps were prepared by visual interpretation of imagery from a false color composite of Landsat MSS bands 4, 5, and 7 and single bands (5 and 7). A modified Level III land use map was prepared for the Black Hills area from RB-57 photography enlarged to a scale of 1:24,000. Level III land use data were used together with computer-generated interpretive soil maps to analyze relationships between developed and developing areas and soil criteria.

  16. Analysis of stability of community structure across multiple hierarchical levels

    CERN Document Server

    Li, Hui-Jia

    2015-01-01

    The analysis of stability of community structure is an important problem for scientists from many fields. Here, we propose a new framework to reveal hidden properties of community structure by quantitatively analyzing the dynamics of Potts model. Specifically we model the Potts procedure of community structure detection by a Markov process, which has a clear mathematical explanation. Critical topological information regarding to multivariate spin configuration could also be inferred from the spectral significance of the Markov process. We test our framework on some example networks and find it doesn't have resolute limitation problem at all. Results have shown the model we proposed is able to uncover hierarchical structure in different scales effectively and efficiently.

  17. Ratio Versus Regression Analysis: Some Empirical Evidence in Brazil

    Directory of Open Access Journals (Sweden)

    Newton Carneiro Affonso da Costa Jr.

    2004-06-01

    Full Text Available This work compares the traditional methodology for ratio analysis, applied to a sample of Brazilian firms, with the alternative one of regression analysis both to cross-industry and intra-industry samples. It was tested the structural validity of the traditional methodology through a model that represents its analogous regression format. The data are from 156 Brazilian public companies in nine industrial sectors for the year 1997. The results provide weak empirical support for the traditional ratio methodology as it was verified that the validity of this methodology may differ between ratios.

  18. Time series analysis using semiparametric regression on oil palm production

    Science.gov (United States)

    Yundari, Pasaribu, U. S.; Mukhaiyar, U.

    2016-04-01

    This paper presents semiparametric kernel regression method which has shown its flexibility and easiness in mathematical calculation, especially in estimating density and regression function. Kernel function is continuous and it produces a smooth estimation. The classical kernel density estimator is constructed by completely nonparametric analysis and it is well reasonable working for all form of function. Here, we discuss about parameter estimation in time series analysis. First, we consider the parameters are exist, then we use nonparametrical estimation which is called semiparametrical. The selection of optimum bandwidth is obtained by considering the approximation of Mean Integrated Square Root Error (MISE).

  19. Regressão múltipla stepwise e hierárquica em Psicologia Organizacional: aplicações, problemas e soluções Stepwise and hierarchical multiple regression in organizational psychology: Applications, problemas and solutions

    Directory of Open Access Journals (Sweden)

    Gardênia Abbad

    2002-01-01

    Full Text Available Este artigo discute algumas aplicações das técnicas de análise de regressão múltipla stepwise e hierárquica, as quais são muito utilizadas em pesquisas da área de Psicologia Organizacional. São discutidas algumas estratégias de identificação e de solução de problemas relativos à ocorrência de erros do Tipo I e II e aos fenômenos de supressão, complementaridade e redundância nas equações de regressão múltipla. São apresentados alguns exemplos de pesquisas nas quais esses padrões de associação entre variáveis estiveram presentes e descritas as estratégias utilizadas pelos pesquisadores para interpretá-los. São discutidas as aplicações dessas análises no estudo de interação entre variáveis e na realização de testes para avaliação da linearidade do relacionamento entre variáveis. Finalmente, são apresentadas sugestões para lidar com as limitações das análises de regressão múltipla (stepwise e hierárquica.This article discusses applications of stepwise and hierarchical multiple regression analyses to research in organizational psychology. Strategies for identifying type I and II errors, and solutions to potential problems that may arise from such errors are proposed. In addition, phenomena such as suppression, complementarity, and redundancy are reviewed. The article presents examples of research where these phenomena occurred, and the manner in which they were explained by researchers. Some applications of multiple regression analyses to studies involving between-variable interactions are presented, along with tests used to analyze the presence of linearity among variables. Finally, some suggestions are provided for dealing with limitations implicit in multiple regression analyses (stepwise and hierarchical.

  20. Analysis of Sting Balance Calibration Data Using Optimized Regression Models

    Science.gov (United States)

    Ulbrich, N.; Bader, Jon B.

    2010-01-01

    Calibration data of a wind tunnel sting balance was processed using a candidate math model search algorithm that recommends an optimized regression model for the data analysis. During the calibration the normal force and the moment at the balance moment center were selected as independent calibration variables. The sting balance itself had two moment gages. Therefore, after analyzing the connection between calibration loads and gage outputs, it was decided to choose the difference and the sum of the gage outputs as the two responses that best describe the behavior of the balance. The math model search algorithm was applied to these two responses. An optimized regression model was obtained for each response. Classical strain gage balance load transformations and the equations of the deflection of a cantilever beam under load are used to show that the search algorithm s two optimized regression models are supported by a theoretical analysis of the relationship between the applied calibration loads and the measured gage outputs. The analysis of the sting balance calibration data set is a rare example of a situation when terms of a regression model of a balance can directly be derived from first principles of physics. In addition, it is interesting to note that the search algorithm recommended the correct regression model term combinations using only a set of statistical quality metrics that were applied to the experimental data during the algorithm s term selection process.

  1. Sparse Regression by Projection and Sparse Discriminant Analysis

    KAUST Repository

    Qi, Xin

    2015-04-03

    © 2015, © American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America. Recent years have seen active developments of various penalized regression methods, such as LASSO and elastic net, to analyze high-dimensional data. In these approaches, the direction and length of the regression coefficients are determined simultaneously. Due to the introduction of penalties, the length of the estimates can be far from being optimal for accurate predictions. We introduce a new framework, regression by projection, and its sparse version to analyze high-dimensional data. The unique nature of this framework is that the directions of the regression coefficients are inferred first, and the lengths and the tuning parameters are determined by a cross-validation procedure to achieve the largest prediction accuracy. We provide a theoretical result for simultaneous model selection consistency and parameter estimation consistency of our method in high dimension. This new framework is then generalized such that it can be applied to principal components analysis, partial least squares, and canonical correlation analysis. We also adapt this framework for discriminant analysis. Compared with the existing methods, where there is relatively little control of the dependency among the sparse components, our method can control the relationships among the components. We present efficient algorithms and related theory for solving the sparse regression by projection problem. Based on extensive simulations and real data analysis, we demonstrate that our method achieves good predictive performance and variable selection in the regression setting, and the ability to control relationships between the sparse components leads to more accurate classification. In supplementary materials available online, the details of the algorithms and theoretical proofs, and R codes for all simulation studies are provided.

  2. Regression analysis for solving diagnosis problem of children's health

    Science.gov (United States)

    Cherkashina, Yu A.; Gerget, O. M.

    2016-04-01

    The paper includes results of scientific researches. These researches are devoted to the application of statistical techniques, namely, regression analysis, to assess the health status of children in the neonatal period based on medical data (hemostatic parameters, parameters of blood tests, the gestational age, vascular-endothelial growth factor) measured at 3-5 days of children's life. In this paper a detailed description of the studied medical data is given. A binary logistic regression procedure is discussed in the paper. Basic results of the research are presented. A classification table of predicted values and factual observed values is shown, the overall percentage of correct recognition is determined. Regression equation coefficients are calculated, the general regression equation is written based on them. Based on the results of logistic regression, ROC analysis was performed, sensitivity and specificity of the model are calculated and ROC curves are constructed. These mathematical techniques allow carrying out diagnostics of health of children providing a high quality of recognition. The results make a significant contribution to the development of evidence-based medicine and have a high practical importance in the professional activity of the author.

  3. Visual category recognition using Spectral Regression and Kernel Discriminant Analysis

    NARCIS (Netherlands)

    Tahir, M.A.; Kittler, J.; Mikolajczyk, K.; Yan, F.; van de Sande, K.E.A.; Gevers, T.

    2009-01-01

    Visual category recognition (VCR) is one of the most important tasks in image and video indexing. Spectral methods have recently emerged as a powerful tool for dimensionality reduction and manifold learning. Recently, Spectral Regression combined with Kernel Discriminant Analysis (SR-KDA) has been s

  4. Controlling the Type I Error Rate in Stepwise Regression Analysis.

    Science.gov (United States)

    Pohlmann, John T.

    Three procedures used to control Type I error rate in stepwise regression analysis are forward selection, backward elimination, and true stepwise. In the forward selection method, a model of the dependent variable is formed by choosing the single best predictor; then the second predictor which makes the strongest contribution to the prediction of…

  5. Kinematic gait patterns in healthy runners: A hierarchical cluster analysis.

    Science.gov (United States)

    Phinyomark, Angkoon; Osis, Sean; Hettinga, Blayne A; Ferber, Reed

    2015-11-01

    Previous studies have demonstrated distinct clusters of gait patterns in both healthy and pathological groups, suggesting that different movement strategies may be represented. However, these studies have used discrete time point variables and usually focused on only one specific joint and plane of motion. Therefore, the first purpose of this study was to determine if running gait patterns for healthy subjects could be classified into homogeneous subgroups using three-dimensional kinematic data from the ankle, knee, and hip joints. The second purpose was to identify differences in joint kinematics between these groups. The third purpose was to investigate the practical implications of clustering healthy subjects by comparing these kinematics with runners experiencing patellofemoral pain (PFP). A principal component analysis (PCA) was used to reduce the dimensionality of the entire gait waveform data and then a hierarchical cluster analysis (HCA) determined group sets of similar gait patterns and homogeneous clusters. The results show two distinct running gait patterns were found with the main between-group differences occurring in frontal and sagittal plane knee angles (Pgait strategies. These results suggest care must be taken when selecting samples of subjects in order to investigate the pathomechanics of injured runners.

  6. The reflection of hierarchical cluster analysis of co-occurrence matrices in SPSS

    NARCIS (Netherlands)

    Zhou, Q.; Leng, F.; Leydesdorff, L.

    2015-01-01

    Purpose: To discuss the problems arising from hierarchical cluster analysis of co-occurrence matrices in SPSS, and the corresponding solutions. Design/methodology/approach: We design different methods of using the SPSS hierarchical clustering module for co-occurrence matrices in order to compare the

  7. The reflection of hierarchical cluster analysis of co-occurrence matrices in SPSS

    NARCIS (Netherlands)

    Zhou, Q.; Leng, F.; Leydesdorff, L.

    2015-01-01

    Purpose: To discuss the problems arising from hierarchical cluster analysis of co-occurrence matrices in SPSS, and the corresponding solutions. Design/methodology/approach: We design different methods of using the SPSS hierarchical clustering module for co-occurrence matrices in order to compare

  8. MULTIPLE REGRESSION ANALYSIS OF MAIN ECONOMIC INDICATORS IN TOURISM

    Directory of Open Access Journals (Sweden)

    Erika KULCSÁR

    2009-12-01

    Full Text Available This paper analysis the measure between GDP dependent variable in the sector of hotels and restaurants and the following independent variables: overnight stays in the establishments of touristic reception, arrivals in the establishments of touristic reception and investments in hotels and restaurants sector in the period of analysis 1995-2007. With the multiple regression analysis I found that investments and tourist arrivals are significant predictors for the GDP dependent variable. Based on these results, I identified those components of the marketing mix, which in my opinion require investment, which could contribute to the positive development of tourist arrivals in the establishments of touristic reception.

  9. MULTIPLE REGRESSION ANALYSIS OF MAIN ECONOMIC INDICATORS IN TOURISM

    Directory of Open Access Journals (Sweden)

    Erika KULCSÁR

    2009-12-01

    Full Text Available This paper analysis the measure between GDP dependent variable in the sector of hotels and restaurants and the following independent variables: overnight stays in the establishments of touristic reception, arrivals in the establishments of touristic reception and investments in hotels and restaurants sector in the period of analysis 1995-2007. With the multiple regression analysis I found that investments and tourist arrivals are significant predictors for the GDP dependent variable. Based on these results, I identified those components of the marketing mix, which in my opinion require investment, which could contribute to the positive development of tourist arrivals in the establishments of touristic reception.

  10. Spatial Hierarchical Bayesian Analysis of the Historical Extreme Streamflow

    Science.gov (United States)

    Najafi, M. R.; Moradkhani, H.

    2012-04-01

    Analysis of the climate change impact on extreme hydro-climatic events is crucial for future hydrologic/hydraulic designs and water resources decision making. The purpose of this study is to investigate the changes of the extreme value distribution parameters with respect to time to reflect upon the impact of climate change. We develop a statistical model using the observed streamflow data of the Columbia River Basin in USA to estimate the changes of high flows as a function of time as well as other variables. Generalized Pareto Distribution (GPD) is used to model the upper 95% flows during December through March for 31 gauge stations. In the process layer of the model the covariates including time, latitude, longitude, elevation and basin area are considered to assess the sensitivity of the model to each variable. Markov Chain Monte Carlo (MCMC) method is used to estimate the parameters. The Spatial Hierarchical Bayesian technique models the GPD parameters spatially and borrows strength from other locations by pooling data together, while providing an explicit estimation of the uncertainties in all stages of modeling.

  11. Principal regression analysis and the index leverage effect

    Science.gov (United States)

    Reigneron, Pierre-Alain; Allez, Romain; Bouchaud, Jean-Philippe

    2011-09-01

    We revisit the index leverage effect, that can be decomposed into a volatility effect and a correlation effect. We investigate the latter using a matrix regression analysis, that we call ‘Principal Regression Analysis' (PRA) and for which we provide some analytical (using Random Matrix Theory) and numerical benchmarks. We find that downward index trends increase the average correlation between stocks (as measured by the most negative eigenvalue of the conditional correlation matrix), and makes the market mode more uniform. Upward trends, on the other hand, also increase the average correlation between stocks but rotates the corresponding market mode away from uniformity. There are two time scales associated to these effects, a short one on the order of a month (20 trading days), and a longer time scale on the order of a year. We also find indications of a leverage effect for sectorial correlations as well, which reveals itself in the second and third mode of the PRA.

  12. Poisson Regression Analysis of Illness and Injury Surveillance Data

    Energy Technology Data Exchange (ETDEWEB)

    Frome E.L., Watkins J.P., Ellis E.D.

    2012-12-12

    The Department of Energy (DOE) uses illness and injury surveillance to monitor morbidity and assess the overall health of the work force. Data collected from each participating site include health events and a roster file with demographic information. The source data files are maintained in a relational data base, and are used to obtain stratified tables of health event counts and person time at risk that serve as the starting point for Poisson regression analysis. The explanatory variables that define these tables are age, gender, occupational group, and time. Typical response variables of interest are the number of absences due to illness or injury, i.e., the response variable is a count. Poisson regression methods are used to describe the effect of the explanatory variables on the health event rates using a log-linear main effects model. Results of fitting the main effects model are summarized in a tabular and graphical form and interpretation of model parameters is provided. An analysis of deviance table is used to evaluate the importance of each of the explanatory variables on the event rate of interest and to determine if interaction terms should be considered in the analysis. Although Poisson regression methods are widely used in the analysis of count data, there are situations in which over-dispersion occurs. This could be due to lack-of-fit of the regression model, extra-Poisson variation, or both. A score test statistic and regression diagnostics are used to identify over-dispersion. A quasi-likelihood method of moments procedure is used to evaluate and adjust for extra-Poisson variation when necessary. Two examples are presented using respiratory disease absence rates at two DOE sites to illustrate the methods and interpretation of the results. In the first example the Poisson main effects model is adequate. In the second example the score test indicates considerable over-dispersion and a more detailed analysis attributes the over-dispersion to extra

  13. A regressed phase analysis for coupled joint systems.

    Science.gov (United States)

    Wininger, Michael

    2011-01-01

    This study aims to address shortcomings of the relative phase analysis, a widely used method for assessment of coupling among joints of the lower limb. Goniometric data from 15 individuals with spastic diplegic cerebral palsy were recorded from the hip and knee joints during ambulation on a flat surface, and from a single healthy individual with no known motor impairment, over at least 10 gait cycles. The minimum relative phase (MRP) revealed substantial disparity in the timing and severity of the instance of maximum coupling, depending on which reference frame was selected: MRP(knee-hip) differed from MRP(hip-knee) by 16.1±14% of gait cycle and 50.6±77% difference in scale. Additionally, several relative phase portraits contained discontinuities which may contribute to error in phase feature extraction. These vagaries can be attributed to the predication of relative phase analysis on a transformation into the velocity-position phase plane, and the extraction of phase angle by the discontinuous arc-tangent operator. Here, an alternative phase analysis is proposed, wherein kinematic data is transformed into a profile of joint coupling across the entire gait cycle. By comparing joint velocities directly via a standard linear regression in the velocity-velocity phase plane, this regressed phase analysis provides several key advantages over relative phase analysis including continuity, commutativity between reference frames, and generalizability to many-joint systems.

  14. Forecasting urban water demand: A meta-regression analysis.

    Science.gov (United States)

    Sebri, Maamar

    2016-12-01

    Water managers and planners require accurate water demand forecasts over the short-, medium- and long-term for many purposes. These range from assessing water supply needs over spatial and temporal patterns to optimizing future investments and planning future allocations across competing sectors. This study surveys the empirical literature on the urban water demand forecasting using the meta-analytical approach. Specifically, using more than 600 estimates, a meta-regression analysis is conducted to identify explanations of cross-studies variation in accuracy of urban water demand forecasting. Our study finds that accuracy depends significantly on study characteristics, including demand periodicity, modeling method, forecasting horizon, model specification and sample size. The meta-regression results remain robust to different estimators employed as well as to a series of sensitivity checks performed. The importance of these findings lies in the conclusions and implications drawn out for regulators and policymakers and for academics alike. Copyright © 2016. Published by Elsevier Ltd.

  15. Registration Cost Performance Analysis of a Hierarchical Mobile Internet Protocol Network

    Institute of Scientific and Technical Information of China (English)

    XU Kai; JI Hong; YUE Guang-xin

    2004-01-01

    On the basis of introducing principles for hierarchical mobile Internet protocol networks, the registration cost performance in this network model is analyzed in detail. Furthermore, the functional relationship is also established in the paper among registration cost, hierarchical level number and the maximum handover time for gateway foreign agent regional registration. At last, the registration cost of the hierarchical mobile Internet protocol network is compared with that of the traditional mobile Internet protocol. Theoretic analysis and computer simulation results show that the hierarchical level number and the maximum handover times can both affect the registration cost importantly, when suitable values of which are chosen, the hierarchical network can significantly improve the registration performance compared with the traditional mobile IP.

  16. Epistasis analysis for quantitative traits by functional regression model.

    Science.gov (United States)

    Zhang, Futao; Boerwinkle, Eric; Xiong, Momiao

    2014-06-01

    The critical barrier in interaction analysis for rare variants is that most traditional statistical methods for testing interactions were originally designed for testing the interaction between common variants and are difficult to apply to rare variants because of their prohibitive computational time and poor ability. The great challenges for successful detection of interactions with next-generation sequencing (NGS) data are (1) lack of methods for interaction analysis with rare variants, (2) severe multiple testing, and (3) time-consuming computations. To meet these challenges, we shift the paradigm of interaction analysis between two loci to interaction analysis between two sets of loci or genomic regions and collectively test interactions between all possible pairs of SNPs within two genomic regions. In other words, we take a genome region as a basic unit of interaction analysis and use high-dimensional data reduction and functional data analysis techniques to develop a novel functional regression model to collectively test interactions between all possible pairs of single nucleotide polymorphisms (SNPs) within two genome regions. By intensive simulations, we demonstrate that the functional regression models for interaction analysis of the quantitative trait have the correct type 1 error rates and a much better ability to detect interactions than the current pairwise interaction analysis. The proposed method was applied to exome sequence data from the NHLBI's Exome Sequencing Project (ESP) and CHARGE-S study. We discovered 27 pairs of genes showing significant interactions after applying the Bonferroni correction (P-values < 4.58 × 10(-10)) in the ESP, and 11 were replicated in the CHARGE-S study.

  17. Residential behavioural energy savings : a meta-regression analysis

    Energy Technology Data Exchange (ETDEWEB)

    Tiedemann, K.H. [BC Hydro, Burnaby, BC (Canada)

    2009-07-01

    Increasing attention is being given to opportunities for residential energy behavioural savings, as developed countries attempt to reduce energy use and greenhouse gas emissions. Several utility companies have undertaken pilot programs geared at understanding which interventions are most effective in reducing residential energy consumption through behavioural change. This paper presented the first metaregression analysis of residential energy behavioural savings. This study focused on interventions which affected household energy-related behaviours and as a result, affected household energy use. The paper described rational choice theory, the theory of planned behaviour, and the integration of rational choice theory and the adjusted expectancy values theory in a simple framework. The paper also discussed the review of various social, psychological and economics journals and databases. The results of the studies were presented. A basic concept in meta-regression analysis is the effects size which is defined as the program effect divided by the standard error of the program effect. A lengthy review of the literature found twenty-eight treatments from ten experiments for which an effect size could be calculated. The experiments involved classifying treatments according to whether the interventions were information, goal setting, feedback, rewards or combinations of these interventions. The impact of these alternative interventions on the effect size was then modelled using White's robust regression. Five regression models were compared on the basis of the Akaike's information criterion. It was found that model 5, which used all of the regressors, was the preferred model. It was concluded that the theory of planned behaviour is more appropriate in the context of analysis of behavioural change and energy use. 21 refs., 4 tabs.

  18. Hierarchical Scheduling Framework Based on Compositional Analysis Using Uppaal

    DEFF Research Database (Denmark)

    Boudjadar, Jalil; David, Alexandre; Kim, Jin Hyun

    2014-01-01

    This paper introduces a reconfigurable compositional scheduling framework, in which the hierarchical structure, the scheduling policies, the concrete task behavior and the shared resources can all be reconfigured. The behavior of each periodic preemptive task is given as a list of timed actions, ...

  19. Meta-regression Analysis of the Chinese Labor Reallocation Effect

    Institute of Scientific and Technical Information of China (English)

    Longhua; YUE; Shiyan; YANG; Rongtai; SHEN

    2013-01-01

    Meta regression analysis method was applied to study 23 papers about the effect of Chinese labor reallocation on the economic growth. The results showed that both the method of the World Bank (1996) or M.Syrquin(1986) had little impact on the results, while the calculation of the stock of physical capital had a positive impact on the results. The result by using panel data study was bigger than results obtained in the time series data. The time span had little influences on the results. Therefore, it was necessary to measure the exact stock of physical capital in China, so as to evaluate the Chinese labor reallocation effect

  20. Multivariate study and regression analysis of gluten-free granola

    Directory of Open Access Journals (Sweden)

    Lilian Maria Pagamunici

    2014-03-01

    Full Text Available This study developed a gluten-free granola and evaluated it during storage with the application of multivariate and regression analysis of the sensory and instrumental parameters. The physicochemical, sensory, and nutritional characteristics of a product containing quinoa, amaranth and linseed were evaluated. The crude protein and lipid contents ranged from 97.49 and 122.72 g kg-1 of food, respectively. The polyunsaturated/saturated, and n-6:n-3 fatty acid ratios ranged from 2.82 and 2.59:1, respectively. Granola had the best alpha-linolenic acid content, nutritional indices in the lipid fraction, and mineral content. There were good hygienic and sanitary conditions during storage; probably due to the low water activity of the formulation, which contributed to inhibit microbial growth. The sensory attributes ranged from 'like very much' to 'like slightly', and the regression models were highly fitted and correlated during the storage period. A reduction in the sensory attribute levels and in the product physical stabilisation was verified by principal component analysis. The use of the affective test acceptance and instrumental analysis combined with statistical methods allowed us to obtain promising results about the characteristics of gluten-free granola.

  1. Regression analysis application for designing the vibration dampers

    Directory of Open Access Journals (Sweden)

    A. V. Ivanov

    2014-01-01

    Full Text Available Multi-frequency vibration dampers protect air power lines and fiber optic communication channels against Aeolian vibrations. To have a maximum efficiency the natural frequencies of dampers should be evenly distributed over the entire operating frequency range from 3 to 150 Hz. A traditional approach to damper design is to investigate damper features using the fullscale models. As a result, a conclusion on the damper capabilities is drawn, and design changes are made to achieve the required natural frequencies. The article describes a direct optimization method to design dampers.This method leads to a clear-cut definition of geometrical and mass parameters of dampers by their natural frequencies. The direct designing method is based on the active plan and design experiment.Based on regression analysis, a regression model is obtained as a second order polynomial to establish unique relation between the input (element dimensions, the weights of cargos and the output (natural frequencies design parameters. Different problems of designing dampers are considered using developed regression models.As a result, it has been found that a satisfactory accuracy of mathematical models, relating the input designing parameters to the output ones, is achieved. Depending on the number of input parameters and the nature of the restrictions a statement of designing purpose, including an optimization one, can be different when restrictions for design parameters are to meet the conflicting requirements.A proposed optimization method to solve a direct designing problem allows us to determine directly the damper element dimensions for any natural frequencies, and at the initial stage of the analysis, based on the methods of nonlinear programming, to disclose problems with no solution.The developed approach can be successfully applied to design various mechanical systems with complicated nonlinear interactions between the input and output parameters.

  2. Optimum short-time polynomial regression for signal analysis

    Indian Academy of Sciences (India)

    A SREENIVASA MURTHY; CHANDRA SEKHAR SEELAMANTULA; T V SREENIVAS

    2016-11-01

    We propose a short-time polynomial regression (STPR) for time-varying signal analysis. The advantage of using polynomials is that the notion of a spectrum is not needed and the signals can be analyzed in the time domain over short durations. In the presence of noise, such modeling becomes important, because the polynomial approximation performs smoothing leading to noise suppression. The problem of optimal smoothingdepends on the duration over which a fixed-order polynomial regression is performed. Considering the STPR of a noisy signal, we derive the optimal smoothing window by minimizing the mean-square error (MSE). For a fixed polynomial order, the smoothing window duration depends on the rate of signal variation, which, in turn,depends on its derivatives. Since the derivatives are not available a priori, exact optimization is not feasible.However, approximate optimization can be achieved using only the variance expressions and the intersection-ofconfidence-intervals (ICI) technique. The ICI technique is based on a consistency measure across confidence intervals corresponding to different window lengths. An approximate asymptotic analysis to determine the optimal confidence interval width shows that the asymptotic expressions are the same irrespective of whether one starts with a uniform sampling grid or a nonuniform one. Simulation results on sinusoids, chirps, and electrocardiogram (ECG) signals, and comparisons with standard wavelet denoising techniques, show that theproposed method is robust particularly in the low signal-to-noise ratio regime.

  3. Hierarchical Modelling of Flood Risk for Engineering Decision Analysis

    DEFF Research Database (Denmark)

    Custer, Rocco

    Societies around the world are faced with flood risk, prompting authorities and decision makers to manage risk to protect population and assets. With climate change, urbanisation and population growth, flood risk changes constantly, requiring flood risk management strategies that are flexible...... and robust. Traditional risk management solutions, e.g. dike construction, are not particularly flexible, as they are difficult to adapt to changing risk. Conversely, the recent concept of integrated flood risk management, entailing a combination of several structural and non-structural risk management...... measures, allows identifying flexible and robust flood risk management strategies. Based on it, this thesis investigates hierarchical flood protection systems, which encompass two, or more, hierarchically integrated flood protection structures on different spatial scales (e.g. dikes, local flood barriers...

  4. Selection of higher order regression models in the analysis of multi-factorial transcription data.

    Directory of Open Access Journals (Sweden)

    Olivia Prazeres da Costa

    Full Text Available INTRODUCTION: Many studies examine gene expression data that has been obtained under the influence of multiple factors, such as genetic background, environmental conditions, or exposure to diseases. The interplay of multiple factors may lead to effect modification and confounding. Higher order linear regression models can account for these effects. We present a new methodology for linear model selection and apply it to microarray data of bone marrow-derived macrophages. This experiment investigates the influence of three variable factors: the genetic background of the mice from which the macrophages were obtained, Yersinia enterocolitica infection (two strains, and a mock control, and treatment/non-treatment with interferon-γ. RESULTS: We set up four different linear regression models in a hierarchical order. We introduce the eruption plot as a new practical tool for model selection complementary to global testing. It visually compares the size and significance of effect estimates between two nested models. Using this methodology we were able to select the most appropriate model by keeping only relevant factors showing additional explanatory power. Application to experimental data allowed us to qualify the interaction of factors as either neutral (no interaction, alleviating (co-occurring effects are weaker than expected from the single effects, or aggravating (stronger than expected. We find a biologically meaningful gene cluster of putative C2TA target genes that appear to be co-regulated with MHC class II genes. CONCLUSIONS: We introduced the eruption plot as a tool for visual model comparison to identify relevant higher order interactions in the analysis of expression data obtained under the influence of multiple factors. We conclude that model selection in higher order linear regression models should generally be performed for the analysis of multi-factorial microarray data.

  5. A COMPARISON BETWEEN SINGLE LINKAGE AND COMPLETE LINKAGE IN AGGLOMERATIVE HIERARCHICAL CLUSTER ANALYSIS FOR IDENTIFYING TOURISTS SEGMENTS

    OpenAIRE

    Noor Rashidah Rashid

    2012-01-01

    Cluster Analysis is a multivariate method in statistics. Agglomerative Hierarchical Cluster Analysis is one of approaches in Cluster Analysis. There are two linkage methods in Agglomerative Hierarchical Cluster Analysis which are Single Linkage and Complete Linkage. The purpose of this study is to compare between Single Linkage and Complete Linkage in Agglomerative Hierarchical Cluster Analysis. The comparison of performances between these linkage methods was shown by using Kruskal-Wallis tes...

  6. Polygraph Test Results Assessment by Regression Analysis Methods

    Directory of Open Access Journals (Sweden)

    K. A. Leontiev

    2014-01-01

    Full Text Available The paper considers a problem of defining the importance of asked questions for the examinee under judicial and psychophysiological polygraph examination by methods of mathematical statistics. It offers the classification algorithm based on the logistic regression as an optimum Bayesian classifier, considering weight coefficients of information for the polygraph-recorded physiological parameters with no condition for independence of the measured signs.Actually, binary classification is executed by results of polygraph examination with preliminary normalization and standardization of primary results, with check of a hypothesis that distribution of obtained data is normal, as well as with calculation of coefficients of linear regression between input values and responses by method of maximum likelihood. Further, the logistic curve divided signs into two classes of the "significant" and "insignificant" type.Efficiency of model is estimated by means of the ROC analysis (Receiver Operator Characteristics. It is shown that necessary minimum sample has to contain results of 45 measurements at least. This approach ensures a reliable result provided that an expert-polygraphologist possesses sufficient qualification and follows testing techniques.

  7. A Visual Analytics Approach for Correlation, Classification, and Regression Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Steed, Chad A [ORNL; SwanII, J. Edward [Mississippi State University (MSU); Fitzpatrick, Patrick J. [Mississippi State University (MSU); Jankun-Kelly, T.J. [Mississippi State University (MSU)

    2012-02-01

    New approaches that combine the strengths of humans and machines are necessary to equip analysts with the proper tools for exploring today's increasing complex, multivariate data sets. In this paper, a novel visual data mining framework, called the Multidimensional Data eXplorer (MDX), is described that addresses the challenges of today's data by combining automated statistical analytics with a highly interactive parallel coordinates based canvas. In addition to several intuitive interaction capabilities, this framework offers a rich set of graphical statistical indicators, interactive regression analysis, visual correlation mining, automated axis arrangements and filtering, and data classification techniques. The current work provides a detailed description of the system as well as a discussion of key design aspects and critical feedback from domain experts.

  8. Cardiorespiratory fitness and laboratory stress: a meta-regression analysis.

    Science.gov (United States)

    Jackson, Erica M; Dishman, Rod K

    2006-01-01

    We performed a meta-regression analysis of 73 studies that examined whether cardiorespiratory fitness mitigates cardiovascular responses during and after acute laboratory stress in humans. The cumulative evidence indicates that fitness is related to slightly greater reactivity, but better recovery. However, effects varied according to several study features and were smallest in the better controlled studies. Fitness did not mitigate integrated stress responses such as heart rate and blood pressure, which were the focus of most of the studies we reviewed. Nonetheless, potentially important areas, particularly hemodynamic and vascular responses, have been understudied. Women, racial/ethnic groups, and cardiovascular patients were underrepresented. Randomized controlled trials, including naturalistic studies of real-life responses, are needed to clarify whether a change in fitness alters putative stress mechanisms linked with cardiovascular health.

  9. Multivariate Regression Analysis of Gravitational Waves from Rotating Core Collapse

    CERN Document Server

    Engels, William J; Ott, Christian D

    2014-01-01

    We present a new multivariate regression model for analysis and parameter estimation of gravitational waves observed from well but not perfectly modeled sources such as core-collapse supernovae. Our approach is based on a principal component decomposition of simulated waveform catalogs. Instead of reconstructing waveforms by direct linear combination of physically meaningless principal components, we solve via least squares for the relationship that encodes the connection between chosen physical parameters and the principal component basis. Although our approach is linear, the waveforms' parameter dependence may be non-linear. For the case of gravitational waves from rotating core collapse, we show, using statistical hypothesis testing, that our method is capable of identifying the most important physical parameters that govern waveform morphology in the presence of simulated detector noise. We also demonstrate our method's ability to predict waveforms from a principal component basis given a set of physical ...

  10. Spatial regression analysis of traffic crashes in Seoul.

    Science.gov (United States)

    Rhee, Kyoung-Ah; Kim, Joon-Ki; Lee, Young-ihn; Ulfarsson, Gudmundur F

    2016-06-01

    Traffic crashes can be spatially correlated events and the analysis of the distribution of traffic crash frequency requires evaluation of parameters that reflect spatial properties and correlation. Typically this spatial aspect of crash data is not used in everyday practice by planning agencies and this contributes to a gap between research and practice. A database of traffic crashes in Seoul, Korea, in 2010 was developed at the traffic analysis zone (TAZ) level with a number of GIS developed spatial variables. Practical spatial models using available software were estimated. The spatial error model was determined to be better than the spatial lag model and an ordinary least squares baseline regression. A geographically weighted regression model provided useful insights about localization of effects. The results found that an increased length of roads with speed limit below 30 km/h and a higher ratio of residents below age of 15 were correlated with lower traffic crash frequency, while a higher ratio of residents who moved to the TAZ, more vehicle-kilometers traveled, and a greater number of access points with speed limit difference between side roads and mainline above 30 km/h all increased the number of traffic crashes. This suggests, for example, that better control or design for merging lower speed roads with higher speed roads is important. A key result is that the length of bus-only center lanes had the largest effect on increasing traffic crashes. This is important as bus-only center lanes with bus stop islands have been increasingly used to improve transit times. Hence the potential negative safety impacts of such systems need to be studied further and mitigated through improved design of pedestrian access to center bus stop islands.

  11. Classification and regression tree analysis vs. multivariable linear and logistic regression methods as statistical tools for studying haemophilia.

    Science.gov (United States)

    Henrard, S; Speybroeck, N; Hermans, C

    2015-11-01

    Haemophilia is a rare genetic haemorrhagic disease characterized by partial or complete deficiency of coagulation factor VIII, for haemophilia A, or IX, for haemophilia B. As in any other medical research domain, the field of haemophilia research is increasingly concerned with finding factors associated with binary or continuous outcomes through multivariable models. Traditional models include multiple logistic regressions, for binary outcomes, and multiple linear regressions for continuous outcomes. Yet these regression models are at times difficult to implement, especially for non-statisticians, and can be difficult to interpret. The present paper sought to didactically explain how, why, and when to use classification and regression tree (CART) analysis for haemophilia research. The CART method is non-parametric and non-linear, based on the repeated partitioning of a sample into subgroups based on a certain criterion. Breiman developed this method in 1984. Classification trees (CTs) are used to analyse categorical outcomes and regression trees (RTs) to analyse continuous ones. The CART methodology has become increasingly popular in the medical field, yet only a few examples of studies using this methodology specifically in haemophilia have to date been published. Two examples using CART analysis and previously published in this field are didactically explained in details. There is increasing interest in using CART analysis in the health domain, primarily due to its ease of implementation, use, and interpretation, thus facilitating medical decision-making. This method should be promoted for analysing continuous or categorical outcomes in haemophilia, when applicable. © 2015 John Wiley & Sons Ltd.

  12. Online Statistical Modeling (Regression Analysis) for Independent Responses

    Science.gov (United States)

    Made Tirta, I.; Anggraeni, Dian; Pandutama, Martinus

    2017-06-01

    Regression analysis (statistical analmodelling) are among statistical methods which are frequently needed in analyzing quantitative data, especially to model relationship between response and explanatory variables. Nowadays, statistical models have been developed into various directions to model various type and complex relationship of data. Rich varieties of advanced and recent statistical modelling are mostly available on open source software (one of them is R). However, these advanced statistical modelling, are not very friendly to novice R users, since they are based on programming script or command line interface. Our research aims to developed web interface (based on R and shiny), so that most recent and advanced statistical modelling are readily available, accessible and applicable on web. We have previously made interface in the form of e-tutorial for several modern and advanced statistical modelling on R especially for independent responses (including linear models/LM, generalized linier models/GLM, generalized additive model/GAM and generalized additive model for location scale and shape/GAMLSS). In this research we unified them in the form of data analysis, including model using Computer Intensive Statistics (Bootstrap and Markov Chain Monte Carlo/ MCMC). All are readily accessible on our online Virtual Statistics Laboratory. The web (interface) make the statistical modeling becomes easier to apply and easier to compare them in order to find the most appropriate model for the data.

  13. Determinants of early cognitive development: hierarchical analysis of a longitudinal study.

    Science.gov (United States)

    Marques dos Santos, Letícia; Neves dos Santos, Darci; Bastos, Ana Cecília Sousa; Assis, Ana Marlúcia Oliveira; Prado, Matildes Silva; Barreto, Mauricio L

    2008-02-01

    The study describes the relationship between anthropometric status, socioeconomic conditions, and quality of home environment and child cognitive development in 320 children from 20 to 42 months of age, randomly selected from 20,000 households that represent the range of socioeconomic and environmental conditions in Salvador, Bahia, Northeast Brazil. The inclusion criterion was to be less than 42 months of age between January and July 1999. Child cognitive development was assessed using the Bayley Scales for Infant Development, and the Home Observation for Measurement of the Environment Inventory (HOME) was applied to assess quality of home environment. Anthropometric status was measured using the indicators weight/age and height/age ratios (z-scores), and socioeconomic data were collected through a standard questionnaire. Statistical analysis was conducted through univariate and hierarchical linear regression. Socioeconomic factors were found to have an indirect impact on early cognitive development mediated by the child's proximal environment factors, such as appropriate play materials and games available and school attendance. No independent association was seen between nutritional status and early cognitive development.

  14. An Effect Size for Regression Predictors in Meta-Analysis

    Science.gov (United States)

    Aloe, Ariel M.; Becker, Betsy Jane

    2012-01-01

    A new effect size representing the predictive power of an independent variable from a multiple regression model is presented. The index, denoted as r[subscript sp], is the semipartial correlation of the predictor with the outcome of interest. This effect size can be computed when multiple predictor variables are included in the regression model…

  15. Hierarchical linear model: thinking outside the traditional repeated-measures analysis-of-variance box.

    Science.gov (United States)

    Lininger, Monica; Spybrook, Jessaca; Cheatham, Christopher C

    2015-04-01

    Longitudinal designs are common in the field of athletic training. For example, in the Journal of Athletic Training from 2005 through 2010, authors of 52 of the 218 original research articles used longitudinal designs. In 50 of the 52 studies, a repeated-measures analysis of variance was used to analyze the data. A possible alternative to this approach is the hierarchical linear model, which has been readily accepted in other medical fields. In this short report, we demonstrate the use of the hierarchical linear model for analyzing data from a longitudinal study in athletic training. We discuss the relevant hypotheses, model assumptions, analysis procedures, and output from the HLM 7.0 software. We also examine the advantages and disadvantages of using the hierarchical linear model with repeated measures and repeated-measures analysis of variance for longitudinal data.

  16. Hierarchical Linear Model: Thinking Outside the Traditional Repeated-Measures Analysis-of-Variance Box

    Science.gov (United States)

    Lininger, Monica; Spybrook, Jessaca; Cheatham, Christopher C.

    2015-01-01

    Longitudinal designs are common in the field of athletic training. For example, in the Journal of Athletic Training from 2005 through 2010, authors of 52 of the 218 original research articles used longitudinal designs. In 50 of the 52 studies, a repeated-measures analysis of variance was used to analyze the data. A possible alternative to this approach is the hierarchical linear model, which has been readily accepted in other medical fields. In this short report, we demonstrate the use of the hierarchical linear model for analyzing data from a longitudinal study in athletic training. We discuss the relevant hypotheses, model assumptions, analysis procedures, and output from the HLM 7.0 software. We also examine the advantages and disadvantages of using the hierarchical linear model with repeated measures and repeated-measures analysis of variance for longitudinal data. PMID:25875072

  17. An Analysis of Bank Service Satisfaction Based on Quantile Regression and Grey Relational Analysis

    Directory of Open Access Journals (Sweden)

    Wen-Tsao Pan

    2016-01-01

    Full Text Available Bank service satisfaction is vital to the success of a bank. In this paper, we propose to use the grey relational analysis to gauge the levels of service satisfaction of the banks. With the grey relational analysis, we compared the effects of different variables on service satisfaction. We gave ranks to the banks according to their levels of service satisfaction. We further used the quantile regression model to find the variables that affected the satisfaction of a customer at a specific quantile of satisfaction level. The result of the quantile regression analysis provided a bank manager with information to formulate policies to further promote satisfaction of the customers at different quantiles of satisfaction level. We also compared the prediction accuracies of the regression models at different quantiles. The experiment result showed that, among the seven quantile regression models, the median regression model has the best performance in terms of RMSE, RTIC, and CE performance measures.

  18. Regression Analysis of Restricted Mean Survival Time Based on Pseudo-Observations

    DEFF Research Database (Denmark)

    Andersen, Per Kragh; Hansen, Mette Gerster; Klein, John P.

    2004-01-01

    censoring; hazard function; health economics; mean survival time; pseudo-observations; regression model; restricted mean survival time; survival analysis......censoring; hazard function; health economics; mean survival time; pseudo-observations; regression model; restricted mean survival time; survival analysis...

  19. Regression analysis of restricted mean survival time based on pseudo-observations

    DEFF Research Database (Denmark)

    Andersen, Per Kragh; Hansen, Mette Gerster; Klein, John P.

    censoring; hazard function; health economics; regression model; survival analysis; mean survival time; restricted mean survival time; pseudo-observations......censoring; hazard function; health economics; regression model; survival analysis; mean survival time; restricted mean survival time; pseudo-observations...

  20. Unsupervised Transient Light Curve Analysis Via Hierarchical Bayesian Inference

    CERN Document Server

    Sanders, Nathan; Soderberg, Alicia

    2014-01-01

    Historically, light curve studies of supernovae (SNe) and other transient classes have focused on individual objects with copious and high signal-to-noise observations. In the nascent era of wide field transient searches, objects with detailed observations are decreasing as a fraction of the overall known SN population, and this strategy sacrifices the majority of the information contained in the data about the underlying population of transients. A population level modeling approach, simultaneously fitting all available observations of objects in a transient sub-class of interest, fully mines the data to infer the properties of the population and avoids certain systematic biases. We present a novel hierarchical Bayesian statistical model for population level modeling of transient light curves, and discuss its implementation using an efficient Hamiltonian Monte Carlo technique. As a test case, we apply this model to the Type IIP SN sample from the Pan-STARRS1 Medium Deep Survey, consisting of 18,837 photometr...

  1. UNSUPERVISED TRANSIENT LIGHT CURVE ANALYSIS VIA HIERARCHICAL BAYESIAN INFERENCE

    Energy Technology Data Exchange (ETDEWEB)

    Sanders, N. E.; Soderberg, A. M. [Harvard-Smithsonian Center for Astrophysics, 60 Garden Street, Cambridge, MA 02138 (United States); Betancourt, M., E-mail: nsanders@cfa.harvard.edu [Department of Statistics, University of Warwick, Coventry CV4 7AL (United Kingdom)

    2015-02-10

    Historically, light curve studies of supernovae (SNe) and other transient classes have focused on individual objects with copious and high signal-to-noise observations. In the nascent era of wide field transient searches, objects with detailed observations are decreasing as a fraction of the overall known SN population, and this strategy sacrifices the majority of the information contained in the data about the underlying population of transients. A population level modeling approach, simultaneously fitting all available observations of objects in a transient sub-class of interest, fully mines the data to infer the properties of the population and avoids certain systematic biases. We present a novel hierarchical Bayesian statistical model for population level modeling of transient light curves, and discuss its implementation using an efficient Hamiltonian Monte Carlo technique. As a test case, we apply this model to the Type IIP SN sample from the Pan-STARRS1 Medium Deep Survey, consisting of 18,837 photometric observations of 76 SNe, corresponding to a joint posterior distribution with 9176 parameters under our model. Our hierarchical model fits provide improved constraints on light curve parameters relevant to the physical properties of their progenitor stars relative to modeling individual light curves alone. Moreover, we directly evaluate the probability for occurrence rates of unseen light curve characteristics from the model hyperparameters, addressing observational biases in survey methodology. We view this modeling framework as an unsupervised machine learning technique with the ability to maximize scientific returns from data to be collected by future wide field transient searches like LSST.

  2. Genomic analysis of the hierarchical structure of regulatory networks

    Science.gov (United States)

    Yu, Haiyuan; Gerstein, Mark

    2006-01-01

    A fundamental question in biology is how the cell uses transcription factors (TFs) to coordinate the expression of thousands of genes in response to various stimuli. The relationships between TFs and their target genes can be modeled in terms of directed regulatory networks. These relationships, in turn, can be readily compared with commonplace “chain-of-command” structures in social networks, which have characteristic hierarchical layouts. Here, we develop algorithms for identifying generalized hierarchies (allowing for various loop structures) and use these approaches to illuminate extensive pyramid-shaped hierarchical structures existing in the regulatory networks of representative prokaryotes (Escherichia coli) and eukaryotes (Saccharomyces cerevisiae), with most TFs at the bottom levels and only a few master TFs on top. These masters are situated near the center of the protein–protein interaction network, a different type of network from the regulatory one, and they receive most of the input for the whole regulatory hierarchy through protein interactions. Moreover, they have maximal influence over other genes, in terms of affecting expression-level changes. Surprisingly, however, TFs at the bottom of the regulatory hierarchy are more essential to the viability of the cell. Finally, one might think master TFs achieve their wide influence through directly regulating many targets, but TFs with most direct targets are in the middle of the hierarchy. We find, in fact, that these midlevel TFs are “control bottlenecks” in the hierarchy, and this great degree of control for “middle managers” has parallels in efficient social structures in various corporate and governmental settings. PMID:17003135

  3. Regression and kriging analysis for grid power factor estimation

    Directory of Open Access Journals (Sweden)

    Rajesh Guntaka

    2014-12-01

    Full Text Available The measurement of power factor (PF in electrical utility grids is a mainstay of load balancing and is also a critical element of transmission and distribution efficiency. The measurement of PF dates back to the earliest periods of electrical power distribution to public grids. In the wide-area distribution grid, measurement of current waveforms is trivial and may be accomplished at any point in the grid using a current tap transformer. However, voltage measurement requires reference to ground and so is more problematic and measurements are normally constrained to points that have ready and easy access to a ground source. We present two mathematical analysis methods based on kriging and linear least square estimation (LLSE (regression to derive PF at nodes with unknown voltages that are within a perimeter of sample nodes with ground reference across a selected power grid. Our results indicate an error average of 1.884% that is within acceptable tolerances for PF measurements that are used in load balancing tasks.

  4. A simplified procedure of linear regression in a preliminary analysis

    Directory of Open Access Journals (Sweden)

    Silvia Facchinetti

    2013-05-01

    Full Text Available The analysis of a statistical large data-set can be led by the study of a particularly interesting variable Y – regressed – and an explicative variable X, chosen among the remained variables, conjointly observed. The study gives a simplified procedure to obtain the functional link of the variables y=y(x by a partition of the data-set into m subsets, in which the observations are synthesized by location indices (mean or median of X and Y. Polynomial models for y(x of order r are considered to verify the characteristics of the given procedure, in particular we assume r= 1 and 2. The distributions of the parameter estimators are obtained by simulation, when the fitting is done for m= r + 1. Comparisons of the results, in terms of distribution and efficiency, are made with the results obtained by the ordinary least square methods. The study also gives some considerations on the consistency of the estimated parameters obtained by the given procedure.

  5. Fast nonlinear regression method for CT brain perfusion analysis.

    Science.gov (United States)

    Bennink, Edwin; Oosterbroek, Jaap; Kudo, Kohsuke; Viergever, Max A; Velthuis, Birgitta K; de Jong, Hugo W A M

    2016-04-01

    Although computed tomography (CT) perfusion (CTP) imaging enables rapid diagnosis and prognosis of ischemic stroke, current CTP analysis methods have several shortcomings. We propose a fast nonlinear regression method with a box-shaped model (boxNLR) that has important advantages over the current state-of-the-art method, block-circulant singular value decomposition (bSVD). These advantages include improved robustness to attenuation curve truncation, extensibility, and unified estimation of perfusion parameters. The method is compared with bSVD and with a commercial SVD-based method. The three methods were quantitatively evaluated by means of a digital perfusion phantom, described by Kudo et al. and qualitatively with the aid of 50 clinical CTP scans. All three methods yielded high Pearson correlation coefficients ([Formula: see text]) with the ground truth in the phantom. The boxNLR perfusion maps of the clinical scans showed higher correlation with bSVD than the perfusion maps from the commercial method. Furthermore, it was shown that boxNLR estimates are robust to noise, truncation, and tracer delay. The proposed method provides a fast and reliable way of estimating perfusion parameters from CTP scans. This suggests it could be a viable alternative to current commercial and academic methods.

  6. Influencing Academic Library Use in Tanzania: A Multiple Regression Analysis

    Directory of Open Access Journals (Sweden)

    Leocardia L Juventus

    2016-12-01

    Full Text Available Library use is influenced by many factors. This study uses a multiple regression analysis to ascertain the connection between the level of library use and a few of these factors based on the questionnaire responses from 158 undergraduate students who use academic libraries in two Tanzania’s universities: Muhimbili University of Health and Allied Sciences (MUHAS, and Hubert Kairuki Memorial University (HKMU. It has been discovered that users of academic libraries in Tanzania are influenced by the need to: search and access online materials, check for new books or other resources, check out books and other materials, and enjoy a friendly environment for study. However, their library use is not influenced by either the free wireless network, or consultation from librarians. It is argued that, academic libraries need to devise and implement plans that can make these libraries better learning environment and platforms to drive socio-economic developmentparticularly in developing nations such as Tanzania. It is further argued that, this can be enhanced through investment in modern academic library infrastructures.

  7. A Novel Multiobjective Evolutionary Algorithm Based on Regression Analysis

    Directory of Open Access Journals (Sweden)

    Zhiming Song

    2015-01-01

    Full Text Available As is known, the Pareto set of a continuous multiobjective optimization problem with m objective functions is a piecewise continuous (m-1-dimensional manifold in the decision space under some mild conditions. However, how to utilize the regularity to design multiobjective optimization algorithms has become the research focus. In this paper, based on this regularity, a model-based multiobjective evolutionary algorithm with regression analysis (MMEA-RA is put forward to solve continuous multiobjective optimization problems with variable linkages. In the algorithm, the optimization problem is modelled as a promising area in the decision space by a probability distribution, and the centroid of the probability distribution is (m-1-dimensional piecewise continuous manifold. The least squares method is used to construct such a model. A selection strategy based on the nondominated sorting is used to choose the individuals to the next generation. The new algorithm is tested and compared with NSGA-II and RM-MEDA. The result shows that MMEA-RA outperforms RM-MEDA and NSGA-II on the test instances with variable linkages. At the same time, MMEA-RA has higher efficiency than the other two algorithms. A few shortcomings of MMEA-RA have also been identified and discussed in this paper.

  8. A novel multiobjective evolutionary algorithm based on regression analysis.

    Science.gov (United States)

    Song, Zhiming; Wang, Maocai; Dai, Guangming; Vasile, Massimiliano

    2015-01-01

    As is known, the Pareto set of a continuous multiobjective optimization problem with m objective functions is a piecewise continuous (m - 1)-dimensional manifold in the decision space under some mild conditions. However, how to utilize the regularity to design multiobjective optimization algorithms has become the research focus. In this paper, based on this regularity, a model-based multiobjective evolutionary algorithm with regression analysis (MMEA-RA) is put forward to solve continuous multiobjective optimization problems with variable linkages. In the algorithm, the optimization problem is modelled as a promising area in the decision space by a probability distribution, and the centroid of the probability distribution is (m - 1)-dimensional piecewise continuous manifold. The least squares method is used to construct such a model. A selection strategy based on the nondominated sorting is used to choose the individuals to the next generation. The new algorithm is tested and compared with NSGA-II and RM-MEDA. The result shows that MMEA-RA outperforms RM-MEDA and NSGA-II on the test instances with variable linkages. At the same time, MMEA-RA has higher efficiency than the other two algorithms. A few shortcomings of MMEA-RA have also been identified and discussed in this paper.

  9. Meta-Analysis in Higher Education: An Illustrative Example Using Hierarchical Linear Modeling

    Science.gov (United States)

    Denson, Nida; Seltzer, Michael H.

    2011-01-01

    The purpose of this article is to provide higher education researchers with an illustrative example of meta-analysis utilizing hierarchical linear modeling (HLM). This article demonstrates the step-by-step process of meta-analysis using a recently-published study examining the effects of curricular and co-curricular diversity activities on racial…

  10. Meta-Analysis in Higher Education: An Illustrative Example Using Hierarchical Linear Modeling

    Science.gov (United States)

    Denson, Nida; Seltzer, Michael H.

    2011-01-01

    The purpose of this article is to provide higher education researchers with an illustrative example of meta-analysis utilizing hierarchical linear modeling (HLM). This article demonstrates the step-by-step process of meta-analysis using a recently-published study examining the effects of curricular and co-curricular diversity activities on racial…

  11. Augmenting Visual Analysis in Single-Case Research with Hierarchical Linear Modeling

    Science.gov (United States)

    Davis, Dawn H.; Gagne, Phill; Fredrick, Laura D.; Alberto, Paul A.; Waugh, Rebecca E.; Haardorfer, Regine

    2013-01-01

    The purpose of this article is to demonstrate how hierarchical linear modeling (HLM) can be used to enhance visual analysis of single-case research (SCR) designs. First, the authors demonstrated the use of growth modeling via HLM to augment visual analysis of a sophisticated single-case study. Data were used from a delayed multiple baseline…

  12. Standardized Regression Coefficients as Indices of Effect Sizes in Meta-Analysis

    Science.gov (United States)

    Kim, Rae Seon

    2011-01-01

    When conducting a meta-analysis, it is common to find many collected studies that report regression analyses, because multiple regression analysis is widely used in many fields. Meta-analysis uses effect sizes drawn from individual studies as a means of synthesizing a collection of results. However, indices of effect size from regression analyses…

  13. Regression modeling strategies with applications to linear models, logistic and ordinal regression, and survival analysis

    CERN Document Server

    Harrell , Jr , Frank E

    2015-01-01

    This highly anticipated second edition features new chapters and sections, 225 new references, and comprehensive R software. In keeping with the previous edition, this book is about the art and science of data analysis and predictive modeling, which entails choosing and using multiple tools. Instead of presenting isolated techniques, this text emphasizes problem solving strategies that address the many issues arising when developing multivariable models using real data and not standard textbook examples. It includes imputation methods for dealing with missing data effectively, methods for fitting nonlinear relationships and for making the estimation of transformations a formal part of the modeling process, methods for dealing with "too many variables to analyze and not enough observations," and powerful model validation techniques based on the bootstrap.  The reader will gain a keen understanding of predictive accuracy, and the harm of categorizing continuous predictors or outcomes.  This text realistically...

  14. Mining Sequential Update Summarization with Hierarchical Text Analysis

    Directory of Open Access Journals (Sweden)

    Chunyun Zhang

    2016-01-01

    Full Text Available The outbreak of unexpected news events such as large human accident or natural disaster brings about a new information access problem where traditional approaches fail. Mostly, news of these events shows characteristics that are early sparse and later redundant. Hence, it is very important to get updates and provide individuals with timely and important information of these incidents during their development, especially when being applied in wireless and mobile Internet of Things (IoT. In this paper, we define the problem of sequential update summarization extraction and present a new hierarchical update mining system which can broadcast with useful, new, and timely sentence-length updates about a developing event. The new system proposes a novel method, which incorporates techniques from topic-level and sentence-level summarization. To evaluate the performance of the proposed system, we apply it to the task of sequential update summarization of temporal summarization (TS track at Text Retrieval Conference (TREC 2013 to compute four measurements of the update mining system: the expected gain, expected latency gain, comprehensiveness, and latency comprehensiveness. Experimental results show that our proposed method has good performance.

  15. Category theoretic analysis of hierarchical protein materials and social networks

    CERN Document Server

    Spivak, David I; Buehler, Markus J

    2011-01-01

    Materials in biology span all the scales from Angstroms to meters and typically consist of complex hierarchical assemblies of simple building blocks. Here we review an application of category theory to describe structural and resulting functional properties of biological protein materials by developing so-called ologs. An olog is like a "concept web" or "semantic network" except that it follows a rigorous mathematical formulation based on category theory. This key difference ensures that an olog is unambiguous, highly adaptable to evolution and change, and suitable for sharing concepts with other ologs. We consider a simple example of an alpha-helical and an amyloid-like protein filament subjected to axial extension and develop an olog representation of their structural and resulting mechanical properties. We also construct a representation of a social network in which people send text-messages to their nearest neighbors and act as a team to perform a task. We show that the olog for the protein and the olog f...

  16. Design and analysis of experiments classical and regression approaches with SAS

    CERN Document Server

    Onyiah, Leonard C

    2008-01-01

    Introductory Statistical Inference and Regression Analysis Elementary Statistical Inference Regression Analysis Experiments, the Completely Randomized Design (CRD)-Classical and Regression Approaches Experiments Experiments to Compare Treatments Some Basic Ideas Requirements of a Good Experiment One-Way Experimental Layout or the CRD: Design and Analysis Analysis of Experimental Data (Fixed Effects Model) Expected Values for the Sums of Squares The Analysis of Variance (ANOVA) Table Follow-Up Analysis to Check fo

  17. Buffalos milk yield analysis using random regression models

    Directory of Open Access Journals (Sweden)

    A.S. Schierholt

    2010-02-01

    Full Text Available Data comprising 1,719 milk yield records from 357 females (predominantly Murrah breed, daughters of 110 sires, with births from 1974 to 2004, obtained from the Programa de Melhoramento Genético de Bubalinos (PROMEBUL and from records of EMBRAPA Amazônia Oriental - EAO herd, located in Belém, Pará, Brazil, were used to compare random regression models for estimating variance components and predicting breeding values of the sires. The data were analyzed by different models using the Legendre’s polynomial functions from second to fourth orders. The random regression models included the effects of herd-year, month of parity date of the control; regression coefficients for age of females (in order to describe the fixed part of the lactation curve and random regression coefficients related to the direct genetic and permanent environment effects. The comparisons among the models were based on the Akaike Infromation Criterion. The random effects regression model using third order Legendre’s polynomials with four classes of the environmental effect were the one that best described the additive genetic variation in milk yield. The heritability estimates varied from 0.08 to 0.40. The genetic correlation between milk yields in younger ages was close to the unit, but in older ages it was low.

  18. Quantile regression provides a fuller analysis of speed data.

    Science.gov (United States)

    Hewson, Paul

    2008-03-01

    Considerable interest already exists in terms of assessing percentiles of speed distributions, for example monitoring the 85th percentile speed is a common feature of the investigation of many road safety interventions. However, unlike the mean, where t-tests and ANOVA can be used to provide evidence of a statistically significant change, inference on these percentiles is much less common. This paper examines the potential role of quantile regression for modelling the 85th percentile, or any other quantile. Given that crash risk may increase disproportionately with increasing relative speed, it may be argued these quantiles are of more interest than the conditional mean. In common with the more usual linear regression, quantile regression admits a simple test as to whether the 85th percentile speed has changed following an intervention in an analogous way to using the t-test to determine if the mean speed has changed by considering the significance of parameters fitted to a design matrix. Having briefly outlined the technique and briefly examined an application with a widely published dataset concerning speed measurements taken around the introduction of signs in Cambridgeshire, this paper will demonstrate the potential for quantile regression modelling by examining recent data from Northamptonshire collected in conjunction with a "community speed watch" programme. Freely available software is used to fit these models and it is hoped that the potential benefits of using quantile regression methods when examining and analysing speed data are demonstrated.

  19. Analysis of retirement income adequacy using quantile regression: A case study in Malaysia

    Science.gov (United States)

    Alaudin, Ros Idayuwati; Ismail, Noriszura; Isa, Zaidi

    2015-09-01

    Quantile regression is a statistical analysis that does not restrict attention to the conditional mean and therefore, permitting the approximation of the whole conditional distribution of a response variable. Quantile regression is a robust regression to outliers compared to mean regression models. In this paper, we demonstrate how quantile regression approach can be used to analyze the ratio of projected wealth to needs (wealth-needs ratio) during retirement.

  20. Hierarchical Linear Modeling for Analysis of Ecological Momentary Assessment Data in Physical Medicine and Rehabilitation Research.

    Science.gov (United States)

    Terhorst, Lauren; Beck, Kelly Battle; McKeon, Ashlee B; Graham, Kristin M; Ye, Feifei; Shiffman, Saul

    2017-08-01

    Ecological momentary assessment (EMA) methods collect real-time data in real-world environments, which allow physical medicine and rehabilitation researchers to examine objective outcome data and reduces bias from retrospective recall. The statistical analysis of EMA data is directly related to the research question and the temporal design of the study. Hierarchical linear modeling, which accounts for multiple observations from the same participant, is a particularly useful approach to analyzing EMA data. The objective of this paper was to introduce the process of conducting hierarchical linear modeling analyses with EMA data. This is accomplished using exemplars from recent physical medicine and rehabilitation literature.

  1. Multiple regression for physiological data analysis: the problem of multicollinearity.

    Science.gov (United States)

    Slinker, B K; Glantz, S A

    1985-07-01

    Multiple linear regression, in which several predictor variables are related to a response variable, is a powerful statistical tool for gaining quantitative insight into complex in vivo physiological systems. For these insights to be correct, all predictor variables must be uncorrelated. However, in many physiological experiments the predictor variables cannot be precisely controlled and thus change in parallel (i.e., they are highly correlated). There is a redundancy of information about the response, a situation called multicollinearity, that leads to numerical problems in estimating the parameters in regression equations; the parameters are often of incorrect magnitude or sign or have large standard errors. Although multicollinearity can be avoided with good experimental design, not all interesting physiological questions can be studied without encountering multicollinearity. In these cases various ad hoc procedures have been proposed to mitigate multicollinearity. Although many of these procedures are controversial, they can be helpful in applying multiple linear regression to some physiological problems.

  2. Analysis of some methods for reduced rank Gaussian process regression

    DEFF Research Database (Denmark)

    Quinonero-Candela, J.; Rasmussen, Carl Edward

    2005-01-01

    While there is strong motivation for using Gaussian Processes (GPs) due to their excellent performance in regression and classification problems, their computational complexity makes them impractical when the size of the training set exceeds a few thousand cases. This has motivated the recent...... proliferation of a number of cost-effective approximations to GPs, both for classification and for regression. In this paper we analyze one popular approximation to GPs for regression: the reduced rank approximation. While generally GPs are equivalent to infinite linear models, we show that Reduced Rank...... Gaussian Processes (RRGPs) are equivalent to finite sparse linear models. We also introduce the concept of degenerate GPs and show that they correspond to inappropriate priors. We show how to modify the RRGP to prevent it from being degenerate at test time. Training RRGPs consists both in learning...

  3. A combined multidimensional scaling and hierarchical clustering view for the exploratory analysis of multidimensional data

    Science.gov (United States)

    Craig, Paul; Roa-Seïler, Néna

    2013-01-01

    This paper describes a novel information visualization technique that combines multidimensional scaling and hierarchical clustering to support the exploratory analysis of multidimensional data. The technique displays the results of multidimensional scaling using a scatter plot where the proximity of any two items' representations is approximate to their similarity according to a Euclidean distance metric. The results of hierarchical clustering are overlaid onto this view by drawing smoothed outlines around each nested cluster. The difference in similarity between successive cluster combinations is used to colour code clusters and make stronger natural clusters more prominent in the display. When a cluster or group of items is selected, multidimensional scaling and hierarchical clustering are re-applied to a filtered subset of the data, and animation is used to smooth the transition between successive filtered views. As a case study we demonstrate the technique being used to analyse survey data relating to the appropriateness of different phrases to different emotionally charged situations.

  4. Mean-field analysis of phase transitions in the emergence of hierarchical society

    Science.gov (United States)

    Okubo, Tsuyoshi; Odagaki, Takashi

    2007-09-01

    Emergence of hierarchical society is analyzed by use of a simple agent-based model. We extend the mean-field model of Bonabeau [Physica A 217, 373 (1995)] to societies obeying complex diffusion rules where each individual selects a moving direction following their power rankings. We apply this mean-field analysis to the pacifist society model recently investigated by use of Monte Carlo simulation [Physica A 367, 435 (2006)]. We show analytically that the self-organization of hierarchies occurs in two steps as the individual density is increased and there are three phases: one egalitarian and two hierarchical states. We also highlight that the transition from the egalitarian phase to the first hierarchical phase is a continuous change in the order parameter and the second transition causes a discontinuous jump in the order parameter.

  5. A Hierarchical Generalized Linear Model in Combination with Dispersion Modeling to Improve Sib-Pair Linkage Analysis.

    Science.gov (United States)

    Lee, Woojoo; Kim, Jeonghwan; Lee, Youngjo; Park, Taesung; Suh, Young Ju

    2015-01-01

    We explored a hierarchical generalized linear model (HGLM) in combination with dispersion modeling to improve the sib-pair linkage analysis based on the revised Haseman-Elston regression model for a quantitative trait. A dispersion modeling technique was investigated for sib-pair linkage analysis using simulation studies and real data applications. We considered 4 heterogeneous dispersion settings according to a signal-to-noise ratio (SNR) in the various statistical models based on the Haseman-Elston regression model. Our numerical studies demonstrated that susceptibility loci could be detected well by modeling the dispersion parameter appropriately. In particular, the HGLM had better performance than the linear regression model and the ordinary linear mixed model when the SNR is low, i.e., when substantial noise was present in the data. The study shows that the HGLM in combination with dispersion modeling can be utilized to identify multiple markers showing linkage to familial complex traits accurately. Appropriate dispersion modeling might be more powerful to identify markers closest to the major genes which determine a quantitative trait. © 2015 S. Karger AG, Basel.

  6. "Sentido de Pertenencia": A Hierarchical Analysis Predicting Sense of Belonging among Latino College Students

    Science.gov (United States)

    Strayhorn, Terrell Lamont

    2008-01-01

    The present study estimated the influence of academic and social collegiate experiences on Latino students' sense of belonging, controlling for background differences, using hierarchical analysis techniques with a nested design. In addition, results were compared between Latino students and their White counterparts. Findings reveal that grades,…

  7. A Hierarchical Linear Model with Factor Analysis Structure at Level 2

    Science.gov (United States)

    Miyazaki, Yasuo; Frank, Kenneth A.

    2006-01-01

    In this article the authors develop a model that employs a factor analysis structure at Level 2 of a two-level hierarchical linear model (HLM). The model (HLM2F) imposes a structure on a deficient rank Level 2 covariance matrix [tau], and facilitates estimation of a relatively large [tau] matrix. Maximum likelihood estimators are derived via the…

  8. Collaborative Hierarchical Sparse Modeling

    CERN Document Server

    Sprechmann, Pablo; Sapiro, Guillermo; Eldar, Yonina C

    2010-01-01

    Sparse modeling is a powerful framework for data analysis and processing. Traditionally, encoding in this framework is done by solving an l_1-regularized linear regression problem, usually called Lasso. In this work we first combine the sparsity-inducing property of the Lasso model, at the individual feature level, with the block-sparsity property of the group Lasso model, where sparse groups of features are jointly encoded, obtaining a sparsity pattern hierarchically structured. This results in the hierarchical Lasso, which shows important practical modeling advantages. We then extend this approach to the collaborative case, where a set of simultaneously coded signals share the same sparsity pattern at the higher (group) level but not necessarily at the lower one. Signals then share the same active groups, or classes, but not necessarily the same active set. This is very well suited for applications such as source separation. An efficient optimization procedure, which guarantees convergence to the global opt...

  9. Regression analysis of censored data using pseudo-observations

    DEFF Research Database (Denmark)

    Parner, Erik T.; Andersen, Per Kragh

    2010-01-01

    We draw upon a series of articles in which a method based on pseu- dovalues is proposed for direct regression modeling of the survival function, the restricted mean, and the cumulative incidence function in competing risks with right-censored data. The models, once the pseudovalues have been...

  10. Measuring Habituation in Infants: An Approach Using Regression Analysis.

    Science.gov (United States)

    Ashmead, Daniel H.; Davis, DeFord L.

    1996-01-01

    Used computer simulations to examine effectiveness of different criteria for measuring infant visual habituation. Found that a criterion based on fitting a second-order polynomial regression function to looking-time data produced more accurate estimation of looking times and higher power for detecting novelty effects than did the traditional…

  11. Grades, Gender, and Encouragement: A Regression Discontinuity Analysis

    Science.gov (United States)

    Owen, Ann L.

    2010-01-01

    The author employs a regression discontinuity design to provide direct evidence on the effects of grades earned in economics principles classes on the decision to major in economics and finds a differential effect for male and female students. Specifically, for female students, receiving an A for a final grade in the first economics class is…

  12. Development of Hierarchical Bayesian Model Based on Regional Frequency Analysis and Its Application to Estimate Areal Rainfall in South Korea

    Science.gov (United States)

    Kim, J.; Kwon, H. H.

    2014-12-01

    The existing regional frequency analysis has disadvantages in that it is difficult to consider geographical characteristics in estimating areal rainfall. In this regard, This study aims to develop a hierarchical Bayesian model based regional frequency analysis in that spatial patterns of the design rainfall with geographical information are explicitly incorporated. This study assumes that the parameters of Gumbel distribution are a function of geographical characteristics (e.g. altitude, latitude and longitude) within a general linear regression framework. Posterior distributions of the regression parameters are estimated by Bayesian Markov Chain Monte Calro (MCMC) method, and the identified functional relationship is used to spatially interpolate the parameters of the Gumbel distribution by using digital elevation models (DEM) as inputs. The proposed model is applied to derive design rainfalls over the entire Han-river watershed. It was found that the proposed Bayesian regional frequency analysis model showed similar results compared to L-moment based regional frequency analysis. In addition, the model showed an advantage in terms of quantifying uncertainty of the design rainfall and estimating the area rainfall considering geographical information. Acknowledgement: This research was supported by a grant (14AWMP-B079364-01) from Water Management Research Program funded by Ministry of Land, Infrastructure and Transport of Korean government.

  13. Academic Achievement, Intelligence, and Creativity: A Regression Surface Analysis.

    Science.gov (United States)

    Marjoribanks, K

    1976-01-01

    Data collected on 400 12-year-old English school children were used to examine relations between measures of intelligence, creativity and academic achievement. Complex multiple regression models, which included terms to account for the possible interaction and curvilinear relations between intelligence, creativity and academic achievement were used to construct regression surfaces. The surfaces showed that the traditional threshold hypothesis, which suggests that beyond a certain level of intelligence academic achievement is related increasingly to creativity and ceases to be related strongly to intelligence, was not supported. For some areas of academic performance the results suggest an alternate proposition, that creativity ceases to be related to achievement after a threshold level of intelligence has been reached. It was also found that at high levels of verbal ability, non-verbal ability and creativity appeared to have differential relations with academic achievement.

  14. Model performance analysis and model validation in logistic regression

    Directory of Open Access Journals (Sweden)

    Rosa Arboretti Giancristofaro

    2007-10-01

    Full Text Available In this paper a new model validation procedure for a logistic regression model is presented. At first, we illustrate a brief review of different techniques of model validation. Next, we define a number of properties required for a model to be considered "good", and a number of quantitative performance measures. Lastly, we describe a methodology for the assessment of the performance of a given model by using an example taken from a management study.

  15. Modeling Heterogeneity in Relationships between Initial Status and Rates of Change: Treating Latent Variable Regression Coefficients as Random Coefficients in a Three-Level Hierarchical Model

    Science.gov (United States)

    Choi, Kilchan; Seltzer, Michael

    2010-01-01

    In studies of change in education and numerous other fields, interest often centers on how differences in the status of individuals at the start of a period of substantive interest relate to differences in subsequent change. In this article, the authors present a fully Bayesian approach to estimating three-level Hierarchical Models in which latent…

  16. A New Approach in Regression Analysis for Modeling Adsorption Isotherms

    Directory of Open Access Journals (Sweden)

    Dana D. Marković

    2014-01-01

    Full Text Available Numerous regression approaches to isotherm parameters estimation appear in the literature. The real insight into the proper modeling pattern can be achieved only by testing methods on a very big number of cases. Experimentally, it cannot be done in a reasonable time, so the Monte Carlo simulation method was applied. The objective of this paper is to introduce and compare numerical approaches that involve different levels of knowledge about the noise structure of the analytical method used for initial and equilibrium concentration determination. Six levels of homoscedastic noise and five types of heteroscedastic noise precision models were considered. Performance of the methods was statistically evaluated based on median percentage error and mean absolute relative error in parameter estimates. The present study showed a clear distinction between two cases. When equilibrium experiments are performed only once, for the homoscedastic case, the winning error function is ordinary least squares, while for the case of heteroscedastic noise the use of orthogonal distance regression or Margart’s percent standard deviation is suggested. It was found that in case when experiments are repeated three times the simple method of weighted least squares performed as well as more complicated orthogonal distance regression method.

  17. HIERARCHICAL ADAPTIVE ROOD PATTERN SEARCH FOR MOTION ESTIMATION AT VIDEO SEQUENCE ANALYSIS

    Directory of Open Access Journals (Sweden)

    V. T. Nguyen

    2016-05-01

    Full Text Available Subject of Research.The paper deals with the motion estimation algorithms for the analysis of video sequences in compression standards MPEG-4 Visual and H.264. Anew algorithm has been offered based on the analysis of the advantages and disadvantages of existing algorithms. Method. Thealgorithm is called hierarchical adaptive rood pattern search (Hierarchical ARPS, HARPS. This new algorithm includes the classic adaptive rood pattern search ARPS and hierarchical search MP (Hierarchical search or Mean pyramid. All motion estimation algorithms have been implemented using MATLAB package and tested with several video sequences. Main Results. The criteria for evaluating the algorithms were: speed, peak signal to noise ratio, mean square error and mean absolute deviation. The proposed method showed a much better performance at a comparable error and deviation. The peak signal to noise ratio in different video sequences shows better and worse results than characteristics of known algorithms so it requires further investigation. Practical Relevance. Application of this algorithm in MPEG-4 and H.264 codecs instead of the standard can significantly reduce compression time. This feature enables to recommend it in telecommunication systems for multimedia data storing, transmission and processing.

  18. Sensitivity analysis and optimization of system dynamics models : Regression analysis and statistical design of experiments

    NARCIS (Netherlands)

    Kleijnen, J.P.C.

    1995-01-01

    This tutorial discusses what-if analysis and optimization of System Dynamics models. These problems are solved, using the statistical techniques of regression analysis and design of experiments (DOE). These issues are illustrated by applying the statistical techniques to a System Dynamics model for

  19. Analysis of U.S. freight-train derailment severity using zero-truncated negative binomial regression and quantile regression.

    Science.gov (United States)

    Liu, Xiang; Saat, M Rapik; Qin, Xiao; Barkan, Christopher P L

    2013-10-01

    Derailments are the most common type of freight-train accidents in the United States. Derailments cause damage to infrastructure and rolling stock, disrupt services, and may cause casualties and harm the environment. Accordingly, derailment analysis and prevention has long been a high priority in the rail industry and government. Despite the low probability of a train derailment, the potential for severe consequences justify the need to better understand the factors influencing train derailment severity. In this paper, a zero-truncated negative binomial (ZTNB) regression model is developed to estimate the conditional mean of train derailment severity. Recognizing that the mean is not the only statistic describing data distribution, a quantile regression (QR) model is also developed to estimate derailment severity at different quantiles. The two regression models together provide a better understanding of train derailment severity distribution. Results of this work can be used to estimate train derailment severity under various operational conditions and by different accident causes. This research is intended to provide insights regarding development of cost-efficient train safety policies. Copyright © 2013 Elsevier Ltd. All rights reserved.

  20. Hierarchical Cluster Analysis: Comparison of Three Linkage Measures and Application to Psychological Data

    Directory of Open Access Journals (Sweden)

    Odilia Yim

    2015-02-01

    Full Text Available Cluster analysis refers to a class of data reduction methods used for sorting cases, observations, or variables of a given dataset into homogeneous groups that differ from each other. The present paper focuses on hierarchical agglomerative cluster analysis, a statistical technique where groups are sequentially created by systematically merging similar clusters together, as dictated by the distance and linkage measures chosen by the researcher. Specific distance and linkage measures are reviewed, including a discussion of how these choices can influence the clustering process by comparing three common linkage measures (single linkage, complete linkage, average linkage. The tutorial guides researchers in performing a hierarchical cluster analysis using the SPSS statistical software. Through an example, we demonstrate how cluster analysis can be used to detect meaningful subgroups in a sample of bilinguals by examining various language variables.

  1. Quantifying Change During Outpatient Stroke Rehabilitation: A Retrospective Regression Analysis.

    Science.gov (United States)

    Lohse, Keith; Bland, Marghuretta D; Lang, Catherine E

    2016-09-01

    To examine change and individual trajectories for balance, upper extremity motor capacity, and mobility in people poststroke during the time they received outpatient therapies. Retrospective analyses of an observational cohort using hierarchical linear modeling. Outpatient rehabilitation. Persons poststroke (N=366). Usual outpatient physical and occupational therapy. Berg Balance Scale (BBS), Action Research Arm Test (ARAT), and walking speed were used to assess the 3 domains. Initial scores at the start of outpatient therapy (intercepts), rate of change during outpatient therapy (slopes), and covariance between slopes and intercepts were modeled as random effects. Additional variables modeled as fixed effects were duration (months of outpatient therapy), time (days poststroke), age (y), and inpatient status (if the patient went to an inpatient rehabilitation facility [IRF]). A patient with average age and time started at 37 points on the BBS with a change of 1.8 points per month, at 35 points on the ARAT with a change of 2 points per month, and with a walking speed of .59m/s with a change of .09m/s per month. When controlling for other variables, patients started with lower scores on the BBS and ARAT or had slower walking speeds at admission if they started outpatient therapy later than average or went to an IRF. Patients generally improved over the course of outpatient therapy, but there was considerable variability in individual trajectories. Average rates of change across all 3 domains were small. Copyright © 2016 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.

  2. Analysis of some methods for reduced rank Gaussian process regression

    DEFF Research Database (Denmark)

    Quinonero-Candela, J.; Rasmussen, Carl Edward

    2005-01-01

    While there is strong motivation for using Gaussian Processes (GPs) due to their excellent performance in regression and classification problems, their computational complexity makes them impractical when the size of the training set exceeds a few thousand cases. This has motivated the recent...... Gaussian Processes (RRGPs) are equivalent to finite sparse linear models. We also introduce the concept of degenerate GPs and show that they correspond to inappropriate priors. We show how to modify the RRGP to prevent it from being degenerate at test time. Training RRGPs consists both in learning...

  3. External Tank Liquid Hydrogen (LH2) Prepress Regression Analysis Independent Review Technical Consultation Report

    Science.gov (United States)

    Parsons, Vickie s.

    2009-01-01

    The request to conduct an independent review of regression models, developed for determining the expected Launch Commit Criteria (LCC) External Tank (ET)-04 cycle count for the Space Shuttle ET tanking process, was submitted to the NASA Engineering and Safety Center NESC on September 20, 2005. The NESC team performed an independent review of regression models documented in Prepress Regression Analysis, Tom Clark and Angela Krenn, 10/27/05. This consultation consisted of a peer review by statistical experts of the proposed regression models provided in the Prepress Regression Analysis. This document is the consultation's final report.

  4. Fractal analysis of the hierarchic structure of fossil coal surface

    Energy Technology Data Exchange (ETDEWEB)

    Alekseev, A.D.; Vasilenko, T.A.; Kirillov, A.K. [National Academy of Sciences, Donetsk (Ukraine)

    2008-05-15

    The fractal analysis is described as method of studying images of surface of fossil coal, one of the natural sorbent, with the aim of determining its structural surface heterogeneity. The deformation effect as a reduction in the dimensions of heterogeneity boundaries is considered. It is shown that the theory of nonequilibrium dynamic systems permits to assess a formation level of heterogeneities involved into a sorbent composition by means of the Hurst factor.

  5. Hierarchical cluster analysis of progression patterns in open-angle glaucoma patients with medical treatment.

    Science.gov (United States)

    Bae, Hyoung Won; Rho, Seungsoo; Lee, Hye Sun; Lee, Naeun; Hong, Samin; Seong, Gong Je; Sung, Kyung Rim; Kim, Chan Yun

    2014-04-29

    To classify medically treated open-angle glaucoma (OAG) by the pattern of progression using hierarchical cluster analysis, and to determine OAG progression characteristics by comparing clusters. Ninety-five eyes of 95 OAG patients who received medical treatment, and who had undergone visual field (VF) testing at least once per year for 5 or more years. OAG was classified into subgroups using hierarchical cluster analysis based on the following five variables: baseline mean deviation (MD), baseline visual field index (VFI), MD slope, VFI slope, and Glaucoma Progression Analysis (GPA) printout. After that, other parameters were compared between clusters. Two clusters were made after a hierarchical cluster analysis. Cluster 1 showed -4.06 ± 2.43 dB baseline MD, 92.58% ± 6.27% baseline VFI, -0.28 ± 0.38 dB per year MD slope, -0.52% ± 0.81% per year VFI slope, and all "no progression" cases in GPA printout, whereas cluster 2 showed -8.68 ± 3.81 baseline MD, 77.54 ± 12.98 baseline VFI, -0.72 ± 0.55 MD slope, -2.22 ± 1.89 VFI slope, and seven "possible" and four "likely" progression cases in GPA printout. There were no significant differences in age, sex, mean IOP, central corneal thickness, and axial length between clusters. However, cluster 2 included more high-tension glaucoma patients and used a greater number of antiglaucoma eye drops significantly compared with cluster 1. Hierarchical cluster analysis of progression patterns divided OAG into slow and fast progression groups, evidenced by assessing the parameters of glaucomatous progression in VF testing. In the fast progression group, the prevalence of high-tension glaucoma was greater and the number of antiglaucoma medications administered was increased versus the slow progression group. Copyright 2014 The Association for Research in Vision and Ophthalmology, Inc.

  6. Social Influence on Information Technology Adoption and Sustained Use in Healthcare: A Hierarchical Bayesian Learning Method Analysis

    Science.gov (United States)

    Hao, Haijing

    2013-01-01

    Information technology adoption and diffusion is currently a significant challenge in the healthcare delivery setting. This thesis includes three papers that explore social influence on information technology adoption and sustained use in the healthcare delivery environment using conventional regression models and novel hierarchical Bayesian…

  7. Implementation of Hierarchical Task Analysis for User Interface Design in Drawing Application for Early Childhood Education

    National Research Council Canada - National Science Library

    Mira Kania Sabariah; Veronikha Effendy; Muhamad Fachmi Ichsan

    2016-01-01

    ... of learning and characteristics of early childhood (4-6 years). Based on the results, Hierarchical Task Analysis method generated a list of tasks that must be done in designing an user interface that represents the user experience in draw learning. Then by using the Heuristic Evaluation method the usability of the model has fulfilled a very good level of understanding and also it can be enhanced and produce a better model.

  8. Simulation Experiments in Practice : Statistical Design and Regression Analysis

    NARCIS (Netherlands)

    Kleijnen, J.P.C.

    2007-01-01

    In practice, simulation analysts often change only one factor at a time, and use graphical analysis of the resulting Input/Output (I/O) data. The goal of this article is to change these traditional, naïve methods of design and analysis, because statistical theory proves that more information is obta

  9. Design And Analysis Of Low Power Hierarchical Decoder

    Directory of Open Access Journals (Sweden)

    Abhinav Singh

    2012-11-01

    Full Text Available Due to the high degree of miniaturization possible today in semiconductor technology, the size and complexity of designs that may be implemented in hardware has increased dramatically. Process scaling has been used in the miniaturization process to reduce the area needed for logic functions in an effort to lower the product costs. Precharged Complementary Metal Oxide Semiconductor (CMOS domino logic techniques may be applied to functional blocks to reduce power. Domino logic forms an attractive design style for high performance designs since its low switching threshold and reduced transistor count leads to fast and area efficient circuit implementations. In this paper all the necessary components required to form a 5-to-32 bit decoder using domino logic are designed to perform different analysis at 180nm & 350 nm technologies. Decoderimplemented through domino logic is compared to static decoder.

  10. Modeling Information Content Via Dirichlet-Multinomial Regression Analysis.

    Science.gov (United States)

    Ferrari, Alberto

    2017-02-16

    Shannon entropy is being increasingly used in biomedical research as an index of complexity and information content in sequences of symbols, e.g. languages, amino acid sequences, DNA methylation patterns and animal vocalizations. Yet, distributional properties of information entropy as a random variable have seldom been the object of study, leading to researchers mainly using linear models or simulation-based analytical approach to assess differences in information content, when entropy is measured repeatedly in different experimental conditions. Here a method to perform inference on entropy in such conditions is proposed. Building on results coming from studies in the field of Bayesian entropy estimation, a symmetric Dirichlet-multinomial regression model, able to deal efficiently with the issue of mean entropy estimation, is formulated. Through a simulation study the model is shown to outperform linear modeling in a vast range of scenarios and to have promising statistical properties. As a practical example, the method is applied to a data set coming from a real experiment on animal communication.

  11. Air Pollution Analysis using Ontologies and Regression Models

    Directory of Open Access Journals (Sweden)

    Parul Choudhary

    2016-07-01

    Full Text Available Rapidly throughout the world economy, "the expansive Web" in the "world" explosive growth, rapidly growing market characterized by short product cycles exists and the demand for increased flexibility as well as the extensive use of a new data vision managed data society. A new socio-economic system that relies more and more on movement and allocation results in data whose daily existence, refinement, economy and adjust the exchange industry. Cooperative Engineering Co -operation and multi -disciplinary installed on people's cooperation is a good example. Semantic Web is a new form of Web content that is meaningful to computers and additional approved another example. Communication, vision sharing and exchanging data Society's are new commercial bet. Urban air pollution modeling and data processing techniques need elevated Association. Artificial intelligence in countless ways and breakthrough technologies can solve environmental problems from uneven offers. A method for data to formal ontology means a true meaning and lack of ambiguity to allow us to portray memo. In this work we survey regression model for ontologies and air pollution.

  12. Survival analysis of cervical cancer using stratified Cox regression

    Science.gov (United States)

    Purnami, S. W.; Inayati, K. D.; Sari, N. W. Wulan; Chosuvivatwong, V.; Sriplung, H.

    2016-04-01

    Cervical cancer is one of the mostly widely cancer cause of the women death in the world including Indonesia. Most cervical cancer patients come to the hospital already in an advanced stadium. As a result, the treatment of cervical cancer becomes more difficult and even can increase the death's risk. One of parameter that can be used to assess successfully of treatment is the probability of survival. This study raises the issue of cervical cancer survival patients at Dr. Soetomo Hospital using stratified Cox regression based on six factors such as age, stadium, treatment initiation, companion disease, complication, and anemia. Stratified Cox model is used because there is one independent variable that does not satisfy the proportional hazards assumption that is stadium. The results of the stratified Cox model show that the complication variable is significant factor which influent survival probability of cervical cancer patient. The obtained hazard ratio is 7.35. It means that cervical cancer patient who has complication is at risk of dying 7.35 times greater than patient who did not has complication. While the adjusted survival curves showed that stadium IV had the lowest probability of survival.

  13. Periorbital melasma: Hierarchical cluster analysis of clinical features in Asian patients.

    Science.gov (United States)

    Jung, Y S; Bae, J M; Kim, B J; Kang, J-S; Cho, S B

    2017-03-19

    Studies have shown melasma lesions to be distributed across the face in centrofacial, malar, and mandibular patterns. Meanwhile, however, melasma lesions of the periorbital area have yet to be thoroughly described. We analyzed normal and ultraviolet light-exposed photographs of patients with melasma. The periorbital melasma lesions were measured according to anatomical reference points and a hierarchical cluster analysis was performed. The periorbital melasma lesions showed clinical features of fine and homogenous melasma pigmentation, involving both the upper and lower eyelids that extended to other anatomical sites with a darker and coarser appearance. The hierarchical cluster analysis indicated that patients with periorbital melasma can be categorized into two clusters according to the surface anatomy of the face. Significant differences between cluster 1 and cluster 2 were found in lateral distance and inferolateral distance, but not in medial distance and superior distance. Comparing the two clusters, patients in cluster 2 were found to be significantly older and more commonly accompanied by melasma lesions of the temple and medial cheek. Our hierarchical cluster analysis of periorbital melasma lesions demonstrated that Asian patients with periorbital melasma can be categorized into two clusters according to the surface anatomy of the face. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  14. Analysis of Energy Optimized Hierarchical Routing Protocols in WSN

    Directory of Open Access Journals (Sweden)

    Er. Shelly Jain

    2013-05-01

    Full Text Available Modern wireless sensor network can be expanded into large geographical areas via cheap sensor devices which can sustain themselves with limited energy and developing an energy efficient protocol is a major challenge. Currently, routing in the wireless sensor network faces multiple challenges, such as new scalability, coverage, packet loss, interference, real-time audio and real time video streaming, weather reports, energy constraints and so forth. Clustering sensor nodes is an effective topology control approach. LEACH is an energy efficient clustering protocol because of its nodes distribution capabilities but still it has limitations because it leads to uneven energy distribution. PEGASIS is an enhancement of LEACH using chain-based technique to optimize the energy consumption. This protocol also has certain disadvantages like delays in larger networks etc. HEED is an advanced version of protocol which removes the disadvantages of LEACH and PEGASIS by using distributed algorithm for selecting the cluster heads (CH. It does not make any assumptions about the infrastructure or capabilities of nodes. LEACH, PEGASIS and HEED routing algorithms are compared using Matlab simulation on a Wi-Max network and the results & analysis are based upon the simulation experiments. Simulation results demonstrate that HEED is effective in prolonging the network lifetime and also overcomes the disadvantages of both LEACH & PEGASIS

  15. Introduction to mixed modelling beyond regression and analysis of variance

    CERN Document Server

    Galwey, N W

    2007-01-01

    Mixed modelling is one of the most promising and exciting areas of statistical analysis, enabling more powerful interpretation of data through the recognition of random effects. However, many perceive mixed modelling as an intimidating and specialized technique.

  16. A regression analysis on the green olives debittering

    Directory of Open Access Journals (Sweden)

    Kopsidas, Gerassimos C.

    1991-12-01

    Full Text Available In this paper, a regression model, which gives the debittering time t as a function of the sodium hydroxide concentration 0 and the debittering temperature T, at the debittering of medium size green olive fruit of the Conservolea variety, is fitted. This model has the simple form t=aoCa1 ∙ ea2/T, where ao, a1, and a2 are constants. The values of ao, a1, and a2 are determined by the method of least squares from a set of experimental data. The determined model is very satisfactory for the conditions in which Greek green olives are debittered.

    En este artículo se ajusta un modelo de regresión, que da el tiempo de endulzamiento t en función de la concentración de hidróxido sódico C y la temperatura de endulzamiento T, en el endulzamiento de aceitunas verdes de tamaño mediano de la variedad Conservolea. Este modelo tiene la forma simple t=aoCa1 ∙ ea2/T, donde a1 y a2 son constantes. Los valores de ao, a1, y a2 son determinados por el método de los mínimos cuadrados a partir de un grupo de datos experimentales. El modelo determinado es muy satisfactorio para las condiciones en las que las aceitunas verdes griegas son endulzadas.

  17. A Quality Assessment Tool for Non-Specialist Users of Regression Analysis

    Science.gov (United States)

    Argyrous, George

    2015-01-01

    This paper illustrates the use of a quality assessment tool for regression analysis. It is designed for non-specialist "consumers" of evidence, such as policy makers. The tool provides a series of questions such consumers of evidence can ask to interrogate regression analysis, and is illustrated with reference to a recent study published…

  18. Statistical mechanical analysis of a hierarchical random code ensemble in signal processing

    Energy Technology Data Exchange (ETDEWEB)

    Obuchi, Tomoyuki [Department of Earth and Space Science, Faculty of Science, Osaka University, Toyonaka 560-0043 (Japan); Takahashi, Kazutaka [Department of Physics, Tokyo Institute of Technology, Tokyo 152-8551 (Japan); Takeda, Koujin, E-mail: takeda@sp.dis.titech.ac.jp [Department of Computational Intelligence and Systems Science, Tokyo Institute of Technology, Yokohama 226-8502 (Japan)

    2011-02-25

    We study a random code ensemble with a hierarchical structure, which is closely related to the generalized random energy model with discrete energy values. Based on this correspondence, we analyze the hierarchical random code ensemble by using the replica method in two situations: lossy data compression and channel coding. For both the situations, the exponents of large deviation analysis characterizing the performance of the ensemble, the distortion rate of lossy data compression and the error exponent of channel coding in Gallager's formalism, are accessible by a generating function of the generalized random energy model. We discuss that the transitions of those exponents observed in the preceding work can be interpreted as phase transitions with respect to the replica number. We also show that the replica symmetry breaking plays an essential role in these transitions.

  19. Monitoring Post Disturbance Forest Regeneration with Hierarchical Object-Based Image Analysis

    Directory of Open Access Journals (Sweden)

    L. Monika Moskal

    2013-10-01

    Full Text Available The main goal of this exploratory project was to quantify seedling density in post fire regeneration sites, with the following objectives: to evaluate the application of second order image texture (SOIT in image segmentation, and to apply the object-based image analysis (OBIA approach to develop a hierarchical classification. With the utilization of image texture we successfully developed a methodology to classify hyperspatial (high-spatial imagery to fine detail level of tree crowns, shadows and understory, while still allowing discrimination between density classes and mature forest versus burn classes. At the most detailed hierarchical Level I classification accuracies reached 78.8%, a Level II stand density classification produced accuracies of 89.1% and the same accuracy was achieved by the coarse general classification at Level III. Our interpretation of these results suggests hyperspatial imagery can be applied to post-fire forest density and regeneration mapping.

  20. Accounting for uncertainty in ecological analysis: the strengths and limitations of hierarchical statistical modeling.

    Science.gov (United States)

    Cressie, Noel; Calder, Catherine A; Clark, James S; Ver Hoef, Jay M; Wikle, Christopher K

    2009-04-01

    Analyses of ecological data should account for the uncertainty in the process(es) that generated the data. However, accounting for these uncertainties is a difficult task, since ecology is known for its complexity. Measurement and/or process errors are often the only sources of uncertainty modeled when addressing complex ecological problems, yet analyses should also account for uncertainty in sampling design, in model specification, in parameters governing the specified model, and in initial and boundary conditions. Only then can we be confident in the scientific inferences and forecasts made from an analysis. Probability and statistics provide a framework that accounts for multiple sources of uncertainty. Given the complexities of ecological studies, the hierarchical statistical model is an invaluable tool. This approach is not new in ecology, and there are many examples (both Bayesian and non-Bayesian) in the literature illustrating the benefits of this approach. In this article, we provide a baseline for concepts, notation, and methods, from which discussion on hierarchical statistical modeling in ecology can proceed. We have also planted some seeds for discussion and tried to show where the practical difficulties lie. Our thesis is that hierarchical statistical modeling is a powerful way of approaching ecological analysis in the presence of inevitable but quantifiable uncertainties, even if practical issues sometimes require pragmatic compromises.

  1. Methods of Detecting Outliers in A Regression Analysis Model ...

    African Journals Online (AJOL)

    PROF. O. E. OSUAGWU

    2013-06-01

    Jun 1, 2013 ... Capacity), X2 (Design Pressure), X3 (Boiler Type), X4 (Drum Type) were used. The analysis of the ... 1.2 Identification Of Outliers. There is no such thing as a simple test. However, there are many ..... Psychological. Bulletin, 95 ...

  2. A Noncentral "t" Regression Model for Meta-Analysis

    Science.gov (United States)

    Camilli, Gregory; de la Torre, Jimmy; Chiu, Chia-Yi

    2010-01-01

    In this article, three multilevel models for meta-analysis are examined. Hedges and Olkin suggested that effect sizes follow a noncentral "t" distribution and proposed several approximate methods. Raudenbush and Bryk further refined this model; however, this procedure is based on a normal approximation. In the current research literature, this…

  3. An improved multiple linear regression and data analysis computer program package

    Science.gov (United States)

    Sidik, S. M.

    1972-01-01

    NEWRAP, an improved version of a previous multiple linear regression program called RAPIER, CREDUC, and CRSPLT, allows for a complete regression analysis including cross plots of the independent and dependent variables, correlation coefficients, regression coefficients, analysis of variance tables, t-statistics and their probability levels, rejection of independent variables, plots of residuals against the independent and dependent variables, and a canonical reduction of quadratic response functions useful in optimum seeking experimentation. A major improvement over RAPIER is that all regression calculations are done in double precision arithmetic.

  4. Random Decrement and Regression Analysis of Traffic Responses of Bridges

    DEFF Research Database (Denmark)

    Asmussen, J. C.; Ibrahim, S. R.; Brincker, Rune

    The topic of this paper is the estimation of modal parameters from ambient data by applying the Random Decrement technique. The data from the Queensborough Bridge over the Fraser River in Vancouver, Canada have been applied. The loads producing the dynamic response are ambient, e.g. wind, traffic...... and small ground motion. The Random Decrement technique is used to estimate the correlation function or the free decays from the ambient data. From these functions, the modal parameters are extracted using the Ibrahim Time Domain method. The possible influence of the traffic mass load on the bridge...... of the analysis using the Random Decrement technique are compared with results from an analysis based on fast Fourier transformations....

  5. Random Decrement and Regression Analysis of Traffic Responses of Bridges

    DEFF Research Database (Denmark)

    Asmussen, J. C.; Ibrahim, S. R.; Brincker, Rune

    1996-01-01

    The topic of this paper is the estimation of modal parameters from ambient data by applying the Random Decrement technique. The data fro the Queensborough Bridge over the Fraser River in Vancouver, Canada have been applied. The loads producing the dynamic response are ambient, e. g. wind, traffic...... and small ground motion. The random Decrement technique is used to estimate the correlation function or the free decays from the ambient data. From these functions, the modal parameters are extracted using the Ibrahim Time domain method. The possible influence of the traffic mass load on the bridge...... of the analysis using the Random decrement technique are compared with results from an analysis based on fast Fourier transformations....

  6. Analysis of cost regression and post-accident absence

    Science.gov (United States)

    Wojciech, Drozd

    2017-07-01

    The article presents issues related with costs of work safety. It proves the thesis that economic aspects cannot be overlooked in effective management of occupational health and safety and that adequate expenditures on safety can bring tangible benefits to the company. Reliable analysis of this problem is essential for the description the problem of safety the work. In the article attempts to carry it out using the procedures of mathematical statistics [1, 2, 3].

  7. Analysis of household data on influenza epidemic with Bayesian hierarchical model.

    Science.gov (United States)

    Hsu, C Y; Yen, A M F; Chen, L S; Chen, H H

    2015-03-01

    Data used for modelling the household transmission of infectious diseases, such as influenza, have inherent multilevel structures and correlated property, which make the widely used conventional infectious disease transmission models (including the Greenwood model and the Reed-Frost model) not directly applicable within the context of a household (due to the crowded domestic condition or socioeconomic status of the household). Thus, at the household level, the effects resulting from individual-level factors, such as vaccination, may be confounded or modified in some way. We proposed the Bayesian hierarchical random-effects (random intercepts and random slopes) model under the context of generalised linear model to capture heterogeneity and variation on the individual, generation, and household levels. It was applied to empirical surveillance data on the influenza epidemic in Taiwan. The parameters of interest were estimated by using the Markov chain Monte Carlo method in conjunction with the Bayesian directed acyclic graphical models. Comparisons between models were made using the deviance information criterion. Based on the result of the random-slope Bayesian hierarchical method under the context of the Reed-Frost transmission model, the regression coefficient regarding the protective effect of vaccination varied statistically significantly from household to household. The result of such a heterogeneity was robust to the use of different prior distributions (including non-informative, sceptical, and enthusiastic ones). By integrating out the uncertainty of the parameters of the posterior distribution, the predictive distribution was computed to forecast the number of influenza cases allowing for random-household effect.

  8. The Use of Nonparametric Kernel Regression Methods in Econometric Production Analysis

    DEFF Research Database (Denmark)

    Czekaj, Tomasz Gerard

    This PhD thesis addresses one of the fundamental problems in applied econometric analysis, namely the econometric estimation of regression functions. The conventional approach to regression analysis is the parametric approach, which requires the researcher to specify the form of the regression...... to avoid this problem. The main objective is to investigate the applicability of the nonparametric kernel regression method in applied production analysis. The focus of the empirical analyses included in this thesis is the agricultural sector in Poland. Data on Polish farms are used to investigate...... practically and politically relevant problems and to illustrate how nonparametric regression methods can be used in applied microeconomic production analysis both in panel data and cross-section data settings. The thesis consists of four papers. The first paper addresses problems of parametric...

  9. Measurement and Analysis of Test Suite Volume Metrics for Regression Testing

    Directory of Open Access Journals (Sweden)

    S Raju

    2014-01-01

    Full Text Available Regression testing intends to ensure that a software applications works as specified after changes made to it during maintenance. It is an important phase in software development lifecycle. Regression testing is the re-execution of some subset of test cases that has already been executed. It is an expensive process used to detect defects due to regressions. Regression testing has been used to support software-testing activities and assure acquiring an appropriate quality through several versions of a software product during its development and maintenance. Regression testing assures the quality of modified applications. In this proposed work, a study and analysis of metrics related to test suite volume was undertaken. It was shown that the software under test needs more test cases after changes were made to it. A comparative analysis was performed for finding the change in test suite size before and after the regression test.

  10. Diversity of Xiphinema americanum-group Species and Hierarchical Cluster Analysis of Morphometrics.

    Science.gov (United States)

    Lamberti, F; Ciancio, A

    1993-09-01

    Of the 39 species composing the Xiphinema americanum group, 14 were described originally from North America and two others have been reported from this region. Many species are very similar morphologically and can be distinguished only by a difficult comparison of various combinations of some morphometric characters. Study of morphometrics of 49 populations, including the type populations of the 39 species attributed to this group, by principal component analysis and hierarchical cluster analysis placed the populations into five subgroups, proposed here as the X. brevicolle subgroup (seven species), the X. americanum subgroup (17 species), the X. taylori subgroup (two species), the X. pachtaicum subgroup (eight species), and the X. lambertii subgroup (five species).

  11. Mapping informative clusters in a hierarchical [corrected] framework of FMRI multivariate analysis.

    Directory of Open Access Journals (Sweden)

    Rui Xu

    Full Text Available Pattern recognition methods have become increasingly popular in fMRI data analysis, which are powerful in discriminating between multi-voxel patterns of brain activities associated with different mental states. However, when they are used in functional brain mapping, the location of discriminative voxels varies significantly, raising difficulties in interpreting the locus of the effect. Here we proposed a hierarchical framework of multivariate approach that maps informative clusters rather than voxels to achieve reliable functional brain mapping without compromising the discriminative power. In particular, we first searched for local homogeneous clusters that consisted of voxels with similar response profiles. Then, a multi-voxel classifier was built for each cluster to extract discriminative information from the multi-voxel patterns. Finally, through multivariate ranking, outputs from the classifiers were served as a multi-cluster pattern to identify informative clusters by examining interactions among clusters. Results from both simulated and real fMRI data demonstrated that this hierarchical approach showed better performance in the robustness of functional brain mapping than traditional voxel-based multivariate methods. In addition, the mapped clusters were highly overlapped for two perceptually equivalent object categories, further confirming the validity of our approach. In short, the hierarchical framework of multivariate approach is suitable for both pattern classification and brain mapping in fMRI studies.

  12. Improving Hierarchical Models Using Historical Data with Applications in High-Throughput Genomics Data Analysis.

    Science.gov (United States)

    Li, Ben; Li, Yunxiao; Qin, Zhaohui S

    2017-06-01

    Modern high-throughput biotechnologies such as microarray and next generation sequencing produce a massive amount of information for each sample assayed. However, in a typical high-throughput experiment, only limited amount of data are observed for each individual feature, thus the classical 'large p, small n' problem. Bayesian hierarchical model, capable of borrowing strength across features within the same dataset, has been recognized as an effective tool in analyzing such data. However, the shrinkage effect, the most prominent feature of hierarchical features, can lead to undesirable over-correction for some features. In this work, we discuss possible causes of the over-correction problem and propose several alternative solutions. Our strategy is rooted in the fact that in the Big Data era, large amount of historical data are available which should be taken advantage of. Our strategy presents a new framework to enhance the Bayesian hierarchical model. Through simulation and real data analysis, we demonstrated superior performance of the proposed strategy. Our new strategy also enables borrowing information across different platforms which could be extremely useful with emergence of new technologies and accumulation of data from different platforms in the Big Data era. Our method has been implemented in R package "adaptiveHM", which is freely available from https://github.com/benliemory/adaptiveHM.

  13. Modeling type 1 and type 2 diabetes mellitus incidence in youth: an application of Bayesian hierarchical regression for sparse small area data.

    Science.gov (United States)

    Song, Hae-Ryoung; Lawson, Andrew; D'Agostino, Ralph B; Liese, Angela D

    2011-03-01

    Sparse count data violate assumptions of traditional Poisson models due to the excessive amount of zeros, and modeling sparse data becomes challenging. However, since aggregation to reduce sparseness may result in biased estimates of risk, solutions need to be found at the level of disaggregated data. We investigated different statistical approaches within a Bayesian hierarchical framework for modeling sparse data without aggregation of data. We compared our proposed models with the traditional Poisson model and the zero-inflated model based on simulated data. We applied statistical models to type 1 and type 2 diabetes in youth 10-19 years known as rare diseases, and compared models using the inference results and various model diagnostic tools. We showed that one of the models we proposed, a sparse Poisson convolution model, performed better than other models in the simulation and application based on the deviance information criterion (DIC) and the mean squared prediction error.

  14. Robust analysis of trends in noisy tokamak confinement data using geodesic least squares regression

    Science.gov (United States)

    Verdoolaege, G.; Shabbir, A.; Hornung, G.

    2016-11-01

    Regression analysis is a very common activity in fusion science for unveiling trends and parametric dependencies, but it can be a difficult matter. We have recently developed the method of geodesic least squares (GLS) regression that is able to handle errors in all variables, is robust against data outliers and uncertainty in the regression model, and can be used with arbitrary distribution models and regression functions. We here report on first results of application of GLS to estimation of the multi-machine scaling law for the energy confinement time in tokamaks, demonstrating improved consistency of the GLS results compared to standard least squares.

  15. Development of a User Interface for a Regression Analysis Software Tool

    Science.gov (United States)

    Ulbrich, Norbert Manfred; Volden, Thomas R.

    2010-01-01

    An easy-to -use user interface was implemented in a highly automated regression analysis tool. The user interface was developed from the start to run on computers that use the Windows, Macintosh, Linux, or UNIX operating system. Many user interface features were specifically designed such that a novice or inexperienced user can apply the regression analysis tool with confidence. Therefore, the user interface s design minimizes interactive input from the user. In addition, reasonable default combinations are assigned to those analysis settings that influence the outcome of the regression analysis. These default combinations will lead to a successful regression analysis result for most experimental data sets. The user interface comes in two versions. The text user interface version is used for the ongoing development of the regression analysis tool. The official release of the regression analysis tool, on the other hand, has a graphical user interface that is more efficient to use. This graphical user interface displays all input file names, output file names, and analysis settings for a specific software application mode on a single screen which makes it easier to generate reliable analysis results and to perform input parameter studies. An object-oriented approach was used for the development of the graphical user interface. This choice keeps future software maintenance costs to a reasonable limit. Examples of both the text user interface and graphical user interface are discussed in order to illustrate the user interface s overall design approach.

  16. The Use of Nonparametric Kernel Regression Methods in Econometric Production Analysis

    DEFF Research Database (Denmark)

    Czekaj, Tomasz Gerard

    This PhD thesis addresses one of the fundamental problems in applied econometric analysis, namely the econometric estimation of regression functions. The conventional approach to regression analysis is the parametric approach, which requires the researcher to specify the form of the regression...... function. However, the a priori specification of a functional form involves the risk of choosing one that is not similar to the “true” but unknown relationship between the regressors and the dependent variable. This problem, known as parametric misspecification, can result in biased parameter estimates...... and nonparametric estimations of production functions in order to evaluate the optimal firm size. The second paper discusses the use of parametric and nonparametric regression methods to estimate panel data regression models. The third paper analyses production risk, price uncertainty, and farmers' risk preferences...

  17. Hierarchical Direct Time Integration Method and Adaptive Procedure for Dynamic Analysis

    Institute of Scientific and Technical Information of China (English)

    2000-01-01

    New hierarchical direct time integration method for structural dynamic analysis is developed by using Taylor series expansions in each time step. Very accurate results can be obtained by increasing the order of the Taylor series. Furthermore, the local error can be estimated by simply comparing the solutions obtained by the proposed method with the higher order solutions. This local estimate is then used to develop an adaptive order-control technique. Numerical examples are given to illustrate the performance of the present method and its adaptive procedure.

  18. Automation of control and analysis of execution of official duties and instructions in the hierarchical organization

    Directory of Open Access Journals (Sweden)

    Demchenko A.I.

    2017-01-01

    Full Text Available The article considers the problem of monitoring over execution of official duties of employees. This problem is characteristic of the enterprises having a hierarchical management structure. The functions and the modes of monitoring are defined, the types of analysis of the staff activities are provided. The description of the program complex allowing distributing functions and instructions for between the employees is given. The developed computer program allows tracking the performance, creating reports. The computer program has a demarcation of access rights and provides the can be operated in both local, and a large-scale network.

  19. Proximate analysis, backwards stepwise regression between gross calorific value, ultimate and chemical analysis of wood.

    Science.gov (United States)

    Telmo, C; Lousada, J; Moreira, N

    2010-06-01

    The gross calorific value (GCV), proximate, ultimate and chemical analysis of debark wood in Portugal were studied, for future utilization in wood pellets industry and the results compared with CEN/TS 14961. The relationship between GCV, ultimate and chemical analysis were determined by multiple regression stepwise backward. The treatment between hardwoods-softwoods did not result in significant statistical differences for proximate, ultimate and chemical analysis. Significant statistical differences were found in carbon for National (hardwoods-softwoods) and (National-tropical) hardwoods in volatile matter, fixed carbon, carbon and oxygen and also for chemical analysis in National (hardwoods-softwoods) for F and (National-tropical) hardwoods for Br. GCV was highly positively related to C (0.79 * * *) and negatively to O (-0.71 * * *). The final independent variables of the model were (C, O, S, Zn, Ni, Br) with R(2)=0.86; F=27.68 * * *. The hydrogen did not contribute statistically to the energy content.

  20. Hierarchical Classifiers for Multi-Way Sentiment Analysis of Arabic Reviews

    Directory of Open Access Journals (Sweden)

    Mahmoud Al-Ayyoub

    2016-02-01

    Full Text Available Sentiment Analysis (SA is one of hottest fields in data mining (DM and natural language processing (NLP. The goal of SA is to extract the sentiment conveyed in a certain text based on its content. While most current works focus on the simple problem of determining whether the sentiment is positive or negative, Multi-Way Sentiment Analysis (MWSA focuses on sentiments conveyed through a rating or scoring system (e.g., a 5-star scoring system. In such scoring systems, the sentiments conveyed in two reviews of close scores (such as 4 stars and 5 stars can be very similar creating an added challenge compared to traditional SA. One intuitive way of handling this challenge is via a divide-and-conquer approach where the MWSA problem is divided into a set of sub-problems allowing the use of customized classifiers to differentiate between reviews of close scores. A hierarchical classification structure can be used with this approach where each node represents a different classification sub-problem and the decision from it may lead to the invocation of another classifier. In this work, we show how the use of this divide-and-conquer hierarchical structure of classifiers can generate better results than the use of existing flat classifiers for the MWSA problem. We focus on the Arabic language for many reasons such as the importance of this language and the scarcity of prior works and available tools for it. To the best of our knowledge, very few papers have been published on MWSA of Arabic reviews. One notable work is that of Ali and Atiya, in which the authors collected a large scale Arabic Book Reviews (LABR dataset and made it publicly available. Unfortunately, the baseline experiments on this dataset had very low accuracy. We present two different hierarchical structures and compare their accuracies with the flat structure using different core classifiers. The comparison is based on standard accuracy measures such as precision and recall in addition to

  1. Detecting overdispersion in count data: A zero-inflated Poisson regression analysis

    Science.gov (United States)

    Afiqah Muhamad Jamil, Siti; Asrul Affendi Abdullah, M.; Kek, Sie Long; Nor, Maria Elena; Mohamed, Maryati; Ismail, Norradihah

    2017-09-01

    This study focusing on analysing count data of butterflies communities in Jasin, Melaka. In analysing count dependent variable, the Poisson regression model has been known as a benchmark model for regression analysis. Continuing from the previous literature that used Poisson regression analysis, this study comprising the used of zero-inflated Poisson (ZIP) regression analysis to gain acute precision on analysing the count data of butterfly communities in Jasin, Melaka. On the other hands, Poisson regression should be abandoned in the favour of count data models, which are capable of taking into account the extra zeros explicitly. By far, one of the most popular models include ZIP regression model. The data of butterfly communities which had been called as the number of subjects in this study had been taken in Jasin, Melaka and consisted of 131 number of subjects visits Jasin, Melaka. Since the researchers are considering the number of subjects, this data set consists of five families of butterfly and represent the five variables involve in the analysis which are the types of subjects. Besides, the analysis of ZIP used the SAS procedure of overdispersion in analysing zeros value and the main purpose of continuing the previous study is to compare which models would be better than when exists zero values for the observation of the count data. The analysis used AIC, BIC and Voung test of 5% level significance in order to achieve the objectives. The finding indicates that there is a presence of over-dispersion in analysing zero value. The ZIP regression model is better than Poisson regression model when zero values exist.

  2. A primer for biomedical scientists on how to execute model II linear regression analysis.

    Science.gov (United States)

    Ludbrook, John

    2012-04-01

    1. There are two very different ways of executing linear regression analysis. One is Model I, when the x-values are fixed by the experimenter. The other is Model II, in which the x-values are free to vary and are subject to error. 2. I have received numerous complaints from biomedical scientists that they have great difficulty in executing Model II linear regression analysis. This may explain the results of a Google Scholar search, which showed that the authors of articles in journals of physiology, pharmacology and biochemistry rarely use Model II regression analysis. 3. I repeat my previous arguments in favour of using least products linear regression analysis for Model II regressions. I review three methods for executing ordinary least products (OLP) and weighted least products (WLP) regression analysis: (i) scientific calculator and/or computer spreadsheet; (ii) specific purpose computer programs; and (iii) general purpose computer programs. 4. Using a scientific calculator and/or computer spreadsheet, it is easy to obtain correct values for OLP slope and intercept, but the corresponding 95% confidence intervals (CI) are inaccurate. 5. Using specific purpose computer programs, the freeware computer program smatr gives the correct OLP regression coefficients and obtains 95% CI by bootstrapping. In addition, smatr can be used to compare the slopes of OLP lines. 6. When using general purpose computer programs, I recommend the commercial programs systat and Statistica for those who regularly undertake linear regression analysis and I give step-by-step instructions in the Supplementary Information as to how to use loss functions.

  3. Hierarchical modeling for reliability analysis using Markov models. B.S./M.S. Thesis - MIT

    Science.gov (United States)

    Fagundo, Arturo

    1994-01-01

    Markov models represent an extremely attractive tool for the reliability analysis of many systems. However, Markov model state space grows exponentially with the number of components in a given system. Thus, for very large systems Markov modeling techniques alone become intractable in both memory and CPU time. Often a particular subsystem can be found within some larger system where the dependence of the larger system on the subsystem is of a particularly simple form. This simple dependence can be used to decompose such a system into one or more subsystems. A hierarchical technique is presented which can be used to evaluate these subsystems in such a way that their reliabilities can be combined to obtain the reliability for the full system. This hierarchical approach is unique in that it allows the subsystem model to pass multiple aggregate state information to the higher level model, allowing more general systems to be evaluated. Guidelines are developed to assist in the system decomposition. An appropriate method for determining subsystem reliability is also developed. This method gives rise to some interesting numerical issues. Numerical error due to roundoff and integration are discussed at length. Once a decomposition is chosen, the remaining analysis is straightforward but tedious. However, an approach is developed for simplifying the recombination of subsystem reliabilities. Finally, a real world system is used to illustrate the use of this technique in a more practical context.

  4. Extending hierarchical task analysis to identify cognitive demands and information design requirements.

    Science.gov (United States)

    Phipps, Denham L; Meakin, George H; Beatty, Paul C W

    2011-07-01

    While hierarchical task analysis (HTA) is well established as a general task analysis method, there appears a need to make more explicit both the cognitive elements of a task and design requirements that arise from an analysis. One way of achieving this is to make use of extensions to the standard HTA. The aim of the current study is to evaluate the use of two such extensions--the sub-goal template (SGT) and the skills-rules-knowledge (SRK) framework--to analyse the cognitive activity that takes place during the planning and delivery of anaesthesia. In quantitative terms, the two methods were found to have relatively poor inter-rater reliability; however, qualitative evidence suggests that the two methods were nevertheless of value in generating insights about anaesthetists' information handling and cognitive performance. Implications for the use of an extended HTA to analyse work systems are discussed.

  5. An exploratory analysis of treatment completion and client and organizational factors using hierarchical linear modeling.

    Science.gov (United States)

    Woodward, Albert; Das, Abhik; Raskin, Ira E; Morgan-Lopez, Antonio A

    2006-11-01

    Data from the Alcohol and Drug Services Study (ADSS) are used to analyze the structure and operation of the substance abuse treatment industry in the United States. Published literature contains little systematic empirical analysis of the interaction between organizational characteristics and treatment outcomes. This paper addresses that deficit. It develops and tests a hierarchical linear model (HLM) to address questions about the empirical relationship between treatment inputs (industry costs, types and use of counseling and medical personnel, diagnosis mix, patient demographics, and the nature and level of services used in substance abuse treatment), and patient outcomes (retention and treatment completion rates). The paper adds to the literature by demonstrating a direct and statistically significant link between treatment completion and the organizational and staffing structure of the treatment setting. Related reimbursement issues, questions for future analysis, and limitations of the ADSS for this analysis are discussed.

  6. Two-dimensional finite element neutron diffusion analysis using hierarchic shape functions

    Energy Technology Data Exchange (ETDEWEB)

    Carpenter, D.C.

    1997-04-01

    Recent advances have been made in the use of p-type finite element method (FEM) for structural and fluid dynamics problems that hold promise for reactor physics problems. These advances include using hierarchic shape functions, element-by-element iterative solvers and more powerful mapping techniques. Use of the hierarchic shape functions allows greater flexibility and efficiency in implementing energy-dependent flux expansions and incorporating localized refinement of the solution space. The irregular matrices generated by the p-type FEM can be solved efficiently using element-by-element conjugate gradient iterative solvers. These solvers do not require storage of either the global or local stiffness matrices and can be highly vectorized. Mapping techniques based on blending function interpolation allow exact representation of curved boundaries using coarse element grids. These features were implemented in a developmental two-dimensional neutron diffusion program based on the use of hierarchic shape functions (FEM2DH). Several aspects in the effective use of p-type analysis were explored. Two choices of elemental preconditioning were examined--the proper selection of the polynomial shape functions and the proper number of functions to use. Of the five shape function polynomials tested, the integral Legendre functions were the most effective. The serendipity set of functions is preferable over the full tensor product set. Two global preconditioners were also examined--simple diagonal and incomplete Cholesky. The full effectiveness of the finite element methodology was demonstrated on a two-region, two-group cylindrical problem but solved in the x-y coordinate space, using a non-structured element grid. The exact, analytic eigenvalue solution was achieved with FEM2DH using various combinations of element grids and flux expansions.

  7. A hierarchical model for probabilistic independent component analysis of multi-subject fMRI studies.

    Science.gov (United States)

    Guo, Ying; Tang, Li

    2013-12-01

    An important goal in fMRI studies is to decompose the observed series of brain images to identify and characterize underlying brain functional networks. Independent component analysis (ICA) has been shown to be a powerful computational tool for this purpose. Classic ICA has been successfully applied to single-subject fMRI data. The extension of ICA to group inferences in neuroimaging studies, however, is challenging due to the unavailability of a pre-specified group design matrix. Existing group ICA methods generally concatenate observed fMRI data across subjects on the temporal domain and then decompose multi-subject data in a similar manner to single-subject ICA. The major limitation of existing methods is that they ignore between-subject variability in spatial distributions of brain functional networks in group ICA. In this article, we propose a new hierarchical probabilistic group ICA method to formally model subject-specific effects in both temporal and spatial domains when decomposing multi-subject fMRI data. The proposed method provides model-based estimation of brain functional networks at both the population and subject level. An important advantage of the hierarchical model is that it provides a formal statistical framework to investigate similarities and differences in brain functional networks across subjects, for example, subjects with mental disorders or neurodegenerative diseases such as Parkinson's as compared to normal subjects. We develop an EM algorithm for model estimation where both the E-step and M-step have explicit forms. We compare the performance of the proposed hierarchical model with that of two popular group ICA methods via simulation studies. We illustrate our method with application to an fMRI study of Zen meditation.

  8. Do drug treatment variables predict cognitive performance in multidrug-treated opioid-dependent patients? A regression analysis study

    Directory of Open Access Journals (Sweden)

    Rapeli Pekka

    2012-11-01

    Full Text Available Abstract Background Cognitive deficits and multiple psychoactive drug regimens are both common in patients treated for opioid-dependence. Therefore, we examined whether the cognitive performance of patients in opioid-substitution treatment (OST is associated with their drug treatment variables. Methods Opioid-dependent patients (N = 104 who were treated either with buprenorphine or methadone (n = 52 in both groups were given attention, working memory, verbal, and visual memory tests after they had been a minimum of six months in treatment. Group-wise results were analysed by analysis of variance. Predictors of cognitive performance were examined by hierarchical regression analysis. Results Buprenorphine-treated patients performed statistically significantly better in a simple reaction time test than methadone-treated ones. No other significant differences between groups in cognitive performance were found. In each OST drug group, approximately 10% of the attention performance could be predicted by drug treatment variables. Use of benzodiazepine medication predicted about 10% of performance variance in working memory. Treatment with more than one other psychoactive drug (than opioid or BZD and frequent substance abuse during the past month predicted about 20% of verbal memory performance. Conclusions Although this study does not prove a causal relationship between multiple prescription drug use and poor cognitive functioning, the results are relevant for psychosocial recovery, vocational rehabilitation, and psychological treatment of OST patients. Especially for patients with BZD treatment, other treatment options should be actively sought.

  9. Advancing the Parameter-elevation Regressions on Independent Slopes Model (PRISM) to Accommodate Atmospheric River Influences Using a Hierarchical Estimation Structure

    Science.gov (United States)

    Hsu, C.; Cifelli, R.; Zamora, R. J.; Schneider, T.

    2014-12-01

    The PRISM monthly climatology has been widely used by various agencies for diverse purposes. In the River Forecast Centers (RFCs), the PRISM monthly climatology is used to support tasks such as QPE, or quality control of point precipitation observation, and fine tune QPFs. Validation studies by forecasters and researchers have shown that interpolation involving PRISM climatology can effectually reduce the estimation bias for the locations where moderate or little orographic phenomena occur. However, many studies have pointed out limitations in PRISM monthly climatology. These limitations are especially apparent in storm events with fast-moving wet air masses or with storm tracks that are different from climatology. In order to upgrade PRISM climatology so it possesses the capability to characterize the climatology of storm events, it is critical to integrate large-scale atmospheric conditions with the original PRISM predictor variables and to simulate them at a temporal resolution higher than monthly. To this end, a simple, flexible, and powerful framework for precipitation estimation modeling that can be applied to very large data sets is thus developed. In this project, a decision tree based estimation structure was developed to perform the aforementioned variable integration work. Three Atmospheric River events (ARs) were selected to explore the hierarchical relationships among these variables and how these relationships shape the event-based precipitation distribution pattern across California. Several atmospheric variables, including vertically Integrated Vapor Transport (IVT), temperature, zonal wind (u), meridional wind (v), and omega (ω), were added to enhance the sophistication of the tree-based structure in estimating precipitation. To develop a direction-based climatology, the directions the ARs moving over the Pacific Ocean were also calculated and parameterized within the tree estimation structure. The results show that the involvement of the

  10. An Isogeometric Design-through-analysis Methodology based on Adaptive Hierarchical Refinement of NURBS, Immersed Boundary Methods, and T-spline CAD Surfaces

    Science.gov (United States)

    2012-01-22

    ICES REPORT 12-05 January 2012 An Isogeometric Design-through-analysis Methodology based on Adaptive Hierarchical Refinement of NURBS , Immersed...M.J. Borden, E. Rank, T.J.R. Hughes, An Isogeometric Design-through-analysis Methodology based on Adaptive Hierarchical Refinement of NURBS , Immersed...analysis Methodology based on Adaptive Hierarchical Refinement of NURBS , Immersed Boundary Methods, and T-spline CAD Surfaces 5a. CONTRACT NUMBER 5b

  11. A conditional likelihood approach for regression analysis using biomarkers measured with batch-specific error.

    Science.gov (United States)

    Wang, Ming; Flanders, W Dana; Bostick, Roberd M; Long, Qi

    2012-12-20

    Measurement error is common in epidemiological and biomedical studies. When biomarkers are measured in batches or groups, measurement error is potentially correlated within each batch or group. In regression analysis, most existing methods are not applicable in the presence of batch-specific measurement error in predictors. We propose a robust conditional likelihood approach to account for batch-specific error in predictors when batch effect is additive and the predominant source of error, which requires no assumptions on the distribution of measurement error. Although a regression model with batch as a categorical covariable yields the same parameter estimates as the proposed conditional likelihood approach for linear regression, this result does not hold in general for all generalized linear models, in particular, logistic regression. Our simulation studies show that the conditional likelihood approach achieves better finite sample performance than the regression calibration approach or a naive approach without adjustment for measurement error. In the case of logistic regression, our proposed approach is shown to also outperform the regression approach with batch as a categorical covariate. In addition, we also examine a 'hybrid' approach combining the conditional likelihood method and the regression calibration method, which is shown in simulations to achieve good performance in the presence of both batch-specific and measurement-specific errors. We illustrate our method by using data from a colorectal adenoma study.

  12. [Tooth decay and associated factors among adolescents in the north of the State of Minas Gerais, Brazil: a hierarchical analysis].

    Science.gov (United States)

    Silveira, Marise Fagundes; Freire, Rafael Silveira; Nepomuceno, Marcela Oliveira; Martins, Andréa Maria Eleutério de Barros Lima; Marcopito, Luiz Francisco

    2015-11-01

    This is a cross-sectional population-based study (n = 763) conducted in the north of the State of Minas Gerais, which aimed to investigate the prevalence of tooth decay among adolescents and to identify the potential determinants of same. Probability sampling by conglomerates in multiple stages was used. Trained and calibrated professionals performed the data collection by means of intraoral examination and interviews in the previously selected households. In the analysis of the determinant factor for the presence of tooth decay, hierarchical binary logistic regression models were used. The prevalence of tooth decay, decayed, missing and filled teeth were 71.3%, 36.5%, 55.6% and 16%, respectively. The following averages were observed: DMFT (3.4 teeth), number of decayed (0.8 teeth), restored (2.4 teeth) and missing (0.2 teeth). The incidence of tooth decay was higher among adolescents who stated they were black/indigenous/brown (OR = 1.76), lived in crowded households (OR = 2.4), did not regularly visit or had never been to a dentist (OR = 1.9), used public or philanthropic services (OR = 1,8), had smoking habits (OR = 4.1), consumed alcohol (OR = 1.8), perceived their oral health negatively (OR = 5.9 and OR = 1.9) and had toothac in the last six months (OR = 2.0).

  13. Evaluation of syngas production unit cost of bio-gasification facility using regression analysis techniques

    Energy Technology Data Exchange (ETDEWEB)

    Deng, Yangyang; Parajuli, Prem B.

    2011-08-10

    Evaluation of economic feasibility of a bio-gasification facility needs understanding of its unit cost under different production capacities. The objective of this study was to evaluate the unit cost of syngas production at capacities from 60 through 1800Nm 3/h using an economic model with three regression analysis techniques (simple regression, reciprocal regression, and log-log regression). The preliminary result of this study showed that reciprocal regression analysis technique had the best fit curve between per unit cost and production capacity, with sum of error squares (SES) lower than 0.001 and coefficient of determination of (R 2) 0.996. The regression analysis techniques determined the minimum unit cost of syngas production for micro-scale bio-gasification facilities of $0.052/Nm 3, under the capacity of 2,880 Nm 3/h. The results of this study suggest that to reduce cost, facilities should run at a high production capacity. In addition, the contribution of this technique could be the new categorical criterion to evaluate micro-scale bio-gasification facility from the perspective of economic analysis.

  14. Regression analysis understanding and building business and economic models using Excel

    CERN Document Server

    Wilson, J Holton

    2012-01-01

    The technique of regression analysis is used so often in business and economics today that an understanding of its use is necessary for almost everyone engaged in the field. This book will teach you the essential elements of building and understanding regression models in a business/economic context in an intuitive manner. The authors take a non-theoretical treatment that is accessible even if you have a limited statistical background. It is specifically designed to teach the correct use of regression, while advising you of its limitations and teaching about common pitfalls. This book describe

  15. [Local Regression Algorithm Based on Net Analyte Signal and Its Application in Near Infrared Spectral Analysis].

    Science.gov (United States)

    Zhang, Hong-guang; Lu, Jian-gang

    2016-02-01

    Abstract To overcome the problems of significant difference among samples and nonlinearity between the property and spectra of samples in spectral quantitative analysis, a local regression algorithm is proposed in this paper. In this algorithm, net signal analysis method(NAS) was firstly used to obtain the net analyte signal of the calibration samples and unknown samples, then the Euclidean distance between net analyte signal of the sample and net analyte signal of calibration samples was calculated and utilized as similarity index. According to the defined similarity index, the local calibration sets were individually selected for each unknown sample. Finally, a local PLS regression model was built on each local calibration sets for each unknown sample. The proposed method was applied to a set of near infrared spectra of meat samples. The results demonstrate that the prediction precision and model complexity of the proposed method are superior to global PLS regression method and conventional local regression algorithm based on spectral Euclidean distance.

  16. Distance Based Root Cause Analysis and Change Impact Analysis of Performance Regressions

    Directory of Open Access Journals (Sweden)

    Junzan Zhou

    2015-01-01

    Full Text Available Performance regression testing is applied to uncover both performance and functional problems of software releases. A performance problem revealed by performance testing can be high response time, low throughput, or even being out of service. Mature performance testing process helps systematically detect software performance problems. However, it is difficult to identify the root cause and evaluate the potential change impact. In this paper, we present an approach leveraging server side logs for identifying root causes of performance problems. Firstly, server side logs are used to recover call tree of each business transaction. We define a novel distance based metric computed from call trees for root cause analysis and apply inverted index from methods to business transactions for change impact analysis. Empirical studies show that our approach can effectively and efficiently help developers diagnose root cause of performance problems.

  17. Microglia Morphological Categorization in a Rat Model of Neuroinflammation by Hierarchical Cluster and Principal Components Analysis

    Science.gov (United States)

    Fernández-Arjona, María del Mar; Grondona, Jesús M.; Granados-Durán, Pablo; Fernández-Llebrez, Pedro; López-Ávalos, María D.

    2017-01-01

    It is known that microglia morphology and function are closely related, but only few studies have objectively described different morphological subtypes. To address this issue, morphological parameters of microglial cells were analyzed in a rat model of aseptic neuroinflammation. After the injection of a single dose of the enzyme neuraminidase (NA) within the lateral ventricle (LV) an acute inflammatory process occurs. Sections from NA-injected animals and sham controls were immunolabeled with the microglial marker IBA1, which highlights ramifications and features of the cell shape. Using images obtained by section scanning, individual microglial cells were sampled from various regions (septofimbrial nucleus, hippocampus and hypothalamus) at different times post-injection (2, 4 and 12 h). Each cell yielded a set of 15 morphological parameters by means of image analysis software. Five initial parameters (including fractal measures) were statistically different in cells from NA-injected rats (most of them IL-1β positive, i.e., M1-state) compared to those from control animals (none of them IL-1β positive, i.e., surveillant state). However, additional multimodal parameters were revealed more suitable for hierarchical cluster analysis (HCA). This method pointed out the classification of microglia population in four clusters. Furthermore, a linear discriminant analysis (LDA) suggested three specific parameters to objectively classify any microglia by a decision tree. In addition, a principal components analysis (PCA) revealed two extra valuable variables that allowed to further classifying microglia in a total of eight sub-clusters or types. The spatio-temporal distribution of these different morphotypes in our rat inflammation model allowed to relate specific morphotypes with microglial activation status and brain location. An objective method for microglia classification based on morphological parameters is proposed. Main points Microglia undergo a quantifiable

  18. Quantile regression for the statistical analysis of immunological data with many non-detects

    OpenAIRE

    Eilers Paul HC; Röder Esther; Savelkoul Huub FJ; van Wijk Roy

    2012-01-01

    Abstract Background Immunological parameters are hard to measure. A well-known problem is the occurrence of values below the detection limit, the non-detects. Non-detects are a nuisance, because classical statistical analyses, like ANOVA and regression, cannot be applied. The more advanced statistical techniques currently available for the analysis of datasets with non-detects can only be used if a small percentage of the data are non-detects. Methods and results Quantile regression, a genera...

  19. A Hierarchical Allometric Scaling Analysis of Chinese Cities: 1991-2014

    CERN Document Server

    Chen, Yanguang

    2016-01-01

    The law of allometric scaling based on Zipf distributions can be employed to research hierarchies of cities in a geographical region. However, the allometric patterns are easily influenced by random disturbance from the noises in observational data. In theory, both the allometric growth law and Zipf's law are equivalent to hierarchical scaling laws associated with fractal structure. In this paper, the scaling laws of hierarchies with cascade structure are used to study Chinese cities, and the method of R/S analysis is applied to analyzing the change of the allometric scaling exponents. The results show that the hierarchical scaling relations of Chinese cities became clearer and clearer from 1991 to 2014 year, the global allometric scaling exponent values fluctuated around 0.85, and the local scaling exponent approached to 0.85. The Hurst exponent of the allometric parameter change is greater than 1/2, indicating persistence and a long-term memory of urban evolution. The main conclusions can be reached as foll...

  20. Analysis of the effects of the global financial crisis on the Turkish economy, using hierarchical methods

    Science.gov (United States)

    Kantar, Ersin; Keskin, Mustafa; Deviren, Bayram

    2012-04-01

    We have analyzed the topology of 50 important Turkish companies for the period 2006-2010 using the concept of hierarchical methods (the minimal spanning tree (MST) and hierarchical tree (HT)). We investigated the statistical reliability of links between companies in the MST by using the bootstrap technique. We also used the average linkage cluster analysis (ALCA) technique to observe the cluster structures much better. The MST and HT are known as useful tools to perceive and detect global structure, taxonomy, and hierarchy in financial data. We obtained four clusters of companies according to their proximity. We also observed that the Banks and Holdings cluster always forms in the centre of the MSTs for the periods 2006-2007, 2008, and 2009-2010. The clusters match nicely with their common production activities or their strong interrelationship. The effects of the Automobile sector increased after the global financial crisis due to the temporary incentives provided by the Turkish government. We find that Turkish companies were not very affected by the global financial crisis.

  1. Exploratory regression analysis: a tool for selecting models and determining predictor importance.

    Science.gov (United States)

    Braun, Michael T; Oswald, Frederick L

    2011-06-01

    Linear regression analysis is one of the most important tools in a researcher's toolbox for creating and testing predictive models. Although linear regression analysis indicates how strongly a set of predictor variables, taken together, will predict a relevant criterion (i.e., the multiple R), the analysis cannot indicate which predictors are the most important. Although there is no definitive or unambiguous method for establishing predictor variable importance, there are several accepted methods. This article reviews those methods for establishing predictor importance and provides a program (in Excel) for implementing them (available for direct download at http://dl.dropbox.com/u/2480715/ERA.xlsm?dl=1) . The program investigates all 2(p) - 1 submodels and produces several indices of predictor importance. This exploratory approach to linear regression, similar to other exploratory data analysis techniques, has the potential to yield both theoretical and practical benefits.

  2. Validation of hierarchical cluster analysis for identification of bacterial species using 42 bacterial isolates

    Science.gov (United States)

    Ghebremedhin, Meron; Yesupriya, Shubha; Luka, Janos; Crane, Nicole J.

    2015-03-01

    Recent studies have demonstrated the potential advantages of the use of Raman spectroscopy in the biomedical field due to its rapidity and noninvasive nature. In this study, Raman spectroscopy is applied as a method for differentiating between bacteria isolates for Gram status and Genus species. We created models for identifying 28 bacterial isolates using spectra collected with a 785 nm laser excitation Raman spectroscopic system. In order to investigate the groupings of these samples, partial least squares discriminant analysis (PLSDA) and hierarchical cluster analysis (HCA) was implemented. In addition, cluster analyses of the isolates were performed using various data types consisting of, biochemical tests, gene sequence alignment, high resolution melt (HRM) analysis and antimicrobial susceptibility tests of minimum inhibitory concentration (MIC) and degree of antimicrobial resistance (SIR). In order to evaluate the ability of these models to correctly classify bacterial isolates using solely Raman spectroscopic data, a set of 14 validation samples were tested using the PLSDA models and consequently the HCA models. External cluster evaluation criteria of purity and Rand index were calculated at different taxonomic levels to compare the performance of clustering using Raman spectra as well as the other datasets. Results showed that Raman spectra performed comparably, and in some cases better than, the other data types with Rand index and purity values up to 0.933 and 0.947, respectively. This study clearly demonstrates that the discrimination of bacterial species using Raman spectroscopic data and hierarchical cluster analysis is possible and has the potential to be a powerful point-of-care tool in clinical settings.

  3. Numerical analysis on mechanical behaviors of hierarchical cellular structures with negative Poisson’s ratio

    Science.gov (United States)

    Li, Dong; Yin, Jianhua; Dong, Liang; Lakes, Roderic S.

    2017-02-01

    Two-dimensional hierarchical re-entrant honeycomb structures were designed and the mechanical behaviors of the structures were studied using a finite element method. Hierarchical re-entrant structure of order n (n ≥ 1) was constructed by replacing each vertex of a lower order (n - 1) hierarchical re-entrant structure with a smaller re-entrant hexagon with identical strut aspect ratio. The Poisson’s ratio and energy absorption capacity of re-entrant structures of different hierarchical orders were studied under different compression velocities. The results showed that the Poisson’s ratio of the first and second order hierarchical structures can reach -1.36 and -1.33 with appropriate aspect ratio, 13.8% and 12.1% lower than that of the zeroth order hierarchical structure. The energy absorption capacity of the three models increased with an increasing compression velocity; the second order hierarchical structure exhibited the highest rate of increase in energy absorption capacity with an increasing compression velocity. The plateau stresses of the first and second order hierarchical structures were slightly lower than that of the zeroth order hierarchical structure; however the second order hierarchical structure exhibited the highest energy absorption capacity at high compression velocity (60 m s-1).

  4. Investigation of the degree of organisational influence on patient experience scores in acute medical admission units in all acute hospitals in England using multilevel hierarchical regression modelling

    Science.gov (United States)

    Sullivan, Paul

    2017-01-01

    Objectives Previous studies found that hospital and specialty have limited influence on patient experience scores, and patient level factors are more important. This could be due to heterogeneity of experience delivery across subunits within organisations. We aimed to determine whether organisation level factors have greater impact if scores for the same subspecialty microsystem are analysed in each hospital. Setting Acute medical admission units in all NHS Acute Trusts in England. Participants We analysed patient experience data from the English Adult Inpatient Survey which is administered to 850 patients annually in each acute NHS Trusts in England. We selected all 8753 patients who returned the survey and who were emergency medical admissions and stayed in their admission unit for 1–2 nights, so as to isolate the experience delivered during the acute admission process. Primary and secondary outcome measures We used multilevel logistic regression to determine the apportioned influence of host organisation and of organisation level factors (size and teaching status), and patient level factors (demographics, presence of long-term conditions and disabilities). We selected ‘being treated with respect and dignity’ and ‘pain control’ as primary outcome parameters. Other Picker Domain question scores were analysed as secondary parameters. Results The proportion of overall variance attributable at organisational level was small; 0.5% (NS) for respect and dignity, 0.4% (NS) for pain control. Long-standing conditions and consequent disabilities were associated with low scores. Other item scores also showed that most influence was from patient level factors. Conclusions When a single microsystem, the acute medical admission process, is isolated, variance in experience scores is mainly explainable by patient level factors with limited organisational level influence. This has implications for the use of generic patient experience surveys for comparison between

  5. Prediction of hearing outcomes by multiple regression analysis in patients with idiopathic sudden sensorineural hearing loss.

    Science.gov (United States)

    Suzuki, Hideaki; Tabata, Takahisa; Koizumi, Hiroki; Hohchi, Nobusuke; Takeuchi, Shoko; Kitamura, Takuro; Fujino, Yoshihisa; Ohbuchi, Toyoaki

    2014-12-01

    This study aimed to create a multiple regression model for predicting hearing outcomes of idiopathic sudden sensorineural hearing loss (ISSNHL). The participants were 205 consecutive patients (205 ears) with ISSNHL (hearing level ≥ 40 dB, interval between onset and treatment ≤ 30 days). They received systemic steroid administration combined with intratympanic steroid injection. Data were examined by simple and multiple regression analyses. Three hearing indices (percentage hearing improvement, hearing gain, and posttreatment hearing level [HLpost]) and 7 prognostic factors (age, days from onset to treatment, initial hearing level, initial hearing level at low frequencies, initial hearing level at high frequencies, presence of vertigo, and contralateral hearing level) were included in the multiple regression analysis as dependent and explanatory variables, respectively. In the simple regression analysis, the percentage hearing improvement, hearing gain, and HLpost showed significant correlation with 2, 5, and 6 of the 7 prognostic factors, respectively. The multiple correlation coefficients were 0.396, 0.503, and 0.714 for the percentage hearing improvement, hearing gain, and HLpost, respectively. Predicted values of HLpost calculated by the multiple regression equation were reliable with 70% probability with a 40-dB-width prediction interval. Prediction of HLpost by the multiple regression model may be useful to estimate the hearing prognosis of ISSNHL. © The Author(s) 2014.

  6. Modeling of retardance in ferrofluid with Taguchi-based multiple regression analysis

    Science.gov (United States)

    Lin, Jing-Fung; Wu, Jyh-Shyang; Sheu, Jer-Jia

    2015-03-01

    The citric acid (CA) coated Fe3O4 ferrofluids are prepared by a co-precipitation method and the magneto-optical retardance property is measured by a Stokes polarimeter. Optimization and multiple regression of retardance in ferrofluids are executed by combining Taguchi method and Excel. From the nine tests for four parameters, including pH of suspension, molar ratio of CA to Fe3O4, volume of CA, and coating temperature, influence sequence and excellent program are found. Multiple regression analysis and F-test on the significance of regression equation are performed. It is found that the model F value is much larger than Fcritical and significance level P <0.0001. So it can be concluded that the regression model has statistically significant predictive ability. Substituting excellent program into equation, retardance is obtained as 32.703°, higher than the highest value in tests by 11.4%.

  7. Assessment of Differential Item Functioning in Health-Related Outcomes: A Simulation and Empirical Analysis with Hierarchical Polytomous Data

    Directory of Open Access Journals (Sweden)

    Zahra Sharafi

    2017-01-01

    Full Text Available Background. The purpose of this study was to evaluate the effectiveness of two methods of detecting differential item functioning (DIF in the presence of multilevel data and polytomously scored items. The assessment of DIF with multilevel data (e.g., patients nested within hospitals, hospitals nested within districts from large-scale assessment programs has received considerable attention but very few studies evaluated the effect of hierarchical structure of data on DIF detection for polytomously scored items. Methods. The ordinal logistic regression (OLR and hierarchical ordinal logistic regression (HOLR were utilized to assess DIF in simulated and real multilevel polytomous data. Six factors (DIF magnitude, grouping variable, intraclass correlation coefficient, number of clusters, number of participants per cluster, and item discrimination parameter with a fully crossed design were considered in the simulation study. Furthermore, data of Pediatric Quality of Life Inventory™ (PedsQL™ 4.0 collected from 576 healthy school children were analyzed. Results. Overall, results indicate that both methods performed equivalently in terms of controlling Type I error and detection power rates. Conclusions. The current study showed negligible difference between OLR and HOLR in detecting DIF with polytomously scored items in a hierarchical structure. Implications and considerations while analyzing real data were also discussed.

  8. Applied Bayesian Hierarchical Methods

    CERN Document Server

    Congdon, Peter D

    2010-01-01

    Bayesian methods facilitate the analysis of complex models and data structures. Emphasizing data applications, alternative modeling specifications, and computer implementation, this book provides a practical overview of methods for Bayesian analysis of hierarchical models.

  9. Trend Analysis of Cancer Mortality and Incidence in Panama, Using Joinpoint Regression Analysis

    Science.gov (United States)

    Politis, Michael; Higuera, Gladys; Chang, Lissette Raquel; Gomez, Beatriz; Bares, Juan; Motta, Jorge

    2015-01-01

    Abstract Cancer is one of the leading causes of death worldwide and its incidence is expected to increase in the future. In Panama, cancer is also one of the leading causes of death. In 1964, a nationwide cancer registry was started and it was restructured and improved in 2012. The aim of this study is to utilize Joinpoint regression analysis to study the trends of the incidence and mortality of cancer in Panama in the last decade. Cancer mortality was estimated from the Panamanian National Institute of Census and Statistics Registry for the period 2001 to 2011. Cancer incidence was estimated from the Panamanian National Cancer Registry for the period 2000 to 2009. The Joinpoint Regression Analysis program, version 4.0.4, was used to calculate trends by age-adjusted incidence and mortality rates for selected cancers. Overall, the trend of age-adjusted cancer mortality in Panama has declined over the last 10 years (−1.12% per year). The cancers for which there was a significant increase in the trend of mortality were female breast cancer and ovarian cancer; while the highest increases in incidence were shown for breast cancer, liver cancer, and prostate cancer. Significant decrease in the trend of mortality was evidenced for the following: prostate cancer, lung and bronchus cancer, and cervical cancer; with respect to incidence, only oral and pharynx cancer in both sexes had a significant decrease. Some cancers showed no significant trends in incidence or mortality. This study reveals contrasting trends in cancer incidence and mortality in Panama in the last decade. Although Panama is considered an upper middle income nation, this study demonstrates that some cancer mortality trends, like the ones seen in cervical and lung cancer, behave similarly to the ones seen in high income countries. In contrast, other types, like breast cancer, follow a pattern seen in countries undergoing a transition to a developed economy with its associated lifestyle, nutrition, and

  10. Econometric analysis of realized covariation: high frequency based covariance, regression, and correlation in financial economics

    DEFF Research Database (Denmark)

    Barndorff-Nielsen, Ole Eiler; Shephard, N.

    2004-01-01

    This paper analyses multivariate high frequency financial data using realized covariation. We provide a new asymptotic distribution theory for standard methods such as regression, correlation analysis, and covariance. It will be based on a fixed interval of time (e.g., a day or week), allowing...... the number of high frequency returns during this period to go to infinity. Our analysis allows us to study how high frequency correlations, regressions, and covariances change through time. In particular we provide confidence intervals for each of these quantities....

  11. Analysis of Functional Data with Focus on Multinomial Regression and Multilevel Data

    DEFF Research Database (Denmark)

    Mousavi, Seyed Nourollah

    Functional data analysis (FDA) is a fast growing area in statistical research with increasingly diverse range of application from economics, medicine, agriculture, chemometrics, etc. Functional regression is an area of FDA which has received the most attention both in aspects of application...... and methodological development. Our main Functional data analysis (FDA) is a fast growing area in statistical research with increasingly diverse range of application from economics, medicine, agriculture, chemometrics, etc. Functional regression is an area of FDA which has received the most attention both in aspects...

  12. Factor analysis and multiple regression between topography and precipitation on Jeju Island, Korea

    Science.gov (United States)

    Um, Myoung-Jin; Yun, Hyeseon; Jeong, Chang-Sam; Heo, Jun-Haeng

    2011-11-01

    SummaryIn this study, new factors that influence precipitation were extracted from geographic variables using factor analysis, which allow for an accurate estimation of orographic precipitation. Correlation analysis was also used to examine the relationship between nine topographic variables from digital elevation models (DEMs) and the precipitation in Jeju Island. In addition, a spatial analysis was performed in order to verify the validity of the regression model. From the results of the correlation analysis, it was found that all of the topographic variables had a positive correlation with the precipitation. The relations between the variables also changed in accordance with a change in the precipitation duration. However, upon examining the correlation matrix, no significant relationship between the latitude and the aspect was found. According to the factor analysis, eight topographic variables (latitude being the exception) were found to have a direct influence on the precipitation. Three factors were then extracted from the eight topographic variables. By directly comparing the multiple regression model with the factors (model 1) to the multiple regression model with the topographic variables (model 3), it was found that model 1 did not violate the limits of statistical significance and multicollinearity. As such, model 1 was considered to be appropriate for estimating the precipitation when taking into account the topography. In the study of model 1, the multiple regression model using factor analysis was found to be the best method for estimating the orographic precipitation on Jeju Island.

  13. Predicting the decision to pursue mediation in civil disputes: a hierarchical classes analysis.

    Science.gov (United States)

    Reich, Warren A; Kressel, Kenneth; Scanlon, Kathleen M; Weiner, Gary A

    2007-11-01

    Clients (N = 185) involved in civil court cases completed the CPR Institute's Mediation Screen, which is designed to assist in making a decision about pursuing mediation. The authors modeled data using hierarchical classes analysis (HICLAS), a clustering algorithm that places clients into 1 set of classes and CPRMS items into another set of classes. HICLAS then links the sets of classes so that any class of clients can be identified in terms of the classes of items they endorsed. HICLAS-derived item classes reflected 2 underlying themes: (a) suitability of the dispute for a problem-solving process and (b) potential benefits of mediation. All clients who perceived that mediation would be beneficial also believed that the context of their conflict was favorable to mediation; however, not all clients who saw a favorable context believed they would benefit from mediation. The majority of clients who agreed to pursue mediation endorsed items reflecting both contextual suitability and perceived benefits of mediation.

  14. Implementation of Hierarchical Task Analysis for User Interface Design in Drawing Application for Early Childhood Education

    Directory of Open Access Journals (Sweden)

    Mira Kania Sabariah

    2016-05-01

    Full Text Available Draw learning in early childhood is an important lesson and full of stimulation of the process of growth and development of children which could help to train the fine motor skills. We have had a lot of applications that can be used to perform learning, including interactive learning applications. Referring to the observations that have been conducted showed that the experiences given by the applications that exist today are very diverse and have not been able to represent the model of learning and characteristics of early childhood (4-6 years. Based on the results, Hierarchical Task Analysis method generated a list of tasks that must be done in designing an user interface that represents the user experience in draw learning. Then by using the Heuristic Evaluation method the usability of the model has fulfilled a very good level of understanding and also it can be enhanced and produce a better model.

  15. Associations among attachment, sexuality, and marital satisfaction in adult Chilean couples: a linear hierarchical models analysis.

    Science.gov (United States)

    Heresi Milad, Eliana; Rivera Ottenberger, Diana; Huepe Artigas, David

    2014-01-01

    This study aimed to explore the associations among attachment system type, sexual satisfaction, and marital satisfaction in adult couples in stable relationships. Participants were 294 couples between the ages of 20 and 70 years who answered self-administered questionnaires. Hierarchical linear modeling revealed that the anxiety and avoidance, sexual satisfaction, and marital satisfaction dimensions were closely related. Specifically, the avoidance dimension, but not the anxiety dimension, corresponded to lower levels of sexual and marital satisfaction. Moreover, for the sexual satisfaction variable, an interaction effect was observed between the gender of the actor and avoidance of the partner, which was observed only in men. In the marital satisfaction dimension, effects were apparent only at the individual level; a positive relation was found between the number of years spent living together and greater contentment with the relationship. These results confirm the hypothetical association between attachment and sexual and marital satisfaction and demonstrate the relevance of methodologies when the unit of analysis is the couple.

  16. Automatic Contrast Enhancement of Brain MR Images Using Hierarchical Correlation Histogram Analysis.

    Science.gov (United States)

    Chen, Chiao-Min; Chen, Chih-Cheng; Wu, Ming-Chi; Horng, Gwoboa; Wu, Hsien-Chu; Hsueh, Shih-Hua; Ho, His-Yun

    Parkinson's disease is a progressive neurodegenerative disorder that has a higher probability of occurrence in middle-aged and older adults than in the young. With the use of a computer-aided diagnosis (CAD) system, abnormal cell regions can be identified, and this identification can help medical personnel to evaluate the chance of disease. This study proposes a hierarchical correlation histogram analysis based on the grayscale distribution degree of pixel intensity by constructing a correlation histogram, that can improves the adaptive contrast enhancement for specific objects. The proposed method produces significant results during contrast enhancement preprocessing and facilitates subsequent CAD processes, thereby reducing recognition time and improving accuracy. The experimental results show that the proposed method is superior to existing methods by using two estimation image quantitative methods of PSNR and average gradient values. Furthermore, the edge information pertaining to specific cells can effectively increase the accuracy of the results.

  17. Applying of hierarchical clustering to analysis of protein patterns in the human cancer-associated liver.

    Directory of Open Access Journals (Sweden)

    Natalia A Petushkova

    Full Text Available There are two ways that statistical methods can learn from biomedical data. One way is to learn classifiers to identify diseases and to predict outcomes using the training dataset with established diagnosis for each sample. When the training dataset is not available the task can be to mine for presence of meaningful groups (clusters of samples and to explore underlying data structure (unsupervised learning.We investigated the proteomic profiles of the cytosolic fraction of human liver samples using two-dimensional electrophoresis (2DE. Samples were resected upon surgical treatment of hepatic metastases in colorectal cancer. Unsupervised hierarchical clustering of 2DE gel images (n = 18 revealed a pair of clusters, containing 11 and 7 samples. Previously we used the same specimens to measure biochemical profiles based on cytochrome P450-dependent enzymatic activities and also found that samples were clearly divided into two well-separated groups by cluster analysis. It turned out that groups by enzyme activity almost perfectly match to the groups identified from proteomic data. Of the 271 reproducible spots on our 2DE gels, we selected 15 to distinguish the human liver cytosolic clusters. Using MALDI-TOF peptide mass fingerprinting, we identified 12 proteins for the selected spots, including known cancer-associated species.Our results highlight the importance of hierarchical cluster analysis of proteomic data, and showed concordance between results of biochemical and proteomic approaches. Grouping of the human liver samples and/or patients into differing clusters may provide insights into possible molecular mechanism of drug metabolism and creates a rationale for personalized treatment.

  18. Analysing count data of Butterflies communities in Jasin, Melaka: A Poisson regression analysis

    Science.gov (United States)

    Afiqah Muhamad Jamil, Siti; Asrul Affendi Abdullah, M.; Kek, Sie Long; Nor, Maria Elena; Mohamed, Maryati; Ismail, Norradihah

    2017-09-01

    Counting outcomes normally have remaining values highly skewed toward the right as they are often characterized by large values of zeros. The data of butterfly communities, had been taken from Jasin, Melaka and consists of 131 number of subject visits in Jasin, Melaka. In this paper, considering the count data of butterfly communities, an analysis is considered Poisson regression analysis as it is assumed to be an alternative way on better suited to the counting process. This research paper is about analysing count data from zero observation ecological inference of butterfly communities in Jasin, Melaka by using Poisson regression analysis. The software for Poisson regression is readily available and it is becoming more widely used in many field of research and the data was analysed by using SAS software. The purpose of analysis comprised the framework of identifying the concerns. Besides, by using Poisson regression analysis, the study determines the fitness of data for accessing the reliability on using the count data. The finding indicates that the highest and lowest number of subject comes from the third family (Nymphalidae) family and fifth (Hesperidae) family and the Poisson distribution seems to fit the zero values.

  19. Regression Model Term Selection for the Analysis of Strain-Gage Balance Calibration Data

    Science.gov (United States)

    Ulbrich, Norbert Manfred; Volden, Thomas R.

    2010-01-01

    The paper discusses the selection of regression model terms for the analysis of wind tunnel strain-gage balance calibration data. Different function class combinations are presented that may be used to analyze calibration data using either a non-iterative or an iterative method. The role of the intercept term in a regression model of calibration data is reviewed. In addition, useful algorithms and metrics originating from linear algebra and statistics are recommended that will help an analyst (i) to identify and avoid both linear and near-linear dependencies between regression model terms and (ii) to make sure that the selected regression model of the calibration data uses only statistically significant terms. Three different tests are suggested that may be used to objectively assess the predictive capability of the final regression model of the calibration data. These tests use both the original data points and regression model independent confirmation points. Finally, data from a simplified manual calibration of the Ames MK40 balance is used to illustrate the application of some of the metrics and tests to a realistic calibration data set.

  20. Hierarchical statistical analysis of complex analog and mixed-signal systems

    Science.gov (United States)

    Webb, Matthew; Tang, Hua

    2014-12-01

    With increasing process parameter variations in nanometre regime, circuits and systems encounter significant performance variations and therefore statistical analysis has become increasingly important. For complex analog and mixed-signal circuits and systems, efficient yet accurate statistical analysis has been a challenge mainly due to significant simulation and modelling time. In the past years, there have been various approaches proposed for statistical analysis of analog and mixed-signal circuits. A recent work is reported to address statistical analysis for continuous-time Delta-Sigma modulators. In this article, we generalise that method and present a hierarchical method for efficient statistical analysis of complex analog and mixed-signal circuits while maintaining reasonable accuracy. At circuit level, we use the response surface modelling method to extract quadratic models of circuit-level performance parameters in terms of process parameters. Then at system level, we use behavioural models and apply the Monte-Carlo method for statistical evaluation of system performance parameters. We illustrate and validate the method on a continuous-time Delta-Sigma modulator and an analog filter.

  1. Regression analysis with missing data and unknown colored noise: application to the MICROSCOPE space mission

    CERN Document Server

    Baghi, Q; Bergé, J; Christophe, B; Touboul, P; Rodrigues, M

    2015-01-01

    The analysis of physical measurements often copes with highly correlated noises and interruptions caused by outliers, saturation events or transmission losses. We assess the impact of missing data on the performance of linear regression analysis involving the fit of modeled or measured time series. We show that data gaps can significantly alter the precision of the regression parameter estimation in the presence of colored noise, due to the frequency leakage of the noise power. We present a regression method which cancels this effect and estimates the parameters of interest with a precision comparable to the complete data case, even if the noise power spectral density (PSD) is not known a priori. The method is based on an autoregressive (AR) fit of the noise, which allows us to build an approximate generalized least squares estimator approaching the minimal variance bound. The method, which can be applied to any similar data processing, is tested on simulated measurements of the MICROSCOPE space mission, whos...

  2. Spline Nonparametric Regression Analysis of Stress-Strain Curve of Confined Concrete

    Directory of Open Access Journals (Sweden)

    Tavio Tavio

    2008-01-01

    Full Text Available Due to enormous uncertainties in confinement models associated with the maximum compressive strength and ductility of concrete confined by rectilinear ties, the implementation of spline nonparametric regression analysis is proposed herein as an alternative approach. The statistical evaluation is carried out based on 128 large-scale column specimens of either normal-or high-strength concrete tested under uniaxial compression. The main advantage of this kind of analysis is that it can be applied when the trend of relation between predictor and response variables are not obvious. The error in the analysis can, therefore, be minimized so that it does not depend on the assumption of a particular shape of the curve. This provides higher flexibility in the application. The results of the statistical analysis indicates that the stress-strain curves of confined concrete obtained from the spline nonparametric regression analysis proves to be in good agreement with the experimental curves available in literatures

  3. Quantitative and Chemical Fingerprint Analysis for the Quality Evaluation of Receptaculum Nelumbinis by RP-HPLC Coupled with Hierarchical Clustering Analysis

    Directory of Open Access Journals (Sweden)

    Jin-Zhong Wu

    2013-01-01

    Full Text Available A simple and reliable method of high-performance liquid chromatography with photodiode array detection (HPLC-DAD was developed to evaluate the quality of Receptaculum Nelumbinis (dried receptacle of Nelumbo nucifera through establishing chromatographic fingerprint and simultaneous determination of five flavonol glycosides, including hyperoside, isoquercitrin, quercetin-3-O-β-d-glucuronide, isorhamnetin-3-O-β-d-galactoside and syringetin-3-O-β-d-glucoside. In quantitative analysis, the five components showed good regression (R > 0.9998 within linear ranges, and their recoveries were in the range of 98.31%–100.32%. In the chromatographic fingerprint, twelve peaks were selected as the characteristic peaks to assess the similarities of different samples collected from different origins in China according to the State Food and Drug Administration (SFDA requirements. Furthermore, hierarchical cluster analysis (HCA was also applied to evaluate the variation of chemical components among different sources of Receptaculum Nelumbinis in China. This study indicated that the combination of quantitative and chromatographic fingerprint analysis can be readily utilized as a quality control method for Receptaculum Nelumbinis and its related traditional Chinese medicinal preparations.

  4. No rationale for 1 variable per 10 events criterion for binary logistic regression analysis

    Directory of Open Access Journals (Sweden)

    Maarten van Smeden

    2016-11-01

    Full Text Available Abstract Background Ten events per variable (EPV is a widely advocated minimal criterion for sample size considerations in logistic regression analysis. Of three previous simulation studies that examined this minimal EPV criterion only one supports the use of a minimum of 10 EPV. In this paper, we examine the reasons for substantial differences between these extensive simulation studies. Methods The current study uses Monte Carlo simulations to evaluate small sample bias, coverage of confidence intervals and mean square error of logit coefficients. Logistic regression models fitted by maximum likelihood and a modified estimation procedure, known as Firth’s correction, are compared. Results The results show that besides EPV, the problems associated with low EPV depend on other factors such as the total sample size. It is also demonstrated that simulation results can be dominated by even a few simulated data sets for which the prediction of the outcome by the covariates is perfect (‘separation’. We reveal that different approaches for identifying and handling separation leads to substantially different simulation results. We further show that Firth’s correction can be used to improve the accuracy of regression coefficients and alleviate the problems associated with separation. Conclusions The current evidence supporting EPV rules for binary logistic regression is weak. Given our findings, there is an urgent need for new research to provide guidance for supporting sample size considerations for binary logistic regression analysis.

  5. Accurate and Efficient Analysis of Printed Reflectarrays With Arbitrary Elements Using Higher-Order Hierarchical Legendre Basis Functions

    DEFF Research Database (Denmark)

    Zhou, Min; Jørgensen, Erik; Kim, Oleksiy S.;

    2012-01-01

    , thus providing the flexibility required in the analysis of printed reflectarrays. A comparison to DTU-ESA Facility measurements of a reference offset reflectarray shows that higher-order hierarchical Legendre basis functions produce results of the same accuracy as those obtained using singular basis...

  6. A hierarchical cluster analysis of normal-tension glaucoma using spectral-domain optical coherence tomography parameters.

    Science.gov (United States)

    Bae, Hyoung Won; Ji, Yongwoo; Lee, Hye Sun; Lee, Naeun; Hong, Samin; Seong, Gong Je; Sung, Kyung Rim; Kim, Chan Yun

    2015-01-01

    Normal-tension glaucoma (NTG) is a heterogenous disease, and there is still controversy about subclassifications of this disorder. On the basis of spectral-domain optical coherence tomography (SD-OCT), we subdivided NTG with hierarchical cluster analysis using optic nerve head (ONH) parameters and retinal nerve fiber layer (RNFL) thicknesses. A total of 200 eyes of 200 NTG patients between March 2011 and June 2012 underwent SD-OCT scans to measure ONH parameters and RNFL thicknesses. We classified NTG into homogenous subgroups based on these variables using a hierarchical cluster analysis, and compared clusters to evaluate diverse NTG characteristics. Three clusters were found after hierarchical cluster analysis. Cluster 1 (62 eyes) had the thickest RNFL and widest rim area, and showed early glaucoma features. Cluster 2 (60 eyes) was characterized by the largest cup/disc ratio and cup volume, and showed advanced glaucomatous damage. Cluster 3 (78 eyes) had small disc areas in SD-OCT and were comprised of patients with significantly younger age, longer axial length, and greater myopia than the other 2 groups. A hierarchical cluster analysis of SD-OCT scans divided NTG patients into 3 groups based upon ONH parameters and RNFL thicknesses. It is anticipated that the small disc area group comprised of younger and more myopic patients may show unique features unlike the other 2 groups.

  7. A method of spherical harmonic analysis in the geosciences via hierarchical Bayesian inference

    Science.gov (United States)

    Muir, J. B.; Tkalčić, H.

    2015-11-01

    The problem of decomposing irregular data on the sphere into a set of spherical harmonics is common in many fields of geosciences where it is necessary to build a quantitative understanding of a globally varying field. For example, in global seismology, a compressional or shear wave speed that emerges from tomographic images is used to interpret current state and composition of the mantle, and in geomagnetism, secular variation of magnetic field intensity measured at the surface is studied to better understand the changes in the Earth's core. Optimization methods are widely used for spherical harmonic analysis of irregular data, but they typically do not treat the dependence of the uncertainty estimates on the imposed regularization. This can cause significant difficulties in interpretation, especially when the best-fit model requires more variables as a result of underestimating data noise. Here, with the above limitations in mind, the problem of spherical harmonic expansion of irregular data is treated within the hierarchical Bayesian framework. The hierarchical approach significantly simplifies the problem by removing the need for regularization terms and user-supplied noise estimates. The use of the corrected Akaike Information Criterion for picking the optimal maximum degree of spherical harmonic expansion and the resulting spherical harmonic analyses are first illustrated on a noisy synthetic data set. Subsequently, the method is applied to two global data sets sensitive to the Earth's inner core and lowermost mantle, consisting of PKPab-df and PcP-P differential traveltime residuals relative to a spherically symmetric Earth model. The posterior probability distributions for each spherical harmonic coefficient are calculated via Markov Chain Monte Carlo sampling; the uncertainty obtained for the coefficients thus reflects the noise present in the real data and the imperfections in the spherical harmonic expansion.

  8. Family background variables as instruments for education in income regressions: A Bayesian analysis

    NARCIS (Netherlands)

    L.F. Hoogerheide (Lennart); J.H. Block (Jörn); A.R. Thurik (Roy)

    2012-01-01

    textabstractThe validity of family background variables instrumenting education in income regressions has been much criticized. In this paper, we use data from the 2004 German Socio-Economic Panel and Bayesian analysis to analyze to what degree violations of the strict validity assumption affect the

  9. A systematic review and meta-regression analysis of mivacurium for tracheal intubation

    NARCIS (Netherlands)

    Vanlinthout, L.E.H.; Mesfin, S.H.; Hens, N.; Vanacker, B.F.; Robertson, E.N.; Booij, L.H.D.J.

    2014-01-01

    We systematically reviewed factors associated with intubation conditions in randomised controlled trials of mivacurium, using random-effects meta-regression analysis. We included 29 studies of 1050 healthy participants. Four factors explained 72.9% of the variation in the probability of excellent in

  10. Explaining Differences in Civic Knowledge: Multi-Level Regression Analysis of Student Data from 27 Countries.

    Science.gov (United States)

    Schulz, Wolfram

    Differences in student knowledge about democracy, institutions, and citizenship and students skills in interpreting political communication were studied through multilevel regression analysis of results from the second International Education Association (IEA) Study. This study provides data on 14-year-old students from 28 countries in Europe,…

  11. Isolating the Effects of Training Using Simple Regression Analysis: An Example of the Procedure.

    Science.gov (United States)

    Waugh, C. Keith

    This paper provides a case example of simple regression analysis, a forecasting procedure used to isolate the effects of training from an identified extraneous variable. This case example focuses on results of a three-day sales training program to improve bank loan officers' knowledge, skill-level, and attitude regarding solicitation and sale of…

  12. Multiple Logistic Regression Analysis of Cigarette Use among High School Students

    Science.gov (United States)

    Adwere-Boamah, Joseph

    2011-01-01

    A binary logistic regression analysis was performed to predict high school students' cigarette smoking behavior from selected predictors from 2009 CDC Youth Risk Behavior Surveillance Survey. The specific target student behavior of interest was frequent cigarette use. Five predictor variables included in the model were: a) race, b) frequency of…

  13. Assessing the Impact of Influential Observations on Multiple Regression Analysis on Human Resource Research.

    Science.gov (United States)

    Bates, Reid A.; Holton, Elwood F., III; Burnett, Michael F.

    1999-01-01

    A case study of learning transfer demonstrates the possible effect of influential observation on linear regression analysis. A diagnostic method that tests for violation of assumptions, multicollinearity, and individual and multiple influential observations helps determine which observation to delete to eliminate bias. (SK)

  14. Declining Bias and Gender Wage Discrimination? A Meta-Regression Analysis

    Science.gov (United States)

    Jarrell, Stephen B.; Stanley, T. D.

    2004-01-01

    The meta-regression analysis reveals that there is a strong tendency for discrimination estimates to fall and wage discrimination exist against the woman. The biasing effect of researchers' gender of not correcting for selection bias has weakened and changes in labor market have made it less important.

  15. Comprehensive Evaluation of Entropy-hierarchical Grey Correlation Analysis for Highway Safety Life Protection Engineering

    Directory of Open Access Journals (Sweden)

    Jin Shuxins

    2016-01-01

    Full Text Available Different highway safety life protection engineering decision-making have important meaning. The achieving goals and optimal highway safety life protection engineering scheme can not only improve the function of the highway facilities and service level, still can reduce the traffic accident, which caused by the imperfect highway facilities. Different highway safety life protection engineering decision-making is a multiple targets, multi-layers and multi-schemes system evaluation problem. With regard to lack of concrete data on multiple targets, multi-layers and multi-schemes system evaluation problem, make analytical hierarchy process combined with the entropy value analysis into the grey relational comprehensive evaluation method, and then get entropy-hierarchical grey correlation analysis method. This method is a qualitative and quantitative decision method, which combine comparison principle of analytic hierarchy process (AHP and the entropy principle of entropy value analysis method to determine the relative weight of various indexes between factors layer-by-layer. Then using grey relational analysis by low-layer to high-layer step by step in the possible scheme and referenced scheme. Finally, calculating the comprehensive correlation degree between the possible scheme and referenced scheme, the best plan which has maximum grey correlation degree can be selected.

  16. Evaluation of hierarchical agglomerative cluster analysis methods for discrimination of primary biological aerosol

    Directory of Open Access Journals (Sweden)

    I. Crawford

    2015-07-01

    Full Text Available In this paper we present improved methods for discriminating and quantifying Primary Biological Aerosol Particles (PBAP by applying hierarchical agglomerative cluster analysis to multi-parameter ultra violet-light induced fluorescence (UV-LIF spectrometer data. The methods employed in this study can be applied to data sets in excess of 1×106 points on a desktop computer, allowing for each fluorescent particle in a dataset to be explicitly clustered. This reduces the potential for misattribution found in subsampling and comparative attribution methods used in previous approaches, improving our capacity to discriminate and quantify PBAP meta-classes. We evaluate the performance of several hierarchical agglomerative cluster analysis linkages and data normalisation methods using laboratory samples of known particle types and an ambient dataset. Fluorescent and non-fluorescent polystyrene latex spheres were sampled with a Wideband Integrated Bioaerosol Spectrometer (WIBS-4 where the optical size, asymmetry factor and fluorescent measurements were used as inputs to the analysis package. It was found that the Ward linkage with z-score or range normalisation performed best, correctly attributing 98 and 98.1 % of the data points respectively. The best performing methods were applied to the BEACHON-RoMBAS ambient dataset where it was found that the z-score and range normalisation methods yield similar results with each method producing clusters representative of fungal spores and bacterial aerosol, consistent with previous results. The z-score result was compared to clusters generated with previous approaches (WIBS AnalysiS Program, WASP where we observe that the subsampling and comparative attribution method employed by WASP results in the overestimation of the fungal spore concentration by a factor of 1.5 and the underestimation of bacterial aerosol concentration by a factor of 5. We suggest that this likely due to errors arising from misatrribution

  17. Evaluation of hierarchical agglomerative cluster analysis methods for discrimination of primary biological aerosol

    Directory of Open Access Journals (Sweden)

    I. Crawford

    2015-11-01

    Full Text Available In this paper we present improved methods for discriminating and quantifying primary biological aerosol particles (PBAPs by applying hierarchical agglomerative cluster analysis to multi-parameter ultraviolet-light-induced fluorescence (UV-LIF spectrometer data. The methods employed in this study can be applied to data sets in excess of 1 × 106 points on a desktop computer, allowing for each fluorescent particle in a data set to be explicitly clustered. This reduces the potential for misattribution found in subsampling and comparative attribution methods used in previous approaches, improving our capacity to discriminate and quantify PBAP meta-classes. We evaluate the performance of several hierarchical agglomerative cluster analysis linkages and data normalisation methods using laboratory samples of known particle types and an ambient data set. Fluorescent and non-fluorescent polystyrene latex spheres were sampled with a Wideband Integrated Bioaerosol Spectrometer (WIBS-4 where the optical size, asymmetry factor and fluorescent measurements were used as inputs to the analysis package. It was found that the Ward linkage with z-score or range normalisation performed best, correctly attributing 98 and 98.1 % of the data points respectively. The best-performing methods were applied to the BEACHON-RoMBAS (Bio–hydro–atmosphere interactions of Energy, Aerosols, Carbon, H2O, Organics and Nitrogen–Rocky Mountain Biogenic Aerosol Study ambient data set, where it was found that the z-score and range normalisation methods yield similar results, with each method producing clusters representative of fungal spores and bacterial aerosol, consistent with previous results. The z-score result was compared to clusters generated with previous approaches (WIBS AnalysiS Program, WASP where we observe that the subsampling and comparative attribution method employed by WASP results in the overestimation of the fungal spore concentration by a factor of 1.5 and the

  18. Analysis of designed experiments by stabilised PLS Regression and jack-knifing

    DEFF Research Database (Denmark)

    Martens, Harald; Høy, M.; Westad, F.

    2001-01-01

    Pragmatical, visually oriented methods for assessing and optimising bi-linear regression models are described, and applied to PLS Regression (PLSR) analysis of multi-response data from controlled experiments. The paper outlines some ways to stabilise the PLSR method to extend its range...... the reliability of the linear and bi-linear model parameter estimates. The paper illustrates how the obtained PLSR "significance" probabilities are similar to those from conventional factorial ANOVA, but the PLSR is shown to give important additional overview plots of the main relevant structures in the multi...

  19. Automatic regression analysis for use in a complex system of evaluation of plant genetic resources

    Directory of Open Access Journals (Sweden)

    Cs. ARKOSSY

    1984-08-01

    Full Text Available In accordance with the general requirements regarding computerization in gene banks and germplasm research a computer program has been compiled for the analysis of univariate response in crop germplasm evaluation. The program is compiled in COBOL and run on a FELIX C-256 computer. The different modules of the program allows for: (1. data control and error listing; (2 computation of the regression function; (3 listing of the difference between the values measured and computed; (4 sorting of the individuals samples; (5 construction of scattergrams in two dimensions for measured values with the simultaneous representation of the regression line; (6 listing of examined samples in a sequence required in evaluation.

  20. Methods and applications of linear models regression and the analysis of variance

    CERN Document Server

    Hocking, Ronald R

    2013-01-01

    Praise for the Second Edition"An essential desktop reference book . . . it should definitely be on your bookshelf." -Technometrics A thoroughly updated book, Methods and Applications of Linear Models: Regression and the Analysis of Variance, Third Edition features innovative approaches to understanding and working with models and theory of linear regression. The Third Edition provides readers with the necessary theoretical concepts, which are presented using intuitive ideas rather than complicated proofs, to describe the inference that is appropriate for the methods being discussed. The book

  1. [Comparison of application of Cochran-Armitage trend test and linear regression analysis for rate trend analysis in epidemiology study].

    Science.gov (United States)

    Wang, D Z; Wang, C; Shen, C F; Zhang, Y; Zhang, H; Song, G D; Xue, X D; Xu, Z L; Zhang, S; Jiang, G H

    2017-05-10

    We described the time trend of acute myocardial infarction (AMI) from 1999 to 2013 in Tianjin incidence rate with Cochran-Armitage trend (CAT) test and linear regression analysis, and the results were compared. Based on actual population, CAT test had much stronger statistical power than linear regression analysis for both overall incidence trend and age specific incidence trend (Cochran-Armitage trend P valuelinear regression P value). The statistical power of CAT test decreased, while the result of linear regression analysis remained the same when population size was reduced by 100 times and AMI incidence rate remained unchanged. The two statistical methods have their advantages and disadvantages. It is necessary to choose statistical method according the fitting degree of data, or comprehensively analyze the results of two methods.

  2. Ca analysis: an Excel based program for the analysis of intracellular calcium transients including multiple, simultaneous regression analysis.

    Science.gov (United States)

    Greensmith, David J

    2014-01-01

    Here I present an Excel based program for the analysis of intracellular Ca transients recorded using fluorescent indicators. The program can perform all the necessary steps which convert recorded raw voltage changes into meaningful physiological information. The program performs two fundamental processes. (1) It can prepare the raw signal by several methods. (2) It can then be used to analyze the prepared data to provide information such as absolute intracellular Ca levels. Also, the rates of change of Ca can be measured using multiple, simultaneous regression analysis. I demonstrate that this program performs equally well as commercially available software, but has numerous advantages, namely creating a simplified, self-contained analysis workflow.

  3. MIXED DENTITION SPACE ANALYSIS OF A SOUTHERN ITALIAN POPULATION: NEW REGRESSION EQUATIONS FOR UNERUPTED TEETH.

    Science.gov (United States)

    Cirulli, N; Ballini, A; Cantore, S; Farronato, D; Inchingolo, F; Dipalma, G; Gatto, M R; Alessandri Bonetti, G

    2015-01-01

    Mixed dentition analysis forms a critical aspect of early orthodontic treatment. In fact an accurate space analysis is one of the important criteria in determining whether the treatment plan may involve serial extraction, guidance of eruption, space maintenance, space regaining or just periodic observation of the patients. The aim of the present study was to calculate linear regression equations in mixed dentition space analysis, measuring 230 dental casts mesiodistal tooth widths, obtained from southern Italian patients (118 females, 112 males, mean age 15±3 years). Student’s t-test or Wilcoxon test for independent and paired samples were used to determine right/left side and male/female differences. On the basis of the sum of the mesiodistal diameters of the 4 mandibular incisors as predictors for the sum of the widths of the canines and premolars in the mandibular mixed dentition, a new linear regression equation was found: y = 0.613x+7.294 (r= 0.701) for both genders in a southern Italian population. To better estimate the size of leeway space, a new regression equation was found to calculate the mesiodistal size of the second premolar using the sum of the four mandibular incisors, canine and first premolar as a predictor. The equation is y = 0.241x+1.224 (r= 0.732). In conclusion, new regression equations were derived for a southern Italian population.

  4. Analysis of the Influence of Quantile Regression Model on Mainland Tourists’ Service Satisfaction Performance

    Directory of Open Access Journals (Sweden)

    Wen-Cheng Wang

    2014-01-01

    Full Text Available It is estimated that mainland Chinese tourists travelling to Taiwan can bring annual revenues of 400 billion NTD to the Taiwan economy. Thus, how the Taiwanese Government formulates relevant measures to satisfy both sides is the focus of most concern. Taiwan must improve the facilities and service quality of its tourism industry so as to attract more mainland tourists. This paper conducted a questionnaire survey of mainland tourists and used grey relational analysis in grey mathematics to analyze the satisfaction performance of all satisfaction question items. The first eight satisfaction items were used as independent variables, and the overall satisfaction performance was used as a dependent variable for quantile regression model analysis to discuss the relationship between the dependent variable under different quantiles and independent variables. Finally, this study further discussed the predictive accuracy of the least mean regression model and each quantile regression model, as a reference for research personnel. The analysis results showed that other variables could also affect the overall satisfaction performance of mainland tourists, in addition to occupation and age. The overall predictive accuracy of quantile regression model Q0.25 was higher than that of the other three models.

  5. Analysis of the Influence of Quantile Regression Model on Mainland Tourists' Service Satisfaction Performance

    Science.gov (United States)

    Wang, Wen-Cheng; Cho, Wen-Chien; Chen, Yin-Jen

    2014-01-01

    It is estimated that mainland Chinese tourists travelling to Taiwan can bring annual revenues of 400 billion NTD to the Taiwan economy. Thus, how the Taiwanese Government formulates relevant measures to satisfy both sides is the focus of most concern. Taiwan must improve the facilities and service quality of its tourism industry so as to attract more mainland tourists. This paper conducted a questionnaire survey of mainland tourists and used grey relational analysis in grey mathematics to analyze the satisfaction performance of all satisfaction question items. The first eight satisfaction items were used as independent variables, and the overall satisfaction performance was used as a dependent variable for quantile regression model analysis to discuss the relationship between the dependent variable under different quantiles and independent variables. Finally, this study further discussed the predictive accuracy of the least mean regression model and each quantile regression model, as a reference for research personnel. The analysis results showed that other variables could also affect the overall satisfaction performance of mainland tourists, in addition to occupation and age. The overall predictive accuracy of quantile regression model Q0.25 was higher than that of the other three models. PMID:24574916

  6. Hierarchical linear modeling of longitudinal pedigree data for genetic association analysis.

    Science.gov (United States)

    Tan, Qihua; B Hjelmborg, Jacob V; Thomassen, Mads; Jensen, Andreas Kryger; Christiansen, Lene; Christensen, Kaare; Zhao, Jing Hua; Kruse, Torben A

    2014-01-01

    Genetic association analysis on complex phenotypes under a longitudinal design involving pedigrees encounters the problem of correlation within pedigrees, which could affect statistical assessment of the genetic effects. Approaches have been proposed to integrate kinship correlation into the mixed-effect models to explicitly model the genetic relationship. These have proved to be an efficient way of dealing with sample clustering in pedigree data. Although current algorithms implemented in popular statistical packages are useful for adjusting relatedness in the mixed modeling of genetic effects on the mean level of a phenotype, they are not sufficiently straightforward to handle the kinship correlation on the time-dependent trajectories of a phenotype. We introduce a 2-level hierarchical linear model to separately assess the genetic associations with the mean level and the rate of change of a phenotype, integrating kinship correlation in the analysis. We apply our method to the Genetic Analysis Workshop 18 genome-wide association studies data on chromosome 3 to estimate the genetic effects on systolic blood pressure measured over time in large pedigrees. Our method identifies genetic variants associated with blood pressure with estimated inflation factors of 0.99, suggesting that our modeling of random effects efficiently handles the genetic relatedness in pedigrees. Application to simulated data captures important variants specified in the simulation. Our results show that the method is useful for genetic association studies in related samples using longitudinal design.

  7. Water quality assessment with hierarchical cluster analysis based on Mahalanobis distance.

    Science.gov (United States)

    Du, Xiangjun; Shao, Fengjing; Wu, Shunyao; Zhang, Hanlin; Xu, Si

    2017-07-01

    Water quality assessment is crucial for assessment of marine eutrophication, prediction of harmful algal blooms, and environment protection. Previous studies have developed many numeric modeling methods and data driven approaches for water quality assessment. The cluster analysis, an approach widely used for grouping data, has also been employed. However, there are complex correlations between water quality variables, which play important roles in water quality assessment but have always been overlooked. In this paper, we analyze correlations between water quality variables and propose an alternative method for water quality assessment with hierarchical cluster analysis based on Mahalanobis distance. Further, we cluster water quality data collected form coastal water of Bohai Sea and North Yellow Sea of China, and apply clustering results to evaluate its water quality. To evaluate the validity, we also cluster the water quality data with cluster analysis based on Euclidean distance, which are widely adopted by previous studies. The results show that our method is more suitable for water quality assessment with many correlated water quality variables. To our knowledge, it is the first attempt to apply Mahalanobis distance for coastal water quality assessment.

  8. Bayesian Method of Moments (BMOM) Analysis of Mean and Regression Models

    CERN Document Server

    Zellner, Arnold

    2008-01-01

    A Bayesian method of moments/instrumental variable (BMOM/IV) approach is developed and applied in the analysis of the important mean and multiple regression models. Given a single set of data, it is shown how to obtain posterior and predictive moments without the use of likelihood functions, prior densities and Bayes' Theorem. The posterior and predictive moments, based on a few relatively weak assumptions, are then used to obtain maximum entropy densities for parameters, realized error terms and future values of variables. Posterior means for parameters and realized error terms are shown to be equal to certain well known estimates and rationalized in terms of quadratic loss functions. Conditional maxent posterior densities for means and regression coefficients given scale parameters are in the normal form while scale parameters' maxent densities are in the exponential form. Marginal densities for individual regression coefficients, realized error terms and future values are in the Laplace or double-exponenti...

  9. Groping Toward Linear Regression Analysis: Newton's Analysis of Hipparchus' Equinox Observations

    CERN Document Server

    Belenkiy, Ari

    2008-01-01

    In 1700, Newton, in designing a new universal calendar contained in the manuscripts known as Yahuda MS 24 from Jewish National and University Library at Jerusalem and analyzed in our recent article in Notes & Records Royal Society (59 (3), Sept 2005, pp. 223-54), attempted to compute the length of the tropical year using the ancient equinox observations reported by a famous Greek astronomer Hipparchus of Rhodes, ten in number. Though Newton had a very thin sample of data, he obtained a tropical year only a few seconds longer than the correct length. The reason lies in Newton's application of a technique similar to modern regression analysis. Actually he wrote down the first of the two so-called "normal equations" known from the Ordinary Least Squares method. Newton also had a vague understanding of qualitative variables. This paper concludes by discussing open historico-astronomical problems related to the inclination of the Earth's axis of rotation. In particular, ignorance about the long-range variation...

  10. Generic Approach for Hierarchical Modulation Performance Analysis: Application to DVB-SH

    CERN Document Server

    Méric, Hugo; Amiot-Bazile, Caroline; Arnal, Fabrice; Boucheret, Marie-Laure

    2011-01-01

    Broadcasting systems have to deal with channel diversity in order to offer the best rate to the users. Hierarchical modulation is a practical solution to provide several rates in function of the channel quality. Unfortunately the performance evaluation of such modulations requires time consuming simulations. We propose in this paper a novel approach based on the channel capacity to avoid these simulations. The method allows to study the performance in terms of spectrum efficiency of hierarchical and also classical modulations combined with error correcting codes. Our method will be applied to the DVB-SH standard which considers hierarchical modulation as an optional feature.

  11. Replica analysis of overfitting in regression models for time-to-event data

    Science.gov (United States)

    Coolen, A. C. C.; Barrett, J. E.; Paga, P.; Perez-Vicente, C. J.

    2017-09-01

    Overfitting, which happens when the number of parameters in a model is too large compared to the number of data points available for determining these parameters, is a serious and growing problem in survival analysis. While modern medicine presents us with data of unprecedented dimensionality, these data cannot yet be used effectively for clinical outcome prediction. Standard error measures in maximum likelihood regression, such as p-values and z-scores, are blind to overfitting, and even for Cox’s proportional hazards model (the main tool of medical statisticians), one finds in literature only rules of thumb on the number of samples required to avoid overfitting. In this paper we present a mathematical theory of overfitting in regression models for time-to-event data, which aims to increase our quantitative understanding of the problem and provide practical tools with which to correct regression outcomes for the impact of overfitting. It is based on the replica method, a statistical mechanical technique for the analysis of heterogeneous many-variable systems that has been used successfully for several decades in physics, biology, and computer science, but not yet in medical statistics. We develop the theory initially for arbitrary regression models for time-to-event data, and verify its predictions in detail for the popular Cox model.

  12. FRICTION MODELING OF Al-Mg ALLOY SHEETS BASED ON MULTIPLE REGRESSION ANALYSIS AND NEURAL NETWORKS

    Directory of Open Access Journals (Sweden)

    Hirpa G. Lemu

    2017-03-01

    Full Text Available This article reports a proposed approach to a frictional resistance description in sheet metal forming processes that enables determination of the friction coefficient value under a wide range of friction conditions without performing time-consuming experiments. The motivation for this proposal is the fact that there exists a considerable amount of factors affect the friction coefficient value and as a result building analytical friction model for specified process conditions is practically impossible. In this proposed approach, a mathematical model of friction behaviour is created using multiple regression analysis and artificial neural networks. The regression analysis was performed using a subroutine in MATLAB programming code and STATISTICA Neural Networks was utilized to build an artificial neural networks model. The effect of different training strategies on the quality of neural networks was studied. As input variables for regression model and training of radial basis function networks, generalized regression neural networks and multilayer networks the results of strip drawing friction test were utilized. Four kinds of Al-Mg alloy sheets were used as a test material.

  13. Multilayer perceptron for robust nonlinear interval regression analysis using genetic algorithms.

    Science.gov (United States)

    Hu, Yi-Chung

    2014-01-01

    On the basis of fuzzy regression, computational models in intelligence such as neural networks have the capability to be applied to nonlinear interval regression analysis for dealing with uncertain and imprecise data. When training data are not contaminated by outliers, computational models perform well by including almost all given training data in the data interval. Nevertheless, since training data are often corrupted by outliers, robust learning algorithms employed to resist outliers for interval regression analysis have been an interesting area of research. Several approaches involving computational intelligence are effective for resisting outliers, but the required parameters for these approaches are related to whether the collected data contain outliers or not. Since it seems difficult to prespecify the degree of contamination beforehand, this paper uses multilayer perceptron to construct the robust nonlinear interval regression model using the genetic algorithm. Outliers beyond or beneath the data interval will impose slight effect on the determination of data interval. Simulation results demonstrate that the proposed method performs well for contaminated datasets.

  14. Multilayer Perceptron for Robust Nonlinear Interval Regression Analysis Using Genetic Algorithms

    Science.gov (United States)

    2014-01-01

    On the basis of fuzzy regression, computational models in intelligence such as neural networks have the capability to be applied to nonlinear interval regression analysis for dealing with uncertain and imprecise data. When training data are not contaminated by outliers, computational models perform well by including almost all given training data in the data interval. Nevertheless, since training data are often corrupted by outliers, robust learning algorithms employed to resist outliers for interval regression analysis have been an interesting area of research. Several approaches involving computational intelligence are effective for resisting outliers, but the required parameters for these approaches are related to whether the collected data contain outliers or not. Since it seems difficult to prespecify the degree of contamination beforehand, this paper uses multilayer perceptron to construct the robust nonlinear interval regression model using the genetic algorithm. Outliers beyond or beneath the data interval will impose slight effect on the determination of data interval. Simulation results demonstrate that the proposed method performs well for contaminated datasets. PMID:25110755

  15. Ingredients and Process Standardization of Thepla: An Indian Unleavened Vegetable Flatbread using Hierarchical Cluster Analysis

    Directory of Open Access Journals (Sweden)

    S.S. Arya

    2012-10-01

    Full Text Available Thepla is an Indian unleavened flatbread made from whole-wheat flour with added spices and vegetables. It is particularly consumed in western zone of the India. The preparation of thepla is tedious, time consuming and requires skill. In the present study standardization of thepla ingredients were carried out by standardizing each ingredient on the basis of Overall Acceptability (OA score. Sensory analysis was carried out using nine-point hedonic rating scale with ten trained panellists. Standardized ingredients of thepla were: salt 3%, red chili powder 2.5%, fenugreek leaves 12%, cumin seed powder 0.6%, coriander seed powder 0.6%, ginger garlic paste (1:1 6%, asafoetida 0.6% and oil 3% w/w of whole wheat flour on the basis of highest sensory OA score. Further thepla process parameters such as time, temperature, diameter of thepla and weight of dough were standardized on the basis of sensory OA score. Obtained sensory score data was processed for Hierarchical Cluster Analysis (HCA.

  16. Time-domain analysis of neural tracking of hierarchical linguistic structures.

    Science.gov (United States)

    Zhang, Wen; Ding, Nai

    2017-02-01

    When listening to continuous speech, cortical activity measured by MEG concurrently follows the rhythms of multiple linguistic structures, e.g., syllables, phrases, and sentences. This phenomenon was previously characterized in the frequency domain. Here, we investigate the waveform of neural activity tracking linguistic structures in the time domain and quantify the coherence of neural response phases over subjects listening to the same stimulus. These analyses are achieved by decomposing the multi-channel MEG recordings into components that maximize the correlation between neural response waveforms across listeners. Each MEG component can be viewed as the recording from a virtual sensor that is spatially tuned to a cortical network showing coherent neural activity over subjects. This analysis reveals information not available from previous frequency-domain analysis of MEG global field power: First, concurrent neural tracking of hierarchical linguistic structures emerges at the beginning of the stimulus, rather than slowly building up after repetitions of the same sentential structure. Second, neural tracking of the sentential structure is reflected by slow neural fluctuations, rather than, e.g., a series of short-lasting transient responses at sentential boundaries. Lastly and most importantly, it shows that the MEG responses tracking the syllabic rhythm are spatially separable from the MEG responses tracking the sentential and phrasal rhythms.

  17. Improving water quality assessments through a hierarchical Bayesian analysis of variability.

    Science.gov (United States)

    Gronewold, Andrew D; Borsuk, Mark E

    2010-10-15

    Water quality measurement error and variability, while well-documented in laboratory-scale studies, is rarely acknowledged or explicitly resolved in most model-based water body assessments, including those conducted in compliance with the United States Environmental Protection Agency (USEPA) Total Maximum Daily Load (TMDL) program. Consequently, proposed pollutant loading reductions in TMDLs and similar water quality management programs may be biased, resulting in either slower-than-expected rates of water quality restoration and designated use reinstatement or, in some cases, overly conservative management decisions. To address this problem, we present a hierarchical Bayesian approach for relating actual in situ or model-predicted pollutant concentrations to multiple sampling and analysis procedures, each with distinct sources of variability. We apply this method to recently approved TMDLs to investigate whether appropriate accounting for measurement error and variability will lead to different management decisions. We find that required pollutant loading reductions may in fact vary depending not only on how measurement variability is addressed but also on which water quality analysis procedure is used to assess standard compliance. As a general strategy, our Bayesian approach to quantifying variability may represent an alternative to the common practice of addressing all forms of uncertainty through an arbitrary margin of safety (MOS).

  18. Tutorial on Biostatistics: Linear Regression Analysis of Continuous Correlated Eye Data.

    Science.gov (United States)

    Ying, Gui-Shuang; Maguire, Maureen G; Glynn, Robert; Rosner, Bernard

    2017-04-01

    To describe and demonstrate appropriate linear regression methods for analyzing correlated continuous eye data. We describe several approaches to regression analysis involving both eyes, including mixed effects and marginal models under various covariance structures to account for inter-eye correlation. We demonstrate, with SAS statistical software, applications in a study comparing baseline refractive error between one eye with choroidal neovascularization (CNV) and the unaffected fellow eye, and in a study determining factors associated with visual field in the elderly. When refractive error from both eyes were analyzed with standard linear regression without accounting for inter-eye correlation (adjusting for demographic and ocular covariates), the difference between eyes with CNV and fellow eyes was 0.15 diopters (D; 95% confidence interval, CI -0.03 to 0.32D, p = 0.10). Using a mixed effects model or a marginal model, the estimated difference was the same but with narrower 95% CI (0.01 to 0.28D, p = 0.03). Standard regression for visual field data from both eyes provided biased estimates of standard error (generally underestimated) and smaller p-values, while analysis of the worse eye provided larger p-values than mixed effects models and marginal models. In research involving both eyes, ignoring inter-eye correlation can lead to invalid inferences. Analysis using only right or left eyes is valid, but decreases power. Worse-eye analysis can provide less power and biased estimates of effect. Mixed effects or marginal models using the eye as the unit of analysis should be used to appropriately account for inter-eye correlation and maximize power and precision.

  19. Predictors of sentinel lymph node status in cutaneous melanoma: a classification and regression tree analysis.

    Science.gov (United States)

    Tejera-Vaquerizo, A; Martín-Cuevas, P; Gallego, E; Herrera-Acosta, E; Traves, V; Herrera-Ceballos, E; Nagore, E

    2015-04-01

    The main aim of this study was to identify predictors of sentinel lymph node (SN) metastasis in cutaneous melanoma. This was a retrospective cohort study of 818 patients in 2 tertiary-level hospitals. The primary outcome variable was SN involvement. Independent predictors were identified using multiple logistic regression and a classification and regression tree (CART) analysis. Ulceration, tumor thickness, and a high mitotic rate (≥6 mitoses/mm(2)) were independently associated with SN metastasis in the multiple regression analysis. The most important predictor in the CART analysis was Breslow thickness. Absence of an inflammatory infiltrate, patient age, and tumor location were predictive of SN metastasis in patients with tumors thicker than 2mm. In the case of thinner melanomas, the predictors were mitotic rate (>6 mitoses/mm(2)), presence of ulceration, and tumor thickness. Patient age, mitotic rate, and tumor thickness and location were predictive of survival. A high mitotic rate predicts a higher risk of SN involvement and worse survival. CART analysis improves the prediction of regional metastasis, resulting in better clinical management of melanoma patients. It may also help select suitable candidates for inclusion in clinical trials. Copyright © 2014 Elsevier España, S.L.U. and AEDV. All rights reserved.

  20. Proximate analysis based multiple regression models for higher heating value estimation of low rank coals

    Energy Technology Data Exchange (ETDEWEB)

    Akkaya, Ali Volkan [Department of Mechanical Engineering, Yildiz Technical University, 34349 Besiktas, Istanbul (Turkey)

    2009-02-15

    In this paper, multiple nonlinear regression models for estimation of higher heating value of coals are developed using proximate analysis data obtained generally from the low rank coal samples as-received basis. In this modeling study, three main model structures depended on the number of proximate analysis parameters, which are named the independent variables, such as moisture, ash, volatile matter and fixed carbon, are firstly categorized. Secondly, sub-model structures with different arrangements of the independent variables are considered. Each sub-model structure is analyzed with a number of model equations in order to find the best fitting model using multiple nonlinear regression method. Based on the results of nonlinear regression analysis, the best model for each sub-structure is determined. Among them, the models giving highest correlation for three main structures are selected. Although the selected all three models predicts HHV rather accurately, the model involving four independent variables provides the most accurate estimation of HHV. Additionally, when the chosen model with four independent variables and a literature model are tested with extra proximate analysis data, it is seen that that the developed model in this study can give more accurate prediction of HHV of coals. It can be concluded that the developed model is effective tool for HHV estimation of low rank coals. (author)

  1. Analysis of the Evolution of the Gross Domestic Product by Means of Cyclic Regressions

    Directory of Open Access Journals (Sweden)

    Catalin Angelo Ioan

    2011-08-01

    Full Text Available In this article, we will carry out an analysis on the regularity of the Gross Domestic Product of a country, in our case the United States. The method of analysis is based on a new method of analysis – the cyclic regressions based on the Fourier series of a function. Another point of view is that of considering instead the growth rate of GDP the speed of variation of this rate, computed as a numerical derivative. The obtained results show a cycle for this indicator for 71 years, the mean square error being 0.93%. The method described allows an prognosis on short-term trends in GDP.

  2. Multiple Regression Analysis of Unconfined Compression Strength of Mine Tailings Matrices

    Directory of Open Access Journals (Sweden)

    Mahmood Ali A.

    2017-01-01

    Full Text Available As part of a novel approach of sustainable development of mine tailings, experimental and numerical analysis is carried out on newly formulated tailings matrices. Several physical characteristic tests are carried out including the unconfined compression strength test to ascertain the integrity of these matrices when subjected to loading. The current paper attempts a multiple regression analysis of the unconfined compressive strength test results of these matrices to investigate the most pertinent factors affecting their strength. Results of this analysis showed that the suggested equation is reasonably applicable to the range of binder combinations used.

  3. Fabrication of micro/nano hierarchical structures with analysis on the surface mechanics

    Science.gov (United States)

    Jheng, Yu-Sheng; Lee, Yeeu-Chang

    2016-10-01

    Biomimicry refers to the imitation of mechanisms and features found in living creatures using artificial methods. This study used optical lithography, colloidal lithography, and dry etching to mimic the micro/nano hierarchical structures covering the soles of gecko feet. We measured the static contact angle and contact angle hysteresis to reveal the behavior of liquid drops on the hierarchical structures. Pulling tests were also performed to measure the resistance of movement between the hierarchical structures and a testing plate. Our results reveal that hierarchical structures at the micro-/nano-scale are considerably hydrophobic, they provide good flow characteristics, and they generate more contact force than do surfaces with micro-scale cylindrical structures.

  4. Hierarchical modeling and inference in ecology: the analysis of data from populations, metapopulations and communities

    National Research Council Canada - National Science Library

    Royle, J. Andrew; Dorazio, Robert M

    2008-01-01

    "This book describes a general and flexible framework for modeling and inference in ecological systems based on hierarchical modeling in which a strict focus on probability models and parametric inference is adopted...

  5. Nonresident Undergraduates' Performance in English Writing Classes-Hierarchical Linear Modeling Analysis

    National Research Council Canada - National Science Library

    Allison A Vaughn; Matthew Bergman; Barry Fass-Holmes

    2015-01-01

    ...) in the fall term of the five most recent academic years. Hierarchical linear modeling analyses showed that the predictors with the largest effect sizes were English writing programs and class level...

  6. Risk assessment of dengue fever in Zhongshan, China: a time-series regression tree analysis.

    Science.gov (United States)

    Liu, K-K; Wang, T; Huang, X-D; Wang, G-L; Xia, Y; Zhang, Y-T; Jing, Q-L; Huang, J-W; Liu, X-X; Lu, J-H; Hu, W-B

    2017-02-01

    Dengue fever (DF) is the most prevalent and rapidly spreading mosquito-borne disease globally. Control of DF is limited by barriers to vector control and integrated management approaches. This study aimed to explore the potential risk factors for autochthonous DF transmission and to estimate the threshold effects of high-order interactions among risk factors. A time-series regression tree model was applied to estimate the hierarchical relationship between reported autochthonous DF cases and the potential risk factors including the timeliness of DF surveillance systems (median time interval between symptom onset date and diagnosis date, MTIOD), mosquito density, imported cases and meteorological factors in Zhongshan, China from 2001 to 2013. We found that MTIOD was the most influential factor in autochthonous DF transmission. Monthly autochthonous DF incidence rate increased by 36·02-fold [relative risk (RR) 36·02, 95% confidence interval (CI) 25·26-46·78, compared to the average DF incidence rate during the study period] when the 2-month lagged moving average of MTIOD was >4·15 days and the 3-month lagged moving average of the mean Breteau Index (BI) was ⩾16·57. If the 2-month lagged moving average MTIOD was between 1·11 and 4·15 days and the monthly maximum diurnal temperature range at a lag of 1 month was <9·6 °C, the monthly mean autochthonous DF incidence rate increased by 14·67-fold (RR 14·67, 95% CI 8·84-20·51, compared to the average DF incidence rate during the study period). This study demonstrates that the timeliness of DF surveillance systems, mosquito density and diurnal temperature range play critical roles in the autochthonous DF transmission in Zhongshan. Better assessment and prediction of the risk of DF transmission is beneficial for establishing scientific strategies for DF early warning surveillance and control.

  7. Extreme Rainfall Analysis using Bayesian Hierarchical Modeling in the Willamette River Basin, Oregon

    Science.gov (United States)

    Love, C. A.; Skahill, B. E.; AghaKouchak, A.; Karlovits, G. S.; England, J. F.; Duren, A. M.

    2016-12-01

    We present preliminary results of ongoing research directed at evaluating the worth of including various covariate data to support extreme rainfall analysis in the Willamette River basin using Bayesian hierarchical modeling (BHM). We also compare the BHM derived extreme rainfall estimates with their respective counterparts obtained from a traditional regional frequency analysis (RFA) using the same set of rain gage extreme rainfall data. The U.S. Army Corps of Engineers (USACE) Portland District operates thirteen dams in the 11,478 square mile Willamette River basin (WRB) located in northwestern Oregon, a major tributary of the Columbia River whose 187 miles long main stem, the Willamette River, flows northward between the Coastal and Cascade Ranges. The WRB contains approximately two-thirds of Oregon's population and 20 of the 25 most populous cities in the state. Extreme rainfall estimates are required to support risk-informed hydrologic analyses for these projects as part of the USACE Dam Safety Program. We analyze daily annual rainfall maxima data for the WRB utilizing the spatial BHM R package "spatial.gev.bma", which has been shown to be efficient in developing coherent maps of extreme rainfall by return level. Our intent is to profile for the USACE an alternate methodology to a RFA which was developed in 2008 due to the lack of an official NOAA Atlas 14 update for the state of Oregon. Unlike RFA, the advantage of a BHM-based analysis of hydrometeorological extremes is its ability to account for non-stationarity while providing robust estimates of uncertainty. BHM also allows for the inclusion of geographical and climatological factors which we show for the WRB influence regional rainfall extremes. Moreover, the Bayesian framework permits one to combine additional data types into the analysis; for example, information derived via elicitation and causal information expansion data, both being additional opportunities for future related research.

  8. Hierarchical sliding mode control for under-actuated cranes design, analysis and simulation

    CERN Document Server

    Qian, Dianwei

    2015-01-01

    This book reports on the latest developments in sliding mode overhead crane control, presenting novel research ideas and findings on sliding mode control (SMC), hierarchical SMC and compensator design-based hierarchical sliding mode. The results, which were previously scattered across various journals and conference proceedings, are now presented in a systematic and unified form. The book will be of interest to researchers, engineers and graduate students in control engineering and mechanical engineering who want to learn the methods and applications of SMC.

  9. Comparison of Artificial Neural Networks and Logistic Regression Analysis in the Credit Risk Prediction

    Directory of Open Access Journals (Sweden)

    Hüseyin BUDAK

    2012-11-01

    Full Text Available Credit scoring is a vital topic for Banks since there is a need to use limited financial sources more effectively. There are several credit scoring methods that are used by Banks. One of them is to estimate whether a credit demanding customer’s repayment order will be regular or not. In this study, artificial neural networks and logistic regression analysis have been used to provide a support to the Banks’ credit risk prediction and to estimate whether a credit demanding customers’ repayment order will be regular or not. The results of the study showed that artificial neural networks method is more reliable than logistic regression analysis while estimating a credit demanding customer’s repayment order.

  10. Analysis of Functional Data with Focus on Multinomial Regression and Multilevel Data

    DEFF Research Database (Denmark)

    Mousavi, Seyed Nourollah

    Functional data analysis (FDA) is a fast growing area in statistical research with increasingly diverse range of application from economics, medicine, agriculture, chemometrics, etc. Functional regression is an area of FDA which has received the most attention both in aspects of application...... and methodological development. Our main Functional data analysis (FDA) is a fast growing area in statistical research with increasingly diverse range of application from economics, medicine, agriculture, chemometrics, etc. Functional regression is an area of FDA which has received the most attention both in aspects...... and the prediction of the response at time t only depends on th concurrently observed predictor. We introduce a version of this model for multilevel functional data of the type subjectunit, with the unit-level data being functional observations. Finally, in the fourth paper we show how registration can be applied...

  11. Forecasting municipal solid waste generation using prognostic tools and regression analysis.

    Science.gov (United States)

    Ghinea, Cristina; Drăgoi, Elena Niculina; Comăniţă, Elena-Diana; Gavrilescu, Marius; Câmpean, Teofil; Curteanu, Silvia; Gavrilescu, Maria

    2016-11-01

    For an adequate planning of waste management systems the accurate forecast of waste generation is an essential step, since various factors can affect waste trends. The application of predictive and prognosis models are useful tools, as reliable support for decision making processes. In this paper some indicators such as: number of residents, population age, urban life expectancy, total municipal solid waste were used as input variables in prognostic models in order to predict the amount of solid waste fractions. We applied Waste Prognostic Tool, regression analysis and time series analysis to forecast municipal solid waste generation and composition by considering the Iasi Romania case study. Regression equations were determined for six solid waste fractions (paper, plastic, metal, glass, biodegradable and other waste). Accuracy Measures were calculated and the results showed that S-curve trend model is the most suitable for municipal solid waste (MSW) prediction.

  12. Regression analysis of growth responses to water depth in three wetland plant species

    DEFF Research Database (Denmark)

    Sorrell, Brian K; Tanner, Chris C; Brix, Hans

    2012-01-01

    ) differing in depth preferences in wetlands, using non-linear and quantile regression analyses to establish how flooding tolerance can explain field zonation. Methodology Plants were established for 8 months in outdoor cultures in waterlogged soil without standing water, and then randomly allocated to water...... depths from 0 – 0.5 m. Morphological and growth responses to depth were followed for 54 days before harvest, and then analysed by repeated measures analysis of covariance, and non-linear and quantile regression analysis (QRA), to compare flooding tolerances. Principal results Growth responses to depth...... differed between the three species, and were non-linear. P. tenax growth rapidly decreased in standing water > 0.25 m depth, C. secta growth increased initially with depth but then decreased at depths > 0.30 m, accompanied by increased shoot height and decreased shoot density, and T. orientalis...

  13. Regression And Time Series Analysis Of Loan Default At Minescho Cooperative Credit Union Tarkwa

    Directory of Open Access Journals (Sweden)

    Otoo

    2015-08-01

    Full Text Available Abstract Lending in the form of loans is a principal business activity for banks credit unions and other financial institutions. This forms a substantial amount of the banks assets. However when these loans are defaulted it tends to have serious effects on the financial institutions. This study sought to determine the trend and forecast loan default at Minescho CreditUnion Tarkwa. A secondary data from the Credit Union was analyzed using Regression Analysis and the Box-Jenkins method of Time Series. From the Regression Analysis there was a moderately strong relationship between the amount of loan default and time. Also the amount of loan default had an increasing trend. The two years forecast of the amount of loan default oscillated initially and remained constant from 2016 onwards.

  14. Regression Analysis of Right-censored Failure Time Data with Missing Censoring Indicators

    Institute of Scientific and Technical Information of China (English)

    Ping Chen; Ren He; Jun-shan Shen; Jian-guo Sun

    2009-01-01

    This paper discusses regression analysis of right-censored failure time data when censoring indicators are missing for some subjects. Several methods have been developed for the analysis under different situations and especially, Goetghebeur and Ryan[4] considered the situation where both the failure time and the censoring time follow the proportional hazards models marginally and developed an estimating equation approach. One limitation of their approach is that the two baseline hazard functions were assumed to be proportional to each other. We consider the same problem and present an efficient estimation procedure for regression parameters that does not require the proportionality assumption. An EM algorithm is developed and the method is evaluated by a simulation study, which indicates that the proposed methodology performs well for practical situations. An illustrative example is provided.

  15. [Regression analysis of an instrumental conditioned tentacular reflex in the edible snail].

    Science.gov (United States)

    Stepanov, I I; Kuntsevich, S V; Lokhov, M I

    1989-01-01

    Regression analysis revealed the opportunity of approximation with exponential mathematical model of the learning curves of conditioned tentacle reflex. Retention of the reflex persisted for more than three weeks. There were some quantitative differences between conditioning of the right and the left tentacle. There was formation of the reflex in every session during spring period, but there was no retention between sessions. The conditioned tentacle reflex may be employed in neuropharmacological studies.

  16. FORECASTING THE FINANCIAL RETURNS FOR USING MULTIPLE REGRESSION BASED ON PRINCIPAL COMPONENT ANALYSIS

    Directory of Open Access Journals (Sweden)

    Nop Sopipan

    2013-01-01

    Full Text Available The aim of this study was to forecast the returns for the Stock Exchange of Thailand (SET Index by adding some explanatory variables and stationary Autoregressive order p (AR (p in the mean equation of returns. In addition, we used Principal Component Analysis (PCA to remove possible complications caused by multicollinearity. Results showed that the multiple regressions based on PCA, has the best performance.

  17. Monitoring heavy metal Cr in soil based on hyperspectral data using regression analysis

    Science.gov (United States)

    Zhang, Ningyu; Xu, Fuyun; Zhuang, Shidong; He, Changwei

    2016-10-01

    Heavy metal pollution in soils is one of the most critical problems in the global ecology and environment safety nowadays. Hyperspectral remote sensing and its application is capable of high speed, low cost, less risk and less damage, and provides a good method for detecting heavy metals in soil. This paper proposed a new idea of applying regression analysis of stepwise multiple regression between the spectral data and monitoring the amount of heavy metal Cr by sample points in soil for environmental protection. In the measurement, a FieldSpec HandHeld spectroradiometer is used to collect reflectance spectra of sample points over the wavelength range of 325-1075 nm. Then the spectral data measured by the spectroradiometer is preprocessed to reduced the influence of the external factors, and the preprocessed methods include first-order differential equation, second-order differential equation and continuum removal method. The algorithms of stepwise multiple regression are established accordingly, and the accuracy of each equation is tested. The results showed that the accuracy of first-order differential equation works best, which makes it feasible to predict the content of heavy metal Cr by using stepwise multiple regression.

  18. Robust estimation for homoscedastic regression in the secondary analysis of case-control data

    KAUST Repository

    Wei, Jiawei

    2012-12-04

    Primary analysis of case-control studies focuses on the relationship between disease D and a set of covariates of interest (Y, X). A secondary application of the case-control study, which is often invoked in modern genetic epidemiologic association studies, is to investigate the interrelationship between the covariates themselves. The task is complicated owing to the case-control sampling, where the regression of Y on X is different from what it is in the population. Previous work has assumed a parametric distribution for Y given X and derived semiparametric efficient estimation and inference without any distributional assumptions about X. We take up the issue of estimation of a regression function when Y given X follows a homoscedastic regression model, but otherwise the distribution of Y is unspecified. The semiparametric efficient approaches can be used to construct semiparametric efficient estimates, but they suffer from a lack of robustness to the assumed model for Y given X. We take an entirely different approach. We show how to estimate the regression parameters consistently even if the assumed model for Y given X is incorrect, and thus the estimates are model robust. For this we make the assumption that the disease rate is known or well estimated. The assumption can be dropped when the disease is rare, which is typically so for most case-control studies, and the estimation algorithm simplifies. Simulations and empirical examples are used to illustrate the approach.

  19. Estimation of Displacement and Rotation by Magnetic Tactile Sensor Using Stepwise Regression Analysis

    Directory of Open Access Journals (Sweden)

    Hiroyuki Nakamoto

    2014-01-01

    Full Text Available The human is covered with soft skin and has tactile receptors inside. The skin deforms along a contact surface. The tactile receptors detect the mechanical deformation. The detection of the mechanical deformation is essential for the tactile sensation. We propose a magnetic type tactile sensor which has a soft surface and eight magnetoresistive elements. The soft surface has a permanent magnet inside and the magnetoresistive elements under the soft surface measure the magnetic flux density of the magnet. The tactile sensor estimates the displacement and the rotation on the surface based on the change of the magnetic flux density. Determination of an estimate equation is difficult because the displacement and the rotation are not geometrically decided based on the magnetic flux density. In this paper, a stepwise regression analysis determines the estimate equation. The outputs of the magnetoresistive elements are used as explanatory variables, and the three-axis displacement and the two-axis rotation are response variables in the regression analysis. We confirm the regression analysis is effective for determining the estimate equations through simulation and experiment. The results show the tactile sensor measures both the displacement and the rotation generated on the surface by using the determined equation.

  20. Non-Stationary Hydrologic Frequency Analysis using B-Splines Quantile Regression

    Science.gov (United States)

    Nasri, B.; St-Hilaire, A.; Bouezmarni, T.; Ouarda, T.

    2015-12-01

    Hydrologic frequency analysis is commonly used by engineers and hydrologists to provide the basic information on planning, design and management of hydraulic structures and water resources system under the assumption of stationarity. However, with increasing evidence of changing climate, it is possible that the assumption of stationarity would no longer be valid and the results of conventional analysis would become questionable. In this study, we consider a framework for frequency analysis of extreme flows based on B-Splines quantile regression, which allows to model non-stationary data that have a dependence on covariates. Such covariates may have linear or nonlinear dependence. A Markov Chain Monte Carlo (MCMC) algorithm is used to estimate quantiles and their posterior distributions. A coefficient of determination for quantiles regression is proposed to evaluate the estimation of the proposed model for each quantile level. The method is applied on annual maximum and minimum streamflow records in Ontario, Canada. Climate indices are considered to describe the non-stationarity in these variables and to estimate the quantiles in this case. The results show large differences between the non-stationary quantiles and their stationary equivalents for annual maximum and minimum discharge with high annual non-exceedance probabilities. Keywords: Quantile regression, B-Splines functions, MCMC, Streamflow, Climate indices, non-stationarity.

  1. Hierarchical Bayesian analysis of censored microbiological contamination data for use in risk assessment and mitigation.

    Science.gov (United States)

    Busschaert, P; Geeraerd, A H; Uyttendaele, M; Van Impe, J F

    2011-06-01

    Microbiological contamination data often is censored because of the presence of non-detects or because measurement outcomes are known only to be smaller than, greater than, or between certain boundary values imposed by the laboratory procedures. Therefore, it is not straightforward to fit distributions that summarize contamination data for use in quantitative microbiological risk assessment, especially when variability and uncertainty are to be characterized separately. In this paper, distributions are fit using Bayesian analysis, and results are compared to results obtained with a methodology based on maximum likelihood estimation and the non-parametric bootstrap method. The Bayesian model is also extended hierarchically to estimate the effects of the individual elements of a covariate such as, for example, on a national level, the food processing company where the analyzed food samples were processed, or, on an international level, the geographical origin of contamination data. Including this extra information allows a risk assessor to differentiate between several scenario's and increase the specificity of the estimate of risk of illness, or compare different scenario's to each other. Furthermore, inference is made on the predictive importance of several different covariates while taking into account uncertainty, allowing to indicate which covariates are influential factors determining contamination.

  2. Asian pollution climatically modulates mid-latitude cyclones following hierarchical modelling and observational analysis.

    Science.gov (United States)

    Wang, Yuan; Zhang, Renyi; Saravanan, R

    2014-01-01

    Increasing levels of anthropogenic aerosols in Asia have raised considerable concern regarding its potential impact on the global atmosphere, but the magnitude of the associated climate forcing remains to be quantified. Here, using a novel hierarchical modelling approach and observational analysis, we demonstrate modulated mid-latitude cyclones by Asian pollution over the past three decades. Regional and seasonal simulations using a cloud-resolving model show that Asian pollution invigorates winter cyclones over the northwest Pacific, increasing precipitation by 7% and net cloud radiative forcing by 1.0 W m(-2) at the top of the atmosphere and by 1.7 W m(-2) at the Earth's surface. A global climate model incorporating the diabatic heating anomalies from Asian pollution produces a 9% enhanced transient eddy meridional heat flux and reconciles a decadal variation of mid-latitude cyclones derived from the Reanalysis data. Our results unambiguously reveal a large impact of the Asian pollutant outflows on the global general circulation and climate.

  3. Hierarchical adaptation scheme for multiagent data fusion and resource management in situation analysis

    Science.gov (United States)

    Benaskeur, Abder R.; Roy, Jean

    2001-08-01

    Sensor Management (SM) has to do with how to best manage, coordinate and organize the use of sensing resources in a manner that synergistically improves the process of data fusion. Based on the contextual information, SM develops options for collecting further information, allocates and directs the sensors towards the achievement of the mission goals and/or tunes the parameters for the realtime improvement of the effectiveness of the sensing process. Conscious of the important role that SM has to play in modern data fusion systems, we are currently studying advanced SM Concepts that would help increase the survivability of the current Halifax and Iroquois Class ships, as well as their possible future upgrades. For this purpose, a hierarchical scheme has been proposed for data fusion and resource management adaptation, based on the control theory and within the process refinement paradigm of the JDL data fusion model, and taking into account the multi-agent model put forward by the SASS Group for the situation analysis process. The novelty of this work lies in the unified framework that has been defined for tackling the adaptation of both the fusion process and the sensor/weapon management.

  4. Multiple-response regression analysis links magnetic resonance imaging features to de-regulated protein expression and pathway activity in lower grade glioma.

    Science.gov (United States)

    Lehrer, Michael; Bhadra, Anindya; Ravikumar, Visweswaran; Chen, James Y; Wintermark, Max; Hwang, Scott N; Holder, Chad A; Huang, Erich P; Fevrier-Sullivan, Brenda; Freymann, John B; Rao, Arvind

    2017-05-01

    Lower grade gliomas (LGGs), lesions of WHO grades II and III, comprise 10-15% of primary brain tumors. In this first-of-a-kind study, we aim to carry out a radioproteomic characterization of LGGs using proteomics data from the TCGA and imaging data from the TCIA cohorts, to obtain an association between tumor MRI characteristics and protein measurements. The availability of linked imaging and molecular data permits the assessment of relationships between tumor genomic/proteomic measurements with phenotypic features. Multiple-response regression of the image-derived, radiologist scored features with reverse-phase protein array (RPPA) expression levels generated correlation coefficients for each combination of image-feature and protein or phospho-protein in the RPPA dataset. Significantly-associated proteins for VASARI features were analyzed with Ingenuity Pathway Analysis software. Hierarchical clustering of the results of the pathway analysis was used to determine which feature groups were most strongly correlated with pathway activity and cellular functions. The multiple-response regression approach identified multiple proteins associated with each VASARI imaging feature. VASARI features were found to be correlated with expression of IL8, PTEN, PI3K/Akt, Neuregulin, ERK/MAPK, p70S6K and EGF signaling pathways. Radioproteomics analysis might enable an insight into the phenotypic consequences of molecular aberrations in LGGs.

  5. Individual patient data meta-analysis of survival data using Poisson regression models

    Directory of Open Access Journals (Sweden)

    Crowther Michael J

    2012-03-01

    Full Text Available Abstract Background An Individual Patient Data (IPD meta-analysis is often considered the gold-standard for synthesising survival data from clinical trials. An IPD meta-analysis can be achieved by either a two-stage or a one-stage approach, depending on whether the trials are analysed separately or simultaneously. A range of one-stage hierarchical Cox models have been previously proposed, but these are known to be computationally intensive and are not currently available in all standard statistical software. We describe an alternative approach using Poisson based Generalised Linear Models (GLMs. Methods We illustrate, through application and simulation, the Poisson approach both classically and in a Bayesian framework, in two-stage and one-stage approaches. We outline the benefits of our one-stage approach through extension to modelling treatment-covariate interactions and non-proportional hazards. Ten trials of hypertension treatment, with all-cause death the outcome of interest, are used to apply and assess the approach. Results We show that the Poisson approach obtains almost identical estimates to the Cox model, is additionally computationally efficient and directly estimates the baseline hazard. Some downward bias is observed in classical estimates of the heterogeneity in the treatment effect, with improved performance from the Bayesian approach. Conclusion Our approach provides a highly flexible and computationally efficient framework, available in all standard statistical software, to the investigation of not only heterogeneity, but the presence of non-proportional hazards and treatment effect modifiers.

  6. Assessing Credit Default using Logistic Regression and Multiple Discriminant Analysis: Empirical Evidence from Bosnia and Herzegovina

    Directory of Open Access Journals (Sweden)

    Deni Memić

    2015-01-01

    Full Text Available This article has an aim to assess credit default prediction on the banking market in Bosnia and Herzegovina nationwide as well as on its constitutional entities (Federation of Bosnia and Herzegovina and Republika Srpska. Ability to classify companies info different predefined groups or finding an appropriate tool which would replace human assessment in classifying companies into good and bad buckets has been one of the main interests on risk management researchers for a long time. We investigated the possibility and accuracy of default prediction using traditional statistical methods logistic regression (logit and multiple discriminant analysis (MDA and compared their predictive abilities. The results show that the created models have high predictive ability. For logit models, some variables are more influential on the default prediction than the others. Return on assets (ROA is statistically significant in all four periods prior to default, having very high regression coefficients, or high impact on the model's ability to predict default. Similar results are obtained for MDA models. It is also found that predictive ability differs between logistic regression and multiple discriminant analysis.

  7. Correlation Study and Regression Analysis of Drinking Water Quality in Kashan City, Iran

    Directory of Open Access Journals (Sweden)

    Mohammad Mehdi HEYDARI

    2013-06-01

    Full Text Available Chemical and statistical regression analysis on drinking water samples at five fields (21 sampling wells with hot and dry climate in Kashan city, central Iran was carried out. Samples were collected during October 2006 to May 2007 (25 - 30 °C. Comparing the results with drinking water quality standards issued by World Health Organization (WHO, it is found that some of the water samples are not potable. Hydrochemical facies using a Piper diagram indicate that in most parts of the city, the chemical character of water is dominated by NaCl. All samples showed sulfate and sodium ion higher and K+ and F- content lower than the permissible limit. A strongly positive correlation is observed between TDS and EC (R = 0.995 and Ca2+ and TH (R = 0.948. The results showed that regression relations have the same correlation coefficients: (I pH -TH, EC -TH (R = 0.520, (II NO3- -pH, TH-pH (R = 0.520, (III Ca2+-SO42-, TH-SO42-, Cl- -SO42- (R = 0.630. The results revealed that systematic calculations of correlation coefficients between water parameters and regression analysis provide a useful means for rapid monitoring of water quality.

  8. Multichannel biomedical time series clustering via hierarchical probabilistic latent semantic analysis.

    Science.gov (United States)

    Wang, Jin; Sun, Xiangping; Nahavandi, Saeid; Kouzani, Abbas; Wu, Yuchuan; She, Mary

    2014-11-01

    Biomedical time series clustering that automatically groups a collection of time series according to their internal similarity is of importance for medical record management and inspection such as bio-signals archiving and retrieval. In this paper, a novel framework that automatically groups a set of unlabelled multichannel biomedical time series according to their internal structural similarity is proposed. Specifically, we treat a multichannel biomedical time series as a document and extract local segments from the time series as words. We extend a topic model, i.e., the Hierarchical probabilistic Latent Semantic Analysis (H-pLSA), which was originally developed for visual motion analysis to cluster a set of unlabelled multichannel time series. The H-pLSA models each channel of the multichannel time series using a local pLSA in the first layer. The topics learned in the local pLSA are then fed to a global pLSA in the second layer to discover the categories of multichannel time series. Experiments on a dataset extracted from multichannel Electrocardiography (ECG) signals demonstrate that the proposed method performs better than previous state-of-the-art approaches and is relatively robust to the variations of parameters including length of local segments and dictionary size. Although the experimental evaluation used the multichannel ECG signals in a biometric scenario, the proposed algorithm is a universal framework for multichannel biomedical time series clustering according to their structural similarity, which has many applications in biomedical time series management. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  9. Analysis of ontogenetic spectra of populations of plants and lichens via ordinal regression

    Science.gov (United States)

    Sofronov, G. Yu.; Glotov, N. V.; Ivanov, S. M.

    2015-03-01

    Ontogenetic spectra of plants and lichens tend to vary across the populations. This means that if several subsamples within a sample (or a population) were collected, then the subsamples would not be homogeneous. Consequently, the statistical analysis of the aggregated data would not be correct, which could potentially lead to false biological conclusions. In order to take into account the heterogeneity of the subsamples, we propose to use ordinal regression, which is a type of generalized linear regression. In this paper, we study the populations of cowberry Vaccinium vitis-idaea L. and epiphytic lichens Hypogymnia physodes (L.) Nyl. and Pseudevernia furfuracea (L.) Zopf. We obtain estimates for the proportions of between-sample variability in the total variability of the ontogenetic spectra of the populations.

  10. The Effects of Agricultural Informatization on Agricultural Economic Growth: An Empirical Analysis Based on Regression Model

    Institute of Scientific and Technical Information of China (English)

    Lingling; TAN

    2013-01-01

    This article selects some major factors influencing the agricultural economic growth are selected,such as labor,capital input,farmland area,fertilizer input and information input.And it selects some factors to explain information input,such as the number of website ownership,types of books,magazines and newspapers published,the number of telephone ownership per 100 households,the number of home computers ownership per 100 households,farmers’ spending on transportation and communication,culture,education,entertainment and services, and the total number of agricultural science and technology service personnel.Using regression model,this article conducts regression analysis of the cross-section data on 31 provinces,autonomous regions and municipalities in 2010.The results show that the building of information infrastructure,the use of means of information,the popularization and promotion of knowledge of agricultural science and technology,play an important role in promoting agricultural economic growth.

  11. Sub-pixel estimation of tree cover and bare surface densities using regression tree analysis

    Directory of Open Access Journals (Sweden)

    Carlos Augusto Zangrando Toneli

    2011-09-01

    Full Text Available Sub-pixel analysis is capable of generating continuous fields, which represent the spatial variability of certain thematic classes. The aim of this work was to develop numerical models to represent the variability of tree cover and bare surfaces within the study area. This research was conducted in the riparian buffer within a watershed of the São Francisco River in the North of Minas Gerais, Brazil. IKONOS and Landsat TM imagery were used with the GUIDE algorithm to construct the models. The results were two index images derived with regression trees for the entire study area, one representing tree cover and the other representing bare surface. The use of non-parametric and non-linear regression tree models presented satisfactory results to characterize wetland, deciduous and savanna patterns of forest formation.

  12. THE PROGNOSIS OF RUSSIAN DEFENSE INDUSTRY DEVELOPMENT IMPLEMENTED THROUGH REGRESSION ANALYSIS

    Directory of Open Access Journals (Sweden)

    L.M. Kapustina

    2007-03-01

    Full Text Available The article illustrates the results of investigation the major internal and external factors which influence the development of the defense industry, as well as the results of regression analysis which quantitatively displays the factorial contribution in the growth rate of Russian defense industry. On the basis of calculated regression dependences the authors fulfilled the medium-term prognosis of defense industry. Optimistic and inertial versions of defense product growth rate for the period up to 2009 are based on scenario conditions in Russian economy worked out by the Ministry of economy and development. In conclusion authors point out which factors and conditions have the largest impact on successful and stable operation of Russian defense industry.

  13. Predicting electricity energy consumption: A comparison of regression analysis, decision tree and neural networks

    Energy Technology Data Exchange (ETDEWEB)

    Tso, Geoffrey K.F.; Yau, Kelvin K.W. [City University of Hong Kong, Kowloon, Hong Kong (China). Department of Management Sciences

    2007-09-15

    This study presents three modeling techniques for the prediction of electricity energy consumption. In addition to the traditional regression analysis, decision tree and neural networks are considered. Model selection is based on the square root of average squared error. In an empirical application to an electricity energy consumption study, the decision tree and neural network models appear to be viable alternatives to the stepwise regression model in understanding energy consumption patterns and predicting energy consumption levels. With the emergence of the data mining approach for predictive modeling, different types of models can be built in a unified platform: to implement various modeling techniques, assess the performance of different models and select the most appropriate model for future prediction. (author)

  14. COLOR IMAGE RETRIEVAL BASED ON FEATURE FUSION THROUGH MULTIPLE LINEAR REGRESSION ANALYSIS

    Directory of Open Access Journals (Sweden)

    K. Seetharaman

    2015-08-01

    Full Text Available This paper proposes a novel technique based on feature fusion using multiple linear regression analysis, and the least-square estimation method is employed to estimate the parameters. The given input query image is segmented into various regions according to the structure of the image. The color and texture features are extracted on each region of the query image, and the features are fused together using the multiple linear regression model. The estimated parameters of the model, which is modeled based on the features, are formed as a vector called a feature vector. The Canberra distance measure is adopted to compare the feature vectors of the query and target images. The F-measure is applied to evaluate the performance of the proposed technique. The obtained results expose that the proposed technique is comparable to the other existing techniques.

  15. A refined method for multivariate meta-analysis and meta-regression.

    Science.gov (United States)

    Jackson, Daniel; Riley, Richard D

    2014-02-20

    Making inferences about the average treatment effect using the random effects model for meta-analysis is problematic in the common situation where there is a small number of studies. This is because estimates of the between-study variance are not precise enough to accurately apply the conventional methods for testing and deriving a confidence interval for the average effect. We have found that a refined method for univariate meta-analysis, which applies a scaling factor to the estimated effects' standard error, provides more accurate inference. We explain how to extend this method to the multivariate scenario and show that our proposal for refined multivariate meta-analysis and meta-regression can provide more accurate inferences than the more conventional approach. We explain how our proposed approach can be implemented using standard output from multivariate meta-analysis software packages and apply our methodology to two real examples. Copyright © 2013 John Wiley & Sons, Ltd.

  16. Implementing informative priors for heterogeneity in meta-analysis using meta-regression and pseudo data.

    Science.gov (United States)

    Rhodes, Kirsty M; Turner, Rebecca M; White, Ian R; Jackson, Dan; Spiegelhalter, David J; Higgins, Julian P T

    2016-12-20

    Many meta-analyses combine results from only a small number of studies, a situation in which the between-study variance is imprecisely estimated when standard methods are applied. Bayesian meta-analysis allows incorporation of external evidence on heterogeneity, providing the potential for more robust inference on the effect size of interest. We present a method for performing Bayesian meta-analysis using data augmentation, in which we represent an informative conjugate prior for between-study variance by pseudo data and use meta-regression for estimation. To assist in this, we derive predictive inverse-gamma distributions for the between-study variance expected in future meta-analyses. These may serve as priors for heterogeneity in new meta-analyses. In a simulation study, we compare approximate Bayesian methods using meta-regression and pseudo data against fully Bayesian approaches based on importance sampling techniques and Markov chain Monte Carlo (MCMC). We compare the frequentist properties of these Bayesian methods with those of the commonly used frequentist DerSimonian and Laird procedure. The method is implemented in standard statistical software and provides a less complex alternative to standard MCMC approaches. An importance sampling approach produces almost identical results to standard MCMC approaches, and results obtained through meta-regression and pseudo data are very similar. On average, data augmentation provides closer results to MCMC, if implemented using restricted maximum likelihood estimation rather than DerSimonian and Laird or maximum likelihood estimation. The methods are applied to real datasets, and an extension to network meta-analysis is described. The proposed method facilitates Bayesian meta-analysis in a way that is accessible to applied researchers. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.

  17. Hierarchical cluster-tendency analysis of the group structure in the foreign exchange market

    Science.gov (United States)

    Wu, Xin-Ye; Zheng, Zhi-Gang

    2013-08-01

    A hierarchical cluster-tendency (HCT) method in analyzing the group structure of networks of the global foreign exchange (FX) market is proposed by combining the advantages of both the minimal spanning tree (MST) and the hierarchical tree (HT). Fifty currencies of the top 50 World GDP in 2010 according to World Bank's database are chosen as the underlying system. By using the HCT method, all nodes in the FX market network can be "colored" and distinguished. We reveal that the FX networks can be divided into two groups, i.e., the Asia-Pacific group and the Pan-European group. The results given by the hierarchical cluster-tendency method agree well with the formerly observed geographical aggregation behavior in the FX market. Moreover, an oil-resource aggregation phenomenon is discovered by using our method. We find that gold could be a better numeraire for the weekly-frequency FX data.

  18. Multi-scaling hierarchical structure analysis on the sequence of E. coli complete genome

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    We have applied the newly developed hierarchical structure theory for complex systems to analyze the multi-scaling structures of the nucleotide density distribution along a linear DNA sequence from the complete Escherichia coli genome. The hierarchical symmetry in the nucleotide density distribution was demonstrated. In particular, we have shown that the G, C density distribution that represents a strong H-bonding between the two DNA chains is more coherent with smaller similarity parameter compared to that of A, T density distribution, indicating a better organized multi-scaling fluctuation field for G, C density distribution along the genome sequence. The biological significance of these findings is under investigation.

  19. Comparison of various texture classification methods using multiresolution analysis and linear regression modelling.

    Science.gov (United States)

    Dhanya, S; Kumari Roshni, V S

    2016-01-01

    Textures play an important role in image classification. This paper proposes a high performance texture classification method using a combination of multiresolution analysis tool and linear regression modelling by channel elimination. The correlation between different frequency regions has been validated as a sort of effective texture characteristic. This method is motivated by the observation that there exists a distinctive correlation between the image samples belonging to the same kind of texture, at different frequency regions obtained by a wavelet transform. Experimentally, it is observed that this correlation differs across textures. The linear regression modelling is employed to analyze this correlation and extract texture features that characterize the samples. Our method considers not only the frequency regions but also the correlation between these regions. This paper primarily focuses on applying the Dual Tree Complex Wavelet Packet Transform and the Linear Regression model for classification of the obtained texture features. Additionally the paper also presents a comparative assessment of the classification results obtained from the above method with two more types of wavelet transform methods namely the Discrete Wavelet Transform and the Discrete Wavelet Packet Transform.

  20. Bayesian Nonparametric Regression Analysis of Data with Random Effects Covariates from Longitudinal Measurements

    KAUST Repository

    Ryu, Duchwan

    2010-09-28

    We consider nonparametric regression analysis in a generalized linear model (GLM) framework for data with covariates that are the subject-specific random effects of longitudinal measurements. The usual assumption that the effects of the longitudinal covariate processes are linear in the GLM may be unrealistic and if this happens it can cast doubt on the inference of observed covariate effects. Allowing the regression functions to be unknown, we propose to apply Bayesian nonparametric methods including cubic smoothing splines or P-splines for the possible nonlinearity and use an additive model in this complex setting. To improve computational efficiency, we propose the use of data-augmentation schemes. The approach allows flexible covariance structures for the random effects and within-subject measurement errors of the longitudinal processes. The posterior model space is explored through a Markov chain Monte Carlo (MCMC) sampler. The proposed methods are illustrated and compared to other approaches, the "naive" approach and the regression calibration, via simulations and by an application that investigates the relationship between obesity in adulthood and childhood growth curves. © 2010, The International Biometric Society.

  1. Bayesian nonparametric regression analysis of data with random effects covariates from longitudinal measurements.

    Science.gov (United States)

    Ryu, Duchwan; Li, Erning; Mallick, Bani K

    2011-06-01

    We consider nonparametric regression analysis in a generalized linear model (GLM) framework for data with covariates that are the subject-specific random effects of longitudinal measurements. The usual assumption that the effects of the longitudinal covariate processes are linear in the GLM may be unrealistic and if this happens it can cast doubt on the inference of observed covariate effects. Allowing the regression functions to be unknown, we propose to apply Bayesian nonparametric methods including cubic smoothing splines or P-splines for the possible nonlinearity and use an additive model in this complex setting. To improve computational efficiency, we propose the use of data-augmentation schemes. The approach allows flexible covariance structures for the random effects and within-subject measurement errors of the longitudinal processes. The posterior model space is explored through a Markov chain Monte Carlo (MCMC) sampler. The proposed methods are illustrated and compared to other approaches, the "naive" approach and the regression calibration, via simulations and by an application that investigates the relationship between obesity in adulthood and childhood growth curves.

  2. Evaluation of a LASSO regression approach on the unrelated samples of Genetic Analysis Workshop 17.

    Science.gov (United States)

    Guo, Wei; Elston, Robert C; Zhu, Xiaofeng

    2011-11-29

    The Genetic Analysis Workshop 17 data we used comprise 697 unrelated individuals genotyped at 24,487 single-nucleotide polymorphisms (SNPs) from a mini-exome scan, using real sequence data for 3,205 genes annotated by the 1000 Genomes Project and simulated phenotypes. We studied 200 sets of simulated phenotypes of trait Q2. An important feature of this data set is that most SNPs are rare, with 87% of the SNPs having a minor allele frequency less than 0.05. For rare SNP detection, in this study we performed a least absolute shrinkage and selection operator (LASSO) regression and F tests at the gene level and calculated the generalized degrees of freedom to avoid any selection bias. For comparison, we also carried out linear regression and the collapsing method, which sums the rare SNPs, modified for a quantitative trait and with two different allele frequency thresholds. The aim of this paper is to evaluate these four approaches in this mini-exome data and compare their performance in terms of power and false positive rates. In most situations the LASSO approach is more powerful than linear regression and collapsing methods. We also note the difficulty in determining the optimal threshold for the collapsing method and the significant role that linkage disequilibrium plays in detecting rare causal SNPs. If a rare causal SNP is in strong linkage disequilibrium with a common marker in the same gene, power will be much improved.

  3. A comparison of ordinal regression models in an analysis of factors associated with periodontal disease

    Directory of Open Access Journals (Sweden)

    Javali Shivalingappa

    2010-01-01

    Full Text Available Aim: The study aimed to determine the factors associated with periodontal disease (different levels of severity by using different regression models for ordinal data. Design: A cross-sectional design was employed using clinical examination and ′questionnaire with interview′ method. Materials and Methods: The study was conducted during June 2008 to October 2008 in Dharwad, Karnataka, India. It involved a systematic random sample of 1760 individuals aged 18-40 years. The periodontal disease examination was conducted by using Community Periodontal Index for Treatment Needs (CPITN. Statistical Analysis Used: Regression models for ordinal data with different built-in link functions were used in determination of factors associated with periodontal disease. Results: The study findings indicated that, the ordinal regression models with four built-in link functions (logit, probit, Clog-log and nlog-log displayed similar results with negligible differences in significant factors associated with periodontal disease. The factors such as religion, caste, sources of drinking water, Timings for sweet consumption, Timings for cleaning or brushing the teeth and materials used for brushing teeth were significantly associated with periodontal disease in all ordinal models. Conclusions: The ordinal regression model with Clog-log is a better fit in determination of significant factors associated with periodontal disease as compared to models with logit, probit and nlog-log built-in link functions. The factors such as caste and time for sweet consumption are negatively associated with periodontal disease. But religion, sources of drinking water, Timings for cleaning or brushing the teeth and materials used for brushing teeth are significantly and positively associated with periodontal disease.

  4. Analysis of sparse data in logistic regression in medical research: A newer approach

    Directory of Open Access Journals (Sweden)

    S Devika

    2016-01-01

    Full Text Available Background and Objective: In the analysis of dichotomous type response variable, logistic regression is usually used. However, the performance of logistic regression in the presence of sparse data is questionable. In such a situation, a common problem is the presence of high odds ratios (ORs with very wide 95% confidence interval (CI (OR: >999.999, 95% CI: 999.999. In this paper, we addressed this issue by using penalized logistic regression (PLR method. Materials and Methods: Data from case-control study on hyponatremia and hiccups conducted in Christian Medical College, Vellore, Tamil Nadu, India was used. The outcome variable was the presence/absence of hiccups and the main exposure variable was the status of hyponatremia. Simulation dataset was created with different sample sizes and with a different number of covariates. Results: A total of 23 cases and 50 controls were used for the analysis of ordinary and PLR methods. The main exposure variable hyponatremia was present in nine (39.13% of the cases and in four (8.0% of the controls. Of the 23 hiccup cases, all were males and among the controls, 46 (92.0% were males. Thus, the complete separation between gender and the disease group led into an infinite OR with 95% CI (OR: >999.999, 95% CI: 999.999 whereas there was a finite and consistent regression coefficient for gender (OR: 5.35; 95% CI: 0.42, 816.48 using PLR. After adjusting for all the confounding variables, hyponatremia entailed 7.9 (95% CI: 2.06, 38.86 times higher risk for the development of hiccups as was found using PLR whereas there was an overestimation of risk OR: 10.76 (95% CI: 2.17, 53.41 using the conventional method. Simulation experiment shows that the estimated coverage probability of this method is near the nominal level of 95% even for small sample sizes and for a large number of covariates. Conclusions: PLR is almost equal to the ordinary logistic regression when the sample size is large and is superior in small cell

  5. FREQFIT: Computer program which performs numerical regression and statistical chi-squared goodness of fit analysis

    Energy Technology Data Exchange (ETDEWEB)

    Hofland, G.S.; Barton, C.C.

    1990-10-01

    The computer program FREQFIT is designed to perform regression and statistical chi-squared goodness of fit analysis on one-dimensional or two-dimensional data. The program features an interactive user dialogue, numerous help messages, an option for screen or line printer output, and the flexibility to use practically any commercially available graphics package to create plots of the program`s results. FREQFIT is written in Microsoft QuickBASIC, for IBM-PC compatible computers. A listing of the QuickBASIC source code for the FREQFIT program, a user manual, and sample input data, output, and plots are included. 6 refs., 1 fig.

  6. Application of nonlinear regression analysis for ammonium exchange by natural (Bigadic) clinoptilolite

    Energy Technology Data Exchange (ETDEWEB)

    Gunay, Ahmet [Deparment of Environmental Engineering, Faculty of Engineering and Architecture, Balikesir University (Turkey)], E-mail: ahmetgunay2@gmail.com

    2007-09-30

    The experimental data of ammonium exchange by natural Bigadic clinoptilolite was evaluated using nonlinear regression analysis. Three two-parameters isotherm models (Langmuir, Freundlich and Temkin) and three three-parameters isotherm models (Redlich-Peterson, Sips and Khan) were used to analyse the equilibrium data. Fitting of isotherm models was determined using values of standard normalization error procedure (SNE) and coefficient of determination (R{sup 2}). HYBRID error function provided lowest sum of normalized error and Khan model had better performance for modeling the equilibrium data. Thermodynamic investigation indicated that ammonium removal by clinoptilolite was favorable at lower temperatures and exothermic in nature.

  7. Classification of Error-Diffused Halftone Images Based on Spectral Regression Kernel Discriminant Analysis

    Directory of Open Access Journals (Sweden)

    Zhigao Zeng

    2016-01-01

    Full Text Available This paper proposes a novel algorithm to solve the challenging problem of classifying error-diffused halftone images. We firstly design the class feature matrices, after extracting the image patches according to their statistics characteristics, to classify the error-diffused halftone images. Then, the spectral regression kernel discriminant analysis is used for feature dimension reduction. The error-diffused halftone images are finally classified using an idea similar to the nearest centroids classifier. As demonstrated by the experimental results, our method is fast and can achieve a high classification accuracy rate with an added benefit of robustness in tackling noise.

  8. A systematic review and meta-regression analysis of mivacurium for tracheal intubation

    OpenAIRE

    Vanlinthout, L.E.H.; Mesfin, S.H.; Hens, Niel; Vanacker, B. F.; Robertson, E. N.; Booij, L. H. D. J.

    2014-01-01

    We systematically reviewed factors associated with intubation conditions in randomised controlled trials of mivacurium, using random-effects meta-regression analysis. We included 29 studies of 1050 healthy participants. Four factors explained 72.9% of the variation in the probability of excellent intubation conditions: mivacurium dose, 24.4%; opioid use, 29.9%; time to intubation and age together, 18.6%. The odds ratio (95% CI) for excellent intubation was 3.14 (1.65–5.73) for doubling the mi...

  9. Prediction of cavity growth rate during underground coal gasification using multiple regression analysis

    Institute of Scientific and Technical Information of China (English)

    Mehdi Najafi; Seyed Mohammad Esmaiel Jalali; Reza KhaloKakaie; Farrokh Forouhandeh

    2015-01-01

    During underground coal gasification (UCG), whereby coal is converted to syngas in situ, a cavity is formed in the coal seam. The cavity growth rate (CGR) or the moving rate of the gasification face is affected by controllable (operation pressure, gasification time, geometry of UCG panel) and uncontrollable (coal seam properties) factors. The CGR is usually predicted by mathematical models and laboratory experiments, which are time consuming, cumbersome and expensive. In this paper, a new simple model for CGR is developed using non-linear regression analysis, based on data from 11 UCG field trials. The empirical model compares satisfactorily with Perkins model and can reliably predict CGR.

  10. FREQFIT: Computer program which performs numerical regression and statistical chi-squared goodness of fit analysis

    Energy Technology Data Exchange (ETDEWEB)

    Hofland, G.S.; Barton, C.C.

    1990-10-01

    The computer program FREQFIT is designed to perform regression and statistical chi-squared goodness of fit analysis on one-dimensional or two-dimensional data. The program features an interactive user dialogue, numerous help messages, an option for screen or line printer output, and the flexibility to use practically any commercially available graphics package to create plots of the program`s results. FREQFIT is written in Microsoft QuickBASIC, for IBM-PC compatible computers. A listing of the QuickBASIC source code for the FREQFIT program, a user manual, and sample input data, output, and plots are included. 6 refs., 1 fig.

  11. An Econometric Analysis of Modulated Realised Covariance, Regression and Correlation in Noisy Diffusion Models

    DEFF Research Database (Denmark)

    Kinnebrock, Silja; Podolskij, Mark

    This paper introduces a new estimator to measure the ex-post covariation between high-frequency financial time series under market microstructure noise. We provide an asymptotic limit theory (including feasible central limit theorems) for standard methods such as regression, correlation analysis...... and covariance, for which we obtain the optimal rate of convergence. We demonstrate some positive semidefinite estimators of the covariation and construct a positive semidefinite estimator of the conditional covariance matrix in the central limit theorem. Furthermore, we indicate how the assumptions on the noise...

  12. Regression analysis as an objective tool of economic management of rolling mill

    Directory of Open Access Journals (Sweden)

    Š. Vilamová

    2015-07-01

    Full Text Available The ability to optimize costs plays a key role in maintaining competitiveness of the company, because without detailed knowledge of costs, companies are not able to make the right decisions that will ensure their long-term growth. The aim of this article is to outline the problematic areas related to company costs and to contribute to a debate on the method used to determine the amount of fixed and variable costs, their monitoring and follow-up control. This article presents a potential use of regression analysis as an objective tool of economic management in metallurgical companies, as these companies have several specific features

  13. An Econometric Analysis of Modulated Realised Covariance, Regression and Correlation in Noisy Diffusion Models

    DEFF Research Database (Denmark)

    Kinnebrock, Silja; Podolskij, Mark

    and covariance, for which we obtain the optimal rate of convergence. We demonstrate some positive semidefinite estimators of the covariation and construct a positive semidefinite estimator of the conditional covariance matrix in the central limit theorem. Furthermore, we indicate how the assumptions on the noise......This paper introduces a new estimator to measure the ex-post covariation between high-frequency financial time series under market microstructure noise. We provide an asymptotic limit theory (including feasible central limit theorems) for standard methods such as regression, correlation analysis...

  14. Estimating the causes of traffic accidents using logistic regression and discriminant analysis.

    Science.gov (United States)

    Karacasu, Murat; Ergül, Barış; Altin Yavuz, Arzu

    2014-01-01

    Factors that affect traffic accidents have been analysed in various ways. In this study, we use the methods of logistic regression and discriminant analysis to determine the damages due to injury and non-injury accidents in the Eskisehir Province. Data were obtained from the accident reports of the General Directorate of Security in Eskisehir; 2552 traffic accidents between January and December 2009 were investigated regarding whether they resulted in injury. According to the results, the effects of traffic accidents were reflected in the variables. These results provide a wealth of information that may aid future measures toward the prevention of undesired results.

  15. Neck-focused panic attacks among Cambodian refugees; a logistic and linear regression analysis.

    Science.gov (United States)

    Hinton, Devon E; Chhean, Dara; Pich, Vuth; Um, Khin; Fama, Jeanne M; Pollack, Mark H

    2006-01-01

    Consecutive Cambodian refugees attending a psychiatric clinic were assessed for the presence and severity of current--i.e., at least one episode in the last month--neck-focused panic. Among the whole sample (N=130), in a logistic regression analysis, the Anxiety Sensitivity Index (ASI; odds ratio=3.70) and the Clinician-Administered PTSD Scale (CAPS; odds ratio=2.61) significantly predicted the presence of current neck panic (NP). Among the neck panic patients (N=60), in the linear regression analysis, NP severity was significantly predicted by NP-associated flashbacks (beta=.42), NP-associated catastrophic cognitions (beta=.22), and CAPS score (beta=.28). Further analysis revealed the effect of the CAPS score to be significantly mediated (Sobel test [Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51, 1173-1182]) by both NP-associated flashbacks and catastrophic cognitions. In the care of traumatized Cambodian refugees, NP severity, as well as NP-associated flashbacks and catastrophic cognitions, should be specifically assessed and treated.

  16. Surface Roughness Prediction Model using Zirconia Toughened Alumina (ZTA) Turning Inserts: Taguchi Method and Regression Analysis

    Science.gov (United States)

    Mandal, Nilrudra; Doloi, Biswanath; Mondal, Biswanath

    2016-01-01

    In the present study, an attempt has been made to apply the Taguchi parameter design method and regression analysis for optimizing the cutting conditions on surface finish while machining AISI 4340 steel with the help of the newly developed yttria based Zirconia Toughened Alumina (ZTA) inserts. These inserts are prepared through wet chemical co-precipitation route followed by powder metallurgy process. Experiments have been carried out based on an orthogonal array L9 with three parameters (cutting speed, depth of cut and feed rate) at three levels (low, medium and high). Based on the mean response and signal to noise ratio (SNR), the best optimal cutting condition has been arrived at A3B1C1 i.e. cutting speed is 420 m/min, depth of cut is 0.5 mm and feed rate is 0.12 m/min considering the condition smaller is the better approach. Analysis of Variance (ANOVA) is applied to find out the significance and percentage contribution of each parameter. The mathematical model of surface roughness has been developed using regression analysis as a function of the above mentioned independent variables. The predicted values from the developed model and experimental values are found to be very close to each other justifying the significance of the model. A confirmation run has been carried out with 95 % confidence level to verify the optimized result and the values obtained are within the prescribed limit.

  17. MULTIVARIATE STEPWISE LOGISTIC REGRESSION ANALYSIS ON RISK FACTORS OF VENTILATOR-ASSOCIATED PNEUMONIA IN COMPREHENSIVE ICU

    Institute of Scientific and Technical Information of China (English)

    管军; 杨兴易; 赵良; 林兆奋; 郭昌星; 李文放

    2003-01-01

    Objective To investigate the incidence, crude mortality and independent risk factors of ventilator-associated pneumonia (VAP) in comprehensive ICU in China.Methods The clinical and microbiological data were retrospectively collected and analysed of all the 97 patients receiving mechanical ventilation (>48hr) in our comprehensive ICU during 1999. 1 - 2000. 12. Firstly several statistically significant risk factors were screened out with univariate analysis, then independent risk factors were determined with multivariate stepwise logistic regression analysis.Results The incidence of VAP was 54. 64% (15. 60 cases per 1000 ventilation days), the crude mortality 47.42% . Interval between the establishment of artificial airway and diagnosis of VAP was 6.9 ± 4.3 d. Univariate analysis suggested that indwelling naso-gastric tube, corticosteroid, acid inhibitor, third-generation cephalosporin/ imipenem, non - infection lung disease, and extrapulmonary infection were the statistically significant risk factors of

  18. Selenium Exposure and Cancer Risk: an Updated Meta-analysis and Meta-regression.

    Science.gov (United States)

    Cai, Xianlei; Wang, Chen; Yu, Wanqi; Fan, Wenjie; Wang, Shan; Shen, Ning; Wu, Pengcheng; Li, Xiuyang; Wang, Fudi

    2016-01-20

    The objective of this study was to investigate the associations between selenium exposure and cancer risk. We identified 69 studies and applied meta-analysis, meta-regression and dose-response analysis to obtain available evidence. The results indicated that high selenium exposure had a protective effect on cancer risk (pooled OR = 0.78; 95%CI: 0.73-0.83). The results of linear and nonlinear dose-response analysis indicated that high serum/plasma selenium and toenail selenium had the efficacy on cancer prevention. However, we did not find a protective efficacy of selenium supplement. High selenium exposure may have different effects on specific types of cancer. It decreased the risk of breast cancer, lung cancer, esophageal cancer, gastric cancer, and prostate cancer, but it was not associated with colorectal cancer, bladder cancer, and skin cancer.

  19. Analysis of Scattering by Inhomogeneous Dielectric Objects Using Higher-Order Hierarchical MoM

    DEFF Research Database (Denmark)

    Kim, Oleksiy S.; Jørgensen, Erik; Meincke, Peter;

    2003-01-01

    developed higher-order hierarchical Legendre basis functions for expansion of the electric flux density and higher-order geometry modeling. An unstructured mesh composed by trilinear (8-node) and/or curved (27-node) hexahedral elements is used to represent the dielectric object accurately. It is shown...

  20. A spatial analysis of hierarchical waste transport structures under growing demand.

    Science.gov (United States)

    Tanguy, Audrey; Glaus, Mathias; Laforest, Valérie; Villot, Jonathan; Hausler, Robert

    2016-10-01

    The design of waste management systems rarely accounts for the spatio-temporal evolution of the demand. However, recent studies suggest that this evolution affects the planning of waste management activities like the choice and location of treatment facilities. As a result, the transport structure could also be affected by these changes. The objective of this paper is to study the influence of the spatio-temporal evolution of the demand on the strategic planning of a waste transport structure. More particularly this study aims at evaluating the effect of varying spatial parameters on the economic performance of hierarchical structures (with one transfer station). To this end, three consecutive generations of three different spatial distributions were tested for hierarchical and non-hierarchical transport structures based on costs minimization. Results showed that a hierarchical structure is economically viable for large and clustered spatial distributions. The distance parameter was decisive but the loading ratio of trucks and the formation of clusters of sources also impacted the attractiveness of the transfer station. Thus the territories' morphology should influence strategies as regards to the installation of transfer stations. The use of spatial-explicit tools such as the transport model presented in this work that take into account the territory's evolution are needed to help waste managers in the strategic planning of waste transport structures.

  1. Hierarchical spatial point process analysis for a plant community with high biodiversity

    DEFF Research Database (Denmark)

    Illian, Janine B.; Møller, Jesper; Waagepetersen, Rasmus

    2009-01-01

    A complex multivariate spatial point pattern of a plant community with high biodiversity is modelled using a hierarchical multivariate point process model. In the model, interactions between plants with different post-fire regeneration strategies are of key interest. We consider initially a maximum...

  2. Evaluating the Impacts of ICT Use: A Multi-Level Analysis with Hierarchical Linear Modeling

    Science.gov (United States)

    Song, Hae-Deok; Kang, Taehoon

    2012-01-01

    The purpose of this study is to evaluate the impacts of ICT use on achievements by considering not only ICT use, but also the process and background variables that influence ICT use at both the student- and school-level. This study was conducted using data from the 2010 Survey of Seoul Education Longitudinal Research. A Hierarchical Linear…

  3. Mechanisms of hierarchical reinforcement learning in corticostriatal circuits 1: computational analysis.

    Science.gov (United States)

    Frank, Michael J; Badre, David

    2012-03-01

    Growing evidence suggests that the prefrontal cortex (PFC) is organized hierarchically, with more anterior regions having increasingly abstract representations. How does this organization support hierarchical cognitive control and the rapid discovery of abstract action rules? We present computational models at different levels of description. A neural circuit model simulates interacting corticostriatal circuits organized hierarchically. In each circuit, the basal ganglia gate frontal actions, with some striatal units gating the inputs to PFC and others gating the outputs to influence response selection. Learning at all of these levels is accomplished via dopaminergic reward prediction error signals in each corticostriatal circuit. This functionality allows the system to exhibit conditional if-then hypothesis testing and to learn rapidly in environments with hierarchical structure. We also develop a hybrid Bayesian-reinforcement learning mixture of experts (MoE) model, which can estimate the most likely hypothesis state of individual participants based on their observed sequence of choices and rewards. This model yields accurate probabilistic estimates about which hypotheses are attended by manipulating attentional states in the generative neural model and recovering them with the MoE model. This 2-pronged modeling approach leads to multiple quantitative predictions that are tested with functional magnetic resonance imaging in the companion paper.

  4. Hierarchical linear modeling of longitudinal pedigree data for genetic association analysis

    DEFF Research Database (Denmark)

    Tan, Qihua; B Hjelmborg, Jacob V; Thomassen, Mads;

    2014-01-01

    on the mean level of a phenotype, they are not sufficiently straightforward to handle the kinship correlation on the time-dependent trajectories of a phenotype. We introduce a 2-level hierarchical linear model to separately assess the genetic associations with the mean level and the rate of change...

  5. Hierarchical modeling and inference in ecology: The analysis of data from populations, metapopulations and communities

    Science.gov (United States)

    Royle, J. Andrew; Dorazio, Robert M.

    2008-01-01

    A guide to data collection, modeling and inference strategies for biological survey data using Bayesian and classical statistical methods. This book describes a general and flexible framework for modeling and inference in ecological systems based on hierarchical models, with a strict focus on the use of probability models and parametric inference. Hierarchical models represent a paradigm shift in the application of statistics to ecological inference problems because they combine explicit models of ecological system structure or dynamics with models of how ecological systems are observed. The principles of hierarchical modeling are developed and applied to problems in population, metapopulation, community, and metacommunity systems. The book provides the first synthetic treatment of many recent methodological advances in ecological modeling and unifies disparate methods and procedures. The authors apply principles of hierarchical modeling to ecological problems, including * occurrence or occupancy models for estimating species distribution * abundance models based on many sampling protocols, including distance sampling * capture-recapture models with individual effects * spatial capture-recapture models based on camera trapping and related methods * population and metapopulation dynamic models * models of biodiversity, community structure and dynamics.

  6. Automated particle identification through regression analysis of size, shape and colour

    Science.gov (United States)

    Rodriguez Luna, J. C.; Cooper, J. M.; Neale, S. L.

    2016-04-01

    Rapid point of care diagnostic tests and tests to provide therapeutic information are now available for a range of specific conditions from the measurement of blood glucose levels for diabetes to card agglutination tests for parasitic infections. Due to a lack of specificity these test are often then backed up by more conventional lab based diagnostic methods for example a card agglutination test may be carried out for a suspected parasitic infection in the field and if positive a blood sample can then be sent to a lab for confirmation. The eventual diagnosis is often achieved by microscopic examination of the sample. In this paper we propose a computerized vision system for aiding in the diagnostic process; this system used a novel particle recognition algorithm to improve specificity and speed during the diagnostic process. We will show the detection and classification of different types of cells in a diluted blood sample using regression analysis of their size, shape and colour. The first step is to define the objects to be tracked by a Gaussian Mixture Model for background subtraction and binary opening and closing for noise suppression. After subtracting the objects of interest from the background the next challenge is to predict if a given object belongs to a certain category or not. This is a classification problem, and the output of the algorithm is a Boolean value (true/false). As such the computer program should be able to "predict" with reasonable level of confidence if a given particle belongs to the kind we are looking for or not. We show the use of a binary logistic regression analysis with three continuous predictors: size, shape and color histogram. The results suggest this variables could be very useful in a logistic regression equation as they proved to have a relatively high predictive value on their own.

  7. Predictive equations using regression analysis of pulmonary function for healthy children in Northeast China.

    Directory of Open Access Journals (Sweden)

    Ya-Nan Ma

    Full Text Available BACKGROUND: There have been few published studies on spirometric reference values for healthy children in China. We hypothesize that there would have been changes in lung function that would not have been precisely predicted by the existing spirometric reference equations. The objective of the study was to develop more accurate predictive equations for spirometric reference values for children aged 9 to 15 years in Northeast China. METHODOLOGY/PRINCIPAL FINDINGS: Spirometric measurements were obtained from 3,922 children, including 1,974 boys and 1,948 girls, who were randomly selected from five cities of Liaoning province, Northeast China, using the ATS (American Thoracic Society and ERS (European Respiratory Society standards. The data was then randomly split into a training subset containing 2078 cases and a validation subset containing 1844 cases. Predictive equations used multiple linear regression techniques with three predictor variables: height, age and weight. Model goodness of fit was examined using the coefficient of determination or the R(2 and adjusted R(2. The predicted values were compared with those obtained from the existing spirometric reference equations. The results showed the prediction equations using linear regression analysis performed well for most spirometric parameters. Paired t-tests were used to compare the predicted values obtained from the developed and existing spirometric reference equations based on the validation subset. The t-test for males was not statistically significant (p>0.01. The predictive accuracy of the developed equations was higher than the existing equations and the predictive ability of the model was also validated. CONCLUSION/SIGNIFICANCE: We developed prediction equations using linear regression analysis of spirometric parameters for children aged 9-15 years in Northeast China. These equations represent the first attempt at predicting lung function for Chinese children following the ATS

  8. The analysis of internet addiction scale using multivariate adaptive regression splines.

    Science.gov (United States)

    Kayri, M

    2010-01-01

    Determining real effects on internet dependency is too crucial with unbiased and robust statistical method. MARS is a new non-parametric method in use in the literature for parameter estimations of cause and effect based research. MARS can both obtain legible model curves and make unbiased parametric predictions. In order to examine the performance of MARS, MARS findings will be compared to Classification and Regression Tree (C&RT) findings, which are considered in the literature to be efficient in revealing correlations between variables. The data set for the study is taken from "The Internet Addiction Scale" (IAS), which attempts to reveal addiction levels of individuals. The population of the study consists of 754 secondary school students (301 female, 443 male students with 10 missing data). MARS 2.0 trial version is used for analysis by MARS method and C&RT analysis was done by SPSS. MARS obtained six base functions of the model. As a common result of these six functions, regression equation of the model was found. Over the predicted variable, MARS showed that the predictors of daily Internet-use time on average, the purpose of Internet-use, grade of students and occupations of mothers had a significant effect (Pdependency level prediction. The fact that MARS revealed extent to which the variable, which was considered significant, changes the character of the model was observed in this study.

  9. Determination of baroreflex sensitivity during the modified Oxford maneuver by trigonometric regressive spectral analysis.

    Directory of Open Access Journals (Sweden)

    Julia Gasch

    Full Text Available BACKGROUND: Differences in spontaneous and drug-induced baroreflex sensitivity (BRS have been attributed to its different operating ranges. The current study attempted to compare BRS estimates during cardiovascular steady-state and pharmacologically stimulation using an innovative algorithm for dynamic determination of baroreflex gain. METHODOLOGY/PRINCIPAL FINDINGS: Forty-five volunteers underwent the modified Oxford maneuver in supine and 60° tilted position with blood pressure and heart rate being continuously recorded. Drug-induced BRS-estimates were calculated from data obtained by bolus injections of nitroprusside and phenylephrine. Spontaneous indices were derived from data obtained during rest (stationary and under pharmacological stimulation (non-stationary using the algorithm of trigonometric regressive spectral analysis (TRS. Spontaneous and drug-induced BRS values were significantly correlated and display directionally similar changes under different situations. Using the Bland-Altman method, systematic differences between spontaneous and drug-induced estimates were found and revealed that the discrepancy can be as large as the gain itself. Fixed bias was not evident with ordinary least products regression. The correlation and agreement between the estimates increased significantly when BRS was calculated by TRS in non-stationary mode during the drug injection period. TRS-BRS significantly increased during phenylephrine and decreased under nitroprusside. CONCLUSIONS/SIGNIFICANCE: The TRS analysis provides a reliable, non-invasive assessment of human BRS not only under static steady state conditions, but also during pharmacological perturbation of the cardiovascular system.

  10. Analysis of pulsed eddy current data using regression models for steam generator tube support structure inspection

    Science.gov (United States)

    Buck, J. A.; Underhill, P. R.; Morelli, J.; Krause, T. W.

    2016-02-01

    Nuclear steam generators (SGs) are a critical component for ensuring safe and efficient operation of a reactor. Life management strategies are implemented in which SG tubes are regularly inspected by conventional eddy current testing (ECT) and ultrasonic testing (UT) technologies to size flaws, and safe operating life of SGs is predicted based on growth models. ECT, the more commonly used technique, due to the rapidity with which full SG tube wall inspection can be performed, is challenged when inspecting ferromagnetic support structure materials in the presence of magnetite sludge and multiple overlapping degradation modes. In this work, an emerging inspection method, pulsed eddy current (PEC), is being investigated to address some of these particular inspection conditions. Time-domain signals were collected by an 8 coil array PEC probe in which ferromagnetic drilled support hole diameter, depth of rectangular tube frets and 2D tube off-centering were varied. Data sets were analyzed with a modified principal components analysis (MPCA) to extract dominant signal features. Multiple linear regression models were applied to MPCA scores to size hole diameter as well as size rectangular outer diameter tube frets. Models were improved through exploratory factor analysis, which was applied to MPCA scores to refine selection for regression models inputs by removing nonessential information.

  11. Page Layout Analysis of the Document Image Based on the Region Classification in a Decision Hierarchical Structure

    Directory of Open Access Journals (Sweden)

    Hossein Pourghassem

    2010-10-01

    Full Text Available The conversion of document image to its electronic version is a very important problem in the saving, searching and retrieval application in the official automation system. For this purpose, analysis of the document image is necessary. In this paper, a hierarchical classification structure based on a two-stage segmentation algorithm is proposed. In this structure, image is segmented using the proposed two-stage segmentation algorithm. Then, the type of the image regions such as document and non-document image is determined using multiple classifiers in the hierarchical classification structure. The proposed segmentation algorithm uses two algorithms based on wavelet transform and thresholding. Texture features such as correlation, homogeneity and entropy that extracted from co-occurrenc matrix and also two new features based on wavelet transform are used to classifiy and lable the regions of the image. The hierarchical classifier is consisted of two Multilayer Perceptron (MLP classifiers and a Support Vector Machine (SVM classifier. The proposed algorithm is evaluated on a database consisting of document and non-document images that provides from Internet. The experimental results show the efficiency of the proposed approach in the region segmentation and classification. The proposed algorithm provides accuracy rate of 97.5% on classification of the regions.

  12. High Adherence to Iron/Folic Acid Supplementation during Pregnancy Time among Antenatal and Postnatal Care Attendant Mothers in Governmental Health Centers in Akaki Kality Sub City, Addis Ababa, Ethiopia: Hierarchical Negative Binomial Poisson Regression

    Science.gov (United States)

    Gebreamlak, Bisratemariam; Dadi, Abel Fekadu; Atnafu, Azeb

    2017-01-01

    Background Iron deficiency during pregnancy is a risk factor for anemia, preterm delivery, and low birth weight. Iron/Folic Acid supplementation with optimal adherence can effectively prevent anemia in pregnancy. However, studies that address this area of adherence are very limited. Therefore, the current study was conducted to assess the adherence and to identify factors associated with a number of Iron/Folic Acid uptake during pregnancy time among mothers attending antenatal and postnatal care follow up in Akaki kality sub city. Methods Institutional based cross-sectional study was conducted on a sample of 557 pregnant women attending antenatal and postnatal care service. Systematic random sampling was used to select study subjects. The mothers were interviewed and the collected data was cleaned and entered into Epi Info 3.5.1 and analyzed by R version 3.2.0. Hierarchical Negative Binomial Poisson Regression Model was fitted to identify the factors associated with a number of Iron/Folic Acid uptake. Adjusted Incidence rate ratio (IRR) with 95% confidence interval (CI) was computed to assess the strength and significance of the association. Result More than 90% of the mothers were supplemented with at least one Iron/Folic Acid supplement from pill per week during their pregnancy time. Sixty percent of the mothers adhered (took four or more tablets per week) (95%CI, 56%—64.1%). Higher IRR of Iron/Folic Acid supplementation was observed among women: who received health education; which were privately employed; who achieved secondary education; and who believed that Iron/Folic Acid supplements increase blood, whereas mothers who reported a side effect, who were from families with relatively better monthly income, and who took the supplement when sick were more likely to adhere. Conclusion Adherence to Iron/Folic Acid supplement during their pregnancy time among mothers attending antenatal and postnatal care was found to be high. Activities that would address the

  13. Using Spline Regression in Semi-Parametric Stochastic Frontier Analysis: An Application to Polish Dairy Farms

    DEFF Research Database (Denmark)

    Czekaj, Tomasz Gerard; Henningsen, Arne

    The estimation of the technical efficiency comprises a vast literature in the field of applied production economics. There are two predominant approaches: the non-parametric and non-stochastic Data Envelopment Analysis (DEA) and the parametric Stochastic Frontier Analysis (SFA). The DEA...... of specifying an unsuitable functional form and thus, model misspecification and biased parameter estimates. Given these problems of the DEA and the SFA, Fan, Li and Weersink (1996) proposed a semi-parametric stochastic frontier model that estimates the production function (frontier) by non-parametric......), Kumbhakar et al. (2007), and Henningsen and Kumbhakar (2009). The aim of this paper and its main contribution to the existing literature is the estimation semi-parametric stochastic frontier models using a different non-parametric estimation technique: spline regression (Ma et al. 2011). We apply...

  14. Within-session analysis of the extinction of pavlovian fear-conditioning using robust regression

    Directory of Open Access Journals (Sweden)

    Vargas-Irwin, Cristina

    2010-06-01

    Full Text Available Traditionally , the analysis of extinction data in fear conditioning experiments has involved the use of standard linear models, mostly ANOVA of between-group differences of subjects that have undergone different extinction protocols, pharmacological manipulations or some other treatment. Although some studies report individual differences in quantities such as suppression rates or freezing percentages, these differences are not included in the statistical modeling. Withinsubject response patterns are then averaged using coarse-grain time windows which can overlook these individual performance dynamics. Here we illustrate an alternative analytical procedure consisting of 2 steps: the estimation of a trend for within-session data and analysis of group differences in trend as main outcome. This procedure is tested on real fear-conditioning extinction data, comparing trend estimates via Ordinary Least Squares (OLS and robust Least Median of Squares (LMS regression estimates, as well as comparing between-group differences and analyzing mean freezing percentage versus LMS slopes as outcomes

  15. The Analysis Of The Correlations And Regressions Between Some Characters On A Wheat Isogenic Varities Assortment

    Science.gov (United States)

    Păniţă, Ovidiu

    2015-09-01

    In the years 2012-2014 on Banu-Maracine DRS there were tested an assortment of 25 isogenic lines of wheat (Triticum aestivum ssp.vulgare), the analyzed characters being the number of seeds/spike, seeds weight/spike (g), no. of spikes/m2, weight of a thousand seeds (WTS) (g) and no. of emerged plants/m2. Based on recorded data and statistical processing of those, they were identified a numbers of links between these characters. Also available regression models were identified between some of the studied characters. Based on component analysis, no. of seeds/spike and seeds weight/spike are components that influence in excess of 88% variance analysis, a total of seven genotypes with positive scores for both factors.

  16. A frailty model approach for regression analysis of multivariate current status data.

    Science.gov (United States)

    Chen, Man-Hua; Tong, Xingwei; Sun, Jianguo

    2009-11-30

    This paper discusses regression analysis of multivariate current status failure time data (The Statistical Analysis of Interval-censoring Failure Time Data. Springer: New York, 2006), which occur quite often in, for example, tumorigenicity experiments and epidemiologic investigations of the natural history of a disease. For the problem, several marginal approaches have been proposed that model each failure time of interest individually (Biometrics 2000; 56:940-943; Statist. Med. 2002; 21:3715-3726). In this paper, we present a full likelihood approach based on the proportional hazards frailty model. For estimation, an Expectation Maximization (EM) algorithm is developed and simulation studies suggest that the presented approach performs well for practical situations. The approach is applied to a set of bivariate current status data arising from a tumorigenicity experiment.

  17. Stability and adaptability of runner peanut genotypes based on nonlinear regression and AMMI analysis

    Directory of Open Access Journals (Sweden)

    Roseane Cavalcanti dos Santos

    2012-08-01

    Full Text Available The objective of this work was to estimate the stability and adaptability of pod and seed yield in runner peanut genotypes based on the nonlinear regression and AMMI analysis. Yield data from 11 trials, distributed in six environments and three harvests, carried out in the Northeast region of Brazil during the rainy season were used. Significant effects of genotypes (G, environments (E, and GE interactions were detected in the analysis, indicating different behaviors among genotypes in favorable and unfavorable environmental conditions. The genotypes BRS Pérola Branca and LViPE‑06 are more stable and adapted to the semiarid environment, whereas LGoPE‑06 is a promising material for pod production, despite being highly dependent on favorable environments.

  18. Hierarchical photocatalysts.

    Science.gov (United States)

    Li, Xin; Yu, Jiaguo; Jaroniec, Mietek

    2016-05-01

    As a green and sustainable technology, semiconductor-based heterogeneous photocatalysis has received much attention in the last few decades because it has potential to solve both energy and environmental problems. To achieve efficient photocatalysts, various hierarchical semiconductors have been designed and fabricated at the micro/nanometer scale in recent years. This review presents a critical appraisal of fabrication methods, growth mechanisms and applications of advanced hierarchical photocatalysts. Especially, the different synthesis strategies such as two-step templating, in situ template-sacrificial dissolution, self-templating method, in situ template-free assembly, chemically induced self-transformation and post-synthesis treatment are highlighted. Finally, some important applications including photocatalytic degradation of pollutants, photocatalytic H2 production and photocatalytic CO2 reduction are reviewed. A thorough assessment of the progress made in photocatalysis may open new opportunities in designing highly effective hierarchical photocatalysts for advanced applications ranging from thermal catalysis, separation and purification processes to solar cells.

  19. Evaluation of Visual Field Progression in Glaucoma: Quasar Regression Program and Event Analysis.

    Science.gov (United States)

    Díaz-Alemán, Valentín T; González-Hernández, Marta; Perera-Sanz, Daniel; Armas-Domínguez, Karintia

    2016-01-01

    To determine the sensitivity, specificity and agreement between the Quasar program, glaucoma progression analysis (GPA II) event analysis and expert opinion in the detection of glaucomatous progression. The Quasar program is based on linear regression analysis of both mean defect (MD) and pattern standard deviation (PSD). Each series of visual fields was evaluated by three methods; Quasar, GPA II and four experts. The sensitivity, specificity and agreement (kappa) for each method was calculated, using expert opinion as the reference standard. The study included 439 SITA Standard visual fields of 56 eyes of 42 patients, with a mean of 7.8 ± 0.8 visual fields per eye. When suspected cases of progression were considered stable, sensitivity and specificity of Quasar, GPA II and the experts were 86.6% and 70.7%, 26.6% and 95.1%, and 86.6% and 92.6% respectively. When suspected cases of progression were considered as progressing, sensitivity and specificity of Quasar, GPA II and the experts were 79.1% and 81.2%, 45.8% and 90.6%, and 85.4% and 90.6% respectively. The agreement between Quasar and GPA II when suspected cases were considered stable or progressing was 0.03 and 0.28 respectively. The degree of agreement between Quasar and the experts when suspected cases were considered stable or progressing was 0.472 and 0.507. The degree of agreement between GPA II and the experts when suspected cases were considered stable or progressing was 0.262 and 0.342. The combination of MD and PSD regression analysis in the Quasar program showed better agreement with the experts and higher sensitivity than GPA II.

  20. Standardizing effect size from linear regression models with log-transformed variables for meta-analysis.

    Science.gov (United States)

    Rodríguez-Barranco, Miguel; Tobías, Aurelio; Redondo, Daniel; Molina-Portillo, Elena; Sánchez, María José

    2017-03-17

    Meta-analysis is very useful to summarize the effect of a treatment or a risk factor for a given disease. Often studies report results based on log-transformed variables in order to achieve the principal assumptions of a linear regression model. If this is the case for some, but not all studies, the effects need to be homogenized. We derived a set of formulae to transform absolute changes into relative ones, and vice versa, to allow including all results in a meta-analysis. We applied our procedure to all possible combinations of log-transformed independent or dependent variables. We also evaluated it in a simulation based on two variables either normally or asymmetrically distributed. In all the scenarios, and based on different change criteria, the effect size estimated by the derived set of formulae was equivalent to the real effect size. To avoid biased estimates of the effect, this procedure should be used with caution in the case of independent variables with asymmetric distributions that significantly differ from the normal distribution. We illustrate an application of this procedure by an application to a meta-analysis on the potential effects on neurodevelopment in children exposed to arsenic and manganese. The procedure proposed has been shown to be valid and capable of expressing the effect size of a linear regression model based on different change criteria in the variables. Homogenizing the results from different studies beforehand allows them to be combined in a meta-analysis, independently of whether the transformations had been performed on the dependent and/or independent variables.

  1. Error structure of enzyme kinetic experiments. Implications for weighting in regression analysis of experimental data.

    Science.gov (United States)

    Askelöf, P; Korsfeldt, M; Mannervik, B

    1976-10-01

    Knowledge of the error structure of a given set of experimental data is a necessary prerequisite for incisive analysis and for discrimination between alternative mathematical models of the data set. A reaction system consisting of glutathione S-transferase A (glutathione S-aryltransferase), glutathione, and 3,4-dichloro-1-nitrobenzene was investigated under steady-state conditions. It was found that the experimental error increased with initial velocity, v, and that the variance (estimated by replicates) could be described by a polynomial in v Var (v) = K0 + K1 - v + K2 - v2 or by a power function Var (v) = K0 + K1 - vK2. These equations were good approximations irrespective of whether different v values were generated by changing substrate or enzyme concentrations. The selection of these models was based mainly on experiments involving varying enzyme concentration, which, unlike v, is not considered a stochastic variable. Different models of the variance, expressed as functions of enzyme concentration, were examined by regression analysis, and the models could then be transformed to functions in which velocity is substituted for enzyme concentration owing to the proportionality between these variables. Thus, neither the absolute nor the relative error was independent of velocity, a result previously obtained for glutathione reductase in this laboratory [BioSystems 7, 101-119 (1975)]. If the experimental errors or velocities were standardized by division with their corresponding mean velocity value they showed a normal (Gaussian) distribution provided that the coefficient of variation was approximately constant for the data considered. Furthermore, it was established that the errors in the independent variables (enzyme and substrate concentrations) were small in comparison with the error in the velocity determinations. For weighting in regression analysis the inverted value of the local variance in each experimental point should be used. It was found that the

  2. Measuring treatment and scale bias effects by linear regression in the analysis of OHI-S scores.

    Science.gov (United States)

    Moore, B J

    1977-05-01

    A linear regression model is presented for estimating unbiased treatment effects from OHI-S scores. An example is given to illustrate an analysis and to compare results of an unbiased regression estimator with those based on a biased simple difference estimator.

  3. The Impact of Outliers on Net-Benefit Regression Model in Cost-Effectiveness Analysis.

    Science.gov (United States)

    Wen, Yu-Wen; Tsai, Yi-Wen; Wu, David Bin-Chia; Chen, Pei-Fen

    2013-01-01

    Ordinary least square (OLS) in regression has been widely used to analyze patient-level data in cost-effectiveness analysis (CEA). However, the estimates, inference and decision making in the economic evaluation based on OLS estimation may be biased by the presence of outliers. Instead, robust estimation can remain unaffected and provide result which is resistant to outliers. The objective of this study is to explore the impact of outliers on net-benefit regression (NBR) in CEA using OLS and to propose a potential solution by using robust estimations, i.e. Huber M-estimation, Hampel M-estimation, Tukey's bisquare M-estimation, MM-estimation and least trimming square estimation. Simulations under different outlier-generating scenarios and an empirical example were used to obtain the regression estimates of NBR by OLS and five robust estimations. Empirical size and empirical power of both OLS and robust estimations were then compared in the context of hypothesis testing. Simulations showed that the five robust approaches compared with OLS estimation led to lower empirical sizes and achieved higher empirical powers in testing cost-effectiveness. Using real example of antiplatelet therapy, the estimated incremental net-benefit by OLS estimation was lower than those by robust approaches because of outliers in cost data. Robust estimations demonstrated higher probability of cost-effectiveness compared to OLS estimation. The presence of outliers can bias the results of NBR and its interpretations. It is recommended that the use of robust estimation in NBR can be an appropriate method to avoid such biased decision making.

  4. Stature estimation from footprint measurements in Indian Tamils by regression analysis

    Directory of Open Access Journals (Sweden)

    T. Nataraja Moorthy

    2014-03-01

    Full Text Available Stature estimation is of particular interest to forensic scientists for its importance in human identification. Footprint is one piece of valuable physical evidence encountered at crime scenes and its identification can facilitate narrowing down the suspects and establishing the identity of the criminals. Analysis of footprints helps in estimation of an individual’s stature because of the existence of the strong correlation between footprint and height. Foot impressions are still found at crime scenes, since offenders often tend to remove their footwear either to avoid noise or to gain a better grip in climbing walls, etc., while entering or exiting. In Asian countries like India, there are people who still have the habit of walking barefoot. The present study aims to estimate the stature in a sample of 2,040 bilateral footprints collected from 1,020 healthy adult male Indian Tamils, an ethnic group in Tamilnadu State, India, who consented to participate in the study and who range in age from 19 to 42 years old; this study will help to generate population-specific equations using a simple linear regression statistical method. All footprint lengths exhibit a statistically positive significant correlation with stature (p-value < 0.01 and the correlation coefficient (r ranges from 0.546 to 0.578. The accuracy of the regression equations was verified by comparing the estimated stature with the actual stature. Regression equations derived in this research can be used to estimate stature from the complete or even partial footprints among Indian Tamils.

  5. A geostatistics-informed hierarchical sensitivity analysis method for complex groundwater flow and transport modeling: GEOSTATISTICAL SENSITIVITY ANALYSIS

    Energy Technology Data Exchange (ETDEWEB)

    Dai, Heng [Pacific Northwest National Laboratory, Richland Washington USA; Chen, Xingyuan [Pacific Northwest National Laboratory, Richland Washington USA; Ye, Ming [Department of Scientific Computing, Florida State University, Tallahassee Florida USA; Song, Xuehang [Pacific Northwest National Laboratory, Richland Washington USA; Zachara, John M. [Pacific Northwest National Laboratory, Richland Washington USA

    2017-05-01

    Sensitivity analysis is an important tool for quantifying uncertainty in the outputs of mathematical models, especially for complex systems with a high dimension of spatially correlated parameters. Variance-based global sensitivity analysis has gained popularity because it can quantify the relative contribution of uncertainty from different sources. However, its computational cost increases dramatically with the complexity of the considered model and the dimension of model parameters. In this study we developed a hierarchical sensitivity analysis method that (1) constructs an uncertainty hierarchy by analyzing the input uncertainty sources, and (2) accounts for the spatial correlation among parameters at each level of the hierarchy using geostatistical tools. The contribution of uncertainty source at each hierarchy level is measured by sensitivity indices calculated using the variance decomposition method. Using this methodology, we identified the most important uncertainty source for a dynamic groundwater flow and solute transport in model at the Department of Energy (DOE) Hanford site. The results indicate that boundary conditions and permeability field contribute the most uncertainty to the simulated head field and tracer plume, respectively. The relative contribution from each source varied spatially and temporally as driven by the dynamic interaction between groundwater and river water at the site. By using a geostatistical approach to reduce the number of realizations needed for the sensitivity analysis, the computational cost of implementing the developed method was reduced to a practically manageable level. The developed sensitivity analysis method is generally applicable to a wide range of hydrologic and environmental problems that deal with high-dimensional spatially-distributed parameters.

  6. [Band depth analysis and partial least square regression based winter wheat biomass estimation using hyperspectral measurements].

    Science.gov (United States)

    Fu, Yuan-Yuan; Wang, Ji-Hua; Yang, Gui-Jun; Song, Xiao-Yu; Xu, Xin-Gang; Feng, Hai-Kuan

    2013-05-01

    The major limitation of using existing vegetation indices for crop biomass estimation is that it approaches a saturation level asymptotically for a certain range of biomass. In order to resolve this problem, band depth analysis and partial least square regression (PLSR) were combined to establish winter wheat biomass estimation model in the present study. The models based on the combination of band depth analysis and PLSR were compared with the models based on common vegetation indexes from the point of view of estimation accuracy, subsequently. Band depth analysis was conducted in the visible spectral domain (550-750 nm). Band depth, band depth ratio (BDR), normalized band depth index, and band depth normalized to area were utilized to represent band depth information. Among the calibrated estimation models, the models based on the combination of band depth analysis and PLSR reached higher accuracy than those based on the vegetation indices. Among them, the combination of BDR and PLSR got the highest accuracy (R2 = 0.792, RMSE = 0.164 kg x m(-2)). The results indicated that the combination of band depth analysis and PLSR could well overcome the saturation problem and improve the biomass estimation accuracy when winter wheat biomass is large.

  7. Thermodynamic analysis on an anisotropically superhydrophobic surface with a hierarchical structure

    Science.gov (United States)

    Zhao, Jieliang; Su, Zhengliang; Yan, Shaoze

    2015-12-01

    Superhydrophobic surfaces, which refer to the surfaces with contact angle higher than 150° and hysteresis less than 10°, have been reported in various studies. However, studies on the superhydrophobicity of anisotropic, hierarchical surfaces are limited and the corresponding thermodynamic mechanisms could not be explained thoroughly. Here we propose a simplified surface model of anisotropic patterned surface with dual scale roughness. Based on the thermodynamic method, we calculate the equilibrium contact angle (ECA) and the contact angle hysteresis (CAH) on the given surface. We show here that the hierarchical structure has much better anisotropic wetting properties than the single-scale one, and the results shed light on the potential application in controllable micro-/nano-fluidic systems. Our studies can be potentially applied for the fabrication of anisotropically superhydrophobic surfaces.

  8. Prognostics of Lithium-Ion Batteries Based on Battery Performance Analysis and Flexible Support Vector Regression

    Directory of Open Access Journals (Sweden)

    Shuai Wang

    2014-10-01

    Full Text Available Accurate prediction of the remaining useful life (RUL of lithium-ion batteries is important for battery management systems. Traditional empirical data-driven approaches for RUL prediction usually require multidimensional physical characteristics including the current, voltage, usage duration, battery temperature, and ambient temperature. From a capacity fading analysis of lithium-ion batteries, it is found that the energy efficiency and battery working temperature are closely related to the capacity degradation, which account for all performance metrics of lithium-ion batteries with regard to the RUL and the relationships between some performance metrics. Thus, we devise a non-iterative prediction model based on flexible support vector regression (F-SVR and an iterative multi-step prediction model based on support vector regression (SVR using the energy efficiency and battery working temperature as input physical characteristics. The experimental results show that the proposed prognostic models have high prediction accuracy by using fewer dimensions for the input data than the traditional empirical models.

  9. An innovative land use regression model incorporating meteorology for exposure analysis.

    Science.gov (United States)

    Su, Jason G; Brauer, Michael; Ainslie, Bruce; Steyn, Douw; Larson, Timothy; Buzzelli, Michael

    2008-02-15

    The advent of spatial analysis and geographic information systems (GIS) has led to studies of chronic exposure and health effects based on the rationale that intra-urban variations in ambient air pollution concentrations are as great as inter-urban differences. Such studies typically rely on local spatial covariates (e.g., traffic, land use type) derived from circular areas (buffers) to predict concentrations/exposures at receptor sites, as a means of averaging the annual net effect of meteorological influences (i.e., wind speed, wind direction and insolation). This is the approach taken in the now popular land use regression (LUR) method. However spatial studies of chronic exposures and temporal studies of acute exposures have not been adequately integrated. This paper presents an innovative LUR method implemented in a GIS environment that reflects both temporal and spatial variability and considers the role of meteorology. The new source area LUR integrates wind speed, wind direction and cloud cover/insolation to estimate hourly nitric oxide (NO) and nitrogen dioxide (NO(2)) concentrations from land use types (i.e., road network, commercial land use) and these concentrations are then used as covariates to regress against NO and NO(2) measurements at various receptor sites across the Vancouver region and compared directly with estimates from a regular LUR. The results show that, when variability in seasonal concentration measurements is present, the source area LUR or SA-LUR model is a better option for concentration estimation.

  10. Improved Regression Analysis of Temperature-Dependent Strain-Gage Balance Calibration Data

    Science.gov (United States)

    Ulbrich, N.

    2015-01-01

    An improved approach is discussed that may be used to directly include first and second order temperature effects in the load prediction algorithm of a wind tunnel strain-gage balance. The improved approach was designed for the Iterative Method that fits strain-gage outputs as a function of calibration loads and uses a load iteration scheme during the wind tunnel test to predict loads from measured gage outputs. The improved approach assumes that the strain-gage balance is at a constant uniform temperature when it is calibrated and used. First, the method introduces a new independent variable for the regression analysis of the balance calibration data. The new variable is designed as the difference between the uniform temperature of the balance and a global reference temperature. This reference temperature should be the primary calibration temperature of the balance so that, if needed, a tare load iteration can be performed. Then, two temperature{dependent terms are included in the regression models of the gage outputs. They are the temperature difference itself and the square of the temperature difference. Simulated temperature{dependent data obtained from Triumph Aerospace's 2013 calibration of NASA's ARC-30K five component semi{span balance is used to illustrate the application of the improved approach.

  11. Multiple Regression Analysis of mRNA-miRNA Associations in Colorectal Cancer Pathway

    Directory of Open Access Journals (Sweden)

    Fengfeng Wang

    2014-01-01

    Full Text Available Background. MicroRNA (miRNA is a short and endogenous RNA molecule that regulates posttranscriptional gene expression. It is an important factor for tumorigenesis of colorectal cancer (CRC, and a potential biomarker for diagnosis, prognosis, and therapy of CRC. Our objective is to identify the related miRNAs and their associations with genes frequently involved in CRC microsatellite instability (MSI and chromosomal instability (CIN signaling pathways. Results. A regression model was adopted to identify the significantly associated miRNAs targeting a set of candidate genes frequently involved in colorectal cancer MSI and CIN pathways. Multiple linear regression analysis was used to construct the model and find the significant mRNA-miRNA associations. We identified three significantly associated mRNA-miRNA pairs: BCL2 was positively associated with miR-16 and SMAD4 was positively associated with miR-567 in the CRC tissue, while MSH6 was positively associated with miR-142-5p in the normal tissue. As for the whole model, BCL2 and SMAD4 models were not significant, and MSH6 model was significant. The significant associations were different in the normal and the CRC tissues. Conclusion. Our results have laid down a solid foundation in exploration of novel CRC mechanisms, and identification of miRNA roles as oncomirs or tumor suppressor mirs in CRC.

  12. A genetic algorithm for variable selection in logistic regression analysis of radiotherapy treatment outcomes.

    Science.gov (United States)

    Gayou, Olivier; Das, Shiva K; Zhou, Su-Min; Marks, Lawrence B; Parda, David S; Miften, Moyed

    2008-12-01

    A given outcome of radiotherapy treatment can be modeled by analyzing its correlation with a combination of dosimetric, physiological, biological, and clinical factors, through a logistic regression fit of a large patient population. The quality of the fit is measured by the combination of the predictive power of this particular set of factors and the statistical significance of the individual factors in the model. We developed a genetic algorithm (GA), in which a small sample of all the possible combinations of variables are fitted to the patient data. New models are derived from the best models, through crossover and mutation operations, and are in turn fitted. The process is repeated until the sample converges to the combination of factors that best predicts the outcome. The GA was tested on a data set that investigated the incidence of lung injury in NSCLC patients treated with 3DCRT. The GA identified a model with two variables as the best predictor of radiation pneumonitis: the V30 (p=0.048) and the ongoing use of tobacco at the time of referral (p=0.074). This two-variable model was confirmed as the best model by analyzing all possible combinations of factors. In conclusion, genetic algorithms provide a reliable and fast way to select significant factors in logistic regression analysis of large clinical studies.

  13. Bayesian linear regression with skew-symmetric error distributions with applications to survival analysis

    KAUST Repository

    Rubio, Francisco J.

    2016-02-09

    We study Bayesian linear regression models with skew-symmetric scale mixtures of normal error distributions. These kinds of models can be used to capture departures from the usual assumption of normality of the errors in terms of heavy tails and asymmetry. We propose a general noninformative prior structure for these regression models and show that the corresponding posterior distribution is proper under mild conditions. We extend these propriety results to cases where the response variables are censored. The latter scenario is of interest in the context of accelerated failure time models, which are relevant in survival analysis. We present a simulation study that demonstrates good frequentist properties of the posterior credible intervals associated with the proposed priors. This study also sheds some light on the trade-off between increased model flexibility and the risk of over-fitting. We illustrate the performance of the proposed models with real data. Although we focus on models with univariate response variables, we also present some extensions to the multivariate case in the Supporting Information.

  14. Thermodynamic Analysis of Simple Gas Turbine Cycle with Multiple Regression Modelling and Optimization

    Directory of Open Access Journals (Sweden)

    Abdul Ghafoor Memon

    2014-03-01

    Full Text Available In this study, thermodynamic and statistical analyses were performed on a gas turbine system, to assess the impact of some important operating parameters like CIT (Compressor Inlet Temperature, PR (Pressure Ratio and TIT (Turbine Inlet Temperature on its performance characteristics such as net power output, energy efficiency, exergy efficiency and fuel consumption. Each performance characteristic was enunciated as a function of operating parameters, followed by a parametric study and optimization. The results showed that the performance characteristics increase with an increase in the TIT and a decrease in the CIT, except fuel consumption which behaves oppositely. The net power output and efficiencies increase with the PR up to certain initial values and then start to decrease, whereas the fuel consumption always decreases with an increase in the PR. The results of exergy analysis showed the combustion chamber as a major contributor to the exergy destruction, followed by stack gas. Subsequently, multiple regression models were developed to correlate each of the response variables (performance characteristic with the predictor variables (operating parameters. The regression model equations showed a significant statistical relationship between the predictor and response variables.

  15. Exergy Analysis of a Subcritical Reheat Steam Power Plant with Regression Modeling and Optimization

    Directory of Open Access Journals (Sweden)

    MUHIB ALI RAJPER

    2016-07-01

    Full Text Available In this paper, exergy analysis of a 210 MW SPP (Steam Power Plant is performed. Firstly, the plant is modeled and validated, followed by a parametric study to show the effects of various operating parameters on the performance parameters. The net power output, energy efficiency, and exergy efficiency are taken as the performance parameters, while the condenser pressure, main steam pressure, bled steam pressures, main steam temperature, and reheat steam temperature isnominated as the operating parameters. Moreover, multiple polynomial regression models are developed to correlate each performance parameter with the operating parameters. The performance is then optimizedby using Direct-searchmethod. According to the results, the net power output, energy efficiency, and exergy efficiency are calculated as 186.5 MW, 31.37 and 30.41%, respectively under normal operating conditions as a base case. The condenser is a major contributor towards the energy loss, followed by the boiler, whereas the highest irreversibilities occur in the boiler and turbine. According to the parametric study, variation in the operating parameters greatly influences the performance parameters. The regression models have appeared to be a good estimator of the performance parameters. The optimum net power output, energy efficiency and exergy efficiency are obtained as 227.6 MW, 37.4 and 36.4, respectively, which have been calculated along with optimal values of selected operating parameters.

  16. Classification of Effective Soil Depth by Using Multinomial Logistic Regression Analysis

    Science.gov (United States)

    Chang, C. H.; Chan, H. C.; Chen, B. A.

    2016-12-01

    Classification of effective soil depth is a task of determining the slopeland utilizable limitation in Taiwan. The "Slopeland Conservation and Utilization Act" categorizes the slopeland into agriculture and husbandry land, land suitable for forestry and land for enhanced conservation according to the factors including average slope, effective soil depth, soil erosion and parental rock. However, sit investigation of the effective soil depth requires a cost-effective field work. This research aimed to classify the effective soil depth by using multinomial logistic regression with the environmental factors. The Wen-Shui Watershed located at the central Taiwan was selected as the study areas. The analysis of multinomial logistic regression is performed by the assistance of a Geographic Information Systems (GIS). The effective soil depth was categorized into four levels including deeper, deep, shallow and shallower. The environmental factors of slope, aspect, digital elevation model (DEM), curvature and normalized difference vegetation index (NDVI) were selected for classifying the soil depth. An Error Matrix was then used to assess the model accuracy. The results showed an overall accuracy of 75%. At the end, a map of effective soil depth was produced to help planners and decision makers in determining the slopeland utilizable limitation in the study areas.

  17. Personality disorders, violence, and antisocial behavior: a systematic review and meta-regression analysis.

    Science.gov (United States)

    Yu, Rongqin; Geddes, John R; Fazel, Seena

    2012-10-01

    The risk of antisocial outcomes in individuals with personality disorder (PD) remains uncertain. The authors synthesize the current evidence on the risks of antisocial behavior, violence, and repeat offending in PD, and they explore sources of heterogeneity in risk estimates through a systematic review and meta-regression analysis of observational studies comparing antisocial outcomes in personality disordered individuals with controls groups. Fourteen studies examined risk of antisocial and violent behavior in 10,007 individuals with PD, compared with over 12 million general population controls. There was a substantially increased risk of violent outcomes in studies with all PDs (random-effects pooled odds ratio [OR] = 3.0, 95% CI = 2.6 to 3.5). Meta-regression revealed that antisocial PD and gender were associated with higher risks (p = .01 and .07, respectively). The odds of all antisocial outcomes were also elevated. Twenty-five studies reported the risk of repeat offending in PD compared with other offenders. The risk of a repeat offense was also increased (fixed-effects pooled OR = 2.4, 95% CI = 2.2 to 2.7) in offenders with PD. The authors conclude that although PD is associated with antisocial outcomes and repeat offending, the risk appears to differ by PD category, gender, and whether individuals are offenders or not.

  18. A quantile regression approach to the analysis of the quality of life determinants in the elderly

    Directory of Open Access Journals (Sweden)

    Serena Broccoli

    2013-05-01

    Full Text Available Objective. The aim of this study is to explain the effect of important covariates on the health-related quality of life (HRQol in elderly subjects. Methods. Data were collected within a longitudinal study that involves 5256 subject, aged +or= 65. The Visual Analogue Scale inclused in the EQ-5D Questionnaire, tha EQ-VAS, was used to obtain a synthetic measure of quality of life. To model EQ-VAS Score a quantile regression analysis was employed. This methodological approach was preferred to an OLS regression becouse of the EQ-VAS Score typical distribution. The main covariates are: amount of weekly physical activity, reported problems in Activity of Daily Living, presence of cardiovascular diseases, diabetes, hypercolesterolemia, hypertension, joints pains, as well as socio-demographic information. Main Results. 1 Even a low level of physical activity significantly influences quality of life in a positive way; 2 ADL problems, at least one cardiovascular disease and joint pain strongly decrease the quality of life.

  19. Thermodynamic analysis on an anisotropically superhydrophobic surface with a hierarchical structure

    Energy Technology Data Exchange (ETDEWEB)

    Zhao, Jieliang [Division of Intelligent and Biomechanical Systems, State Key Laboratory of Tribology, Tsinghua University, Room 3407, Building 9003, 100084 Beijing (China); Su, Zhengliang [Division of Intelligent and Biomechanical Systems, State Key Laboratory of Tribology, Tsinghua University, Room 3407, Building 9003, 100084 Beijing (China); Department of Automotive Engineering, Tsinghua University, Beijing 100084 (China); Yan, Shaoze, E-mail: yansz@mail.tsinghua.edu.cn [Division of Intelligent and Biomechanical Systems, State Key Laboratory of Tribology, Tsinghua University, Room 3407, Building 9003, 100084 Beijing (China)

    2015-12-01

    Graphical abstract: - Highlights: • We model the superhydrophobic surface with anisotropic and hierarchical structure. • Anisotropic wetting only shows in noncomposite state (not in composite state). • Transition from noncomposite to composite state on dual-scale structure is hard. • Droplets tend to roll in the particular direction. • Droplets tend to stably remain in one preferred thermodynamic state. - Abstract: Superhydrophobic surfaces, which refer to the surfaces with contact angle higher than 150° and hysteresis less than 10°, have been reported in various studies. However, studies on the superhydrophobicity of anisotropic, hierarchical surfaces are limited and the corresponding thermodynamic mechanisms could not be explained thoroughly. Here we propose a simplified surface model of anisotropic patterned surface with dual scale roughness. Based on the thermodynamic method, we calculate the equilibrium contact angle (ECA) and the contact angle hysteresis (CAH) on the given surface. We show here that the hierarchical structure has much better anisotropic wetting properties than the single-scale one, and the results shed light on the potential application in controllable micro-/nano-fluidic systems. Our studies can be potentially applied for the fabrication of anisotropically superhydrophobic surfaces.

  20. Bayesian Data Analysis with the Bivariate Hierarchical Ornstein-Uhlenbeck Process Model.

    Science.gov (United States)

    Oravecz, Zita; Tuerlinckx, Francis; Vandekerckhove, Joachim

    2016-01-01

    In this paper, we propose a multilevel process modeling approach to describing individual differences in within-person changes over time. To characterize changes within an individual, repeated measures over time are modeled in terms of three person-specific parameters: a baseline level, intraindividual variation around the baseline, and regulatory mechanisms adjusting toward baseline. Variation due to measurement error is separated from meaningful intraindividual variation. The proposed model allows for the simultaneous analysis of longitudinal measurements of two linked variables (bivariate longitudinal modeling) and captures their relationship via two person-specific parameters. Relationships between explanatory variables and model parameters can be studied in a one-stage analysis, meaning that model parameters and regression coefficients are estimated simultaneously. Mathematical details of the approach, including a description of the core process model-the Ornstein-Uhlenbeck model-are provided. We also describe a user friendly, freely accessible software program that provides a straightforward graphical interface to carry out parameter estimation and inference. The proposed approach is illustrated by analyzing data collected via self-reports on affective states.

  1. Factors predicting the failure of Bernese periacetabular osteotomy: a meta-regression analysis.

    Science.gov (United States)

    Sambandam, Senthil Nathan; Hull, Jason; Jiranek, William A

    2009-12-01

    There is no clear evidence regarding the outcome of Bernese periacetabular osteotomy (PAO) in different patient populations. We performed systematic meta-regression analysis of 23 eligible studies. There were 1,113 patients of which 61 patients had total hip arthroplasty (THA) (endpoint) as a result of failed Bernese PAO. Univariate analysis revealed significant correlation between THA and presence of grade 2/grade 3 arthritis, Merle de'Aubigne score (MDS), Harris hip score and Tonnis angle, change in lateral centre edge (LCE) angle, late proximal femoral osteotomies, and heterotrophic ossification (HO) resection. Multivariate analysis showed that the odds of having THA increases with grade 2/grade 3 osteoarthritis (3.36 times), joint penetration (3.12 times), low preoperative MDS (1.59 times), late PFO (1.59 times), presence of preoperative subluxation (1.22 times), previous hip operations (1.14 times), and concomitant PFO (1.09 times). In the absence of randomised controlled studies, the findings of this analysis can help the surgeon to make treatment decisions.

  2. A Performance Study of Data Mining Techniques: Multiple Linear Regression vs. Factor Analysis

    CERN Document Server

    Taneja, Abhishek

    2011-01-01

    The growing volume of data usually creates an interesting challenge for the need of data analysis tools that discover regularities in these data. Data mining has emerged as disciplines that contribute tools for data analysis, discovery of hidden knowledge, and autonomous decision making in many application domains. The purpose of this study is to compare the performance of two data mining techniques viz., factor analysis and multiple linear regression for different sample sizes on three unique sets of data. The performance of the two data mining techniques is compared on following parameters like mean square error (MSE), R-square, R-Square adjusted, condition number, root mean square error(RMSE), number of variables included in the prediction model, modified coefficient of efficiency, F-value, and test of normality. These parameters have been computed using various data mining tools like SPSS, XLstat, Stata, and MS-Excel. It is seen that for all the given dataset, factor analysis outperform multiple linear re...

  3. Regression analysis between body and head measurements of Chinese alligators (Alligator sinensis in the captive population

    Directory of Open Access Journals (Sweden)

    Wu, X. B.

    2006-06-01

    Full Text Available Four body-size and fourteen head-size measurements were taken from each Chinese alligator (Alligator sinensis according to the measurements adapted from Verdade. Regression equations between body-size and head-size variables were presented to predict body size from head dimension. The coefficients of determination of captive animals concerning body- and head-size variables can be considered extremely high, which means most of the head-size variables studied can be useful for predicting body length. The result of multivariate allometric analysis indicated that the head elongates as in most other species of crocodilians. The allometric coefficients of snout length (SL and lower ramus (LM were greater than those of other variables of head, which was considered to be possibly correlated to fights and prey. On the contrary, allometric coefficients for the variables of obita (OW, OL and postorbital cranial roof (LCR, were lower than those of other variables.

  4. Deterministic Assessment of Continuous Flight Auger Construction Durations Using Regression Analysis

    Directory of Open Access Journals (Sweden)

    Hossam E. Hosny

    2015-07-01

    Full Text Available One of the primary functions of construction equipment management is to calculate the production rate of equipment which will be a major input to the processes of time estimates, cost estimates and the overall project planning. Accordingly, it is crucial to stakeholders to be able to compute equipment production rates. This may be achieved using an accurate, reliable and easy tool. The objective of this research is to provide a simple model that can be used by specialists to predict the duration of a proposed Continuous Flight Auger job. The model was obtained using a prioritizing technique based on expert judgment then using multi-regression analysis based on a representative sample. The model was then validated on a selected sample of projects. The average error of the model was calculated to be about (3%-6%.

  5. Biological stability in drinking water: a regression analysis of influencing factors

    Institute of Scientific and Technical Information of China (English)

    LU Wei; ZHANG Xiao-jian

    2005-01-01

    Some parameters, such as assimilable organic carbon(AOC), chloramine residual, water temperature, and water residence time, were measured in drinking water from distribution systems in a northern city of China. The measurement results illustrate that when chloramine residual is more than 0.3 mg/L or AOC content is below 50 tμg/L, the biological stability of drinking water can be controlled.Both chloramine residual and AOC have a good relationship with Heterotrophic Plate Counts(HPC)(log value), the correlation coefficient was -0.64 and 0.33, respectively. By regression analysis of the survey data, a statistical equation is presented and it is concluded that disinfectant residual exerts the strongest influence on bacterial growth and AOC is a suitable index to assess the biological stability in the drinking water.

  6. Logistic Regression Analysis on Factors Affecting Adoption of RiceFish Farming in North Iran

    Institute of Scientific and Technical Information of China (English)

    Seyyed Ali NOORHOSSEINI-NIYAKI; Mohammad Sadegh ALLAHYARI

    2012-01-01

    We evaluated the factors influencing the adoption of rice-fish farming in the Tavalesh region near the Caspian Sea in northern Iran.We conducted a survey with open-ended questions.Data were collected from 184 respondents (61 adopters and 123 non-adopters) randomly sampled from selected villages and analyzed using logistic regression and multiresponse analysis.Family size,number of contacts with an extension agent,participation in extension-education activities,membership in social institutions and the presence of farm workers were the most important socioeconomic factors for the adoption of rice-fish farming system.In addition,economic problems were the most common issue reported by adopters.Other issues such as lack of access to appropriate fish food,losses of fish,lack of access to high quality fish fingerlings and dehydration and poor water quality were also important to a number of farmers.

  7. ANALYSIS OF TUITION GROWTH RATES BASED ON CLUSTERING AND REGRESSION MODELS

    Directory of Open Access Journals (Sweden)

    Long Cheng

    2016-07-01

    Full Text Available Tuition plays a significant role in determining whether a student could afford higher education, which is one of the major driving forces for country development and social prosperity. So it is necessary to fully understand what factors might affect the tuition and how they affect it. However, many existing studies on the tuition growth rate either lack sufficient real data and proper quantitative models to support their conclusions, or are limited to focus on only a few factors that might affect the tuition growth rate, failing to make a comprehensive analysis. In this paper, we explore a wide variety of factors that might affect the tuition growth rate by use of large amounts of authentic data and different quantitative methods such as clustering and regression models.

  8. A generalized Defries-Fulker regression framework for the analysis of twin data.

    Science.gov (United States)

    Lazzeroni, Laura C; Ray, Amrita

    2013-01-01

    Twin studies compare the similarity between monozygotic twins to that between dizygotic twins in order to investigate the relative contributions of latent genetic and environmental factors influencing a phenotype. Statistical methods for twin data include likelihood estimation and Defries-Fulker regression. We propose a new generalization of the Defries-Fulker model that fully incorporates the effects of observed covariates on both members of a twin pair and is robust to violations of the Normality assumption. A simulation study demonstrates that the method is competitive with likelihood analysis. The Defries-Fulker strategy yields new insight into the parameter space of the twin model and provides a novel, prediction-based interpretation of twin study results that unifies continuous and binary traits. Due to the simplicity of its structure, extensions of the model have the potential to encompass generalized linear models, censored and truncated data; and gene by environment interactions.

  9. Electricity price forecasting using generalized regression neural network based on principal components analysis

    Institute of Scientific and Technical Information of China (English)

    牛东晓; 刘达; 邢棉

    2008-01-01

    A combined model based on principal components analysis (PCA) and generalized regression neural network (GRNN) was adopted to forecast electricity price in day-ahead electricity market. PCA was applied to mine the main influence on day-ahead price, avoiding the strong correlation between the input factors that might influence electricity price, such as the load of the forecasting hour, other history loads and prices, weather and temperature; then GRNN was employed to forecast electricity price according to the main information extracted by PCA. To prove the efficiency of the combined model, a case from PJM (Pennsylvania-New Jersey-Maryland) day-ahead electricity market was evaluated. Compared to back-propagation (BP) neural network and standard GRNN, the combined method reduces the mean absolute percentage error about 3%.

  10. Sensitivity Analysis to Select the Most Influential Risk Factors in a Logistic Regression Model

    Directory of Open Access Journals (Sweden)

    Jassim N. Hussain

    2008-01-01

    Full Text Available The traditional variable selection methods for survival data depend on iteration procedures, and control of this process assumes tuning parameters that are problematic and time consuming, especially if the models are complex and have a large number of risk factors. In this paper, we propose a new method based on the global sensitivity analysis (GSA to select the most influential risk factors. This contributes to simplification of the logistic regression model by excluding the irrelevant risk factors, thus eliminating the need to fit and evaluate a large number of models. Data from medical trials are suggested as a way to test the efficiency and capability of this method and as a way to simplify the model. This leads to construction of an appropriate model. The proposed method ranks the risk factors according to their importance.

  11. Melanin and blood concentration in human skin studied by multiple regression analysis: experiments

    Science.gov (United States)

    Shimada, M.; Yamada, Y.; Itoh, M.; Yatagai, T.

    2001-09-01

    Knowledge of the mechanism of human skin colour and measurement of melanin and blood concentration in human skin are needed in the medical and cosmetic fields. The absorbance spectrum from reflectance at the visible wavelength of human skin increases under several conditions such as a sunburn or scalding. The change of the absorbance spectrum from reflectance including the scattering effect does not correspond to the molar absorption spectrum of melanin and blood. The modified Beer-Lambert law is applied to the change in the absorbance spectrum from reflectance of human skin as the change in melanin and blood is assumed to be small. The concentration of melanin and blood was estimated from the absorbance spectrum reflectance of human skin using multiple regression analysis. Estimated concentrations were compared with the measured one in a phantom experiment and this method was applied to in vivo skin.

  12. Model selection for marginal regression analysis of longitudinal data with missing observations and covariate measurement error.

    Science.gov (United States)

    Shen, Chung-Wei; Chen, Yi-Hau

    2015-10-01

    Missing observations and covariate measurement error commonly arise in longitudinal data. However, existing methods for model selection in marginal regression analysis of longitudinal data fail to address the potential bias resulting from these issues. To tackle this problem, we propose a new model selection criterion, the Generalized Longitudinal Information Criterion, which is based on an approximately unbiased estimator for the expected quadratic error of a considered marginal model accounting for both data missingness and covariate measurement error. The simulation results reveal that the proposed method performs quite well in the presence of missing data and covariate measurement error. On the contrary, the naive procedures without taking care of such complexity in data may perform quite poorly. The proposed method is applied to data from the Taiwan Longitudinal Study on Aging to assess the relationship of depression with health and social status in the elderly, accommodating measurement error in the covariate as well as missing observations.

  13. Functional Unfold Principal Component Regression Methodology for Analysis of Industrial Batch Process Data

    DEFF Research Database (Denmark)

    Mears, Lisa; Nørregaard, Rasmus; Sin, Gürkan;

    2016-01-01

    process operating at Novozymes A/S. Following the FUPCR methodology, the final product concentration could be predicted with an average prediction error of 7.4%. Multiple iterations of preprocessing were applied by implementing the methodology to identify the best data handling methods for the model....... It is shown that application of functional data analysis and the choice of variance scaling method have the greatest impact on the prediction accuracy. Considering the vast amount of batch process data continuously generated in industry, this methodology can potentially contribute as a tool to identify......This work proposes a methodology utilizing functional unfold principal component regression (FUPCR), for application to industrial batch process data as a process modeling and optimization tool. The methodology is applied to an industrial fermentation dataset, containing 30 batches of a production...

  14. A Note on Penalized Regression Spline Estimation in the Secondary Analysis of Case-Control Data

    KAUST Repository

    Gazioglu, Suzan

    2013-05-25

    Primary analysis of case-control studies focuses on the relationship between disease (D) and a set of covariates of interest (Y, X). A secondary application of the case-control study, often invoked in modern genetic epidemiologic association studies, is to investigate the interrelationship between the covariates themselves. The task is complicated due to the case-control sampling, and to avoid the biased sampling that arises from the design, it is typical to use the control data only. In this paper, we develop penalized regression spline methodology that uses all the data, and improves precision of estimation compared to using only the controls. A simulation study and an empirical example are used to illustrate the methodology.

  15. Logistic regression analysis of the risk factors of acute renal failure complicating limb war injuries

    Directory of Open Access Journals (Sweden)

    Chang-zhi CHENG

    2011-06-01

    Full Text Available Objective To explore the risk factors of complication of acute renal failure(ARF in war injuries of limbs.Methods The clinical data of 352 patients with limb injuries admitted to 303 Hospital of PLA from 1968 to 2002 were retrospectively analyzed.The patients were divided into ARF group(n=9 and non-ARF group(n=343 according to the occurrence of ARF,and the case-control study was carried out.Ten factors which might lead to death were analyzed by logistic regression to screen the risk factors for ARF,including causes of trauma,shock after injury,time of admission to hospital after injury,injured sites,combined trauma,number of surgical procedures,presence of foreign matters,features of fractures,amputation,and tourniquet time.Results Fifteen of the 352 patients died(4.3%,among them 7 patients(46.7% died of ARF,3(20.0% of pulmonary embolism,3(20.0% of gas gangrene,and 2(13.3% of multiple organ failure.Univariate analysis revealed that the shock,time before admitted to hospital,amputation and tourniquet time were the risk factors for ARF in the wounded with limb injuries,while the logistic regression analysis showed only amputation was the risk factor for ARF(P < 0.05.Conclusion ARF is the primary cause-of-death in the wounded with limb injury.Prompt and accurate treatment and optimal time for amputation may be beneficial to decreasing the incidence and mortality of ARF in the wounded with severe limb injury and ischemic necrosis.

  16. [The hierarchical clustering analysis of hyperspectral image based on probabilistic latent semantic analysis].

    Science.gov (United States)

    Yi, Wen-Bin; Shen, Li; Qi, Yin-Feng; Tang, Hong

    2011-09-01

    The paper introduces the Probabilistic Latent Semantic Analysis (PLSA) to the image clustering and an effective image clustering algorithm using the semantic information from PLSA is proposed which is used for hyperspectral images. Firstly, the ISODATA algorithm is used to obtain the initial clustering result of hyperspectral image and the clusters of the initial clustering result are considered as the visual words of the PLSA. Secondly, the object-oriented image segmentation algorithm is used to partition the hyperspectral image and segments with relatively pure pixels are regarded as documents in PLSA. Thirdly, a variety of identification methods which can estimate the best number of cluster centers is combined to get the number of latent semantic topics. Then the conditional distributions of visual words in topics and the mixtures of topics in different documents are estimated by using PLSA. Finally, the conditional probabilistic of latent semantic topics are distinguished using statistical pattern recognition method, the topic type for each visual in each document will be given and the clustering result of hyperspectral image are then achieved. Experimental results show the clusters of the proposed algorithm are better than K-MEANS and ISODATA in terms of object-oriented property and the clustering result is closer to the distribution of real spatial distribution of surface.

  17. Automated Detection of Connective Tissue by Tissue Counter Analysis and Classification and Regression Trees

    Directory of Open Access Journals (Sweden)

    Josef Smolle

    2001-01-01

    Full Text Available Objective: To evaluate the feasibility of the CART (Classification and Regression Tree procedure for the recognition of microscopic structures in tissue counter analysis. Methods: Digital microscopic images of H&E stained slides of normal human skin and of primary malignant melanoma were overlayed with regularly distributed square measuring masks (elements and grey value, texture and colour features within each mask were recorded. In the learning set, elements were interactively labeled as representing either connective tissue of the reticular dermis, other tissue components or background. Subsequently, CART models were based on these data sets. Results: Implementation of the CART classification rules into the image analysis program showed that in an independent test set 94.1% of elements classified as connective tissue of the reticular dermis were correctly labeled. Automated measurements of the total amount of tissue and of the amount of connective tissue within a slide showed high reproducibility (r=0.97 and r=0.94, respectively; p < 0.001. Conclusions: CART procedure in tissue counter analysis yields simple and reproducible classification rules for tissue elements.

  18. Analysis of Hierarchical Diff-EDF Schedulability over Heterogeneous Real-Time Packet Networks

    Directory of Open Access Journals (Sweden)

    M. Saleh

    2007-01-01

    Full Text Available Packet networks are currently enabling the integration of traffic with a wide range of characteristics that extend from video traffic with stringent QoS requirements to the best-effort traffic requiring no guarantees. QoS guarantees can be provided in conventional packet networks by the use of proper packet scheduling algorithms. As a computer revolution, many scheduling algorithms have been proposed to provide different schemes of QoS guarantees with Earliest Deadline First (EDF as the most popular one. With EDF scheduling, all flows receive the same miss rate regardless of their traffic characteristics and deadlines. This makes the standard EDF algorithm unsuitable for situations in which the different flows have different miss rate requirements since in order to meet all miss rate requirements it is necessary to limit admissions so as to satisfy the flow with the most stringent miss rate requirements. In this paper, we propose a new priority assignment scheduling algorithm, Hierarchal Diff-EDF (Differentiate Earliest Deadline First, which can meet the real-time needs of these applications while continuing to provide best effort service to non-real time traffic. The Hierarchal Diff-EDF features a feedback control mechanism that detects overload conditions and modifies packet priority assignments accordingly. To examine our proposed scheduler model, we introduced our attempt to provide an exact analytical solution. The attempt showed that the solution was apparently very complicated due to the high interdependence between the system queues' service. Hence, the use of simulation techniques seems inevitable. The simulation results showed that the Hierarchical Diff-EDF achieved the minimum packet average delay when compared with both EDF and Diff-EDF schedulers.

  19. Genetic analysis of somatic cell score in Norwegian cattle using random regression test-day models.

    Science.gov (United States)

    Odegård, J; Jensen, J; Klemetsdal, G; Madsen, P; Heringstad, B

    2003-12-01

    The dataset used in this analysis contained a total of 341,736 test-day observations of somatic cell scores from 77,110 primiparous daughters of 1965 Norwegian Cattle sires. Initial analyses, using simple random regression models without genetic effects, indicated that use of homogeneous residual variance was appropriate. Further analyses were carried out by use of a repeatability model and 12 random regression sire models. Legendre polynomials of varying order were used to model both permanent environmental and sire effects, as did the Wilmink function, the Lidauer-Mäntysaari function, and the Ali-Schaeffer function. For all these models, heritability estimates were lowest at the beginning (0.05 to 0.07) and higher at the end (0.09 to 0.12) of lactation. Genetic correlations between somatic cell scores early and late in lactation were moderate to high (0.38 to 0.71), whereas genetic correlations for adjacent DIM were near unity. Models were compared based on likelihood ratio tests, Bayesian information criterion, Akaike information criterion, residual variance, and predictive ability. Based on prediction of randomly excluded observations, models with 4 coefficients for permanent environmental effect were preferred over simpler models. More highly parameterized models did not substantially increase predictive ability. Evaluation of the different model selection criteria indicated that a reduced order of fit for sire effects was desireable. Models with zeroth- or first-order of fit for sire effects and higher order of fit for permanent environmental effects probably underestimated sire variance. The chosen model had Legendre polynomials with 3 coefficients for sire, and 4 coefficients for permanent environmental effects. For this model, trajectories of sire variance and heritability were similar assuming either homogeneous or heterogeneous residual variance structure.

  20. Neighborhood Predictors of Intimate Partner Violence: A Theory-Informed Analysis Using Hierarchical Linear Modeling.

    Science.gov (United States)

    Voith, Laura A; Brondino, Michael J

    2017-09-01

    Due to high prevalence rates and deleterious effects on individuals, families, and communities, intimate partner violence (IPV) is a significant public health problem. Because IPV occurs in the context of communities and neighborhoods, research must examine the broader environment in addition to individual-level factors to successfully facilitate behavior change. Drawing from the Social Determinants of Health framework and Social Disorganization Theory, neighborhood predictors of IPV were tested using hierarchical linear modeling. Results indicated that concentrated disadvantage and female-to-male partner violence were robust predictors of women's IPV victimization. Implications for theory, practice, and policy, and future research are discussed. © Society for Community Research and Action 2017.

  1. Generalized multilevel function-on-scalar regression and principal component analysis.

    Science.gov (United States)

    Goldsmith, Jeff; Zipunnikov, Vadim; Schrack, Jennifer

    2015-06-01

    This manuscript considers regression models for generalized, multilevel functional responses: functions are generalized in that they follow an exponential family distribution and multilevel in that they are clustered within groups or subjects. This data structure is increasingly common across scientific domains and is exemplified by our motivating example, in which binary curves indicating physical activity or inactivity are observed for nearly 600 subjects over 5 days. We use a generalized linear model to incorporate scalar covariates into the mean structure, and decompose subject-specific and subject-day-specific deviations using multilevel functional principal components analysis. Thus, functional fixed effects are estimated while accounting for within-function and within-subject correlations, and major directions of variability within and between subjects are identified. Fixed effect coefficient functions and principal component basis functions are estimated using penalized splines; model parameters are estimated in a Bayesian framework using Stan, a programming language that implements a Hamiltonian Monte Carlo sampler. Simulations designed to mimic the application have good estimation and inferential properties with reasonable computation times for moderate datasets, in both cross-sectional and multilevel scenarios; code is publicly available. In the application we identify effects of age and BMI on the time-specific change in probability of being active over a 24-hour period; in addition, the principal components analysis identifies the patterns of activity that distinguish subjects and days within subjects.

  2. The value of a statistical life: a meta-analysis with a mixed effects regression model.

    Science.gov (United States)

    Bellavance, François; Dionne, Georges; Lebeau, Martin

    2009-03-01

    The value of a statistical life (VSL) is a very controversial topic, but one which is essential to the optimization of governmental decisions. We see a great variability in the values obtained from different studies. The source of this variability needs to be understood, in order to offer public decision-makers better guidance in choosing a value and to set clearer guidelines for future research on the topic. This article presents a meta-analysis based on 39 observations obtained from 37 studies (from nine different countries) which all use a hedonic wage method to calculate the VSL. Our meta-analysis is innovative in that it is the first to use the mixed effects regression model [Raudenbush, S.W., 1994. Random effects models. In: Cooper, H., Hedges, L.V. (Eds.), The Handbook of Research Synthesis. Russel Sage Foundation, New York] to analyze studies on the value of a statistical life. We conclude that the variability found in the values studied stems in large part from differences in methodologies.

  3. QUANTITATIVE ELECTRONIC STRUCTURE - ACTIVITY RELATIONSHIPS ANALYSIS ANTIMUTAGENIC BENZALACETONE DERIVATIVES BY PRINCIPAL COMPONENT REGRESSION APPROACH

    Directory of Open Access Journals (Sweden)

    Yuliana Yuliana

    2010-06-01

    Full Text Available Quantitative Electronic Structure Activity Relationship (QSAR analysis of a series of benzalacetones has been investigated based on semi empirical PM3 calculation data using Principal Components Regression (PCR. Investigation has been done based on antimutagen activity from benzalacetone compounds (presented by log 1/IC50 and was studied as linear correlation with latent variables (Tx resulted from transformation of atomic net charges using Principal Component Analysis (PCA. QSAR equation was determinated based on distribution of selected components and then was analysed with PCR. The result was described by the following QSAR equation : log 1/IC50 = 6.555 + (2.177.T1 + (2.284.T2 + (1.933.T3 The equation was significant on the 95% level with statistical parameters : n = 28 r = 0.766  SE  = 0.245  Fcalculation/Ftable = 3.780 and gave the PRESS result 0.002. It means that there were only a relatively few deviations between the experimental and theoretical data of antimutagenic activity.          New types of benzalacetone derivative compounds were designed  and their theoretical activity were predicted based on the best QSAR equation. It was found that compounds number 29, 30, 31, 32, 33, 35, 36, 37, 38, 40, 41, 42, 44, 47, 48, 49 and 50  have  a relatively high antimutagenic activity.   Keywords: QSAR; antimutagenic activity; benzalaceton; atomic net charge

  4. Binary Logistic Regression Analysis of Foramen Magnum Dimensions for Sex Determination

    Science.gov (United States)

    Kamath, Venkatesh Gokuldas

    2015-01-01

    Purpose. The structural integrity of foramen magnum is usually preserved in fire accidents and explosions due to its resistant nature and secluded anatomical position and this study attempts to determine its sexing potential. Methods. The sagittal and transverse diameters and area of foramen magnum of seventy-two skulls (41 male and 31 female) from south Indian population were measured. The analysis was done using Student's t-test, linear correlation, histogram, Q-Q plot, and Binary Logistic Regression (BLR) to obtain a model for sex determination. The predicted probabilities of BLR were analysed using Receiver Operating Characteristic (ROC) curve. Result. BLR analysis and ROC curve revealed that the predictability of the dimensions in sexing the crania was 69.6% for sagittal diameter, 66.4% for transverse diameter, and 70.3% for area of foramen. Conclusion. The sexual dimorphism of foramen magnum dimensions is established. However, due to considerable overlapping of male and female values, it is unwise to singularly rely on the foramen measurements. However, considering the high sex predictability percentage of its dimensions in the present study and the studies preceding it, the foramen measurements can be used to supplement other sexing evidence available so as to precisely ascertain the sex of the skeleton. PMID:26346917

  5. A calibration method of Argo floats based on multiple regression analysis

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    Argo floats are free-moving floats that report vertical profiles of salinity, temperature and pressure at regular time intervals. These floats give good measurements of temperature and pressure, but salinity measurements may show significant sensor drifting with time. It is found that sensor drifting with time is not purely linear as presupposed by Wong (2003). A new method is developed to calibrate conductivity data measured by Argo floats. In this method, Wong's objective analysis method was adopted to estimate the background climatological salinity field on potential temperature surfaces from nearby historical data in WOD01. Furthermore, temperature and time factors are taken into account, and stepwise regression was used for a time-varying or temperature-varying slope in potential conductivity space to correct the drifting in these profiling float salinity data. The result shows salinity errors using this method are smaller than that of Wong's method, the quantitative and qualitative analysis of the conductivity sensor can be carried out with our method.

  6. Regression analysis in modeling of air surface temperature and factors affecting its value in Peninsular Malaysia

    Science.gov (United States)

    Rajab, Jasim Mohammed; Jafri, Mohd. Zubir Mat; Lim, Hwee San; Abdullah, Khiruddin

    2012-10-01

    This study encompasses air surface temperature (AST) modeling in the lower atmosphere. Data of four atmosphere pollutant gases (CO, O3, CH4, and H2O) dataset, retrieved from the National Aeronautics and Space Administration Atmospheric Infrared Sounder (AIRS), from 2003 to 2008 was employed to develop a model to predict AST value in the Malaysian peninsula using the multiple regression method. For the entire period, the pollutants were highly correlated (R=0.821) with predicted AST. Comparisons among five stations in 2009 showed close agreement between the predicted AST and the observed AST from AIRS, especially in the southwest monsoon (SWM) season, within 1.3 K, and for in situ data, within 1 to 2 K. The validation results of AST with AST from AIRS showed high correlation coefficient (R=0.845 to 0.918), indicating the model's efficiency and accuracy. Statistical analysis in terms of β showed that H2O (0.565 to 1.746) tended to contribute significantly to high AST values during the northeast monsoon season. Generally, these results clearly indicate the advantage of using the satellite AIRS data and a correlation analysis study to investigate the impact of atmospheric greenhouse gases on AST over the Malaysian peninsula. A model was developed that is capable of retrieving the Malaysian peninsulan AST in all weather conditions, with total uncertainties ranging between 1 and 2 K.

  7. A factor analysis-multiple regression model for source apportionment of suspended particulate matter

    Science.gov (United States)

    Okamoto, Shin'ichi; Hayashi, Masayuki; Nakajima, Masaomi; Kainuma, Yasutaka; Shiozawa, Kiyoshige

    A factor analysis-multiple regression (FA-MR) model has been used for a source apportionment study in the Tokyo metropolitan area. By a varimax rotated factor analysis, five source types could be identified: refuse incineration, soil and automobile, secondary particles, sea salt and steel mill. Quantitative estimations using the FA-MR model corresponded to the calculated contributing concentrations determined by using a weighted least-squares CMB model. However, the source type of refuse incineration identified by the FA-MR model was similar to that of biomass burning, rather than that produced by an incineration plant. The estimated contributions of sea salt and steel mill by the FA-MR model contained those of other sources, which have the same temporal variation of contributing concentrations. This symptom was caused by a multicollinearity problem. Although this result shows the limitation of the multivariate receptor model, it gives useful information concerning source types and their distribution by comparing with the results of the CMB model. In the Tokyo metropolitan area, the contributions from soil (including road dust), automobile, secondary particles and refuse incineration (biomass burning) were larger than industrial contributions: fuel oil combustion and steel mill. However, since vanadium is highly correlated with SO 42- and other secondary particle related elements, a major portion of secondary particles is considered to be related to fuel oil combustion.

  8. Machine learning of swimming data via wisdom of crowd and regression analysis.

    Science.gov (United States)

    Xie, Jiang; Xu, Junfu; Nie, Celine; Nie, Qing

    2017-04-01

    Every performance, in an officially sanctioned meet, by a registered USA swimmer is recorded into an online database with times dating back to 1980. For the first time, statistical analysis and machine learning methods are systematically applied to 4,022,631 swim records. In this study, we investigate performance features for all strokes as a function of age and gender. The variances in performance of males and females for different ages and strokes were studied, and the correlations of performances for different ages were estimated using the Pearson correlation. Regression analysis show the performance trends for both males and females at different ages and suggest critical ages for peak training. Moreover, we assess twelve popular machine learning methods to predict or classify swimmer performance. Each method exhibited different strengths or weaknesses in different cases, indicating no one method could predict well for all strokes. To address this problem, we propose a new method by combining multiple inference methods to derive Wisdom of Crowd Classifier (WoCC). Our simulation experiments demonstrate that the WoCC is a consistent method with better overall prediction accuracy. Our study reveals several new age-dependent trends in swimming and provides an accurate method for classifying and predicting swimming times.

  9. Weighing risk factors associated with bee colony collapse disorder by classification and regression tree analysis.

    Science.gov (United States)

    VanEngelsdorp, Dennis; Speybroeck, Niko; Evans, Jay D; Nguyen, Bach Kim; Mullin, Chris; Frazier, Maryann; Frazier, Jim; Cox-Foster, Diana; Chen, Yanping; Tarpy, David R; Haubruge, Eric; Pettis, Jeffrey S; Saegerman, Claude

    2010-10-01

    Colony collapse disorder (CCD), a syndrome whose defining trait is the rapid loss of adult worker honey bees, Apis mellifera L., is thought to be responsible for a minority of the large overwintering losses experienced by U.S. beekeepers since the winter 2006-2007. Using the same data set developed to perform a monofactorial analysis (PloS ONE 4: e6481, 2009), we conducted a classification and regression tree (CART) analysis in an attempt to better understand the relative importance and interrelations among different risk variables in explaining CCD. Fifty-five exploratory variables were used to construct two CART models: one model with and one model without a cost of misclassifying a CCD-diagnosed colony as a non-CCD colony. The resulting model tree that permitted for misclassification had a sensitivity and specificity of 85 and 74%, respectively. Although factors measuring colony stress (e.g., adult bee physiological measures, such as fluctuating asymmetry or mass of head) were important discriminating values, six of the 19 variables having the greatest discriminatory value were pesticide levels in different hive matrices. Notably, coumaphos levels in brood (a miticide commonly used by beekeepers) had the highest discriminatory value and were highest in control (healthy) colonies. Our CART analysis provides evidence that CCD is probably the result of several factors acting in concert, making afflicted colonies more susceptible to disease. This analysis highlights several areas that warrant further attention, including the effect of sublethal pesticide exposure on pathogen prevalence and the role of variability in bee tolerance to pesticides on colony survivorship.

  10. Driver injury severity outcome analysis in rural interstate highway crashes: a two-level Bayesian logistic regression interpretation.

    Science.gov (United States)

    Chen, Cong; Zhang, Guohui; Liu, Xiaoyue Cathy; Ci, Yusheng; Huang, Helai; Ma, Jianming; Chen, Yanyan; Guan, Hongzhi

    2016-12-01

    There is a high potential of severe injury outcomes in traffic crashes on rural interstate highways due to the significant amount of high speed traffic on these corridors. Hierarchical Bayesian models are capable of incorporating between-crash variance and within-crash correlations into traffic crash data analysis and are increasingly utilized in traffic crash severity analysis. This paper applies a hierarchical Bayesian logistic model to examine the significant factors at crash and vehicle/driver levels and their heterogeneous impacts on driver injury severity in rural interstate highway crashes. Analysis results indicate that the majority of the total variance is induced by the between-crash variance, showing the appropriateness of the utilized hierarchical modeling approach. Three crash-level variables and six vehicle/driver-level variables are found significant in predicting driver injury severities: road curve, maximum vehicle damage in a crash, number of vehicles in a crash, wet road surface, vehicle type, driver age, driver gender, driver seatbelt use and driver alcohol or drug involvement. Among these variables, road curve, functional and disabled vehicle damage in crash, single-vehicle crashes, female drivers, senior drivers, motorcycles and driver alcohol or drug involvement tend to increase the odds of drivers being incapably injured or killed in rural interstate crashes, while wet road surface, male drivers and driver seatbelt use are more likely to decrease the probability of severe driver injuries. The developed methodology and estimation results provide insightful understanding of the internal mechanism of rural interstate crashes and beneficial references for developing effective countermeasures for rural interstate crash prevention.

  11. The Jackknife Interval Estimation of Parametersin Partial Least Squares Regression Modelfor Poverty Data Analysis

    Directory of Open Access Journals (Sweden)

    Pudji Ismartini

    2010-08-01

    Full Text Available One of the major problem facing the data modelling at social area is multicollinearity. Multicollinearity can have significant impact on the quality and stability of the fitted regression model. Common classical regression technique by using Least Squares estimate is highly sensitive to multicollinearity problem. In such a problem area, Partial Least Squares Regression (PLSR is a useful and flexible tool for statistical model building; however, PLSR can only yields point estimations. This paper will construct the interval estimations for PLSR regression parameters by implementing Jackknife technique to poverty data. A SAS macro programme is developed to obtain the Jackknife interval estimator for PLSR.

  12. THE ANALYSIS OF THIN WALLED COMPOSITE LAMINATED HELICOPTER ROTOR WITH HIERARCHICAL WARPING FUNCTIONS AND FINITE ELEMENT METHOD

    Institute of Scientific and Technical Information of China (English)

    诸德超; 邓忠民; 王荇卫

    2001-01-01

    In the present paper, a series of hierarchical warping functions is developed to analyze the static and dynamic problems of thin walled composite laminated helicopter rotors composed of several layers with single closed cell. This ethod is the development and extension of the traditional constrained warping theory of thin walled metallic beams, which had been proved very successful since 1940s. The warping distribution along the perimeter of each layer is expanded into a series of successively corrective warping functions with the traditional warping function caused by free torsion or free bending as the first term, and is assumed to be piecewise linear along the thickness direction of layers. The governing equations are derived based upon the variational principle of minimum potential energy for static analysis and Rayleigh Quotient for free vibration analysis. Then the hierarchical finite element method is introduced to form a numerical algorithm. Both static and natural vibration problems of sample box beams are analyzed with the present method to show the main mechanical behavior of the thin walled composite laminated helicopter rotor.

  13. A Resting-State Brain Functional Network Study in MDD Based on Minimum Spanning Tree Analysis and the Hierarchical Clustering

    Directory of Open Access Journals (Sweden)

    Xiaowei Li

    2017-01-01

    Full Text Available A large number of studies demonstrated that major depressive disorder (MDD is characterized by the alterations in brain functional connections which is also identifiable during the brain’s “resting-state.” But, in the present study, the approach of constructing functional connectivity is often biased by the choice of the threshold. Besides, more attention was paid to the number and length of links in brain networks, and the clustering partitioning of nodes was unclear. Therefore, minimum spanning tree (MST analysis and the hierarchical clustering were first used for the depression disease in this study. Resting-state electroencephalogram (EEG sources were assessed from 15 healthy and 23 major depressive subjects. Then the coherence, MST, and the hierarchical clustering were obtained. In the theta band, coherence analysis showed that the EEG coherence of the MDD patients was significantly higher than that of the healthy controls especially in the left temporal region. The MST results indicated the higher leaf fraction in the depressed group. Compared with the normal group, the major depressive patients lost clustering in frontal regions. Our findings suggested that there was a stronger brain interaction in the MDD group and a left-right functional imbalance in the frontal regions for MDD controls.

  14. Hierarchical Model for the Analysis of Scattering Data of Complex Materials

    Science.gov (United States)

    Oyedele, Akinola; Mcnutt, Nicholas W.; Rios, Orlando; Keffer, David J.

    2016-06-01

    Interpreting the results of scattering data for complex materials with a hierarchical structure in which at least one phase is amorphous presents a significant challenge. Often the interpretation relies on the use of large-scale molecular dynamics (MD) simulations, in which a structure is hypothesized and from which a radial distribution function (RDF) can be extracted and directly compared against an experimental RDF. This computationally intensive approach presents a bottleneck in the efficient characterization of the atomic structure of new materials. Here, we propose and demonstrate an approach for a hierarchical decomposition of the RDF in which MD simulations are replaced by a combination of tractable models and theory at the atomic scale and the mesoscale, which when combined yield the RDF. We apply the procedure to a carbon composite, in which graphitic nanocrystallites are distributed in an amorphous domain. We compare the model with the RDF from both MD simulation and neutron scattering data. This procedure is applicable for understanding the fundamental processing-structure-property relationships in complex magnetic materials.

  15. Hierarchical Bayesian Analysis of Biased Beliefs and Distributional Other-Regarding Preferences

    Directory of Open Access Journals (Sweden)

    Jeroen Weesie

    2013-02-01

    Full Text Available This study investigates the relationship between an actor’s beliefs about others’ other-regarding (social preferences and her own other-regarding preferences, using an “avant-garde” hierarchical Bayesian method. We estimate two distributional other-regarding preference parameters, α and β, of actors using incentivized choice data in binary Dictator Games. Simultaneously, we estimate the distribution of actors’ beliefs about others α and β, conditional on actors’ own α and β, with incentivized belief elicitation. We demonstrate the benefits of the Bayesian method compared to it’s hierarchical frequentist counterparts. Results show a positive association between an actor’s own (α; β and her beliefs about average(α; β in the population. The association between own preferences and the variance in beliefs about others’ preferences in the population, however, is curvilinear for α and insignificant for β. These results are partially consistent with the cone effect [1,2] which is described in detail below. Because in the Bayesian-Nash equilibrium concept, beliefs and own preferences are assumed to be independent, these results cast doubt on the application of the Bayesian-Nash equilibrium concept to experimental data.

  16. Regression for economics

    CERN Document Server

    Naghshpour, Shahdad

    2012-01-01

    Regression analysis is the most commonly used statistical method in the world. Although few would characterize this technique as simple, regression is in fact both simple and elegant. The complexity that many attribute to regression analysis is often a reflection of their lack of familiarity with the language of mathematics. But regression analysis can be understood even without a mastery of sophisticated mathematical concepts. This book provides the foundation and will help demystify regression analysis using examples from economics and with real data to show the applications of the method. T

  17. Comparative analysis of regression and artificial neural network models for wind speed prediction

    Science.gov (United States)

    Bilgili, Mehmet; Sahin, Besir

    2010-11-01

    In this study, wind speed was modeled by linear regression (LR), nonlinear regression (NLR) and artificial neural network (ANN) methods. A three-layer feedforward artificial neural network structure was constructed and a backpropagation algorithm was used for the training of ANNs. To get a successful simulation, firstly, the correlation coefficients between all of the meteorological variables (wind speed, ambient temperature, atmospheric pressure, relative humidity and rainfall) were calculated taking two variables in turn for each calculation. All independent variables were added to the simple regression model. Then, the method of stepwise multiple regression was applied for the selection of the “best” regression equation (model). Thus, the best independent variables were selected for the LR and NLR models and also used in the input layer of the ANN. The results obtained by all methods were compared to each other. Finally, the ANN method was found to provide better performance than the LR and NLR methods.

  18. Using the classical linear regression model in analysis of the dependences of conveyor belt life

    Directory of Open Access Journals (Sweden)

    Miriam Andrejiová

    2013-12-01

    Full Text Available The paper deals with the classical linear regression model of the dependence of conveyor belt life on some selected parameters: thickness of paint layer, width and length of the belt, conveyor speed and quantity of transported material. The first part of the article is about regression model design, point and interval estimation of parameters, verification of statistical significance of the model, and about the parameters of the proposed regression model. The second part of the article deals with identification of influential and extreme values that can have an impact on estimation of regression model parameters. The third part focuses on assumptions of the classical regression model, i.e. on verification of independence assumptions, normality and homoscedasticity of residuals.

  19. Automatic denoising of functional MRI data: combining independent component analysis and hierarchical fusion of classifiers.

    Science.gov (United States)

    Salimi-Khorshidi, Gholamreza; Douaud, Gwenaëlle; Beckmann, Christian F; Glasser, Matthew F; Griffanti, Ludovica; Smith, Stephen M

    2014-04-15

    Many sources of fluctuation contribute to the fMRI signal, and this makes identifying the effects that are truly related to the underlying neuronal activity difficult. Independent component analysis (ICA) - one of the most widely used techniques for the exploratory analysis of fMRI data - has shown to be a powerful technique in identifying various sources of neuronally-related and artefactual fluctuation in fMRI data (both with the application of external stimuli and with the subject "at rest"). ICA decomposes fMRI data into patterns of activity (a set of spatial maps and their corresponding time series) that are statistically independent and add linearly to explain voxel-wise time series. Given the set of ICA components, if the components representing "signal" (brain activity) can be distinguished form the "noise" components (effects of motion, non-neuronal physiology, scanner artefacts and other nuisance sources), the latter can then be removed from the data, providing an effective cleanup of structured noise. Manual classification of components is labour intensive and requires expertise; hence, a fully automatic noise detection algorithm that can reliably detect various types of noise sources (in both task and resting fMRI) is desirable. In this paper, we introduce FIX ("FMRIB's ICA-based X-noiseifier"), which provides an automatic solution for denoising fMRI data via accurate classification of ICA components. For each ICA component FIX generates a large number of distinct spatial and temporal features, each describing a different aspect of the data (e.g., what proportion of temporal fluctuations are at high frequencies). The set of features is then fed into a multi-level classifier (built around several different classifiers). Once trained through the hand-classification of a sufficient number of training datasets, the classifier can then automatically classify new datasets. The noise components can then be subtracted from (or regressed out of) the original

  20. Effect of acute hypoxia on cognition: A systematic review and meta-regression analysis.

    Science.gov (United States)

    McMorris, Terry; Hale, Beverley J; Barwood, Martin; Costello, Joseph; Corbett, Jo

    2017-03-01

    A systematic meta-regression analysis of the effects of acute hypoxia on the performance of central executive and non-executive tasks, and the effects of the moderating variables, arterial partial pressure of oxygen (PaO2) and hypobaric versus normobaric hypoxia, was undertaken. Studies were included if they were performed on healthy humans; within-subject design was used; data were reported giving the PaO2 or that allowed the PaO2 to be estimated (e.g. arterial oxygen saturation and/or altitude); and the duration of being in a hypoxic state prior to cognitive testing was ≤6days. Twenty-two experiments met the criteria for inclusion and demonstrated a moderate, negative mean effect size (g=-0.49, 95% CI -0.64 to -0.34, p<0.001). There were no significant differences between central executive and non-executive, perception/attention and short-term memory, tasks. Low (35-60mmHg) PaO2 was the key predictor of cognitive performance (R(2)=0.45, p<0.001) and this was independent of whether the exposure was in hypobaric hypoxic or normobaric hypoxic conditions.

  1. A simplified calculation procedure for mass isotopomer distribution analysis (MIDA) based on multiple linear regression.

    Science.gov (United States)

    Fernández-Fernández, Mario; Rodríguez-González, Pablo; García Alonso, J Ignacio

    2016-10-01

    We have developed a novel, rapid and easy calculation procedure for Mass Isotopomer Distribution Analysis based on multiple linear regression which allows the simultaneous calculation of the precursor pool enrichment and the fraction of newly synthesized labelled proteins (fractional synthesis) using linear algebra. To test this approach, we used the peptide RGGGLK as a model tryptic peptide containing three subunits of glycine. We selected glycine labelled in two (13) C atoms ((13) C2 -glycine) as labelled amino acid to demonstrate that spectral overlap is not a problem in the proposed methodology. The developed methodology was tested first in vitro by changing the precursor pool enrichment from 10 to 40% of (13) C2 -glycine. Secondly, a simulated in vivo synthesis of proteins was designed by combining the natural abundance RGGGLK peptide and 10 or 20% (13) C2 -glycine at 1 : 1, 1 : 3 and 3 : 1 ratios. Precursor pool enrichments and fractional synthesis values were calculated with satisfactory precision and accuracy using a simple spreadsheet. This novel approach can provide a relatively rapid and easy means to measure protein turnover based on stable isotope tracers. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  2. A New Global Regression Analysis Method for the Prediction of Wind Tunnel Model Weight Corrections

    Science.gov (United States)

    Ulbrich, Norbert Manfred; Bridge, Thomas M.; Amaya, Max A.

    2014-01-01

    A new global regression analysis method is discussed that predicts wind tunnel model weight corrections for strain-gage balance loads during a wind tunnel test. The method determines corrections by combining "wind-on" model attitude measurements with least squares estimates of the model weight and center of gravity coordinates that are obtained from "wind-off" data points. The method treats the least squares fit of the model weight separate from the fit of the center of gravity coordinates. Therefore, it performs two fits of "wind- off" data points and uses the least squares estimator of the model weight as an input for the fit of the center of gravity coordinates. Explicit equations for the least squares estimators of the weight and center of gravity coordinates are derived that simplify the implementation of the method in the data system software of a wind tunnel. In addition, recommendations for sets of "wind-off" data points are made that take typical model support system constraints into account. Explicit equations of the confidence intervals on the model weight and center of gravity coordinates and two different error analyses of the model weight prediction are also discussed in the appendices of the paper.

  3. Linear and nonlinear regression analysis for heavy metals removal using Agaricus bisporus macrofungus

    Directory of Open Access Journals (Sweden)

    Boldizsar Nagy

    2017-05-01

    Full Text Available In the present study the biosorption characteristics of Cd (II and Zn (II ions from monocomponent aqueous solutions by Agaricus bisporus macrofungus were investigated. The initial metal ion concentrations, contact time, initial pH and temperature were parameters that influence the biosorption. Maximum removal efficiencies up to 76.10% and 70.09% (318 K for Cd (II and Zn (II, respectively and adsorption capacities up to 3.49 and 2.39 mg/g for Cd (II and Zn (II, respectively at the highest concentration, were calculated. The experimental data were analyzed using pseudo-first- and pseudo-second-order kinetic models, various isotherm models in linear and nonlinear (CMA-ES optimization algorithm regression and thermodynamic parameters were calculated. The results showed that the biosorption process of both studied metal ions, followed pseudo second-order kinetics, while equilibrium is best described by Sips isotherm. The changes in morphological structure after heavy metal-biomass interactions were evaluated by SEM analysis. Our results confirmed that macrofungus A. bisporus could be used as a cost effective, efficient biosorbent for the removal of Cd (II and Zn (II from aqueous synthetic solutions.

  4. A systematic review and meta-regression analysis of mivacurium for tracheal intubation.

    Science.gov (United States)

    Vanlinthout, L E H; Mesfin, S H; Hens, N; Vanacker, B F; Robertson, E N; Booij, L H D J

    2014-12-01

    We systematically reviewed factors associated with intubation conditions in randomised controlled trials of mivacurium, using random-effects meta-regression analysis. We included 29 studies of 1050 healthy participants. Four factors explained 72.9% of the variation in the probability of excellent intubation conditions: mivacurium dose, 24.4%; opioid use, 29.9%; time to intubation and age together, 18.6%. The odds ratio (95% CI) for excellent intubation was 3.14 (1.65-5.73) for doubling the mivacurium dose, 5.99 (2.14-15.18) for adding opioids to the intubation sequence, and 6.55 (6.01-7.74) for increasing the delay between mivacurium injection and airway insertion from 1 to 2 min in subjects aged 25 years and 2.17 (2.01-2.69) for subjects aged 70 years, p < 0.001 for all. We conclude that good conditions for tracheal intubation are more likely by delaying laryngoscopy after injecting a higher dose of mivacurium with an opioid, particularly in older people.

  5. Determination of useful ranges of mixing conditions for glycerin Fatty Acid ester by multiple regression analysis.

    Science.gov (United States)

    Uchimoto, Takeaki; Iwao, Yasunori; Hattori, Hiroaki; Noguchi, Shuji; Itai, Shigeru

    2013-01-01

    The interaction of the effects of the triglycerin full behenate (TR-FB) concentration and the mixing time on lubrication and tablet properties were analyzed under a two-factor central composite design, and compared with those of magnesium stearate (Mg-St). Various amounts of lubricant (0.07-3.0%) were added to granules and mixed for 1-30 min. A multiple linear regression analysis was performed to identify the effect of the mixing conditions on each physicochemical property. The mixing conditions did not significantly affect the lubrication properties of TR-FB. For tablet properties, tensile strength decreased and disintegration time increased when the lubricant concentration and the mixing time were increased for Mg-St. The direct interaction of the Mg-St concentration and the mixing time had a significant negative effect on the disintegration time. In contrast, any mixing conditions of TR-FB did not affect the tablet properties. In addition, the range of mixing conditions which satisfied the lubrication and tablet property criteria was broader for TR-FB than that for Mg-St, suggesting that TR-FB allows tablets with high quality attributes to be produced consistently. Therefore, TR-FB is a potential lubricant alternative to Mg-St.

  6. Performance Prediction Modelling for Flexible Pavement on Low Volume Roads Using Multiple Linear Regression Analysis

    Directory of Open Access Journals (Sweden)

    C. Makendran

    2015-01-01

    Full Text Available Prediction models for low volume village roads in India are developed to evaluate the progression of different types of distress such as roughness, cracking, and potholes. Even though the Government of India is investing huge quantum of money on road construction every year, poor control over the quality of road construction and its subsequent maintenance is leading to the faster road deterioration. In this regard, it is essential that scientific maintenance procedures are to be evolved on the basis of performance of low volume flexible pavements. Considering the above, an attempt has been made in this research endeavor to develop prediction models to understand the progression of roughness, cracking, and potholes in flexible pavements exposed to least or nil routine maintenance. Distress data were collected from the low volume rural roads covering about 173 stretches spread across Tamil Nadu state in India. Based on the above collected data, distress prediction models have been developed using multiple linear regression analysis. Further, the models have been validated using independent field data. It can be concluded that the models developed in this study can serve as useful tools for the practicing engineers maintaining flexible pavements on low volume roads.

  7. Prediction of Depression in Cancer Patients With Different Classification Criteria, Linear Discriminant Analysis versus Logistic Regression.

    Science.gov (United States)

    Shayan, Zahra; Mohammad Gholi Mezerji, Naser; Shayan, Leila; Naseri, Parisa

    2015-11-03

    Logistic regression (LR) and linear discriminant analysis (LDA) are two popular statistical models for prediction of group membership. Although they are very similar, the LDA makes more assumptions about the data. When categorical and continuous variables used simultaneously, the optimal choice between the two models is questionable. In most studies, classification error (CE) is used to discriminate between subjects in several groups, but this index is not suitable to predict the accuracy of the outcome. The present study compared LR and LDA models using classification indices. This cross-sectional study selected 243 cancer patients. Sample sets of different sizes (n = 50, 100, 150, 200, 220) were randomly selected and the CE, B, and Q classification indices were calculated by the LR and LDA models. CE revealed the a lack of superiority for one model over the other, but the results showed that LR performed better than LDA for the B and Q indices in all situations. No significant effect for sample size on CE was noted for selection of an optimal model. Assessment of the accuracy of prediction of real data indicated that the B and Q indices are appropriate for selection of an optimal model. The results of this study showed that LR performs better in some cases and LDA in others when based on CE. The CE index is not appropriate for classification, although the B and Q indices performed better and offered more efficient criteria for comparison and discrimination between groups.

  8. Applying different independent component analysis algorithms and support vector regression for IT chain store sales forecasting.

    Science.gov (United States)

    Dai, Wensheng; Wu, Jui-Yu; Lu, Chi-Jie

    2014-01-01

    Sales forecasting is one of the most important issues in managing information technology (IT) chain store sales since an IT chain store has many branches. Integrating feature extraction method and prediction tool, such as support vector regression (SVR), is a useful method for constructing an effective sales forecasting scheme. Independent component analysis (ICA) is a novel feature extraction technique and has been widely applied to deal with various forecasting problems. But, up to now, only the basic ICA method (i.e., temporal ICA model) was applied to sale forecasting problem. In this paper, we utilize three different ICA methods including spatial ICA (sICA), temporal ICA (tICA), and spatiotemporal ICA (stICA) to extract features from the sales data and compare their performance in sales forecasting of IT chain store. Experimental results from a real sales data show that the sales forecasting scheme by integrating stICA and SVR outperforms the comparison models in terms of forecasting error. The stICA is a promising tool for extracting effective features from branch sales data and the extracted features can improve the prediction performance of SVR for sales forecasting.

  9. Factors Influencing Water System Functionality in Nigeria and Tanzania: A Regression and Bayesian Network Analysis.

    Science.gov (United States)

    Cronk, Ryan; Bartram, Jamie

    2017-09-21

    Sufficient, safe, and continuously available water services are important for human development and health yet many water systems in low- and middle-income countries are nonfunctional. Monitoring data were analyzed using regression and Bayesian networks (BNs) to explore factors influencing the functionality of 82 503 water systems in Nigeria and Tanzania. Functionality varied by system type. In Tanzania, Nira handpumps were more functional than Afridev and India Mark II handpumps. Higher functionality was associated with fee collection in Nigeria. In Tanzania, functionality was higher if fees were collected monthly rather than in response to system breakdown. Systems in Nigeria were more likely to be functional if they were used for both human and livestock consumption. In Tanzania, systems managed by private operators were more functional than community-managed systems. The BNs found strong dependencies between functionality and system type and administrative unit (e.g., district). The BNs predicted functionality increased from 68% to 89% in Nigeria and from 53% to 68% in Tanzania when best observed conditions were in place. Improvements to water system monitoring and analysis of monitoring data with different modeling techniques may be useful for identifying water service improvement opportunities and informing evidence-based decision-making for better management, policy, programming, and practice.

  10. Comparison of Bayesian and Classical Analysis of Weibull Regression Model: A Simulation Study

    Directory of Open Access Journals (Sweden)

    İmran KURT ÖMÜRLÜ

    2011-01-01

    Full Text Available Objective: The purpose of this study was to compare performances of classical Weibull Regression Model (WRM and Bayesian-WRM under varying conditions using Monte Carlo simulations. Material and Methods: It was simulated the generated data by running for each of classical WRM and Bayesian-WRM under varying informative priors and sample sizes using our simulation algorithm. In simulation studies, n=50, 100 and 250 were for sample sizes, and informative prior values using a normal prior distribution with was selected for b1. For each situation, 1000 simulations were performed. Results: Bayesian-WRM with proper informative prior showed a good performance with too little bias. It was found out that bias of Bayesian-WRM increased while priors were becoming distant from reliability in all sample sizes. Furthermore, Bayesian-WRM obtained predictions with more little standard error than the classical WRM in both of small and big samples in the light of proper priors. Conclusion: In this simulation study, Bayesian-WRM showed better performance than classical method, when subjective data analysis performed by considering of expert opinions and historical knowledge about parameters. Consequently, Bayesian-WRM should be preferred in existence of reliable informative priors, in the contrast cases, classical WRM should be preferred.

  11. A Vehicle Traveling Time Prediction Method Based on Grey Theory and Linear Regression Analysis

    Institute of Scientific and Technical Information of China (English)

    TU Jun; LI Yan-ming; LIU Cheng-liang

    2009-01-01

    Vehicle traveling time prediction is an important part of the research of intelligent transportation system. By now, there have been various kinds of methods for vehicle traveling time prediction. But few consider both aspects of time and space. In this paper, a vehicle traveling time prediction method based on grey theory (GT) and linear regression analysis (LRA) is presented. In aspects of time, we use the history data sequence of bus speed on a certain road to predict the future bus speed on that road by GT. And in aspects of space, we calculate the traffic affecting factors between various roads by LRA. Using these factors we can predict the vehicle's speed at the lower road if the vehicle's speed at the current road is known. Finally we use time factor and space factor as the weighting factors of the two results predicted by GT and LRA respectively to find the fina0l result, thus calculating the vehicle's travehng time. The method also considers such factors as dwell time, thus making the prediction more accurate.

  12. Variable Selection for Functional Logistic Regression in fMRI Data Analysis

    Directory of Open Access Journals (Sweden)

    Nedret BILLOR

    2015-03-01

    Full Text Available This study was motivated by classification problem in Functional Magnetic Resonance Imaging (fMRI, a noninvasive imaging technique which allows an experimenter to take images of a subject's brain over time. As fMRI studies usually have a small number of subjects and we assume that there is a smooth, underlying curve describing the observations in fMRI data, this results in incredibly high-dimensional datasets that are functional in nature. High dimensionality is one of the biggest problems in statistical analysis of fMRI data. There is also a need for the development of better classification methods. One of the best things about fMRI technique is its noninvasiveness. If statistical classification methods are improved, it could aid the advancement of noninvasive diagnostic techniques for mental illness or even degenerative diseases such as Alzheimer's. In this paper, we develop a variable selection technique, which tackles high dimensionality and correlation problems in fMRI data, based on L1 regularization-group lasso for the functional logistic regression model where the response is binary and represent two separate classes; the predictors are functional. We assess our method with a simulation study and an application to a real fMRI dataset.

  13. Evaluating Alcoholics Anonymous's Effect on Drinking in Project MATCH Using Cross-Lagged Regression Panel Analysis

    Science.gov (United States)

    Magura, Stephen; Cleland, Charles M.; Tonigan, J. Scott

    2013-01-01

    Objective: The objective of the study is to determine whether Alcoholics Anonymous (AA) participation leads to reduced drinking and problems related to drinking within Project MATCH (Matching Alcoholism Treatments to Client Heterogeneity), an existing national alcoholism treatment data set. Method: The method used is structural equation modeling of panel data with cross-lagged partial regression coefficients. The main advantage of this technique for the analysis of AA outcomes is that potential reciprocal causation between AA participation and drinking behavior can be explicitly modeled through the specification of finite causal lags. Results: For the outpatient subsample (n = 952), the results strongly support the hypothesis that AA attendance leads to increases in alcohol abstinence and reduces drinking/problems, whereas a causal effect in the reverse direction is unsupported. For the aftercare subsample (n = 774), the results are not as clear but also suggest that AA attendance leads to better outcomes. Conclusions: Although randomized controlled trials are the surest means of establishing causal relations between interventions and outcomes, such trials are rare in AA research for practical reasons. The current study successfully exploited the multiple data waves in Project MATCH to examine evidence of causality between AA participation and drinking outcomes. The study obtained unique statistical results supporting the effectiveness of AA primarily in the context of primary outpatient treatment for alcoholism. PMID:23490566

  14. Applying Different Independent Component Analysis Algorithms and Support Vector Regression for IT Chain Store Sales Forecasting

    Directory of Open Access Journals (Sweden)

    Wensheng Dai

    2014-01-01

    Full Text Available Sales forecasting is one of the most important issues in managing information technology (IT chain store sales since an IT chain store has many branches. Integrating feature extraction method and prediction tool, such as support vector regression (SVR, is a useful method for constructing an effective sales forecasting scheme. Independent component analysis (ICA is a novel feature extraction technique and has been widely applied to deal with various forecasting problems. But, up to now, only the basic ICA method (i.e., temporal ICA model was applied to sale forecasting problem. In this paper, we utilize three different ICA methods including spatial ICA (sICA, temporal ICA (tICA, and spatiotemporal ICA (stICA to extract features from the sales data and compare their performance in sales forecasting of IT chain store. Experimental results from a real sales data show that the sales forecasting scheme by integrating stICA and SVR outperforms the comparison models in terms of forecasting error. The stICA is a promising tool for extracting effective features from branch sales data and the extracted features can improve the prediction performance of SVR for sales forecasting.

  15. Association between parity and fistula location in women with obstetric fistula: a multivariate regression analysis.

    Science.gov (United States)

    Sih, A M; Kopp, D M; Tang, J H; Rosenberg, N E; Chipungu, E; Harfouche, M; Moyo, M; Mwale, M; Wilkinson, J P

    2016-04-01

    To compare primiparous and multiparous women who develop obstetric fistula (OF) and to assess predictors of fistula location. Cross-sectional study. Fistula Care Centre at Bwaila Hospital, Lilongwe, Malawi. Women with OF who presented between September 2011 and July 2014 with a complete obstetric history were eligible for the study. Women with OF were surveyed for their obstetric history. Women were classified as multiparous if prior vaginal or caesarean delivery was reported. The location of the fistula was determined at operation: OF involving the urethra, bladder neck, and midvagina were classified as low; OF involving the vaginal apex, cervix, uterus, and ureters were classified as high. Demographic information was compared between primiparous and multiparous women using chi-squared and Mann-Whitney U-tests. Multivariate logistic regression models were implemented to assess the relationship between variables of interest and fistula location. During the study period, 533 women presented for repair, of which 452 (84.8%) were included in the analysis. The majority (56.6%) were multiparous when the fistula formed. Multiparous women were more likely to have laboured fistula location (37.5 versus 11.2%, P fistula. Multiparity was common in our cohort, and these women were more likely to have a high fistula. Additional research is needed to understand the aetiology of high fistula including potential iatrogenic causes. Multiparity and caesarean delivery were associated with a high tract fistula in our Malawian cohort. © 2016 Royal College of Obstetricians and Gynaecologists.

  16. Association between parity and fistula location in Malawian women with obstetric fistula: a multivariate regression analysis

    Science.gov (United States)

    Sih, Allison M.; Kopp, Dawn M.; Tang, Jennifer H.; Rosenberg, Nora E.; Chipungu, Ennet; Harfouche, Melike; Moyo, Margaret; Mwale, Mwawi; Wilkinson, Jeffrey P.

    2016-01-01

    Objective To compare primiparous and multiparous women who develop obstetric fistula (OF) and to assess predictors of fistula location Design Cross-sectional study Setting Fistula Care Center at Bwaila Hospital, Lilongwe, Malawi Population Women with OF who presented between September 2011 and July 2014 with a complete obstetric history were eligible for the study. Methods Women with OF were surveyed for their obstetric history. Women were classified as multiparous if prior vaginal or cesarean delivery was reported. Location of fistula was determined at operation. OF involving the urethra, bladder neck, and midvagina were classified as low; OF involving the vaginal apex, cervix, uterus, and ureters were classified as high. Main Outcome Measures Demographic information was compared between primiparous and multiparous women using Chi-squared and Mann-Whitney U tests. Multivariate logistic regression models were implemented to assess the relationship between variables of interest and fistula location. Results During the study period, 533 women presented for repair, of which 452 (84.8%) were included in the analysis. The majority (56.6%) were multiparous when the fistula formed. Multiparous women were more likely to have labored less than a day (62.4% vs 44.5%, pfistula location (37.5% vs 11.2%, pfistula. Conclusions Multiparity was common in our cohort, and these women were more likely to have a high fistula. Additional research is needed to understand the etiology of high fistula including potential iatrogenic causes. PMID:26853525

  17. Italian Manufacturing and Service Firms Labor Productivity: a Longitudinal Quantile Regression Analysis

    Directory of Open Access Journals (Sweden)

    Margherita Velucchi

    2014-09-01

    Full Text Available Labor productivity is very complex to analyze across time, sectors and countries. In particular, in Italy, labor productivity has shown a prolonged slowdown but sector analyses highlight the presence of specific niches that have good levels of productivity and performance. This paper investigates how firms' characteristics might have affected the dynamics of the Italian service and manufacturing firms labor productivity in recent years (1998-2007, comparing them and focusing on some relevant sectors. We use a micro level original panel from the Italian National Institute of Statistics (ISTAT and a longitudinal quantile regression approach that allow us to show that labor productivity is highly heterogeneous across sectors and that the links between labor productivity and firms' characteristics are not constant across quantiles. We show that average estimates obtained via GLS do not capture the complex dynamics and heterogeneity of the service and manufacturing firms' labor productivity. Using this approach, we show that innovativeness and human capital, in particular, have a very strong impact on fostering labor productivity of lower productive firms. From the sector analysis on four service' sectors (restaurants & hotels, trade distributors, trade shops and legal & accountants we show that heterogeneity is more intense at a sector level and we derive some common features that may be useful in terms of policy implications.

  18. An Original Stepwise Multilevel Logistic Regression Analysis of Discriminatory Accuracy: The Case of Neighbourhoods and Health.

    Directory of Open Access Journals (Sweden)

    Juan Merlo

    Full Text Available Many multilevel logistic regression analyses of "neighbourhood and health" focus on interpreting measures of associations (e.g., odds ratio, OR. In contrast, multilevel analysis of variance is rarely considered. We propose an original stepwise analytical approach that distinguishes between "specific" (measures of association and "general" (measures of variance contextual effects. Performing two empirical examples we illustrate the methodology, interpret the results and discuss the implications of this kind of analysis in public health.We analyse 43,291 individuals residing in 218 neighbourhoods in the city of Malmö, Sweden in 2006. We study two individual outcomes (psychotropic drug use and choice of private vs. public general practitioner, GP for which the relative importance of neighbourhood as a source of individual variation differs substantially. In Step 1 of the analysis, we evaluate the OR and the area under the receiver operating characteristic (AUC curve for individual-level covariates (i.e., age, sex and individual low income. In Step 2, we assess general contextual effects using the AUC. Finally, in Step 3 the OR for a specific neighbourhood characteristic (i.e., neighbourhood income is interpreted jointly with the proportional change in variance (i.e., PCV and the proportion of ORs in the opposite direction (POOR statistics.For both outcomes, information on individual characteristics (Step 1 provide a low discriminatory accuracy (AUC = 0.616 for psychotropic drugs; = 0.600 for choosing a private GP. Accounting for neighbourhood of residence (Step 2 only improved the AUC for choosing a private GP (+0.295 units. High neighbourhood income (Step 3 was strongly associated to choosing a private GP (OR = 3.50 but the PCV was only 11% and the POOR 33%.Applying an innovative stepwise multilevel analysis, we observed that, in Malmö, the neighbourhood context per se had a negligible influence on individual use of psychotropic drugs, but

  19. [Multiple dependent variables LS-SVM regression algorithm and its application in NIR spectral quantitative analysis].

    Science.gov (United States)

    An, Xin; Xu, Shuo; Zhang, Lu-Da; Su, Shi-Guang

    2009-01-01

    In the present paper, on the basis of LS-SVM algorithm, we built a multiple dependent variables LS-SVM (MLS-SVM) regression model whose weights can be optimized, and gave the corresponding algorithm. Furthermore, we theoretically explained the relationship between MLS-SVM and LS-SVM. Sixty four broomcorn samples were taken as experimental material, and the sample ratio of modeling set to predicting set was 51 : 13. We first selected randomly and uniformly five weight groups in the interval [0, 1], and then in the way of leave-one-out (LOO) rule determined one appropriate weight group and parameters including penalizing parameters and kernel parameters in the model according to the criterion of the minimum of average relative error. Then a multiple dependent variables quantitative analysis model was built with NIR spectrum and simultaneously analyzed three chemical constituents containing protein, lysine and starch. Finally, the average relative errors between actual values and predicted ones by the model of three components for the predicting set were 1.65%, 6.47% and 1.37%, respectively, and the correlation coefficients were 0.9940, 0.8392 and 0.8825, respectively. For comparison, LS-SVM was also utilized, for which the average relative errors were 1.68%, 6.25% and 1.47%, respectively, and the correlation coefficients were 0.9941, 0.8310 and 0.8800, respectively. It is obvious that MLS-SVM algorithm is comparable to LS-SVM algorithm in modeling analysis performance, and both of them can give satisfying results. The result shows that the model with MLS-SVM algorithm is capable of doing multi-components NIR quantitative analysis synchronously. Thus MLS-SVM algorithm offers a new multiple dependent variables quantitative analysis approach for chemometrics. In addition, the weights have certain effect on the prediction performance of the model with MLS-SVM, which is consistent with our intuition and is validated in this study. Therefore, it is necessary to optimize

  20. Quantile regression

    CERN Document Server

    Hao, Lingxin

    2007-01-01

    Quantile Regression, the first book of Hao and Naiman's two-book series, establishes the seldom recognized link between inequality studies and quantile regression models. Though separate methodological literature exists for each subject, the authors seek to explore the natural connections between this increasingly sought-after tool and research topics in the social sciences. Quantile regression as a method does not rely on assumptions as restrictive as those for the classical linear regression; though more traditional models such as least squares linear regression are more widely utilized, Hao

  1. Hierarchical classification for the topography analysis of Asteroid (4179) Toutatis from the Chang'E-2 images

    Science.gov (United States)

    Zheng, Chen; Ping, Jinsong; Wang, Mingyuan

    2016-11-01

    High spatial resolution images of the near-Earth Asteroid (4179) Toutatis are provided by a successful flyby of the Chang'E-2 spacecraft. These optical images give us a chance to closely observe the surface of this asteroid. However, some local low-contrast regions in the Chang'E-2 images limit the accuracy of the topography recognition. To solve this problem, a hierarchical classification method is suggested and developed to assist the topography analysis for Toutatis based on the Chang'E-2 optical images. The proposed method first classifies the image at both the macro level and the micro level, respectively. Then, topography of Toutatis can be explored by using the hierarchical classification result. Experimental results demonstrate that the method cannot only provide a new perspective to analyze topography objects reported previously, but also reveal some new characteristics at low-contrast regions. Namely, two new topographic characteristics are revealed, one is a connection region with a particular spectral value locating at the corner of the large lobe, and the other is an object seeming like a fixed star at the background of the images.

  2. Taxonomy of Manufacturing Flexibility at Manufacturing Companies Using Imperialist Competitive Algorithms, Support Vector Machines and Hierarchical Cluster Analysis

    Directory of Open Access Journals (Sweden)

    M. Khoobiyan

    2017-04-01

    Full Text Available Manufacturing flexibility is a multidimensional concept and manufacturing companies act differently in using these dimensions. The purpose of this study is to investigate taxonomy and identify dominant groups of manufacturing flexibility. Dimensions of manufacturing flexibility are extracted by content analysis of literature and expert judgements. Manufacturing flexibility was measured by using a questionnaire developed to survey managers of manufacturing companies. The sample size was set at 379. To identify dominant groups of flexibility based on dimensions of flexibility determined, Hierarchical Cluster Analysis (HCA, Imperialist Competitive Algorithms (ICAs and Support Vector Machines (SVMs were used by cluster validity indices. The best algorithm for clustering was SVMs with three clusters, designated as leading delivery-based flexibility, frugal flexibility and sufficient plan-based flexibility.

  3. Comparing transfusion reaction rates for various plasma types: a systematic review and meta-analysis/regression.

    Science.gov (United States)

    Saadah, Nicholas H; van Hout, Fabienne M A; Schipperus, Martin R; le Cessie, Saskia; Middelburg, Rutger A; Wiersum-Osselton, Johanna C; van der Bom, Johanna G

    2017-09-01

    We estimated rates for common plasma-associated transfusion reactions and compared reported rates for various plasma types. We performed a systematic review and meta-analysis of peer-reviewed articles that reported plasma transfusion reaction rates. Random-effects pooled rates were calculated and compared between plasma types. Meta-regression was used to compare various plasma types with regard to their reported plasma transfusion reaction rates. Forty-eight studies reported transfusion reaction rates for fresh-frozen plasma (FFP; mixed-sex and male-only), amotosalen INTERCEPT FFP, methylene blue-treated FFP, and solvent/detergent-treated pooled plasma. Random-effects pooled average rates for FFP were: allergic reactions, 92/10(5) units transfused (95% confidence interval [CI], 46-184/10(5) units transfused); febrile nonhemolytic transfusion reactions (FNHTRs), 12/10(5) units transfused (95% CI, 7-22/10(5) units transfused); transfusion-associated circulatory overload (TACO), 6/10(5) units transfused (95% CI, 1-30/10(5) units transfused); transfusion-related acute lung injury (TRALI), 1.8/10(5) units transfused (95% CI, 1.2-2.7/10(5) units transfused); and anaphylactic reactions, 0.8/10(5) units transfused (95% CI, 0-45.7/10(5) units transfused). Risk differences between plasma types were not significant for allergic reactions, TACO, or anaphylactic reactions. Methylene blue-treated FFP led to fewer FNHTRs than FFP (risk difference = -15.3 FNHTRs/10(5) units transfused; 95% CI, -24.7 to -7.1 reactions/10(5) units transfused); and male-only FFP led to fewer cases of TRALI than mixed-sex FFP (risk difference = -0.74 TRALI/10(5) units transfused; 95% CI, -2.42 to -0.42 injuries/10(5) units transfused). Meta-regression demonstrates that the rate of FNHTRs is lower for methylene blue-treated compared with FFP, and the rate of TRALI is lower for male-only than for mixed-sex FFP; whereas no significant differences are observed between plasma types for allergic

  4. Regional Flood Frequency Analysis using Support Vector Regression under historical and future climate

    Science.gov (United States)

    Gizaw, Mesgana Seyoum; Gan, Thian Yew

    2016-07-01

    Regional Flood Frequency Analysis (RFFA) is a statistical method widely used to estimate flood quantiles of catchments with limited streamflow data. In addition, to estimate the flood quantile of ungauged sites, there could be only a limited number of stations with complete dataset are available from hydrologically similar, surrounding catchments. Besides traditional regression based RFFA methods, recent applications of machine learning algorithms such as the artificial neural network (ANN) have shown encouraging results in regional flood quantile estimations. Another novel machine learning technique that is becoming widely applicable in the hydrologic community is the Support Vector Regression (SVR). In this study, an RFFA model based on SVR was developed to estimate regional flood quantiles for two study areas, one with 26 catchments located in southeastern British Columbia (BC) and another with 23 catchments located in southern Ontario (ON), Canada. The SVR-RFFA model for both study sites was developed from 13 sets of physiographic and climatic predictors for the historical period. The Ef (Nash Sutcliffe coefficient) and R2 of the SVR-RFFA model was about 0.7 when estimating flood quantiles of 10, 25, 50 and 100 year return periods which indicate satisfactory model performance in both study areas. In addition, the SVR-RFFA model also performed well based on other goodness-of-fit statistics such as BIAS (mean bias) and BIASr (relative BIAS). If the amount of data available for training RFFA models is limited, the SVR-RFFA model was found to perform better than an ANN based RFFA model, and with significantly lower median CV (coefficient of variation) of the estimated flood quantiles. The SVR-RFFA model was then used to project changes in flood quantiles over the two study areas under the impact of climate change using the RCP4.5 and RCP8.5 climate projections of five Coupled Model Intercomparison Project (CMIP5) GCMs (Global Climate Models) for the 2041

  5. Duloxetine compared with fluoxetine and venlafaxine: use of meta-regression analysis for indirect comparisons

    Directory of Open Access Journals (Sweden)

    Lançon Christophe

    2006-07-01

    Full Text Available Abstract Background Data comparing duloxetine with existing antidepressant treatments is limited. A comparison of duloxetine with fluoxetine has been performed but no comparison with venlafaxine, the other antidepressant in the same therapeutic class with a significant market share, has been undertaken. In the absence of relevant data to assess the place that duloxetine should occupy in the therapeutic arsenal, indirect comparisons are the most rigorous way to go. We conducted a systematic review of the efficacy of duloxetine, fluoxetine and venlafaxine versus placebo in the treatment of Major Depressive Disorder (MDD, and performed indirect comparisons through meta-regressions. Methods The bibliography of the Agency for Health Care Policy and Research and the CENTRAL, Medline, and Embase databases were interrogated using advanced search strategies based on a combination of text and index terms. The search focused on randomized placebo-controlled clinical trials involving adult patients treated for acute phase Major Depressive Disorder. All outcomes were derived to take account for varying placebo responses throughout studies. Primary outcome was treatment efficacy as measured by Hedge's g effect size. Secondary outcomes were response and dropout rates as measured by log odds ratios. Meta-regressions were run to indirectly compare the drugs. Sensitivity analysis, assessing the influence of individual studies over the results, and the influence of patients' characteristics were run. Results 22 studies involving fluoxetine, 9 involving duloxetine and 8 involving venlafaxine were selected. Using indirect comparison methodology, estimated effect sizes for efficacy compared with duloxetine were 0.11 [-0.14;0.36] for fluoxetine and 0.22 [0.06;0.38] for venlafaxine. Response log odds ratios were -0.21 [-0.44;0.03], 0.70 [0.26;1.14]. Dropout log odds ratios were -0.02 [-0.33;0.29], 0.21 [-0.13;0.55]. Sensitivity analyses showed that results were

  6. Modeling and regression analysis of semiochemical dose-response curves of insect antennal reception and behavior.

    Science.gov (United States)

    Byers, John A

    2013-08-01

    Dose-response curves of the effects of semiochemicals on neurophysiology and behavior are reported in many articles in insect chemical ecology. Most curves are shown in figures representing points connected by straight lines, in which the x-axis has order of magnitude increases in dosage vs. responses on the y-axis. The lack of regression curves indicates that the nature of the dose-response relationship is not well understood. Thus, a computer model was developed to simulate a flux of various numbers of pheromone molecules (10(3) to 5 × 10(6)) passing by 10(4) receptors distributed among 10(6) positions along an insect antenna. Each receptor was depolarized by at least one strike by a molecule, and subsequent strikes had no additional effect. The simulations showed that with an increase in pheromone release rate, the antennal response would increase in a convex fashion and not in a logarithmic relation as suggested previously. Non-linear regression showed that a family of kinetic formation functions fit the simulated data nearly perfectly (R(2) >0.999). This is reasonable because olfactory receptors have proteins that bind to the pheromone molecule and are expected to exhibit enzyme kinetics. Over 90 dose-response relationships reported in the literature of electroantennographic and behavioral bioassays in the laboratory and field were analyzed by the logarithmic and kinetic formation functions. This analysis showed that in 95% of the cases, the kinetic functions explained the relationships better than the logarithmic (mean of about 20% better). The kinetic curves become sigmoid when graphed on a log scale on the x-axis. Dose-catch relationships in the field are similar to dose-EAR (effective attraction radius, in which a spherical radius indicates the trapping effect of a lure) and the circular EARc in two dimensions used in mass trapping models. The use of kinetic formation functions for dose-response curves of attractants, and kinetic decay curves for

  7. A Bayesian ridge regression analysis of congestion's impact on urban expressway safety.

    Science.gov (United States)

    Shi, Qi; Abdel-Aty, Mohamed; Lee, Jaeyoung

    2016-03-01

    With the rapid growth of traffic in urban areas, concerns about congestion and traffic safety have been heightened. This study leveraged both Automatic Vehicle Identification (AVI) system and Microwave Vehicle Detection System (MVDS) installed on an expressway in Central Florida to explore how congestion impacts the crash occurrence in urban areas. Multiple congestion measures from the two systems were developed. To ensure more precise estimates of the congestion's effects, the traffic data were aggregated into peak and non-peak hours. Multicollinearity among traffic parameters was examined. The results showed the presence of multicollinearity especially during peak hours. As a response, ridge regression was introduced to cope with this issue. Poisson models with uncorrelated random effects, correlated random effects, and both correlated random effects and random parameters were constructed within the Bayesian framework. It was proven that correlated random effects could significantly enhance model performance. The random parameters model has similar goodness-of-fit compared with the model with only correlated random effects. However, by accounting for the unobserved heterogeneity, more variables were found to be significantly related to crash frequency. The models indicated that congestion increased crash frequency during peak hours while during non-peak hours it was not a major crash contributing factor. Using the random parameter model, the three congestion measures were compared. It was found that all congestion indicators had similar effects while Congestion Index (CI) derived from MVDS data was a better congestion indicator for safety analysis. Also, analyses showed that the segments with higher congestion intensity could not only increase property damage only (PDO) crashes, but also more severe crashes. In addition, the issues regarding the necessity to incorporate specific congestion indicator for congestion's effects on safety and to take care of the

  8. Flood susceptible analysis at Kelantan river basin using remote sensing and logistic regression model

    Science.gov (United States)

    Pradhan, Biswajeet

    Recently, in 2006 and 2007 heavy monsoons rainfall have triggered floods along Malaysia's east coast as well as in southern state of Johor. The hardest hit areas are along the east coast of peninsular Malaysia in the states of Kelantan, Terengganu and Pahang. The city of Johor was particularly hard hit in southern side. The flood cost nearly billion ringgit of property and many lives. The extent of damage could have been reduced or minimized if an early warning system would have been in place. This paper deals with flood susceptibility analysis using logistic regression model. We have evaluated the flood susceptibility and the effect of flood-related factors along the Kelantan river basin using the Geographic Information System (GIS) and remote sensing data. Previous flooded areas were extracted from archived radarsat images using image processing tools. Flood susceptibility mapping was conducted in the study area along the Kelantan River using radarsat imagery and then enlarged to 1:25,000 scales. Topographical, hydrological, geological data and satellite images were collected, processed, and constructed into a spatial database using GIS and image processing. The factors chosen that influence flood occurrence were: topographic slope, topographic aspect, topographic curvature, DEM and distance from river drainage, all from the topographic database; flow direction, flow accumulation, extracted from hydrological database; geology and distance from lineament, taken from the geologic database; land use from SPOT satellite images; soil texture from soil database; and the vegetation index value from SPOT satellite images. Flood susceptible areas were analyzed and mapped using the probability-logistic regression model. Results indicate that flood prone areas can be performed at 1:25,000 which is comparable to some conventional flood hazard map scales. The flood prone areas delineated on these maps correspond to areas that would be inundated by significant flooding

  9. Superior water repellency of water strider legs with hierarchical structures: experiments and analysis.

    Science.gov (United States)

    Feng, Xi-Qiao; Gao, Xuefeng; Wu, Ziniu; Jiang, Lei; Zheng, Quan-Shui

    2007-04-24

    Water striders are a type of insect with the remarkable ability to stand effortlessly and walk quickly on water. This article reports the water repellency mechanism of water strider legs. Scanning electron microscope (SEM) observations reveal the uniquely hierarchical structure on the legs, consisting of numerous oriented needle-shaped microsetae with elaborate nanogrooves. The maximal supporting force of a single leg against water surprisingly reaches up to 152 dynes, about 15 times the total body weight of this insect. We theoretically demonstrate that the cooperation of nanogroove structures on the oriented microsetae, in conjunction with the wax on the leg, renders such water repellency. This finding might be helpful in the design of innovative miniature aquatic devices and nonwetting materials.

  10. Meta-analysis methods for synthesizing treatment effects in multisite studies: hierarchical linear modeling (HLM perspective

    Directory of Open Access Journals (Sweden)

    Sema A. Kalaian

    2003-06-01

    Full Text Available The objectives of the present mixed-effects meta-analytic application are to provide practical guidelines to: (a Calculate..treatment effect sizes from multiple sites; (b Calculate the overall mean of the site effect sizes and their variances; (c..Model the heterogeneity in these site treatment effects as a function of site and program characteristics plus..unexplained random error using Hierarchical Linear Modeling (HLM; (d Improve the ability of multisite evaluators..and policy makers to reach sound conclusions about the effectiveness of educational and social interventions based on..multisite evaluations; and (e Illustrate the proposed methodology by applying these methods to real multi-site research..data.

  11. The Analysis of Internet Addiction Scale Using Multivariate Adaptive Regression Splines

    Directory of Open Access Journals (Sweden)

    M Kayri

    2010-12-01

    Full Text Available "nBackground: Determining real effects on internet dependency is too crucial with unbiased and robust statistical method. MARS is a new non-parametric method in use in the literature for parameter estimations of cause and effect based research. MARS can both obtain legible model curves and make unbiased parametric predictions."nMethods: In order to examine the performance of MARS, MARS findings will be compared to Classification and Regres­sion Tree (C&RT findings, which are considered in the literature to be efficient in revealing correlations between variables. The data set for the study is taken from "The Internet Addiction Scale" (IAS, which attempts to reveal addiction levels of individu­als. The population of the study consists of 754 secondary school students (301 female, 443 male students with 10 miss­ing data. MARS 2.0 trial version is used for analysis by MARS method and C&RT analysis was done by SPSS."nResults: MARS obtained six base functions of the model. As a common result of these six functions, regression equation of the model was found. Over the predicted variable, MARS showed that the predictors of daily Internet-use time on average, the purpose of Internet- use, grade of students and occupations of mothers had a significant effect (P< 0.05. In this compara­tive study, MARS obtained different findings from C&RT in dependency level prediction."nConclusion: The fact that MARS revealed extent to which the variable, which was considered significant, changes the charac­ter of the model was observed in this study.

  12. Predictive model of biliocystic communication in liver hydatid cysts using classification and regression tree analysis

    Directory of Open Access Journals (Sweden)

    Souadka Amine

    2010-04-01

    Full Text Available Abstract Background Incidence of liver hydatid cyst (LHC rupture ranged 15%-40% of all cases and most of them concern the bile duct tree. Patients with biliocystic communication (BCC had specific clinic and therapeutic aspect. The purpose of this study was to determine witch patients with LHC may develop BCC using classification and regression tree (CART analysis Methods A retrospective study of 672 patients with liver hydatid cyst treated at the surgery department "A" at Ibn Sina University Hospital, Rabat Morocco. Four-teen risk factors for BCC occurrence were entered into CART analysis to build an algorithm that can predict at the best way the occurrence of BCC. Results Incidence of BCC was 24.5%. Subgroups with high risk were patients with jaundice and thick pericyst risk at 73.2% and patients with thick pericyst, with no jaundice 36.5 years and younger with no past history of LHC risk at 40.5%. Our developed CART model has sensitivity at 39.6%, specificity at 93.3%, positive predictive value at 65.6%, a negative predictive value at 82.6% and accuracy of good classification at 80.1%. Discriminating ability of the model was good 82%. Conclusion we developed a simple classification tool to identify LHC patients with high risk BCC during a routine clinic visit (only on clinical history and examination followed by an ultrasonography. Predictive factors were based on pericyst aspect, jaundice, age, past history of liver hydatidosis and morphological Gharbi cyst aspect. We think that this classification can be useful with efficacy to direct patients at appropriated medical struct's.

  13. Integrative analysis of multiple diverse omics datasets by sparse group multitask regression

    Directory of Open Access Journals (Sweden)

    Dongdong eLin

    2014-10-01

    Full Text Available A variety of high throughput genome-wide assays enable the exploration of genetic risk factors underlying complex traits. Although these studies have remarkable impact on identifying susceptible biomarkers, they suffer from issues such as limited sample size and low reproducibility. Combining individual studies of different genetic levels/platforms has the promise to improve the power and consistency of biomarker identification. In this paper, we propose a novel integrative method, namely sparse group multitask regression, for integrating diverse omics datasets, platforms and populations to identify risk genes/factors of complex diseases. This method combines multitask learning with sparse group regularization, which will: 1 treat the biomarker identification in each single study as a task and then combine them by multitask learning; 2 group variables from all studies for identifying significant genes; 3 enforce sparse constraint on groups of variables to overcome the ‘small sample, but large variables’ problem. We introduce two sparse group penalties: sparse group lasso and sparse group ridge in our multitask model, and provide an effective algorithm for each model. In addition, we propose a significance test for the identification of potential risk genes. Two simulation studies are performed to evaluate the performance of our integrative method by comparing it with conventional meta-analysis method. The results show that our sparse group multitask method outperforms meta-analysis method significantly. In an application to our osteoporosis studies, 7 genes are identified as significant genes by our method and are found to have significant effects in other three independent studies for validation. The most significant gene SOD2 has been identified in our previous osteoporosis study involving the same expression dataset. Several other genes such as TREML2, HTR1E and GLO1 are shown to be novel susceptible genes for osteoporosis, as confirmed

  14. Variables that influence HIV-1 cerebrospinal fluid viral load in cryptococcal meningitis: a linear regression analysis

    Directory of Open Access Journals (Sweden)

    Cecchini Diego M

    2009-11-01

    Full Text Available Abstract Background The central nervous system is considered a sanctuary site for HIV-1 replication. Variables associated with HIV cerebrospinal fluid (CSF viral load in the context of opportunistic CNS infections are poorly understood. Our objective was to evaluate the relation between: (1 CSF HIV-1 viral load and CSF cytological and biochemical characteristics (leukocyte count, protein concentration, cryptococcal antigen titer; (2 CSF HIV-1 viral load and HIV-1 plasma viral load; and (3 CSF leukocyte count and the peripheral blood CD4+ T lymphocyte count. Methods Our approach was to use a prospective collection and analysis of pre-treatment, paired CSF and plasma samples from antiretroviral-naive HIV-positive patients with cryptococcal meningitis and assisted at the Francisco J Muñiz Hospital, Buenos Aires, Argentina (period: 2004 to 2006. We measured HIV CSF and plasma levels by polymerase chain reaction using the Cobas Amplicor HIV-1 Monitor Test version 1.5 (Roche. Data were processed with Statistix 7.0 software (linear regression analysis. Results Samples from 34 patients were analyzed. CSF leukocyte count showed statistically significant correlation with CSF HIV-1 viral load (r = 0.4, 95% CI = 0.13-0.63, p = 0.01. No correlation was found with the plasma viral load, CSF protein concentration and cryptococcal antigen titer. A positive correlation was found between peripheral blood CD4+ T lymphocyte count and the CSF leukocyte count (r = 0.44, 95% CI = 0.125-0.674, p = 0.0123. Conclusion Our study suggests that CSF leukocyte count influences CSF HIV-1 viral load in patients with meningitis caused by Cryptococcus neoformans.

  15. Regression Analysis of Top of Descent Location for Idle-thrust Descents

    Science.gov (United States)

    Stell, Laurel; Bronsvoort, Jesper; McDonald, Greg

    2013-01-01

    In this paper, multiple regression analysis is used to model the top of descent (TOD) location of user-preferred descent trajectories computed by the flight management system (FMS) on over 1000 commercial flights into Melbourne, Australia. The independent variables cruise altitude, final altitude, cruise Mach, descent speed, wind, and engine type were also recorded or computed post-operations. Both first-order and second-order models are considered, where cross-validation, hypothesis testing, and additional analysis are used to compare models. This identifies the models that should give the smallest errors if used to predict TOD location for new data in the future. A model that is linear in TOD altitude, final altitude, descent speed, and wind gives an estimated standard deviation of 3.9 nmi for TOD location given the trajec- tory parameters, which means about 80% of predictions would have error less than 5 nmi in absolute value. This accuracy is better than demonstrated by other ground automation predictions using kinetic models. Furthermore, this approach would enable online learning of the model. Additional data or further knowl- edge of algorithms is necessary to conclude definitively that no second-order terms are appropriate. Possible applications of the linear model are described, including enabling arriving aircraft to fly optimized descents computed by the FMS even in congested airspace. In particular, a model for TOD location that is linear in the independent variables would enable decision support tool human-machine interfaces for which a kinetic approach would be computationally too slow.

  16. Shock index correlates with extravasation on angiographs of gastrointestinal hemorrhage: a logistics regression analysis.

    Science.gov (United States)

    Nakasone, Yutaka; Ikeda, Osamu; Yamashita, Yasuyuki; Kudoh, Kouichi; Shigematsu, Yoshinori; Harada, Kazunori

    2007-01-01

    We applied multivariate analysis to the clinical findings in patients with acute gastrointestinal (GI) hemorrhage and compared the relationship between these findings and angiographic evidence of extravasation. Our study population consisted of 46 patients with acute GI bleeding. They were divided into two groups. In group 1 we retrospectively analyzed 41 angiograms obtained in 29 patients (age range, 25-91 years; average, 71 years). Their clinical findings including the shock index (SI), diastolic blood pressure, hemoglobin, platelet counts, and age, which were quantitatively analyzed. In group 2, consisting of 17 patients (age range, 21-78 years; average, 60 years), we prospectively applied statistical analysis by a logistics regression model to their clinical findings and then assessed 21 angiograms obtained in these patients to determine whether our model was useful for predicting the presence of angiographic evidence of extravasation. On 18 of 41 (43.9%) angiograms in group 1 there was evidence of extravasation; in 3 patients it was demonstrated only by selective angiography. Factors significantly associated with angiographic visualization of extravasation were the SI and patient age. For differentiation between cases with and cases without angiographic evidence of extravasation, the maximum cutoff point was between 0.51 and 0.0.53. Of the 21 angiograms obtained in group 2, 13 (61.9%) showed evidence of extravasation; in 1 patient it was demonstrated only on selective angiograms. We found that in 90% of the cases, the prospective application of our model correctly predicted the angiographically confirmed presence or absence of extravasation. We conclude that in patients with GI hemorrhage, angiographic visualization of extravasation is associated with the pre-embolization SI. Patients with a high SI value should undergo study to facilitate optimal treatment planning.

  17. Integration Analysis of Three Omics Data Using Penalized Regression Methods: An Application to Bladder Cancer.

    Science.gov (United States)

    Pineda, Silvia; Real, Francisco X; Kogevinas, Manolis; Carrato, Alfredo; Chanock, Stephen J; Malats, Núria; Van Steen, Kristel

    2015-12-01

    Omics data integration is becoming necessary to investigate the genomic mechanisms involved in complex diseases. During the integration process, many challenges arise such as data heterogeneity, the smaller number of individuals in comparison to the number of parameters, multicollinearity, and interpretation and validation of results due to their complexity and lack of knowledge about biological processes. To overcome some of these issues, innovative statistical approaches are being developed. In this work, we propose a permutation-based method to concomitantly assess significance and correct by multiple testing with the MaxT algorithm. This was applied with penalized regression methods (LASSO and ENET) when exploring relationships between common genetic variants, DNA methylation and gene expression measured in bladder tumor samples. The overall analysis flow consisted of three steps: (1) SNPs/CpGs were selected per each gene probe within 1Mb window upstream and downstream the gene; (2) LASSO and ENET were applied to assess the association between each expression probe and the selected SNPs/CpGs in three multivariable models (SNP, CPG, and Global models, the latter integrating SNPs and CPGs); and (3) the significance of each model was assessed using the permutation-based MaxT method. We identified 48 genes whose expression levels were significantly associated with both SNPs and CPGs. Importantly, 36 (75%) of them were replicated in an independent data set (TCGA) and the performance of the proposed method was checked with a simulation study. We further support our results with a biological interpretation based on an enrichment analysis. The approach we propose allows reducing computational time and is flexible and easy to implement when analyzing several types of omics data. Our results highlight the importance of integrating omics data by applying appropriate statistical strategies to discover new insights into the complex genetic mechanisms involved in disease

  18. Establishing the change in antibiotic resistance of Enterococcus faecium strains isolated from Dutch broilers by logistic regression and survival analysis

    NARCIS (Netherlands)

    Stegeman, J.A.; Vernooij, J.C.M.; Khalifa, O.A.; Broek, van den J.; Mevius, D.J.

    2006-01-01

    In this study, we investigated the change in the resistance of Enterococcus faecium strains isolated from Dutch broilers against erythromycin and virginiamycin in 1998, 1999 and 2001 by logistic regression analysis and survival analysis. The E. faecium strains were isolated from caecal samples that

  19. A hierarchical task analysis of shoulder arthroscopy for a virtual arthroscopic tear diagnosis and evaluation platform (VATDEP).

    Science.gov (United States)

    Demirel, Doga; Yu, Alexander; Cooper-Baer, Seth; Dendukuri, Aditya; Halic, Tansel; Kockara, Sinan; Kockara, Nizamettin; Ahmadi, Shahryar

    2017-09-01

    Shoulder arthroscopy is a minimally invasive surgical procedure for diagnosis and treatment of a shoulder pathology. The procedure is performed with a fiber optic camera, called arthroscope, and instruments inserted through very tiny incisions made around the shoulder. The confined shoulder space, unintuitive camera orientation and constrained instrument motions complicates the procedure. Therefore, surgical competence in arthroscopy entails extensive training especially for psychomotor skills development. Conventional arthroscopy training methods such as mannequins, cadavers or apprenticeship model have limited use attributed to their low-fidelity in realism, cost inefficiency or incurring high risk. However, virtual reality (VR) based surgical simulators offer a realistic, low cost, risk-free training and assessment platform where the trainees can repeatedly perform arthroscopy and receive quantitative feedback on their performances. Therefore, we are developing a VR based shoulder arthroscopy simulation specifically for the rotator cuff ailments that can quantify the surgery performance. Development of such a VR simulation requires a through task analysis that describes the steps and goals of the procedure, comprehensive metrics for quantitative and objective skills and surgical technique assessment. We analyzed shoulder arthroscopic rotator cuff surgeries and created a hierarchical task tree. We introduced a novel surgery metrics to reduce the subjectivity of the existing grading metrics and performed video analysis of 14 surgery recordings in the operating room (OR). We also analyzed our video analysis results with respect to the existing proposed metrics in the literature. We used Pearson's correlation tests to find any correlations among the task times, scores and surgery specific information. We determined strong positive correlation between cleaning time vs difficulty in tying suture, cleaning time vs difficulty in passing suture, cleaning time vs scar

  20. Comparison of multianalyte proficiency test results by sum of ranking differences, principal component analysis, and hierarchical cluster analysis.

    Science.gov (United States)

    Škrbić, Biljana; Héberger, Károly; Durišić-Mladenović, Nataša

    2013-10-01

    Sum of ranking differences (SRD) was applied for comparing multianalyte results obtained by several analytical methods used in one or in different laboratories, i.e., for ranking the overall performances of the methods (or laboratories) in simultaneous determination of the same set of analytes. The data sets for testing of the SRD applicability contained the results reported during one of the proficiency tests (PTs) organized by EU Reference Laboratory for Polycyclic Aromatic Hydrocarbons (EU-RL-PAH). In this way, the SRD was also tested as a discriminant method alternative to existing average performance scores used to compare mutlianalyte PT results. SRD should be used along with the z scores--the most commonly used PT performance statistics. SRD was further developed to handle the same rankings (ties) among laboratories. Two benchmark concentration series were selected as reference: (a) the assigned PAH concentrations (determined precisely beforehand by the EU-RL-PAH) and (b) the averages of all individual PAH concentrations determined by each laboratory. Ranking relative to the assigned values and also to the average (or median) values pointed to the laboratories with the most extreme results, as well as revealed groups of laboratories with similar overall performances. SRD reveals differences between methods or laboratories even if classical test(s) cannot. The ranking was validated using comparison of ranks by random numbers (a randomization test) and using seven folds cross-validation, which highlighted the similarities among the (methods used in) laboratories. Principal component analysis and hierarchical cluster analysis justified the findings based on SRD ranking/grouping. If the PAH-concentrations are row-scaled, (i.e., z scores are analyzed as input for ranking) SRD can still be used for checking the normality of errors. Moreover, cross-validation of SRD on z scores groups the laboratories similarly. The SRD technique is general in nature, i.e., it can

  1. Regression models for air pollution and daily mortality: analysis of data from Birmingham, Alabama

    Energy Technology Data Exchange (ETDEWEB)

    Smith, R.L. [University of North Carolina, Chapel Hill, NC (United States). Dept. of Statistics; Davis, J.M. [North Carolina State University, Raleigh, NC (United States). Dept. of Marine, Earth and Atmospheric Sciences; Sacks, J. [National Institute of Statistical Sciences, Research Triangle Park, NC (United States); Speckman, P. [University of Missouri, Columbia, MO (United States). Dept. of Statistics; Styer, P.

    2000-11-01

    In recent years, a very large literature has built up on the human health effects of air pollution. Many studies have been based on time series analyses in which daily mortality counts, or some other measure such as hospital admissions, have been decomposed through regression analysis into contributions based on long-term trend and seasonality, meteorological effects, and air pollution. There has been a particular focus on particulate air pollution represented by PM{sub 10} (particulate matter of aerodynamic diameter 10 {mu}m or less), though in recent years more attention has been given to very small particles of diameter 2.5 {mu}m or less. Most of the existing data studies, however, are based on PM{sub 10} because of the wide availability of monitoring data for this variable. The persistence of the resulting effects across many different studies is widely cited as evidence that this is not mere statistical association, but indeed establishes a causal relationship. These studies have been cited by the United States Environmental Protection Agency (USEPA) as justification for a tightening on particulate matter standards in the 1997 revision of the National Ambient Air Quality Standard (NAAQS), which is the basis for air pollution regulation in the United States. The purpose of the present paper is to propose a systematic approach to the regression analyses that are central to this kind of research. We argue that the results may depend on a number of ad hoc features of the analysis, including which meteorological variables to adjust for, and the manner in which different lagged values of particulate matter are combined into a single 'exposure measure'. We also examine the question of whether the effects are linear or nonlinear, with particular attention to the possibility of a 'threshold effect', i.e. that significant effects occur only above some threshold. These points are illustrated with a data set from Birmingham, Alabama, first cited by

  2. A Vector Auto Regression Model Applied to Real Estate Development Investment: A Statistic Analysis

    National Research Council Canada - National Science Library

    Liu, Fengyun; Matsuno, Shuji; Malekian, Reza; Yu, Jin; Li, Zhixiong

    2016-01-01

    .... The above theoretical model is empirically evidenced with VAR (Vector Auto Regression) methodology. A panel VAR model shows that land leasing and real estate price appreciation positively affect local government general fiscal revenue...

  3. Application of ordinal logistic regression analysis in determining risk factors of child malnutrition in Bangladesh

    OpenAIRE

    Das Sumonkanti; Rahman Rajwanur M

    2011-01-01

    Abstract Background The study attempts to develop an ordinal logistic regression (OLR) model to identify the determinants of child malnutrition instead of developing traditional binary logistic regression (BLR) model using the data of Bangladesh Demographic and Health Survey 2004. Methods Based on weight-for-age anthropometric index (Z-score) child nutrition status is categorized into three groups-severely undernourished (< -3.0), moderately undernourished (-3.0 to -2.01) and nourished (≥-2.0...

  4. Regression analysis to predict growth performance from dietary net energy in growing-finishing pigs.

    Science.gov (United States)

    Nitikanchana, S; Dritz, S S; Tokach, M D; DeRouchey, J M; Goodband, R D; White, B J

    2015-06-01

    Data from 41 trials with multiple energy levels (285 observations) were used in a meta-analysis to predict growth performance based on dietary NE concentration. Nutrient and energy concentrations in all diets were estimated using the NRC ingredient library. Predictor variables examined for best fit models using Akaike information criteria included linear and quadratic terms of NE, BW, CP, standardized ileal digestible (SID) Lys, crude fiber, NDF, ADF, fat, ash, and their interactions. The initial best fit models included interactions between NE and CP or SID Lys. After removal of the observations that fed SID Lys below the suggested requirement, these terms were no longer significant. Including dietary fat in the model with NE and BW significantly improved the G:F prediction model, indicating that NE may underestimate the influence of fat on G:F. The meta-analysis indicated that, as long as diets are adequate for other nutrients (i.e., Lys), dietary NE is adequate to predict changes in ADG across different dietary ingredients and conditions. The analysis indicates that ADG increases with increasing dietary NE and BW but decreases when BW is above 87 kg. The G:F ratio improves with increasing dietary NE and fat but decreases with increasing BW. The regression equations were then evaluated by comparing the actual and predicted performance of 543 finishing pigs in 2 trials fed 5 dietary treatments, included 3 different levels of NE by adding wheat middlings, soybean hulls, dried distillers grains with solubles (DDGS; 8 to 9% oil), or choice white grease (CWG) to a corn-soybean meal-based diet. Diets were 1) 30% DDGS, 20% wheat middlings, and 4 to 5% soybean hulls (low energy); 2) 20% wheat middlings and 4 to 5% soybean hulls (low energy); 3) a corn-soybean meal diet (medium energy); 4) diet 2 supplemented with 3.7% CWG to equalize the NE level to diet 3 (medium energy); and 5) a corn-soybean meal diet with 3.7% CWG (high energy). Only small differences were observed

  5. Numerical analysis of fuel regression rate distribution characteristics in hybrid rocket motors with different fuel types

    Institute of Scientific and Technical Information of China (English)

    LI; XinTian; TIAN; Hui; CAI; GuoBiao

    2013-01-01

    This paper presents three-dimensional numerical simulations of the hybrid rocket motor with hydrogen peroxide (HP) and hy-droxyl terminated polybutadiene (HTPB) propellant combination and investigates the fuel regression rate distribution charac-teristics of different fuel types. The numerical models are established to couple the Navier-Stokes equations with turbulence,chemical reactions, solid fuel pyrolysis and solid-gas interfacial boundary conditions. Simulation results including the temper-ature contours and fuel regression rate distributions are presented for the tube, star and wagon wheel grains. The results demonstrate that the changing trends of the regression rate along the axis are similar for all kinds of fuel types, which decrease sharply near the leading edges of the fuels and then gradually increase with increasing axial locations. The regression rates of the star and wagon wheel grains show apparent three-dimensional characteristics, and they are higher in the regions of fuel surfaces near the central core oxidizer flow. The average regression rates increase as the oxidizer mass fluxes rise for all of the fuel types. However, under same oxidizer mass flux, the average regression rates of the star and wagon wheel grains are much larger than that of the tube grain due to their lower hydraulic diameters.

  6. When to Use Hierarchical Linear Modeling

    Directory of Open Access Journals (Sweden)

    Veronika Huta

    2014-04-01

    Full Text Available Previous publications on hierarchical linear modeling (HLM have provided guidance on how to perform the analysis, yet there is relatively little information on two questions that arise even before analysis: Does HLM apply to one’s data and research question? And if it does apply, how does one choose between HLM and other methods sometimes used in these circumstances, including multiple regression, repeated-measures or mixed ANOVA, and structural equation modeling or path analysis? The purpose of this tutorial is to briefly introduce HLM and then to review some of the considerations that are helpful in answering these questions, including the nature of the data, the model to be tested, and the information desired on the output. Some examples of how the same analysis could be performed in HLM, repeated-measures or mixed ANOVA, and structural equation modeling or path analysis are also provided. .

  7. Expert Involvement Predicts mHealth App Downloads: Multivariate Regression Analysis of Urology Apps

    Science.gov (United States)

    Osório, Luís; Cavadas, Vitor; Fraga, Avelino; Carrasquinho, Eduardo; Cardoso de Oliveira, Eduardo; Castelo-Branco, Miguel; Roobol, Monique J

    2016-01-01

    Background Urological mobile medical (mHealth) apps are gaining popularity with both clinicians and patients. mHealth is a rapidly evolving and heterogeneous field, with some urology apps being downloaded over 10,000 times and others not at all. The factors that contribute to medical app downloads have yet to be identified, including the hypothetical influence of expert involvement in app development. Objective The objective of our study was to identify predictors of the number of urology app downloads. Methods We reviewed urology apps available in the Google Play Store and collected publicly available data. Multivariate ordinal logistic regression evaluated the effect of publicly available app variables on the number of apps being downloaded. Results Of 129 urology apps eligible for study, only 2 (1.6%) had >10,000 downloads, with half having ≤100 downloads and 4 (3.1%) having none at all. Apps developed with expert urologist involvement (P=.003), optional in-app purchases (P=.01), higher user rating (P<.001), and more user reviews (P<.001) were more likely to be installed. App cost was inversely related to the number of downloads (P<.001). Only data from the Google Play Store and the developers’ websites, but not other platforms, were publicly available for analysis, and the level and nature of expert involvement was not documented. Conclusions The explicit participation of urologists in app development is likely to enhance its chances to have a higher number of downloads. This finding should help in the design of better apps and further promote urologist involvement in mHealth. Official certification processes are required to ensure app quality and user safety. PMID:27421338

  8. Risk factors for violence in psychosis: systematic review and meta-regression analysis of 110 studies.

    Science.gov (United States)

    Witt, Katrina; van Dorn, Richard; Fazel, Seena

    2013-01-01

    Previous reviews on risk and protective factors for violence in psychosis have produced contrasting findings. There is therefore a need to clarify the direction and strength of association of risk and protective factors for violent outcomes in individuals with psychosis. We conducted a systematic review and meta-analysis using 6 electronic databases (CINAHL, EBSCO, EMBASE, Global Health, PsycINFO, PUBMED) and Google Scholar. Studies were identified that reported factors associated with violence in adults diagnosed, using DSM or ICD criteria, with schizophrenia and other psychoses. We considered non-English language studies and dissertations. Risk and protective factors were meta-analysed if reported in three or more primary studies. Meta-regression examined sources of heterogeneity. A novel meta-epidemiological approach was used to group similar risk factors into one of 10 domains. Sub-group analyses were then used to investigate whether risk domains differed for studies reporting severe violence (rather than aggression or hostility) and studies based in inpatient (rather than outpatient) settings. There were 110 eligible studies reporting on 45,533 individuals, 8,439 (18.5%) of whom were violent. A total of 39,995 (87.8%) were diagnosed with schizophrenia, 209 (0.4%) were diagnosed with bipolar disorder, and 5,329 (11.8%) were diagnosed with other psychoses. Dynamic (or modifiable) risk factors included hostile behaviour, recent drug misuse, non-adherence with psychological therapies (p valuesviolence, these associations did not change materially. In studies investigating inpatient violence, associations differed in strength but not direction. Certain dynamic risk factors are strongly associated with increased violence risk in individuals with psychosis and their role in risk assessment and management warrants further examination.

  9. Principal component regression and linear mixed model in association analysis of structured samples: competitors or complements?

    Science.gov (United States)

    Zhang, Yiwei; Pan, Wei

    2015-03-01

    Genome-wide association studies (GWAS) have been established as a major tool to identify genetic variants associated with complex traits, such as common diseases. However, GWAS may suffer from false positives and false negatives due to confounding population structures, including known or unknown relatedness. Another important issue is unmeasured environmental risk factors. Among many methods for adjusting for population structures, two approaches stand out: one is principal component regression (PCR) based on principal component analysis, which is perhaps the most popular due to its early appearance, simplicity, and general effectiveness; the other is based on a linear mixed model (LMM) that has emerged recently as perhaps the most flexible and effective, especially for samples with complex structures as in model organisms. As shown previously, the PCR approach can be regarded as an approximation to an LMM; such an approximation depends on the number of the top principal components (PCs) used, the choice of which is often difficult in practice. Hence, in the presence of population structure, the LMM appears to outperform the PCR method. However, due to the different treatments of fixed vs. random effects in the two approaches, we show an advantage of PCR over LMM: in the presence of an unknown but spatially confined environmental confounder (e.g., environmental pollution or lifestyle), the PCs may be able to implicitly and effectively adjust for the confounder whereas the LMM cannot. Accordingly, to adjust for both population structures and nongenetic confounders, we propose a hybrid method combining the use and, thus, strengths of PCR and LMM. We use real genotype data and simulated phenotypes to confirm the above points, and establish the superior performance of the hybrid method across all scenarios.

  10. Diabetes mortality in Serbia, 1991-2015 (a nationwide study): A joinpoint regression analysis.

    Science.gov (United States)

    Ilic, Milena; Ilic, Irena

    2017-02-01

    The aim of this study was to analyze the mortality trends of diabetes mellitus in Serbia (excluding the Autonomous Province of Kosovo and Metohia). A population-based cross sectional study analyzing diabetes mortality in Serbia in the period 1991-2015 was carried out based on official data. The age-standardized mortality rates (per 100,000) were calculated by direct standardization, using the European Standard Population. Average annual percentage of change (AAPC) and the corresponding 95% confidence interval (CI) were computed using the joinpoint regression analysis. More than 63,000 (about 27,000 of men and 36,000 of women) diabetes deaths occurred in Serbia from 1991 to 2015. Death rates from diabetes were almost equal in men and in women (about 24.0 per 100,000) and places Serbia among the countries with the highest diabetes mortality rates in Europe. Since 1991, mortality from diabetes in men significantly increased by +1.2% per year (95% CI 0.7-1.7), but non-significantly increased in women by +0.2% per year (95% CI -0.4 to 0.7). Increased trends in diabetes type 1 mortality rates were significant in both genders in Serbia. Trends in mortality for diabetes type 2 showed a significant decrease in both genders since 2010. Given that diabetes mortality trends showed different patterns during the studied period, our results imply that further observation of trend is needed. Copyright © 2016 Primary Care Diabetes Europe. Published by Elsevier Ltd. All rights reserved.

  11. Source apportionment based on an atmospheric dispersion model and multiple linear regression analysis

    Science.gov (United States)

    Fushimi, Akihiro; Kawashima, Hiroto; Kajihara, Hideo

    Understanding the contribution of each emission source of air pollutants to ambient concentrations is important to establish effective measures for risk reduction. We have developed a source apportionment method based on an atmospheric dispersion model and multiple linear regression analysis (MLR) in conjunction with ambient concentrations simultaneously measured at points in a grid network. We used a Gaussian plume dispersion model developed by the US Environmental Protection Agency called the Industrial Source Complex model (ISC) in the method. Our method does not require emission amounts or source profiles. The method was applied to the case of benzene in the vicinity of the Keiyo Central Coastal Industrial Complex (KCCIC), one of the biggest industrial complexes in Japan. Benzene concentrations were simultaneously measured from December 2001 to July 2002 at sites in a grid network established in the KCCIC and the surrounding residential area. The method was used to estimate benzene emissions from the factories in the KCCIC and from automobiles along a section of a road, and then the annual average contribution of the KCCIC to the ambient concentrations was estimated based on the estimated emissions. The estimated contributions of the KCCIC were 65% inside the complex, 49% at 0.5-km sites, 35% at 1.5-km sites, 20% at 3.3-km sites, and 9% at a 5.6-km site. The estimated concentrations agreed well with the measured values. The estimated emissions from the factories and the road were slightly larger than those reported in the first Pollutant Release and Transfer Register (PRTR). These results support the reliability of our method. This method can be applied to other chemicals or regions to achieve reasonable source apportionments.

  12. Malignant lymphatic and hematopoietic neoplasms mortality in Serbia, 1991-2010: a joinpoint regression analysis.

    Directory of Open Access Journals (Sweden)

    Milena Ilic

    Full Text Available BACKGROUND: Limited data on mortality from malignant lymphatic and hematopoietic neoplasms have been published for Serbia. METHODS: The study covered population of Serbia during the 1991-2010 period. Mortality trends were assessed using the joinpoint regression analysis. RESULTS: Trend for overall death rates from malignant lymphoid and haematopoietic neoplasms significantly decreased: by -2.16% per year from 1991 through 1998, and then significantly increased by +2.20% per year for the 1998-2010 period. The growth during the entire period was on average +0.8% per year (95% CI 0.3 to 1.3. Mortality was higher among males than among females in all age groups. According to the comparability test, mortality trends from malignant lymphoid and haematopoietic neoplasms in men and women were parallel (final selected model failed to reject parallelism, P = 0.232. Among younger Serbian population (0-44 years old in both sexes: trends significantly declined in males for the entire period, while in females 15-44 years of age mortality rates significantly declined only from 2003 onwards. Mortality trend significantly increased in elderly in both genders (by +1.7% in males and +1.5% in females in the 60-69 age group, and +3.8% in males and +3.6% in females in the 70+ age group. According to the comparability test, mortality trend for Hodgkin's lymphoma differed significantly from mortality trends for all other types of malignant lymphoid and haematopoietic neoplasms (P<0.05. CONCLUSION: Unfavourable mortality trend in Serbia requires targeted intervention for risk factors control, early diagnosis and modern therapy.

  13. Oral health-related risk behaviours and attitudes among Croatian adolescents--multiple logistic regression analysis.

    Science.gov (United States)

    Spalj, Stjepan; Spalj, Vedrana Tudor; Ivanković, Luida; Plancak, Darije

    2014-03-01

    The aim of this study was to explore the patterns of oral health-related risk behaviours in relation to dental status, attitudes, motivation and knowledge among Croatian adolescents. The assessment was conducted in the sample of 750 male subjects - military recruits aged 18-28 in Croatia using the questionnaire and clinical examination. Mean number of decayed, missing and filled teeth (DMFT) and Significant Caries Index (SIC) were calculated. Multiple logistic regression models were crated for analysis. Although models of risk behaviours were statistically significant their explanatory values were quite low. Five of them--rarely toothbrushing, not using hygiene auxiliaries, rarely visiting dentist, toothache as a primary reason to visit dentist, and demand for tooth extraction due to toothache--had the highest explanatory values ranging from 21-29% and correctly classified 73-89% of subjects. Toothache as a primary reason to visit dentist, extraction as preferable therapy when toothache occurs, not having brushing education in school and frequent gingival bleeding were significantly related to population with high caries experience (DMFT > or = 14 according to SiC) producing Odds ratios of 1.6 (95% CI 1.07-2.46), 2.1 (95% CI 1.29-3.25), 1.8 (95% CI 1.21-2.74) and 2.4 (95% CI 1.21-2.74) respectively. DMFT> or = 14 model had low explanatory value of 6.5% and correctly classified 83% of subjects. It can be concluded that oral health-related risk behaviours are interrelated. Poor association was seen between attitudes concerning oral health and oral health-related risk behaviours, indicating insufficient motivation to change lifestyle and habits. Self-reported oral hygiene habits were not strongly related to dental status.

  14. Regression analysis of time trends in perinatal mortality in Germany 1980-1993.

    Science.gov (United States)

    Scherb, H; Weigelt, E; Brüske-Hohlfeld, I

    2000-02-01

    Numerous investigations have been carried out on the possible impact of the Chernobyl accident on the prevalence of anomalies at birth and on perinatal mortality. In many cases the studies were aimed at the detection of differences of pregnancy outcome measurements between regions or time periods. Most authors conclude that there is no evidence of a detrimental physical effect on congenital anomalies or other outcomes of pregnancy following the accident. In this paper, we report on statistical analyses of time trends of perinatal mortality in Germany. Our main intention is to investigate whether perinatal mortality, as reflected in official records, was increased in 1987 as a possible effect of the Chernobyl accident. We show that, in Germany as a whole, there was a significantly elevated perinatal mortality proportion in 1987 as compared to the trend function. The increase is 4.8% (p = 0.0046) of the expected perinatal death proportion for 1987. Even more pronounced levels of 8.2% (p = 0. 0458) and 8.5% (p = 0.0702) may be found in the higher contaminated areas of the former German Democratic Republic (GDR), including West Berlin, and of Bavaria, respectively. To investigate the impact of statistical models on results, we applied three standard regression techniques. The observed significant increase in 1987 is independent of the statistical model used. Stillbirth proportions show essentially the same behavior as perinatal death proportions, but the results for all of Germany are nonsignificant due to the smaller numbers involved. Analysis of the association of stillbirth proportions with the (137)Cs deposition on a district level in Bavaria discloses a significant relationship. Our results are in contrast to those of many analyses of the health consequences of the Chernobyl accident and contradict the present radiobiologic knowledge. As we are dealing with highly aggregated data, other causes or artifacts may explain the observed effects. Hence, the findings

  15. Flexible survival regression modelling

    DEFF Research Database (Denmark)

    Cortese, Giuliana; Scheike, Thomas H; Martinussen, Torben

    2009-01-01

    Regression analysis of survival data, and more generally event history data, is typically based on Cox's regression model. We here review some recent methodology, focusing on the limitations of Cox's regression model. The key limitation is that the model is not well suited to represent time-varyi...

  16. Hierarchical materials: Background and perspectives

    DEFF Research Database (Denmark)

    2016-01-01

    Hierarchical design draws inspiration from analysis of biological materials and has opened new possibilities for enhancing performance and enabling new functionalities and extraordinary properties. With the development of nanotechnology, the necessary technological requirements for the manufactur...

  17. Hierarchical clustering for graph visualization

    CERN Document Server

    Clémençon, Stéphan; Rossi, Fabrice; Tran, Viet Chi

    2012-01-01

    This paper describes a graph visualization methodology based on hierarchical maximal modularity clustering, with interactive and significant coarsening and refining possibilities. An application of this method to HIV epidemic analysis in Cuba is outlined.

  18. 分层教学改革的绩效分析%Analysis of the Performance of Hierarchical Teaching Reform

    Institute of Scientific and Technical Information of China (English)

    夏莉

    2012-01-01

    Through statistics, analysis and comparison of test papers of Economic Mathematics, this paper puts forward that the main factors influencing learning achievement of students are elementary mathematics basis and teaching mode selection. By the practice in hierarchical teaching reform, the individualized education and pertinent cultivation for the students are enhanced, and the teaching quality of Economic Mathematics is boosted.%通过“经济数学”课程考试试卷统计、分析及对比,提出了影响学生学习成绩的主要因素是初等数学基础和教学模式的选择;通过分层教学改革实践,加强了学生个性化教育和有针对性的培养,提高了“经济数学”课程教学质量。

  19. [Study on the application of ridge regression to near-infrared spectroscopy quantitative analysis and optimum wavelength selection].

    Science.gov (United States)

    Zhang, Man; Liu, Xu-Hua; He, Xiong-Kui; Zhang, Lu-Da; Zhao, Long-Lian; Li, Jun-Hui

    2010-05-01

    In the present paper, taking 66 wheat samples for testing materials, ridge regression technology in near-infrared (NIR) spectroscopy quantitative analysis was researched. The NIR-ridge regression model for determination of protein content was established by NIR spectral data of 44 wheat samples to predict the protein content of the other 22 samples. The average relative error was 0.015 18 between the predictive results and Kjeldahl's values (chemical analysis values). And the predictive results were compared with those values derived through partial least squares (PLS) method, showing that ridge regression method was deserved to be chosen for NIR spectroscopy quantitative analysis. Furthermore, in order to reduce the disturbance to predictive capacity of the quantitative analysis model resulting from irrelevant information, one effective way is to screen the wavelength information. In order to select the spectral information with more content information and stronger relativity with the composition or the nature of the samples to improve the model's predictive accuracy, ridge regression was used to select wavelength information in this paper. The NIR-ridge regression model was established with the spectral information at 4 wavelength points, which were selected from 1 297 wavelength points, to predict the protein content of the 22 samples. The average relative error was 0.013 7 and the correlation coefficient reached 0.981 7 between the predictive results and Kjeldahl's values. The results showed that ridge regression was able to screen the essential wavelength information from a large amount of spectral information. It not only can simplify the model and effectively reduce the disturbance resulting from collinearity information, but also has practical significance for designing special NIR analysis instrument for analyzing specific component in some special samples.

  20. Significant drivers of the virtual water trade evaluated with a multivariate regression analysis

    Science.gov (United States)

    Tamea, Stefania; Laio, Francesco; Ridolfi, Luca

    2014-05-01

    International trade of food is vital for the food security of many countries, which rely on trade to compensate for an agricultural production insufficient to feed the population. At the same time, food trade has implications on the distribution and use of water resources, because through the international trade of food commodities, countries virtually displace the water used for food production, known as "virtual water". Trade thus implies a network of virtual water fluxes from exporting to importing countries, which has been estimated to displace more than 2 billions of m3 of water per year, or about the 2% of the annual global precipitation above land. It is thus important to adequately identify the dynamics and the controlling factors of the virtual water trade in that it supports and enables the world food security. Using the FAOSTAT database of international trade and the virtual water content available from the Water Footprint Network, we reconstructed 25 years (1986-2010) of virtual water fluxes. We then analyzed the dependence of exchanged fluxes on a set of major relevant factors, that includes: population, gross domestic product, arable land, virtual water embedded in agricultural production and dietary consumption, and geographical distance between countries. Significant drivers have been identified by means of a multivariate regression analysis, applied separately to the export and import fluxes of each country; temporal trends are outlined and the relative importance of drivers is assessed by a commonality analysis. Results indicate that population, gross domestic product and geographical distance are the major drivers of virtual water fluxes, with a minor (but non-negligible) contribution given by the agricultural production of exporting countries. Such drivers have become relevant for an increasing number of countries throughout the years, with an increasing variance explained by the distance between countries and a decreasing role of the gross