Repeated Results Analysis for Middleware Regression Benchmarking
Czech Academy of Sciences Publication Activity Database
Bulej, Lubomír; Kalibera, T.; Tůma, P.
2005-01-01
Roč. 60, - (2005), s. 345-358 ISSN 0166-5316 R&D Projects: GA ČR GA102/03/0672 Institutional research plan: CEZ:AV0Z10300504 Keywords : middleware benchmarking * regression benchmarking * regression testing Subject RIV: JD - Computer Applications, Robotics Impact factor: 0.756, year: 2005
Augmenting Data with Published Results in Bayesian Linear Regression
de Leeuw, Christiaan; Klugkist, Irene
2012-01-01
In most research, linear regression analyses are performed without taking into account published results (i.e., reported summary statistics) of similar previous studies. Although the prior density in Bayesian linear regression could accommodate such prior knowledge, formal models for doing so are absent from the literature. The goal of this…
Mapping the results of local statistics: Using geographically weighted regression
Directory of Open Access Journals (Sweden)
Stephen A. Matthews
2012-03-01
Full Text Available BACKGROUND The application of geographically weighted regression (GWR - a local spatial statistical technique used to test for spatial nonstationarity - has grown rapidly in the social, health, and demographic sciences. GWR is a useful exploratory analytical tool that generates a set of location-specific parameter estimates which can be mapped and analysed to provide information on spatial nonstationarity in the relationships between predictors and the outcome variable. OBJECTIVE A major challenge to users of GWR methods is how best to present and synthesize the large number of mappable results, specifically the local parameter parameter estimates and local t-values, generated from local GWR models. We offer an elegant solution. METHODS This paper introduces a mapping technique to simultaneously display local parameter estimates and local t-values on one map based on the use of data selection and transparency techniques. We integrate GWR software and GIS software package (ArcGIS and adapt earlier work in cartography on bivariate mapping. We compare traditional mapping strategies (i.e., side-by-side comparison and isoline overlay maps with our method using an illustration focusing on US county infant mortality data. CONCLUSIONS The resultant map design is more elegant than methods used to date. This type of map presentation can facilitate the exploration and interpretation of nonstationarity, focusing map reader attention on the areas of primary interest.
Two SPSS programs for interpreting multiple regression results.
Lorenzo-Seva, Urbano; Ferrando, Pere J; Chico, Eliseo
2010-02-01
When multiple regression is used in explanation-oriented designs, it is very important to determine both the usefulness of the predictor variables and their relative importance. Standardized regression coefficients are routinely provided by commercial programs. However, they generally function rather poorly as indicators of relative importance, especially in the presence of substantially correlated predictors. We provide two user-friendly SPSS programs that implement currently recommended techniques and recent developments for assessing the relevance of the predictors. The programs also allow the user to take into account the effects of measurement error. The first program, MIMR-Corr.sps, uses a correlation matrix as input, whereas the second program, MIMR-Raw.sps, uses the raw data and computes bootstrap confidence intervals of different statistics. The SPSS syntax, a short manual, and data files related to this article are available as supplemental materials from http://brm.psychonomic-journals.org/content/supplemental.
Valle, Denis; Lima, Joanna M Tucker; Millar, Justin; Amratia, Punam; Haque, Ubydul
2015-11-04
Logistic regression is a statistical model widely used in cross-sectional and cohort studies to identify and quantify the effects of potential disease risk factors. However, the impact of imperfect tests on adjusted odds ratios (and thus on the identification of risk factors) is under-appreciated. The purpose of this article is to draw attention to the problem associated with modelling imperfect diagnostic tests, and propose simple Bayesian models to adequately address this issue. A systematic literature review was conducted to determine the proportion of malaria studies that appropriately accounted for false-negatives/false-positives in a logistic regression setting. Inference from the standard logistic regression was also compared with that from three proposed Bayesian models using simulations and malaria data from the western Brazilian Amazon. A systematic literature review suggests that malaria epidemiologists are largely unaware of the problem of using logistic regression to model imperfect diagnostic test results. Simulation results reveal that statistical inference can be substantially improved when using the proposed Bayesian models versus the standard logistic regression. Finally, analysis of original malaria data with one of the proposed Bayesian models reveals that microscopy sensitivity is strongly influenced by how long people have lived in the study region, and an important risk factor (i.e., participation in forest extractivism) is identified that would have been missed by standard logistic regression. Given the numerous diagnostic methods employed by malaria researchers and the ubiquitous use of logistic regression to model the results of these diagnostic tests, this paper provides critical guidelines to improve data analysis practice in the presence of misclassification error. Easy-to-use code that can be readily adapted to WinBUGS is provided, enabling straightforward implementation of the proposed Bayesian models.
Directory of Open Access Journals (Sweden)
Svetlana O. Musienko
2017-03-01
Full Text Available Objective to develop the economicmathematical model of the dependence of revenue on other balance sheet items taking into account the sectoral affiliation of the companies. Methods using comparative analysis the article studies the existing approaches to the construction of the company management models. Applying the regression analysis and the least squares method which is widely used for financial management of enterprises in Russia and abroad the author builds a model of the dependence of revenue on other balance sheet items taking into account the sectoral affiliation of the companies which can be used in the financial analysis and prediction of small enterprisesrsquo performance. Results the article states the need to identify factors affecting the financial management efficiency. The author analyzed scientific research and revealed the lack of comprehensive studies on the methodology for assessing the small enterprisesrsquo management while the methods used for large companies are not always suitable for the task. The systematized approaches of various authors to the formation of regression models describe the influence of certain factors on the company activity. It is revealed that the resulting indicators in the studies were revenue profit or the company relative profitability. The main drawback of most models is the mathematical not economic approach to the definition of the dependent and independent variables. Basing on the analysis it was determined that the most correct is the model of dependence between revenues and total assets of the company using the decimal logarithm. The model was built using data on the activities of the 507 small businesses operating in three spheres of economic activity. Using the presented model it was proved that there is direct dependence between the sales proceeds and the main items of the asset balance as well as differences in the degree of this effect depending on the economic activity of small
Sampson, Maureen L; Gounden, Verena; van Deventer, Hendrik E; Remaley, Alan T
2016-02-01
The main drawback of the periodic analysis of quality control (QC) material is that test performance is not monitored in time periods between QC analyses, potentially leading to the reporting of faulty test results. The objective of this study was to develop a patient based QC procedure for the more timely detection of test errors. Results from a Chem-14 panel measured on the Beckman LX20 analyzer were used to develop the model. Each test result was predicted from the other 13 members of the panel by multiple regression, which resulted in correlation coefficients between the predicted and measured result of >0.7 for 8 of the 14 tests. A logistic regression model, which utilized the measured test result, the predicted test result, the day of the week and time of day, was then developed for predicting test errors. The output of the logistic regression was tallied by a daily CUSUM approach and used to predict test errors, with a fixed specificity of 90%. The mean average run length (ARL) before error detection by CUSUM-Logistic Regression (CSLR) was 20 with a mean sensitivity of 97%, which was considerably shorter than the mean ARL of 53 (sensitivity 87.5%) for a simple prediction model that only used the measured result for error detection. A CUSUM-Logistic Regression analysis of patient laboratory data can be an effective approach for the rapid and sensitive detection of clinical laboratory errors. Published by Elsevier Inc.
Linear regression metamodeling as a tool to summarize and present simulation model results.
Jalal, Hawre; Dowd, Bryan; Sainfort, François; Kuntz, Karen M
2013-10-01
Modelers lack a tool to systematically and clearly present complex model results, including those from sensitivity analyses. The objective was to propose linear regression metamodeling as a tool to increase transparency of decision analytic models and better communicate their results. We used a simplified cancer cure model to demonstrate our approach. The model computed the lifetime cost and benefit of 3 treatment options for cancer patients. We simulated 10,000 cohorts in a probabilistic sensitivity analysis (PSA) and regressed the model outcomes on the standardized input parameter values in a set of regression analyses. We used the regression coefficients to describe measures of sensitivity analyses, including threshold and parameter sensitivity analyses. We also compared the results of the PSA to deterministic full-factorial and one-factor-at-a-time designs. The regression intercept represented the estimated base-case outcome, and the other coefficients described the relative parameter uncertainty in the model. We defined simple relationships that compute the average and incremental net benefit of each intervention. Metamodeling produced outputs similar to traditional deterministic 1-way or 2-way sensitivity analyses but was more reliable since it used all parameter values. Linear regression metamodeling is a simple, yet powerful, tool that can assist modelers in communicating model characteristics and sensitivity analyses.
Directory of Open Access Journals (Sweden)
Guangbin Liu
2015-04-01
Full Text Available Wool is an important material in textile manufacturing. In order to investigate the intrinsic factors that regulate wool follicle cycling and wool fiber properties, Illumina sequencing was performed on wool follicle bulb samples from the middle anagen, catagen and late telogen/early anagen phases. In total, 13,898 genes were identified. KRTs and KRTAPs are the most highly expressed gene families in wool follicle bulb. In addition, 438 and 203 genes were identified to be differentially expressed in wool follicle bulb samples from the middle anagen phase compared to the catagen phase and the samples from the catagen phase compared to the late telogen/early anagen phase, respectively. Finally, our data revealed that two groups of genes presenting distinct expression patterns during the phase transformation may have important roles for wool follicle bulb regression and regeneration. In conclusion, our results demonstrated the gene expression patterns in the wool follicle bulb and add new data towards an understanding of the mechanisms involved in wool fiber growth in sheep.
Posterior consistency for Bayesian inverse problems through stability and regression results
International Nuclear Information System (INIS)
Vollmer, Sebastian J
2013-01-01
In the Bayesian approach, the a priori knowledge about the input of a mathematical model is described via a probability measure. The joint distribution of the unknown input and the data is then conditioned, using Bayes’ formula, giving rise to the posterior distribution on the unknown input. In this setting we prove posterior consistency for nonlinear inverse problems: a sequence of data is considered, with diminishing fluctuations around a single truth and it is then of interest to show that the resulting sequence of posterior measures arising from this sequence of data concentrates around the truth used to generate the data. Posterior consistency justifies the use of the Bayesian approach very much in the same way as error bounds and convergence results for regularization techniques do. As a guiding example, we consider the inverse problem of reconstructing the diffusion coefficient from noisy observations of the solution to an elliptic PDE in divergence form. This problem is approached by splitting the forward operator into the underlying continuum model and a simpler observation operator based on the output of the model. In general, these splittings allow us to conclude posterior consistency provided a deterministic stability result for the underlying inverse problem and a posterior consistency result for the Bayesian regression problem with the push-forward prior. Moreover, we prove posterior consistency for the Bayesian regression problem based on the regularity, the tail behaviour and the small ball probabilities of the prior. (paper)
Directory of Open Access Journals (Sweden)
Cedric Simillion
2017-10-01
Full Text Available About one in 15 of the world’s population is chronically infected with either hepatitis virus B (HBV or C (HCV, with enormous public health consequences. The metabolic alterations caused by these infections have never been directly compared and contrasted. We investigated groups of HBV-positive, HCV-positive, and uninfected healthy controls using gas chromatography-mass spectrometry analyses of their plasma and urine. A robust regression analysis of the metabolite data was conducted to reveal correlations between metabolite pairs. Ten metabolite correlations appeared for HBV plasma and urine, with 18 for HCV plasma and urine, none of which were present in the controls. Metabolic perturbation networks were constructed, which permitted a differential view of the HBV- and HCV-infected liver. HBV hepatitis was consistent with enhanced glucose uptake, glycolysis, and pentose phosphate pathway metabolism, the latter using xylitol and producing threonic acid, which may also be imported by glucose transporters. HCV hepatitis was consistent with impaired glucose uptake, glycolysis, and pentose phosphate pathway metabolism, with the tricarboxylic acid pathway fueled by branched-chain amino acids feeding gluconeogenesis and the hepatocellular loss of glucose, which most probably contributed to hyperglycemia. It is concluded that robust regression analyses can uncover metabolic rewiring in disease states.
Shabani, Farzin; Kumar, Lalit; Solhjouy-fard, Samaneh
2017-08-01
The aim of this study was to have a comparative investigation and evaluation of the capabilities of correlative and mechanistic modeling processes, applied to the projection of future distributions of date palm in novel environments and to establish a method of minimizing uncertainty in the projections of differing techniques. The location of this study on a global scale is in Middle Eastern Countries. We compared the mechanistic model CLIMEX (CL) with the correlative models MaxEnt (MX), Boosted Regression Trees (BRT), and Random Forests (RF) to project current and future distributions of date palm ( Phoenix dactylifera L.). The Global Climate Model (GCM), the CSIRO-Mk3.0 (CS) using the A2 emissions scenario, was selected for making projections. Both indigenous and alien distribution data of the species were utilized in the modeling process. The common areas predicted by MX, BRT, RF, and CL from the CS GCM were extracted and compared to ascertain projection uncertainty levels of each individual technique. The common areas identified by all four modeling techniques were used to produce a map indicating suitable and unsuitable areas for date palm cultivation for Middle Eastern countries, for the present and the year 2100. The four different modeling approaches predict fairly different distributions. Projections from CL were more conservative than from MX. The BRT and RF were the most conservative methods in terms of projections for the current time. The combination of the final CL and MX projections for the present and 2100 provide higher certainty concerning those areas that will become highly suitable for future date palm cultivation. According to the four models, cold, hot, and wet stress, with differences on a regional basis, appears to be the major restrictions on future date palm distribution. The results demonstrate variances in the projections, resulting from different techniques. The assessment and interpretation of model projections requires reservations
An illustration of harmonic regression based on the results of the fast Fourier transformation
Directory of Open Access Journals (Sweden)
Bertfai Imre
2002-01-01
Full Text Available The well-known methodology of the Fourier analysis is put against the background in the 2nd half of the century parallel to the development of the time-domain approach in the analysis of mainly economical time series. However, from the author's point of view, the former possesses some hidden analytical advantages which deserve to be re-introduced to the toolbox of analysts. This paper, through several case studies, reports research results for computer algorithm providing a harmonic model for time series. The starting point of the particular method is a harmonic analysis (Fourier-analysis or Lomb-periodogram. The results are optimized in a multifold manner resulting in a model which is easy to handle and able to forecast the underlying data. The results provided are particularly free from limitations characteristic for that methods. Furthermore, the calculated results are easy to interpret and use for further decisions. Nevertheless, the author intends to enhance the procedure in several ways. The method shown seems to be very effective and useful in modeling time series consisting of periodic terms. An additional advantage is the easy interpretation of the obtained parameters.
Directory of Open Access Journals (Sweden)
Christoph Königs
2018-05-01
Full Text Available B lymphocytes are key players in humoral immunity, expressing diverse surface immunoglobulin receptors directed against specific antigenic epitopes. The development and profile of distinct subpopulations have gained awareness in the setting of primary immunodeficiency disorders, primary or secondary autoimmunity and as therapeutic targets of specific antibodies in various diseases. The major B cell subpopulations in peripheral blood include naïve (CD19+ or CD20+IgD+CD27−, non-switched memory (CD19+ or CD20+IgD+CD27+ and switched memory B cells (CD19+ or CD20+IgD−CD27+. Furthermore, less common B cell subpopulations have also been described as having a role in the suppressive capacity of B cells to maintain self-tolerance. Data on reference values for B cell subpopulations are limited and only available for older age groups, neglecting the continuous process of human B cell development in children and adolescents. This study was designed to establish an exponential regression model to produce continuous reference values for main B cell subpopulations to reflect the dynamic maturation of the human immune system in healthy children.
Pan, Xin; Cao, Chen; Yang, Yingbao; Li, Xiaolong; Shan, Liangliang; Zhu, Xi
2018-04-01
The land surface temperature (LST) derived from thermal infrared satellite images is a meaningful variable in many remote sensing applications. However, at present, the spatial resolution of the satellite thermal infrared remote sensing sensor is coarser, which cannot meet the needs. In this study, LST image was downscaled by a random forest model between LST and multiple predictors in an arid region with an oasis-desert ecotone. The proposed downscaling approach was evaluated using LST derived from the MODIS LST product of Zhangye City in Heihe Basin. The primary result of LST downscaling has been shown that the distribution of downscaled LST matched with that of the ecosystem of oasis and desert. By the way of sensitivity analysis, the most sensitive factors to LST downscaling were modified normalized difference water index (MNDWI)/normalized multi-band drought index (NMDI), soil adjusted vegetation index (SAVI)/ shortwave infrared reflectance (SWIR)/normalized difference vegetation index (NDVI), normalized difference building index (NDBI)/SAVI and SWIR/NDBI/MNDWI/NDWI for the region of water, vegetation, building and desert, with LST variation (at most) of 0.20/-0.22 K, 0.92/0.62/0.46 K, 0.28/-0.29 K and 3.87/-1.53/-0.64/-0.25 K in the situation of +/-0.02 predictor perturbances, respectively.
Rapakoulia, Trisevgeni
2017-08-09
Motivation: Drug combination therapy for treatment of cancers and other multifactorial diseases has the potential of increasing the therapeutic effect, while reducing the likelihood of drug resistance. In order to reduce time and cost spent in comprehensive screens, methods are needed which can model additive effects of possible drug combinations. Results: We here show that the transcriptional response to combinatorial drug treatment at promoters, as measured by single molecule CAGE technology, is accurately described by a linear combination of the responses of the individual drugs at a genome wide scale. We also find that the same linear relationship holds for transcription at enhancer elements. We conclude that the described approach is promising for eliciting the transcriptional response to multidrug treatment at promoters and enhancers in an unbiased genome wide way, which may minimize the need for exhaustive combinatorial screens.
Energy Technology Data Exchange (ETDEWEB)
Xu, Bin [School of Statistics, Jiangxi University of Finance and Economics, Nanchang, Jiangxi 330013 (China); Research Center of Applied Statistics, Jiangxi University of Finance and Economics, Nanchang, Jiangxi 330013 (China); Lin, Boqiang, E-mail: bqlin@xmu.edu.cn [Collaborative Innovation Center for Energy Economics and Energy Policy, China Institute for Studies in Energy Policy, Xiamen University, Xiamen, Fujian 361005 (China)
2017-03-15
China is currently the world's largest carbon dioxide (CO{sub 2}) emitter. Moreover, total energy consumption and CO{sub 2} emissions in China will continue to increase due to the rapid growth of industrialization and urbanization. Therefore, vigorously developing the high–tech industry becomes an inevitable choice to reduce CO{sub 2} emissions at the moment or in the future. However, ignoring the existing nonlinear links between economic variables, most scholars use traditional linear models to explore the impact of the high–tech industry on CO{sub 2} emissions from an aggregate perspective. Few studies have focused on nonlinear relationships and regional differences in China. Based on panel data of 1998–2014, this study uses the nonparametric additive regression model to explore the nonlinear effect of the high–tech industry from a regional perspective. The estimated results show that the residual sum of squares (SSR) of the nonparametric additive regression model in the eastern, central and western regions are 0.693, 0.054 and 0.085 respectively, which are much less those that of the traditional linear regression model (3.158, 4.227 and 7.196). This verifies that the nonparametric additive regression model has a better fitting effect. Specifically, the high–tech industry produces an inverted “U–shaped” nonlinear impact on CO{sub 2} emissions in the eastern region, but a positive “U–shaped” nonlinear effect in the central and western regions. Therefore, the nonlinear impact of the high–tech industry on CO{sub 2} emissions in the three regions should be given adequate attention in developing effective abatement policies. - Highlights: • The nonlinear effect of the high–tech industry on CO{sub 2} emissions was investigated. • The high–tech industry yields an inverted “U–shaped” effect in the eastern region. • The high–tech industry has a positive “U–shaped” nonlinear effect in other regions. • The linear impact
Sobrinho, L G; Almeida-Costa, J M
1992-01-01
Pathological hyperprolactinaemia (PH) is significantly associated with: (1) paternal deprivation during childhood, (2) depression, (3) non-specific symptoms including obesity and weight gain. The clinical onset of the symptoms often follows pregnancy or a loss. Prolactin is an insulin antagonist which does not promote weight gain. Hyperprolactinaemia and increased metabolic efficiency are parts of a system of interdependent behavioural and metabolic mechanisms necessary for the care of the young. We call this system, which is available as a whole package, maternal subroutine (MS). An important number of cases of PH are due to activation of the MS that is not induced by pregnancy. The same occurs in surrogate maternity and in some animal models. Most women with PH developed a malignant symbiotic relationship with their mothers in the setting of absence, alcoholism or devaluation of the father. These women may regress to early developmental stages to the point that they identify themselves both with their lactating mother and with the nursing infant as has been found in psychoanalysed patients and in the paradigmatic condition of pseudopregnancy. Such regression can be associated with activation of the MS. Prolactinomas represent the extreme of the spectrum of PH and may result from somatic mutations occurring in hyperstimulated lactotrophs.
Spady, Richard; Stouli, Sami
2012-01-01
We propose dual regression as an alternative to the quantile regression process for the global estimation of conditional distribution functions under minimal assumptions. Dual regression provides all the interpretational power of the quantile regression process while avoiding the need for repairing the intersecting conditional quantile surfaces that quantile regression often produces in practice. Our approach introduces a mathematical programming characterization of conditional distribution f...
Dyer, Betsey D.; Kahn, Michael J.; LeBlanc, Mark D.
2008-01-01
Classification and regression tree (CART) analysis was applied to genome-wide tetranucleotide frequencies (genomic signatures) of 195 archaea and bacteria. Although genomic signatures have typically been used to classify evolutionary divergence, in this study, convergent evolution was the focus. Temperature optima for most of the organisms examined could be distinguished by CART analyses of tetranucleotide frequencies. This suggests that pervasive (nonlinear) qualities of genomes may reflect certain environmental conditions (such as temperature) in which those genomes evolved. The predominant use of GAGA and AGGA as the discriminating tetramers in CART models suggests that purine-loading and codon biases of thermophiles may explain some of the results. PMID:19054742
Elfaki, Tayseer Elamin Mohamed; Arndts, Kathrin; Wiszniewsky, Anna; Ritter, Manuel; Goreish, Ibtisam A; Atti El Mekki, Misk El Yemen A; Arriens, Sandra; Pfarr, Kenneth; Fimmers, Rolf; Doenhoff, Mike; Hoerauf, Achim; Layland, Laura E
2016-05-01
In the Sudan, Schistosoma mansoni infections are a major cause of morbidity in school-aged children and infection rates are associated with available clean water sources. During infection, immune responses pass through a Th1 followed by Th2 and Treg phases and patterns can relate to different stages of infection or immunity. This retrospective study evaluated immunoepidemiological aspects in 234 individuals (range 4-85 years old) from Kassala and Khartoum states in 2011. Systemic immune profiles (cytokines and immunoglobulins) and epidemiological parameters were surveyed in n = 110 persons presenting patent S. mansoni infections (egg+), n = 63 individuals positive for S. mansoni via PCR in sera but egg negative (SmPCR+) and n = 61 people who were infection-free (Sm uninf). Immunoepidemiological findings were further investigated using two binary multivariable regression analysis. Nearly all egg+ individuals had no access to latrines and over 90% obtained water via the canal stemming from the Atbara River. With regards to age, infection and an egg+ status was linked to young and adolescent groups. In terms of immunology, S. mansoni infection per se was strongly associated with increased SEA-specific IgG4 but not IgE levels. IL-6, IL-13 and IL-10 were significantly elevated in patently-infected individuals and positively correlated with egg load. In contrast, IL-2 and IL-1β were significantly lower in SmPCR+ individuals when compared to Sm uninf and egg+ groups which was further confirmed during multivariate regression analysis. Schistosomiasis remains an important public health problem in the Sudan with a high number of patent individuals. In addition, SmPCR diagnostics revealed another cohort of infected individuals with a unique immunological profile and provides an avenue for future studies on non-patent infection states. Future studies should investigate the downstream signalling pathways/mechanisms of IL-2 and IL-1β as potential diagnostic markers in order to
Li, L.; Yang, C.
2017-12-01
Climate extremes often manifest as rare events in terms of surface air temperature and precipitation with an annual reoccurrence period. In order to represent the manifold characteristics of climate extremes for monitoring and analysis, the Expert Team on Climate Change Detection and Indices (ETCCDI) had worked out a set of 27 core indices based on daily temperature and precipitation data, describing extreme weather and climate events on an annual basis. The CLIMDEX project (http://www.climdex.org) had produced public domain datasets of such indices for data from a variety of sources, including output from global climate models (GCM) participating in the Coupled Model Intercomparison Project Phase 5 (CMIP5). Among the 27 ETCCDI indices, there are six percentile-based temperature extremes indices that may fall into two groups: exceedance rates (ER) (TN10p, TN90p, TX10p and TX90p) and durations (CSDI and WSDI). Percentiles must be estimated prior to the calculation of the indices, and could more or less be biased by the adopted algorithm. Such biases will in turn be propagated to the final results of indices. The CLIMDEX used an empirical quantile estimator combined with a bootstrap resampling procedure to reduce the inhomogeneity in the annual series of the ER indices. However, there are still some problems remained in the CLIMDEX datasets, namely the overestimated climate variability due to unaccounted autocorrelation in the daily temperature data, seasonally varying biases and inconsistency between algorithms applied to the ER indices and to the duration indices. We now present new results of the six indices through a semiparametric quantile regression approach for the CMIP5 model output. By using the base-period data as a whole and taking seasonality and autocorrelation into account, this approach successfully addressed the aforementioned issues and came out with consistent results. The new datasets cover the historical and three projected (RCP2.6, RCP4.5 and RCP
Salas, M.M.; Nascimento, G.G.; Vargas-Ferreira, F.; Tarquinio, S.B.; Huysmans, M.C.D.N.J.M.; Demarco, F.F.
2015-01-01
OBJECTIVE: The aim of the present study was to assess the influence of diet in tooth erosion presence in children and adolescents by meta-analysis and meta-regression. DATA: Two reviewers independently performed the selection process and the quality of studies was assessed. SOURCES: Studies
Directory of Open Access Journals (Sweden)
Tayseer Elamin Mohamed Elfaki
2016-05-01
Full Text Available In the Sudan, Schistosoma mansoni infections are a major cause of morbidity in school-aged children and infection rates are associated with available clean water sources. During infection, immune responses pass through a Th1 followed by Th2 and Treg phases and patterns can relate to different stages of infection or immunity.This retrospective study evaluated immunoepidemiological aspects in 234 individuals (range 4-85 years old from Kassala and Khartoum states in 2011. Systemic immune profiles (cytokines and immunoglobulins and epidemiological parameters were surveyed in n = 110 persons presenting patent S. mansoni infections (egg+, n = 63 individuals positive for S. mansoni via PCR in sera but egg negative (SmPCR+ and n = 61 people who were infection-free (Sm uninf. Immunoepidemiological findings were further investigated using two binary multivariable regression analysis.Nearly all egg+ individuals had no access to latrines and over 90% obtained water via the canal stemming from the Atbara River. With regards to age, infection and an egg+ status was linked to young and adolescent groups. In terms of immunology, S. mansoni infection per se was strongly associated with increased SEA-specific IgG4 but not IgE levels. IL-6, IL-13 and IL-10 were significantly elevated in patently-infected individuals and positively correlated with egg load. In contrast, IL-2 and IL-1β were significantly lower in SmPCR+ individuals when compared to Sm uninf and egg+ groups which was further confirmed during multivariate regression analysis.Schistosomiasis remains an important public health problem in the Sudan with a high number of patent individuals. In addition, SmPCR diagnostics revealed another cohort of infected individuals with a unique immunological profile and provides an avenue for future studies on non-patent infection states. Future studies should investigate the downstream signalling pathways/mechanisms of IL-2 and IL-1β as potential diagnostic markers
Differentiating regressed melanoma from regressed lichenoid keratosis.
Chan, Aegean H; Shulman, Kenneth J; Lee, Bonnie A
2017-04-01
Distinguishing regressed lichen planus-like keratosis (LPLK) from regressed melanoma can be difficult on histopathologic examination, potentially resulting in mismanagement of patients. We aimed to identify histopathologic features by which regressed melanoma can be differentiated from regressed LPLK. Twenty actively inflamed LPLK, 12 LPLK with regression and 15 melanomas with regression were compared and evaluated by hematoxylin and eosin staining as well as Melan-A, microphthalmia transcription factor (MiTF) and cytokeratin (AE1/AE3) immunostaining. (1) A total of 40% of regressed melanomas showed complete or near complete loss of melanocytes within the epidermis with Melan-A and MiTF immunostaining, while 8% of regressed LPLK exhibited this finding. (2) Necrotic keratinocytes were seen in the epidermis in 33% regressed melanomas as opposed to all of the regressed LPLK. (3) A dense infiltrate of melanophages in the papillary dermis was seen in 40% of regressed melanomas, a feature not seen in regressed LPLK. In summary, our findings suggest that a complete or near complete loss of melanocytes within the epidermis strongly favors a regressed melanoma over a regressed LPLK. In addition, necrotic epidermal keratinocytes and the presence of a dense band-like distribution of dermal melanophages can be helpful in differentiating these lesions. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Sjølie, A K; Klein, R; Porta, M; Orchard, T; Fuller, J; Parving, H H; Bilous, R; Aldington, S; Chaturvedi, N
2011-03-01
To study the association between baseline retinal microaneurysm score and progression and regression of diabetic retinopathy, and response to treatment with candesartan in people with diabetes. This was a multicenter randomized clinical trial. The progression analysis included 893 patients with Type 1 diabetes and 526 patients with Type 2 diabetes with retinal microaneurysms only at baseline. For regression, 438 with Type 1 and 216 with Type 2 diabetes qualified. Microaneurysms were scored from yearly retinal photographs according to the Early Treatment Diabetic Retinopathy Study (ETDRS) protocol. Retinopathy progression and regression was defined as two or more step change on the ETDRS scale from baseline. Patients were normoalbuminuric, and normotensive with Type 1 and Type 2 diabetes or treated hypertensive with Type 2 diabetes. They were randomized to treatment with candesartan 32 mg daily or placebo and followed for 4.6 years. A higher microaneurysm score at baseline predicted an increased risk of retinopathy progression (HR per microaneurysm score 1.08, P diabetes; HR 1.07, P = 0.0174 in Type 2 diabetes) and reduced the likelihood of regression (HR 0.79, P diabetes; HR 0.85, P = 0.0009 in Type 2 diabetes), all adjusted for baseline variables and treatment. Candesartan reduced the risk of microaneurysm score progression. Microaneurysm counts are important prognostic indicators for worsening of retinopathy, thus microaneurysms are not benign. Treatment with renin-angiotensin system inhibitors is effective in the early stages and may improve mild diabetic retinopathy. Microaneurysm scores may be useful surrogate endpoints in clinical trials. © 2011 The Authors. Diabetic Medicine © 2011 Diabetes UK.
Directory of Open Access Journals (Sweden)
Regis Wendpouire Oubida
2015-03-01
Full Text Available Local adaptation to climate in temperate forest trees involves the integration of multiple physiological, morphological, and phenological traits. Latitudinal clines are frequently observed for these traits, but environmental constraints also track longitude and altitude. We combined extensive phenotyping of 12 candidate adaptive traits, multivariate regression trees, quantitative genetics, and a genome-wide panel of SNP markers to better understand the interplay among geography, climate, and adaptation to abiotic factors in Populus trichocarpa. Heritabilities were low to moderate (0.13 to 0.32 and population differentiation for many traits exceeded the 99th percentile of the genome-wide distribution of FST, suggesting local adaptation. When climate variables were taken as predictors and the 12 traits as response variables in a multivariate regression tree analysis, evapotranspiration (Eref explained the most variation, with subsequent splits related to mean temperature of the warmest month, frost-free period (FFP, and mean annual precipitation (MAP. These grouping matched relatively well the splits using geographic variables as predictors: the northernmost groups (short FFP and low Eref had the lowest growth, and lowest cold injury index; the southern British Columbia group (low Eref and intermediate temperatures had average growth and cold injury index; the group from the coast of California and Oregon (high Eref and FFP had the highest growth performance and the highest cold injury index; and the southernmost, high-altitude group (with high Eref and low FFP performed poorly, had high cold injury index, and lower water use efficiency. Taken together, these results suggest variation in both temperature and water availability across the range shape multivariate adaptive traits in poplar.
Zhang, Hongyang; Welch, William J.; Zamar, Ruben H.
2017-01-01
Tomal et al. (2015) introduced the notion of "phalanxes" in the context of rare-class detection in two-class classification problems. A phalanx is a subset of features that work well for classification tasks. In this paper, we propose a different class of phalanxes for application in regression settings. We define a "Regression Phalanx" - a subset of features that work well together for prediction. We propose a novel algorithm which automatically chooses Regression Phalanxes from high-dimensi...
Riley, Richard D.
2017-01-01
An important question for clinicians appraising a meta‐analysis is: are the findings likely to be valid in their own practice—does the reported effect accurately represent the effect that would occur in their own clinical population? To this end we advance the concept of statistical validity—where the parameter being estimated equals the corresponding parameter for a new independent study. Using a simple (‘leave‐one‐out’) cross‐validation technique, we demonstrate how we may test meta‐analysis estimates for statistical validity using a new validation statistic, Vn, and derive its distribution. We compare this with the usual approach of investigating heterogeneity in meta‐analyses and demonstrate the link between statistical validity and homogeneity. Using a simulation study, the properties of Vn and the Q statistic are compared for univariate random effects meta‐analysis and a tailored meta‐regression model, where information from the setting (included as model covariates) is used to calibrate the summary estimate to the setting of application. Their properties are found to be similar when there are 50 studies or more, but for fewer studies Vn has greater power but a higher type 1 error rate than Q. The power and type 1 error rate of Vn are also shown to depend on the within‐study variance, between‐study variance, study sample size, and the number of studies in the meta‐analysis. Finally, we apply Vn to two published meta‐analyses and conclude that it usefully augments standard methods when deciding upon the likely validity of summary meta‐analysis estimates in clinical practice. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:28620945
Regression analysis by example
Chatterjee, Samprit
2012-01-01
Praise for the Fourth Edition: ""This book is . . . an excellent source of examples for regression analysis. It has been and still is readily readable and understandable."" -Journal of the American Statistical Association Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. Regression Analysis by Example, Fifth Edition has been expanded
Matson, Johnny L.; Kozlowski, Alison M.
2010-01-01
Autistic regression is one of the many mysteries in the developmental course of autism and pervasive developmental disorders not otherwise specified (PDD-NOS). Various definitions of this phenomenon have been used, further clouding the study of the topic. Despite this problem, some efforts at establishing prevalence have been made. The purpose of…
Directory of Open Access Journals (Sweden)
Jason W. Osborne
2012-06-01
Full Text Available Logistic regression is slowly gaining acceptance in the social sciences, and fills an important niche in the researcher's toolkit: being able to predict important outcomes that are not continuous in nature. While OLS regression is a valuable tool, it cannot routinely be used to predict outcomes that are binary or categorical in nature. These outcomes represent important social science lines of research: retention in, or dropout from school, using illicit drugs, underage alcohol consumption, antisocial behavior, purchasing decisions, voting patterns, risky behavior, and so on. The goal of this paper is to briefly lead the reader through the surprisingly simple mathematics that underpins logistic regression: probabilities, odds, odds ratios, and logits. Anyone with spreadsheet software or a scientific calculator can follow along, and in turn, this knowledge can be used to make much more interesting, clear, and accurate presentations of results (especially to non-technical audiences. In particular, I will share an example of an interaction in logistic regression, how it was originally graphed, and how the graph was made substantially more user-friendly by converting the original metric (logits to a more readily interpretable metric (probability through three simple steps.
Olive, David J
2017-01-01
This text covers both multiple linear regression and some experimental design models. The text uses the response plot to visualize the model and to detect outliers, does not assume that the error distribution has a known parametric distribution, develops prediction intervals that work when the error distribution is unknown, suggests bootstrap hypothesis tests that may be useful for inference after variable selection, and develops prediction regions and large sample theory for the multivariate linear regression model that has m response variables. A relationship between multivariate prediction regions and confidence regions provides a simple way to bootstrap confidence regions. These confidence regions often provide a practical method for testing hypotheses. There is also a chapter on generalized linear models and generalized additive models. There are many R functions to produce response and residual plots, to simulate prediction intervals and hypothesis tests, to detect outliers, and to choose response trans...
International Nuclear Information System (INIS)
Clarkson, Richard WE; Wayland, Matthew T; Lee, Jennifer; Freeman, Tom; Watson, Christine J
2004-01-01
In order to gain a better understanding of the molecular processes that underlie apoptosis and tissue regression in mammary gland, we undertook a large-scale analysis of transcriptional changes during the mouse mammary pregnancy cycle, with emphasis on the transition from lactation to involution. Affymetrix microarrays, representing 8618 genes, were used to compare mammary tissue from 12 time points (one virgin, three gestation, three lactation and five involution stages). Six animals were used for each time point. Common patterns of gene expression across all time points were identified and related to biological function. The majority of significantly induced genes in involution were also differentially regulated at earlier stages in the pregnancy cycle. This included a marked increase in inflammatory mediators during involution and at parturition, which correlated with leukaemia inhibitory factor–Stat3 (signal transducer and activator of signalling-3) signalling. Before involution, expected increases in cell proliferation, biosynthesis and metabolism-related genes were observed. During involution, the first 24 hours after weaning was characterized by a transient increase in expression of components of the death receptor pathways of apoptosis, inflammatory cytokines and acute phase response genes. After 24 hours, regulators of intrinsic apoptosis were induced in conjunction with markers of phagocyte activity, matrix proteases, suppressors of neutrophils and soluble components of specific and innate immunity. We provide a resource of mouse mammary gene expression data for download or online analysis. Here we highlight the sequential induction of distinct apoptosis pathways in involution and the stimulation of immunomodulatory signals, which probably suppress the potentially damaging effects of a cellular inflammatory response while maintaining an appropriate antimicrobial and phagocytic environment
Directory of Open Access Journals (Sweden)
Fabian Horst
Full Text Available Traditionally, gait analysis has been centered on the idea of average behavior and normality. On one hand, clinical diagnoses and therapeutic interventions typically assume that average gait patterns remain constant over time. On the other hand, it is well known that all our movements are accompanied by a certain amount of variability, which does not allow us to make two identical steps. The purpose of this study was to examine changes in the intra-individual gait patterns across different time-scales (i.e., tens-of-mins, tens-of-hours.Nine healthy subjects performed 15 gait trials at a self-selected speed on 6 sessions within one day (duration between two subsequent sessions from 10 to 90 mins. For each trial, time-continuous ground reaction forces and lower body joint angles were measured. A supervised learning model using a kernel-based discriminant regression was applied for classifying sessions within individual gait patterns.Discernable characteristics of intra-individual gait patterns could be distinguished between repeated sessions by classification rates of 67.8 ± 8.8% and 86.3 ± 7.9% for the six-session-classification of ground reaction forces and lower body joint angles, respectively. Furthermore, the one-on-one-classification showed that increasing classification rates go along with increasing time durations between two sessions and indicate that changes of gait patterns appear at different time-scales.Discernable characteristics between repeated sessions indicate continuous intrinsic changes in intra-individual gait patterns and suggest a predominant role of deterministic processes in human motor control and learning. Natural changes of gait patterns without any externally induced injury or intervention may reflect continuous adaptations of the motor system over several time-scales. Accordingly, the modelling of walking by means of average gait patterns that are assumed to be near constant over time needs to be reconsidered in the
Horst, Fabian; Eekhoff, Alexander; Newell, Karl M; Schöllhorn, Wolfgang I
2017-01-01
Traditionally, gait analysis has been centered on the idea of average behavior and normality. On one hand, clinical diagnoses and therapeutic interventions typically assume that average gait patterns remain constant over time. On the other hand, it is well known that all our movements are accompanied by a certain amount of variability, which does not allow us to make two identical steps. The purpose of this study was to examine changes in the intra-individual gait patterns across different time-scales (i.e., tens-of-mins, tens-of-hours). Nine healthy subjects performed 15 gait trials at a self-selected speed on 6 sessions within one day (duration between two subsequent sessions from 10 to 90 mins). For each trial, time-continuous ground reaction forces and lower body joint angles were measured. A supervised learning model using a kernel-based discriminant regression was applied for classifying sessions within individual gait patterns. Discernable characteristics of intra-individual gait patterns could be distinguished between repeated sessions by classification rates of 67.8 ± 8.8% and 86.3 ± 7.9% for the six-session-classification of ground reaction forces and lower body joint angles, respectively. Furthermore, the one-on-one-classification showed that increasing classification rates go along with increasing time durations between two sessions and indicate that changes of gait patterns appear at different time-scales. Discernable characteristics between repeated sessions indicate continuous intrinsic changes in intra-individual gait patterns and suggest a predominant role of deterministic processes in human motor control and learning. Natural changes of gait patterns without any externally induced injury or intervention may reflect continuous adaptations of the motor system over several time-scales. Accordingly, the modelling of walking by means of average gait patterns that are assumed to be near constant over time needs to be reconsidered in the context of
Understanding logistic regression analysis
Sperandei, Sandro
2014-01-01
Logistic regression is used to obtain odds ratio in the presence of more than one explanatory variable. The procedure is quite similar to multiple linear regression, with the exception that the response variable is binomial. The result is the impact of each variable on the odds ratio of the observed event of interest. The main advantage is to avoid confounding effects by analyzing the association of all variables together. In this article, we explain the logistic regression procedure using ex...
Directory of Open Access Journals (Sweden)
Pinto João
2011-08-01
Full Text Available Abstract Background Anopheles gambiae M and S molecular forms, the major malaria vectors in the Afro-tropical region, are ongoing a process of ecological diversification and adaptive lineage splitting, which is affecting malaria transmission and vector control strategies in West Africa. These two incipient species are defined on the basis of single nucleotide differences in the IGS and ITS regions of multicopy rDNA located on the X-chromosome. A number of PCR and PCR-RFLP approaches based on form-specific SNPs in the IGS region are used for M and S identification. Moreover, a PCR-method to detect the M-specific insertion of a short interspersed transposable element (SINE200 has recently been introduced as an alternative identification approach. However, a large-scale comparative analysis of four widely used PCR or PCR-RFLP genotyping methods for M and S identification was never carried out to evaluate whether they could be used interchangeably, as commonly assumed. Results The genotyping of more than 400 A. gambiae specimens from nine African countries, and the sequencing of the IGS-amplicon of 115 of them, highlighted discrepancies among results obtained by the different approaches due to different kinds of biases, which may result in an overestimation of MS putative hybrids, as follows: i incorrect match of M and S specific primers used in the allele specific-PCR approach; ii presence of polymorphisms in the recognition sequence of restriction enzymes used in the PCR-RFLP approaches; iii incomplete cleavage during the restriction reactions; iv presence of different copy numbers of M and S-specific IGS-arrays in single individuals in areas of secondary contact between the two forms. Conclusions The results reveal that the PCR and PCR-RFLP approaches most commonly utilized to identify A. gambiae M and S forms are not fully interchangeable as usually assumed, and highlight limits of the actual definition of the two molecular forms, which might
Directory of Open Access Journals (Sweden)
Lauren M G Davis
Full Text Available Prebiotics are selectively fermented ingredients that allow specific changes in the gastrointestinal microbiota that confer health benefits to the host. However, the effects of prebiotics on the human gut microbiota are incomplete as most studies have relied on methods that fail to cover the breadth of the bacterial community. The goal of this research was to use high throughput multiplex community sequencing of 16S rDNA tags to gain a community wide perspective of the impact of prebiotic galactooligosaccharide (GOS on the fecal microbiota of healthy human subjects. Fecal samples from eighteen healthy adults were previously obtained during a feeding trial in which each subject consumed a GOS-containing product for twelve weeks, with four increasing dosages (0, 2.5, 5, and 10 gram of GOS. Multiplex sequencing of the 16S rDNA tags revealed that GOS induced significant compositional alterations in the fecal microbiota, principally by increasing the abundance of organisms within the Actinobacteria. Specifically, several distinct lineages of Bifidobacterium were enriched. Consumption of GOS led to five- to ten-fold increases in bifidobacteria in half of the subjects. Increases in Firmicutes were also observed, however, these changes were detectable in only a few individuals. The enrichment of bifidobacteria was generally at the expense of one group of bacteria, the Bacteroides. The responses to GOS and the magnitude of the response varied between individuals, were reversible, and were in accordance with dosage. The bifidobacteria were the only bacteria that were consistently and significantly enriched by GOS, although this substrate supported the growth of diverse colonic bacteria in mono-culture experiments. These results suggest that GOS can be used to enrich bifidobacteria in the human gastrointestinal tract with remarkable specificity, and that the bifidogenic properties of GOS that occur in vivo are caused by selective fermentation as well as by
McCormick, Patrick W.; Lewis, Gary D.; Dujovny, Manuel; Ausman, James I.; Stewart, Mick; Widman, Ronald A.
1992-05-01
Near infrared light generated by specialized instrumentation was passed through artificially oxygenated human blood during simultaneous sampling by a co-oximeter. Characteristic absorption spectra were analyzed to calculate the ratio of oxygenated to reduced hemoglobin. A positive linear regression fit between diffuse transmission oximetry and measured blood oxygenation over the range 23% to 99% (r2 equals .98, p signal was observed in the patient over time. The procedure was able to be performed clinically without difficulty; rSO2 values recorded continuously demonstrate the usefulness of the technique. Using the same instrumentation, arterial input and cerebral response functions, generated by IV tracer bolus, were deconvoluted to measure mean cerebral transit time. Date collected over time provided a sensitive index of changes in cerebral blood flow as a result of therapeutic maneuvers.
Understanding logistic regression analysis.
Sperandei, Sandro
2014-01-01
Logistic regression is used to obtain odds ratio in the presence of more than one explanatory variable. The procedure is quite similar to multiple linear regression, with the exception that the response variable is binomial. The result is the impact of each variable on the odds ratio of the observed event of interest. The main advantage is to avoid confounding effects by analyzing the association of all variables together. In this article, we explain the logistic regression procedure using examples to make it as simple as possible. After definition of the technique, the basic interpretation of the results is highlighted and then some special issues are discussed.
Osborne, Jason W.
2012-01-01
Logistic regression is slowly gaining acceptance in the social sciences, and fills an important niche in the researcher's toolkit: being able to predict important outcomes that are not continuous in nature. While OLS regression is a valuable tool, it cannot routinely be used to predict outcomes that are binary or categorical in nature. These…
Yamashita, Takashi; Kart, Cary S; Noe, Douglas A
2012-12-01
Type 2 diabetes is known to contribute to health disparities in the U.S. and failure to adhere to recommended self-care behaviors is a contributing factor. Intervention programs face difficulties as a result of patient diversity and limited resources. With data from the 2005 Behavioral Risk Factor Surveillance System, this study employs a logistic regression tree algorithm to identify characteristics of sub-populations with type 2 diabetes according to their reported frequency of adherence to four recommended diabetes self-care behaviors including blood glucose monitoring, foot examination, eye examination and HbA1c testing. Using Andersen's health behavior model, need factors appear to dominate the definition of which sub-groups were at greatest risk for low as well as high adherence. Findings demonstrate the utility of easily interpreted tree diagrams to design specific culturally appropriate intervention programs targeting sub-populations of diabetes patients who need to improve their self-care behaviors. Limitations and contributions of the study are discussed.
Modified Regression Correlation Coefficient for Poisson Regression Model
Kaengthong, Nattacha; Domthong, Uthumporn
2017-09-01
This study gives attention to indicators in predictive power of the Generalized Linear Model (GLM) which are widely used; however, often having some restrictions. We are interested in regression correlation coefficient for a Poisson regression model. This is a measure of predictive power, and defined by the relationship between the dependent variable (Y) and the expected value of the dependent variable given the independent variables [E(Y|X)] for the Poisson regression model. The dependent variable is distributed as Poisson. The purpose of this research was modifying regression correlation coefficient for Poisson regression model. We also compare the proposed modified regression correlation coefficient with the traditional regression correlation coefficient in the case of two or more independent variables, and having multicollinearity in independent variables. The result shows that the proposed regression correlation coefficient is better than the traditional regression correlation coefficient based on Bias and the Root Mean Square Error (RMSE).
Meckfessel, Sandra; Stühmer, Constantin; Bormann, Kai-Hendrik; Kupka, Thomas; Behrends, Marianne; Matthies, Herbert; Vaske, Bernhard; Stiesch, Meike; Gellrich, Nils-Claudius; Rücker, Martin
2011-01-01
Because a traditionally instructed dental radiology lecture course is very time-consuming and labour-intensive, online courseware, including an interactive-learning module, was implemented to support the lectures. The purpose of this study was to evaluate the perceptions of students who have worked with web-based courseware as well as the effect on their results in final examinations. Users (n(3+4)=138) had access to the e-program from any networked computer at any time. Two groups (n(3)=71, n(4)=67) had to pass a final exam after using the e-course. Results were compared with two groups (n(1)=42, n(2)=48) who had studied the same content by attending traditional lectures. In addition a survey of the students was statistically evaluated. Most of the respondents reported a positive attitude towards e-learning and would have appreciated more access to computer-assisted instruction. Two years after initiating the e-course the failure rate in the final examination dropped significantly, from 40% to less than 2%. The very positive response to the e-program and improved test scores demonstrated the effectiveness of our e-course as a learning aid. Interactive modules in step with clinical practice provided learning that is not achieved by traditional teaching methods alone. To what extent staff savings are possible is part of a further study. Copyright © 2010 European Association for Cranio-Maxillo-Facial Surgery. Published by Elsevier Ltd. All rights reserved.
Chong, A; Nazarian, N; Chandrananth, J; Tacey, M; Shepherd, D; Tran, P
2015-02-01
This study sought to determine the medium-term patient-reported and radiographic outcomes in patients undergoing surgery for hallux valgus. A total of 118 patients (162 feet) underwent surgery for hallux valgus between January 2008 and June 2009. The Manchester-Oxford Foot Questionnaire (MOXFQ), a validated tool for the assessment of outcome after surgery for hallux valgus, was used and patient satisfaction was sought. The medical records and radiographs were reviewed retrospectively. At a mean of 5.2 years (4.7 to 6.0) post-operatively, the median combined MOXFQ score was 7.8 (IQR:0 to 32.8). The median domain scores for pain, walking/standing, and social interaction were 10 (IQR: 0 to 45), 0 (IQR: 0 to 32.1) and 6.3 (IQR: 0 to 25) respectively. A total of 119 procedures (73.9%, in 90 patients) were reported as satisfactory but only 53 feet (32.7%, in 43 patients) were completely asymptomatic. The mean (SD) correction of hallux valgus, intermetatarsal, and distal metatarsal articular angles was 18.5° (8.8°), 5.7° (3.3°), and 16.6° (8.8°), respectively. Multivariable regression analysis identified that an American Association of Anesthesiologists grade of >1 (Incident Rate Ratio (IRR) = 1.67, p-value = 0.011) and recurrent deformity (IRR = 1.77, p-value = 0.003) were associated with significantly worse MOXFQ scores. No correlation was found between the severity of deformity, the type, or degree of surgical correction and the outcome. When using a validated outcome score for the assessment of outcome after surgery for hallux valgus, the long-term results are worse than expected when compared with the short- and mid-term outcomes, with 25.9% of patients dissatisfied at a mean follow-up of 5.2 years. ©2015 The British Editorial Society of Bone & Joint Surgery.
and Multinomial Logistic Regression
African Journals Online (AJOL)
This work presented the results of an experimental comparison of two models: Multinomial Logistic Regression (MLR) and Artificial Neural Network (ANN) for classifying students based on their academic performance. The predictive accuracy for each model was measured by their average Classification Correct Rate (CCR).
Pedrini, D. T.; Pedrini, Bonnie C.
Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…
Krivolutsky, Alexei A.; Nazarova, Margarita; Knyazeva, Galina
Solar activity influences on atmospheric photochemical system via its changebale electromag-netic flux with eleven-year period and also by energetic particles during solar proton event (SPE). Energetic particles penetrate mostly into polar regions and induce additional produc-tion of NOx and HOx chemical compounds, which can destroy ozone in photochemical catalytic cycles. Solar irradiance variations cause in-phase variability of ozone in accordance with photo-chemical theory. However, real ozone response caused by these two factors, which has different physical nature, is not so clear on long-term time scale. In order to understand the situation multiply linear regression statistical method was used. Three data series, which covered the period 1958-2006, have been used to realize such analysis: yearly averaged total ozone at dif-ferent latitudes (World Ozone Data Centre, Canada, WMO); yearly averaged proton fluxes with E¿ 10 MeV ( IMP, GOES, METEOR satellites); yearly averaged numbers of solar spots (Solar Data). Then, before the analysis, the data sets of ozone deviations from the mean values for whole period (1958-2006) at each latitudinal belt were prepared. The results of multiply regression analysis (two factors) revealed rather complicated time-dependent behavior of ozone response with clear negative peaks for the years of strong SPEs. The magnitudes of such peaks on annual mean basis are not greater than 10 DU. The unusual effect -positive response of ozone to solar proton activity near both poles-was discovered by statistical analysis. The pos-sible photochemical nature of found effect is discussed. This work was supported by Russian Science Foundation for Basic Research (grant 09-05-009949) and by the contract 1-6-08 under Russian Sub-Program "Research and Investigation of Antarctica".
Chanat, Jeffrey G.; Moyer, Douglas L.; Blomquist, Joel D.; Hyer, Kenneth E.; Langland, Michael J.
2016-01-13
In the Chesapeake Bay watershed, estimated fluxes of nutrients and sediment from the bay’s nontidal tributaries into the estuary are the foundation of decision making to meet reductions prescribed by the Chesapeake Bay Total Maximum Daily Load (TMDL) and are often the basis for refining scientific understanding of the watershed-scale processes that influence the delivery of these constituents to the bay. Two regression-based flux and trend estimation models, ESTIMATOR and Weighted Regressions on Time, Discharge, and Season (WRTDS), were compared using data from 80 watersheds in the Chesapeake Bay Nontidal Water-Quality Monitoring Network (CBNTN). The watersheds range in size from 62 to 70,189 square kilometers and record lengths range from 6 to 28 years. ESTIMATOR is a constant-parameter model that estimates trends only in concentration; WRTDS uses variable parameters estimated with weighted regression, and estimates trends in both concentration and flux. WRTDS had greater explanatory power than ESTIMATOR, with the greatest degree of improvement evident for records longer than 25 years (30 stations; improvement in median model R2= 0.06 for total nitrogen, 0.08 for total phosphorus, and 0.05 for sediment) and the least degree of improvement for records of less than 10 years, for which the two models performed nearly equally. Flux bias statistics were comparable or lower (more favorable) for WRTDS for any record length; for 30 stations with records longer than 25 years, the greatest degree of improvement was evident for sediment (decrease of 0.17 in median statistic) and total phosphorus (decrease of 0.05). The overall between-station pattern in concentration trend direction and magnitude for all constituents was roughly similar for both models. A detailed case study revealed that trends in concentration estimated by WRTDS can operationally be viewed as a less-constrained equivalent to trends in concentration estimated by ESTIMATOR. Estimates of annual mean flow
Tzeng, I-Shiang; Liu, Su-Hsun; Chen, Kuan-Fu; Wu, Chin-Chieh; Chen, Jih-Chang
2016-10-01
To reduce patient boarding time at the emergency department (ED) and to improve the overall quality of the emergent care system in Taiwan, the Minister of Health and Welfare of Taiwan (MOHW) piloted the Grading Responsible Hospitals for Acute Care (GRHAC) audit program in 2007-2009.The aim of the study was to evaluate the impact of the GRHAC audit program on the identification and management of acute myocardial infarction (AMI)-associated ED visits by describing and comparing the incidence of AMI-associated ED visits before (2003-2007), during (2007-2009), and after (2009-2012) the initial audit program implementation.Using aggregated data from the MOHW of Taiwan, we estimated the annual incidence of AMI-associated ED visits by Poisson regression models. We used segmented regression techniques to evaluate differences in the annual rates and in the year-to-year changes in AMI-associated ED visits between 2003 and 2012. Medical comorbidities such as diabetes mellitus, hyperlipidemia, and hypertensive disease were considered as potential confounders.Overall, the number of AMI-associated patient visits increased from 8130 visits in 2003 to 12,695 visits in 2012 (P-value for trend capacity for timely and correctly diagnosing and managing patients presenting with AMI-associated symptoms or signs at the ED.
Bayesian ARTMAP for regression.
Sasu, L M; Andonie, R
2013-10-01
Bayesian ARTMAP (BA) is a recently introduced neural architecture which uses a combination of Fuzzy ARTMAP competitive learning and Bayesian learning. Training is generally performed online, in a single-epoch. During training, BA creates input data clusters as Gaussian categories, and also infers the conditional probabilities between input patterns and categories, and between categories and classes. During prediction, BA uses Bayesian posterior probability estimation. So far, BA was used only for classification. The goal of this paper is to analyze the efficiency of BA for regression problems. Our contributions are: (i) we generalize the BA algorithm using the clustering functionality of both ART modules, and name it BA for Regression (BAR); (ii) we prove that BAR is a universal approximator with the best approximation property. In other words, BAR approximates arbitrarily well any continuous function (universal approximation) and, for every given continuous function, there is one in the set of BAR approximators situated at minimum distance (best approximation); (iii) we experimentally compare the online trained BAR with several neural models, on the following standard regression benchmarks: CPU Computer Hardware, Boston Housing, Wisconsin Breast Cancer, and Communities and Crime. Our results show that BAR is an appropriate tool for regression tasks, both for theoretical and practical reasons. Copyright © 2013 Elsevier Ltd. All rights reserved.
Logic regression and its extensions.
Schwender, Holger; Ruczinski, Ingo
2010-01-01
Logic regression is an adaptive classification and regression procedure, initially developed to reveal interacting single nucleotide polymorphisms (SNPs) in genetic association studies. In general, this approach can be used in any setting with binary predictors, when the interaction of these covariates is of primary interest. Logic regression searches for Boolean (logic) combinations of binary variables that best explain the variability in the outcome variable, and thus, reveals variables and interactions that are associated with the response and/or have predictive capabilities. The logic expressions are embedded in a generalized linear regression framework, and thus, logic regression can handle a variety of outcome types, such as binary responses in case-control studies, numeric responses, and time-to-event data. In this chapter, we provide an introduction to the logic regression methodology, list some applications in public health and medicine, and summarize some of the direct extensions and modifications of logic regression that have been proposed in the literature. Copyright © 2010 Elsevier Inc. All rights reserved.
Better Autologistic Regression
Directory of Open Access Journals (Sweden)
Mark A. Wolters
2017-11-01
Full Text Available Autologistic regression is an important probability model for dichotomous random variables observed along with covariate information. It has been used in various fields for analyzing binary data possessing spatial or network structure. The model can be viewed as an extension of the autologistic model (also known as the Ising model, quadratic exponential binary distribution, or Boltzmann machine to include covariates. It can also be viewed as an extension of logistic regression to handle responses that are not independent. Not all authors use exactly the same form of the autologistic regression model. Variations of the model differ in two respects. First, the variable coding—the two numbers used to represent the two possible states of the variables—might differ. Common coding choices are (zero, one and (minus one, plus one. Second, the model might appear in either of two algebraic forms: a standard form, or a recently proposed centered form. Little attention has been paid to the effect of these differences, and the literature shows ambiguity about their importance. It is shown here that changes to either coding or centering in fact produce distinct, non-nested probability models. Theoretical results, numerical studies, and analysis of an ecological data set all show that the differences among the models can be large and practically significant. Understanding the nature of the differences and making appropriate modeling choices can lead to significantly improved autologistic regression analyses. The results strongly suggest that the standard model with plus/minus coding, which we call the symmetric autologistic model, is the most natural choice among the autologistic variants.
DEFF Research Database (Denmark)
Johansen, Søren
2008-01-01
The reduced rank regression model is a multivariate regression model with a coefficient matrix with reduced rank. The reduced rank regression algorithm is an estimation procedure, which estimates the reduced rank regression model. It is related to canonical correlations and involves calculating...
Mohammad, Fahim; Theisen-Toupal, Jesse C.; Arnaout, Ramy
2014-01-01
Laboratory testing is the single highest-volume medical activity, making it useful to ask how well one can anticipate whether a given test result will be high, low, or within the reference interval ("normal"). We analyzed 10 years of electronic health records--a total of 69.4 million blood tests--to see how well standard rule-mining techniques can anticipate test results based on patient age and gender, recent diagnoses, and recent laboratory test results. We evaluated rules according to thei...
Regression to Causality : Regression-style presentation influences causal attribution
DEFF Research Database (Denmark)
Bordacconi, Mats Joe; Larsen, Martin Vinæs
2014-01-01
of equivalent results presented as either regression models or as a test of two sample means. Our experiment shows that the subjects who were presented with results as estimates from a regression model were more inclined to interpret these results causally. Our experiment implies that scholars using regression...... models – one of the primary vehicles for analyzing statistical results in political science – encourage causal interpretation. Specifically, we demonstrate that presenting observational results in a regression model, rather than as a simple comparison of means, makes causal interpretation of the results...... more likely. Our experiment drew on a sample of 235 university students from three different social science degree programs (political science, sociology and economics), all of whom had received substantial training in statistics. The subjects were asked to compare and evaluate the validity...
DEFF Research Database (Denmark)
Brok, J.; Thorlund, K.; Gluud, C.
2008-01-01
in 80% (insufficient information size). TSA(15%) and TSA(LBHIS) found that 95% and 91% had absence of evidence. The remaining nonsignificant meta-analyses had evidence of lack of effect. CONCLUSION: TSA reveals insufficient information size and potentially false positive results in many meta......OBJECTIVES: To evaluate meta-analyses with trial sequential analysis (TSA). TSA adjusts for random error risk and provides the required number of participants (information size) in a meta-analysis. Meta-analyses not reaching information size are analyzed with trial sequential monitoring boundaries...... analogous to interim monitoring boundaries in a single trial. STUDY DESIGN AND SETTING: We applied TSA on meta-analyses performed in Cochrane Neonatal reviews. We calculated information sizes and monitoring boundaries with three different anticipated intervention effects of 30% relative risk reduction (TSA...
Directory of Open Access Journals (Sweden)
Fahim Mohammad
Full Text Available Laboratory testing is the single highest-volume medical activity, making it useful to ask how well one can anticipate whether a given test result will be high, low, or within the reference interval ("normal". We analyzed 10 years of electronic health records--a total of 69.4 million blood tests--to see how well standard rule-mining techniques can anticipate test results based on patient age and gender, recent diagnoses, and recent laboratory test results. We evaluated rules according to their positive and negative predictive value (PPV and NPV and area under the receiver-operator characteristic curve (ROC AUCs. Using a stringent cutoff of PPV and/or NPV≥0.95, standard techniques yield few rules for sendout tests but several for in-house tests, mostly for repeat laboratory tests that are part of the complete blood count and basic metabolic panel. Most rules were clinically and pathophysiologically plausible, and several seemed clinically useful for informing pre-test probability of a given result. But overall, rules were unlikely to be able to function as a general substitute for actually ordering a test. Improving laboratory utilization will likely require different input data and/or alternative methods.
Mohammad, Fahim; Theisen-Toupal, Jesse C; Arnaout, Ramy
2014-01-01
Laboratory testing is the single highest-volume medical activity, making it useful to ask how well one can anticipate whether a given test result will be high, low, or within the reference interval ("normal"). We analyzed 10 years of electronic health records--a total of 69.4 million blood tests--to see how well standard rule-mining techniques can anticipate test results based on patient age and gender, recent diagnoses, and recent laboratory test results. We evaluated rules according to their positive and negative predictive value (PPV and NPV) and area under the receiver-operator characteristic curve (ROC AUCs). Using a stringent cutoff of PPV and/or NPV≥0.95, standard techniques yield few rules for sendout tests but several for in-house tests, mostly for repeat laboratory tests that are part of the complete blood count and basic metabolic panel. Most rules were clinically and pathophysiologically plausible, and several seemed clinically useful for informing pre-test probability of a given result. But overall, rules were unlikely to be able to function as a general substitute for actually ordering a test. Improving laboratory utilization will likely require different input data and/or alternative methods.
Mason, Cindi; Twomey, Janet; Wright, David; Whitman, Lawrence
2018-01-01
As the need for engineers continues to increase, a growing focus has been placed on recruiting students into the field of engineering and retaining the students who select engineering as their field of study. As a result of this concentration on student retention, numerous studies have been conducted to identify, understand, and confirm…
Retro-regression--another important multivariate regression improvement.
Randić, M
2001-01-01
We review the serious problem associated with instabilities of the coefficients of regression equations, referred to as the MRA (multivariate regression analysis) "nightmare of the first kind". This is manifested when in a stepwise regression a descriptor is included or excluded from a regression. The consequence is an unpredictable change of the coefficients of the descriptors that remain in the regression equation. We follow with consideration of an even more serious problem, referred to as the MRA "nightmare of the second kind", arising when optimal descriptors are selected from a large pool of descriptors. This process typically causes at different steps of the stepwise regression a replacement of several previously used descriptors by new ones. We describe a procedure that resolves these difficulties. The approach is illustrated on boiling points of nonanes which are considered (1) by using an ordered connectivity basis; (2) by using an ordering resulting from application of greedy algorithm; and (3) by using an ordering derived from an exhaustive search for optimal descriptors. A novel variant of multiple regression analysis, called retro-regression (RR), is outlined showing how it resolves the ambiguities associated with both "nightmares" of the first and the second kind of MRA.
DEFF Research Database (Denmark)
Fitzenberger, Bernd; Wilke, Ralf Andreas
2015-01-01
if the mean regression model does not. We provide a short informal introduction into the principle of quantile regression which includes an illustrative application from empirical labor market research. This is followed by briefly sketching the underlying statistical model for linear quantile regression based......Quantile regression is emerging as a popular statistical approach, which complements the estimation of conditional mean models. While the latter only focuses on one aspect of the conditional distribution of the dependent variable, the mean, quantile regression provides more detailed insights...... by modeling conditional quantiles. Quantile regression can therefore detect whether the partial effect of a regressor on the conditional quantiles is the same for all quantiles or differs across quantiles. Quantile regression can provide evidence for a statistical relationship between two variables even...
DEFF Research Database (Denmark)
Hansen, Henrik; Tarp, Finn
2001-01-01
This paper examines the relationship between foreign aid and growth in real GDP per capita as it emerges from simple augmentations of popular cross country growth specifications. It is shown that aid in all likelihood increases the growth rate, and this result is not conditional on ‘good’ policy....... investment. We conclude by stressing the need for more theoretical work before this kind of cross-country regressions are used for policy purposes.......This paper examines the relationship between foreign aid and growth in real GDP per capita as it emerges from simple augmentations of popular cross country growth specifications. It is shown that aid in all likelihood increases the growth rate, and this result is not conditional on ‘good’ policy...
Tumor regression patterns in retinoblastoma
International Nuclear Information System (INIS)
Zafar, S.N.; Siddique, S.N.; Zaheer, N.
2016-01-01
To observe the types of tumor regression after treatment, and identify the common pattern of regression in our patients. Study Design: Descriptive study. Place and Duration of Study: Department of Pediatric Ophthalmology and Strabismus, Al-Shifa Trust Eye Hospital, Rawalpindi, Pakistan, from October 2011 to October 2014. Methodology: Children with unilateral and bilateral retinoblastoma were included in the study. Patients were referred to Pakistan Institute of Medical Sciences, Islamabad, for chemotherapy. After every cycle of chemotherapy, dilated funds examination under anesthesia was performed to record response of the treatment. Regression patterns were recorded on RetCam II. Results: Seventy-four tumors were included in the study. Out of 74 tumors, 3 were ICRB group A tumors, 43 were ICRB group B tumors, 14 tumors belonged to ICRB group C, and remaining 14 were ICRB group D tumors. Type IV regression was seen in 39.1% (n=29) tumors, type II in 29.7% (n=22), type III in 25.6% (n=19), and type I in 5.4% (n=4). All group A tumors (100%) showed type IV regression. Seventeen (39.5%) group B tumors showed type IV regression. In group C, 5 tumors (35.7%) showed type II regression and 5 tumors (35.7%) showed type IV regression. In group D, 6 tumors (42.9%) regressed to type II non-calcified remnants. Conclusion: The response and success of the focal and systemic treatment, as judged by the appearance of different patterns of tumor regression, varies with the ICRB grouping of the tumor. (author)
Potempa, Krzysztof; Graham, Christine M.; Moreira-Teixeira, Lucia; McNab, Finlay W.; Howes, Ashleigh; Stavropoulos, Evangelos; Pascual, Virginia; Banchereau, Jacques; Chaussabel, Damien; O’Garra, Anne
2016-01-01
Analysis of the mouse transcriptional response to Listeria monocytogenes infection reveals that a large set of genes are perturbed in both blood and tissue and that these transcriptional responses are enriched for pathways of the immune response. Further we identified enrichment for both type I and type II interferon (IFN) signaling molecules in the blood and tissues upon infection. Since type I IFN signaling has been reported widely to impair bacterial clearance we examined gene expression from blood and tissues of wild type (WT) and type I IFNαβ receptor-deficient (Ifnar1-/-) mice at the basal level and upon infection with L. monocytogenes. Measurement of the fold change response upon infection in the absence of type I IFN signaling demonstrated an upregulation of specific genes at day 1 post infection. A less marked reduction of the global gene expression signature in blood or tissues from infected Ifnar1-/- as compared to WT mice was observed at days 2 and 3 after infection, with marked reduction in key genes such as Oasg1 and Stat2. Moreover, on in depth analysis, changes in gene expression in uninfected mice of key IFN regulatory genes including Irf9, Irf7, Stat1 and others were identified, and although induced by an equivalent degree upon infection this resulted in significantly lower final gene expression levels upon infection of Ifnar1-/- mice. These data highlight how dysregulation of this network in the steady state and temporally upon infection may determine the outcome of this bacterial infection and how basal levels of type I IFN-inducible genes may perturb an optimal host immune response to control intracellular bacterial infections such as L. monocytogenes. PMID:26918359
Introduction to regression graphics
Cook, R Dennis
2009-01-01
Covers the use of dynamic and interactive computer graphics in linear regression analysis, focusing on analytical graphics. Features new techniques like plot rotation. The authors have composed their own regression code, using Xlisp-Stat language called R-code, which is a nearly complete system for linear regression analysis and can be utilized as the main computer program in a linear regression course. The accompanying disks, for both Macintosh and Windows computers, contain the R-code and Xlisp-Stat. An Instructor's Manual presenting detailed solutions to all the problems in the book is ava
Alternative Methods of Regression
Birkes, David
2011-01-01
Of related interest. Nonlinear Regression Analysis and its Applications Douglas M. Bates and Donald G. Watts ".an extraordinary presentation of concepts and methods concerning the use and analysis of nonlinear regression models.highly recommend[ed].for anyone needing to use and/or understand issues concerning the analysis of nonlinear regression models." --Technometrics This book provides a balance between theory and practice supported by extensive displays of instructive geometrical constructs. Numerous in-depth case studies illustrate the use of nonlinear regression analysis--with all data s
Abstract Expression Grammar Symbolic Regression
Korns, Michael F.
This chapter examines the use of Abstract Expression Grammars to perform the entire Symbolic Regression process without the use of Genetic Programming per se. The techniques explored produce a symbolic regression engine which has absolutely no bloat, which allows total user control of the search space and output formulas, which is faster, and more accurate than the engines produced in our previous papers using Genetic Programming. The genome is an all vector structure with four chromosomes plus additional epigenetic and constraint vectors, allowing total user control of the search space and the final output formulas. A combination of specialized compiler techniques, genetic algorithms, particle swarm, aged layered populations, plus discrete and continuous differential evolution are used to produce an improved symbolic regression sytem. Nine base test cases, from the literature, are used to test the improvement in speed and accuracy. The improved results indicate that these techniques move us a big step closer toward future industrial strength symbolic regression systems.
Regression analysis with categorized regression calibrated exposure: some interesting findings
Directory of Open Access Journals (Sweden)
Hjartåker Anette
2006-07-01
Full Text Available Abstract Background Regression calibration as a method for handling measurement error is becoming increasingly well-known and used in epidemiologic research. However, the standard version of the method is not appropriate for exposure analyzed on a categorical (e.g. quintile scale, an approach commonly used in epidemiologic studies. A tempting solution could then be to use the predicted continuous exposure obtained through the regression calibration method and treat it as an approximation to the true exposure, that is, include the categorized calibrated exposure in the main regression analysis. Methods We use semi-analytical calculations and simulations to evaluate the performance of the proposed approach compared to the naive approach of not correcting for measurement error, in situations where analyses are performed on quintile scale and when incorporating the original scale into the categorical variables, respectively. We also present analyses of real data, containing measures of folate intake and depression, from the Norwegian Women and Cancer study (NOWAC. Results In cases where extra information is available through replicated measurements and not validation data, regression calibration does not maintain important qualities of the true exposure distribution, thus estimates of variance and percentiles can be severely biased. We show that the outlined approach maintains much, in some cases all, of the misclassification found in the observed exposure. For that reason, regression analysis with the corrected variable included on a categorical scale is still biased. In some cases the corrected estimates are analytically equal to those obtained by the naive approach. Regression calibration is however vastly superior to the naive method when applying the medians of each category in the analysis. Conclusion Regression calibration in its most well-known form is not appropriate for measurement error correction when the exposure is analyzed on a
Directory of Open Access Journals (Sweden)
Matthias Schmid
Full Text Available Regression analysis with a bounded outcome is a common problem in applied statistics. Typical examples include regression models for percentage outcomes and the analysis of ratings that are measured on a bounded scale. In this paper, we consider beta regression, which is a generalization of logit models to situations where the response is continuous on the interval (0,1. Consequently, beta regression is a convenient tool for analyzing percentage responses. The classical approach to fit a beta regression model is to use maximum likelihood estimation with subsequent AIC-based variable selection. As an alternative to this established - yet unstable - approach, we propose a new estimation technique called boosted beta regression. With boosted beta regression estimation and variable selection can be carried out simultaneously in a highly efficient way. Additionally, both the mean and the variance of a percentage response can be modeled using flexible nonlinear covariate effects. As a consequence, the new method accounts for common problems such as overdispersion and non-binomial variance structures.
Weisberg, Sanford
2013-01-01
Praise for the Third Edition ""...this is an excellent book which could easily be used as a course text...""-International Statistical Institute The Fourth Edition of Applied Linear Regression provides a thorough update of the basic theory and methodology of linear regression modeling. Demonstrating the practical applications of linear regression analysis techniques, the Fourth Edition uses interesting, real-world exercises and examples. Stressing central concepts such as model building, understanding parameters, assessing fit and reliability, and drawing conclusions, the new edition illus
Hosmer, David W; Sturdivant, Rodney X
2013-01-01
A new edition of the definitive guide to logistic regression modeling for health science and other applications This thoroughly expanded Third Edition provides an easily accessible introduction to the logistic regression (LR) model and highlights the power of this model by examining the relationship between a dichotomous outcome and a set of covariables. Applied Logistic Regression, Third Edition emphasizes applications in the health sciences and handpicks topics that best suit the use of modern statistical software. The book provides readers with state-of-
Examination of influential observations in penalized spline regression
Türkan, Semra
2013-10-01
In parametric or nonparametric regression models, the results of regression analysis are affected by some anomalous observations in the data set. Thus, detection of these observations is one of the major steps in regression analysis. These observations are precisely detected by well-known influence measures. Pena's statistic is one of them. In this study, Pena's approach is formulated for penalized spline regression in terms of ordinary residuals and leverages. The real data and artificial data are used to see illustrate the effectiveness of Pena's statistic as to Cook's distance on detecting influential observations. The results of the study clearly reveal that the proposed measure is superior to Cook's Distance to detect these observations in large data set.
Understanding poisson regression.
Hayat, Matthew J; Higgins, Melinda
2014-04-01
Nurse investigators often collect study data in the form of counts. Traditional methods of data analysis have historically approached analysis of count data either as if the count data were continuous and normally distributed or with dichotomization of the counts into the categories of occurred or did not occur. These outdated methods for analyzing count data have been replaced with more appropriate statistical methods that make use of the Poisson probability distribution, which is useful for analyzing count data. The purpose of this article is to provide an overview of the Poisson distribution and its use in Poisson regression. Assumption violations for the standard Poisson regression model are addressed with alternative approaches, including addition of an overdispersion parameter or negative binomial regression. An illustrative example is presented with an application from the ENSPIRE study, and regression modeling of comorbidity data is included for illustrative purposes. Copyright 2014, SLACK Incorporated.
Directory of Open Access Journals (Sweden)
Mok Tik
2014-06-01
Full Text Available This study formulates regression of vector data that will enable statistical analysis of various geodetic phenomena such as, polar motion, ocean currents, typhoon/hurricane tracking, crustal deformations, and precursory earthquake signals. The observed vector variable of an event (dependent vector variable is expressed as a function of a number of hypothesized phenomena realized also as vector variables (independent vector variables and/or scalar variables that are likely to impact the dependent vector variable. The proposed representation has the unique property of solving the coefficients of independent vector variables (explanatory variables also as vectors, hence it supersedes multivariate multiple regression models, in which the unknown coefficients are scalar quantities. For the solution, complex numbers are used to rep- resent vector information, and the method of least squares is deployed to estimate the vector model parameters after transforming the complex vector regression model into a real vector regression model through isomorphism. Various operational statistics for testing the predictive significance of the estimated vector parameter coefficients are also derived. A simple numerical example demonstrates the use of the proposed vector regression analysis in modeling typhoon paths.
Multicollinearity and Regression Analysis
Daoud, Jamal I.
2017-12-01
In regression analysis it is obvious to have a correlation between the response and predictor(s), but having correlation among predictors is something undesired. The number of predictors included in the regression model depends on many factors among which, historical data, experience, etc. At the end selection of most important predictors is something objective due to the researcher. Multicollinearity is a phenomena when two or more predictors are correlated, if this happens, the standard error of the coefficients will increase [8]. Increased standard errors means that the coefficients for some or all independent variables may be found to be significantly different from In other words, by overinflating the standard errors, multicollinearity makes some variables statistically insignificant when they should be significant. In this paper we focus on the multicollinearity, reasons and consequences on the reliability of the regression model.
DEFF Research Database (Denmark)
Bache, Stefan Holst
A new and alternative quantile regression estimator is developed and it is shown that the estimator is root n-consistent and asymptotically normal. The estimator is based on a minimax ‘deviance function’ and has asymptotically equivalent properties to the usual quantile regression estimator. It is......, however, a different and therefore new estimator. It allows for both linear- and nonlinear model specifications. A simple algorithm for computing the estimates is proposed. It seems to work quite well in practice but whether it has theoretical justification is still an open question....
DEFF Research Database (Denmark)
Ozenne, Brice; Sørensen, Anne Lyngholm; Scheike, Thomas
2017-01-01
In the presence of competing risks a prediction of the time-dynamic absolute risk of an event can be based on cause-specific Cox regression models for the event and the competing risks (Benichou and Gail, 1990). We present computationally fast and memory optimized C++ functions with an R interface...... for predicting the covariate specific absolute risks, their confidence intervals, and their confidence bands based on right censored time to event data. We provide explicit formulas for our implementation of the estimator of the (stratified) baseline hazard function in the presence of tied event times. As a by...... functionals. The software presented here is implemented in the riskRegression package....
Regression Analysis by Example. 5th Edition
Chatterjee, Samprit; Hadi, Ali S.
2012-01-01
Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. "Regression Analysis by Example, Fifth Edition" has been expanded and thoroughly…
Logistic regression for dichotomized counts.
Preisser, John S; Das, Kalyan; Benecha, Habtamu; Stamm, John W
2016-12-01
Sometimes there is interest in a dichotomized outcome indicating whether a count variable is positive or zero. Under this scenario, the application of ordinary logistic regression may result in efficiency loss, which is quantifiable under an assumed model for the counts. In such situations, a shared-parameter hurdle model is investigated for more efficient estimation of regression parameters relating to overall effects of covariates on the dichotomous outcome, while handling count data with many zeroes. One model part provides a logistic regression containing marginal log odds ratio effects of primary interest, while an ancillary model part describes the mean count of a Poisson or negative binomial process in terms of nuisance regression parameters. Asymptotic efficiency of the logistic model parameter estimators of the two-part models is evaluated with respect to ordinary logistic regression. Simulations are used to assess the properties of the models with respect to power and Type I error, the latter investigated under both misspecified and correctly specified models. The methods are applied to data from a randomized clinical trial of three toothpaste formulations to prevent incident dental caries in a large population of Scottish schoolchildren. © The Author(s) 2014.
Directory of Open Access Journals (Sweden)
Claas Wodarczyk
2009-09-01
Full Text Available Polycystin-1 (PC-1, the product of the PKD1 gene, mutated in the majority of cases of Autosomal Dominant Polycystic Kidney Disease (ADPKD, is a very large (approximately 520 kDa plasma membrane receptor localized in several subcellular compartments including cell-cell/matrix junctions as well as cilia. While heterologous over-expression systems have allowed identification of several of the potential biological roles of this receptor, its precise function remains largely elusive. Studying PC-1 in vivo has been a challenging task due to its complexity and low expression levels. To overcome these limitations and facilitate the study of endogenous PC-1, we have inserted HA- or Myc-tag sequences into the Pkd1 locus by homologous recombination. Here, we show that our approach was successful in generating a fully functional and easily detectable endogenous PC-1. Characterization of PC-1 distribution in vivo showed that it is expressed ubiquitously and is developmentally-regulated in most tissues. Furthermore, our novel tool allowed us to investigate the role of PC-1 in brain, where the protein is abundantly expressed. Subcellular localization of PC-1 revealed strong and specific staining in ciliated ependymal and choroid plexus cells. Consistent with this distribution, we observed hydrocephalus formation both in the ubiquitous knock-out embryos and in newborn mice with conditional inactivation of the Pkd1 gene in the brain. Both choroid plexus and ependymal cilia were morphologically normal in these mice, suggesting a role for PC-1 in ciliary function or signalling in this compartment, rather than in ciliogenesis. We propose that the role of PC-1 in the brain cilia might be to prevent hydrocephalus, a previously unrecognized role for this receptor and one that might have important implications for other genetic or sporadic diseases.
Multiple linear regression analysis
Edwards, T. R.
1980-01-01
Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.
Bayesian logistic regression analysis
Van Erp, H.R.N.; Van Gelder, P.H.A.J.M.
2012-01-01
In this paper we present a Bayesian logistic regression analysis. It is found that if one wishes to derive the posterior distribution of the probability of some event, then, together with the traditional Bayes Theorem and the integrating out of nuissance parameters, the Jacobian transformation is an
Seber, George A F
2012-01-01
Concise, mathematically clear, and comprehensive treatment of the subject.* Expanded coverage of diagnostics and methods of model fitting.* Requires no specialized knowledge beyond a good grasp of matrix algebra and some acquaintance with straight-line regression and simple analysis of variance models.* More than 200 problems throughout the book plus outline solutions for the exercises.* This revision has been extensively class-tested.
Ritz, Christian; Parmigiani, Giovanni
2009-01-01
R is a rapidly evolving lingua franca of graphical display and statistical analysis of experiments from the applied sciences. This book provides a coherent treatment of nonlinear regression with R by means of examples from a diversity of applied sciences such as biology, chemistry, engineering, medicine and toxicology.
Bounded Gaussian process regression
DEFF Research Database (Denmark)
Jensen, Bjørn Sand; Nielsen, Jens Brehm; Larsen, Jan
2013-01-01
We extend the Gaussian process (GP) framework for bounded regression by introducing two bounded likelihood functions that model the noise on the dependent variable explicitly. This is fundamentally different from the implicit noise assumption in the previously suggested warped GP framework. We...... with the proposed explicit noise-model extension....
Mechanisms of neuroblastoma regression
Brodeur, Garrett M.; Bagatell, Rochelle
2014-01-01
Recent genomic and biological studies of neuroblastoma have shed light on the dramatic heterogeneity in the clinical behaviour of this disease, which spans from spontaneous regression or differentiation in some patients, to relentless disease progression in others, despite intensive multimodality therapy. This evidence also suggests several possible mechanisms to explain the phenomena of spontaneous regression in neuroblastomas, including neurotrophin deprivation, humoral or cellular immunity, loss of telomerase activity and alterations in epigenetic regulation. A better understanding of the mechanisms of spontaneous regression might help to identify optimal therapeutic approaches for patients with these tumours. Currently, the most druggable mechanism is the delayed activation of developmentally programmed cell death regulated by the tropomyosin receptor kinase A pathway. Indeed, targeted therapy aimed at inhibiting neurotrophin receptors might be used in lieu of conventional chemotherapy or radiation in infants with biologically favourable tumours that require treatment. Alternative approaches consist of breaking immune tolerance to tumour antigens or activating neurotrophin receptor pathways to induce neuronal differentiation. These approaches are likely to be most effective against biologically favourable tumours, but they might also provide insights into treatment of biologically unfavourable tumours. We describe the different mechanisms of spontaneous neuroblastoma regression and the consequent therapeutic approaches. PMID:25331179
Directory of Open Access Journals (Sweden)
Greta E Weiss
2015-02-01
Full Text Available During blood stage Plasmodium falciparum infection, merozoites invade uninfected erythrocytes via a complex, multistep process involving a series of distinct receptor-ligand binding events. Understanding each element in this process increases the potential to block the parasite's life cycle via drugs or vaccines. To investigate specific receptor-ligand interactions, they were systematically blocked using a combination of genetic deletion, enzymatic receptor cleavage and inhibition of binding via antibodies, peptides and small molecules, and the resulting temporal changes in invasion and morphological effects on erythrocytes were filmed using live cell imaging. Analysis of the videos have shown receptor-ligand interactions occur in the following sequence with the following cellular morphologies; 1 an early heparin-blockable interaction which weakly deforms the erythrocyte, 2 EBA and PfRh ligands which strongly deform the erythrocyte, a process dependant on the merozoite's actin-myosin motor, 3 a PfRh5-basigin binding step which results in a pore or opening between parasite and host through which it appears small molecules and possibly invasion components can flow and 4 an AMA1-RON2 interaction that mediates tight junction formation, which acts as an anchor point for internalization. In addition to enhancing general knowledge of apicomplexan biology, this work provides a rational basis to combine sequentially acting merozoite vaccine candidates in a single multi-receptor-blocking vaccine.
Ridge Regression Signal Processing
Kuhl, Mark R.
1990-01-01
The introduction of the Global Positioning System (GPS) into the National Airspace System (NAS) necessitates the development of Receiver Autonomous Integrity Monitoring (RAIM) techniques. In order to guarantee a certain level of integrity, a thorough understanding of modern estimation techniques applied to navigational problems is required. The extended Kalman filter (EKF) is derived and analyzed under poor geometry conditions. It was found that the performance of the EKF is difficult to predict, since the EKF is designed for a Gaussian environment. A novel approach is implemented which incorporates ridge regression to explain the behavior of an EKF in the presence of dynamics under poor geometry conditions. The basic principles of ridge regression theory are presented, followed by the derivation of a linearized recursive ridge estimator. Computer simulations are performed to confirm the underlying theory and to provide a comparative analysis of the EKF and the recursive ridge estimator.
Subset selection in regression
Miller, Alan
2002-01-01
Originally published in 1990, the first edition of Subset Selection in Regression filled a significant gap in the literature, and its critical and popular success has continued for more than a decade. Thoroughly revised to reflect progress in theory, methods, and computing power, the second edition promises to continue that tradition. The author has thoroughly updated each chapter, incorporated new material on recent developments, and included more examples and references. New in the Second Edition:A separate chapter on Bayesian methodsComplete revision of the chapter on estimationA major example from the field of near infrared spectroscopyMore emphasis on cross-validationGreater focus on bootstrappingStochastic algorithms for finding good subsets from large numbers of predictors when an exhaustive search is not feasible Software available on the Internet for implementing many of the algorithms presentedMore examplesSubset Selection in Regression, Second Edition remains dedicated to the techniques for fitting...
Regression in organizational leadership.
Kernberg, O F
1979-02-01
The choice of good leaders is a major task for all organizations. Inforamtion regarding the prospective administrator's personality should complement questions regarding his previous experience, his general conceptual skills, his technical knowledge, and the specific skills in the area for which he is being selected. The growing psychoanalytic knowledge about the crucial importance of internal, in contrast to external, object relations, and about the mutual relationships of regression in individuals and in groups, constitutes an important practical tool for the selection of leaders.
Classification and regression trees
Breiman, Leo; Olshen, Richard A; Stone, Charles J
1984-01-01
The methodology used to construct tree structured rules is the focus of this monograph. Unlike many other statistical procedures, which moved from pencil and paper to calculators, this text's use of trees was unthinkable before computers. Both the practical and theoretical sides have been developed in the authors' study of tree methods. Classification and Regression Trees reflects these two sides, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.
Hilbe, Joseph M
2009-01-01
This book really does cover everything you ever wanted to know about logistic regression … with updates available on the author's website. Hilbe, a former national athletics champion, philosopher, and expert in astronomy, is a master at explaining statistical concepts and methods. Readers familiar with his other expository work will know what to expect-great clarity.The book provides considerable detail about all facets of logistic regression. No step of an argument is omitted so that the book will meet the needs of the reader who likes to see everything spelt out, while a person familiar with some of the topics has the option to skip "obvious" sections. The material has been thoroughly road-tested through classroom and web-based teaching. … The focus is on helping the reader to learn and understand logistic regression. The audience is not just students meeting the topic for the first time, but also experienced users. I believe the book really does meet the author's goal … .-Annette J. Dobson, Biometric...
Sheliga, Boris M; Quaia, Christian; FitzGibbon, Edmond J; Cumming, Bruce G
2016-01-01
White noise stimuli are frequently used to study the visual processing of broadband images in the laboratory. A common goal is to describe how responses are derived from Fourier components in the image. We investigated this issue by recording the ocular-following responses (OFRs) to white noise stimuli in human subjects. For a given speed we compared OFRs to unfiltered white noise with those to noise filtered with band-pass filters and notch filters. Removing components with low spatial frequency (SF) reduced OFR magnitudes, and the SF associated with the greatest reduction matched the SF that produced the maximal response when presented alone. This reduction declined rapidly with SF, compatible with a winner-take-all operation. Removing higher SF components increased OFR magnitudes. For higher speeds this effect became larger and propagated toward lower SFs. All of these effects were quantitatively well described by a model that combined two factors: (a) an excitatory drive that reflected the OFRs to individual Fourier components and (b) a suppression by higher SF channels where the temporal sampling of the display led to flicker. This nonlinear interaction has an important practical implication: Even with high refresh rates (150 Hz), the temporal sampling introduced by visual displays has a significant impact on visual processing. For instance, we show that this distorts speed tuning curves, shifting the peak to lower speeds. Careful attention to spectral content, in the light of this nonlinearity, is necessary to minimize the resulting artifact when using white noise patterns undergoing apparent motion.
Frampton, Dan; Gallo Cassarino, Tiziano; Raffle, Jade; Hubb, Jonathan; Ferns, R. Bridget; Waters, Laura; Tong, C. Y. William; Kozlakidis, Zisis; Hayward, Andrew; Kellam, Paul; Pillay, Deenan; Clark, Duncan; Nastouli, Eleni; Leigh Brown, Andrew J.
2018-01-01
Background & methods The ICONIC project has developed an automated high-throughput pipeline to generate HIV nearly full-length genomes (NFLG, i.e. from gag to nef) from next-generation sequencing (NGS) data. The pipeline was applied to 420 HIV samples collected at University College London Hospitals NHS Trust and Barts Health NHS Trust (London) and sequenced using an Illumina MiSeq at the Wellcome Trust Sanger Institute (Cambridge). Consensus genomes were generated and subtyped using COMET, and unique recombinants were studied with jpHMM and SimPlot. Maximum-likelihood phylogenetic trees were constructed using RAxML to identify transmission networks using the Cluster Picker. Results The pipeline generated sequences of at least 1Kb of length (median = 7.46Kb, IQR = 4.01Kb) for 375 out of the 420 samples (89%), with 174 (46.4%) being NFLG. A total of 365 sequences (169 of them NFLG) corresponded to unique subjects and were included in the down-stream analyses. The most frequent HIV subtypes were B (n = 149, 40.8%) and C (n = 77, 21.1%) and the circulating recombinant form CRF02_AG (n = 32, 8.8%). We found 14 different CRFs (n = 66, 18.1%) and multiple URFs (n = 32, 8.8%) that involved recombination between 12 different subtypes/CRFs. The most frequent URFs were B/CRF01_AE (4 cases) and A1/D, B/C, and B/CRF02_AG (3 cases each). Most URFs (19/26, 73%) lacked breakpoints in the PR+RT pol region, rendering them undetectable if only that was sequenced. Twelve (37.5%) of the URFs could have emerged within the UK, whereas the rest were probably imported from sub-Saharan Africa, South East Asia and South America. For 2 URFs we found highly similar pol sequences circulating in the UK. We detected 31 phylogenetic clusters using the full dataset: 25 pairs (mostly subtypes B and C), 4 triplets and 2 quadruplets. Some of these were not consistent across different genes due to inter- and intra-subtype recombination. Clusters involved 70 sequences, 19.2% of the dataset. Conclusions
Steganalysis using logistic regression
Lubenko, Ivans; Ker, Andrew D.
2011-02-01
We advocate Logistic Regression (LR) as an alternative to the Support Vector Machine (SVM) classifiers commonly used in steganalysis. LR offers more information than traditional SVM methods - it estimates class probabilities as well as providing a simple classification - and can be adapted more easily and efficiently for multiclass problems. Like SVM, LR can be kernelised for nonlinear classification, and it shows comparable classification accuracy to SVM methods. This work is a case study, comparing accuracy and speed of SVM and LR classifiers in detection of LSB Matching and other related spatial-domain image steganography, through the state-of-art 686-dimensional SPAM feature set, in three image sets.
SEPARATION PHENOMENA LOGISTIC REGRESSION
Directory of Open Access Journals (Sweden)
Ikaro Daniel de Carvalho Barreto
2014-03-01
Full Text Available This paper proposes an application of concepts about the maximum likelihood estimation of the binomial logistic regression model to the separation phenomena. It generates bias in the estimation and provides different interpretations of the estimates on the different statistical tests (Wald, Likelihood Ratio and Score and provides different estimates on the different iterative methods (Newton-Raphson and Fisher Score. It also presents an example that demonstrates the direct implications for the validation of the model and validation of variables, the implications for estimates of odds ratios and confidence intervals, generated from the Wald statistics. Furthermore, we present, briefly, the Firth correction to circumvent the phenomena of separation.
DEFF Research Database (Denmark)
Ozenne, Brice; Sørensen, Anne Lyngholm; Scheike, Thomas
2017-01-01
In the presence of competing risks a prediction of the time-dynamic absolute risk of an event can be based on cause-specific Cox regression models for the event and the competing risks (Benichou and Gail, 1990). We present computationally fast and memory optimized C++ functions with an R interface......-product we obtain fast access to the baseline hazards (compared to survival::basehaz()) and predictions of survival probabilities, their confidence intervals and confidence bands. Confidence intervals and confidence bands are based on point-wise asymptotic expansions of the corresponding statistical...
Adaptive metric kernel regression
DEFF Research Database (Denmark)
Goutte, Cyril; Larsen, Jan
2000-01-01
Kernel smoothing is a widely used non-parametric pattern recognition technique. By nature, it suffers from the curse of dimensionality and is usually difficult to apply to high input dimensions. In this contribution, we propose an algorithm that adapts the input metric used in multivariate...... regression by minimising a cross-validation estimate of the generalisation error. This allows to automatically adjust the importance of different dimensions. The improvement in terms of modelling performance is illustrated on a variable selection task where the adaptive metric kernel clearly outperforms...
Adaptive Metric Kernel Regression
DEFF Research Database (Denmark)
Goutte, Cyril; Larsen, Jan
1998-01-01
Kernel smoothing is a widely used nonparametric pattern recognition technique. By nature, it suffers from the curse of dimensionality and is usually difficult to apply to high input dimensions. In this paper, we propose an algorithm that adapts the input metric used in multivariate regression...... by minimising a cross-validation estimate of the generalisation error. This allows one to automatically adjust the importance of different dimensions. The improvement in terms of modelling performance is illustrated on a variable selection task where the adaptive metric kernel clearly outperforms the standard...
Should metacognition be measured by logistic regression?
Rausch, Manuel; Zehetleitner, Michael
2017-03-01
Are logistic regression slopes suitable to quantify metacognitive sensitivity, i.e. the efficiency with which subjective reports differentiate between correct and incorrect task responses? We analytically show that logistic regression slopes are independent from rating criteria in one specific model of metacognition, which assumes (i) that rating decisions are based on sensory evidence generated independently of the sensory evidence used for primary task responses and (ii) that the distributions of evidence are logistic. Given a hierarchical model of metacognition, logistic regression slopes depend on rating criteria. According to all considered models, regression slopes depend on the primary task criterion. A reanalysis of previous data revealed that massive numbers of trials are required to distinguish between hierarchical and independent models with tolerable accuracy. It is argued that researchers who wish to use logistic regression as measure of metacognitive sensitivity need to control the primary task criterion and rating criteria. Copyright © 2017 Elsevier Inc. All rights reserved.
Conjoined legs: Sirenomelia or caudal regression syndrome?
Directory of Open Access Journals (Sweden)
Sakti Prasad Das
2013-01-01
Full Text Available Presence of single umbilical persistent vitelline artery distinguishes sirenomelia from caudal regression syndrome. We report a case of a12-year-old boy who had bilateral umbilical arteries presented with fusion of both legs in the lower one third of leg. Both feet were rudimentary. The right foot had a valgus rocker-bottom deformity. All toes were present but rudimentary. The left foot showed absence of all toes. Physical examination showed left tibia vara. The chest evaluation in sitting revealed pigeon chest and elevated right shoulder. Posterior examination of the trunk showed thoracic scoliosis with convexity to right. The patient was operated and at 1 year followup the boy had two separate legs with a good aesthetic and functional results.
Conjoined legs: Sirenomelia or caudal regression syndrome?
Das, Sakti Prasad; Ojha, Niranjan; Ganesh, G Shankar; Mohanty, Ram Narayan
2013-07-01
Presence of single umbilical persistent vitelline artery distinguishes sirenomelia from caudal regression syndrome. We report a case of a12-year-old boy who had bilateral umbilical arteries presented with fusion of both legs in the lower one third of leg. Both feet were rudimentary. The right foot had a valgus rocker-bottom deformity. All toes were present but rudimentary. The left foot showed absence of all toes. Physical examination showed left tibia vara. The chest evaluation in sitting revealed pigeon chest and elevated right shoulder. Posterior examination of the trunk showed thoracic scoliosis with convexity to right. The patient was operated and at 1 year followup the boy had two separate legs with a good aesthetic and functional results.
Directory of Open Access Journals (Sweden)
Salabura Piotr
2017-01-01
Full Text Available HADES experiment at GSI is the only high precision experiment probing nuclear matter in the beam energy range of a few AGeV. Pion, proton and ion beams are used to study rare dielectron and strangeness probes to diagnose properties of strongly interacting matter in this energy regime. Selected results from p + A and A + A collisions are presented and discussed.
Polylinear regression analysis in radiochemistry
International Nuclear Information System (INIS)
Kopyrin, A.A.; Terent'eva, T.N.; Khramov, N.N.
1995-01-01
A number of radiochemical problems have been formulated in the framework of polylinear regression analysis, which permits the use of conventional mathematical methods for their solution. The authors have considered features of the use of polylinear regression analysis for estimating the contributions of various sources to the atmospheric pollution, for studying irradiated nuclear fuel, for estimating concentrations from spectral data, for measuring neutron fields of a nuclear reactor, for estimating crystal lattice parameters from X-ray diffraction patterns, for interpreting data of X-ray fluorescence analysis, for estimating complex formation constants, and for analyzing results of radiometric measurements. The problem of estimating the target parameters can be incorrect at certain properties of the system under study. The authors showed the possibility of regularization by adding a fictitious set of data open-quotes obtainedclose quotes from the orthogonal design. To estimate only a part of the parameters under consideration, the authors used incomplete rank models. In this case, it is necessary to take into account the possibility of confounding estimates. An algorithm for evaluating the degree of confounding is presented which is realized using standard software or regression analysis
Luo, Chongliang; Liu, Jin; Dey, Dipak K; Chen, Kun
2016-07-01
In many fields, multi-view datasets, measuring multiple distinct but interrelated sets of characteristics on the same set of subjects, together with data on certain outcomes or phenotypes, are routinely collected. The objective in such a problem is often two-fold: both to explore the association structures of multiple sets of measurements and to develop a parsimonious model for predicting the future outcomes. We study a unified canonical variate regression framework to tackle the two problems simultaneously. The proposed criterion integrates multiple canonical correlation analysis with predictive modeling, balancing between the association strength of the canonical variates and their joint predictive power on the outcomes. Moreover, the proposed criterion seeks multiple sets of canonical variates simultaneously to enable the examination of their joint effects on the outcomes, and is able to handle multivariate and non-Gaussian outcomes. An efficient algorithm based on variable splitting and Lagrangian multipliers is proposed. Simulation studies show the superior performance of the proposed approach. We demonstrate the effectiveness of the proposed approach in an [Formula: see text] intercross mice study and an alcohol dependence study. © The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
A Simulation Investigation of Principal Component Regression.
Allen, David E.
Regression analysis is one of the more common analytic tools used by researchers. However, multicollinearity between the predictor variables can cause problems in using the results of regression analyses. Problems associated with multicollinearity include entanglement of relative influences of variables due to reduced precision of estimation,…
Suppression Situations in Multiple Linear Regression
Shieh, Gwowen
2006-01-01
This article proposes alternative expressions for the two most prevailing definitions of suppression without resorting to the standardized regression modeling. The formulation provides a simple basis for the examination of their relationship. For the two-predictor regression, the author demonstrates that the previous results in the literature are…
Polynomial regression analysis and significance test of the regression function
International Nuclear Information System (INIS)
Gao Zhengming; Zhao Juan; He Shengping
2012-01-01
In order to analyze the decay heating power of a certain radioactive isotope per kilogram with polynomial regression method, the paper firstly demonstrated the broad usage of polynomial function and deduced its parameters with ordinary least squares estimate. Then significance test method of polynomial regression function is derived considering the similarity between the polynomial regression model and the multivariable linear regression model. Finally, polynomial regression analysis and significance test of the polynomial function are done to the decay heating power of the iso tope per kilogram in accord with the authors' real work. (authors)
Recursive Algorithm For Linear Regression
Varanasi, S. V.
1988-01-01
Order of model determined easily. Linear-regression algorithhm includes recursive equations for coefficients of model of increased order. Algorithm eliminates duplicative calculations, facilitates search for minimum order of linear-regression model fitting set of data satisfactory.
Combining Alphas via Bounded Regression
Directory of Open Access Journals (Sweden)
Zura Kakushadze
2015-11-01
Full Text Available We give an explicit algorithm and source code for combining alpha streams via bounded regression. In practical applications, typically, there is insufficient history to compute a sample covariance matrix (SCM for a large number of alphas. To compute alpha allocation weights, one then resorts to (weighted regression over SCM principal components. Regression often produces alpha weights with insufficient diversification and/or skewed distribution against, e.g., turnover. This can be rectified by imposing bounds on alpha weights within the regression procedure. Bounded regression can also be applied to stock and other asset portfolio construction. We discuss illustrative examples.
Regression in autistic spectrum disorders.
Stefanatos, Gerry A
2008-12-01
A significant proportion of children diagnosed with Autistic Spectrum Disorder experience a developmental regression characterized by a loss of previously-acquired skills. This may involve a loss of speech or social responsitivity, but often entails both. This paper critically reviews the phenomena of regression in autistic spectrum disorders, highlighting the characteristics of regression, age of onset, temporal course, and long-term outcome. Important considerations for diagnosis are discussed and multiple etiological factors currently hypothesized to underlie the phenomenon are reviewed. It is argued that regressive autistic spectrum disorders can be conceptualized on a spectrum with other regressive disorders that may share common pathophysiological features. The implications of this viewpoint are discussed.
Linear regression in astronomy. I
Isobe, Takashi; Feigelson, Eric D.; Akritas, Michael G.; Babu, Gutti Jogesh
1990-01-01
Five methods for obtaining linear regression fits to bivariate data with unknown or insignificant measurement errors are discussed: ordinary least-squares (OLS) regression of Y on X, OLS regression of X on Y, the bisector of the two OLS lines, orthogonal regression, and 'reduced major-axis' regression. These methods have been used by various researchers in observational astronomy, most importantly in cosmic distance scale applications. Formulas for calculating the slope and intercept coefficients and their uncertainties are given for all the methods, including a new general form of the OLS variance estimates. The accuracy of the formulas was confirmed using numerical simulations. The applicability of the procedures is discussed with respect to their mathematical properties, the nature of the astronomical data under consideration, and the scientific purpose of the regression. It is found that, for problems needing symmetrical treatment of the variables, the OLS bisector performs significantly better than orthogonal or reduced major-axis regression.
Regression of environmental noise in LIGO data
International Nuclear Information System (INIS)
Tiwari, V; Klimenko, S; Mitselmakher, G; Necula, V; Drago, M; Prodi, G; Frolov, V; Yakushin, I; Re, V; Salemi, F; Vedovato, G
2015-01-01
We address the problem of noise regression in the output of gravitational-wave (GW) interferometers, using data from the physical environmental monitors (PEM). The objective of the regression analysis is to predict environmental noise in the GW channel from the PEM measurements. One of the most promising regression methods is based on the construction of Wiener–Kolmogorov (WK) filters. Using this method, the seismic noise cancellation from the LIGO GW channel has already been performed. In the presented approach the WK method has been extended, incorporating banks of Wiener filters in the time–frequency domain, multi-channel analysis and regulation schemes, which greatly enhance the versatility of the regression analysis. Also we present the first results on regression of the bi-coherent noise in the LIGO data. (paper)
Advanced statistics: linear regression, part I: simple linear regression.
Marill, Keith A
2004-01-01
Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.
Linear regression in astronomy. II
Feigelson, Eric D.; Babu, Gutti J.
1992-01-01
A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.
Time-adaptive quantile regression
DEFF Research Database (Denmark)
Møller, Jan Kloppenborg; Nielsen, Henrik Aalborg; Madsen, Henrik
2008-01-01
and an updating procedure are combined into a new algorithm for time-adaptive quantile regression, which generates new solutions on the basis of the old solution, leading to savings in computation time. The suggested algorithm is tested against a static quantile regression model on a data set with wind power......An algorithm for time-adaptive quantile regression is presented. The algorithm is based on the simplex algorithm, and the linear optimization formulation of the quantile regression problem is given. The observations have been split to allow a direct use of the simplex algorithm. The simplex method...... production, where the models combine splines and quantile regression. The comparison indicates superior performance for the time-adaptive quantile regression in all the performance parameters considered....
Quantile regression theory and applications
Davino, Cristina; Vistocco, Domenico
2013-01-01
A guide to the implementation and interpretation of Quantile Regression models This book explores the theory and numerous applications of quantile regression, offering empirical data analysis as well as the software tools to implement the methods. The main focus of this book is to provide the reader with a comprehensivedescription of the main issues concerning quantile regression; these include basic modeling, geometrical interpretation, estimation and inference for quantile regression, as well as issues on validity of the model, diagnostic tools. Each methodological aspect is explored and
Variable and subset selection in PLS regression
DEFF Research Database (Denmark)
Høskuldsson, Agnar
2001-01-01
The purpose of this paper is to present some useful methods for introductory analysis of variables and subsets in relation to PLS regression. We present here methods that are efficient in finding the appropriate variables or subset to use in the PLS regression. The general conclusion...... is that variable selection is important for successful analysis of chemometric data. An important aspect of the results presented is that lack of variable selection can spoil the PLS regression, and that cross-validation measures using a test set can show larger variation, when we use different subsets of X, than...
Two Paradoxes in Linear Regression Analysis
FENG, Ge; PENG, Jing; TU, Dongke; ZHENG, Julia Z.; FENG, Changyong
2016-01-01
Summary Regression is one of the favorite tools in applied statistics. However, misuse and misinterpretation of results from regression analysis are common in biomedical research. In this paper we use statistical theory and simulation studies to clarify some paradoxes around this popular statistical method. In particular, we show that a widely used model selection procedure employed in many publications in top medical journals is wrong. Formal procedures based on solid statistical theory should be used in model selection. PMID:28638214
Caudal regression syndrome : a case report
International Nuclear Information System (INIS)
Lee, Eun Joo; Kim, Hi Hye; Kim, Hyung Sik; Park, So Young; Han, Hye Young; Lee, Kwang Hun
1998-01-01
Caudal regression syndrome is a rare congenital anomaly, which results from a developmental failure of the caudal mesoderm during the fetal period. We present a case of caudal regression syndrome composed of a spectrum of anomalies including sirenomelia, dysplasia of the lower lumbar vertebrae, sacrum, coccyx and pelvic bones,genitourinary and anorectal anomalies, and dysplasia of the lung, as seen during infantography and MR imaging
Caudal regression syndrome : a case report
Energy Technology Data Exchange (ETDEWEB)
Lee, Eun Joo; Kim, Hi Hye; Kim, Hyung Sik; Park, So Young; Han, Hye Young; Lee, Kwang Hun [Chungang Gil Hospital, Incheon (Korea, Republic of)
1998-07-01
Caudal regression syndrome is a rare congenital anomaly, which results from a developmental failure of the caudal mesoderm during the fetal period. We present a case of caudal regression syndrome composed of a spectrum of anomalies including sirenomelia, dysplasia of the lower lumbar vertebrae, sacrum, coccyx and pelvic bones,genitourinary and anorectal anomalies, and dysplasia of the lung, as seen during infantography and MR imaging.
Panel Smooth Transition Regression Models
DEFF Research Database (Denmark)
González, Andrés; Terasvirta, Timo; Dijk, Dick van
We introduce the panel smooth transition regression model. This new model is intended for characterizing heterogeneous panels, allowing the regression coefficients to vary both across individuals and over time. Specifically, heterogeneity is allowed for by assuming that these coefficients are bou...
Testing discontinuities in nonparametric regression
Dai, Wenlin
2017-01-19
In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100
Testing discontinuities in nonparametric regression
Dai, Wenlin; Zhou, Yuejin; Tong, Tiejun
2017-01-01
In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100
Logistic Regression: Concept and Application
Cokluk, Omay
2010-01-01
The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…
Fungible weights in logistic regression.
Jones, Jeff A; Waller, Niels G
2016-06-01
In this article we develop methods for assessing parameter sensitivity in logistic regression models. To set the stage for this work, we first review Waller's (2008) equations for computing fungible weights in linear regression. Next, we describe 2 methods for computing fungible weights in logistic regression. To demonstrate the utility of these methods, we compute fungible logistic regression weights using data from the Centers for Disease Control and Prevention's (2010) Youth Risk Behavior Surveillance Survey, and we illustrate how these alternate weights can be used to evaluate parameter sensitivity. To make our work accessible to the research community, we provide R code (R Core Team, 2015) that will generate both kinds of fungible logistic regression weights. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
International Nuclear Information System (INIS)
Leng Ling; Zhang Tianyi; Kleinman, Lawrence; Zhu Wei
2007-01-01
Regression analysis, especially the ordinary least squares method which assumes that errors are confined to the dependent variable, has seen a fair share of its applications in aerosol science. The ordinary least squares approach, however, could be problematic due to the fact that atmospheric data often does not lend itself to calling one variable independent and the other dependent. Errors often exist for both measurements. In this work, we examine two regression approaches available to accommodate this situation. They are orthogonal regression and geometric mean regression. Comparisons are made theoretically as well as numerically through an aerosol study examining whether the ratio of organic aerosol to CO would change with age
Advanced statistics: linear regression, part II: multiple linear regression.
Marill, Keith A
2004-01-01
The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.
Optimized support vector regression for drilling rate of penetration estimation
Bodaghi, Asadollah; Ansari, Hamid Reza; Gholami, Mahsa
2015-12-01
In the petroleum industry, drilling optimization involves the selection of operating conditions for achieving the desired depth with the minimum expenditure while requirements of personal safety, environment protection, adequate information of penetrated formations and productivity are fulfilled. Since drilling optimization is highly dependent on the rate of penetration (ROP), estimation of this parameter is of great importance during well planning. In this research, a novel approach called `optimized support vector regression' is employed for making a formulation between input variables and ROP. Algorithms used for optimizing the support vector regression are the genetic algorithm (GA) and the cuckoo search algorithm (CS). Optimization implementation improved the support vector regression performance by virtue of selecting proper values for its parameters. In order to evaluate the ability of optimization algorithms in enhancing SVR performance, their results were compared to the hybrid of pattern search and grid search (HPG) which is conventionally employed for optimizing SVR. The results demonstrated that the CS algorithm achieved further improvement on prediction accuracy of SVR compared to the GA and HPG as well. Moreover, the predictive model derived from back propagation neural network (BPNN), which is the traditional approach for estimating ROP, is selected for comparisons with CSSVR. The comparative results revealed the superiority of CSSVR. This study inferred that CSSVR is a viable option for precise estimation of ROP.
Multivariate Regression of Liver on Intestine of Mice: A ...
African Journals Online (AJOL)
Multivariate Regression of Liver on Intestine of Mice: A Chemotherapeutic Evaluation of Plant ... Using an analysis of covariance model, the effects ... The findings revealed, with the aid of likelihood-ratio statistic, a marked improvement in
Spontaneous regression of metastases from malignant melanoma: a case report
DEFF Research Database (Denmark)
Kalialis, Louise V; Drzewiecki, Krzysztof T; Mohammadi, Mahin
2008-01-01
A case of a 61-year-old male with widespread metastatic melanoma is presented 5 years after complete spontaneous cure. Spontaneous regression occurred in cutaneous, pulmonary, hepatic and cerebral metastases. A review of the literature reveals seven cases of regression of cerebral metastases; thi...
Quantile Regression With Measurement Error
Wei, Ying; Carroll, Raymond J.
2009-01-01
. The finite sample performance of the proposed method is investigated in a simulation study, and compared to the standard regression calibration approach. Finally, we apply our methodology to part of the National Collaborative Perinatal Project growth data, a
From Rasch scores to regression
DEFF Research Database (Denmark)
Christensen, Karl Bang
2006-01-01
Rasch models provide a framework for measurement and modelling latent variables. Having measured a latent variable in a population a comparison of groups will often be of interest. For this purpose the use of observed raw scores will often be inadequate because these lack interval scale propertie....... This paper compares two approaches to group comparison: linear regression models using estimated person locations as outcome variables and latent regression models based on the distribution of the score....
Testing Heteroscedasticity in Robust Regression
Czech Academy of Sciences Publication Activity Database
Kalina, Jan
2011-01-01
Roč. 1, č. 4 (2011), s. 25-28 ISSN 2045-3345 Grant - others:GA ČR(CZ) GA402/09/0557 Institutional research plan: CEZ:AV0Z10300504 Keywords : robust regression * heteroscedasticity * regression quantiles * diagnostics Subject RIV: BB - Applied Statistics , Operational Research http://www.researchjournals.co.uk/documents/Vol4/06%20Kalina.pdf
Regression methods for medical research
Tai, Bee Choo
2013-01-01
Regression Methods for Medical Research provides medical researchers with the skills they need to critically read and interpret research using more advanced statistical methods. The statistical requirements of interpreting and publishing in medical journals, together with rapid changes in science and technology, increasingly demands an understanding of more complex and sophisticated analytic procedures.The text explains the application of statistical models to a wide variety of practical medical investigative studies and clinical trials. Regression methods are used to appropriately answer the
Forecasting with Dynamic Regression Models
Pankratz, Alan
2012-01-01
One of the most widely used tools in statistical forecasting, single equation regression models is examined here. A companion to the author's earlier work, Forecasting with Univariate Box-Jenkins Models: Concepts and Cases, the present text pulls together recent time series ideas and gives special attention to possible intertemporal patterns, distributed lag responses of output to input series and the auto correlation patterns of regression disturbance. It also includes six case studies.
Discriminative Elastic-Net Regularized Linear Regression.
Zhang, Zheng; Lai, Zhihui; Xu, Yong; Shao, Ling; Wu, Jian; Xie, Guo-Sen
2017-03-01
In this paper, we aim at learning compact and discriminative linear regression models. Linear regression has been widely used in different problems. However, most of the existing linear regression methods exploit the conventional zero-one matrix as the regression targets, which greatly narrows the flexibility of the regression model. Another major limitation of these methods is that the learned projection matrix fails to precisely project the image features to the target space due to their weak discriminative capability. To this end, we present an elastic-net regularized linear regression (ENLR) framework, and develop two robust linear regression models which possess the following special characteristics. First, our methods exploit two particular strategies to enlarge the margins of different classes by relaxing the strict binary targets into a more feasible variable matrix. Second, a robust elastic-net regularization of singular values is introduced to enhance the compactness and effectiveness of the learned projection matrix. Third, the resulting optimization problem of ENLR has a closed-form solution in each iteration, which can be solved efficiently. Finally, rather than directly exploiting the projection matrix for recognition, our methods employ the transformed features as the new discriminate representations to make final image classification. Compared with the traditional linear regression model and some of its variants, our method is much more accurate in image classification. Extensive experiments conducted on publicly available data sets well demonstrate that the proposed framework can outperform the state-of-the-art methods. The MATLAB codes of our methods can be available at http://www.yongxu.org/lunwen.html.
Kornmann, Marko; Staib, Ludger; Wiegel, Thomas; Kron, Martina; Henne-Bruns, Doris; Link, Karl-Heinrich; Formentini, Andrea
2013-03-01
Two identical randomized controlled trials designed to optimize adjuvant treatment of colon cancer (CC) (n =855) and rectal cancer (RC) (n = 796) were performed. Long-term evaluation confirmed that the addition of folinic acid (FA) to 5-fluorouracil (5-FU) improved 7-year overall survival (OS) in CC but not in RC and revealed different patterns of recurrence in patients with CC and those with RC. Our aim was to compare long-term results of adjuvant treatment of colon cancer (CC) and rectal cancer (RC). Adjuvant chemotherapy of CC improved overall survival (OS), whereas that of RC remained at the level achieved by 5-fluorouracil (5-FU). We separately conducted 2 identically designed adjuvant trials in CC and RC. Patients were assigned to adjuvant chemotherapy with 5-FU alone, 5-FU + folinic acid (FA), or 5-FU + interferon-alfa. The first study enrolled patients with stage IIb/III CC, and the second study enrolled patients with stage II/III RC. All patients with RC received postoperative irradiation. Median follow-up for all patients with CC (n = 855) and RC (n = 796) was 4.9 years. The pattern and frequency of recurrence differed significantly, especially lung metastases, which occurred more frequently in RC (12.7%) than in CC (7.3%; P < .001). Seven-year OS rates for 5-FU, 5-FU + FA, and 5-FU + IFN-alfa were 54.1% (95% confidence interval [CI], 46.5-61.0), 66.8% (95% CI, 59.4-73.1), and 56.7% (95% CI, 49.3-63.4) in CC and 50.6% (95% CI, 43.0-57.7), 56.3% (95% CI, 49.4-62.7), and 54.8% (95% CI, 46.7-62.2) in RC, respectively. A subgroup analysis pointed to a reduced local recurrence (LR) rate and an increased OS by the addition of FA in stage II RC (n = 271) but not in stage III RC (n = 525). FA increased 7-year OS by 12.7 percentage points in CC but was not effective in RC. Based on these results and the pattern of metastases, our results suggest that the chemosensitivity of CC and RC may be different. Strategies different from those used in CC may be successful to
Lin, Yingzhi; Deng, Xiangzheng; Li, Xing; Ma, Enjun
2014-12-01
Spatially explicit simulation of land use change is the basis for estimating the effects of land use and cover change on energy fluxes, ecology and the environment. At the pixel level, logistic regression is one of the most common approaches used in spatially explicit land use allocation models to determine the relationship between land use and its causal factors in driving land use change, and thereby to evaluate land use suitability. However, these models have a drawback in that they do not determine/allocate land use based on the direct relationship between land use change and its driving factors. Consequently, a multinomial logistic regression method was introduced to address this flaw, and thereby, judge the suitability of a type of land use in any given pixel in a case study area of the Jiangxi Province, China. A comparison of the two regression methods indicated that the proportion of correctly allocated pixels using multinomial logistic regression was 92.98%, which was 8.47% higher than that obtained using logistic regression. Paired t-test results also showed that pixels were more clearly distinguished by multinomial logistic regression than by logistic regression. In conclusion, multinomial logistic regression is a more efficient and accurate method for the spatial allocation of land use changes. The application of this method in future land use change studies may improve the accuracy of predicting the effects of land use and cover change on energy fluxes, ecology, and environment.
Spontaneous regression of metastases from malignant melanoma: a case report
DEFF Research Database (Denmark)
Kalialis, Louise V; Drzewiecki, Krzysztof T; Mohammadi, Mahin
2008-01-01
A case of a 61-year-old male with widespread metastatic melanoma is presented 5 years after complete spontaneous cure. Spontaneous regression occurred in cutaneous, pulmonary, hepatic and cerebral metastases. A review of the literature reveals seven cases of regression of cerebral metastases......; this report is the first to document complete spontaneous regression of cerebral metastases from malignant melanoma by means of computed tomography scans. Spontaneous regression is defined as the partial or complete disappearance of a malignant tumour in the absence of all treatment or in the presence...
Acupuncture and Spontaneous Regression of a Radiculopathic Cervical Herniated Disc
Directory of Open Access Journals (Sweden)
Kim Sung-Ha
2012-06-01
Full Text Available The spontaneous regression of herniated cervical discs is not a well-established phenomenon. However, we encountered a case of a spontaneous regression of a severe radiculopathic herniated cervical disc that was treated with acupuncture, pharmacopuncture, and herb medicine. The symptoms were improved within 12 months of treatment. Magnetic resonance imaging (MRI conducted at that time revealed marked regression of the herniated disc. This case provides an additional example of spontaneous regression of a herniated cervical disc documented by MRI following non-surgical treatment.
Vaeth, Michael; Skovlund, Eva
2004-06-15
For a given regression problem it is possible to identify a suitably defined equivalent two-sample problem such that the power or sample size obtained for the two-sample problem also applies to the regression problem. For a standard linear regression model the equivalent two-sample problem is easily identified, but for generalized linear models and for Cox regression models the situation is more complicated. An approximately equivalent two-sample problem may, however, also be identified here. In particular, we show that for logistic regression and Cox regression models the equivalent two-sample problem is obtained by selecting two equally sized samples for which the parameters differ by a value equal to the slope times twice the standard deviation of the independent variable and further requiring that the overall expected number of events is unchanged. In a simulation study we examine the validity of this approach to power calculations in logistic regression and Cox regression models. Several different covariate distributions are considered for selected values of the overall response probability and a range of alternatives. For the Cox regression model we consider both constant and non-constant hazard rates. The results show that in general the approach is remarkably accurate even in relatively small samples. Some discrepancies are, however, found in small samples with few events and a highly skewed covariate distribution. Comparison with results based on alternative methods for logistic regression models with a single continuous covariate indicates that the proposed method is at least as good as its competitors. The method is easy to implement and therefore provides a simple way to extend the range of problems that can be covered by the usual formulas for power and sample size determination. Copyright 2004 John Wiley & Sons, Ltd.
Hossain, Md Golam; Saw, Aik; Alam, Rashidul; Ohtsuki, Fumio; Kamarul, Tunku
2013-09-01
Cephalic index (CI), the ratio of head breadth to head length, is widely used to categorise human populations. The aim of this study was to access the impact of anthropometric measurements on the CI of male Japanese university students. This study included 1,215 male university students from Tokyo and Kyoto, selected using convenient sampling. Multiple regression analysis was used to determine the effect of anthropometric measurements on CI. The variance inflation factor (VIF) showed no evidence of a multicollinearity problem among independent variables. The coefficients of the regression line demonstrated a significant positive relationship between CI and minimum frontal breadth (p regression analysis showed a greater likelihood for minimum frontal breadth (p regression analysis revealed bizygomatic breadth, head circumference, minimum frontal breadth, head height and morphological facial height to be the best predictor craniofacial measurements with respect to CI. The results suggest that most of the variables considered in this study appear to influence the CI of adult male Japanese students.
Producing The New Regressive Left
DEFF Research Database (Denmark)
Crone, Christine
members, this thesis investigates a growing political trend and ideological discourse in the Arab world that I have called The New Regressive Left. On the premise that a media outlet can function as a forum for ideology production, the thesis argues that an analysis of this material can help to trace...... the contexture of The New Regressive Left. If the first part of the thesis lays out the theoretical approach and draws the contextual framework, through an exploration of the surrounding Arab media-and ideoscapes, the second part is an analytical investigation of the discourse that permeates the programmes aired...... becomes clear from the analytical chapters is the emergence of the new cross-ideological alliance of The New Regressive Left. This emerging coalition between Shia Muslims, religious minorities, parts of the Arab Left, secular cultural producers, and the remnants of the political,strategic resistance...
A Matlab program for stepwise regression
Directory of Open Access Journals (Sweden)
Yanhong Qi
2016-03-01
Full Text Available The stepwise linear regression is a multi-variable regression for identifying statistically significant variables in the linear regression equation. In present study, we presented the Matlab program of stepwise regression.
Correlation and simple linear regression.
Zou, Kelly H; Tuncali, Kemal; Silverman, Stuart G
2003-06-01
In this tutorial article, the concepts of correlation and regression are reviewed and demonstrated. The authors review and compare two correlation coefficients, the Pearson correlation coefficient and the Spearman rho, for measuring linear and nonlinear relationships between two continuous variables. In the case of measuring the linear relationship between a predictor and an outcome variable, simple linear regression analysis is conducted. These statistical concepts are illustrated by using a data set from published literature to assess a computed tomography-guided interventional technique. These statistical methods are important for exploring the relationships between variables and can be applied to many radiologic studies.
Regression filter for signal resolution
International Nuclear Information System (INIS)
Matthes, W.
1975-01-01
The problem considered is that of resolving a measured pulse height spectrum of a material mixture, e.g. gamma ray spectrum, Raman spectrum, into a weighed sum of the spectra of the individual constituents. The model on which the analytical formulation is based is described. The problem reduces to that of a multiple linear regression. A stepwise linear regression procedure was constructed. The efficiency of this method was then tested by transforming the procedure in a computer programme which was used to unfold test spectra obtained by mixing some spectra, from a library of arbitrary chosen spectra, and adding a noise component. (U.K.)
Nonparametric Mixture of Regression Models.
Huang, Mian; Li, Runze; Wang, Shaoli
2013-07-01
Motivated by an analysis of US house price index data, we propose nonparametric finite mixture of regression models. We study the identifiability issue of the proposed models, and develop an estimation procedure by employing kernel regression. We further systematically study the sampling properties of the proposed estimators, and establish their asymptotic normality. A modified EM algorithm is proposed to carry out the estimation procedure. We show that our algorithm preserves the ascent property of the EM algorithm in an asymptotic sense. Monte Carlo simulations are conducted to examine the finite sample performance of the proposed estimation procedure. An empirical analysis of the US house price index data is illustrated for the proposed methodology.
Regression Benchmarking: An Approach to Quality Assurance in Performance
Bulej, Lubomír
2005-01-01
The paper presents a short summary of our work in the area of regression benchmarking and its application to software development. Specially, we explain the concept of regression benchmarking, the requirements for employing regression testing in a software project, and methods used for analyzing the vast amounts of data resulting from repeated benchmarking. We present the application of regression benchmarking on a real software project and conclude with a glimpse at the challenges for the fu...
Panel data specifications in nonparametric kernel regression
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard; Henningsen, Arne
parametric panel data estimators to analyse the production technology of Polish crop farms. The results of our nonparametric kernel regressions generally differ from the estimates of the parametric models but they only slightly depend on the choice of the kernel functions. Based on economic reasoning, we...
Influence diagnostics in meta-regression model.
Shi, Lei; Zuo, ShanShan; Yu, Dalei; Zhou, Xiaohua
2017-09-01
This paper studies the influence diagnostics in meta-regression model including case deletion diagnostic and local influence analysis. We derive the subset deletion formulae for the estimation of regression coefficient and heterogeneity variance and obtain the corresponding influence measures. The DerSimonian and Laird estimation and maximum likelihood estimation methods in meta-regression are considered, respectively, to derive the results. Internal and external residual and leverage measure are defined. The local influence analysis based on case-weights perturbation scheme, responses perturbation scheme, covariate perturbation scheme, and within-variance perturbation scheme are explored. We introduce a method by simultaneous perturbing responses, covariate, and within-variance to obtain the local influence measure, which has an advantage of capable to compare the influence magnitude of influential studies from different perturbations. An example is used to illustrate the proposed methodology. Copyright © 2017 John Wiley & Sons, Ltd.
Principal component regression for crop yield estimation
Suryanarayana, T M V
2016-01-01
This book highlights the estimation of crop yield in Central Gujarat, especially with regard to the development of Multiple Regression Models and Principal Component Regression (PCR) models using climatological parameters as independent variables and crop yield as a dependent variable. It subsequently compares the multiple linear regression (MLR) and PCR results, and discusses the significance of PCR for crop yield estimation. In this context, the book also covers Principal Component Analysis (PCA), a statistical procedure used to reduce a number of correlated variables into a smaller number of uncorrelated variables called principal components (PC). This book will be helpful to the students and researchers, starting their works on climate and agriculture, mainly focussing on estimation models. The flow of chapters takes the readers in a smooth path, in understanding climate and weather and impact of climate change, and gradually proceeds towards downscaling techniques and then finally towards development of ...
Cactus: An Introduction to Regression
Hyde, Hartley
2008-01-01
When the author first used "VisiCalc," the author thought it a very useful tool when he had the formulas. But how could he design a spreadsheet if there was no known formula for the quantities he was trying to predict? A few months later, the author relates he learned to use multiple linear regression software and suddenly it all clicked into…
Regression Models for Repairable Systems
Czech Academy of Sciences Publication Activity Database
Novák, Petr
2015-01-01
Roč. 17, č. 4 (2015), s. 963-972 ISSN 1387-5841 Institutional support: RVO:67985556 Keywords : Reliability analysis * Repair models * Regression Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.782, year: 2015 http://library.utia.cas.cz/separaty/2015/SI/novak-0450902.pdf
Survival analysis II: Cox regression
Stel, Vianda S.; Dekker, Friedo W.; Tripepi, Giovanni; Zoccali, Carmine; Jager, Kitty J.
2011-01-01
In contrast to the Kaplan-Meier method, Cox proportional hazards regression can provide an effect estimate by quantifying the difference in survival between patient groups and can adjust for confounding effects of other variables. The purpose of this article is to explain the basic concepts of the
Kernel regression with functional response
Ferraty, Frédéric; Laksaci, Ali; Tadj, Amel; Vieu, Philippe
2011-01-01
We consider kernel regression estimate when both the response variable and the explanatory one are functional. The rates of uniform almost complete convergence are stated as function of the small ball probability of the predictor and as function of the entropy of the set on which uniformity is obtained.
Chapuis, Aude G; Roberts, Ilana M; Thompson, John A; Margolin, Kim A; Bhatia, Shailender; Lee, Sylvia M; Sloan, Heather L; Lai, Ivy P; Farrar, Erik A; Wagener, Felecia; Shibuya, Kendall C; Cao, Jianhong; Wolchok, Jedd D; Greenberg, Philip D; Yee, Cassian
2016-11-01
Purpose Peripheral blood-derived antigen-specific cytotoxic T cells (CTLs) provide a readily available source of effector cells that can be administered with minimal toxicity in an outpatient setting. In metastatic melanoma, this approach results in measurable albeit modest clinical responses in patients resistant to conventional therapy. We reasoned that concurrent cytotoxic T-cell lymphocyte antigen-4 (CTLA-4) checkpoint blockade might enhance the antitumor activity of adoptively transferred CTLs. Patients and Methods Autologous MART1-specific CTLs were generated by priming with peptide-pulsed dendritic cells in the presence of interleukin-21 and enriched by peptide-major histocompatibility complex multimer-guided cell sorting. This expeditiously yielded polyclonal CTL lines uniformly expressing markers associated with an enhanced survival potential. In this first-in-human strategy, 10 patients with stage IV melanoma received the MART1-specific CTLs followed by a standard course of anti-CTLA-4 (ipilimumab). Results The toxicity profile of the combined treatment was comparable to that of ipilimumab monotherapy. Evaluation of best responses at 12 weeks yielded two continuous complete remissions, one partial response (PR) using RECIST criteria (two PRs using immune-related response criteria), and three instances of stable disease. Infused CTLs persisted with frequencies up to 2.9% of CD8 + T cells for as long as the patients were monitored (up to 40 weeks). In patients who experienced complete remissions, PRs, or stable disease, the persisting CTLs acquired phenotypic and functional characteristics of long-lived memory cells. Moreover, these patients also developed responses to nontargeted tumor antigens (epitope spreading). Conclusion We demonstrate that combining antigen-specific CTLs with CTLA-4 blockade is safe and produces durable clinical responses, likely reflecting both enhanced activity of transferred cells and improved recruitment of new responses
Chapuis, Aude G.; Roberts, Ilana M.; Thompson, John A.; Margolin, Kim A.; Bhatia, Shailender; Lee, Sylvia M.; Sloan, Heather L.; Lai, Ivy P.; Farrar, Erik A.; Wagener, Felecia; Shibuya, Kendall C.; Cao, Jianhong; Wolchok, Jedd D.; Greenberg, Philip D.
2016-01-01
Purpose Peripheral blood–derived antigen-specific cytotoxic T cells (CTLs) provide a readily available source of effector cells that can be administered with minimal toxicity in an outpatient setting. In metastatic melanoma, this approach results in measurable albeit modest clinical responses in patients resistant to conventional therapy. We reasoned that concurrent cytotoxic T-cell lymphocyte antigen-4 (CTLA-4) checkpoint blockade might enhance the antitumor activity of adoptively transferred CTLs. Patients and Methods Autologous MART1-specific CTLs were generated by priming with peptide-pulsed dendritic cells in the presence of interleukin-21 and enriched by peptide-major histocompatibility complex multimer-guided cell sorting. This expeditiously yielded polyclonal CTL lines uniformly expressing markers associated with an enhanced survival potential. In this first-in-human strategy, 10 patients with stage IV melanoma received the MART1-specific CTLs followed by a standard course of anti–CTLA-4 (ipilimumab). Results The toxicity profile of the combined treatment was comparable to that of ipilimumab monotherapy. Evaluation of best responses at 12 weeks yielded two continuous complete remissions, one partial response (PR) using RECIST criteria (two PRs using immune-related response criteria), and three instances of stable disease. Infused CTLs persisted with frequencies up to 2.9% of CD8+ T cells for as long as the patients were monitored (up to 40 weeks). In patients who experienced complete remissions, PRs, or stable disease, the persisting CTLs acquired phenotypic and functional characteristics of long-lived memory cells. Moreover, these patients also developed responses to nontargeted tumor antigens (epitope spreading). Conclusion We demonstrate that combining antigen-specific CTLs with CTLA-4 blockade is safe and produces durable clinical responses, likely reflecting both enhanced activity of transferred cells and improved recruitment of new responses
Poisson Mixture Regression Models for Heart Disease Prediction.
Mufudza, Chipo; Erol, Hamza
2016-01-01
Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model.
International Nuclear Information System (INIS)
Jafri, Y.Z.; Kamal, L.
2007-01-01
Various statistical techniques was used on five-year data from 1998-2002 of average humidity, rainfall, maximum and minimum temperatures, respectively. The relationships to regression analysis time series (RATS) were developed for determining the overall trend of these climate parameters on the basis of which forecast models can be corrected and modified. We computed the coefficient of determination as a measure of goodness of fit, to our polynomial regression analysis time series (PRATS). The correlation to multiple linear regression (MLR) and multiple linear regression analysis time series (MLRATS) were also developed for deciphering the interdependence of weather parameters. Spearman's rand correlation and Goldfeld-Quandt test were used to check the uniformity or non-uniformity of variances in our fit to polynomial regression (PR). The Breusch-Pagan test was applied to MLR and MLRATS, respectively which yielded homoscedasticity. We also employed Bartlett's test for homogeneity of variances on a five-year data of rainfall and humidity, respectively which showed that the variances in rainfall data were not homogenous while in case of humidity, were homogenous. Our results on regression and regression analysis time series show the best fit to prediction modeling on climatic data of Quetta, Pakistan. (author)
Energy Technology Data Exchange (ETDEWEB)
Lee, Hong Seok; Choi, Doo Ho; Park, Hee Chul; Park, Won; Yu, Jeong Il; Chung, Kwang Zoo [Dept. of Radiation Oncology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul (Korea, Republic of)
2016-09-15
To determine whether large rectal volume on planning computed tomography (CT) results in lower tumor regression grade (TRG) after neoadjuvant concurrent chemoradiotherapy (CCRT) in rectal cancer patients. We reviewed medical records of 113 patients treated with surgery following neoadjuvant CCRT for rectal cancer between January and December 2012. Rectal volume was contoured on axial images in which gross tumor volume was included. Average axial rectal area (ARA) was defined as rectal volume divided by longitudinal tumor length. The impact of rectal volume and ARA on TRG was assessed. Average rectal volume and ARA were 11.3 mL and 2.9 cm². After completion of neoadjuvant CCRT in 113 patients, pathologic results revealed total regression (TRG 4) in 28 patients (25%), good regression (TRG 3) in 25 patients (22%), moderate regression (TRG 2) in 34 patients (30%), minor regression (TRG 1) in 24 patients (21%), and no regression (TRG0) in 2 patients (2%). No difference of rectal volume and ARA was found between each TRG groups. Linear correlation existed between rectal volume and TRG (p = 0.036) but not between ARA and TRG (p = 0.058). Rectal volume on planning CT has no significance on TRG in patients receiving neoadjuvant CCRT for rectal cancer. These results indicate that maintaining minimal rectal volume before each treatment may not be necessary.
Spatial Regression and Prediction of Water Quality in a Watershed with Complex Pollution Sources.
Yang, Xiaoying; Liu, Qun; Luo, Xingzhang; Zheng, Zheng
2017-08-16
Fast economic development, burgeoning population growth, and rapid urbanization have led to complex pollution sources contributing to water quality deterioration simultaneously in many developing countries including China. This paper explored the use of spatial regression to evaluate the impacts of watershed characteristics on ambient total nitrogen (TN) concentration in a heavily polluted watershed and make predictions across the region. Regression results have confirmed the substantial impact on TN concentration by a variety of point and non-point pollution sources. In addition, spatial regression has yielded better performance than ordinary regression in predicting TN concentrations. Due to its best performance in cross-validation, the river distance based spatial regression model was used to predict TN concentrations across the watershed. The prediction results have revealed a distinct pattern in the spatial distribution of TN concentrations and identified three critical sub-regions in priority for reducing TN loads. Our study results have indicated that spatial regression could potentially serve as an effective tool to facilitate water pollution control in watersheds under diverse physical and socio-economical conditions.
International Nuclear Information System (INIS)
Lee, Hong Seok; Choi, Doo Ho; Park, Hee Chul; Park, Won; Yu, Jeong Il; Chung, Kwang Zoo
2016-01-01
To determine whether large rectal volume on planning computed tomography (CT) results in lower tumor regression grade (TRG) after neoadjuvant concurrent chemoradiotherapy (CCRT) in rectal cancer patients. We reviewed medical records of 113 patients treated with surgery following neoadjuvant CCRT for rectal cancer between January and December 2012. Rectal volume was contoured on axial images in which gross tumor volume was included. Average axial rectal area (ARA) was defined as rectal volume divided by longitudinal tumor length. The impact of rectal volume and ARA on TRG was assessed. Average rectal volume and ARA were 11.3 mL and 2.9 cm². After completion of neoadjuvant CCRT in 113 patients, pathologic results revealed total regression (TRG 4) in 28 patients (25%), good regression (TRG 3) in 25 patients (22%), moderate regression (TRG 2) in 34 patients (30%), minor regression (TRG 1) in 24 patients (21%), and no regression (TRG0) in 2 patients (2%). No difference of rectal volume and ARA was found between each TRG groups. Linear correlation existed between rectal volume and TRG (p = 0.036) but not between ARA and TRG (p = 0.058). Rectal volume on planning CT has no significance on TRG in patients receiving neoadjuvant CCRT for rectal cancer. These results indicate that maintaining minimal rectal volume before each treatment may not be necessary
Logistic regression applied to natural hazards: rare event logistic regression with replications
Guns, M.; Vanacker, Veerle
2012-01-01
Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logisti...
Spontaneous regression of intracranial malignant lymphoma
International Nuclear Information System (INIS)
Kojo, Nobuto; Tokutomi, Takashi; Eguchi, Gihachirou; Takagi, Shigeyuki; Matsumoto, Tomie; Sasaguri, Yasuyuki; Shigemori, Minoru.
1988-01-01
In a 46-year-old female with a 1-month history of gait and speech disturbances, computed tomography (CT) demonstrated mass lesions of slightly high density in the left basal ganglia and left frontal lobe. The lesions were markedly enhanced by contrast medium. The patient received no specific treatment, but her clinical manifestations gradually abated and the lesions decreased in size. Five months after her initial examination, the lesions were absent on CT scans; only a small area of low density remained. Residual clinical symptoms included mild right hemiparesis and aphasia. After 14 months the patient again deteriorated, and a CT scan revealed mass lesions in the right frontal lobe and the pons. However, no enhancement was observed in the previously affected regions. A biopsy revealed malignant lymphoma. Despite treatment with steroids and radiation, the patient's clinical status progressively worsened and she died 27 months after initial presentation. Seven other cases of spontaneous regression of primary malignant lymphoma have been reported. In this case, the mechanism of the spontaneous regression was not clear, but changes in immunologic status may have been involved. (author)
Quantile Regression With Measurement Error
Wei, Ying
2009-08-27
Regression quantiles can be substantially biased when the covariates are measured with error. In this paper we propose a new method that produces consistent linear quantile estimation in the presence of covariate measurement error. The method corrects the measurement error induced bias by constructing joint estimating equations that simultaneously hold for all the quantile levels. An iterative EM-type estimation algorithm to obtain the solutions to such joint estimation equations is provided. The finite sample performance of the proposed method is investigated in a simulation study, and compared to the standard regression calibration approach. Finally, we apply our methodology to part of the National Collaborative Perinatal Project growth data, a longitudinal study with an unusual measurement error structure. © 2009 American Statistical Association.
Multivariate and semiparametric kernel regression
Härdle, Wolfgang; Müller, Marlene
1997-01-01
The paper gives an introduction to theory and application of multivariate and semiparametric kernel smoothing. Multivariate nonparametric density estimation is an often used pilot tool for examining the structure of data. Regression smoothing helps in investigating the association between covariates and responses. We concentrate on kernel smoothing using local polynomial fitting which includes the Nadaraya-Watson estimator. Some theory on the asymptotic behavior and bandwidth selection is pro...
Regression algorithm for emotion detection
Berthelon , Franck; Sander , Peter
2013-01-01
International audience; We present here two components of a computational system for emotion detection. PEMs (Personalized Emotion Maps) store links between bodily expressions and emotion values, and are individually calibrated to capture each person's emotion profile. They are an implementation based on aspects of Scherer's theoretical complex system model of emotion~\\cite{scherer00, scherer09}. We also present a regression algorithm that determines a person's emotional feeling from sensor m...
Directional quantile regression in R
Czech Academy of Sciences Publication Activity Database
Boček, Pavel; Šiman, Miroslav
2017-01-01
Roč. 53, č. 3 (2017), s. 480-492 ISSN 0023-5954 R&D Projects: GA ČR GA14-07234S Institutional support: RVO:67985556 Keywords : multivariate quantile * regression quantile * halfspace depth * depth contour Subject RIV: BD - Theory of Information OBOR OECD: Applied mathematics Impact factor: 0.379, year: 2016 http://library.utia.cas.cz/separaty/2017/SI/bocek-0476587.pdf
Directory of Open Access Journals (Sweden)
Andrew J Parker
2014-04-01
Full Text Available The power and significance of artwork in shaping human cognition is self-evident. The starting point for our empirical investigations is the view that the task of neuroscience is to integrate itself with other forms of knowledge, rather than to seek to supplant them. In our recent work, we examined a particular aspect of the appreciation of artwork using present-day functional magnetic resonance imaging (fMRI. Our results emphasised the continuity between viewing artwork and other human cognitive activities. We also showed that appreciation of a particular aspect of artwork, namely authenticity, depends upon the co-ordinated activity between the brain regions involved in multiple decision making and those responsible for processing visual information. The findings about brain function probably have no specific consequences for understanding how people respond to the art of Rembrandt in comparison with their response to other artworks. However, the use of images of Rembrandt’s portraits, his most intimate and personal works, clearly had a significant impact upon our viewers, even though they have been spatially confined to the interior of an MRI scanner at the time of viewing. Neuroscientific studies of humans viewing artwork have the capacity to reveal the diversity of human cognitive responses that may be induced by external advice or context as people view artwork in a variety of frameworks and settings.
Gaussian Process Regression Model in Spatial Logistic Regression
Sofro, A.; Oktaviarina, A.
2018-01-01
Spatial analysis has developed very quickly in the last decade. One of the favorite approaches is based on the neighbourhood of the region. Unfortunately, there are some limitations such as difficulty in prediction. Therefore, we offer Gaussian process regression (GPR) to accommodate the issue. In this paper, we will focus on spatial modeling with GPR for binomial data with logit link function. The performance of the model will be investigated. We will discuss the inference of how to estimate the parameters and hyper-parameters and to predict as well. Furthermore, simulation studies will be explained in the last section.
Few crystal balls are crystal clear : eyeballing regression
International Nuclear Information System (INIS)
Wittebrood, R.T.
1998-01-01
The theory of regression and statistical analysis as it applies to reservoir analysis was discussed. It was argued that regression lines are not always the final truth. It was suggested that regression lines and eyeballed lines are often equally accurate. The many conditions that must be fulfilled to calculate a proper regression were discussed. Mentioned among these conditions were the distribution of the data, hidden variables, knowledge of how the data was obtained, the need for causal correlation of the variables, and knowledge of the manner in which the regression results are going to be used. 1 tab., 13 figs
Piecewise linear regression splines with hyperbolic covariates
International Nuclear Information System (INIS)
Cologne, John B.; Sposto, Richard
1992-09-01
Consider the problem of fitting a curve to data that exhibit a multiphase linear response with smooth transitions between phases. We propose substituting hyperbolas as covariates in piecewise linear regression splines to obtain curves that are smoothly joined. The method provides an intuitive and easy way to extend the two-phase linear hyperbolic response model of Griffiths and Miller and Watts and Bacon to accommodate more than two linear segments. The resulting regression spline with hyperbolic covariates may be fit by nonlinear regression methods to estimate the degree of curvature between adjoining linear segments. The added complexity of fitting nonlinear, as opposed to linear, regression models is not great. The extra effort is particularly worthwhile when investigators are unwilling to assume that the slope of the response changes abruptly at the join points. We can also estimate the join points (the values of the abscissas where the linear segments would intersect if extrapolated) if their number and approximate locations may be presumed known. An example using data on changing age at menarche in a cohort of Japanese women illustrates the use of the method for exploratory data analysis. (author)
Targeting: Logistic Regression, Special Cases and Extensions
Directory of Open Access Journals (Sweden)
Helmut Schaeben
2014-12-01
Full Text Available Logistic regression is a classical linear model for logit-transformed conditional probabilities of a binary target variable. It recovers the true conditional probabilities if the joint distribution of predictors and the target is of log-linear form. Weights-of-evidence is an ordinary logistic regression with parameters equal to the differences of the weights of evidence if all predictor variables are discrete and conditionally independent given the target variable. The hypothesis of conditional independence can be tested in terms of log-linear models. If the assumption of conditional independence is violated, the application of weights-of-evidence does not only corrupt the predicted conditional probabilities, but also their rank transform. Logistic regression models, including the interaction terms, can account for the lack of conditional independence, appropriate interaction terms compensate exactly for violations of conditional independence. Multilayer artificial neural nets may be seen as nested regression-like models, with some sigmoidal activation function. Most often, the logistic function is used as the activation function. If the net topology, i.e., its control, is sufficiently versatile to mimic interaction terms, artificial neural nets are able to account for violations of conditional independence and yield very similar results. Weights-of-evidence cannot reasonably include interaction terms; subsequent modifications of the weights, as often suggested, cannot emulate the effect of interaction terms.
Vlasova, Tatiana
2010-05-01
Socially-oriented Observations (SOO) in the Russian North have been carried out within multidisciplinary IPY PPS Arctic project under the leadership of Norway and supported by the Research Council of Norway as well as Russian Academy of Sciences. The main objective of SOO is to increase knowledge and observation of changes in quality of life conditions (state of natural environment including climate and biota, safe drinking water and foods, well-being, employment, social relations, access to health care and high quality education, etc.) and - to reveal trends in human capital and capacities (health, demography, education, creativity, spiritual-cultural characteristics and diversity, participation in decision making, etc.). SOO have been carried out in industrial cities as well as sparsely populated rural and nature protection areas in observation sites situated in different bioms (from coastal tundra to southern taiga zone) of Murmansk, Arkhangelsk Oblast and Republic of Komi. SOO were conducted according to the international protocol included in PPS Arctic Manual. SOO approaches based both on local people's perceptions and statistics help to identify main issues and targets for life quality, human capital and environment improvement and thus to distinguish leading SOO indicators for further monitoring. SOO have revealed close interaction between human resources, quality of life and environmental changes. Negative changes in human capital (depopulation, increasing unemployment, aging, declining physical and mental health, quality of education, loss of traditional knowledge, marginalization etc.), despite peoples' high creativity and optimism are becoming the major driving force effecting both the quality of life and the state of environment and overall sustainability. Human induced disturbances such as uncontrolled forests cuttings and poaching are increasing. Observed rapid changes in climate and biota (ice and permafrost melting, tundra shrubs getting taller and
Song, Qifa; Yang, Yuanbin; Lin, Wenping; Yi, Bo; Xu, Guozhang
2017-09-25
We aimed to describe the molecular epidemiological characteristics and clinical treatment outcome of typhoid fever in Ningbo, China during 2005-2014. Eighty-eight Salmonella Typhi isolates were obtained from 307 hospitalized patients. Three prevalent pulsed-field gel electrophoresis (PFGE) patterns of 54 isolates from 3 outbreaks were identified. Overall, there were 64 (72.7%) isolates from clustered cases and 24 (27.3%) isolates from sporadic cases. Resistance to nalidixic acid (NAL) (n = 47; 53.4%) and ampicillin (AMP) (n = 40; 45.4%) and rare resistance to tetracycline (TET) (n = 2; 2.3%) and gentamicin (GEN) (n = 2; 2.3%) were observed. No isolates resistant to cefotaxime (CTX), chloramphenicol (CL), ciprofloxacin (CIP), and trimethoprim-sulfamethoxazole (SXT) were found. The occurrence of reduced sensitivity to CIP was 52.3% (n = 46). The medians of fever clearance time in cases with and without complications were 7 (interquartile range (IQR): 4-10) and 5 (IQR: 3-7) days (P = 0.001), respectively, when patients were treated with CIP or levofloxacin (LEV) and/or third-generation cephalosporins (CEP). Rates of serious complications were at low levels: peritonitis (2.3%), intestinal hemorrhage (6.8%), and intestinal perforation (1.1%). The present study revealed a long-term clustering trend with respect to PFGE patterns, occasional outbreaks, and the rapid spread of AMP resistance and decreased CIP susceptibility among S. Typhi isolates in recent years.
Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon
2015-01-01
Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended.
Regression: The Apple Does Not Fall Far From the Tree.
Vetter, Thomas R; Schober, Patrick
2018-05-15
Researchers and clinicians are frequently interested in either: (1) assessing whether there is a relationship or association between 2 or more variables and quantifying this association; or (2) determining whether 1 or more variables can predict another variable. The strength of such an association is mainly described by the correlation. However, regression analysis and regression models can be used not only to identify whether there is a significant relationship or association between variables but also to generate estimations of such a predictive relationship between variables. This basic statistical tutorial discusses the fundamental concepts and techniques related to the most common types of regression analysis and modeling, including simple linear regression, multiple regression, logistic regression, ordinal regression, and Poisson regression, as well as the common yet often underrecognized phenomenon of regression toward the mean. The various types of regression analysis are powerful statistical techniques, which when appropriately applied, can allow for the valid interpretation of complex, multifactorial data. Regression analysis and models can assess whether there is a relationship or association between 2 or more observed variables and estimate the strength of this association, as well as determine whether 1 or more variables can predict another variable. Regression is thus being applied more commonly in anesthesia, perioperative, critical care, and pain research. However, it is crucial to note that regression can identify plausible risk factors; it does not prove causation (a definitive cause and effect relationship). The results of a regression analysis instead identify independent (predictor) variable(s) associated with the dependent (outcome) variable. As with other statistical methods, applying regression requires that certain assumptions be met, which can be tested with specific diagnostics.
No rationale for 1 variable per 10 events criterion for binary logistic regression analysis
Directory of Open Access Journals (Sweden)
Maarten van Smeden
2016-11-01
Full Text Available Abstract Background Ten events per variable (EPV is a widely advocated minimal criterion for sample size considerations in logistic regression analysis. Of three previous simulation studies that examined this minimal EPV criterion only one supports the use of a minimum of 10 EPV. In this paper, we examine the reasons for substantial differences between these extensive simulation studies. Methods The current study uses Monte Carlo simulations to evaluate small sample bias, coverage of confidence intervals and mean square error of logit coefficients. Logistic regression models fitted by maximum likelihood and a modified estimation procedure, known as Firth’s correction, are compared. Results The results show that besides EPV, the problems associated with low EPV depend on other factors such as the total sample size. It is also demonstrated that simulation results can be dominated by even a few simulated data sets for which the prediction of the outcome by the covariates is perfect (‘separation’. We reveal that different approaches for identifying and handling separation leads to substantially different simulation results. We further show that Firth’s correction can be used to improve the accuracy of regression coefficients and alleviate the problems associated with separation. Conclusions The current evidence supporting EPV rules for binary logistic regression is weak. Given our findings, there is an urgent need for new research to provide guidance for supporting sample size considerations for binary logistic regression analysis.
Spontaneous regression of pulmonary bullae
International Nuclear Information System (INIS)
Satoh, H.; Ishikawa, H.; Ohtsuka, M.; Sekizawa, K.
2002-01-01
The natural history of pulmonary bullae is often characterized by gradual, progressive enlargement. Spontaneous regression of bullae is, however, very rare. We report a case in which complete resolution of pulmonary bullae in the left upper lung occurred spontaneously. The management of pulmonary bullae is occasionally made difficult because of gradual progressive enlargement associated with abnormal pulmonary function. Some patients have multiple bulla in both lungs and/or have a history of pulmonary emphysema. Others have a giant bulla without emphysematous change in the lungs. Our present case had treated lung cancer with no evidence of local recurrence. He had no emphysematous change in lung function test and had no complaints, although the high resolution CT scan shows evidence of underlying minimal changes of emphysema. Ortin and Gurney presented three cases of spontaneous reduction in size of bulla. Interestingly, one of them had a marked decrease in the size of a bulla in association with thickening of the wall of the bulla, which was observed in our patient. This case we describe is of interest, not only because of the rarity with which regression of pulmonary bulla has been reported in the literature, but also because of the spontaneous improvements in the radiological picture in the absence of overt infection or tumor. Copyright (2002) Blackwell Science Pty Ltd
Quantum algorithm for linear regression
Wang, Guoming
2017-07-01
We present a quantum algorithm for fitting a linear regression model to a given data set using the least-squares approach. Differently from previous algorithms which yield a quantum state encoding the optimal parameters, our algorithm outputs these numbers in the classical form. So by running it once, one completely determines the fitted model and then can use it to make predictions on new data at little cost. Moreover, our algorithm works in the standard oracle model, and can handle data sets with nonsparse design matrices. It runs in time poly( log2(N ) ,d ,κ ,1 /ɛ ) , where N is the size of the data set, d is the number of adjustable parameters, κ is the condition number of the design matrix, and ɛ is the desired precision in the output. We also show that the polynomial dependence on d and κ is necessary. Thus, our algorithm cannot be significantly improved. Furthermore, we also give a quantum algorithm that estimates the quality of the least-squares fit (without computing its parameters explicitly). This algorithm runs faster than the one for finding this fit, and can be used to check whether the given data set qualifies for linear regression in the first place.
Interpretation of commonly used statistical regression models.
Kasza, Jessica; Wolfe, Rory
2014-01-01
A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.
Linear regression and the normality assumption.
Schmidt, Amand F; Finan, Chris
2017-12-16
Researchers often perform arbitrary outcome transformations to fulfill the normality assumption of a linear regression model. This commentary explains and illustrates that in large data settings, such transformations are often unnecessary, and worse may bias model estimates. Linear regression assumptions are illustrated using simulated data and an empirical example on the relation between time since type 2 diabetes diagnosis and glycated hemoglobin levels. Simulation results were evaluated on coverage; i.e., the number of times the 95% confidence interval included the true slope coefficient. Although outcome transformations bias point estimates, violations of the normality assumption in linear regression analyses do not. The normality assumption is necessary to unbiasedly estimate standard errors, and hence confidence intervals and P-values. However, in large sample sizes (e.g., where the number of observations per variable is >10) violations of this normality assumption often do not noticeably impact results. Contrary to this, assumptions on, the parametric model, absence of extreme observations, homoscedasticity, and independency of the errors, remain influential even in large sample size settings. Given that modern healthcare research typically includes thousands of subjects focusing on the normality assumption is often unnecessary, does not guarantee valid results, and worse may bias estimates due to the practice of outcome transformations. Copyright © 2017 Elsevier Inc. All rights reserved.
Ridge regression estimator: combining unbiased and ordinary ridge regression methods of estimation
Directory of Open Access Journals (Sweden)
Sharad Damodar Gore
2009-10-01
Full Text Available Statistical literature has several methods for coping with multicollinearity. This paper introduces a new shrinkage estimator, called modified unbiased ridge (MUR. This estimator is obtained from unbiased ridge regression (URR in the same way that ordinary ridge regression (ORR is obtained from ordinary least squares (OLS. Properties of MUR are derived. Results on its matrix mean squared error (MMSE are obtained. MUR is compared with ORR and URR in terms of MMSE. These results are illustrated with an example based on data generated by Hoerl and Kennard (1975.
Prediction, Regression and Critical Realism
DEFF Research Database (Denmark)
Næss, Petter
2004-01-01
This paper considers the possibility of prediction in land use planning, and the use of statistical research methods in analyses of relationships between urban form and travel behaviour. Influential writers within the tradition of critical realism reject the possibility of predicting social...... phenomena. This position is fundamentally problematic to public planning. Without at least some ability to predict the likely consequences of different proposals, the justification for public sector intervention into market mechanisms will be frail. Statistical methods like regression analyses are commonly...... seen as necessary in order to identify aggregate level effects of policy measures, but are questioned by many advocates of critical realist ontology. Using research into the relationship between urban structure and travel as an example, the paper discusses relevant research methods and the kinds...
On Weighted Support Vector Regression
DEFF Research Database (Denmark)
Han, Xixuan; Clemmensen, Line Katrine Harder
2014-01-01
We propose a new type of weighted support vector regression (SVR), motivated by modeling local dependencies in time and space in prediction of house prices. The classic weights of the weighted SVR are added to the slack variables in the objective function (OF‐weights). This procedure directly...... shrinks the coefficient of each observation in the estimated functions; thus, it is widely used for minimizing influence of outliers. We propose to additionally add weights to the slack variables in the constraints (CF‐weights) and call the combination of weights the doubly weighted SVR. We illustrate...... the differences and similarities of the two types of weights by demonstrating the connection between the Least Absolute Shrinkage and Selection Operator (LASSO) and the SVR. We show that an SVR problem can be transformed to a LASSO problem plus a linear constraint and a box constraint. We demonstrate...
Duc, Pierre-Alain; Cuillandre, Jean-Charles; Serra, Paolo; Michel-Dansac, Leo; Ferriere, Etienne; Alatalo, Katherine; Blitz, Leo; Bois, Maxime; Bournaud, Frédéric; Bureau, Martin; Cappellari, Michele; Davies, Roger L.; Davis, Timothy A.; de Zeeuw, P. T.; Emsellem, Eric; Khochfar, Sadegh; Krajnović, Davor; Kuntschner, Harald; Lablanche, Pierre-Yves; McDermid, Richard M.; Morganti, Raffaella; Naab, Thorsten; Oosterloo, Tom; Sarzi, Marc; Scott, Nicholas; Weijmans, Anne-Marie; Young, Lisa M.
2011-10-01
The mass assembly of galaxies leaves imprints in their outskirts, such as shells and tidal tails. The frequency and properties of such fine structures depend on the main acting mechanisms - secular evolution, minor or major mergers - and on the age of the last substantial accretion event. We use this to constrain the mass assembly history of two apparently relaxed nearby early-type galaxies (ETGs) selected from the ATLAS3D sample, NGC 680 and 5557. Our ultra-deep optical images obtained with MegaCam on the Canada-France-Hawaii Telescope reach 29 mag arcsec-2 in the g band. They reveal very low surface brightness (LSB) filamentary structures around these ellipticals. Among them, a gigantic 160 kpc long, narrow, tail east of NGC 5557 hosts three gas-rich star-forming objects, previously detected in H I with the Westerbork Synthesis Radio Telescope and in UV with GALEX. NGC 680 exhibits two major diffuse plumes apparently connected to extended H I tails, as well as a series of arcs and shells. Comparing the outer stellar and gaseous morphology of the two ellipticals with that predicted from models of colliding galaxies, we argue that the LSB features are tidal debris and that each of these two ETGs was assembled during a relatively recent, major wet merger, which most likely occurred after the redshift z ≃ 0.5 epoch. Had these mergers been older, the tidal features should have already fallen back or be destroyed by more recent accretion events. However, the absence of molecular gas and of a prominent young stellar population in the core region of the galaxies indicates that the merger is at least 1-2 Gyr old: the memory of any merger-triggered nuclear starburst has indeed been lost. The star-forming objects found towards the collisional debris of NGC 5557 are then likely tidal dwarf galaxies. Such recycled galaxies here appear to be long-lived and continue to form stars while any star formation activity has stopped in their parent galaxy. The inner kinematics of NGC
No rationale for 1 variable per 10 events criterion for binary logistic regression analysis.
van Smeden, Maarten; de Groot, Joris A H; Moons, Karel G M; Collins, Gary S; Altman, Douglas G; Eijkemans, Marinus J C; Reitsma, Johannes B
2016-11-24
Ten events per variable (EPV) is a widely advocated minimal criterion for sample size considerations in logistic regression analysis. Of three previous simulation studies that examined this minimal EPV criterion only one supports the use of a minimum of 10 EPV. In this paper, we examine the reasons for substantial differences between these extensive simulation studies. The current study uses Monte Carlo simulations to evaluate small sample bias, coverage of confidence intervals and mean square error of logit coefficients. Logistic regression models fitted by maximum likelihood and a modified estimation procedure, known as Firth's correction, are compared. The results show that besides EPV, the problems associated with low EPV depend on other factors such as the total sample size. It is also demonstrated that simulation results can be dominated by even a few simulated data sets for which the prediction of the outcome by the covariates is perfect ('separation'). We reveal that different approaches for identifying and handling separation leads to substantially different simulation results. We further show that Firth's correction can be used to improve the accuracy of regression coefficients and alleviate the problems associated with separation. The current evidence supporting EPV rules for binary logistic regression is weak. Given our findings, there is an urgent need for new research to provide guidance for supporting sample size considerations for binary logistic regression analysis.
Modeling oil production based on symbolic regression
International Nuclear Information System (INIS)
Yang, Guangfei; Li, Xianneng; Wang, Jianliang; Lian, Lian; Ma, Tieju
2015-01-01
Numerous models have been proposed to forecast the future trends of oil production and almost all of them are based on some predefined assumptions with various uncertainties. In this study, we propose a novel data-driven approach that uses symbolic regression to model oil production. We validate our approach on both synthetic and real data, and the results prove that symbolic regression could effectively identify the true models beneath the oil production data and also make reliable predictions. Symbolic regression indicates that world oil production will peak in 2021, which broadly agrees with other techniques used by researchers. Our results also show that the rate of decline after the peak is almost half the rate of increase before the peak, and it takes nearly 12 years to drop 4% from the peak. These predictions are more optimistic than those in several other reports, and the smoother decline will provide the world, especially the developing countries, with more time to orchestrate mitigation plans. -- Highlights: •A data-driven approach has been shown to be effective at modeling the oil production. •The Hubbert model could be discovered automatically from data. •The peak of world oil production is predicted to appear in 2021. •The decline rate after peak is half of the increase rate before peak. •Oil production projected to decline 4% post-peak
Image superresolution using support vector regression.
Ni, Karl S; Nguyen, Truong Q
2007-06-01
A thorough investigation of the application of support vector regression (SVR) to the superresolution problem is conducted through various frameworks. Prior to the study, the SVR problem is enhanced by finding the optimal kernel. This is done by formulating the kernel learning problem in SVR form as a convex optimization problem, specifically a semi-definite programming (SDP) problem. An additional constraint is added to reduce the SDP to a quadratically constrained quadratic programming (QCQP) problem. After this optimization, investigation of the relevancy of SVR to superresolution proceeds with the possibility of using a single and general support vector regression for all image content, and the results are impressive for small training sets. This idea is improved upon by observing structural properties in the discrete cosine transform (DCT) domain to aid in learning the regression. Further improvement involves a combination of classification and SVR-based techniques, extending works in resolution synthesis. This method, termed kernel resolution synthesis, uses specific regressors for isolated image content to describe the domain through a partitioned look of the vector space, thereby yielding good results.
Directory of Open Access Journals (Sweden)
Horst Entorf
2015-07-01
Full Text Available Two alternative hypotheses – referred to as opportunity- and stigma-based behavior – suggest that the magnitude of the link between unemployment and crime also depends on preexisting local crime levels. In order to analyze conjectured nonlinearities between both variables, we use quantile regressions applied to German district panel data. While both conventional OLS and quantile regressions confirm the positive link between unemployment and crime for property crimes, results for assault differ with respect to the method of estimation. Whereas conventional mean regressions do not show any significant effect (which would confirm the usual result found for violent crimes in the literature, quantile regression reveals that size and importance of the relationship are conditional on the crime rate. The partial effect is significantly positive for moderately low and median quantiles of local assault rates.
Robust Regression and its Application in Financial Data Analysis
Mansoor Momeni; Mahmoud Dehghan Nayeri; Ali Faal Ghayoumi; Hoda Ghorbani
2010-01-01
This research is aimed to describe the application of robust regression and its advantages over the least square regression method in analyzing financial data. To do this, relationship between earning per share, book value of equity per share and share price as price model and earning per share, annual change of earning per share and return of stock as return model is discussed using both robust and least square regressions, and finally the outcomes are compared. Comparing the results from th...
Berman, Elizabeth
1979-01-01
Mathematics Revealed focuses on the principles, processes, operations, and exercises in mathematics.The book first offers information on whole numbers, fractions, and decimals and percents. Discussions focus on measuring length, percent, decimals, numbers as products, addition and subtraction of fractions, mixed numbers and ratios, division of fractions, addition, subtraction, multiplication, and division. The text then examines positive and negative numbers and powers and computation. Topics include division and averages, multiplication, ratios, and measurements, scientific notation and estim
Directory of Open Access Journals (Sweden)
Hong-Juan Li
2013-04-01
Full Text Available Electric load forecasting is an important issue for a power utility, associated with the management of daily operations such as energy transfer scheduling, unit commitment, and load dispatch. Inspired by strong non-linear learning capability of support vector regression (SVR, this paper presents a SVR model hybridized with the empirical mode decomposition (EMD method and auto regression (AR for electric load forecasting. The electric load data of the New South Wales (Australia market are employed for comparing the forecasting performances of different forecasting models. The results confirm the validity of the idea that the proposed model can simultaneously provide forecasting with good accuracy and interpretability.
Using the Ridge Regression Procedures to Estimate the Multiple Linear Regression Coefficients
Gorgees, HazimMansoor; Mahdi, FatimahAssim
2018-05-01
This article concerns with comparing the performance of different types of ordinary ridge regression estimators that have been already proposed to estimate the regression parameters when the near exact linear relationships among the explanatory variables is presented. For this situations we employ the data obtained from tagi gas filling company during the period (2008-2010). The main result we reached is that the method based on the condition number performs better than other methods since it has smaller mean square error (MSE) than the other stated methods.
Bayesian Inference of a Multivariate Regression Model
Directory of Open Access Journals (Sweden)
Marick S. Sinay
2014-01-01
Full Text Available We explore Bayesian inference of a multivariate linear regression model with use of a flexible prior for the covariance structure. The commonly adopted Bayesian setup involves the conjugate prior, multivariate normal distribution for the regression coefficients and inverse Wishart specification for the covariance matrix. Here we depart from this approach and propose a novel Bayesian estimator for the covariance. A multivariate normal prior for the unique elements of the matrix logarithm of the covariance matrix is considered. Such structure allows for a richer class of prior distributions for the covariance, with respect to strength of beliefs in prior location hyperparameters, as well as the added ability, to model potential correlation amongst the covariance structure. The posterior moments of all relevant parameters of interest are calculated based upon numerical results via a Markov chain Monte Carlo procedure. The Metropolis-Hastings-within-Gibbs algorithm is invoked to account for the construction of a proposal density that closely matches the shape of the target posterior distribution. As an application of the proposed technique, we investigate a multiple regression based upon the 1980 High School and Beyond Survey.
Face Alignment via Regressing Local Binary Features.
Ren, Shaoqing; Cao, Xudong; Wei, Yichen; Sun, Jian
2016-03-01
This paper presents a highly efficient and accurate regression approach for face alignment. Our approach has two novel components: 1) a set of local binary features and 2) a locality principle for learning those features. The locality principle guides us to learn a set of highly discriminative local binary features for each facial landmark independently. The obtained local binary features are used to jointly learn a linear regression for the final output. This approach achieves the state-of-the-art results when tested on the most challenging benchmarks to date. Furthermore, because extracting and regressing local binary features are computationally very cheap, our system is much faster than previous methods. It achieves over 3000 frames per second (FPS) on a desktop or 300 FPS on a mobile phone for locating a few dozens of landmarks. We also study a key issue that is important but has received little attention in the previous research, which is the face detector used to initialize alignment. We investigate several face detectors and perform quantitative evaluation on how they affect alignment accuracy. We find that an alignment friendly detector can further greatly boost the accuracy of our alignment method, reducing the error up to 16% relatively. To facilitate practical usage of face detection/alignment methods, we also propose a convenient metric to measure how good a detector is for alignment initialization.
Geographically weighted regression model on poverty indicator
Slamet, I.; Nugroho, N. F. T. A.; Muslich
2017-12-01
In this research, we applied geographically weighted regression (GWR) for analyzing the poverty in Central Java. We consider Gaussian Kernel as weighted function. The GWR uses the diagonal matrix resulted from calculating kernel Gaussian function as a weighted function in the regression model. The kernel weights is used to handle spatial effects on the data so that a model can be obtained for each location. The purpose of this paper is to model of poverty percentage data in Central Java province using GWR with Gaussian kernel weighted function and to determine the influencing factors in each regency/city in Central Java province. Based on the research, we obtained geographically weighted regression model with Gaussian kernel weighted function on poverty percentage data in Central Java province. We found that percentage of population working as farmers, population growth rate, percentage of households with regular sanitation, and BPJS beneficiaries are the variables that affect the percentage of poverty in Central Java province. In this research, we found the determination coefficient R2 are 68.64%. There are two categories of district which are influenced by different of significance factors.
General regression and representation model for classification.
Directory of Open Access Journals (Sweden)
Jianjun Qian
Full Text Available Recently, the regularized coding-based classification methods (e.g. SRC and CRC show a great potential for pattern classification. However, most existing coding methods assume that the representation residuals are uncorrelated. In real-world applications, this assumption does not hold. In this paper, we take account of the correlations of the representation residuals and develop a general regression and representation model (GRR for classification. GRR not only has advantages of CRC, but also takes full use of the prior information (e.g. the correlations between representation residuals and representation coefficients and the specific information (weight matrix of image pixels to enhance the classification performance. GRR uses the generalized Tikhonov regularization and K Nearest Neighbors to learn the prior information from the training data. Meanwhile, the specific information is obtained by using an iterative algorithm to update the feature (or image pixel weights of the test sample. With the proposed model as a platform, we design two classifiers: basic general regression and representation classifier (B-GRR and robust general regression and representation classifier (R-GRR. The experimental results demonstrate the performance advantages of proposed methods over state-of-the-art algorithms.
Credit Scoring Problem Based on Regression Analysis
Khassawneh, Bashar Suhil Jad Allah
2014-01-01
ABSTRACT: This thesis provides an explanatory introduction to the regression models of data mining and contains basic definitions of key terms in the linear, multiple and logistic regression models. Meanwhile, the aim of this study is to illustrate fitting models for the credit scoring problem using simple linear, multiple linear and logistic regression models and also to analyze the found model functions by statistical tools. Keywords: Data mining, linear regression, logistic regression....
International Nuclear Information System (INIS)
Hong, W.-C.
2009-01-01
Accurate forecasting of electric load has always been the most important issues in the electricity industry, particularly for developing countries. Due to the various influences, electric load forecasting reveals highly nonlinear characteristics. Recently, support vector regression (SVR), with nonlinear mapping capabilities of forecasting, has been successfully employed to solve nonlinear regression and time series problems. However, it is still lack of systematic approaches to determine appropriate parameter combination for a SVR model. This investigation elucidates the feasibility of applying chaotic particle swarm optimization (CPSO) algorithm to choose the suitable parameter combination for a SVR model. The empirical results reveal that the proposed model outperforms the other two models applying other algorithms, genetic algorithm (GA) and simulated annealing algorithm (SA). Finally, it also provides the theoretical exploration of the electric load forecasting support system (ELFSS)
Directory of Open Access Journals (Sweden)
Reyna Maya-García
2017-02-01
Full Text Available Mammillaria pectinifera is an endemic, short-globose cactus species, included in the IUCN list as a threatened species with only 18 remaining populations in the Tehuacán-Cuicatlán Valley in central Mexico. We evaluated the population genetic diversity and structure, connectivity, recent bottlenecks and population size, using nuclear microsatellites. M. pectinifera showed high genetic diversity but some evidence of heterozygote deficiency (FIS, recent bottlenecks in some populations and reductions in population size. Also, we found low population genetic differentiation and high values of connectivity for M. pectinifera, as the result of historical events of gene flow through pollen and seed dispersal. M. pectinifera occurs in sites with some degree of disturbance leading to the isolation of its populations and decreasing the levels of gene flow among them. Excessive deforestation also changes the original vegetation damaging the natural habitats. This species will become extinct if it is not properly preserved. Furthermore, this species has some ecological features that make them more vulnerable to disturbance such as a very low growth rates and long life cycles. We suggest in situ conservation to prevent the decrease of population sizes and loss of genetic diversity in the natural protected areas such as the Tehuacán-Cuicatlán Biosphere Reserve. In addition, a long-term ex situ conservation program is need to construct seed banks, and optimize seed germination and plant establishment protocols that restore disturbed habitats. Furthermore, creating a supply of living plants for trade is critical to avoid further extraction of plants from nature.
Mönch, Sabine; Netzel, Michael; Netzel, Gabriele; Ott, Undine; Frank, Thomas; Rychlik, Michael
2016-01-01
Different dietary sources of folate have differing bioavailabilities, which may affect their nutritional "value." In order to examine if these differences also occur within the same food products, a short-term human pilot study was undertaken as a follow-up study to a previously published human trial to evaluate the relative native folate bioavailabilities from low-fat Camembert cheese compared to pteroylmonoglutamic acid as the reference dose. Two healthy human subjects received the test foods in a randomized cross-over design separated by a 14-day equilibrium phase. Folate body pools were saturated with a pteroylmonoglutamic acid supplement before the first testing and between the testings. Folates in test foods and blood plasma were analyzed by stable isotope dilution assays. The biokinetic parameters C max, t max, and area under the curve (AUC) were determined in plasma within the interval of 0-12 h. When comparing the ratio estimates of AUC and C max for the different Camembert cheeses, a higher bioavailability was found for the low-fat Camembert assessed in the present study (≥64%) compared to a different brand in our previous investigation (8.8%). It is suggested that these differences may arise from the different folate distribution in the soft dough and firm rind as well as differing individual folate vitamer proportions. The results clearly underline the importance of the food matrix, even within the same type of food product, in terms of folate bioavailability. Moreover, our findings add to the increasing number of studies questioning the general assumption of 50% bioavailability as the rationale behind the definition of folate equivalents. However, more research is needed to better understand the interactions between individual folate vitamers and other food components and the potential impact on folate bioavailability and metabolism.
Directory of Open Access Journals (Sweden)
Sabine eMönch
2016-04-01
Full Text Available Different dietary sources of folate have differing bioavailabilities which may affect their nutritional value. In order to examine if these differences also occur within the same food products, a short term human pilot study was undertaken as a follow-up study to a previously published human trial to evaluate the relative native folate bioavailabilities from low-fat Camembert cheese compared to pteroylmonoglutamic acid as the reference dose. Two healthy human subjects received the test foods in a randomized cross-over design separated by a 14-day equilibrium phase. Folate body pools were saturated with a pteroylmonoglutamic acid supplement before the first testing and between the testings. Folates in test foods and blood plasma were analysed by stable isotope dilution assays. The biokinetic parameters Cmax, tmax and AUC were determined in plasma within the interval of 0 to 12 hours. When comparing the ratio estimates of AUC and Cmax for the different Camembert cheeses, a higher bioavailability was found for the low-fat Camembert assessed in the present study (≥64% compared to a different brand in our previous investigation (8.8%. It is suggested that these differences may arise from the different folate distribution in the soft dough and firm rind as well as differing individual folate vitamer proportions. The results clearly underline the importance of the food matrix, even within the same type of food product, in terms of folate bioavailability. Moreover, our findings add to the increasing number of studies questioning the general assumption of 50 % bioavailability as the rationale behind the definition of folate equivalents. However, more research is needed to better understand the interactions between individual folate vitamers and other food components and the potential impact on folate bioavailability and metabolism.
Logistic regression applied to natural hazards: rare event logistic regression with replications
Directory of Open Access Journals (Sweden)
M. Guns
2012-06-01
Full Text Available Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logistic regression with replications, combines the strength of probabilistic and statistical methods, and allows overcoming some of the limitations of previous developments through robust variable selection. This technique was here developed for the analyses of landslide controlling factors, but the concept is widely applicable for statistical analyses of natural hazards.
Logistic regression applied to natural hazards: rare event logistic regression with replications
Guns, M.; Vanacker, V.
2012-06-01
Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logistic regression with replications, combines the strength of probabilistic and statistical methods, and allows overcoming some of the limitations of previous developments through robust variable selection. This technique was here developed for the analyses of landslide controlling factors, but the concept is widely applicable for statistical analyses of natural hazards.
Regularized Label Relaxation Linear Regression.
Fang, Xiaozhao; Xu, Yong; Li, Xuelong; Lai, Zhihui; Wong, Wai Keung; Fang, Bingwu
2018-04-01
Linear regression (LR) and some of its variants have been widely used for classification problems. Most of these methods assume that during the learning phase, the training samples can be exactly transformed into a strict binary label matrix, which has too little freedom to fit the labels adequately. To address this problem, in this paper, we propose a novel regularized label relaxation LR method, which has the following notable characteristics. First, the proposed method relaxes the strict binary label matrix into a slack variable matrix by introducing a nonnegative label relaxation matrix into LR, which provides more freedom to fit the labels and simultaneously enlarges the margins between different classes as much as possible. Second, the proposed method constructs the class compactness graph based on manifold learning and uses it as the regularization item to avoid the problem of overfitting. The class compactness graph is used to ensure that the samples sharing the same labels can be kept close after they are transformed. Two different algorithms, which are, respectively, based on -norm and -norm loss functions are devised. These two algorithms have compact closed-form solutions in each iteration so that they are easily implemented. Extensive experiments show that these two algorithms outperform the state-of-the-art algorithms in terms of the classification accuracy and running time.
Smith, Paul F; Ganesh, Siva; Liu, Ping
2013-10-30
Regression is a common statistical tool for prediction in neuroscience. However, linear regression is by far the most common form of regression used, with regression trees receiving comparatively little attention. In this study, the results of conventional multiple linear regression (MLR) were compared with those of random forest regression (RFR), in the prediction of the concentrations of 9 neurochemicals in the vestibular nucleus complex and cerebellum that are part of the l-arginine biochemical pathway (agmatine, putrescine, spermidine, spermine, l-arginine, l-ornithine, l-citrulline, glutamate and γ-aminobutyric acid (GABA)). The R(2) values for the MLRs were higher than the proportion of variance explained values for the RFRs: 6/9 of them were ≥ 0.70 compared to 4/9 for RFRs. Even the variables that had the lowest R(2) values for the MLRs, e.g. ornithine (0.50) and glutamate (0.61), had much lower proportion of variance explained values for the RFRs (0.27 and 0.49, respectively). The RSE values for the MLRs were lower than those for the RFRs in all but two cases. In general, MLRs seemed to be superior to the RFRs in terms of predictive value and error. In the case of this data set, MLR appeared to be superior to RFR in terms of its explanatory value and error. This result suggests that MLR may have advantages over RFR for prediction in neuroscience with this kind of data set, but that RFR can still have good predictive value in some cases. Copyright © 2013 Elsevier B.V. All rights reserved.
Directory of Open Access Journals (Sweden)
Débora Silva Oliveira
2013-03-01
Full Text Available Investigaram-se indicadores de regressão e crescimento do primogênito no processo de tornar-se irmão. Participaram três primogênitos pré-escolares no terceiro trimestre de gestação, aos 12 e 24 meses do irmão. Foi aplicado o Teste das Fábulas e realizada análise qualitativa de conteúdo. Os resultados revelaram regressão do primogênito na gestação materna e crescimento, aos 12 e aos 24 meses de idade do irmão. A regressão foi uma forma de enfrentar a chegada do irmão, enquanto que o crescimento revelou capacidade para conquistas ou custos de ser mais velho. Tanto a regressão quanto o crescimento oportunizaram um ir e vir saudável, fundamental para o desenvolvimento rumo à independência. Esses achados têm implicações para a pesquisa e para a clínica.Regression and growth indicators in the process of becoming a sibling were investigated. Three firstborns took part in the study during the first sibling's third trimester of pregnancy, and when the sibling was 12 and 24 months old, respectively. The Fables Test was used and a qualitative content analysis was carried out. Results revealed regression indicators during pregnancy. At 12 and 24 months there were growth indicators together with regression indicators. Regression was used by the firstborn for coping with the sibling's arrival while growth revealed the capacity for acquisitions or the costs of being an older sibling. Both regressive and growth manifestations enabled a healthy to and fro, which is fundamental for development towards independence. These findings have both research and clinical implications.
Meaney, Christopher; Moineddin, Rahim
2014-01-24
In biomedical research, response variables are often encountered which have bounded support on the open unit interval--(0,1). Traditionally, researchers have attempted to estimate covariate effects on these types of response data using linear regression. Alternative modelling strategies may include: beta regression, variable-dispersion beta regression, and fractional logit regression models. This study employs a Monte Carlo simulation design to compare the statistical properties of the linear regression model to that of the more novel beta regression, variable-dispersion beta regression, and fractional logit regression models. In the Monte Carlo experiment we assume a simple two sample design. We assume observations are realizations of independent draws from their respective probability models. The randomly simulated draws from the various probability models are chosen to emulate average proportion/percentage/rate differences of pre-specified magnitudes. Following simulation of the experimental data we estimate average proportion/percentage/rate differences. We compare the estimators in terms of bias, variance, type-1 error and power. Estimates of Monte Carlo error associated with these quantities are provided. If response data are beta distributed with constant dispersion parameters across the two samples, then all models are unbiased and have reasonable type-1 error rates and power profiles. If the response data in the two samples have different dispersion parameters, then the simple beta regression model is biased. When the sample size is small (N0 = N1 = 25) linear regression has superior type-1 error rates compared to the other models. Small sample type-1 error rates can be improved in beta regression models using bias correction/reduction methods. In the power experiments, variable-dispersion beta regression and fractional logit regression models have slightly elevated power compared to linear regression models. Similar results were observed if the
Spontaneous regression of metastases from melanoma: review of the literature
DEFF Research Database (Denmark)
Kalialis, Louise Vennegaard; Drzewiecki, Krzysztof T; Klyver, Helle
2009-01-01
Regression of metastatic melanoma is a rare event, and review of the literature reveals a total of 76 reported cases since 1866. The proposed mechanisms include immunologic, endocrine, inflammatory and metastatic tumour nutritional factors. We conclude from this review that although the precise...
Bayesian logistic regression in detection of gene-steroid interaction for cancer at PDLIM5 locus.
Wang, Ke-Sheng; Owusu, Daniel; Pan, Yue; Xie, Changchun
2016-06-01
The PDZ and LIM domain 5 (PDLIM5) gene may play a role in cancer, bipolar disorder, major depression, alcohol dependence and schizophrenia; however, little is known about the interaction effect of steroid and PDLIM5 gene on cancer. This study examined 47 single-nucleotide polymorphisms (SNPs) within the PDLIM5 gene in the Marshfield sample with 716 cancer patients (any diagnosed cancer, excluding minor skin cancer) and 2848 noncancer controls. Multiple logistic regression model in PLINK software was used to examine the association of each SNP with cancer. Bayesian logistic regression in PROC GENMOD in SAS statistical software, ver. 9.4 was used to detect gene- steroid interactions influencing cancer. Single marker analysis using PLINK identified 12 SNPs associated with cancer (Plogistic regression in PROC GENMOD showed that both rs6532496 and rs951613 revealed strong gene-steroid interaction effects (OR=2.18, 95% CI=1.31-3.63 with P = 2.9 × 10⁻³ for rs6532496 and OR=2.07, 95% CI=1.24-3.45 with P = 5.43 × 10⁻³ for rs951613, respectively). Results from Bayesian logistic regression showed stronger interaction effects (OR=2.26, 95% CI=1.2-3.38 for rs6532496 and OR=2.14, 95% CI=1.14-3.2 for rs951613, respectively). All the 12 SNPs associated with cancer revealed significant gene-steroid interaction effects (P logistic regression and OR=2.59, 95% CI=1.4-3.97 from Bayesian logistic regression; respectively). This study provides evidence of common genetic variants within the PDLIM5 gene and interactions between PLDIM5 gene polymorphisms and steroid use influencing cancer.
Testing the equality of nonparametric regression curves based on ...
African Journals Online (AJOL)
Abstract. In this work we propose a new methodology for the comparison of two regression functions f1 and f2 in the case of homoscedastic error structure and a fixed design. Our approach is based on the empirical Fourier coefficients of the regression functions f1 and f2 respectively. As our main results we obtain the ...
A Methodology for Generating Placement Rules that Utilizes Logistic Regression
Wurtz, Keith
2008-01-01
The purpose of this article is to provide the necessary tools for institutional researchers to conduct a logistic regression analysis and interpret the results. Aspects of the logistic regression procedure that are necessary to evaluate models are presented and discussed with an emphasis on cutoff values and choosing the appropriate number of…
Bayesian regression of piecewise homogeneous Poisson processes
Directory of Open Access Journals (Sweden)
Diego Sevilla
2015-12-01
Full Text Available In this paper, a Bayesian method for piecewise regression is adapted to handle counting processes data distributed as Poisson. A numerical code in Mathematica is developed and tested analyzing simulated data. The resulting method is valuable for detecting breaking points in the count rate of time series for Poisson processes. Received: 2 November 2015, Accepted: 27 November 2015; Edited by: R. Dickman; Reviewed by: M. Hutter, Australian National University, Canberra, Australia.; DOI: http://dx.doi.org/10.4279/PIP.070018 Cite as: D J R Sevilla, Papers in Physics 7, 070018 (2015
SPE dose prediction using locally weighted regression
International Nuclear Information System (INIS)
Hines, J. W.; Townsend, L. W.; Nichols, T. F.
2005-01-01
When astronauts are outside earth's protective magnetosphere, they are subject to large radiation doses resulting from solar particle events (SPEs). The total dose received from a major SPE in deep space could cause severe radiation poisoning. The dose is usually received over a 20-40 h time interval but the event's effects may be mitigated with an early warning system. This paper presents a method to predict the total dose early in the event. It uses a locally weighted regression model, which is easier to train and provides predictions as accurate as neural network models previously used. (authors)
SPE dose prediction using locally weighted regression
International Nuclear Information System (INIS)
Hines, J. W.; Townsend, L. W.; Nichols, T. F.
2005-01-01
When astronauts are outside Earth's protective magnetosphere, they are subject to large radiation doses resulting from solar particle events. The total dose received from a major solar particle event in deep space could cause severe radiation poisoning. The dose is usually received over a 20-40 h time interval but the event's effects may be reduced with an early warning system. This paper presents a method to predict the total dose early in the event. It uses a locally weighted regression model, which is easier to train, and provides predictions as accurate as the neural network models that were used previously. (authors)
Principal component regression analysis with SPSS.
Liu, R X; Kuang, J; Gong, Q; Hou, X L
2003-06-01
The paper introduces all indices of multicollinearity diagnoses, the basic principle of principal component regression and determination of 'best' equation method. The paper uses an example to describe how to do principal component regression analysis with SPSS 10.0: including all calculating processes of the principal component regression and all operations of linear regression, factor analysis, descriptives, compute variable and bivariate correlations procedures in SPSS 10.0. The principal component regression analysis can be used to overcome disturbance of the multicollinearity. The simplified, speeded up and accurate statistical effect is reached through the principal component regression analysis with SPSS.
Levine, Matthew E; Albers, David J; Hripcsak, George
2016-01-01
Time series analysis methods have been shown to reveal clinical and biological associations in data collected in the electronic health record. We wish to develop reliable high-throughput methods for identifying adverse drug effects that are easy to implement and produce readily interpretable results. To move toward this goal, we used univariate and multivariate lagged regression models to investigate associations between twenty pairs of drug orders and laboratory measurements. Multivariate lagged regression models exhibited higher sensitivity and specificity than univariate lagged regression in the 20 examples, and incorporating autoregressive terms for labs and drugs produced more robust signals in cases of known associations among the 20 example pairings. Moreover, including inpatient admission terms in the model attenuated the signals for some cases of unlikely associations, demonstrating how multivariate lagged regression models' explicit handling of context-based variables can provide a simple way to probe for health-care processes that confound analyses of EHR data.
A flexible fuzzy regression algorithm for forecasting oil consumption estimation
International Nuclear Information System (INIS)
Azadeh, A.; Khakestani, M.; Saberi, M.
2009-01-01
Oil consumption plays a vital role in socio-economic development of most countries. This study presents a flexible fuzzy regression algorithm for forecasting oil consumption based on standard economic indicators. The standard indicators are annual population, cost of crude oil import, gross domestic production (GDP) and annual oil production in the last period. The proposed algorithm uses analysis of variance (ANOVA) to select either fuzzy regression or conventional regression for future demand estimation. The significance of the proposed algorithm is three fold. First, it is flexible and identifies the best model based on the results of ANOVA and minimum absolute percentage error (MAPE), whereas previous studies consider the best fitted fuzzy regression model based on MAPE or other relative error results. Second, the proposed model may identify conventional regression as the best model for future oil consumption forecasting because of its dynamic structure, whereas previous studies assume that fuzzy regression always provide the best solutions and estimation. Third, it utilizes the most standard independent variables for the regression models. To show the applicability and superiority of the proposed flexible fuzzy regression algorithm the data for oil consumption in Canada, United States, Japan and Australia from 1990 to 2005 are used. The results show that the flexible algorithm provides accurate solution for oil consumption estimation problem. The algorithm may be used by policy makers to accurately foresee the behavior of oil consumption in various regions.
Regression away from the mean: Theory and examples.
Schwarz, Wolf; Reike, Dennis
2018-02-01
Using a standard repeated measures model with arbitrary true score distribution and normal error variables, we present some fundamental closed-form results which explicitly indicate the conditions under which regression effects towards (RTM) and away from the mean are expected. Specifically, we show that for skewed and bimodal distributions many or even most cases will show a regression effect that is in expectation away from the mean, or that is not just towards but actually beyond the mean. We illustrate our results in quantitative detail with typical examples from experimental and biometric applications, which exhibit a clear regression away from the mean ('egression from the mean') signature. We aim not to repeal cautionary advice against potential RTM effects, but to present a balanced view of regression effects, based on a clear identification of the conditions governing the form that regression effects take in repeated measures designs. © 2017 The British Psychological Society.
Unbalanced Regressions and the Predictive Equation
DEFF Research Database (Denmark)
Osterrieder, Daniela; Ventosa-Santaulària, Daniel; Vera-Valdés, J. Eduardo
Predictive return regressions with persistent regressors are typically plagued by (asymptotically) biased/inconsistent estimates of the slope, non-standard or potentially even spurious statistical inference, and regression unbalancedness. We alleviate the problem of unbalancedness in the theoreti......Predictive return regressions with persistent regressors are typically plagued by (asymptotically) biased/inconsistent estimates of the slope, non-standard or potentially even spurious statistical inference, and regression unbalancedness. We alleviate the problem of unbalancedness...
Semiparametric regression during 2003–2007
Ruppert, David; Wand, M.P.; Carroll, Raymond J.
2009-01-01
Semiparametric regression is a fusion between parametric regression and nonparametric regression that integrates low-rank penalized splines, mixed model and hierarchical Bayesian methodology – thus allowing more streamlined handling of longitudinal and spatial correlation. We review progress in the field over the five-year period between 2003 and 2007. We find semiparametric regression to be a vibrant field with substantial involvement and activity, continual enhancement and widespread application.
Gaussian process regression analysis for functional data
Shi, Jian Qing
2011-01-01
Gaussian Process Regression Analysis for Functional Data presents nonparametric statistical methods for functional regression analysis, specifically the methods based on a Gaussian process prior in a functional space. The authors focus on problems involving functional response variables and mixed covariates of functional and scalar variables.Covering the basics of Gaussian process regression, the first several chapters discuss functional data analysis, theoretical aspects based on the asymptotic properties of Gaussian process regression models, and new methodological developments for high dime
Significance testing in ridge regression for genetic data
Directory of Open Access Journals (Sweden)
De Iorio Maria
2011-09-01
Full Text Available Abstract Background Technological developments have increased the feasibility of large scale genetic association studies. Densely typed genetic markers are obtained using SNP arrays, next-generation sequencing technologies and imputation. However, SNPs typed using these methods can be highly correlated due to linkage disequilibrium among them, and standard multiple regression techniques fail with these data sets due to their high dimensionality and correlation structure. There has been increasing interest in using penalised regression in the analysis of high dimensional data. Ridge regression is one such penalised regression technique which does not perform variable selection, instead estimating a regression coefficient for each predictor variable. It is therefore desirable to obtain an estimate of the significance of each ridge regression coefficient. Results We develop and evaluate a test of significance for ridge regression coefficients. Using simulation studies, we demonstrate that the performance of the test is comparable to that of a permutation test, with the advantage of a much-reduced computational cost. We introduce the p-value trace, a plot of the negative logarithm of the p-values of ridge regression coefficients with increasing shrinkage parameter, which enables the visualisation of the change in p-value of the regression coefficients with increasing penalisation. We apply the proposed method to a lung cancer case-control data set from EPIC, the European Prospective Investigation into Cancer and Nutrition. Conclusions The proposed test is a useful alternative to a permutation test for the estimation of the significance of ridge regression coefficients, at a much-reduced computational cost. The p-value trace is an informative graphical tool for evaluating the results of a test of significance of ridge regression coefficients as the shrinkage parameter increases, and the proposed test makes its production computationally feasible.
Standards for Standardized Logistic Regression Coefficients
Menard, Scott
2011-01-01
Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…
A Seemingly Unrelated Poisson Regression Model
King, Gary
1989-01-01
This article introduces a new estimator for the analysis of two contemporaneously correlated endogenous event count variables. This seemingly unrelated Poisson regression model (SUPREME) estimator combines the efficiencies created by single equation Poisson regression model estimators and insights from "seemingly unrelated" linear regression models.
Directory of Open Access Journals (Sweden)
Qiutong Jin
2016-06-01
Full Text Available Estimating the spatial distribution of precipitation is an important and challenging task in hydrology, climatology, ecology, and environmental science. In order to generate a highly accurate distribution map of average annual precipitation for the Loess Plateau in China, multiple linear regression Kriging (MLRK and geographically weighted regression Kriging (GWRK methods were employed using precipitation data from the period 1980–2010 from 435 meteorological stations. The predictors in regression Kriging were selected by stepwise regression analysis from many auxiliary environmental factors, such as elevation (DEM, normalized difference vegetation index (NDVI, solar radiation, slope, and aspect. All predictor distribution maps had a 500 m spatial resolution. Validation precipitation data from 130 hydrometeorological stations were used to assess the prediction accuracies of the MLRK and GWRK approaches. Results showed that both prediction maps with a 500 m spatial resolution interpolated by MLRK and GWRK had a high accuracy and captured detailed spatial distribution data; however, MLRK produced a lower prediction error and a higher variance explanation than GWRK, although the differences were small, in contrast to conclusions from similar studies.
Cao, Qingqing; Wu, Zhenqiang; Sun, Ying; Wang, Tiezhu; Han, Tengwei; Gu, Chaomei; Sun, Yehuan
2011-11-01
To Eexplore the application of negative binomial regression and modified Poisson regression analysis in analyzing the influential factors for injury frequency and the risk factors leading to the increase of injury frequency. 2917 primary and secondary school students were selected from Hefei by cluster random sampling method and surveyed by questionnaire. The data on the count event-based injuries used to fitted modified Poisson regression and negative binomial regression model. The risk factors incurring the increase of unintentional injury frequency for juvenile students was explored, so as to probe the efficiency of these two models in studying the influential factors for injury frequency. The Poisson model existed over-dispersion (P Poisson regression and negative binomial regression model, was fitted better. respectively. Both showed that male gender, younger age, father working outside of the hometown, the level of the guardian being above junior high school and smoking might be the results of higher injury frequencies. On a tendency of clustered frequency data on injury event, both the modified Poisson regression analysis and negative binomial regression analysis can be used. However, based on our data, the modified Poisson regression fitted better and this model could give a more accurate interpretation of relevant factors affecting the frequency of injury.
Moderation analysis using a two-level regression model.
Yuan, Ke-Hai; Cheng, Ying; Maxwell, Scott
2014-10-01
Moderation analysis is widely used in social and behavioral research. The most commonly used model for moderation analysis is moderated multiple regression (MMR) in which the explanatory variables of the regression model include product terms, and the model is typically estimated by least squares (LS). This paper argues for a two-level regression model in which the regression coefficients of a criterion variable on predictors are further regressed on moderator variables. An algorithm for estimating the parameters of the two-level model by normal-distribution-based maximum likelihood (NML) is developed. Formulas for the standard errors (SEs) of the parameter estimates are provided and studied. Results indicate that, when heteroscedasticity exists, NML with the two-level model gives more efficient and more accurate parameter estimates than the LS analysis of the MMR model. When error variances are homoscedastic, NML with the two-level model leads to essentially the same results as LS with the MMR model. Most importantly, the two-level regression model permits estimating the percentage of variance of each regression coefficient that is due to moderator variables. When applied to data from General Social Surveys 1991, NML with the two-level model identified a significant moderation effect of race on the regression of job prestige on years of education while LS with the MMR model did not. An R package is also developed and documented to facilitate the application of the two-level model.
Regression with Sparse Approximations of Data
DEFF Research Database (Denmark)
Noorzad, Pardis; Sturm, Bob L.
2012-01-01
We propose sparse approximation weighted regression (SPARROW), a method for local estimation of the regression function that uses sparse approximation with a dictionary of measurements. SPARROW estimates the regression function at a point with a linear combination of a few regressands selected...... by a sparse approximation of the point in terms of the regressors. We show SPARROW can be considered a variant of \\(k\\)-nearest neighbors regression (\\(k\\)-NNR), and more generally, local polynomial kernel regression. Unlike \\(k\\)-NNR, however, SPARROW can adapt the number of regressors to use based...
Spontaneous regression of a congenital melanocytic nevus
Directory of Open Access Journals (Sweden)
Amiya Kumar Nath
2011-01-01
Full Text Available Congenital melanocytic nevus (CMN may rarely regress which may also be associated with a halo or vitiligo. We describe a 10-year-old girl who presented with CMN on the left leg since birth, which recently started to regress spontaneously with associated depigmentation in the lesion and at a distant site. Dermoscopy performed at different sites of the regressing lesion demonstrated loss of epidermal pigments first followed by loss of dermal pigments. Histopathology and Masson-Fontana stain demonstrated lymphocytic infiltration and loss of pigment production in the regressing area. Immunohistochemistry staining (S100 and HMB-45, however, showed that nevus cells were present in the regressing areas.
Mapping geogenic radon potential by regression kriging
Energy Technology Data Exchange (ETDEWEB)
Pásztor, László [Institute for Soil Sciences and Agricultural Chemistry, Centre for Agricultural Research, Hungarian Academy of Sciences, Department of Environmental Informatics, Herman Ottó út 15, 1022 Budapest (Hungary); Szabó, Katalin Zsuzsanna, E-mail: sz_k_zs@yahoo.de [Department of Chemistry, Institute of Environmental Science, Szent István University, Páter Károly u. 1, Gödöllő 2100 (Hungary); Szatmári, Gábor; Laborczi, Annamária [Institute for Soil Sciences and Agricultural Chemistry, Centre for Agricultural Research, Hungarian Academy of Sciences, Department of Environmental Informatics, Herman Ottó út 15, 1022 Budapest (Hungary); Horváth, Ákos [Department of Atomic Physics, Eötvös University, Pázmány Péter sétány 1/A, 1117 Budapest (Hungary)
2016-02-15
Radon ({sup 222}Rn) gas is produced in the radioactive decay chain of uranium ({sup 238}U) which is an element that is naturally present in soils. Radon is transported mainly by diffusion and convection mechanisms through the soil depending mainly on the physical and meteorological parameters of the soil and can enter and accumulate in buildings. Health risks originating from indoor radon concentration can be attributed to natural factors and is characterized by geogenic radon potential (GRP). Identification of areas with high health risks require spatial modeling, that is, mapping of radon risk. In addition to geology and meteorology, physical soil properties play a significant role in the determination of GRP. In order to compile a reliable GRP map for a model area in Central-Hungary, spatial auxiliary information representing GRP forming environmental factors were taken into account to support the spatial inference of the locally measured GRP values. Since the number of measured sites was limited, efficient spatial prediction methodologies were searched for to construct a reliable map for a larger area. Regression kriging (RK) was applied for the interpolation using spatially exhaustive auxiliary data on soil, geology, topography, land use and climate. RK divides the spatial inference into two parts. Firstly, the deterministic component of the target variable is determined by a regression model. The residuals of the multiple linear regression analysis represent the spatially varying but dependent stochastic component, which are interpolated by kriging. The final map is the sum of the two component predictions. Overall accuracy of the map was tested by Leave-One-Out Cross-Validation. Furthermore the spatial reliability of the resultant map is also estimated by the calculation of the 90% prediction interval of the local prediction values. The applicability of the applied method as well as that of the map is discussed briefly. - Highlights: • A new method
Mapping geogenic radon potential by regression kriging
International Nuclear Information System (INIS)
Pásztor, László; Szabó, Katalin Zsuzsanna; Szatmári, Gábor; Laborczi, Annamária; Horváth, Ákos
2016-01-01
Radon ( 222 Rn) gas is produced in the radioactive decay chain of uranium ( 238 U) which is an element that is naturally present in soils. Radon is transported mainly by diffusion and convection mechanisms through the soil depending mainly on the physical and meteorological parameters of the soil and can enter and accumulate in buildings. Health risks originating from indoor radon concentration can be attributed to natural factors and is characterized by geogenic radon potential (GRP). Identification of areas with high health risks require spatial modeling, that is, mapping of radon risk. In addition to geology and meteorology, physical soil properties play a significant role in the determination of GRP. In order to compile a reliable GRP map for a model area in Central-Hungary, spatial auxiliary information representing GRP forming environmental factors were taken into account to support the spatial inference of the locally measured GRP values. Since the number of measured sites was limited, efficient spatial prediction methodologies were searched for to construct a reliable map for a larger area. Regression kriging (RK) was applied for the interpolation using spatially exhaustive auxiliary data on soil, geology, topography, land use and climate. RK divides the spatial inference into two parts. Firstly, the deterministic component of the target variable is determined by a regression model. The residuals of the multiple linear regression analysis represent the spatially varying but dependent stochastic component, which are interpolated by kriging. The final map is the sum of the two component predictions. Overall accuracy of the map was tested by Leave-One-Out Cross-Validation. Furthermore the spatial reliability of the resultant map is also estimated by the calculation of the 90% prediction interval of the local prediction values. The applicability of the applied method as well as that of the map is discussed briefly. - Highlights: • A new method, regression
Regression calibration with more surrogates than mismeasured variables
Kipnis, Victor
2012-06-29
In a recent paper (Weller EA, Milton DK, Eisen EA, Spiegelman D. Regression calibration for logistic regression with multiple surrogates for one exposure. Journal of Statistical Planning and Inference 2007; 137: 449-461), the authors discussed fitting logistic regression models when a scalar main explanatory variable is measured with error by several surrogates, that is, a situation with more surrogates than variables measured with error. They compared two methods of adjusting for measurement error using a regression calibration approximate model as if it were exact. One is the standard regression calibration approach consisting of substituting an estimated conditional expectation of the true covariate given observed data in the logistic regression. The other is a novel two-stage approach when the logistic regression is fitted to multiple surrogates, and then a linear combination of estimated slopes is formed as the estimate of interest. Applying estimated asymptotic variances for both methods in a single data set with some sensitivity analysis, the authors asserted superiority of their two-stage approach. We investigate this claim in some detail. A troubling aspect of the proposed two-stage method is that, unlike standard regression calibration and a natural form of maximum likelihood, the resulting estimates are not invariant to reparameterization of nuisance parameters in the model. We show, however, that, under the regression calibration approximation, the two-stage method is asymptotically equivalent to a maximum likelihood formulation, and is therefore in theory superior to standard regression calibration. However, our extensive finite-sample simulations in the practically important parameter space where the regression calibration model provides a good approximation failed to uncover such superiority of the two-stage method. We also discuss extensions to different data structures.
Regression calibration with more surrogates than mismeasured variables
Kipnis, Victor; Midthune, Douglas; Freedman, Laurence S.; Carroll, Raymond J.
2012-01-01
In a recent paper (Weller EA, Milton DK, Eisen EA, Spiegelman D. Regression calibration for logistic regression with multiple surrogates for one exposure. Journal of Statistical Planning and Inference 2007; 137: 449-461), the authors discussed fitting logistic regression models when a scalar main explanatory variable is measured with error by several surrogates, that is, a situation with more surrogates than variables measured with error. They compared two methods of adjusting for measurement error using a regression calibration approximate model as if it were exact. One is the standard regression calibration approach consisting of substituting an estimated conditional expectation of the true covariate given observed data in the logistic regression. The other is a novel two-stage approach when the logistic regression is fitted to multiple surrogates, and then a linear combination of estimated slopes is formed as the estimate of interest. Applying estimated asymptotic variances for both methods in a single data set with some sensitivity analysis, the authors asserted superiority of their two-stage approach. We investigate this claim in some detail. A troubling aspect of the proposed two-stage method is that, unlike standard regression calibration and a natural form of maximum likelihood, the resulting estimates are not invariant to reparameterization of nuisance parameters in the model. We show, however, that, under the regression calibration approximation, the two-stage method is asymptotically equivalent to a maximum likelihood formulation, and is therefore in theory superior to standard regression calibration. However, our extensive finite-sample simulations in the practically important parameter space where the regression calibration model provides a good approximation failed to uncover such superiority of the two-stage method. We also discuss extensions to different data structures.
BANK FAILURE PREDICTION WITH LOGISTIC REGRESSION
Directory of Open Access Journals (Sweden)
Taha Zaghdoudi
2013-04-01
Full Text Available In recent years the economic and financial world is shaken by a wave of financial crisis and resulted in violent bank fairly huge losses. Several authors have focused on the study of the crises in order to develop an early warning model. It is in the same path that our work takes its inspiration. Indeed, we have tried to develop a predictive model of Tunisian bank failures with the contribution of the binary logistic regression method. The specificity of our prediction model is that it takes into account microeconomic indicators of bank failures. The results obtained using our provisional model show that a bank's ability to repay its debt, the coefficient of banking operations, bank profitability per employee and leverage financial ratio has a negative impact on the probability of failure.
Regression testing in the TOTEM DCS
International Nuclear Information System (INIS)
Rodríguez, F Lucas; Atanassov, I; Burkimsher, P; Frost, O; Taskinen, J; Tulimaki, V
2012-01-01
The Detector Control System of the TOTEM experiment at the LHC is built with the industrial product WinCC OA (PVSS). The TOTEM system is generated automatically through scripts using as input the detector Product Breakdown Structure (PBS) structure and its pinout connectivity, archiving and alarm metainformation, and some other heuristics based on the naming conventions. When those initial parameters and automation code are modified to include new features, the resulting PVSS system can also introduce side-effects. On a daily basis, a custom developed regression testing tool takes the most recent code from a Subversion (SVN) repository and builds a new control system from scratch. This system is exported in plain text format using the PVSS export tool, and compared with a system previously validated by a human. A report is sent to the developers with any differences highlighted, in readiness for validation and acceptance as a new stable version. This regression approach is not dependent on any development framework or methodology. This process has been satisfactory during several months, proving to be a very valuable tool before deploying new versions in the production systems.
Parameters Estimation of Geographically Weighted Ordinal Logistic Regression (GWOLR) Model
Zuhdi, Shaifudin; Retno Sari Saputro, Dewi; Widyaningsih, Purnami
2017-06-01
A regression model is the representation of relationship between independent variable and dependent variable. The dependent variable has categories used in the logistic regression model to calculate odds on. The logistic regression model for dependent variable has levels in the logistics regression model is ordinal. GWOLR model is an ordinal logistic regression model influenced the geographical location of the observation site. Parameters estimation in the model needed to determine the value of a population based on sample. The purpose of this research is to parameters estimation of GWOLR model using R software. Parameter estimation uses the data amount of dengue fever patients in Semarang City. Observation units used are 144 villages in Semarang City. The results of research get GWOLR model locally for each village and to know probability of number dengue fever patient categories.
Short-term load forecasting with increment regression tree
Energy Technology Data Exchange (ETDEWEB)
Yang, Jingfei; Stenzel, Juergen [Darmstadt University of Techonology, Darmstadt 64283 (Germany)
2006-06-15
This paper presents a new regression tree method for short-term load forecasting. Both increment and non-increment tree are built according to the historical data to provide the data space partition and input variable selection. Support vector machine is employed to the samples of regression tree nodes for further fine regression. Results of different tree nodes are integrated through weighted average method to obtain the comprehensive forecasting result. The effectiveness of the proposed method is demonstrated through its application to an actual system. (author)
International Nuclear Information System (INIS)
Lee, Seung-Yong; Lee, Sang-In; Hwang, Byoung-chul
2016-01-01
In this study, the effect of microstructural factors on the impact toughness of hypoeutectoid steels with ferrite-pearlite structure was quantitatively investigated using multiple regression analysis. Microstructural analysis results showed that the pearlite fraction increased with increasing austenitizing temperature and decreasing transformation temperature which substantially decreased the pearlite interlamellar spacing and cementite thickness depending on carbon content. The impact toughness of hypoeutectoid steels usually increased as interlamellar spacing or cementite thickness decreased, although the impact toughness was largely associated with pearlite fraction. Based on these results, multiple regression analysis was performed to understand the individual effect of pearlite fraction, interlamellar spacing, and cementite thickness on the impact toughness. The regression analysis results revealed that pearlite fraction significantly affected impact toughness at room temperature, while cementite thickness did at low temperature.
Poisson Mixture Regression Models for Heart Disease Prediction
Erol, Hamza
2016-01-01
Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model. PMID:27999611
Prenatal diagnosis of Caudal Regression Syndrome : a case report
Directory of Open Access Journals (Sweden)
Celikaslan Nurgul
2001-12-01
Full Text Available Abstract Background Caudal regression is a rare syndrome which has a spectrum of congenital malformations ranging from simple anal atresia to absence of sacral, lumbar and possibly lower thoracic vertebrae, to the most severe form which is known as sirenomelia. Maternal diabetes, genetic predisposition and vascular hypoperfusion have been suggested as possible causative factors. Case presentation We report a case of caudal regression syndrome diagnosed in utero at 22 weeks' of gestation. Prenatal ultrasound examination revealed a sudden interruption of the spine and "frog-like" position of lower limbs. Termination of pregnancy and autopsy findings confirmed the diagnosis. Conclusion Prenatal ultrasonographic diagnosis of caudal regression syndrome is possible at 22 weeks' of gestation by ultrasound examination.
Logistic regression against a divergent Bayesian network
Directory of Open Access Journals (Sweden)
Noel Antonio Sánchez Trujillo
2015-01-01
Full Text Available This article is a discussion about two statistical tools used for prediction and causality assessment: logistic regression and Bayesian networks. Using data of a simulated example from a study assessing factors that might predict pulmonary emphysema (where fingertip pigmentation and smoking are considered; we posed the following questions. Is pigmentation a confounding, causal or predictive factor? Is there perhaps another factor, like smoking, that confounds? Is there a synergy between pigmentation and smoking? The results, in terms of prediction, are similar with the two techniques; regarding causation, differences arise. We conclude that, in decision-making, the sum of both: a statistical tool, used with common sense, and previous evidence, taking years or even centuries to develop; is better than the automatic and exclusive use of statistical resources.
Entrepreneurial intention modeling using hierarchical multiple regression
Directory of Open Access Journals (Sweden)
Marina Jeger
2014-12-01
Full Text Available The goal of this study is to identify the contribution of effectuation dimensions to the predictive power of the entrepreneurial intention model over and above that which can be accounted for by other predictors selected and confirmed in previous studies. As is often the case in social and behavioral studies, some variables are likely to be highly correlated with each other. Therefore, the relative amount of variance in the criterion variable explained by each of the predictors depends on several factors such as the order of variable entry and sample specifics. The results show the modest predictive power of two dimensions of effectuation prior to the introduction of the theory of planned behavior elements. The article highlights the main advantages of applying hierarchical regression in social sciences as well as in the specific context of entrepreneurial intention formation, and addresses some of the potential pitfalls that this type of analysis entails.
Gaussian process regression for geometry optimization
Denzel, Alexander; Kästner, Johannes
2018-03-01
We implemented a geometry optimizer based on Gaussian process regression (GPR) to find minimum structures on potential energy surfaces. We tested both a two times differentiable form of the Matérn kernel and the squared exponential kernel. The Matérn kernel performs much better. We give a detailed description of the optimization procedures. These include overshooting the step resulting from GPR in order to obtain a higher degree of interpolation vs. extrapolation. In a benchmark against the Limited-memory Broyden-Fletcher-Goldfarb-Shanno optimizer of the DL-FIND library on 26 test systems, we found the new optimizer to generally reduce the number of required optimization steps.
Statistical learning from a regression perspective
Berk, Richard A
2016-01-01
This textbook considers statistical learning applications when interest centers on the conditional distribution of the response variable, given a set of predictors, and when it is important to characterize how the predictors are related to the response. As a first approximation, this can be seen as an extension of nonparametric regression. This fully revised new edition includes important developments over the past 8 years. Consistent with modern data analytics, it emphasizes that a proper statistical learning data analysis derives from sound data collection, intelligent data management, appropriate statistical procedures, and an accessible interpretation of results. A continued emphasis on the implications for practice runs through the text. Among the statistical learning procedures examined are bagging, random forests, boosting, support vector machines and neural networks. Response variables may be quantitative or categorical. As in the first edition, a unifying theme is supervised learning that can be trea...
Applied regression analysis a research tool
Pantula, Sastry; Dickey, David
1998-01-01
Least squares estimation, when used appropriately, is a powerful research tool. A deeper understanding of the regression concepts is essential for achieving optimal benefits from a least squares analysis. This book builds on the fundamentals of statistical methods and provides appropriate concepts that will allow a scientist to use least squares as an effective research tool. Applied Regression Analysis is aimed at the scientist who wishes to gain a working knowledge of regression analysis. The basic purpose of this book is to develop an understanding of least squares and related statistical methods without becoming excessively mathematical. It is the outgrowth of more than 30 years of consulting experience with scientists and many years of teaching an applied regression course to graduate students. Applied Regression Analysis serves as an excellent text for a service course on regression for non-statisticians and as a reference for researchers. It also provides a bridge between a two-semester introduction to...
Regression models of reactor diagnostic signals
International Nuclear Information System (INIS)
Vavrin, J.
1989-01-01
The application is described of an autoregression model as the simplest regression model of diagnostic signals in experimental analysis of diagnostic systems, in in-service monitoring of normal and anomalous conditions and their diagnostics. The method of diagnostics is described using a regression type diagnostic data base and regression spectral diagnostics. The diagnostics is described of neutron noise signals from anomalous modes in the experimental fuel assembly of a reactor. (author)
Bulcock, J. W.
The problem of model estimation when the data are collinear was examined. Though the ridge regression (RR) outperforms ordinary least squares (OLS) regression in the presence of acute multicollinearity, it is not a problem free technique for reducing the variance of the estimates. It is a stochastic procedure when it should be nonstochastic and it…
Directory of Open Access Journals (Sweden)
Are Hugo Pripp
Full Text Available Chronic subdural hematoma (CSDH is characterized by an "old" encapsulated collection of blood and blood breakdown products between the brain and its outermost covering (the dura. Recognized risk factors for development of CSDH are head injury, old age and using anticoagulation medication, but its underlying pathophysiological processes are still unclear. It is assumed that a complex local process of interrelated mechanisms including inflammation, neomembrane formation, angiogenesis and fibrinolysis could be related to its development and propagation. However, the association between the biomarkers of inflammation and angiogenesis, and the clinical and radiological characteristics of CSDH patients, need further investigation. The high number of biomarkers compared to the number of observations, the correlation between biomarkers, missing data and skewed distributions may limit the usefulness of classical statistical methods. We therefore explored lasso regression to assess the association between 30 biomarkers of inflammation and angiogenesis at the site of lesions, and selected clinical and radiological characteristics in a cohort of 93 patients. Lasso regression performs both variable selection and regularization to improve the predictive accuracy and interpretability of the statistical model. The results from the lasso regression showed analysis exhibited lack of robust statistical association between the biomarkers in hematoma fluid with age, gender, brain infarct, neurological deficiencies and volume of hematoma. However, there were associations between several of the biomarkers with postoperative recurrence requiring reoperation. The statistical analysis with lasso regression supported previous findings that the immunological characteristics of CSDH are local. The relationship between biomarkers, the radiological appearance of lesions and recurrence requiring reoperation have been inclusive using classical statistical methods on these data
Multivariate Regression Analysis and Slaughter Livestock,
AGRICULTURE, *ECONOMICS), (*MEAT, PRODUCTION), MULTIVARIATE ANALYSIS, REGRESSION ANALYSIS , ANIMALS, WEIGHT, COSTS, PREDICTIONS, STABILITY, MATHEMATICAL MODELS, STORAGE, BEEF, PORK, FOOD, STATISTICAL DATA, ACCURACY
[From clinical judgment to linear regression model.
Palacios-Cruz, Lino; Pérez, Marcela; Rivas-Ruiz, Rodolfo; Talavera, Juan O
2013-01-01
When we think about mathematical models, such as linear regression model, we think that these terms are only used by those engaged in research, a notion that is far from the truth. Legendre described the first mathematical model in 1805, and Galton introduced the formal term in 1886. Linear regression is one of the most commonly used regression models in clinical practice. It is useful to predict or show the relationship between two or more variables as long as the dependent variable is quantitative and has normal distribution. Stated in another way, the regression is used to predict a measure based on the knowledge of at least one other variable. Linear regression has as it's first objective to determine the slope or inclination of the regression line: Y = a + bx, where "a" is the intercept or regression constant and it is equivalent to "Y" value when "X" equals 0 and "b" (also called slope) indicates the increase or decrease that occurs when the variable "x" increases or decreases in one unit. In the regression line, "b" is called regression coefficient. The coefficient of determination (R 2 ) indicates the importance of independent variables in the outcome.
Regression modeling methods, theory, and computation with SAS
Panik, Michael
2009-01-01
Regression Modeling: Methods, Theory, and Computation with SAS provides an introduction to a diverse assortment of regression techniques using SAS to solve a wide variety of regression problems. The author fully documents the SAS programs and thoroughly explains the output produced by the programs.The text presents the popular ordinary least squares (OLS) approach before introducing many alternative regression methods. It covers nonparametric regression, logistic regression (including Poisson regression), Bayesian regression, robust regression, fuzzy regression, random coefficients regression,
Suzuki, Taku; Iwamoto, Takuji; Shizu, Kanae; Suzuki, Katsuji; Yamada, Harumoto; Sato, Kazuki
2017-05-01
This retrospective study was designed to investigate prognostic factors for postoperative outcomes for cubital tunnel syndrome (CubTS) using multiple logistic regression analysis with a large number of patients. Eighty-three patients with CubTS who underwent surgeries were enrolled. The following potential prognostic factors for disease severity were selected according to previous reports: sex, age, type of surgery, disease duration, body mass index, cervical lesion, presence of diabetes mellitus, Workers' Compensation status, preoperative severity, and preoperative electrodiagnostic testing. Postoperative severity of disease was assessed 2 years after surgery by Messina's criteria which is an outcome measure specifically for CubTS. Bivariate analysis was performed to select candidate prognostic factors for multiple linear regression analyses. Multiple logistic regression analysis was conducted to identify the association between postoperative severity and selected prognostic factors. Both bivariate and multiple linear regression analysis revealed only preoperative severity as an independent risk factor for poor prognosis, while other factors did not show any significant association. Although conflicting results exist regarding prognosis of CubTS, this study supports evidence from previous studies and concludes early surgical intervention portends the most favorable prognosis. Copyright © 2017 The Japanese Orthopaedic Association. Published by Elsevier B.V. All rights reserved.
Collaborative regression-based anatomical landmark detection
International Nuclear Information System (INIS)
Gao, Yaozong; Shen, Dinggang
2015-01-01
Anatomical landmark detection plays an important role in medical image analysis, e.g. for registration, segmentation and quantitative analysis. Among the various existing methods for landmark detection, regression-based methods have recently attracted much attention due to their robustness and efficiency. In these methods, landmarks are localised through voting from all image voxels, which is completely different from the classification-based methods that use voxel-wise classification to detect landmarks. Despite their robustness, the accuracy of regression-based landmark detection methods is often limited due to (1) the inclusion of uninformative image voxels in the voting procedure, and (2) the lack of effective ways to incorporate inter-landmark spatial dependency into the detection step. In this paper, we propose a collaborative landmark detection framework to address these limitations. The concept of collaboration is reflected in two aspects. (1) Multi-resolution collaboration. A multi-resolution strategy is proposed to hierarchically localise landmarks by gradually excluding uninformative votes from faraway voxels. Moreover, for informative voxels near the landmark, a spherical sampling strategy is also designed at the training stage to improve their prediction accuracy. (2) Inter-landmark collaboration. A confidence-based landmark detection strategy is proposed to improve the detection accuracy of ‘difficult-to-detect’ landmarks by using spatial guidance from ‘easy-to-detect’ landmarks. To evaluate our method, we conducted experiments extensively on three datasets for detecting prostate landmarks and head and neck landmarks in computed tomography images, and also dental landmarks in cone beam computed tomography images. The results show the effectiveness of our collaborative landmark detection framework in improving landmark detection accuracy, compared to other state-of-the-art methods. (paper)
Impact of multicollinearity on small sample hydrologic regression models
Kroll, Charles N.; Song, Peter
2013-06-01
Often hydrologic regression models are developed with ordinary least squares (OLS) procedures. The use of OLS with highly correlated explanatory variables produces multicollinearity, which creates highly sensitive parameter estimators with inflated variances and improper model selection. It is not clear how to best address multicollinearity in hydrologic regression models. Here a Monte Carlo simulation is developed to compare four techniques to address multicollinearity: OLS, OLS with variance inflation factor screening (VIF), principal component regression (PCR), and partial least squares regression (PLS). The performance of these four techniques was observed for varying sample sizes, correlation coefficients between the explanatory variables, and model error variances consistent with hydrologic regional regression models. The negative effects of multicollinearity are magnified at smaller sample sizes, higher correlations between the variables, and larger model error variances (smaller R2). The Monte Carlo simulation indicates that if the true model is known, multicollinearity is present, and the estimation and statistical testing of regression parameters are of interest, then PCR or PLS should be employed. If the model is unknown, or if the interest is solely on model predictions, is it recommended that OLS be employed since using more complicated techniques did not produce any improvement in model performance. A leave-one-out cross-validation case study was also performed using low-streamflow data sets from the eastern United States. Results indicate that OLS with stepwise selection generally produces models across study regions with varying levels of multicollinearity that are as good as biased regression techniques such as PCR and PLS.
Use of probabilistic weights to enhance linear regression myoelectric control.
Smith, Lauren H; Kuiken, Todd A; Hargrove, Levi J
2015-12-01
Clinically available prostheses for transradial amputees do not allow simultaneous myoelectric control of degrees of freedom (DOFs). Linear regression methods can provide simultaneous myoelectric control, but frequently also result in difficulty with isolating individual DOFs when desired. This study evaluated the potential of using probabilistic estimates of categories of gross prosthesis movement, which are commonly used in classification-based myoelectric control, to enhance linear regression myoelectric control. Gaussian models were fit to electromyogram (EMG) feature distributions for three movement classes at each DOF (no movement, or movement in either direction) and used to weight the output of linear regression models by the probability that the user intended the movement. Eight able-bodied and two transradial amputee subjects worked in a virtual Fitts' law task to evaluate differences in controllability between linear regression and probability-weighted regression for an intramuscular EMG-based three-DOF wrist and hand system. Real-time and offline analyses in able-bodied subjects demonstrated that probability weighting improved performance during single-DOF tasks (p linear regression control. Use of probability weights can improve the ability to isolate individual during linear regression myoelectric control, while maintaining the ability to simultaneously control multiple DOFs.
Simulation Experiments in Practice: Statistical Design and Regression Analysis
Kleijnen, J.P.C.
2007-01-01
In practice, simulation analysts often change only one factor at a time, and use graphical analysis of the resulting Input/Output (I/O) data. The goal of this article is to change these traditional, naïve methods of design and analysis, because statistical theory proves that more information is obtained when applying Design Of Experiments (DOE) and linear regression analysis. Unfortunately, classic DOE and regression analysis assume a single simulation response that is normally and independen...
DEFF Research Database (Denmark)
Sharifzadeh, Sara; Skytte, Jacob Lercke; Nielsen, Otto Højager Attermann
2012-01-01
Statistical solutions find wide spread use in food and medicine quality control. We investigate the effect of different regression and sparse regression methods for a viscosity estimation problem using the spectro-temporal features from new Sub-Surface Laser Scattering (SLS) vision system. From...... with sparse LAR, lasso and Elastic Net (EN) sparse regression methods. Due to the inconsistent measurement condition, Locally Weighted Scatter plot Smoothing (Loess) has been employed to alleviate the undesired variation in the estimated viscosity. The experimental results of applying different methods show...
RAWS II: A MULTIPLE REGRESSION ANALYSIS PROGRAM,
This memorandum gives instructions for the use and operation of a revised version of RAWS, a multiple regression analysis program. The program...of preprocessed data, the directed retention of variable, listing of the matrix of the normal equations and its inverse, and the bypassing of the regression analysis to provide the input variable statistics only. (Author)
Hierarchical regression analysis in structural Equation Modeling
de Jong, P.F.
1999-01-01
In a hierarchical or fixed-order regression analysis, the independent variables are entered into the regression equation in a prespecified order. Such an analysis is often performed when the extra amount of variance accounted for in a dependent variable by a specific independent variable is the main
Categorical regression dose-response modeling
The goal of this training is to provide participants with training on the use of the U.S. EPA’s Categorical Regression soft¬ware (CatReg) and its application to risk assessment. Categorical regression fits mathematical models to toxicity data that have been assigned ord...
Variable importance in latent variable regression models
Kvalheim, O.M.; Arneberg, R.; Bleie, O.; Rajalahti, T.; Smilde, A.K.; Westerhuis, J.A.
2014-01-01
The quality and practical usefulness of a regression model are a function of both interpretability and prediction performance. This work presents some new graphical tools for improved interpretation of latent variable regression models that can also assist in improved algorithms for variable
Stepwise versus Hierarchical Regression: Pros and Cons
Lewis, Mitzi
2007-01-01
Multiple regression is commonly used in social and behavioral data analysis. In multiple regression contexts, researchers are very often interested in determining the "best" predictors in the analysis. This focus may stem from a need to identify those predictors that are supportive of theory. Alternatively, the researcher may simply be interested…
Gibrat’s law and quantile regressions
DEFF Research Database (Denmark)
Distante, Roberta; Petrella, Ivan; Santoro, Emiliano
2017-01-01
The nexus between firm growth, size and age in U.S. manufacturing is examined through the lens of quantile regression models. This methodology allows us to overcome serious shortcomings entailed by linear regression models employed by much of the existing literature, unveiling a number of important...
Regression Analysis and the Sociological Imagination
De Maio, Fernando
2014-01-01
Regression analysis is an important aspect of most introductory statistics courses in sociology but is often presented in contexts divorced from the central concerns that bring students into the discipline. Consequently, we present five lesson ideas that emerge from a regression analysis of income inequality and mortality in the USA and Canada.
Principles of Quantile Regression and an Application
Chen, Fang; Chalhoub-Deville, Micheline
2014-01-01
Newer statistical procedures are typically introduced to help address the limitations of those already in practice or to deal with emerging research needs. Quantile regression (QR) is introduced in this paper as a relatively new methodology, which is intended to overcome some of the limitations of least squares mean regression (LMR). QR is more…
ON REGRESSION REPRESENTATIONS OF STOCHASTIC-PROCESSES
RUSCHENDORF, L; DEVALK, [No Value
We construct a.s. nonlinear regression representations of general stochastic processes (X(n))n is-an-element-of N. As a consequence we obtain in particular special regression representations of Markov chains and of certain m-dependent sequences. For m-dependent sequences we obtain a constructive
Pathological assessment of liver fibrosis regression
Directory of Open Access Journals (Sweden)
WANG Bingqiong
2017-03-01
Full Text Available Hepatic fibrosis is the common pathological outcome of chronic hepatic diseases. An accurate assessment of fibrosis degree provides an important reference for a definite diagnosis of diseases, treatment decision-making, treatment outcome monitoring, and prognostic evaluation. At present, many clinical studies have proven that regression of hepatic fibrosis and early-stage liver cirrhosis can be achieved by effective treatment, and a correct evaluation of fibrosis regression has become a hot topic in clinical research. Liver biopsy has long been regarded as the gold standard for the assessment of hepatic fibrosis, and thus it plays an important role in the evaluation of fibrosis regression. This article reviews the clinical application of current pathological staging systems in the evaluation of fibrosis regression from the perspectives of semi-quantitative scoring system, quantitative approach, and qualitative approach, in order to propose a better pathological evaluation system for the assessment of fibrosis regression.
Neighborhood Effects in Wind Farm Performance: A Regression Approach
Directory of Open Access Journals (Sweden)
Matthias Ritter
2017-03-01
Full Text Available The optimization of turbine density in wind farms entails a trade-off between the usage of scarce, expensive land and power losses through turbine wake effects. A quantification and prediction of the wake effect, however, is challenging because of the complex aerodynamic nature of the interdependencies of turbines. In this paper, we propose a parsimonious data driven regression wake model that can be used to predict production losses of existing and potential wind farms. Motivated by simple engineering wake models, the predicting variables are wind speed, the turbine alignment angle, and distance. By utilizing data from two wind farms in Germany, we show that our models can compete with the standard Jensen model in predicting wake effect losses. A scenario analysis reveals that a distance between turbines can be reduced by up to three times the rotor size, without entailing substantial production losses. In contrast, an unfavorable configuration of turbines with respect to the main wind direction can result in production losses that are much higher than in an optimal case.
Li, Jiangtong; Luo, Yongdao; Dai, Honglin
2018-01-01
Water is the source of life and the essential foundation of all life. With the development of industrialization, the phenomenon of water pollution is becoming more and more frequent, which directly affects the survival and development of human. Water quality detection is one of the necessary measures to protect water resources. Ultraviolet (UV) spectral analysis is an important research method in the field of water quality detection, which partial least squares regression (PLSR) analysis method is becoming predominant technology, however, in some special cases, PLSR's analysis produce considerable errors. In order to solve this problem, the traditional principal component regression (PCR) analysis method was improved by using the principle of PLSR in this paper. The experimental results show that for some special experimental data set, improved PCR analysis method performance is better than PLSR. The PCR and PLSR is the focus of this paper. Firstly, the principal component analysis (PCA) is performed by MATLAB to reduce the dimensionality of the spectral data; on the basis of a large number of experiments, the optimized principal component is extracted by using the principle of PLSR, which carries most of the original data information. Secondly, the linear regression analysis of the principal component is carried out with statistic package for social science (SPSS), which the coefficients and relations of principal components can be obtained. Finally, calculating a same water spectral data set by PLSR and improved PCR, analyzing and comparing two results, improved PCR and PLSR is similar for most data, but improved PCR is better than PLSR for data near the detection limit. Both PLSR and improved PCR can be used in Ultraviolet spectral analysis of water, but for data near the detection limit, improved PCR's result better than PLSR.
Quantile regression and the gender wage gap: Is there a glass ceiling in the Turkish labor market?
Kaya, Ezgi
2017-01-01
Recent studies from different countries suggest that the gender gap is not constant across the wage distribution and the average wage gap provides limited information on women’s relative position in the labour market. Using micro level data from official statistics, this study explores the gender wage‐gap in Turkey across the wage distribution. The quantile regression and counterfactual decomposition analysis results reveal three striking features of the Turkish labour market. The first is th...
Research and analyze of physical health using multiple regression analysis
Directory of Open Access Journals (Sweden)
T. S. Kyi
2014-01-01
Full Text Available This paper represents the research which is trying to create a mathematical model of the "healthy people" using the method of regression analysis. The factors are the physical parameters of the person (such as heart rate, lung capacity, blood pressure, breath holding, weight height coefficient, flexibility of the spine, muscles of the shoulder belt, abdominal muscles, squatting, etc.., and the response variable is an indicator of physical working capacity. After performing multiple regression analysis, obtained useful multiple regression models that can predict the physical performance of boys the aged of fourteen to seventeen years. This paper represents the development of regression model for the sixteen year old boys and analyzed results.
Radiation regression patterns after cobalt plaque insertion for retinoblastoma
International Nuclear Information System (INIS)
Buys, R.J.; Abramson, D.H.; Ellsworth, R.M.; Haik, B.
1983-01-01
An analysis of 31 eyes of 30 patients who had been treated with cobalt plaques for retinoblastoma disclosed that a type I radiation regression pattern developed in 15 patients; type II, in one patient, and type III, in five patients. Nine patients had a regression pattern characterized by complete destruction of the tumor, the surrounding choroid, and all of the vessels in the area into which the plaque was inserted. This resulting white scar, corresponding to the sclerae only, was classified as a type IV radiation regression pattern. There was no evidence of tumor recurrence in patients with type IV regression patterns, with an average follow-up of 6.5 years, after receiving cobalt plaque therapy. Twenty-nine of these 30 patients had been unsuccessfully treated with at least one other modality (ie, light coagulation, cryotherapy, external beam radiation, or chemotherapy)
Radiation regression patterns after cobalt plaque insertion for retinoblastoma
Energy Technology Data Exchange (ETDEWEB)
Buys, R.J.; Abramson, D.H.; Ellsworth, R.M.; Haik, B.
1983-08-01
An analysis of 31 eyes of 30 patients who had been treated with cobalt plaques for retinoblastoma disclosed that a type I radiation regression pattern developed in 15 patients; type II, in one patient, and type III, in five patients. Nine patients had a regression pattern characterized by complete destruction of the tumor, the surrounding choroid, and all of the vessels in the area into which the plaque was inserted. This resulting white scar, corresponding to the sclerae only, was classified as a type IV radiation regression pattern. There was no evidence of tumor recurrence in patients with type IV regression patterns, with an average follow-up of 6.5 years, after receiving cobalt plaque therapy. Twenty-nine of these 30 patients had been unsuccessfully treated with at least one other modality (ie, light coagulation, cryotherapy, external beam radiation, or chemotherapy).
Regression modeling of ground-water flow
Cooley, R.L.; Naff, R.L.
1985-01-01
Nonlinear multiple regression methods are developed to model and analyze groundwater flow systems. Complete descriptions of regression methodology as applied to groundwater flow models allow scientists and engineers engaged in flow modeling to apply the methods to a wide range of problems. Organization of the text proceeds from an introduction that discusses the general topic of groundwater flow modeling, to a review of basic statistics necessary to properly apply regression techniques, and then to the main topic: exposition and use of linear and nonlinear regression to model groundwater flow. Statistical procedures are given to analyze and use the regression models. A number of exercises and answers are included to exercise the student on nearly all the methods that are presented for modeling and statistical analysis. Three computer programs implement the more complex methods. These three are a general two-dimensional, steady-state regression model for flow in an anisotropic, heterogeneous porous medium, a program to calculate a measure of model nonlinearity with respect to the regression parameters, and a program to analyze model errors in computed dependent variables such as hydraulic head. (USGS)
Directory of Open Access Journals (Sweden)
Ebrahim Karimi Sangchini
2015-01-01
Full Text Available Landslides are amongst the most damaging natural hazards in mountainous regions. Every year, hundreds of people all over the world lose their lives in landslides; furthermore, there are large impacts on the local and global economy from these events. In this study, landslide hazard zonation in Babaheydar watershed using logistic regression was conducted to determine landslide hazard areas. At first, the landslide inventory map was prepared using aerial photograph interpretations and field surveys. The next step, ten landslide conditioning factors such as altitude, slope percentage, slope aspect, lithology, distance from faults, rivers, settlement and roads, land use, and precipitation were chosen as effective factors on landsliding in the study area. Subsequently, landslide susceptibility map was constructed using the logistic regression model in Geographic Information System (GIS. The ROC and Pseudo-R2 indexes were used for model assessment. Results showed that the logistic regression model provided slightly high prediction accuracy of landslide susceptibility maps in the Babaheydar Watershed with ROC equal to 0.876. Furthermore, the results revealed that about 44% of the watershed areas were located in high and very high hazard classes. The resultant landslide susceptibility maps can be useful in appropriate watershed management practices and for sustainable development in the region.
Applied Regression Modeling A Business Approach
Pardoe, Iain
2012-01-01
An applied and concise treatment of statistical regression techniques for business students and professionals who have little or no background in calculusRegression analysis is an invaluable statistical methodology in business settings and is vital to model the relationship between a response variable and one or more predictor variables, as well as the prediction of a response value given values of the predictors. In view of the inherent uncertainty of business processes, such as the volatility of consumer spending and the presence of market uncertainty, business professionals use regression a
Analyzing hospitalization data: potential limitations of Poisson regression.
Weaver, Colin G; Ravani, Pietro; Oliver, Matthew J; Austin, Peter C; Quinn, Robert R
2015-08-01
Poisson regression is commonly used to analyze hospitalization data when outcomes are expressed as counts (e.g. number of days in hospital). However, data often violate the assumptions on which Poisson regression is based. More appropriate extensions of this model, while available, are rarely used. We compared hospitalization data between 206 patients treated with hemodialysis (HD) and 107 treated with peritoneal dialysis (PD) using Poisson regression and compared results from standard Poisson regression with those obtained using three other approaches for modeling count data: negative binomial (NB) regression, zero-inflated Poisson (ZIP) regression and zero-inflated negative binomial (ZINB) regression. We examined the appropriateness of each model and compared the results obtained with each approach. During a mean 1.9 years of follow-up, 183 of 313 patients (58%) were never hospitalized (indicating an excess of 'zeros'). The data also displayed overdispersion (variance greater than mean), violating another assumption of the Poisson model. Using four criteria, we determined that the NB and ZINB models performed best. According to these two models, patients treated with HD experienced similar hospitalization rates as those receiving PD {NB rate ratio (RR): 1.04 [bootstrapped 95% confidence interval (CI): 0.49-2.20]; ZINB summary RR: 1.21 (bootstrapped 95% CI 0.60-2.46)}. Poisson and ZIP models fit the data poorly and had much larger point estimates than the NB and ZINB models [Poisson RR: 1.93 (bootstrapped 95% CI 0.88-4.23); ZIP summary RR: 1.84 (bootstrapped 95% CI 0.88-3.84)]. We found substantially different results when modeling hospitalization data, depending on the approach used. Our results argue strongly for a sound model selection process and improved reporting around statistical methods used for modeling count data. © The Author 2015. Published by Oxford University Press on behalf of ERA-EDTA. All rights reserved.
Vectors, a tool in statistical regression theory
Corsten, L.C.A.
1958-01-01
Using linear algebra this thesis developed linear regression analysis including analysis of variance, covariance analysis, special experimental designs, linear and fertility adjustments, analysis of experiments at different places and times. The determination of the orthogonal projection, yielding
Genetics Home Reference: caudal regression syndrome
... umbilical artery: Further support for a caudal regression-sirenomelia spectrum. Am J Med Genet A. 2007 Dec ... AK, Dickinson JE, Bower C. Caudal dysgenesis and sirenomelia-single centre experience suggests common pathogenic basis. Am ...
Dynamic travel time estimation using regression trees.
2008-10-01
This report presents a methodology for travel time estimation by using regression trees. The dissemination of travel time information has become crucial for effective traffic management, especially under congested road conditions. In the absence of c...
Fuzzy multiple linear regression: A computational approach
Juang, C. H.; Huang, X. H.; Fleming, J. W.
1992-01-01
This paper presents a new computational approach for performing fuzzy regression. In contrast to Bardossy's approach, the new approach, while dealing with fuzzy variables, closely follows the conventional regression technique. In this approach, treatment of fuzzy input is more 'computational' than 'symbolic.' The following sections first outline the formulation of the new approach, then deal with the implementation and computational scheme, and this is followed by examples to illustrate the new procedure.
Computing multiple-output regression quantile regions
Czech Academy of Sciences Publication Activity Database
Paindaveine, D.; Šiman, Miroslav
2012-01-01
Roč. 56, č. 4 (2012), s. 840-853 ISSN 0167-9473 R&D Projects: GA MŠk(CZ) 1M06047 Institutional research plan: CEZ:AV0Z10750506 Keywords : halfspace depth * multiple-output regression * parametric linear programming * quantile regression Subject RIV: BA - General Mathematics Impact factor: 1.304, year: 2012 http://library.utia.cas.cz/separaty/2012/SI/siman-0376413.pdf
There is No Quantum Regression Theorem
International Nuclear Information System (INIS)
Ford, G.W.; OConnell, R.F.
1996-01-01
The Onsager regression hypothesis states that the regression of fluctuations is governed by macroscopic equations describing the approach to equilibrium. It is here asserted that this hypothesis fails in the quantum case. This is shown first by explicit calculation for the example of quantum Brownian motion of an oscillator and then in general from the fluctuation-dissipation theorem. It is asserted that the correct generalization of the Onsager hypothesis is the fluctuation-dissipation theorem. copyright 1996 The American Physical Society
Spontaneous regression of metastatic Merkel cell carcinoma.
LENUS (Irish Health Repository)
Hassan, S J
2010-01-01
Merkel cell carcinoma is a rare aggressive neuroendocrine carcinoma of the skin predominantly affecting elderly Caucasians. It has a high rate of local recurrence and regional lymph node metastases. It is associated with a poor prognosis. Complete spontaneous regression of Merkel cell carcinoma has been reported but is a poorly understood phenomenon. Here we present a case of complete spontaneous regression of metastatic Merkel cell carcinoma demonstrating a markedly different pattern of events from those previously published.
Forecasting exchange rates: a robust regression approach
Preminger, Arie; Franck, Raphael
2005-01-01
The least squares estimation method as well as other ordinary estimation method for regression models can be severely affected by a small number of outliers, thus providing poor out-of-sample forecasts. This paper suggests a robust regression approach, based on the S-estimation method, to construct forecasting models that are less sensitive to data contamination by outliers. A robust linear autoregressive (RAR) and a robust neural network (RNN) models are estimated to study the predictabil...
Marginal longitudinal semiparametric regression via penalized splines
Al Kadiri, M.
2010-08-01
We study the marginal longitudinal nonparametric regression problem and some of its semiparametric extensions. We point out that, while several elaborate proposals for efficient estimation have been proposed, a relative simple and straightforward one, based on penalized splines, has not. After describing our approach, we then explain how Gibbs sampling and the BUGS software can be used to achieve quick and effective implementation. Illustrations are provided for nonparametric regression and additive models.
Marginal longitudinal semiparametric regression via penalized splines
Al Kadiri, M.; Carroll, R.J.; Wand, M.P.
2010-01-01
We study the marginal longitudinal nonparametric regression problem and some of its semiparametric extensions. We point out that, while several elaborate proposals for efficient estimation have been proposed, a relative simple and straightforward one, based on penalized splines, has not. After describing our approach, we then explain how Gibbs sampling and the BUGS software can be used to achieve quick and effective implementation. Illustrations are provided for nonparametric regression and additive models.
bayesQR: A Bayesian Approach to Quantile Regression
Directory of Open Access Journals (Sweden)
Dries F. Benoit
2017-01-01
Full Text Available After its introduction by Koenker and Basset (1978, quantile regression has become an important and popular tool to investigate the conditional response distribution in regression. The R package bayesQR contains a number of routines to estimate quantile regression parameters using a Bayesian approach based on the asymmetric Laplace distribution. The package contains functions for the typical quantile regression with continuous dependent variable, but also supports quantile regression for binary dependent variables. For both types of dependent variables, an approach to variable selection using the adaptive lasso approach is provided. For the binary quantile regression model, the package also contains a routine that calculates the fitted probabilities for each vector of predictors. In addition, functions for summarizing the results, creating traceplots, posterior histograms and drawing quantile plots are included. This paper starts with a brief overview of the theoretical background of the models used in the bayesQR package. The main part of this paper discusses the computational problems that arise in the implementation of the procedure and illustrates the usefulness of the package through selected examples.
Real estate value prediction using multivariate regression models
Manjula, R.; Jain, Shubham; Srivastava, Sharad; Rajiv Kher, Pranav
2017-11-01
The real estate market is one of the most competitive in terms of pricing and the same tends to vary significantly based on a lot of factors, hence it becomes one of the prime fields to apply the concepts of machine learning to optimize and predict the prices with high accuracy. Therefore in this paper, we present various important features to use while predicting housing prices with good accuracy. We have described regression models, using various features to have lower Residual Sum of Squares error. While using features in a regression model some feature engineering is required for better prediction. Often a set of features (multiple regressions) or polynomial regression (applying a various set of powers in the features) is used for making better model fit. For these models are expected to be susceptible towards over fitting ridge regression is used to reduce it. This paper thus directs to the best application of regression models in addition to other techniques to optimize the result.
Learning a Nonnegative Sparse Graph for Linear Regression.
Fang, Xiaozhao; Xu, Yong; Li, Xuelong; Lai, Zhihui; Wong, Wai Keung
2015-09-01
Previous graph-based semisupervised learning (G-SSL) methods have the following drawbacks: 1) they usually predefine the graph structure and then use it to perform label prediction, which cannot guarantee an overall optimum and 2) they only focus on the label prediction or the graph structure construction but are not competent in handling new samples. To this end, a novel nonnegative sparse graph (NNSG) learning method was first proposed. Then, both the label prediction and projection learning were integrated into linear regression. Finally, the linear regression and graph structure learning were unified within the same framework to overcome these two drawbacks. Therefore, a novel method, named learning a NNSG for linear regression was presented, in which the linear regression and graph learning were simultaneously performed to guarantee an overall optimum. In the learning process, the label information can be accurately propagated via the graph structure so that the linear regression can learn a discriminative projection to better fit sample labels and accurately classify new samples. An effective algorithm was designed to solve the corresponding optimization problem with fast convergence. Furthermore, NNSG provides a unified perceptiveness for a number of graph-based learning methods and linear regression methods. The experimental results showed that NNSG can obtain very high classification accuracy and greatly outperforms conventional G-SSL methods, especially some conventional graph construction methods.
Descriptor Learning via Supervised Manifold Regularization for Multioutput Regression.
Zhen, Xiantong; Yu, Mengyang; Islam, Ali; Bhaduri, Mousumi; Chan, Ian; Li, Shuo
2017-09-01
Multioutput regression has recently shown great ability to solve challenging problems in both computer vision and medical image analysis. However, due to the huge image variability and ambiguity, it is fundamentally challenging to handle the highly complex input-target relationship of multioutput regression, especially with indiscriminate high-dimensional representations. In this paper, we propose a novel supervised descriptor learning (SDL) algorithm for multioutput regression, which can establish discriminative and compact feature representations to improve the multivariate estimation performance. The SDL is formulated as generalized low-rank approximations of matrices with a supervised manifold regularization. The SDL is able to simultaneously extract discriminative features closely related to multivariate targets and remove irrelevant and redundant information by transforming raw features into a new low-dimensional space aligned to targets. The achieved discriminative while compact descriptor largely reduces the variability and ambiguity for multioutput regression, which enables more accurate and efficient multivariate estimation. We conduct extensive evaluation of the proposed SDL on both synthetic data and real-world multioutput regression tasks for both computer vision and medical image analysis. Experimental results have shown that the proposed SDL can achieve high multivariate estimation accuracy on all tasks and largely outperforms the algorithms in the state of the arts. Our method establishes a novel SDL framework for multioutput regression, which can be widely used to boost the performance in different applications.
Determinants of LSIL Regression in Women from a Colombian Cohort
International Nuclear Information System (INIS)
Molano, Monica; Gonzalez, Mauricio; Gamboa, Oscar; Ortiz, Natasha; Luna, Joaquin; Hernandez, Gustavo; Posso, Hector; Murillo, Raul; Munoz, Nubia
2010-01-01
Objective: To analyze the role of Human Papillomavirus (HPV) and other risk factors in the regression of cervical lesions in women from the Bogota Cohort. Methods: 200 HPV positive women with abnormal cytology were included for regression analysis. The time of lesion regression was modeled using methods for interval censored survival time data. Median duration of total follow-up was 9 years. Results: 80 (40%) women were diagnosed with Atypical Squamous Cells of Undetermined Significance (ASCUS) or Atypical Glandular Cells of Undetermined Significance (AGUS) while 120 (60%) were diagnosed with Low Grade Squamous Intra-epithelial Lesions (LSIL). Globally, 40% of the lesions were still present at first year of follow up, while 1.5% was still present at 5 year check-up. The multivariate model showed similar regression rates for lesions in women with ASCUS/AGUS and women with LSIL (HR= 0.82, 95% CI 0.59-1.12). Women infected with HR HPV types and those with mixed infections had lower regression rates for lesions than did women infected with LR types (HR=0.526, 95% CI 0.33-0.84, for HR types and HR=0.378, 95% CI 0.20-0.69, for mixed infections). Furthermore, women over 30 years had a higher lesion regression rate than did women under 30 years (HR1.53, 95% CI 1.03-2.27). The study showed that the median time for lesion regression was 9 months while the median time for HPV clearance was 12 months. Conclusions: In the studied population, the type of infection and the age of the women are critical factors for the regression of cervical lesions.
Spontaneous regression of a large hepatocellular carcinoma: case report
Directory of Open Access Journals (Sweden)
Alqutub, Adel
2011-01-01
Full Text Available The prognosis of untreated advanced hepatocellular carcinoma (HCC is grim with a median survival of less than 6 months. Spontaneous regression of HCC has been defined as the disappearance of the hepatic lesions in the absence of any specific therapy. The spontaneous regression of a very large HCC is very rare and limited data is available in the English literature. We describe spontaneous regression of hepatocellular carcinoma in a 65-year-old male who presented to our clinic with vague abdominal pain and weight loss of two months duration. He was found to have multiple hepatic lesions with elevation of serum alpha-fetoprotein (AFP level to 6,500 µg/L (normal <20 µg/L. Computed tomography revealed advanced HCC replacing almost 80% of the right hepatic lobe. Without any intervention the patient showed gradual improvement over a period of few months. Follow-up CT scan revealed disappearance of hepatic lesions with progressive decline of AFP levels to normal. Various mechanisms have been postulated to explain this rare phenomenon, but the exact mechanism remains a mystery.
A rotor optimization using regression analysis
Giansante, N.
1984-01-01
The design and development of helicopter rotors is subject to the many design variables and their interactions that effect rotor operation. Until recently, selection of rotor design variables to achieve specified rotor operational qualities has been a costly, time consuming, repetitive task. For the past several years, Kaman Aerospace Corporation has successfully applied multiple linear regression analysis, coupled with optimization and sensitivity procedures, in the analytical design of rotor systems. It is concluded that approximating equations can be developed rapidly for a multiplicity of objective and constraint functions and optimizations can be performed in a rapid and cost effective manner; the number and/or range of design variables can be increased by expanding the data base and developing approximating functions to reflect the expanded design space; the order of the approximating equations can be expanded easily to improve correlation between analyzer results and the approximating equations; gradients of the approximating equations can be calculated easily and these gradients are smooth functions reducing the risk of numerical problems in the optimization; the use of approximating functions allows the problem to be started easily and rapidly from various initial designs to enhance the probability of finding a global optimum; and the approximating equations are independent of the analysis or optimization codes used.
Post-processing through linear regression
van Schaeybroeck, B.; Vannitsem, S.
2011-03-01
Various post-processing techniques are compared for both deterministic and ensemble forecasts, all based on linear regression between forecast data and observations. In order to evaluate the quality of the regression methods, three criteria are proposed, related to the effective correction of forecast error, the optimal variability of the corrected forecast and multicollinearity. The regression schemes under consideration include the ordinary least-square (OLS) method, a new time-dependent Tikhonov regularization (TDTR) method, the total least-square method, a new geometric-mean regression (GM), a recently introduced error-in-variables (EVMOS) method and, finally, a "best member" OLS method. The advantages and drawbacks of each method are clarified. These techniques are applied in the context of the 63 Lorenz system, whose model version is affected by both initial condition and model errors. For short forecast lead times, the number and choice of predictors plays an important role. Contrarily to the other techniques, GM degrades when the number of predictors increases. At intermediate lead times, linear regression is unable to provide corrections to the forecast and can sometimes degrade the performance (GM and the best member OLS with noise). At long lead times the regression schemes (EVMOS, TDTR) which yield the correct variability and the largest correlation between ensemble error and spread, should be preferred.
Post-processing through linear regression
Directory of Open Access Journals (Sweden)
B. Van Schaeybroeck
2011-03-01
Full Text Available Various post-processing techniques are compared for both deterministic and ensemble forecasts, all based on linear regression between forecast data and observations. In order to evaluate the quality of the regression methods, three criteria are proposed, related to the effective correction of forecast error, the optimal variability of the corrected forecast and multicollinearity. The regression schemes under consideration include the ordinary least-square (OLS method, a new time-dependent Tikhonov regularization (TDTR method, the total least-square method, a new geometric-mean regression (GM, a recently introduced error-in-variables (EVMOS method and, finally, a "best member" OLS method. The advantages and drawbacks of each method are clarified.
These techniques are applied in the context of the 63 Lorenz system, whose model version is affected by both initial condition and model errors. For short forecast lead times, the number and choice of predictors plays an important role. Contrarily to the other techniques, GM degrades when the number of predictors increases. At intermediate lead times, linear regression is unable to provide corrections to the forecast and can sometimes degrade the performance (GM and the best member OLS with noise. At long lead times the regression schemes (EVMOS, TDTR which yield the correct variability and the largest correlation between ensemble error and spread, should be preferred.
Unbalanced Regressions and the Predictive Equation
DEFF Research Database (Denmark)
Osterrieder, Daniela; Ventosa-Santaulària, Daniel; Vera-Valdés, J. Eduardo
Predictive return regressions with persistent regressors are typically plagued by (asymptotically) biased/inconsistent estimates of the slope, non-standard or potentially even spurious statistical inference, and regression unbalancedness. We alleviate the problem of unbalancedness in the theoreti......Predictive return regressions with persistent regressors are typically plagued by (asymptotically) biased/inconsistent estimates of the slope, non-standard or potentially even spurious statistical inference, and regression unbalancedness. We alleviate the problem of unbalancedness...... in the theoretical predictive equation by suggesting a data generating process, where returns are generated as linear functions of a lagged latent I(0) risk process. The observed predictor is a function of this latent I(0) process, but it is corrupted by a fractionally integrated noise. Such a process may arise due...... to aggregation or unexpected level shifts. In this setup, the practitioner estimates a misspecified, unbalanced, and endogenous predictive regression. We show that the OLS estimate of this regression is inconsistent, but standard inference is possible. To obtain a consistent slope estimate, we then suggest...
Multiple Response Regression for Gaussian Mixture Models with Known Labels.
Lee, Wonyul; Du, Ying; Sun, Wei; Hayes, D Neil; Liu, Yufeng
2012-12-01
Multiple response regression is a useful regression technique to model multiple response variables using the same set of predictor variables. Most existing methods for multiple response regression are designed for modeling homogeneous data. In many applications, however, one may have heterogeneous data where the samples are divided into multiple groups. Our motivating example is a cancer dataset where the samples belong to multiple cancer subtypes. In this paper, we consider modeling the data coming from a mixture of several Gaussian distributions with known group labels. A naive approach is to split the data into several groups according to the labels and model each group separately. Although it is simple, this approach ignores potential common structures across different groups. We propose new penalized methods to model all groups jointly in which the common and unique structures can be identified. The proposed methods estimate the regression coefficient matrix, as well as the conditional inverse covariance matrix of response variables. Asymptotic properties of the proposed methods are explored. Through numerical examples, we demonstrate that both estimation and prediction can be improved by modeling all groups jointly using the proposed methods. An application to a glioblastoma cancer dataset reveals some interesting common and unique gene relationships across different cancer subtypes.
Complete regression of myocardial involvement associated with lymphoma following chemotherapy.
Vinicki, Juan Pablo; Cianciulli, Tomás F; Farace, Gustavo A; Saccheri, María C; Lax, Jorge A; Kazelian, Lucía R; Wachs, Adolfo
2013-09-26
Cardiac involvement as an initial presentation of malignant lymphoma is a rare occurrence. We describe the case of a 26 year old man who had initially been diagnosed with myocardial infiltration on an echocardiogram, presenting with a testicular mass and unilateral peripheral facial paralysis. On admission, electrocardiograms (ECG) revealed negative T-waves in all leads and ST-segment elevation in the inferior leads. On two-dimensional echocardiography, there was infiltration of the pericardium with mild effusion, infiltrative thickening of the aortic walls, both atria and the interatrial septum and a mildly depressed systolic function of both ventricles. An axillary biopsy was performed and reported as a T-cell lymphoblastic lymphoma (T-LBL). Following the diagnosis and staging, chemotherapy was started. Twenty-two days after finishing the first cycle of chemotherapy, the ECG showed regression of T-wave changes in all leads and normalization of the ST-segment elevation in the inferior leads. A follow-up Two-dimensional echocardiography confirmed regression of the myocardial infiltration. This case report illustrates a lymphoma presenting with testicular mass, unilateral peripheral facial paralysis and myocardial involvement, and demonstrates that regression of infiltration can be achieved by intensive chemotherapy treatment. To our knowledge, there are no reported cases of T-LBL presenting as a testicular mass and unilateral peripheral facial paralysis, with complete regression of myocardial involvement.
Spatial vulnerability assessments by regression kriging
Pásztor, László; Laborczi, Annamária; Takács, Katalin; Szatmári, Gábor
2016-04-01
information representing IEW or GRP forming environmental factors were taken into account to support the spatial inference of the locally experienced IEW frequency and measured GRP values respectively. An efficient spatial prediction methodology was applied to construct reliable maps, namely regression kriging (RK) using spatially exhaustive auxiliary data on soil, geology, topography, land use and climate. RK divides the spatial inference into two parts. Firstly the deterministic component of the target variable is determined by a regression model. The residuals of the multiple linear regression analysis represent the spatially varying but dependent stochastic component, which are interpolated by kriging. The final map is the sum of the two component predictions. Application of RK also provides the possibility of inherent accuracy assessment. The resulting maps are characterized by global and local measures of its accuracy. Additionally the method enables interval estimation for spatial extension of the areas of predefined risk categories. All of these outputs provide useful contribution to spatial planning, action planning and decision making. Acknowledgement: Our work was partly supported by the Hungarian National Scientific Research Foundation (OTKA, Grant No. K105167).
National Research Council Canada - National Science Library
Bielecki, John
2003-01-01
.... Previous research has demonstrated the use of a two-step logistic and multiple regression methodology to predicting cost growth produces desirable results versus traditional single-step regression...
Morales, Esteban; de Leon, John Mark S; Abdollahi, Niloufar; Yu, Fei; Nouri-Mahdavi, Kouros; Caprioli, Joseph
2016-03-01
The study was conducted to evaluate threshold smoothing algorithms to enhance prediction of the rates of visual field (VF) worsening in glaucoma. We studied 798 patients with primary open-angle glaucoma and 6 or more years of follow-up who underwent 8 or more VF examinations. Thresholds at each VF location for the first 4 years or first half of the follow-up time (whichever was greater) were smoothed with clusters defined by the nearest neighbor (NN), Garway-Heath, Glaucoma Hemifield Test (GHT), and weighting by the correlation of rates at all other VF locations. Thresholds were regressed with a pointwise exponential regression (PER) model and a pointwise linear regression (PLR) model. Smaller root mean square error (RMSE) values of the differences between the observed and the predicted thresholds at last two follow-ups indicated better model predictions. The mean (SD) follow-up times for the smoothing and prediction phase were 5.3 (1.5) and 10.5 (3.9) years. The mean RMSE values for the PER and PLR models were unsmoothed data, 6.09 and 6.55; NN, 3.40 and 3.42; Garway-Heath, 3.47 and 3.48; GHT, 3.57 and 3.74; and correlation of rates, 3.59 and 3.64. Smoothed VF data predicted better than unsmoothed data. Nearest neighbor provided the best predictions; PER also predicted consistently more accurately than PLR. Smoothing algorithms should be used when forecasting VF results with PER or PLR. The application of smoothing algorithms on VF data can improve forecasting in VF points to assist in treatment decisions.
Use of probabilistic weights to enhance linear regression myoelectric control
Smith, Lauren H.; Kuiken, Todd A.; Hargrove, Levi J.
2015-12-01
Objective. Clinically available prostheses for transradial amputees do not allow simultaneous myoelectric control of degrees of freedom (DOFs). Linear regression methods can provide simultaneous myoelectric control, but frequently also result in difficulty with isolating individual DOFs when desired. This study evaluated the potential of using probabilistic estimates of categories of gross prosthesis movement, which are commonly used in classification-based myoelectric control, to enhance linear regression myoelectric control. Approach. Gaussian models were fit to electromyogram (EMG) feature distributions for three movement classes at each DOF (no movement, or movement in either direction) and used to weight the output of linear regression models by the probability that the user intended the movement. Eight able-bodied and two transradial amputee subjects worked in a virtual Fitts’ law task to evaluate differences in controllability between linear regression and probability-weighted regression for an intramuscular EMG-based three-DOF wrist and hand system. Main results. Real-time and offline analyses in able-bodied subjects demonstrated that probability weighting improved performance during single-DOF tasks (p < 0.05) by preventing extraneous movement at additional DOFs. Similar results were seen in experiments with two transradial amputees. Though goodness-of-fit evaluations suggested that the EMG feature distributions showed some deviations from the Gaussian, equal-covariance assumptions used in this experiment, the assumptions were sufficiently met to provide improved performance compared to linear regression control. Significance. Use of probability weights can improve the ability to isolate individual during linear regression myoelectric control, while maintaining the ability to simultaneously control multiple DOFs.
Interpreting Multiple Linear Regression: A Guidebook of Variable Importance
Nathans, Laura L.; Oswald, Frederick L.; Nimon, Kim
2012-01-01
Multiple regression (MR) analyses are commonly employed in social science fields. It is also common for interpretation of results to typically reflect overreliance on beta weights, often resulting in very limited interpretations of variable importance. It appears that few researchers employ other methods to obtain a fuller understanding of what…
Application of Soft Computing Techniques and Multiple Regression Models for CBR prediction of Soils
Directory of Open Access Journals (Sweden)
Fatimah Khaleel Ibrahim
2017-08-01
Full Text Available The techniques of soft computing technique such as Artificial Neutral Network (ANN have improved the predicting capability and have actually discovered application in Geotechnical engineering. The aim of this research is to utilize the soft computing technique and Multiple Regression Models (MLR for forecasting the California bearing ratio CBR( of soil from its index properties. The indicator of CBR for soil could be predicted from various soils characterizing parameters with the assist of MLR and ANN methods. The data base that collected from the laboratory by conducting tests on 86 soil samples that gathered from different projects in Basrah districts. Data gained from the experimental result were used in the regression models and soft computing techniques by using artificial neural network. The liquid limit, plastic index , modified compaction test and the CBR test have been determined. In this work, different ANN and MLR models were formulated with the different collection of inputs to be able to recognize their significance in the prediction of CBR. The strengths of the models that were developed been examined in terms of regression coefficient (R2, relative error (RE% and mean square error (MSE values. From the results of this paper, it absolutely was noticed that all the proposed ANN models perform better than that of MLR model. In a specific ANN model with all input parameters reveals better outcomes than other ANN models.
Assessment of deforestation using regression; Hodnotenie odlesnenia s vyuzitim regresie
Energy Technology Data Exchange (ETDEWEB)
Juristova, J. [Univerzita Komenskeho, Prirodovedecka fakulta, Katedra kartografie, geoinformatiky a DPZ, 84215 Bratislava (Slovakia)
2013-04-16
This work is devoted to the evaluation of deforestation using regression methods through software Idrisi Taiga. Deforestation is evaluated by the method of logistic regression. The dependent variable has discrete values '0' and '1', indicating that the deforestation occurred or not. Independent variables have continuous values, expressing the distance from the edge of the deforested areas of forests from urban areas, the river and the road network. The results were also used in predicting the probability of deforestation in subsequent periods. The result is a map showing the output probability of deforestation for the periods 1990/2000 and 200/2006 in accordance with predetermined coefficients (values of independent variables). (authors)
Regression of conjunctival tumor during dietary treatment of celiac disease
Directory of Open Access Journals (Sweden)
Tuncer Samuray
2010-01-01
Full Text Available A 3-year-old girl presented with a hemorrhagic conjunctival lesion in the right eye. The medical history revealed premature cessation of breast feeding, intolerance to the ingestion of baby foods, anorexia, and abdominal distention. Prior to her referral, endoscopic small intestinal biopsy had been carried out under general anesthesia with a possible diagnosis of Celiac Disease (CD. Her parents did not want their child to undergo general anesthesia for the second time for the excisional biopsy. We decided to follow the patient until all systemic investigations were concluded. In evaluation, the case was diagnosed with CD and the conjunctival tumor showed complete regression during gluten-free dietary treatment. The clinical fleshy appearance of the lesion with spider-like vascular extensions and subconjunctival hemorrhagic spots, possible association with an acquired immune system dysfunction due to CD, and spontaneous regression by a gluten-free diet led us to make a presumed diagnosis of conjunctival Kaposi sarcoma.
DeGutis, Joseph; Wilmer, Jeremy; Mercado, Rogelio J.; Cohan, Sarah
2013-01-01
Although holistic processing is thought to underlie normal face recognition ability, widely discrepant reports have recently emerged about this link in an individual differences context. Progress in this domain may have been impeded by the widespread use of subtraction scores, which lack validity due to their contamination with control condition…
Multicollinearity in Regression Analyses Conducted in Epidemiologic Studies.
Vatcheva, Kristina P; Lee, MinJae; McCormick, Joseph B; Rahbar, Mohammad H
2016-04-01
The adverse impact of ignoring multicollinearity on findings and data interpretation in regression analysis is very well documented in the statistical literature. The failure to identify and report multicollinearity could result in misleading interpretations of the results. A review of epidemiological literature in PubMed from January 2004 to December 2013, illustrated the need for a greater attention to identifying and minimizing the effect of multicollinearity in analysis of data from epidemiologic studies. We used simulated datasets and real life data from the Cameron County Hispanic Cohort to demonstrate the adverse effects of multicollinearity in the regression analysis and encourage researchers to consider the diagnostic for multicollinearity as one of the steps in regression analysis.
Regression analysis using dependent Polya trees.
Schörgendorfer, Angela; Branscum, Adam J
2013-11-30
Many commonly used models for linear regression analysis force overly simplistic shape and scale constraints on the residual structure of data. We propose a semiparametric Bayesian model for regression analysis that produces data-driven inference by using a new type of dependent Polya tree prior to model arbitrary residual distributions that are allowed to evolve across increasing levels of an ordinal covariate (e.g., time, in repeated measurement studies). By modeling residual distributions at consecutive covariate levels or time points using separate, but dependent Polya tree priors, distributional information is pooled while allowing for broad pliability to accommodate many types of changing residual distributions. We can use the proposed dependent residual structure in a wide range of regression settings, including fixed-effects and mixed-effects linear and nonlinear models for cross-sectional, prospective, and repeated measurement data. A simulation study illustrates the flexibility of our novel semiparametric regression model to accurately capture evolving residual distributions. In an application to immune development data on immunoglobulin G antibodies in children, our new model outperforms several contemporary semiparametric regression models based on a predictive model selection criterion. Copyright © 2013 John Wiley & Sons, Ltd.
Is past life regression therapy ethical?
Andrade, Gabriel
2017-01-01
Past life regression therapy is used by some physicians in cases with some mental diseases. Anxiety disorders, mood disorders, and gender dysphoria have all been treated using life regression therapy by some doctors on the assumption that they reflect problems in past lives. Although it is not supported by psychiatric associations, few medical associations have actually condemned it as unethical. In this article, I argue that past life regression therapy is unethical for two basic reasons. First, it is not evidence-based. Past life regression is based on the reincarnation hypothesis, but this hypothesis is not supported by evidence, and in fact, it faces some insurmountable conceptual problems. If patients are not fully informed about these problems, they cannot provide an informed consent, and hence, the principle of autonomy is violated. Second, past life regression therapy has the great risk of implanting false memories in patients, and thus, causing significant harm. This is a violation of the principle of non-malfeasance, which is surely the most important principle in medical ethics.
Ng, Kar Yong; Awang, Norhashidah
2018-01-06
Frequent haze occurrences in Malaysia have made the management of PM 10 (particulate matter with aerodynamic less than 10 μm) pollution a critical task. This requires knowledge on factors associating with PM 10 variation and good forecast of PM 10 concentrations. Hence, this paper demonstrates the prediction of 1-day-ahead daily average PM 10 concentrations based on predictor variables including meteorological parameters and gaseous pollutants. Three different models were built. They were multiple linear regression (MLR) model with lagged predictor variables (MLR1), MLR model with lagged predictor variables and PM 10 concentrations (MLR2) and regression with time series error (RTSE) model. The findings revealed that humidity, temperature, wind speed, wind direction, carbon monoxide and ozone were the main factors explaining the PM 10 variation in Peninsular Malaysia. Comparison among the three models showed that MLR2 model was on a same level with RTSE model in terms of forecasting accuracy, while MLR1 model was the worst.
On Solving Lq-Penalized Regressions
Directory of Open Access Journals (Sweden)
Tracy Zhou Wu
2007-01-01
Full Text Available Lq-penalized regression arises in multidimensional statistical modelling where all or part of the regression coefficients are penalized to achieve both accuracy and parsimony of statistical models. There is often substantial computational difficulty except for the quadratic penalty case. The difficulty is partly due to the nonsmoothness of the objective function inherited from the use of the absolute value. We propose a new solution method for the general Lq-penalized regression problem based on space transformation and thus efficient optimization algorithms. The new method has immediate applications in statistics, notably in penalized spline smoothing problems. In particular, the LASSO problem is shown to be polynomial time solvable. Numerical studies show promise of our approach.
Refractive regression after laser in situ keratomileusis.
Yan, Mabel K; Chang, John Sm; Chan, Tommy Cy
2018-04-26
Uncorrected refractive errors are a leading cause of visual impairment across the world. In today's society, laser in situ keratomileusis (LASIK) has become the most commonly performed surgical procedure to correct refractive errors. However, regression of the initially achieved refractive correction has been a widely observed phenomenon following LASIK since its inception more than two decades ago. Despite technological advances in laser refractive surgery and various proposed management strategies, post-LASIK regression is still frequently observed and has significant implications for the long-term visual performance and quality of life of patients. This review explores the mechanism of refractive regression after both myopic and hyperopic LASIK, predisposing risk factors and its clinical course. In addition, current preventative strategies and therapies are also reviewed. © 2018 Royal Australian and New Zealand College of Ophthalmologists.
Regression Models for Market-Shares
DEFF Research Database (Denmark)
Birch, Kristina; Olsen, Jørgen Kai; Tjur, Tue
2005-01-01
On the background of a data set of weekly sales and prices for three brands of coffee, this paper discusses various regression models and their relation to the multiplicative competitive-interaction model (the MCI model, see Cooper 1988, 1993) for market-shares. Emphasis is put on the interpretat......On the background of a data set of weekly sales and prices for three brands of coffee, this paper discusses various regression models and their relation to the multiplicative competitive-interaction model (the MCI model, see Cooper 1988, 1993) for market-shares. Emphasis is put...... on the interpretation of the parameters in relation to models for the total sales based on discrete choice models.Key words and phrases. MCI model, discrete choice model, market-shares, price elasitcity, regression model....
Alternative regression models to assess increase in childhood BMI
Directory of Open Access Journals (Sweden)
Mansmann Ulrich
2008-09-01
Full Text Available Abstract Background Body mass index (BMI data usually have skewed distributions, for which common statistical modeling approaches such as simple linear or logistic regression have limitations. Methods Different regression approaches to predict childhood BMI by goodness-of-fit measures and means of interpretation were compared including generalized linear models (GLMs, quantile regression and Generalized Additive Models for Location, Scale and Shape (GAMLSS. We analyzed data of 4967 children participating in the school entry health examination in Bavaria, Germany, from 2001 to 2002. TV watching, meal frequency, breastfeeding, smoking in pregnancy, maternal obesity, parental social class and weight gain in the first 2 years of life were considered as risk factors for obesity. Results GAMLSS showed a much better fit regarding the estimation of risk factors effects on transformed and untransformed BMI data than common GLMs with respect to the generalized Akaike information criterion. In comparison with GAMLSS, quantile regression allowed for additional interpretation of prespecified distribution quantiles, such as quantiles referring to overweight or obesity. The variables TV watching, maternal BMI and weight gain in the first 2 years were directly, and meal frequency was inversely significantly associated with body composition in any model type examined. In contrast, smoking in pregnancy was not directly, and breastfeeding and parental social class were not inversely significantly associated with body composition in GLM models, but in GAMLSS and partly in quantile regression models. Risk factor specific BMI percentile curves could be estimated from GAMLSS and quantile regression models. Conclusion GAMLSS and quantile regression seem to be more appropriate than common GLMs for risk factor modeling of BMI data.
Gaussian process regression for tool wear prediction
Kong, Dongdong; Chen, Yongjie; Li, Ning
2018-05-01
To realize and accelerate the pace of intelligent manufacturing, this paper presents a novel tool wear assessment technique based on the integrated radial basis function based kernel principal component analysis (KPCA_IRBF) and Gaussian process regression (GPR) for real-timely and accurately monitoring the in-process tool wear parameters (flank wear width). The KPCA_IRBF is a kind of new nonlinear dimension-increment technique and firstly proposed for feature fusion. The tool wear predictive value and the corresponding confidence interval are both provided by utilizing the GPR model. Besides, GPR performs better than artificial neural networks (ANN) and support vector machines (SVM) in prediction accuracy since the Gaussian noises can be modeled quantitatively in the GPR model. However, the existence of noises will affect the stability of the confidence interval seriously. In this work, the proposed KPCA_IRBF technique helps to remove the noises and weaken its negative effects so as to make the confidence interval compressed greatly and more smoothed, which is conducive for monitoring the tool wear accurately. Moreover, the selection of kernel parameter in KPCA_IRBF can be easily carried out in a much larger selectable region in comparison with the conventional KPCA_RBF technique, which helps to improve the efficiency of model construction. Ten sets of cutting tests are conducted to validate the effectiveness of the presented tool wear assessment technique. The experimental results show that the in-process flank wear width of tool inserts can be monitored accurately by utilizing the presented tool wear assessment technique which is robust under a variety of cutting conditions. This study lays the foundation for tool wear monitoring in real industrial settings.
Directory of Open Access Journals (Sweden)
Jibo Yue
2018-01-01
Full Text Available Above-ground biomass (AGB provides a vital link between solar energy consumption and yield, so its correct estimation is crucial to accurately monitor crop growth and predict yield. In this work, we estimate AGB by using 54 vegetation indexes (e.g., Normalized Difference Vegetation Index, Soil-Adjusted Vegetation Index and eight statistical regression techniques: artificial neural network (ANN, multivariable linear regression (MLR, decision-tree regression (DT, boosted binary regression tree (BBRT, partial least squares regression (PLSR, random forest regression (RF, support vector machine regression (SVM, and principal component regression (PCR, which are used to analyze hyperspectral data acquired by using a field spectrophotometer. The vegetation indexes (VIs determined from the spectra were first used to train regression techniques for modeling and validation to select the best VI input, and then summed with white Gaussian noise to study how remote sensing errors affect the regression techniques. Next, the VIs were divided into groups of different sizes by using various sampling methods for modeling and validation to test the stability of the techniques. Finally, the AGB was estimated by using a leave-one-out cross validation with these powerful techniques. The results of the study demonstrate that, of the eight techniques investigated, PLSR and MLR perform best in terms of stability and are most suitable when high-accuracy and stable estimates are required from relatively few samples. In addition, RF is extremely robust against noise and is best suited to deal with repeated observations involving remote-sensing data (i.e., data affected by atmosphere, clouds, observation times, and/or sensor noise. Finally, the leave-one-out cross-validation method indicates that PLSR provides the highest accuracy (R2 = 0.89, RMSE = 1.20 t/ha, MAE = 0.90 t/ha, NRMSE = 0.07, CV (RMSE = 0.18; thus, PLSR is best suited for works requiring high
On directional multiple-output quantile regression
Czech Academy of Sciences Publication Activity Database
Paindaveine, D.; Šiman, Miroslav
2011-01-01
Roč. 102, č. 2 (2011), s. 193-212 ISSN 0047-259X R&D Projects: GA MŠk(CZ) 1M06047 Grant - others:Commision EC(BE) Fonds National de la Recherche Scientifique Institutional research plan: CEZ:AV0Z10750506 Keywords : multivariate quantile * quantile regression * multiple-output regression * halfspace depth * portfolio optimization * value-at risk Subject RIV: BA - General Mathematics Impact factor: 0.879, year: 2011 http://library.utia.cas.cz/separaty/2011/SI/siman-0364128.pdf
Removing Malmquist bias from linear regressions
Verter, Frances
1993-01-01
Malmquist bias is present in all astronomical surveys where sources are observed above an apparent brightness threshold. Those sources which can be detected at progressively larger distances are progressively more limited to the intrinsically luminous portion of the true distribution. This bias does not distort any of the measurements, but distorts the sample composition. We have developed the first treatment to correct for Malmquist bias in linear regressions of astronomical data. A demonstration of the corrected linear regression that is computed in four steps is presented.
Robust median estimator in logisitc regression
Czech Academy of Sciences Publication Activity Database
Hobza, T.; Pardo, L.; Vajda, Igor
2008-01-01
Roč. 138, č. 12 (2008), s. 3822-3840 ISSN 0378-3758 R&D Projects: GA MŠk 1M0572 Grant - others:Instituto Nacional de Estadistica (ES) MPO FI - IM3/136; GA MŠk(CZ) MTM 2006-06872 Institutional research plan: CEZ:AV0Z10750506 Keywords : Logistic regression * Median * Robustness * Consistency and asymptotic normality * Morgenthaler * Bianco and Yohai * Croux and Hasellbroeck Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.679, year: 2008 http://library.utia.cas.cz/separaty/2008/SI/vajda-robust%20median%20estimator%20in%20logistic%20regression.pdf
Comparison of l₁-Norm SVR and Sparse Coding Algorithms for Linear Regression.
Zhang, Qingtian; Hu, Xiaolin; Zhang, Bo
2015-08-01
Support vector regression (SVR) is a popular function estimation technique based on Vapnik's concept of support vector machine. Among many variants, the l1-norm SVR is known to be good at selecting useful features when the features are redundant. Sparse coding (SC) is a technique widely used in many areas and a number of efficient algorithms are available. Both l1-norm SVR and SC can be used for linear regression. In this brief, the close connection between the l1-norm SVR and SC is revealed and some typical algorithms are compared for linear regression. The results show that the SC algorithms outperform the Newton linear programming algorithm, an efficient l1-norm SVR algorithm, in efficiency. The algorithms are then used to design the radial basis function (RBF) neural networks. Experiments on some benchmark data sets demonstrate the high efficiency of the SC algorithms. In particular, one of the SC algorithms, the orthogonal matching pursuit is two orders of magnitude faster than a well-known RBF network designing algorithm, the orthogonal least squares algorithm.
Spontaneous regression of transverse colon cancer: a case report.
Chida, Keigo; Nakanishi, Kazuaki; Shomura, Hiroki; Homma, Shigenori; Hattori, Atsuo; Kazui, Keizo; Taketomi, Akinobu
2017-12-01
Spontaneous regression (SR) of many malignant tumors has been well documented, with an approximate incidence of one per 60,000-100,000 cancer patients. However, SR of colorectal cancer (CRC) is very rare, accounting for less than 2% of such cases. We report a case of SR of transverse colon cancer in an 80-year-old man undergoing outpatient follow-up after surgical treatment of early gastric cancer. Colonoscopy (CS) revealed a Borrmann type II tumor in the transverse colon measuring 30 × 30 mm. Because the patient underwent anticoagulant therapy, we did not perform a biopsy at that time. A second CS was performed 1 week after the initial examination and revealed tumor shrinkage to a diameter of 20 mm and a shift to the Borrmann type III morphology. Biopsy revealed a poorly differentiated adenocarcinoma. One week after the second CS, we performed a partial resection of the transverse colon and D2 lymph node dissection. Histopathology revealed inflammatory cell infiltration and fibrosis from the submucosal to muscularis propria layers in the absence of cancer cells, leading to pathological staging of pStage 0 (T0N0). The patient had an uneventful recovery, and CS performed at 5 months postoperatively revealed the absence of a tumor in the colon and rectum. The patient continues to be followed up as an outpatient at 12 months postoperatively, and no recurrence has been observed.
A comparison of regression algorithms for wind speed forecasting at Alexander Bay
CSIR Research Space (South Africa)
Botha, Nicolene
2016-12-01
Full Text Available to forecast 1 to 24 hours ahead, in hourly intervals. Predictions are performed on a wind speed time series with three machine learning regression algorithms, namely support vector regression, ordinary least squares and Bayesian ridge regression. The resulting...
Demonstration of a Fiber Optic Regression Probe
Korman, Valentin; Polzin, Kurt A.
2010-01-01
The capability to provide localized, real-time monitoring of material regression rates in various applications has the potential to provide a new stream of data for development testing of various components and systems, as well as serving as a monitoring tool in flight applications. These applications include, but are not limited to, the regression of a combusting solid fuel surface, the ablation of the throat in a chemical rocket or the heat shield of an aeroshell, and the monitoring of erosion in long-life plasma thrusters. The rate of regression in the first application is very fast, while the second and third are increasingly slower. A recent fundamental sensor development effort has led to a novel regression, erosion, and ablation sensor technology (REAST). The REAST sensor allows for measurement of real-time surface erosion rates at a discrete surface location. The sensor is optical, using two different, co-located fiber-optics to perform the regression measurement. The disparate optical transmission properties of the two fiber-optics makes it possible to measure the regression rate by monitoring the relative light attenuation through the fibers. As the fibers regress along with the parent material in which they are embedded, the relative light intensities through the two fibers changes, providing a measure of the regression rate. The optical nature of the system makes it relatively easy to use in a variety of harsh, high temperature environments, and it is also unaffected by the presence of electric and magnetic fields. In addition, the sensor could be used to perform optical spectroscopy on the light emitted by a process and collected by fibers, giving localized measurements of various properties. The capability to perform an in-situ measurement of material regression rates is useful in addressing a variety of physical issues in various applications. An in-situ measurement allows for real-time data regarding the erosion rates, providing a quick method for
Is adult gait less susceptible than paediatric gait to hip joint centre regression equation error?
Kiernan, D; Hosking, J; O'Brien, T
2016-03-01
Hip joint centre (HJC) regression equation error during paediatric gait has recently been shown to have clinical significance. In relation to adult gait, it has been inferred that comparable errors with children in absolute HJC position may in fact result in less significant kinematic and kinetic error. This study investigated the clinical agreement of three commonly used regression equation sets (Bell et al., Davis et al. and Orthotrak) for adult subjects against the equations of Harrington et al. The relationship between HJC position error and subject size was also investigated for the Davis et al. set. Full 3-dimensional gait analysis was performed on 12 healthy adult subjects with data for each set compared to Harrington et al. The Gait Profile Score, Gait Variable Score and GDI-kinetic were used to assess clinical significance while differences in HJC position between the Davis and Harrington sets were compared to leg length and subject height using regression analysis. A number of statistically significant differences were present in absolute HJC position. However, all sets fell below the clinically significant thresholds (GPS <1.6°, GDI-Kinetic <3.6 points). Linear regression revealed a statistically significant relationship for both increasing leg length and increasing subject height with decreasing error in anterior/posterior and superior/inferior directions. Results confirm a negligible clinical error for adult subjects suggesting that any of the examined sets could be used interchangeably. Decreasing error with both increasing leg length and increasing subject height suggests that the Davis set should be used cautiously on smaller subjects. Copyright © 2016 Elsevier B.V. All rights reserved.
Model-based Quantile Regression for Discrete Data
Padellini, Tullia
2018-04-10
Quantile regression is a class of methods voted to the modelling of conditional quantiles. In a Bayesian framework quantile regression has typically been carried out exploiting the Asymmetric Laplace Distribution as a working likelihood. Despite the fact that this leads to a proper posterior for the regression coefficients, the resulting posterior variance is however affected by an unidentifiable parameter, hence any inferential procedure beside point estimation is unreliable. We propose a model-based approach for quantile regression that considers quantiles of the generating distribution directly, and thus allows for a proper uncertainty quantification. We then create a link between quantile regression and generalised linear models by mapping the quantiles to the parameter of the response variable, and we exploit it to fit the model with R-INLA. We extend it also in the case of discrete responses, where there is no 1-to-1 relationship between quantiles and distribution\\'s parameter, by introducing continuous generalisations of the most common discrete variables (Poisson, Binomial and Negative Binomial) to be exploited in the fitting.
Directory of Open Access Journals (Sweden)
Rachid Darnag
2017-02-01
Full Text Available Support vector machines (SVM represent one of the most promising Machine Learning (ML tools that can be applied to develop a predictive quantitative structure–activity relationship (QSAR models using molecular descriptors. Multiple linear regression (MLR and artificial neural networks (ANNs were also utilized to construct quantitative linear and non linear models to compare with the results obtained by SVM. The prediction results are in good agreement with the experimental value of HIV activity; also, the results reveal the superiority of the SVM over MLR and ANN model. The contribution of each descriptor to the structure–activity relationships was evaluated.
KELEŞ, Taliha; ALTUN, Murat
2016-01-01
Regression analysis is a statistical technique for investigating and modeling the relationship between variables. The purpose of this study was the trivial presentation of the equation for orthogonal regression (OR) and the comparison of classical linear regression (CLR) and OR techniques with respect to the sum of squared perpendicular distances. For that purpose, the analyses were shown by an example. It was found that the sum of squared perpendicular distances of OR is smaller. Thus, it wa...
Implementing fuzzy polynomial interpolation (FPI and fuzzy linear regression (LFR
Directory of Open Access Journals (Sweden)
Maria Cristina Floreno
1996-05-01
Full Text Available This paper presents some preliminary results arising within a general framework concerning the development of software tools for fuzzy arithmetic. The program is in a preliminary stage. What has been already implemented consists of a set of routines for elementary operations, optimized functions evaluation, interpolation and regression. Some of these have been applied to real problems.This paper describes a prototype of a library in C++ for polynomial interpolation of fuzzifying functions, a set of routines in FORTRAN for fuzzy linear regression and a program with graphical user interface allowing the use of such routines.
Determinants of Non-Performing Assets in India - Panel Regression
Directory of Open Access Journals (Sweden)
Saikat Ghosh Roy
2014-12-01
Full Text Available It is well known that level of banks‟ credit plays an important role in economic developments. Indian banking sector has played a seminal role in supporting economic growth in India. Recently, Indian banks are experiencing consistent increase in non-performing assets (NPA. In this perspective, this paper investigates the trends in NPA in Indian banks and its determinants. The panel regressions, fixed effect allows evaluating the impact of selected macroeconomic variables on the NPA. The Panel regression result indicates that the GDP growth, change in exchange rate and global volatility have major effects on the NPA level of Indian banking sector.
Testing overall and moderator effects meta-regression
Huizenga, H.M.; Visser, I.; Dolan, C.V.
2011-01-01
Random effects meta-regression is a technique to synthesize results of multiple studies. It allows for a test of an overall effect, as well as for tests of effects of study characteristics, that is, (discrete or continuous) moderator effects. We describe various procedures to test moderator effects:
Outlier detection algorithms for least squares time series regression
DEFF Research Database (Denmark)
Johansen, Søren; Nielsen, Bent
We review recent asymptotic results on some robust methods for multiple regression. The regressors include stationary and non-stationary time series as well as polynomial terms. The methods include the Huber-skip M-estimator, 1-step Huber-skip M-estimators, in particular the Impulse Indicator Sat...
A Demonstration of Regression False Positive Selection in Data Mining
Pinder, Jonathan P.
2014-01-01
Business analytics courses, such as marketing research, data mining, forecasting, and advanced financial modeling, have substantial predictive modeling components. The predictive modeling in these courses requires students to estimate and test many linear regressions. As a result, false positive variable selection ("type I errors") is…
Linearity and Misspecification Tests for Vector Smooth Transition Regression Models
DEFF Research Database (Denmark)
Teräsvirta, Timo; Yang, Yukai
The purpose of the paper is to derive Lagrange multiplier and Lagrange multiplier type specification and misspecification tests for vector smooth transition regression models. We report results from simulation studies in which the size and power properties of the proposed asymptotic tests in small...
Determination of gaussian peaks in gamma spectra by iterative regression
International Nuclear Information System (INIS)
Nordemann, D.J.R.
1987-05-01
The parameters of the peaks in gamma-ray spectra are determined by a simple iterative regression method. For each peak, the parameters are associated with a gaussian curve (3 parameters) located above a linear continuum (2 parameters). This method may produces the complete result of the calculation of statistical uncertainties and an accuracy higher than others methods. (author) [pt
The APT model as reduced-rank regression
Bekker, P.A.; Dobbelstein, P.; Wansbeek, T.J.
Integrating the two steps of an arbitrage pricing theory (APT) model leads to a reduced-rank regression (RRR) model. So the results on RRR can be used to estimate APT models, making estimation very simple. We give a succinct derivation of estimation of RRR, derive the asymptotic variance of RRR
Simulation Experiments in Practice : Statistical Design and Regression Analysis
Kleijnen, J.P.C.
2007-01-01
In practice, simulation analysts often change only one factor at a time, and use graphical analysis of the resulting Input/Output (I/O) data. Statistical theory proves that more information is obtained when applying Design Of Experiments (DOE) and linear regression analysis. Unfortunately, classic
Wavelet regression model in forecasting crude oil price
Hamid, Mohd Helmie; Shabri, Ani
2017-05-01
This study presents the performance of wavelet multiple linear regression (WMLR) technique in daily crude oil forecasting. WMLR model was developed by integrating the discrete wavelet transform (DWT) and multiple linear regression (MLR) model. The original time series was decomposed to sub-time series with different scales by wavelet theory. Correlation analysis was conducted to assist in the selection of optimal decomposed components as inputs for the WMLR model. The daily WTI crude oil price series has been used in this study to test the prediction capability of the proposed model. The forecasting performance of WMLR model were also compared with regular multiple linear regression (MLR), Autoregressive Moving Average (ARIMA) and Generalized Autoregressive Conditional Heteroscedasticity (GARCH) using root mean square errors (RMSE) and mean absolute errors (MAE). Based on the experimental results, it appears that the WMLR model performs better than the other forecasting technique tested in this study.
DIABETES MELLITUS AND ITS ROLE IN CAUDAL REGRESSION SYNDROME
Directory of Open Access Journals (Sweden)
Sandeep
2016-03-01
Full Text Available BACKGROUND Caudal regression syndrome also called as sacral agenesis or hypoplasia of the sacrum is a congenital disorder in which there is abnormal development of the lower part of the vertebral column 1 due to which there is a plethora of abnormalities such as gross motor deficiencies and other genitor-urinary malformations which in deed depends on the extent of malformations that is seen. Caudal regression syndrome is rare, with an estimated incidence of 1:7500-100,000. The aim of the study is to find the frequency of manifestations and the manifestations itself. METHODS Fifty patients who were pregnant and were diagnosed with diabetes mellitus were identified and were referred to the Department of Medicine. RESULTS In the present study the frequency of manifestations of caudal regression syndrome is 8 in 100 diagnosed patients. CONCLUSION The malformations in the babies born to diabetic mothers are high in the population of costal Karnataka and Kerala.
Regression periods in infancy: a case study from Catalonia.
Sadurní, Marta; Rostan, Carlos
2002-05-01
Based on Rijt-Plooij and Plooij's (1992) research on emergence of regression periods in the first two years of life, the presence of such periods in a group of 18 babies (10 boys and 8 girls, aged between 3 weeks and 14 months) from a Catalonian population was analyzed. The measurements were a questionnaire filled in by the infants' mothers, a semi-structured weekly tape-recorded interview, and observations in their homes. The procedure and the instruments used in the project follow those proposed by Rijt-Plooij and Plooij. Our results confirm the existence of the regression periods in the first year of children's life. Inter-coder agreement for trained coders was 78.2% and within-coder agreement was 90.1%. In the discussion, the possible meaning and relevance of regression periods in order to understand development from a psychobiological and social framework is commented upon.
Method for nonlinear exponential regression analysis
Junkin, B. G.
1972-01-01
Two computer programs developed according to two general types of exponential models for conducting nonlinear exponential regression analysis are described. Least squares procedure is used in which the nonlinear problem is linearized by expanding in a Taylor series. Program is written in FORTRAN 5 for the Univac 1108 computer.
Measurement Error in Education and Growth Regressions
Portela, Miguel; Alessie, Rob; Teulings, Coen
2010-01-01
The use of the perpetual inventory method for the construction of education data per country leads to systematic measurement error. This paper analyzes its effect on growth regressions. We suggest a methodology for correcting this error. The standard attenuation bias suggests that using these
The M Word: Multicollinearity in Multiple Regression.
Morrow-Howell, Nancy
1994-01-01
Notes that existence of substantial correlation between two or more independent variables creates problems of multicollinearity in multiple regression. Discusses multicollinearity problem in social work research in which independent variables are usually intercorrelated. Clarifies problems created by multicollinearity, explains detection of…
Regression Discontinuity Designs Based on Population Thresholds
DEFF Research Database (Denmark)
Eggers, Andrew C.; Freier, Ronny; Grembi, Veronica
In many countries, important features of municipal government (such as the electoral system, mayors' salaries, and the number of councillors) depend on whether the municipality is above or below arbitrary population thresholds. Several papers have used a regression discontinuity design (RDD...
Deriving the Regression Line with Algebra
Quintanilla, John A.
2017-01-01
Exploration with spreadsheets and reliance on previous skills can lead students to determine the line of best fit. To perform linear regression on a set of data, students in Algebra 2 (or, in principle, Algebra 1) do not have to settle for using the mysterious "black box" of their graphing calculators (or other classroom technologies).…
Functional data analysis of generalized regression quantiles
Guo, Mengmeng
2013-11-05
Generalized regression quantiles, including the conditional quantiles and expectiles as special cases, are useful alternatives to the conditional means for characterizing a conditional distribution, especially when the interest lies in the tails. We develop a functional data analysis approach to jointly estimate a family of generalized regression quantiles. Our approach assumes that the generalized regression quantiles share some common features that can be summarized by a small number of principal component functions. The principal component functions are modeled as splines and are estimated by minimizing a penalized asymmetric loss measure. An iterative least asymmetrically weighted squares algorithm is developed for computation. While separate estimation of individual generalized regression quantiles usually suffers from large variability due to lack of sufficient data, by borrowing strength across data sets, our joint estimation approach significantly improves the estimation efficiency, which is demonstrated in a simulation study. The proposed method is applied to data from 159 weather stations in China to obtain the generalized quantile curves of the volatility of the temperature at these stations. © 2013 Springer Science+Business Media New York.
Regression testing Ajax applications : Coping with dynamism
Roest, D.; Mesbah, A.; Van Deursen, A.
2009-01-01
Note: This paper is a pre-print of: Danny Roest, Ali Mesbah and Arie van Deursen. Regression Testing AJAX Applications: Coping with Dynamism. In Proceedings of the 3rd International Conference on Software Testing, Verification and Validation (ICST’10), Paris, France. IEEE Computer Society, 2010.
Group-wise partial least square regression
Camacho, José; Saccenti, Edoardo
2018-01-01
This paper introduces the group-wise partial least squares (GPLS) regression. GPLS is a new sparse PLS technique where the sparsity structure is defined in terms of groups of correlated variables, similarly to what is done in the related group-wise principal component analysis. These groups are
Functional data analysis of generalized regression quantiles
Guo, Mengmeng; Zhou, Lan; Huang, Jianhua Z.; Hä rdle, Wolfgang Karl
2013-01-01
Generalized regression quantiles, including the conditional quantiles and expectiles as special cases, are useful alternatives to the conditional means for characterizing a conditional distribution, especially when the interest lies in the tails. We develop a functional data analysis approach to jointly estimate a family of generalized regression quantiles. Our approach assumes that the generalized regression quantiles share some common features that can be summarized by a small number of principal component functions. The principal component functions are modeled as splines and are estimated by minimizing a penalized asymmetric loss measure. An iterative least asymmetrically weighted squares algorithm is developed for computation. While separate estimation of individual generalized regression quantiles usually suffers from large variability due to lack of sufficient data, by borrowing strength across data sets, our joint estimation approach significantly improves the estimation efficiency, which is demonstrated in a simulation study. The proposed method is applied to data from 159 weather stations in China to obtain the generalized quantile curves of the volatility of the temperature at these stations. © 2013 Springer Science+Business Media New York.
Finite Algorithms for Robust Linear Regression
DEFF Research Database (Denmark)
Madsen, Kaj; Nielsen, Hans Bruun
1990-01-01
The Huber M-estimator for robust linear regression is analyzed. Newton type methods for solution of the problem are defined and analyzed, and finite convergence is proved. Numerical experiments with a large number of test problems demonstrate efficiency and indicate that this kind of approach may...
Function approximation with polynomial regression slines
International Nuclear Information System (INIS)
Urbanski, P.
1996-01-01
Principles of the polynomial regression splines as well as algorithms and programs for their computation are presented. The programs prepared using software package MATLAB are generally intended for approximation of the X-ray spectra and can be applied in the multivariate calibration of radiometric gauges. (author)
Assessing risk factors for periodontitis using regression
Lobo Pereira, J. A.; Ferreira, Maria Cristina; Oliveira, Teresa
2013-10-01
Multivariate statistical analysis is indispensable to assess the associations and interactions between different factors and the risk of periodontitis. Among others, regression analysis is a statistical technique widely used in healthcare to investigate and model the relationship between variables. In our work we study the impact of socio-demographic, medical and behavioral factors on periodontal health. Using regression, linear and logistic models, we can assess the relevance, as risk factors for periodontitis disease, of the following independent variables (IVs): Age, Gender, Diabetic Status, Education, Smoking status and Plaque Index. The multiple linear regression analysis model was built to evaluate the influence of IVs on mean Attachment Loss (AL). Thus, the regression coefficients along with respective p-values will be obtained as well as the respective p-values from the significance tests. The classification of a case (individual) adopted in the logistic model was the extent of the destruction of periodontal tissues defined by an Attachment Loss greater than or equal to 4 mm in 25% (AL≥4mm/≥25%) of sites surveyed. The association measures include the Odds Ratios together with the correspondent 95% confidence intervals.
Predicting Social Trust with Binary Logistic Regression
Adwere-Boamah, Joseph; Hufstedler, Shirley
2015-01-01
This study used binary logistic regression to predict social trust with five demographic variables from a national sample of adult individuals who participated in The General Social Survey (GSS) in 2012. The five predictor variables were respondents' highest degree earned, race, sex, general happiness and the importance of personally assisting…
Yet another look at MIDAS regression
Ph.H.B.F. Franses (Philip Hans)
2016-01-01
textabstractA MIDAS regression involves a dependent variable observed at a low frequency and independent variables observed at a higher frequency. This paper relates a true high frequency data generating process, where also the dependent variable is observed (hypothetically) at the high frequency,
Revisiting Regression in Autism: Heller's "Dementia Infantilis"
Westphal, Alexander; Schelinski, Stefanie; Volkmar, Fred; Pelphrey, Kevin
2013-01-01
Theodor Heller first described a severe regression of adaptive function in normally developing children, something he termed dementia infantilis, over one 100 years ago. Dementia infantilis is most closely related to the modern diagnosis, childhood disintegrative disorder. We translate Heller's paper, Uber Dementia Infantilis, and discuss…
Fast multi-output relevance vector regression
Ha, Youngmin
2017-01-01
This paper aims to decrease the time complexity of multi-output relevance vector regression from O(VM^3) to O(V^3+M^3), where V is the number of output dimensions, M is the number of basis functions, and V
Regression Equations for Birth Weight Estimation using ...
African Journals Online (AJOL)
In this study, Birth Weight has been estimated from anthropometric measurements of hand and foot. Linear regression equations were formed from each of the measured variables. These simple equations can be used to estimate Birth Weight of new born babies, in order to identify those with low birth weight and referred to ...
Superquantile Regression: Theory, Algorithms, and Applications
2014-12-01
Highway, Suite 1204, Arlington, Va 22202-4302, and to the Office of Management and Budget, Paperwork Reduction Project (0704-0188) Washington DC 20503. 1...Navy submariners, reliability engineering, uncertainty quantification, and financial risk management . Superquantile, superquantile regression...Royset Carlos F. Borges Associate Professor of Operations Research Dissertation Supervisor Professor of Applied Mathematics Lyn R. Whitaker Javier
Measurement Error in Education and Growth Regressions
Portela, M.; Teulings, C.N.; Alessie, R.
The perpetual inventory method used for the construction of education data per country leads to systematic measurement error. This paper analyses the effect of this measurement error on GDP regressions. There is a systematic difference in the education level between census data and observations
Measurement error in education and growth regressions
Portela, Miguel; Teulings, Coen; Alessie, R.
2004-01-01
The perpetual inventory method used for the construction of education data per country leads to systematic measurement error. This paper analyses the effect of this measurement error on GDP regressions. There is a systematic difference in the education level between census data and observations
transformation of independent variables in polynomial regression ...
African Journals Online (AJOL)
Ada
preferable when possible to work with a simple functional form in transformed variables rather than with a more complicated form in the original variables. In this paper, it is shown that linear transformations applied to independent variables in polynomial regression models affect the t ratio and hence the statistical ...
Multiple Linear Regression: A Realistic Reflector.
Nutt, A. T.; Batsell, R. R.
Examples of the use of Multiple Linear Regression (MLR) techniques are presented. This is done to show how MLR aids data processing and decision-making by providing the decision-maker with freedom in phrasing questions and by accurately reflecting the data on hand. A brief overview of the rationale underlying MLR is given, some basic definitions…
Spontaneous regression of multiple pulmonary metastatic nodules of hepatocarcinoma: a case report
Energy Technology Data Exchange (ETDEWEB)
Bahk, Yong Whee; Park, Seog Hee; Kim, Sun Moo [St. Mary' s Hospital, Catholic Medical College, Seoul (Korea, Republic of)
1981-09-15
Although are spontaneous regression of either primary or metastatic malignant tumor in the absence of or inadequate therapy has been well documented. Since the earliest day of this century various malignant tumors have been reported to spontaneously disappear or to be arrested of their growth, but the cases of hepatocarcinoma has been very rare. From the literature, we were able to find out 5 previously reported cases of hepatocarcinoma which showed spontaneous regression at the primary site. Recently we have seen a case of multiple pulmonary metastatic nodules of hepatocarcinoma which completely regressed spontaneously and this forms the basis of the present case report. The patient was 55-year-old male admitted to St. Mary's Hospital, Catholic Medical College because of a hard palpable mass in the epigastrium on April 26, 1978. The admission PA chest roentgenogram revealed multiple small nodular densities scattered throughout both lung field especially in lower zones and toward the peripheral portion. A hepatoscintigram revealed a large cold area involving the left lobe and inermediate zone of the liver. Alfa-fetoprotein and hepatitis B serum antigen test were positive whereas many other standard liver function tests turned out to be negative. A needle biopsy of the tumor revealed well differentiated hepatocellular carcinoma. The patient was put under chemotherapy which consisted of 5 FU 500 mg intravenously for 6 days from April 28 to May 3, 1978. The patient was discharged after this single course of 5 FU treatment and was on a herb medicine, the nature and quantity of which obscure. No other specific treatment was given. The second admission took place on Dec. 3, 1980 because of irregularity in bowel habits and dyspepsia. A follow up PA chest roentgenogram obtained on the second admission revealed complete disappearance of previously noted multiple pulmonary nodular lesions (Fig. 3). Follow up liver scan revealed persistence of the cold area in the left lobe
Spontaneous regression of multiple pulmonary metastatic nodules of hepatocarcinoma: a case report
International Nuclear Information System (INIS)
Bahk, Yong Whee; Park, Seog Hee; Kim, Sun Moo
1981-01-01
Although are spontaneous regression of either primary or metastatic malignant tumor in the absence of or inadequate therapy has been well documented. Since the earliest day of this century various malignant tumors have been reported to spontaneously disappear or to be arrested of their growth, but the cases of hepatocarcinoma has been very rare. From the literature, we were able to find out 5 previously reported cases of hepatocarcinoma which showed spontaneous regression at the primary site. Recently we have seen a case of multiple pulmonary metastatic nodules of hepatocarcinoma which completely regressed spontaneously and this forms the basis of the present case report. The patient was 55-year-old male admitted to St. Mary's Hospital, Catholic Medical College because of a hard palpable mass in the epigastrium on April 26, 1978. The admission PA chest roentgenogram revealed multiple small nodular densities scattered throughout both lung field especially in lower zones and toward the peripheral portion. A hepatoscintigram revealed a large cold area involving the left lobe and inermediate zone of the liver. Alfa-fetoprotein and hepatitis B serum antigen test were positive whereas many other standard liver function tests turned out to be negative. A needle biopsy of the tumor revealed well differentiated hepatocellular carcinoma. The patient was put under chemotherapy which consisted of 5 FU 500 mg intravenously for 6 days from April 28 to May 3, 1978. The patient was discharged after this single course of 5 FU treatment and was on a herb medicine, the nature and quantity of which obscure. No other specific treatment was given. The second admission took place on Dec. 3, 1980 because of irregularity in bowel habits and dyspepsia. A follow up PA chest roentgenogram obtained on the second admission revealed complete disappearance of previously noted multiple pulmonary nodular lesions (Fig. 3). Follow up liver scan revealed persistence of the cold area in the left lobe
On concurvity in nonlinear and nonparametric regression models
Directory of Open Access Journals (Sweden)
Sonia Amodio
2014-12-01
Full Text Available When data are affected by multicollinearity in the linear regression framework, then concurvity will be present in fitting a generalized additive model (GAM. The term concurvity describes nonlinear dependencies among the predictor variables. As collinearity results in inflated variance of the estimated regression coefficients in the linear regression model, the result of the presence of concurvity leads to instability of the estimated coefficients in GAMs. Even if the backfitting algorithm will always converge to a solution, in case of concurvity the final solution of the backfitting procedure in fitting a GAM is influenced by the starting functions. While exact concurvity is highly unlikely, approximate concurvity, the analogue of multicollinearity, is of practical concern as it can lead to upwardly biased estimates of the parameters and to underestimation of their standard errors, increasing the risk of committing type I error. We compare the existing approaches to detect concurvity, pointing out their advantages and drawbacks, using simulated and real data sets. As a result, this paper will provide a general criterion to detect concurvity in nonlinear and non parametric regression models.
International Nuclear Information System (INIS)
Janssen, I.; Stebbings, J.H.
1990-01-01
In environmental epidemiology, trace and toxic substance concentrations frequently have very highly skewed distributions ranging over one or more orders of magnitude, and prediction by conventional regression is often poor. Classification and Regression Tree Analysis (CART) is an alternative in such contexts. To compare the techniques, two Pennsylvania data sets and three independent variables are used: house radon progeny (RnD) and gamma levels as predicted by construction characteristics in 1330 houses; and ∼200 house radon (Rn) measurements as predicted by topographic parameters. CART may identify structural variables of interest not identified by conventional regression, and vice versa, but in general the regression models are similar. CART has major advantages in dealing with other common characteristics of environmental data sets, such as missing values, continuous variables requiring transformations, and large sets of potential independent variables. CART is most useful in the identification and screening of independent variables, greatly reducing the need for cross-tabulations and nested breakdown analyses. There is no need to discard cases with missing values for the independent variables because surrogate variables are intrinsic to CART. The tree-structured approach is also independent of the scale on which the independent variables are measured, so that transformations are unnecessary. CART identifies important interactions as well as main effects. The major advantages of CART appear to be in exploring data. Once the important variables are identified, conventional regressions seem to lead to results similar but more interpretable by most audiences. 12 refs., 8 figs., 10 tabs
Enders, Felicity
2013-12-01
Although regression is widely used for reading and publishing in the medical literature, no instruments were previously available to assess students' understanding. The goal of this study was to design and assess such an instrument for graduate students in Clinical and Translational Science and Public Health. A 27-item REsearch on Global Regression Expectations in StatisticS (REGRESS) quiz was developed through an iterative process. Consenting students taking a course on linear regression in a Clinical and Translational Science program completed the quiz pre- and postcourse. Student results were compared to practicing statisticians with a master's or doctoral degree in statistics or a closely related field. Fifty-two students responded precourse, 59 postcourse , and 22 practicing statisticians completed the quiz. The mean (SD) score was 9.3 (4.3) for students precourse and 19.0 (3.5) postcourse (P REGRESS quiz was internally reliable (Cronbach's alpha 0.89). The initial validation is quite promising with statistically significant and meaningful differences across time and study populations. Further work is needed to validate the quiz across multiple institutions. © 2013 Wiley Periodicals, Inc.
International Nuclear Information System (INIS)
Coleman, D.J.; Lizzi, F.L.; Silverman, R.H.; Ellsworth, R.M.; Haik, B.G.; Abramson, D.H.; Smith, M.E.; Rondeau, M.J.
1985-01-01
Parameters derived from computer analysis of digital radio-frequency (rf) ultrasound scan data of untreated uveal malignant melanomas were examined for correlations with tumor regression following cobalt-60 plaque. Parameters included tumor height, normalized power spectrum and acoustic tissue type (ATT). Acoustic tissue type was based upon discriminant analysis of tumor power spectra, with spectra of tumors of known pathology serving as a model. Results showed ATT to be correlated with tumor regression during the first 18 months following treatment. Tumors with ATT associated with spindle cell malignant melanoma showed over twice the percentage reduction in height as those with ATT associated with mixed/epithelioid melanomas. Pre-treatment height was only weakly correlated with regression. Additionally, significant spectral changes were observed following treatment. Ultrasonic spectrum analysis thus provides a noninvasive tool for classification, prediction and monitoring of tumor response to cobalt-60 plaque
Spontaneous regression of intracranial malignant lymphoma. Case report
Energy Technology Data Exchange (ETDEWEB)
Kojo, Nobuto; Tokutomi, Takashi; Eguchi, Gihachirou; Takagi, Shigeyuki; Matsumoto, Tomie; Sasaguri, Yasuyuki; Shigemori, Minoru.
1988-05-01
In a 46-year-old female with a 1-month history of gait and speech disturbances, computed tomography (CT) demonstrated mass lesions of slightly high density in the left basal ganglia and left frontal lobe. The lesions were markedly enhanced by contrast medium. The patient received no specific treatment, but her clinical manifestations gradually abated and the lesions decreased in size. Five months after her initial examination, the lesions were absent on CT scans; only a small area of low density remained. Residual clinical symptoms included mild right hemiparesis and aphasia. After 14 months the patient again deteriorated, and a CT scan revealed mass lesions in the right frontal lobe and the pons. However, no enhancement was observed in the previously affected regions. A biopsy revealed malignant lymphoma. Despite treatment with steroids and radiation, the patient's clinical status progressively worsened and she died 27 months after initial presentation. Seven other cases of spontaneous regression of primary malignant lymphoma have been reported. In this case, the mechanism of the spontaneous regression was not clear, but changes in immunologic status may have been involved.
Quasi-experimental evidence on tobacco tax regressivity.
Koch, Steven F
2018-01-01
Tobacco taxes are known to reduce tobacco consumption and to be regressive, such that tobacco control policy may have the perverse effect of further harming the poor. However, if tobacco consumption falls faster amongst the poor than the rich, tobacco control policy can actually be progressive. We take advantage of persistent and committed tobacco control activities in South Africa to examine the household tobacco expenditure burden. For the analysis, we make use of two South African Income and Expenditure Surveys (2005/06 and 2010/11) that span a series of such tax increases and have been matched across the years, yielding 7806 matched pairs of tobacco consuming households and 4909 matched pairs of cigarette consuming households. By matching households across the surveys, we are able to examine both the regressivity of the household tobacco burden, and any change in that regressivity, and since tobacco taxes have been a consistent component of tobacco prices, our results also relate to the regressivity of tobacco taxes. Like previous research into cigarette and tobacco expenditures, we find that the tobacco burden is regressive; thus, so are tobacco taxes. However, we find that over the five-year period considered, the tobacco burden has decreased, and, most importantly, falls less heavily on the poor. Thus, the tobacco burden and the tobacco tax is less regressive in 2010/11 than in 2005/06. Thus, increased tobacco taxes can, in at least some circumstances, reduce the financial burden that tobacco places on households. Copyright © 2017 Elsevier Ltd. All rights reserved.
Sparse Regression by Projection and Sparse Discriminant Analysis
Qi, Xin
2015-04-03
© 2015, © American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America. Recent years have seen active developments of various penalized regression methods, such as LASSO and elastic net, to analyze high-dimensional data. In these approaches, the direction and length of the regression coefficients are determined simultaneously. Due to the introduction of penalties, the length of the estimates can be far from being optimal for accurate predictions. We introduce a new framework, regression by projection, and its sparse version to analyze high-dimensional data. The unique nature of this framework is that the directions of the regression coefficients are inferred first, and the lengths and the tuning parameters are determined by a cross-validation procedure to achieve the largest prediction accuracy. We provide a theoretical result for simultaneous model selection consistency and parameter estimation consistency of our method in high dimension. This new framework is then generalized such that it can be applied to principal components analysis, partial least squares, and canonical correlation analysis. We also adapt this framework for discriminant analysis. Compared with the existing methods, where there is relatively little control of the dependency among the sparse components, our method can control the relationships among the components. We present efficient algorithms and related theory for solving the sparse regression by projection problem. Based on extensive simulations and real data analysis, we demonstrate that our method achieves good predictive performance and variable selection in the regression setting, and the ability to control relationships between the sparse components leads to more accurate classification. In supplementary materials available online, the details of the algorithms and theoretical proofs, and R codes for all simulation studies are provided.
Regression analysis of radiological parameters in nuclear power plants
International Nuclear Information System (INIS)
Bhargava, Pradeep; Verma, R.K.; Joshi, M.L.
2003-01-01
Indian Pressurized Heavy Water Reactors (PHWRs) have now attained maturity in their operations. Indian PHWR operation started in the year 1972. At present there are 12 operating PHWRs collectively producing nearly 2400 MWe. Sufficient radiological data are available for analysis to draw inferences which may be utilised for better understanding of radiological parameters influencing the collective internal dose. Tritium is the main contributor to the occupational internal dose originating in PHWRs. An attempt has been made to establish the relationship between radiological parameters, which may be useful to draw inferences about the internal dose. Regression analysis have been done to find out the relationship, if it exist, among the following variables: A. Specific tritium activity of heavy water (Moderator and PHT) and tritium concentration in air at various work locations. B. Internal collective occupational dose and tritium release to environment through air route. C. Specific tritium activity of heavy water (Moderator and PHT) and collective internal occupational dose. For this purpose multivariate regression analysis has been carried out. D. Tritium concentration in air at various work location and tritium release to environment through air route. For this purpose multivariate regression analysis has been carried out. This analysis reveals that collective internal dose has got very good correlation with the tritium activity release to the environment through air route. Whereas no correlation has been found between specific tritium activity in the heavy water systems and collective internal occupational dose. The good correlation has been found in case D and F test reveals that it is not by chance. (author)
Jovanovic, Milos; Radovanovic, Sandro; Vukicevic, Milan; Van Poucke, Sven; Delibasic, Boris
2016-09-01
Quantification and early identification of unplanned readmission risk have the potential to improve the quality of care during hospitalization and after discharge. However, high dimensionality, sparsity, and class imbalance of electronic health data and the complexity of risk quantification, challenge the development of accurate predictive models. Predictive models require a certain level of interpretability in order to be applicable in real settings and create actionable insights. This paper aims to develop accurate and interpretable predictive models for readmission in a general pediatric patient population, by integrating a data-driven model (sparse logistic regression) and domain knowledge based on the international classification of diseases 9th-revision clinical modification (ICD-9-CM) hierarchy of diseases. Additionally, we propose a way to quantify the interpretability of a model and inspect the stability of alternative solutions. The analysis was conducted on >66,000 pediatric hospital discharge records from California, State Inpatient Databases, Healthcare Cost and Utilization Project between 2009 and 2011. We incorporated domain knowledge based on the ICD-9-CM hierarchy in a data driven, Tree-Lasso regularized logistic regression model, providing the framework for model interpretation. This approach was compared with traditional Lasso logistic regression resulting in models that are easier to interpret by fewer high-level diagnoses, with comparable prediction accuracy. The results revealed that the use of a Tree-Lasso model was as competitive in terms of accuracy (measured by area under the receiver operating characteristic curve-AUC) as the traditional Lasso logistic regression, but integration with the ICD-9-CM hierarchy of diseases provided more interpretable models in terms of high-level diagnoses. Additionally, interpretations of models are in accordance with existing medical understanding of pediatric readmission. Best performing models have
Controlling attribute effect in linear regression
Calders, Toon; Karim, Asim A.; Kamiran, Faisal; Ali, Wasif Mohammad; Zhang, Xiangliang
2013-01-01
In data mining we often have to learn from biased data, because, for instance, data comes from different batches or there was a gender or racial bias in the collection of social data. In some applications it may be necessary to explicitly control this bias in the models we learn from the data. This paper is the first to study learning linear regression models under constraints that control the biasing effect of a given attribute such as gender or batch number. We show how propensity modeling can be used for factoring out the part of the bias that can be justified by externally provided explanatory attributes. Then we analytically derive linear models that minimize squared error while controlling the bias by imposing constraints on the mean outcome or residuals of the models. Experiments with discrimination-aware crime prediction and batch effect normalization tasks show that the proposed techniques are successful in controlling attribute effects in linear regression models. © 2013 IEEE.
Stochastic development regression using method of moments
DEFF Research Database (Denmark)
Kühnel, Line; Sommer, Stefan Horst
2017-01-01
This paper considers the estimation problem arising when inferring parameters in the stochastic development regression model for manifold valued non-linear data. Stochastic development regression captures the relation between manifold-valued response and Euclidean covariate variables using...... the stochastic development construction. It is thereby able to incorporate several covariate variables and random effects. The model is intrinsically defined using the connection of the manifold, and the use of stochastic development avoids linearizing the geometry. We propose to infer parameters using...... the Method of Moments procedure that matches known constraints on moments of the observations conditional on the latent variables. The performance of the model is investigated in a simulation example using data on finite dimensional landmark manifolds....
Beta-binomial regression and bimodal utilization.
Liu, Chuan-Fen; Burgess, James F; Manning, Willard G; Maciejewski, Matthew L
2013-10-01
To illustrate how the analysis of bimodal U-shaped distributed utilization can be modeled with beta-binomial regression, which is rarely used in health services research. Veterans Affairs (VA) administrative data and Medicare claims in 2001-2004 for 11,123 Medicare-eligible VA primary care users in 2000. We compared means and distributions of VA reliance (the proportion of all VA/Medicare primary care visits occurring in VA) predicted from beta-binomial, binomial, and ordinary least-squares (OLS) models. Beta-binomial model fits the bimodal distribution of VA reliance better than binomial and OLS models due to the nondependence on normality and the greater flexibility in shape parameters. Increased awareness of beta-binomial regression may help analysts apply appropriate methods to outcomes with bimodal or U-shaped distributions. © Health Research and Educational Trust.
Testing homogeneity in Weibull-regression models.
Bolfarine, Heleno; Valença, Dione M
2005-10-01
In survival studies with families or geographical units it may be of interest testing whether such groups are homogeneous for given explanatory variables. In this paper we consider score type tests for group homogeneity based on a mixing model in which the group effect is modelled as a random variable. As opposed to hazard-based frailty models, this model presents survival times that conditioned on the random effect, has an accelerated failure time representation. The test statistics requires only estimation of the conventional regression model without the random effect and does not require specifying the distribution of the random effect. The tests are derived for a Weibull regression model and in the uncensored situation, a closed form is obtained for the test statistic. A simulation study is used for comparing the power of the tests. The proposed tests are applied to real data sets with censored data.
Are increases in cigarette taxation regressive?
Borren, P; Sutton, M
1992-12-01
Using the latest published data from Tobacco Advisory Council surveys, this paper re-evaluates the question of whether or not increases in cigarette taxation are regressive in the United Kingdom. The extended data set shows no evidence of increasing price-elasticity by social class as found in a major previous study. To the contrary, there appears to be no clear pattern in the price responsiveness of smoking behaviour across different social classes. Increases in cigarette taxation, while reducing smoking levels in all groups, fall most heavily on men and women in the lowest social class. Men and women in social class five can expect to pay eight and eleven times more of a tax increase respectively, than their social class one counterparts. Taken as a proportion of relative incomes, the regressive nature of increases in cigarette taxation is even more pronounced.
Controlling attribute effect in linear regression
Calders, Toon
2013-12-01
In data mining we often have to learn from biased data, because, for instance, data comes from different batches or there was a gender or racial bias in the collection of social data. In some applications it may be necessary to explicitly control this bias in the models we learn from the data. This paper is the first to study learning linear regression models under constraints that control the biasing effect of a given attribute such as gender or batch number. We show how propensity modeling can be used for factoring out the part of the bias that can be justified by externally provided explanatory attributes. Then we analytically derive linear models that minimize squared error while controlling the bias by imposing constraints on the mean outcome or residuals of the models. Experiments with discrimination-aware crime prediction and batch effect normalization tasks show that the proposed techniques are successful in controlling attribute effects in linear regression models. © 2013 IEEE.
Regression Models For Multivariate Count Data.
Zhang, Yiwen; Zhou, Hua; Zhou, Jin; Sun, Wei
2017-01-01
Data with multivariate count responses frequently occur in modern applications. The commonly used multinomial-logit model is limiting due to its restrictive mean-variance structure. For instance, analyzing count data from the recent RNA-seq technology by the multinomial-logit model leads to serious errors in hypothesis testing. The ubiquity of over-dispersion and complicated correlation structures among multivariate counts calls for more flexible regression models. In this article, we study some generalized linear models that incorporate various correlation structures among the counts. Current literature lacks a treatment of these models, partly due to the fact that they do not belong to the natural exponential family. We study the estimation, testing, and variable selection for these models in a unifying framework. The regression models are compared on both synthetic and real RNA-seq data.
Model selection in kernel ridge regression
DEFF Research Database (Denmark)
Exterkate, Peter
2013-01-01
Kernel ridge regression is a technique to perform ridge regression with a potentially infinite number of nonlinear transformations of the independent variables as regressors. This method is gaining popularity as a data-rich nonlinear forecasting tool, which is applicable in many different contexts....... The influence of the choice of kernel and the setting of tuning parameters on forecast accuracy is investigated. Several popular kernels are reviewed, including polynomial kernels, the Gaussian kernel, and the Sinc kernel. The latter two kernels are interpreted in terms of their smoothing properties......, and the tuning parameters associated to all these kernels are related to smoothness measures of the prediction function and to the signal-to-noise ratio. Based on these interpretations, guidelines are provided for selecting the tuning parameters from small grids using cross-validation. A Monte Carlo study...
Confidence bands for inverse regression models
International Nuclear Information System (INIS)
Birke, Melanie; Bissantz, Nicolai; Holzmann, Hajo
2010-01-01
We construct uniform confidence bands for the regression function in inverse, homoscedastic regression models with convolution-type operators. Here, the convolution is between two non-periodic functions on the whole real line rather than between two periodic functions on a compact interval, since the former situation arguably arises more often in applications. First, following Bickel and Rosenblatt (1973 Ann. Stat. 1 1071–95) we construct asymptotic confidence bands which are based on strong approximations and on a limit theorem for the supremum of a stationary Gaussian process. Further, we propose bootstrap confidence bands based on the residual bootstrap and prove consistency of the bootstrap procedure. A simulation study shows that the bootstrap confidence bands perform reasonably well for moderate sample sizes. Finally, we apply our method to data from a gel electrophoresis experiment with genetically engineered neuronal receptor subunits incubated with rat brain extract
Spatial stochastic regression modelling of urban land use
International Nuclear Information System (INIS)
Arshad, S H M; Jaafar, J; Abiden, M Z Z; Latif, Z A; Rasam, A R A
2014-01-01
Urbanization is very closely linked to industrialization, commercialization or overall economic growth and development. This results in innumerable benefits of the quantity and quality of the urban environment and lifestyle but on the other hand contributes to unbounded development, urban sprawl, overcrowding and decreasing standard of living. Regulation and observation of urban development activities is crucial. The understanding of urban systems that promotes urban growth are also essential for the purpose of policy making, formulating development strategies as well as development plan preparation. This study aims to compare two different stochastic regression modeling techniques for spatial structure models of urban growth in the same specific study area. Both techniques will utilize the same datasets and their results will be analyzed. The work starts by producing an urban growth model by using stochastic regression modeling techniques namely the Ordinary Least Square (OLS) and Geographically Weighted Regression (GWR). The two techniques are compared to and it is found that, GWR seems to be a more significant stochastic regression model compared to OLS, it gives a smaller AICc (Akaike's Information Corrected Criterion) value and its output is more spatially explainable
Song, Chao; Kwan, Mei-Po; Zhu, Jiping
2017-04-08
An increasing number of fires are occurring with the rapid development of cities, resulting in increased risk for human beings and the environment. This study compares geographically weighted regression-based models, including geographically weighted regression (GWR) and geographically and temporally weighted regression (GTWR), which integrates spatial and temporal effects and global linear regression models (LM) for modeling fire risk at the city scale. The results show that the road density and the spatial distribution of enterprises have the strongest influences on fire risk, which implies that we should focus on areas where roads and enterprises are densely clustered. In addition, locations with a large number of enterprises have fewer fire ignition records, probably because of strict management and prevention measures. A changing number of significant variables across space indicate that heterogeneity mainly exists in the northern and eastern rural and suburban areas of Hefei city, where human-related facilities or road construction are only clustered in the city sub-centers. GTWR can capture small changes in the spatiotemporal heterogeneity of the variables while GWR and LM cannot. An approach that integrates space and time enables us to better understand the dynamic changes in fire risk. Thus governments can use the results to manage fire safety at the city scale.
Regressing Atherosclerosis by Resolving Plaque Inflammation
2017-07-01
regression requires the alteration of macrophages in the plaques to a tissue repair “alternatively” activated state. This switch in activation state... tissue repair “alternatively” activated state. This switch in activation state requires the action of TH2 cytokines interleukin (IL)-4 or IL-13. To...regulation of tissue macrophage and dendritic cell population dynamics by CSF-1. J Exp Med. 2011;208(9):1901–1916. 35. Xu H, Exner BG, Chilton PM
Determination of regression laws: Linear and nonlinear
International Nuclear Information System (INIS)
Onishchenko, A.M.
1994-01-01
A detailed mathematical determination of regression laws is presented in the article. Particular emphasis is place on determining the laws of X j on X l to account for source nuclei decay and detector errors in nuclear physics instrumentation. Both linear and nonlinear relations are presented. Linearization of 19 functions is tabulated, including graph, relation, variable substitution, obtained linear function, and remarks. 6 refs., 1 tab
Directional quantile regression in Octave (and MATLAB)
Czech Academy of Sciences Publication Activity Database
Boček, Pavel; Šiman, Miroslav
2016-01-01
Roč. 52, č. 1 (2016), s. 28-51 ISSN 0023-5954 R&D Projects: GA ČR GA14-07234S Institutional support: RVO:67985556 Keywords : quantile regression * multivariate quantile * depth contour * Matlab Subject RIV: IN - Informatics, Computer Science Impact factor: 0.379, year: 2016 http://library.utia.cas.cz/separaty/2016/SI/bocek-0458380.pdf
Logistic regression a self-learning text
Kleinbaum, David G
1994-01-01
This textbook provides students and professionals in the health sciences with a presentation of the use of logistic regression in research. The text is self-contained, and designed to be used both in class or as a tool for self-study. It arises from the author's many years of experience teaching this material and the notes on which it is based have been extensively used throughout the world.
Multicollinearity in Regression Analyses Conducted in Epidemiologic Studies
Vatcheva, Kristina P.; Lee, MinJae; McCormick, Joseph B.; Rahbar, Mohammad H.
2016-01-01
The adverse impact of ignoring multicollinearity on findings and data interpretation in regression analysis is very well documented in the statistical literature. The failure to identify and report multicollinearity could result in misleading interpretations of the results. A review of epidemiological literature in PubMed from January 2004 to December 2013, illustrated the need for a greater attention to identifying and minimizing the effect of multicollinearity in analysis of data from epide...
Multitask Quantile Regression under the Transnormal Model.
Fan, Jianqing; Xue, Lingzhou; Zou, Hui
2016-01-01
We consider estimating multi-task quantile regression under the transnormal model, with focus on high-dimensional setting. We derive a surprisingly simple closed-form solution through rank-based covariance regularization. In particular, we propose the rank-based ℓ 1 penalization with positive definite constraints for estimating sparse covariance matrices, and the rank-based banded Cholesky decomposition regularization for estimating banded precision matrices. By taking advantage of alternating direction method of multipliers, nearest correlation matrix projection is introduced that inherits sampling properties of the unprojected one. Our work combines strengths of quantile regression and rank-based covariance regularization to simultaneously deal with nonlinearity and nonnormality for high-dimensional regression. Furthermore, the proposed method strikes a good balance between robustness and efficiency, achieves the "oracle"-like convergence rate, and provides the provable prediction interval under the high-dimensional setting. The finite-sample performance of the proposed method is also examined. The performance of our proposed rank-based method is demonstrated in a real application to analyze the protein mass spectroscopy data.
Complex regression Doppler optical coherence tomography
Elahi, Sahar; Gu, Shi; Thrane, Lars; Rollins, Andrew M.; Jenkins, Michael W.
2018-04-01
We introduce a new method to measure Doppler shifts more accurately and extend the dynamic range of Doppler optical coherence tomography (OCT). The two-point estimate of the conventional Doppler method is replaced with a regression that is applied to high-density B-scans in polar coordinates. We built a high-speed OCT system using a 1.68-MHz Fourier domain mode locked laser to acquire high-density B-scans (16,000 A-lines) at high enough frame rates (˜100 fps) to accurately capture the dynamics of the beating embryonic heart. Flow phantom experiments confirm that the complex regression lowers the minimum detectable velocity from 12.25 mm / s to 374 μm / s, whereas the maximum velocity of 400 mm / s is measured without phase wrapping. Complex regression Doppler OCT also demonstrates higher accuracy and precision compared with the conventional method, particularly when signal-to-noise ratio is low. The extended dynamic range allows monitoring of blood flow over several stages of development in embryos without adjusting the imaging parameters. In addition, applying complex averaging recovers hidden features in structural images.
Satellite rainfall retrieval by logistic regression
Chiu, Long S.
1986-01-01
The potential use of logistic regression in rainfall estimation from satellite measurements is investigated. Satellite measurements provide covariate information in terms of radiances from different remote sensors.The logistic regression technique can effectively accommodate many covariates and test their significance in the estimation. The outcome from the logistical model is the probability that the rainrate of a satellite pixel is above a certain threshold. By varying the thresholds, a rainrate histogram can be obtained, from which the mean and the variant can be estimated. A logistical model is developed and applied to rainfall data collected during GATE, using as covariates the fractional rain area and a radiance measurement which is deduced from a microwave temperature-rainrate relation. It is demonstrated that the fractional rain area is an important covariate in the model, consistent with the use of the so-called Area Time Integral in estimating total rain volume in other studies. To calibrate the logistical model, simulated rain fields generated by rainfield models with prescribed parameters are needed. A stringent test of the logistical model is its ability to recover the prescribed parameters of simulated rain fields. A rain field simulation model which preserves the fractional rain area and lognormality of rainrates as found in GATE is developed. A stochastic regression model of branching and immigration whose solutions are lognormally distributed in some asymptotic limits has also been developed.
Mixed-effects regression models in linguistics
Heylen, Kris; Geeraerts, Dirk
2018-01-01
When data consist of grouped observations or clusters, and there is a risk that measurements within the same group are not independent, group-specific random effects can be added to a regression model in order to account for such within-group associations. Regression models that contain such group-specific random effects are called mixed-effects regression models, or simply mixed models. Mixed models are a versatile tool that can handle both balanced and unbalanced datasets and that can also be applied when several layers of grouping are present in the data; these layers can either be nested or crossed. In linguistics, as in many other fields, the use of mixed models has gained ground rapidly over the last decade. This methodological evolution enables us to build more sophisticated and arguably more realistic models, but, due to its technical complexity, also introduces new challenges. This volume brings together a number of promising new evolutions in the use of mixed models in linguistics, but also addres...
On logistic regression analysis of dichotomized responses.
Lu, Kaifeng
2017-01-01
We study the properties of treatment effect estimate in terms of odds ratio at the study end point from logistic regression model adjusting for the baseline value when the underlying continuous repeated measurements follow a multivariate normal distribution. Compared with the analysis that does not adjust for the baseline value, the adjusted analysis produces a larger treatment effect as well as a larger standard error. However, the increase in standard error is more than offset by the increase in treatment effect so that the adjusted analysis is more powerful than the unadjusted analysis for detecting the treatment effect. On the other hand, the true adjusted odds ratio implied by the normal distribution of the underlying continuous variable is a function of the baseline value and hence is unlikely to be able to be adequately represented by a single value of adjusted odds ratio from the logistic regression model. In contrast, the risk difference function derived from the logistic regression model provides a reasonable approximation to the true risk difference function implied by the normal distribution of the underlying continuous variable over the range of the baseline distribution. We show that different metrics of treatment effect have similar statistical power when evaluated at the baseline mean. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Guo, Huey-Ming; Shyu, Yea-Ing Lotus; Chang, Her-Kun
2006-01-01
In this article, the authors provide an overview of a research method to predict quality of care in home health nursing data set. The results of this study can be visualized through classification an regression tree (CART) graphs. The analysis was more effective, and the results were more informative since the home health nursing dataset was analyzed with a combination of the logistic regression and CART, these two techniques complete each other. And the results more informative that more patients' characters were related to quality of care in home care. The results contributed to home health nurse predict patient outcome in case management. Improved prediction is needed for interventions to be appropriately targeted for improved patient outcome and quality of care.
SPLINE LINEAR REGRESSION USED FOR EVALUATING FINANCIAL ASSETS 1
Directory of Open Access Journals (Sweden)
Liviu GEAMBAŞU
2010-12-01
Full Text Available One of the most important preoccupations of financial markets participants was and still is the problem of determining more precise the trend of financial assets prices. For solving this problem there were written many scientific papers and were developed many mathematical and statistical models in order to better determine the financial assets price trend. If until recently the simple linear models were largely used due to their facile utilization, the financial crises that affected the world economy starting with 2008 highlight the necessity of adapting the mathematical models to variation of economy. A simple to use model but adapted to economic life realities is the spline linear regression. This type of regression keeps the continuity of regression function, but split the studied data in intervals with homogenous characteristics. The characteristics of each interval are highlighted and also the evolution of market over all the intervals, resulting reduced standard errors. The first objective of the article is the theoretical presentation of the spline linear regression, also referring to scientific national and international papers related to this subject. The second objective is applying the theoretical model to data from the Bucharest Stock Exchange
A Solution to Separation and Multicollinearity in Multiple Logistic Regression.
Shen, Jianzhao; Gao, Sujuan
2008-10-01
In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27-38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth's penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study.
Arcuate Fasciculus in Autism Spectrum Disorder Toddlers with Language Regression
Directory of Open Access Journals (Sweden)
Zhang Lin
2018-03-01
Full Text Available Language regression is observed in a subset of toddlers with autism spectrum disorder (ASD as initial symptom. However, such a phenomenon has not been fully explored, partly due to the lack of definite diagnostic evaluation methods and criteria. Materials and Methods: Fifteen toddlers with ASD exhibiting language regression and fourteen age-matched typically developing (TD controls underwent diffusion tensor imaging (DTI. DTI parameters including fractional anisotropy (FA, average fiber length (AFL, tract volume (TV and number of voxels (NV were analyzed by Neuro 3D in Siemens syngo workstation. Subsequently, the data were analyzed by using IBM SPSS Statistics 22. Results: Compared with TD children, a significant reduction of FA along with an increase in TV and NV was observed in ASD children with language regression. Note that there were no significant differences between ASD and TD children in AFL of the arcuate fasciculus (AF. Conclusions: These DTI changes in the AF suggest that microstructural anomalies of the AF white matter may be associated with language deficits in ASD children exhibiting language regression starting from an early age.
A Powerful Test for Comparing Multiple Regression Functions.
Maity, Arnab
2012-09-01
In this article, we address the important problem of comparison of two or more population regression functions. Recently, Pardo-Fernández, Van Keilegom and González-Manteiga (2007) developed test statistics for simple nonparametric regression models: Y(ij) = θ(j)(Z(ij)) + σ(j)(Z(ij))∊(ij), based on empirical distributions of the errors in each population j = 1, … , J. In this paper, we propose a test for equality of the θ(j)(·) based on the concept of generalized likelihood ratio type statistics. We also generalize our test for other nonparametric regression setups, e.g, nonparametric logistic regression, where the loglikelihood for population j is any general smooth function [Formula: see text]. We describe a resampling procedure to obtain the critical values of the test. In addition, we present a simulation study to evaluate the performance of the proposed test and compare our results to those in Pardo-Fernández et al. (2007).
DYNA3D/ParaDyn Regression Test Suite Inventory
Energy Technology Data Exchange (ETDEWEB)
Lin, Jerry I. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
2016-09-01
The following table constitutes an initial assessment of feature coverage across the regression test suite used for DYNA3D and ParaDyn. It documents the regression test suite at the time of preliminary release 16.1 in September 2016. The columns of the table represent groupings of functionalities, e.g., material models. Each problem in the test suite is represented by a row in the table. All features exercised by the problem are denoted by a check mark (√) in the corresponding column. The definition of “feature” has not been subdivided to its smallest unit of user input, e.g., algorithmic parameters specific to a particular type of contact surface. This represents a judgment to provide code developers and users a reasonable impression of feature coverage without expanding the width of the table by several multiples. All regression testing is run in parallel, typically with eight processors, except problems involving features only available in serial mode. Many are strictly regression tests acting as a check that the codes continue to produce adequately repeatable results as development unfolds; compilers change and platforms are replaced. A subset of the tests represents true verification problems that have been checked against analytical or other benchmark solutions. Users are welcomed to submit documented problems for inclusion in the test suite, especially if they are heavily exercising, and dependent upon, features that are currently underrepresented.
Testing for marginal linear effects in quantile regression
Wang, Huixia Judy
2017-10-23
The paper develops a new marginal testing procedure to detect significant predictors that are associated with the conditional quantiles of a scalar response. The idea is to fit the marginal quantile regression on each predictor one at a time, and then to base the test on the t-statistics that are associated with the most predictive predictors. A resampling method is devised to calibrate this test statistic, which has non-regular limiting behaviour due to the selection of the most predictive variables. Asymptotic validity of the procedure is established in a general quantile regression setting in which the marginal quantile regression models can be misspecified. Even though a fixed dimension is assumed to derive the asymptotic results, the test proposed is applicable and computationally feasible for large dimensional predictors. The method is more flexible than existing marginal screening test methods based on mean regression and has the added advantage of being robust against outliers in the response. The approach is illustrated by using an application to a human immunodeficiency virus drug resistance data set.
Testing for marginal linear effects in quantile regression
Wang, Huixia Judy; McKeague, Ian W.; Qian, Min
2017-01-01
The paper develops a new marginal testing procedure to detect significant predictors that are associated with the conditional quantiles of a scalar response. The idea is to fit the marginal quantile regression on each predictor one at a time, and then to base the test on the t-statistics that are associated with the most predictive predictors. A resampling method is devised to calibrate this test statistic, which has non-regular limiting behaviour due to the selection of the most predictive variables. Asymptotic validity of the procedure is established in a general quantile regression setting in which the marginal quantile regression models can be misspecified. Even though a fixed dimension is assumed to derive the asymptotic results, the test proposed is applicable and computationally feasible for large dimensional predictors. The method is more flexible than existing marginal screening test methods based on mean regression and has the added advantage of being robust against outliers in the response. The approach is illustrated by using an application to a human immunodeficiency virus drug resistance data set.
Geographically Weighted Logistic Regression Applied to Credit Scoring Models
Directory of Open Access Journals (Sweden)
Pedro Henrique Melo Albuquerque
Full Text Available Abstract This study used real data from a Brazilian financial institution on transactions involving Consumer Direct Credit (CDC, granted to clients residing in the Distrito Federal (DF, to construct credit scoring models via Logistic Regression and Geographically Weighted Logistic Regression (GWLR techniques. The aims were: to verify whether the factors that influence credit risk differ according to the borrower’s geographic location; to compare the set of models estimated via GWLR with the global model estimated via Logistic Regression, in terms of predictive power and financial losses for the institution; and to verify the viability of using the GWLR technique to develop credit scoring models. The metrics used to compare the models developed via the two techniques were the AICc informational criterion, the accuracy of the models, the percentage of false positives, the sum of the value of false positive debt, and the expected monetary value of portfolio default compared with the monetary value of defaults observed. The models estimated for each region in the DF were distinct in their variables and coefficients (parameters, with it being concluded that credit risk was influenced differently in each region in the study. The Logistic Regression and GWLR methodologies presented very close results, in terms of predictive power and financial losses for the institution, and the study demonstrated viability in using the GWLR technique to develop credit scoring models for the target population in the study.
Kempe, P T; van Oppen, P; de Haan, E; Twisk, J W R; Sluis, A; Smit, J H; van Dyck, R; van Balkom, A J L M
2007-09-01
Two methods for predicting remissions in obsessive-compulsive disorder (OCD) treatment are evaluated. Y-BOCS measurements of 88 patients with a primary OCD (DSM-III-R) diagnosis were performed over a 16-week treatment period, and during three follow-ups. Remission at any measurement was defined as a Y-BOCS score lower than thirteen combined with a reduction of seven points when compared with baseline. Logistic regression models were compared with a Cox regression for recurrent events model. Logistic regression yielded different models at different evaluation times. The recurrent events model remained stable when fewer measurements were used. Higher baseline levels of neuroticism and more severe OCD symptoms were associated with a lower chance of remission, early age of onset and more depressive symptoms with a higher chance. Choice of outcome time affects logistic regression prediction models. Recurrent events analysis uses all information on remissions and relapses. Short- and long-term predictors for OCD remission show overlap.
A method for nonlinear exponential regression analysis
Junkin, B. G.
1971-01-01
A computer-oriented technique is presented for performing a nonlinear exponential regression analysis on decay-type experimental data. The technique involves the least squares procedure wherein the nonlinear problem is linearized by expansion in a Taylor series. A linear curve fitting procedure for determining the initial nominal estimates for the unknown exponential model parameters is included as an integral part of the technique. A correction matrix was derived and then applied to the nominal estimate to produce an improved set of model parameters. The solution cycle is repeated until some predetermined criterion is satisfied.
Multinomial logistic regression in workers' health
Grilo, Luís M.; Grilo, Helena L.; Gonçalves, Sónia P.; Junça, Ana
2017-11-01
In European countries, namely in Portugal, it is common to hear some people mentioning that they are exposed to excessive and continuous psychosocial stressors at work. This is increasing in diverse activity sectors, such as, the Services sector. A representative sample was collected from a Portuguese Services' organization, by applying a survey (internationally validated), which variables were measured in five ordered categories in Likert-type scale. A multinomial logistic regression model is used to estimate the probability of each category of the dependent variable general health perception where, among other independent variables, burnout appear as statistically significant.
Three Contributions to Robust Regression Diagnostics
Czech Academy of Sciences Publication Activity Database
Kalina, Jan
2015-01-01
Roč. 11, č. 2 (2015), s. 69-78 ISSN 1336-9180 Grant - others:GA ČR(CZ) GA13-01930S; Nadační fond na podporu vědy(CZ) Neuron Institutional support: RVO:67985807 Keywords : robust regression * robust econometrics * hypothesis test ing Subject RIV: BA - General Mathematics http://www.degruyter.com/view/j/jamsi.2015.11.issue-2/jamsi-2015-0013/jamsi-2015-0013.xml?format=INT
SDE based regression for random PDEs
Bayer, Christian
2016-01-01
A simulation based method for the numerical solution of PDE with random coefficients is presented. By the Feynman-Kac formula, the solution can be represented as conditional expectation of a functional of a corresponding stochastic differential equation driven by independent noise. A time discretization of the SDE for a set of points in the domain and a subsequent Monte Carlo regression lead to an approximation of the global solution of the random PDE. We provide an initial error and complexity analysis of the proposed method along with numerical examples illustrating its behaviour.
Selecting a Regression Saturated by Indicators
DEFF Research Database (Denmark)
Hendry, David F.; Johansen, Søren; Santos, Carlos
We consider selecting a regression model, using a variant of Gets, when there are more variables than observations, in the special case that the variables are impulse dummies (indicators) for every observation. We show that the setting is unproblematic if tackled appropriately, and obtain the fin...... the finite-sample distribution of estimators of the mean and variance in a simple location-scale model under the null that no impulses matter. A Monte Carlo simulation confirms the null distribution, and shows power against an alternative of interest....
Selecting a Regression Saturated by Indicators
DEFF Research Database (Denmark)
Hendry, David F.; Johansen, Søren; Santos, Carlos
We consider selecting a regression model, using a variant of Gets, when there are more variables than observations, in the special case that the variables are impulse dummies (indicators) for every observation. We show that the setting is unproblematic if tackled appropriately, and obtain the fin...... the finite-sample distribution of estimators of the mean and variance in a simple location-scale model under the null that no impulses matter. A Monte Carlo simulation confirms the null distribution, and shows power against an alternative of interest...
Fixed kernel regression for voltammogram feature extraction
International Nuclear Information System (INIS)
Acevedo Rodriguez, F J; López-Sastre, R J; Gil-Jiménez, P; Maldonado Bascón, S; Ruiz-Reyes, N
2009-01-01
Cyclic voltammetry is an electroanalytical technique for obtaining information about substances under analysis without the need for complex flow systems. However, classifying the information in voltammograms obtained using this technique is difficult. In this paper, we propose the use of fixed kernel regression as a method for extracting features from these voltammograms, reducing the information to a few coefficients. The proposed approach has been applied to a wine classification problem with accuracy rates of over 98%. Although the method is described here for extracting voltammogram information, it can be used for other types of signals
Regression analysis for the social sciences
Gordon, Rachel A
2010-01-01
The book provides graduate students in the social sciences with the basic skills that they need to estimate, interpret, present, and publish basic regression models using contemporary standards. Key features of the book include: interweaving the teaching of statistical concepts with examples developed for the course from publicly-available social science data or drawn from the literature. thorough integration of teaching statistical theory with teaching data processing and analysis. teaching of both SAS and Stata "side-by-side" and use of chapter exercises in which students practice programming and interpretation on the same data set and course exercises in which students can choose their own research questions and data set.
SDE based regression for random PDEs
Bayer, Christian
2016-01-06
A simulation based method for the numerical solution of PDE with random coefficients is presented. By the Feynman-Kac formula, the solution can be represented as conditional expectation of a functional of a corresponding stochastic differential equation driven by independent noise. A time discretization of the SDE for a set of points in the domain and a subsequent Monte Carlo regression lead to an approximation of the global solution of the random PDE. We provide an initial error and complexity analysis of the proposed method along with numerical examples illustrating its behaviour.
Neutrosophic Correlation and Simple Linear Regression
Directory of Open Access Journals (Sweden)
A. A. Salama
2014-09-01
Full Text Available Since the world is full of indeterminacy, the neutrosophics found their place into contemporary research. The fundamental concepts of neutrosophic set, introduced by Smarandache. Recently, Salama et al., introduced the concept of correlation coefficient of neutrosophic data. In this paper, we introduce and study the concepts of correlation and correlation coefficient of neutrosophic data in probability spaces and study some of their properties. Also, we introduce and study the neutrosophic simple linear regression model. Possible applications to data processing are touched upon.
Spectral density regression for bivariate extremes
Castro Camilo, Daniela
2016-05-11
We introduce a density regression model for the spectral density of a bivariate extreme value distribution, that allows us to assess how extremal dependence can change over a covariate. Inference is performed through a double kernel estimator, which can be seen as an extension of the Nadaraya–Watson estimator where the usual scalar responses are replaced by mean constrained densities on the unit interval. Numerical experiments with the methods illustrate their resilience in a variety of contexts of practical interest. An extreme temperature dataset is used to illustrate our methods. © 2016 Springer-Verlag Berlin Heidelberg
AIRLINE ACTIVITY FORECASTING BY REGRESSION MODELS
Directory of Open Access Journals (Sweden)
Н. Білак
2012-04-01
Full Text Available Proposed linear and nonlinear regression models, which take into account the equation of trend and seasonality indices for the analysis and restore the volume of passenger traffic over the past period of time and its prediction for future years, as well as the algorithm of formation of these models based on statistical analysis over the years. The desired model is the first step for the synthesis of more complex models, which will enable forecasting of passenger (income level airline with the highest accuracy and time urgency.
International Nuclear Information System (INIS)
Xu, Bin; Lin, Boqiang
2017-01-01
China is currently the world's largest emitter of carbon dioxide. Considered as a large agricultural country, carbon emission in China’s agriculture sector keeps on growing rapidly. It is, therefore, of great importance to investigate the driving forces of carbon dioxide emissions in this sector. The traditional regression estimation can only get “average” and “global” parameter estimates; it excludes the “local” parameter estimates which vary across space in some spatial systems. Geographically weighted regression embeds the latitude and longitude of the sample data into the regression parameters, and uses the local weighted least squares method to estimate the parameters point–by–point. To reveal the nonstationary spatial effects of driving forces, geographically weighted regression model is employed in this paper. The results show that economic growth is positively correlated with emissions, with the impact in the western region being less than that in the central and eastern regions. Urbanization is positively related to emissions but produces opposite effects pattern. Energy intensity is also correlated with emissions, with a decreasing trend from the eastern region to the central and western regions. Therefore, policymakers should take full account of the spatial nonstationarity of driving forces in designing emission reduction policies. - Highlights: • We explore the driving forces of CO_2 emissions in the agriculture sector. • Urbanization is positively related to emissions but produces opposite effect pattern. • The effect of energy intensity declines from the eastern region to western region.
Management of Industrial Performance Indicators: Regression Analysis and Simulation
Directory of Open Access Journals (Sweden)
Walter Roberto Hernandez Vergara
2017-11-01
Full Text Available Stochastic methods can be used in problem solving and explanation of natural phenomena through the application of statistical procedures. The article aims to associate the regression analysis and systems simulation, in order to facilitate the practical understanding of data analysis. The algorithms were developed in Microsoft Office Excel software, using statistical techniques such as regression theory, ANOVA and Cholesky Factorization, which made it possible to create models of single and multiple systems with up to five independent variables. For the analysis of these models, the Monte Carlo simulation and analysis of industrial performance indicators were used, resulting in numerical indices that aim to improve the goals’ management for compliance indicators, by identifying systems’ instability, correlation and anomalies. The analytical models presented in the survey indicated satisfactory results with numerous possibilities for industrial and academic applications, as well as the potential for deployment in new analytical techniques.
Regression tree analysis for predicting body weight of Nigerian Muscovy duck (Cairina moschata
Directory of Open Access Journals (Sweden)
Oguntunji Abel Olusegun
2017-01-01
Full Text Available Morphometric parameters and their indices are central to the understanding of the type and function of livestock. The present study was conducted to predict body weight (BWT of adult Nigerian Muscovy ducks from nine (9 morphometric parameters and seven (7 body indices and also to identify the most important predictor of BWT among them using regression tree analysis (RTA. The experimental birds comprised of 1,020 adult male and female Nigerian Muscovy ducks randomly sampled in Rain Forest (203, Guinea Savanna (298 and Derived Savanna (519 agro-ecological zones. Result of RTA revealed that compactness; body girth and massiveness were the most important independent variables in predicting BWT and were used in constructing RT. The combined effect of the three predictors was very high and explained 91.00% of the observed variation of the target variable (BWT. The optimal regression tree suggested that Muscovy ducks with compactness >5.765 would be fleshy and have highest BWT. The result of the present study could be exploited by animal breeders and breeding companies in selection and improvement of BWT of Muscovy ducks.
International Nuclear Information System (INIS)
Hong, Wei-Chiang
2011-01-01
Support vector regression (SVR), with hybrid chaotic sequence and evolutionary algorithms to determine suitable values of its three parameters, not only can effectively avoid converging prematurely (i.e., trapping into a local optimum), but also reveals its superior forecasting performance. Electric load sometimes demonstrates a seasonal (cyclic) tendency due to economic activities or climate cyclic nature. The applications of SVR models to deal with seasonal (cyclic) electric load forecasting have not been widely explored. In addition, the concept of recurrent neural networks (RNNs), focused on using past information to capture detailed information, is helpful to be combined into an SVR model. This investigation presents an electric load forecasting model which combines the seasonal recurrent support vector regression model with chaotic artificial bee colony algorithm (namely SRSVRCABC) to improve the forecasting performance. The proposed SRSVRCABC employs the chaotic behavior of honey bees which is with better performance in function optimization to overcome premature local optimum. A numerical example from an existed reference is used to elucidate the forecasting performance of the proposed SRSVRCABC model. The forecasting results indicate that the proposed model yields more accurate forecasting results than ARIMA and TF-ε-SVR-SA models. Therefore, the SRSVRCABC model is a promising alternative for electric load forecasting. -- Highlights: → Hybridizing the seasonal adjustment and the recurrent mechanism into an SVR model. → Employing chaotic sequence to improve the premature convergence of artificial bee colony algorithm. → Successfully providing significant accurate monthly load demand forecasting.
Estimating the Impact of Urbanization on Air Quality in China Using Spatial Regression Models
Directory of Open Access Journals (Sweden)
Chuanglin Fang
2015-11-01
Full Text Available Urban air pollution is one of the most visible environmental problems to have accompanied China’s rapid urbanization. Based on emission inventory data from 2014, gathered from 289 cities, we used Global and Local Moran’s I to measure the spatial autorrelation of Air Quality Index (AQI values at the city level, and employed Ordinary Least Squares (OLS, Spatial Lag Model (SAR, and Geographically Weighted Regression (GWR to quantitatively estimate the comprehensive impact and spatial variations of China’s urbanization process on air quality. The results show that a significant spatial dependence and heterogeneity existed in AQI values. Regression models revealed urbanization has played an important negative role in determining air quality in Chinese cities. The population, urbanization rate, automobile density, and the proportion of secondary industry were all found to have had a significant influence over air quality. Per capita Gross Domestic Product (GDP and the scale of urban land use, however, failed the significance test at 10% level. The GWR model performed better than global models and the results of GWR modeling show that the relationship between urbanization and air quality was not constant in space. Further, the local parameter estimates suggest significant spatial variation in the impacts of various urbanization factors on air quality.
Jakubowski, J.; Stypulkowski, J. B.; Bernardeau, F. G.
2017-12-01
The first phase of the Abu Hamour drainage and storm tunnel was completed in early 2017. The 9.5 km long, 3.7 m diameter tunnel was excavated with two Earth Pressure Balance (EPB) Tunnel Boring Machines from Herrenknecht. TBM operation processes were monitored and recorded by Data Acquisition and Evaluation System. The authors coupled collected TBM drive data with available information on rock mass properties, cleansed, completed with secondary variables and aggregated by weeks and shifts. Correlations and descriptive statistics charts were examined. Multivariate Linear Regression and CART regression tree models linking TBM penetration rate (PR), penetration per revolution (PPR) and field penetration index (FPI) with TBM operational and geotechnical characteristics were performed for the conditions of the weak/soft rock of Doha. Both regression methods are interpretable and the data were screened with different computational approaches allowing enriched insight. The primary goal of the analysis was to investigate empirical relations between multiple explanatory and responding variables, to search for best subsets of explanatory variables and to evaluate the strength of linear and non-linear relations. For each of the penetration indices, a predictive model coupling both regression methods was built and validated. The resultant models appeared to be stronger than constituent ones and indicated an opportunity for more accurate and robust TBM performance predictions.
The Use of Nonparametric Kernel Regression Methods in Econometric Production Analysis
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard
and nonparametric estimations of production functions in order to evaluate the optimal firm size. The second paper discusses the use of parametric and nonparametric regression methods to estimate panel data regression models. The third paper analyses production risk, price uncertainty, and farmers' risk preferences...... within a nonparametric panel data regression framework. The fourth paper analyses the technical efficiency of dairy farms with environmental output using nonparametric kernel regression in a semiparametric stochastic frontier analysis. The results provided in this PhD thesis show that nonparametric......This PhD thesis addresses one of the fundamental problems in applied econometric analysis, namely the econometric estimation of regression functions. The conventional approach to regression analysis is the parametric approach, which requires the researcher to specify the form of the regression...
Bayesian nonlinear regression for large small problems
Chakraborty, Sounak; Ghosh, Malay; Mallick, Bani K.
2012-01-01
Statistical modeling and inference problems with sample sizes substantially smaller than the number of available covariates are challenging. This is known as large p small n problem. Furthermore, the problem is more complicated when we have multiple correlated responses. We develop multivariate nonlinear regression models in this setup for accurate prediction. In this paper, we introduce a full Bayesian support vector regression model with Vapnik's ε-insensitive loss function, based on reproducing kernel Hilbert spaces (RKHS) under the multivariate correlated response setup. This provides a full probabilistic description of support vector machine (SVM) rather than an algorithm for fitting purposes. We have also introduced a multivariate version of the relevance vector machine (RVM). Instead of the original treatment of the RVM relying on the use of type II maximum likelihood estimates of the hyper-parameters, we put a prior on the hyper-parameters and use Markov chain Monte Carlo technique for computation. We have also proposed an empirical Bayes method for our RVM and SVM. Our methods are illustrated with a prediction problem in the near-infrared (NIR) spectroscopy. A simulation study is also undertaken to check the prediction accuracy of our models. © 2012 Elsevier Inc.
Supporting Regularized Logistic Regression Privately and Efficiently
Li, Wenfa; Liu, Hongzhe; Yang, Peng; Xie, Wei
2016-01-01
As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc. PMID:27271738
Structural Break Tests Robust to Regression Misspecification
Directory of Open Access Journals (Sweden)
Alaa Abi Morshed
2018-05-01
Full Text Available Structural break tests for regression models are sensitive to model misspecification. We show—analytically and through simulations—that the sup Wald test for breaks in the conditional mean and variance of a time series process exhibits severe size distortions when the conditional mean dynamics are misspecified. We also show that the sup Wald test for breaks in the unconditional mean and variance does not have the same size distortions, yet benefits from similar power to its conditional counterpart in correctly specified models. Hence, we propose using it as an alternative and complementary test for breaks. We apply the unconditional and conditional mean and variance tests to three US series: unemployment, industrial production growth and interest rates. Both the unconditional and the conditional mean tests detect a break in the mean of interest rates. However, for the other two series, the unconditional mean test does not detect a break, while the conditional mean tests based on dynamic regression models occasionally detect a break, with the implied break-point estimator varying across different dynamic specifications. For all series, the unconditional variance does not detect a break while most tests for the conditional variance do detect a break which also varies across specifications.
Supporting Regularized Logistic Regression Privately and Efficiently.
Li, Wenfa; Liu, Hongzhe; Yang, Peng; Xie, Wei
2016-01-01
As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc.
Bayesian nonlinear regression for large small problems
Chakraborty, Sounak
2012-07-01
Statistical modeling and inference problems with sample sizes substantially smaller than the number of available covariates are challenging. This is known as large p small n problem. Furthermore, the problem is more complicated when we have multiple correlated responses. We develop multivariate nonlinear regression models in this setup for accurate prediction. In this paper, we introduce a full Bayesian support vector regression model with Vapnik\\'s ε-insensitive loss function, based on reproducing kernel Hilbert spaces (RKHS) under the multivariate correlated response setup. This provides a full probabilistic description of support vector machine (SVM) rather than an algorithm for fitting purposes. We have also introduced a multivariate version of the relevance vector machine (RVM). Instead of the original treatment of the RVM relying on the use of type II maximum likelihood estimates of the hyper-parameters, we put a prior on the hyper-parameters and use Markov chain Monte Carlo technique for computation. We have also proposed an empirical Bayes method for our RVM and SVM. Our methods are illustrated with a prediction problem in the near-infrared (NIR) spectroscopy. A simulation study is also undertaken to check the prediction accuracy of our models. © 2012 Elsevier Inc.
Hyperspectral Unmixing with Robust Collaborative Sparse Regression
Directory of Open Access Journals (Sweden)
Chang Li
2016-07-01
Full Text Available Recently, sparse unmixing (SU of hyperspectral data has received particular attention for analyzing remote sensing images. However, most SU methods are based on the commonly admitted linear mixing model (LMM, which ignores the possible nonlinear effects (i.e., nonlinearity. In this paper, we propose a new method named robust collaborative sparse regression (RCSR based on the robust LMM (rLMM for hyperspectral unmixing. The rLMM takes the nonlinearity into consideration, and the nonlinearity is merely treated as outlier, which has the underlying sparse property. The RCSR simultaneously takes the collaborative sparse property of the abundance and sparsely distributed additive property of the outlier into consideration, which can be formed as a robust joint sparse regression problem. The inexact augmented Lagrangian method (IALM is used to optimize the proposed RCSR. The qualitative and quantitative experiments on synthetic datasets and real hyperspectral images demonstrate that the proposed RCSR is efficient for solving the hyperspectral SU problem compared with the other four state-of-the-art algorithms.
Supporting Regularized Logistic Regression Privately and Efficiently.
Directory of Open Access Journals (Sweden)
Wenfa Li
Full Text Available As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc.
Tracking time-varying parameters with local regression
DEFF Research Database (Denmark)
Joensen, Alfred Karsten; Nielsen, Henrik Aalborg; Nielsen, Torben Skov
2000-01-01
This paper shows that the recursive least-squares (RLS) algorithm with forgetting factor is a special case of a varying-coe\\$cient model, and a model which can easily be estimated via simple local regression. This observation allows us to formulate a new method which retains the RLS algorithm, bu......, but extends the algorithm by including polynomial approximations. Simulation results are provided, which indicates that this new method is superior to the classical RLS method, if the parameter variations are smooth....
Simultaneous confidence bands for Cox regression from semiparametric random censorship.
Mondal, Shoubhik; Subramanian, Sundarraman
2016-01-01
Cox regression is combined with semiparametric random censorship models to construct simultaneous confidence bands (SCBs) for subject-specific survival curves. Simulation results are presented to compare the performance of the proposed SCBs with the SCBs that are based only on standard Cox. The new SCBs provide correct empirical coverage and are more informative. The proposed SCBs are illustrated with two real examples. An extension to handle missing censoring indicators is also outlined.
Interpreting parameters in the logistic regression model with random effects
DEFF Research Database (Denmark)
Larsen, Klaus; Petersen, Jørgen Holm; Budtz-Jørgensen, Esben
2000-01-01
interpretation, interval odds ratio, logistic regression, median odds ratio, normally distributed random effects......interpretation, interval odds ratio, logistic regression, median odds ratio, normally distributed random effects...
Yelland, Lisa N; Salter, Amy B; Ryan, Philip
2011-10-15
Modified Poisson regression, which combines a log Poisson regression model with robust variance estimation, is a useful alternative to log binomial regression for estimating relative risks. Previous studies have shown both analytically and by simulation that modified Poisson regression is appropriate for independent prospective data. This method is often applied to clustered prospective data, despite a lack of evidence to support its use in this setting. The purpose of this article is to evaluate the performance of the modified Poisson regression approach for estimating relative risks from clustered prospective data, by using generalized estimating equations to account for clustering. A simulation study is conducted to compare log binomial regression and modified Poisson regression for analyzing clustered data from intervention and observational studies. Both methods generally perform well in terms of bias, type I error, and coverage. Unlike log binomial regression, modified Poisson regression is not prone to convergence problems. The methods are contrasted by using example data sets from 2 large studies. The results presented in this article support the use of modified Poisson regression as an alternative to log binomial regression for analyzing clustered prospective data when clustering is taken into account by using generalized estimating equations.
Regressive Progression: The Quest for Self-Transcendence in Western Tragedy
Directory of Open Access Journals (Sweden)
Bahee Hadaegh
2009-07-01
Full Text Available Regressive progression is a concept which interestingly describes the developmental process of Western tragedy based on the recurring motif of the quest for the higher self and Nietzsche’s understanding of Dionysian tragic hero. This motif reveals itself in three manifestations - action, imagination and inaction- respectively visible in the three major dramatic eras of the Renaissance tragedy, European nineteenth-century drama, and the Absurd Theatre. Although the approach of the quest regressively shifts from action to inaction, the degree of success of the tragic questers in approximating the wished-for higher self reveals a progressive line in the developmental process of Western tragedy.
Fault trend prediction of device based on support vector regression
International Nuclear Information System (INIS)
Song Meicun; Cai Qi
2011-01-01
The research condition of fault trend prediction and the basic theory of support vector regression (SVR) were introduced. SVR was applied to the fault trend prediction of roller bearing, and compared with other methods (BP neural network, gray model, and gray-AR model). The results show that BP network tends to overlearn and gets into local minimum so that the predictive result is unstable. It also shows that the predictive result of SVR is stabilization, and SVR is superior to BP neural network, gray model and gray-AR model in predictive precision. SVR is a kind of effective method of fault trend prediction. (authors)
Dimethyl phenyl piperazine iodide (DMPP) induces glioma regression by inhibiting angiogenesis
Energy Technology Data Exchange (ETDEWEB)
He, Yan-qing; Li, Yan; Wang, Xiao-yu [Key Laboratory for Regenerative Medicine of the Ministry of Education, Division of Histology and Embryology, Medical College, Jinan University, Guangzhou 510632 (China); He, Xiao-dong [Institute of Vascular Biological Sciences, Guangdong Pharmaceutical University, Guangzhou 510006 (China); Jun, Li [Guangdong Provincial Key Laboratory of Bioengineering Medicine, National Engineering Research Centre of Genetic Medicine, College of Life Science and Technology, Jinan University, Guangzhou 510632 (China); Chuai, Manli [Division of Cell and Developmental Biology, University of Dundee, Dundee, DD1 5EH (United Kingdom); Lee, Kenneth Ka Ho [Key Laboratory for Regenerative Medicine of the Ministry of Education, School of Biomedical Sciences, Chinese University of Hong Kong, Shatin (Hong Kong); Wang, Ju [Guangdong Provincial Key Laboratory of Bioengineering Medicine, National Engineering Research Centre of Genetic Medicine, College of Life Science and Technology, Jinan University, Guangzhou 510632 (China); Wang, Li-jing, E-mail: wanglijing62@163.com [Institute of Vascular Biological Sciences, Guangdong Pharmaceutical University, Guangzhou 510006 (China); Yang, Xuesong, E-mail: yang_xuesong@126.com [Key Laboratory for Regenerative Medicine of the Ministry of Education, Division of Histology and Embryology, Medical College, Jinan University, Guangzhou 510632 (China)
2014-01-15
1,1-Dimethyl-4-phenyl piperazine iodide (DMPP) is a synthetic nicotinic acetylcholine receptor (nAChR) agonist that could reduce airway inflammation. In this study, we demonstrated that DMPP could dramatically inhibit glioma size maintained on the chick embryonic chorioallantoic membrane (CAM). We first performed MTT and BrdU incorporation experiments on U87 glioma cells in vitro to understand the mechanism involved. We established that DMPP did not significantly affect U87 cell proliferation and survival. We speculated that DMPP directly caused the tumor to regress by affecting the vasculature in and around the implanted tumor on our chick CAM model. Hence, we conducted detailed analysis of DMPP's inhibitory effects on angiogenesis. Three vasculogenesis and angiogenesis in vivo models were used in the study which included (1) early chick blood islands formation, (2) chick yolk-sac membrane (YSW) and (3) CAM models. The results revealed that DMPP directly suppressed all developmental stages involved in vasculogenesis and angiogenesis – possibly by acting through Ang-1 and HIF-2α signaling. In sum, our results show that DMPP could induce glioma regression grown on CAM by inhibiting vasculogenesis and angiogenesis. - Highlights: ●We demonstrated that DMPP inhibited the growth of glioma cells on chick CAM. ●DMPP did not significantly affect the proliferation and survival of U87 cells. ●We revealed that DMPP suppressed vasculogenesis and angiogenesis in chick embryo. ●Angiogenesis in chick CAM was inhibited by DMPP via most probably Ang-1 and HIF-2α. ●DMPP could be potentially developed as an anti-tumor drug in the future.
Dimethyl phenyl piperazine iodide (DMPP) induces glioma regression by inhibiting angiogenesis
International Nuclear Information System (INIS)
He, Yan-qing; Li, Yan; Wang, Xiao-yu; He, Xiao-dong; Jun, Li; Chuai, Manli; Lee, Kenneth Ka Ho; Wang, Ju; Wang, Li-jing; Yang, Xuesong
2014-01-01
1,1-Dimethyl-4-phenyl piperazine iodide (DMPP) is a synthetic nicotinic acetylcholine receptor (nAChR) agonist that could reduce airway inflammation. In this study, we demonstrated that DMPP could dramatically inhibit glioma size maintained on the chick embryonic chorioallantoic membrane (CAM). We first performed MTT and BrdU incorporation experiments on U87 glioma cells in vitro to understand the mechanism involved. We established that DMPP did not significantly affect U87 cell proliferation and survival. We speculated that DMPP directly caused the tumor to regress by affecting the vasculature in and around the implanted tumor on our chick CAM model. Hence, we conducted detailed analysis of DMPP's inhibitory effects on angiogenesis. Three vasculogenesis and angiogenesis in vivo models were used in the study which included (1) early chick blood islands formation, (2) chick yolk-sac membrane (YSW) and (3) CAM models. The results revealed that DMPP directly suppressed all developmental stages involved in vasculogenesis and angiogenesis – possibly by acting through Ang-1 and HIF-2α signaling. In sum, our results show that DMPP could induce glioma regression grown on CAM by inhibiting vasculogenesis and angiogenesis. - Highlights: ●We demonstrated that DMPP inhibited the growth of glioma cells on chick CAM. ●DMPP did not significantly affect the proliferation and survival of U87 cells. ●We revealed that DMPP suppressed vasculogenesis and angiogenesis in chick embryo. ●Angiogenesis in chick CAM was inhibited by DMPP via most probably Ang-1 and HIF-2α. ●DMPP could be potentially developed as an anti-tumor drug in the future
Robust Mediation Analysis Based on Median Regression
Yuan, Ying; MacKinnon, David P.
2014-01-01
Mediation analysis has many applications in psychology and the social sciences. The most prevalent methods typically assume that the error distribution is normal and homoscedastic. However, this assumption may rarely be met in practice, which can affect the validity of the mediation analysis. To address this problem, we propose robust mediation analysis based on median regression. Our approach is robust to various departures from the assumption of homoscedasticity and normality, including heavy-tailed, skewed, contaminated, and heteroscedastic distributions. Simulation studies show that under these circumstances, the proposed method is more efficient and powerful than standard mediation analysis. We further extend the proposed robust method to multilevel mediation analysis, and demonstrate through simulation studies that the new approach outperforms the standard multilevel mediation analysis. We illustrate the proposed method using data from a program designed to increase reemployment and enhance mental health of job seekers. PMID:24079925
ANYOLS, Least Square Fit by Stepwise Regression
International Nuclear Information System (INIS)
Atwoods, C.L.; Mathews, S.
1986-01-01
Description of program or function: ANYOLS is a stepwise program which fits data using ordinary or weighted least squares. Variables are selected for the model in a stepwise way based on a user- specified input criterion or a user-written subroutine. The order in which variables are entered can be influenced by user-defined forcing priorities. Instead of stepwise selection, ANYOLS can try all possible combinations of any desired subset of the variables. Automatic output for the final model in a stepwise search includes plots of the residuals, 'studentized' residuals, and leverages; if the model is not too large, the output also includes partial regression and partial leverage plots. A data set may be re-used so that several selection criteria can be tried. Flexibility is increased by allowing the substitution of user-written subroutines for several default subroutines
Nonparametric additive regression for repeatedly measured data
Carroll, R. J.
2009-05-20
We develop an easily computed smooth backfitting algorithm for additive model fitting in repeated measures problems. Our methodology easily copes with various settings, such as when some covariates are the same over repeated response measurements. We allow for a working covariance matrix for the regression errors, showing that our method is most efficient when the correct covariance matrix is used. The component functions achieve the known asymptotic variance lower bound for the scalar argument case. Smooth backfitting also leads directly to design-independent biases in the local linear case. Simulations show our estimator has smaller variance than the usual kernel estimator. This is also illustrated by an example from nutritional epidemiology. © 2009 Biometrika Trust.
Adaptive regression for modeling nonlinear relationships
Knafl, George J
2016-01-01
This book presents methods for investigating whether relationships are linear or nonlinear and for adaptively fitting appropriate models when they are nonlinear. Data analysts will learn how to incorporate nonlinearity in one or more predictor variables into regression models for different types of outcome variables. Such nonlinear dependence is often not considered in applied research, yet nonlinear relationships are common and so need to be addressed. A standard linear analysis can produce misleading conclusions, while a nonlinear analysis can provide novel insights into data, not otherwise possible. A variety of examples of the benefits of modeling nonlinear relationships are presented throughout the book. Methods are covered using what are called fractional polynomials based on real-valued power transformations of primary predictor variables combined with model selection based on likelihood cross-validation. The book covers how to formulate and conduct such adaptive fractional polynomial modeling in the s...
Crime Modeling using Spatial Regression Approach
Saleh Ahmar, Ansari; Adiatma; Kasim Aidid, M.
2018-01-01
Act of criminality in Indonesia increased both variety and quantity every year. As murder, rape, assault, vandalism, theft, fraud, fencing, and other cases that make people feel unsafe. Risk of society exposed to crime is the number of reported cases in the police institution. The higher of the number of reporter to the police institution then the number of crime in the region is increasing. In this research, modeling criminality in South Sulawesi, Indonesia with the dependent variable used is the society exposed to the risk of crime. Modelling done by area approach is the using Spatial Autoregressive (SAR) and Spatial Error Model (SEM) methods. The independent variable used is the population density, the number of poor population, GDP per capita, unemployment and the human development index (HDI). Based on the analysis using spatial regression can be shown that there are no dependencies spatial both lag or errors in South Sulawesi.
Regression analysis for the social sciences
Gordon, Rachel A
2015-01-01
Provides graduate students in the social sciences with the basic skills they need to estimate, interpret, present, and publish basic regression models using contemporary standards. Key features of the book include: interweaving the teaching of statistical concepts with examples developed for the course from publicly-available social science data or drawn from the literature. thorough integration of teaching statistical theory with teaching data processing and analysis. teaching of Stata and use of chapter exercises in which students practice programming and interpretation on the same data set. A separate set of exercises allows students to select a data set to apply the concepts learned in each chapter to a research question of interest to them, all updated for this edition.
Least square regularized regression in sum space.
Xu, Yong-Li; Chen, Di-Rong; Li, Han-Xiong; Liu, Lu
2013-04-01
This paper proposes a least square regularized regression algorithm in sum space of reproducing kernel Hilbert spaces (RKHSs) for nonflat function approximation, and obtains the solution of the algorithm by solving a system of linear equations. This algorithm can approximate the low- and high-frequency component of the target function with large and small scale kernels, respectively. The convergence and learning rate are analyzed. We measure the complexity of the sum space by its covering number and demonstrate that the covering number can be bounded by the product of the covering numbers of basic RKHSs. For sum space of RKHSs with Gaussian kernels, by choosing appropriate parameters, we tradeoff the sample error and regularization error, and obtain a polynomial learning rate, which is better than that in any single RKHS. The utility of this method is illustrated with two simulated data sets and five real-life databases.
Model Selection in Kernel Ridge Regression
DEFF Research Database (Denmark)
Exterkate, Peter
Kernel ridge regression is gaining popularity as a data-rich nonlinear forecasting tool, which is applicable in many different contexts. This paper investigates the influence of the choice of kernel and the setting of tuning parameters on forecast accuracy. We review several popular kernels......, including polynomial kernels, the Gaussian kernel, and the Sinc kernel. We interpret the latter two kernels in terms of their smoothing properties, and we relate the tuning parameters associated to all these kernels to smoothness measures of the prediction function and to the signal-to-noise ratio. Based...... on these interpretations, we provide guidelines for selecting the tuning parameters from small grids using cross-validation. A Monte Carlo study confirms the practical usefulness of these rules of thumb. Finally, the flexible and smooth functional forms provided by the Gaussian and Sinc kernels makes them widely...
Learning Inverse Rig Mappings by Nonlinear Regression.
Holden, Daniel; Saito, Jun; Komura, Taku
2017-03-01
We present a framework to design inverse rig-functions-functions that map low level representations of a character's pose such as joint positions or surface geometry to the representation used by animators called the animation rig. Animators design scenes using an animation rig, a framework widely adopted in animation production which allows animators to design character poses and geometry via intuitive parameters and interfaces. Yet most state-of-the-art computer animation techniques control characters through raw, low level representations such as joint angles, joint positions, or vertex coordinates. This difference often stops the adoption of state-of-the-art techniques in animation production. Our framework solves this issue by learning a mapping between the low level representations of the pose and the animation rig. We use nonlinear regression techniques, learning from example animation sequences designed by the animators. When new motions are provided in the skeleton space, the learned mapping is used to estimate the rig controls that reproduce such a motion. We introduce two nonlinear functions for producing such a mapping: Gaussian process regression and feedforward neural networks. The appropriate solution depends on the nature of the rig and the amount of data available for training. We show our framework applied to various examples including articulated biped characters, quadruped characters, facial animation rigs, and deformable characters. With our system, animators have the freedom to apply any motion synthesis algorithm to arbitrary rigging and animation pipelines for immediate editing. This greatly improves the productivity of 3D animation, while retaining the flexibility and creativity of artistic input.
DRREP: deep ridge regressed epitope predictor.
Sher, Gene; Zhi, Degui; Zhang, Shaojie
2017-10-03
The ability to predict epitopes plays an enormous role in vaccine development in terms of our ability to zero in on where to do a more thorough in-vivo analysis of the protein in question. Though for the past decade there have been numerous advancements and improvements in epitope prediction, on average the best benchmark prediction accuracies are still only around 60%. New machine learning algorithms have arisen within the domain of deep learning, text mining, and convolutional networks. This paper presents a novel analytically trained and string kernel using deep neural network, which is tailored for continuous epitope prediction, called: Deep Ridge Regressed Epitope Predictor (DRREP). DRREP was tested on long protein sequences from the following datasets: SARS, Pellequer, HIV, AntiJen, and SEQ194. DRREP was compared to numerous state of the art epitope predictors, including the most recently published predictors called LBtope and DMNLBE. Using area under ROC curve (AUC), DRREP achieved a performance improvement over the best performing predictors on SARS (13.7%), HIV (8.9%), Pellequer (1.5%), and SEQ194 (3.1%), with its performance being matched only on the AntiJen dataset, by the LBtope predictor, where both DRREP and LBtope achieved an AUC of 0.702. DRREP is an analytically trained deep neural network, thus capable of learning in a single step through regression. By combining the features of deep learning, string kernels, and convolutional networks, the system is able to perform residue-by-residue prediction of continues epitopes with higher accuracy than the current state of the art predictors.
Classifying machinery condition using oil samples and binary logistic regression
Phillips, J.; Cripps, E.; Lau, John W.; Hodkiewicz, M. R.
2015-08-01
The era of big data has resulted in an explosion of condition monitoring information. The result is an increasing motivation to automate the costly and time consuming human elements involved in the classification of machine health. When working with industry it is important to build an understanding and hence some trust in the classification scheme for those who use the analysis to initiate maintenance tasks. Typically "black box" approaches such as artificial neural networks (ANN) and support vector machines (SVM) can be difficult to provide ease of interpretability. In contrast, this paper argues that logistic regression offers easy interpretability to industry experts, providing insight to the drivers of the human classification process and to the ramifications of potential misclassification. Of course, accuracy is of foremost importance in any automated classification scheme, so we also provide a comparative study based on predictive performance of logistic regression, ANN and SVM. A real world oil analysis data set from engines on mining trucks is presented and using cross-validation we demonstrate that logistic regression out-performs the ANN and SVM approaches in terms of prediction for healthy/not healthy engines.
Forecasting urban water demand: A meta-regression analysis.
Sebri, Maamar
2016-12-01
Water managers and planners require accurate water demand forecasts over the short-, medium- and long-term for many purposes. These range from assessing water supply needs over spatial and temporal patterns to optimizing future investments and planning future allocations across competing sectors. This study surveys the empirical literature on the urban water demand forecasting using the meta-analytical approach. Specifically, using more than 600 estimates, a meta-regression analysis is conducted to identify explanations of cross-studies variation in accuracy of urban water demand forecasting. Our study finds that accuracy depends significantly on study characteristics, including demand periodicity, modeling method, forecasting horizon, model specification and sample size. The meta-regression results remain robust to different estimators employed as well as to a series of sensitivity checks performed. The importance of these findings lies in the conclusions and implications drawn out for regulators and policymakers and for academics alike. Copyright © 2016. Published by Elsevier Ltd.
Climate reconstruction by regression - 32 variations on a theme
Energy Technology Data Exchange (ETDEWEB)
Buerger, Gerd; Fast, Irina; Cubasch, Ulrich [FU Berlin (Germany). Inst. fuer Meteorologie
2006-02-15
Regression-based methods fail to provide a sufficiently unique reconstruction of a given millennial history of Northern Hemisphere mean temperature. They instead offer a multitude of variants, depending on the specific data processing scheme. Using a simulated climate history with noise-disturbed pseudo-proxies, we systematically test a set of such configurations, each of which appears to be a priori reasonable, with existing applications elsewhere. This results in an entire spectrum between practically useless and almost perfect reconstructions. The reason lies in the fact that the training variations are not representative of the full millennium, and the regression equations have to be extrapolated. This creates an error that is proportional to both the model uncertainty and the proxy amplitudes. Estimation of that uncertainty is paramount for a useful millennial reconstruction, especially if it is of the parameter-loaded multiproxy type.
New robust statistical procedures for the polytomous logistic regression models.
Castilla, Elena; Ghosh, Abhik; Martin, Nirian; Pardo, Leandro
2018-05-17
This article derives a new family of estimators, namely the minimum density power divergence estimators, as a robust generalization of the maximum likelihood estimator for the polytomous logistic regression model. Based on these estimators, a family of Wald-type test statistics for linear hypotheses is introduced. Robustness properties of both the proposed estimators and the test statistics are theoretically studied through the classical influence function analysis. Appropriate real life examples are presented to justify the requirement of suitable robust statistical procedures in place of the likelihood based inference for the polytomous logistic regression model. The validity of the theoretical results established in the article are further confirmed empirically through suitable simulation studies. Finally, an approach for the data-driven selection of the robustness tuning parameter is proposed with empirical justifications. © 2018, The International Biometric Society.
Modeling and prediction of flotation performance using support vector regression
Directory of Open Access Journals (Sweden)
Despotović Vladimir
2017-01-01
Full Text Available Continuous efforts have been made in recent year to improve the process of paper recycling, as it is of critical importance for saving the wood, water and energy resources. Flotation deinking is considered to be one of the key methods for separation of ink particles from the cellulose fibres. Attempts to model the flotation deinking process have often resulted in complex models that are difficult to implement and use. In this paper a model for prediction of flotation performance based on Support Vector Regression (SVR, is presented. Representative data samples were created in laboratory, under a variety of practical control variables for the flotation deinking process, including different reagents, pH values and flotation residence time. Predictive model was created that was trained on these data samples, and the flotation performance was assessed showing that Support Vector Regression is a promising method even when dataset used for training the model is limited.
Modeling the number of car theft using Poisson regression
Zulkifli, Malina; Ling, Agnes Beh Yen; Kasim, Maznah Mat; Ismail, Noriszura
2016-10-01
Regression analysis is the most popular statistical methods used to express the relationship between the variables of response with the covariates. The aim of this paper is to evaluate the factors that influence the number of car theft using Poisson regression model. This paper will focus on the number of car thefts that occurred in districts in Peninsular Malaysia. There are two groups of factor that have been considered, namely district descriptive factors and socio and demographic factors. The result of the study showed that Bumiputera composition, Chinese composition, Other ethnic composition, foreign migration, number of residence with the age between 25 to 64, number of employed person and number of unemployed person are the most influence factors that affect the car theft cases. These information are very useful for the law enforcement department, insurance company and car owners in order to reduce and limiting the car theft cases in Peninsular Malaysia.
Robust analysis of trends in noisy tokamak confinement data using geodesic least squares regression
Energy Technology Data Exchange (ETDEWEB)
Verdoolaege, G., E-mail: geert.verdoolaege@ugent.be [Department of Applied Physics, Ghent University, B-9000 Ghent (Belgium); Laboratory for Plasma Physics, Royal Military Academy, B-1000 Brussels (Belgium); Shabbir, A. [Department of Applied Physics, Ghent University, B-9000 Ghent (Belgium); Max Planck Institute for Plasma Physics, Boltzmannstr. 2, 85748 Garching (Germany); Hornung, G. [Department of Applied Physics, Ghent University, B-9000 Ghent (Belgium)
2016-11-15
Regression analysis is a very common activity in fusion science for unveiling trends and parametric dependencies, but it can be a difficult matter. We have recently developed the method of geodesic least squares (GLS) regression that is able to handle errors in all variables, is robust against data outliers and uncertainty in the regression model, and can be used with arbitrary distribution models and regression functions. We here report on first results of application of GLS to estimation of the multi-machine scaling law for the energy confinement time in tokamaks, demonstrating improved consistency of the GLS results compared to standard least squares.
Continuous validation of ASTEC containment models and regression testing
International Nuclear Information System (INIS)
Nowack, Holger; Reinke, Nils; Sonnenkalb, Martin
2014-01-01
The focus of the ASTEC (Accident Source Term Evaluation Code) development at GRS is primarily on the containment module CPA (Containment Part of ASTEC), whose modelling is to a large extent based on the GRS containment code COCOSYS (COntainment COde SYStem). Validation is usually understood as the approval of the modelling capabilities by calculations of appropriate experiments done by external users different from the code developers. During the development process of ASTEC CPA, bugs and unintended side effects may occur, which leads to changes in the results of the initially conducted validation. Due to the involvement of a considerable number of developers in the coding of ASTEC modules, validation of the code alone, even if executed repeatedly, is not sufficient. Therefore, a regression testing procedure has been implemented in order to ensure that the initially obtained validation results are still valid with succeeding code versions. Within the regression testing procedure, calculations of experiments and plant sequences are performed with the same input deck but applying two different code versions. For every test-case the up-to-date code version is compared to the preceding one on the basis of physical parameters deemed to be characteristic for the test-case under consideration. In the case of post-calculations of experiments also a comparison to experimental data is carried out. Three validation cases from the regression testing procedure are presented within this paper. The very good post-calculation of the HDR E11.1 experiment shows the high quality modelling of thermal-hydraulics in ASTEC CPA. Aerosol behaviour is validated on the BMC VANAM M3 experiment, and the results show also a very good agreement with experimental data. Finally, iodine behaviour is checked in the validation test-case of the THAI IOD-11 experiment. Within this test-case, the comparison of the ASTEC versions V2.0r1 and V2.0r2 shows how an error was detected by the regression testing
Luo, Lei; Yang, Jian; Qian, Jianjun; Tai, Ying; Lu, Gui-Fu
2017-09-01
Dealing with partial occlusion or illumination is one of the most challenging problems in image representation and classification. In this problem, the characterization of the representation error plays a crucial role. In most current approaches, the error matrix needs to be stretched into a vector and each element is assumed to be independently corrupted. This ignores the dependence between the elements of error. In this paper, it is assumed that the error image caused by partial occlusion or illumination changes is a random matrix variate and follows the extended matrix variate power exponential distribution. This has the heavy tailed regions and can be used to describe a matrix pattern of l×m dimensional observations that are not independent. This paper reveals the essence of the proposed distribution: it actually alleviates the correlations between pixels in an error matrix E and makes E approximately Gaussian. On the basis of this distribution, we derive a Schatten p -norm-based matrix regression model with L q regularization. Alternating direction method of multipliers is applied to solve this model. To get a closed-form solution in each step of the algorithm, two singular value function thresholding operators are introduced. In addition, the extended Schatten p -norm is utilized to characterize the distance between the test samples and classes in the design of the classifier. Extensive experimental results for image reconstruction and classification with structural noise demonstrate that the proposed algorithm works much more robustly than some existing regression-based methods.
International Nuclear Information System (INIS)
Driemel, Oliver; Ettl, Tobias; Reichert, Torsten E.; Koelbl, Oliver; Dresp, Bernd V.; Reuther, Juergen; Pistner, Hans
2009-01-01
Background and purpose: preoperative radiochemotherapy has been reported to enhance tumor response and to improve long-term survival in advanced squamous cell carcinoma of the head and neck. This retrospective study evaluates regression rate and long-term survival in 228 patients with primary oral squamous cell carcinoma treated by neoadjuvant radiochemotherapy and radical surgery. Patients and methods: all patients with biopsy-proven, resectable oral squamous cell carcinoma - TNM stages II-IV without distant metastasis - received preoperative treatment consisting of fractioned irradiation of the primary and the regional lymph nodes with a total dose of 40 Gy and additional cisplatin (n = 160) or carboplatin (n = 68) during the 1st week of treatment. Radical surgery and neck dissection followed after a delay of 10-14 days. The study only included cases with histologically negative resection margins. Results: after a median follow-up of 5.2 years, 53 patients (23.2%) had experienced local-regional recurrence. The median 2-year disease-specific survival (DSS) rate was 86.2%. 5-year DSS and 10-year DSS were 76.3% and 66.7%, respectively. Complete histological local tumor regression after surgery (ypTO) was observed in 50 patients (21.9%) and was independent of pretreatment tumor classification. Uni- and multivariate survival analysis revealed that ypT- and ypN-stage were the most decisive predictors for DSS. Conclusion: preoperative radiochemotherapy with cisplatin/carboplatin followed by radical surgery attains favorable long-term survival rates. This applies especially to cases with complete histological tumor regression after radiochemotherapy, which can be assumed for one of five patients. (orig.)
Šarić, Željko; Xu, Xuecai; Duan, Li; Babić, Darko
2018-06-20
This study intended to investigate the interactions between accident rate and traffic signs in state roads located in Croatia, and accommodate the heterogeneity attributed to unobserved factors. The data from 130 state roads between 2012 and 2016 were collected from Traffic Accident Database System maintained by the Republic of Croatia Ministry of the Interior. To address the heterogeneity, a panel quantile regression model was proposed, in which quantile regression model offers a more complete view and a highly comprehensive analysis of the relationship between accident rate and traffic signs, while the panel data model accommodates the heterogeneity attributed to unobserved factors. Results revealed that (1) low visibility of material damage (MD) and death or injured (DI) increased the accident rate; (2) the number of mandatory signs and the number of warning signs were more likely to reduce the accident rate; (3)average speed limit and the number of invalid traffic signs per km exhibited a high accident rate. To our knowledge, it's the first attempt to analyze the interactions between accident consequences and traffic signs by employing a panel quantile regression model; by involving the visibility, the present study demonstrates that the low visibility causes a relatively higher risk of MD and DI; It is noteworthy that average speed limit corresponds with accident rate positively; The number of mandatory signs and the number of warning signs are more likely to reduce the accident rate; The number of invalid traffic signs per km are significant for accident rate, thus regular maintenance should be kept for a safer roadway environment.
AN APPLICATION OF FUNCTIONAL MULTIVARIATE REGRESSION MODEL TO MULTICLASS CLASSIFICATION
Krzyśko, Mirosław; Smaga, Łukasz
2017-01-01
In this paper, the scale response functional multivariate regression model is considered. By using the basis functions representation of functional predictors and regression coefficients, this model is rewritten as a multivariate regression model. This representation of the functional multivariate regression model is used for multiclass classification for multivariate functional data. Computational experiments performed on real labelled data sets demonstrate the effectiveness of the proposed ...
Bakker, D.P.; Busscher, H.J.; Zanten, J. van; Vries, J. de; Klijnstra, J.W.; Mei, H.C. van der
2004-01-01
Many studies have shown relationships of substratum hydrophobicity, charge or roughness with bacterial adhesion, although bacterial adhesion is governed by interplay of different physico-chemical properties and multiple regression analysis would be more suitable to reveal mechanisms of bacterial
Bakker, Dewi P; Busscher, Henk J; van Zanten, Joyce; de Vries, Jacob; Klijnstra, Job W; van der Mei, Henny C
Many studies have shown relationships of substratum hydrophobicity, charge or roughness with bacterial adhesion, although bacterial adhesion is governed by interplay of different physico-chemical properties and multiple regression analysis would be more suitable to reveal mechanisms of bacterial
Introduction to the use of regression models in epidemiology.
Bender, Ralf
2009-01-01
Regression modeling is one of the most important statistical techniques used in analytical epidemiology. By means of regression models the effect of one or several explanatory variables (e.g., exposures, subject characteristics, risk factors) on a response variable such as mortality or cancer can be investigated. From multiple regression models, adjusted effect estimates can be obtained that take the effect of potential confounders into account. Regression methods can be applied in all epidemiologic study designs so that they represent a universal tool for data analysis in epidemiology. Different kinds of regression models have been developed in dependence on the measurement scale of the response variable and the study design. The most important methods are linear regression for continuous outcomes, logistic regression for binary outcomes, Cox regression for time-to-event data, and Poisson regression for frequencies and rates. This chapter provides a nontechnical introduction to these regression models with illustrating examples from cancer research.
Poisson Regression Analysis of Illness and Injury Surveillance Data
Energy Technology Data Exchange (ETDEWEB)
Frome E.L., Watkins J.P., Ellis E.D.
2012-12-12
The Department of Energy (DOE) uses illness and injury surveillance to monitor morbidity and assess the overall health of the work force. Data collected from each participating site include health events and a roster file with demographic information. The source data files are maintained in a relational data base, and are used to obtain stratified tables of health event counts and person time at risk that serve as the starting point for Poisson regression analysis. The explanatory variables that define these tables are age, gender, occupational group, and time. Typical response variables of interest are the number of absences due to illness or injury, i.e., the response variable is a count. Poisson regression methods are used to describe the effect of the explanatory variables on the health event rates using a log-linear main effects model. Results of fitting the main effects model are summarized in a tabular and graphical form and interpretation of model parameters is provided. An analysis of deviance table is used to evaluate the importance of each of the explanatory variables on the event rate of interest and to determine if interaction terms should be considered in the analysis. Although Poisson regression methods are widely used in the analysis of count data, there are situations in which over-dispersion occurs. This could be due to lack-of-fit of the regression model, extra-Poisson variation, or both. A score test statistic and regression diagnostics are used to identify over-dispersion. A quasi-likelihood method of moments procedure is used to evaluate and adjust for extra-Poisson variation when necessary. Two examples are presented using respiratory disease absence rates at two DOE sites to illustrate the methods and interpretation of the results. In the first example the Poisson main effects model is adequate. In the second example the score test indicates considerable over-dispersion and a more detailed analysis attributes the over-dispersion to extra
Automation of Flight Software Regression Testing
Tashakkor, Scott B.
2016-01-01
NASA is developing the Space Launch System (SLS) to be a heavy lift launch vehicle supporting human and scientific exploration beyond earth orbit. SLS will have a common core stage, an upper stage, and different permutations of boosters and fairings to perform various crewed or cargo missions. Marshall Space Flight Center (MSFC) is writing the Flight Software (FSW) that will operate the SLS launch vehicle. The FSW is developed in an incremental manner based on "Agile" software techniques. As the FSW is incrementally developed, testing the functionality of the code needs to be performed continually to ensure that the integrity of the software is maintained. Manually testing the functionality on an ever-growing set of requirements and features is not an efficient solution and therefore needs to be done automatically to ensure testing is comprehensive. To support test automation, a framework for a regression test harness has been developed and used on SLS FSW. The test harness provides a modular design approach that can compile or read in the required information specified by the developer of the test. The modularity provides independence between groups of tests and the ability to add and remove tests without disturbing others. This provides the SLS FSW team a time saving feature that is essential to meeting SLS Program technical and programmatic requirements. During development of SLS FSW, this technique has proved to be a useful tool to ensure all requirements have been tested, and that desired functionality is maintained, as changes occur. It also provides a mechanism for developers to check functionality of the code that they have developed. With this system, automation of regression testing is accomplished through a scheduling tool and/or commit hooks. Key advantages of this test harness capability includes execution support for multiple independent test cases, the ability for developers to specify precisely what they are testing and how, the ability to add
Laplacian embedded regression for scalable manifold regularization.
Chen, Lin; Tsang, Ivor W; Xu, Dong
2012-06-01
Semi-supervised learning (SSL), as a powerful tool to learn from a limited number of labeled data and a large number of unlabeled data, has been attracting increasing attention in the machine learning community. In particular, the manifold regularization framework has laid solid theoretical foundations for a large family of SSL algorithms, such as Laplacian support vector machine (LapSVM) and Laplacian regularized least squares (LapRLS). However, most of these algorithms are limited to small scale problems due to the high computational cost of the matrix inversion operation involved in the optimization problem. In this paper, we propose a novel framework called Laplacian embedded regression by introducing an intermediate decision variable into the manifold regularization framework. By using ∈-insensitive loss, we obtain the Laplacian embedded support vector regression (LapESVR) algorithm, which inherits the sparse solution from SVR. Also, we derive Laplacian embedded RLS (LapERLS) corresponding to RLS under the proposed framework. Both LapESVR and LapERLS possess a simpler form of a transformed kernel, which is the summation of the original kernel and a graph kernel that captures the manifold structure. The benefits of the transformed kernel are two-fold: (1) we can deal with the original kernel matrix and the graph Laplacian matrix in the graph kernel separately and (2) if the graph Laplacian matrix is sparse, we only need to perform the inverse operation for a sparse matrix, which is much more efficient when compared with that for a dense one. Inspired by kernel principal component analysis, we further propose to project the introduced decision variable into a subspace spanned by a few eigenvectors of the graph Laplacian matrix in order to better reflect the data manifold, as well as accelerate the calculation of the graph kernel, allowing our methods to efficiently and effectively cope with large scale SSL problems. Extensive experiments on both toy and real
On macroeconomic values investigation using fuzzy linear regression analysis
Directory of Open Access Journals (Sweden)
Richard Pospíšil
2017-06-01
Full Text Available The theoretical background for abstract formalization of the vague phenomenon of complex systems is the fuzzy set theory. In the paper, vague data is defined as specialized fuzzy sets - fuzzy numbers and there is described a fuzzy linear regression model as a fuzzy function with fuzzy numbers as vague parameters. To identify the fuzzy coefficients of the model, the genetic algorithm is used. The linear approximation of the vague function together with its possibility area is analytically and graphically expressed. A suitable application is performed in the tasks of the time series fuzzy regression analysis. The time-trend and seasonal cycles including their possibility areas are calculated and expressed. The examples are presented from the economy field, namely the time-development of unemployment, agricultural production and construction respectively between 2009 and 2011 in the Czech Republic. The results are shown in the form of the fuzzy regression models of variables of time series. For the period 2009-2011, the analysis assumptions about seasonal behaviour of variables and the relationship between them were confirmed; in 2010, the system behaved fuzzier and the relationships between the variables were vaguer, that has a lot of causes, from the different elasticity of demand, through state interventions to globalization and transnational impacts.
Predicting company growth using logistic regression and neural networks
Directory of Open Access Journals (Sweden)
Marijana Zekić-Sušac
2016-12-01
Full Text Available The paper aims to establish an efficient model for predicting company growth by leveraging the strengths of logistic regression and neural networks. A real dataset of Croatian companies was used which described the relevant industry sector, financial ratios, income, and assets in the input space, with a dependent binomial variable indicating whether a company had high-growth if it had annualized growth in assets by more than 20% a year over a three-year period. Due to a large number of input variables, factor analysis was performed in the pre -processing stage in order to extract the most important input components. Building an efficient model with a high classification rate and explanatory ability required application of two data mining methods: logistic regression as a parametric and neural networks as a non -parametric method. The methods were tested on the models with and without variable reduction. The classification accuracy of the models was compared using statistical tests and ROC curves. The results showed that neural networks produce a significantly higher classification accuracy in the model when incorporating all available variables. The paper further discusses the advantages and disadvantages of both approaches, i.e. logistic regression and neural networks in modelling company growth. The suggested model is potentially of benefit to investors and economic policy makers as it provides support for recognizing companies with growth potential, especially during times of economic downturn.
Regression Models and Fuzzy Logic Prediction of TBM Penetration Rate
Directory of Open Access Journals (Sweden)
Minh Vu Trieu
2017-03-01
Full Text Available This paper presents statistical analyses of rock engineering properties and the measured penetration rate of tunnel boring machine (TBM based on the data of an actual project. The aim of this study is to analyze the influence of rock engineering properties including uniaxial compressive strength (UCS, Brazilian tensile strength (BTS, rock brittleness index (BI, the distance between planes of weakness (DPW, and the alpha angle (Alpha between the tunnel axis and the planes of weakness on the TBM rate of penetration (ROP. Four (4 statistical regression models (two linear and two nonlinear are built to predict the ROP of TBM. Finally a fuzzy logic model is developed as an alternative method and compared to the four statistical regression models. Results show that the fuzzy logic model provides better estimations and can be applied to predict the TBM performance. The R-squared value (R2 of the fuzzy logic model scores the highest value of 0.714 over the second runner-up of 0.667 from the multiple variables nonlinear regression model.
Regression Models and Fuzzy Logic Prediction of TBM Penetration Rate
Minh, Vu Trieu; Katushin, Dmitri; Antonov, Maksim; Veinthal, Renno
2017-03-01
This paper presents statistical analyses of rock engineering properties and the measured penetration rate of tunnel boring machine (TBM) based on the data of an actual project. The aim of this study is to analyze the influence of rock engineering properties including uniaxial compressive strength (UCS), Brazilian tensile strength (BTS), rock brittleness index (BI), the distance between planes of weakness (DPW), and the alpha angle (Alpha) between the tunnel axis and the planes of weakness on the TBM rate of penetration (ROP). Four (4) statistical regression models (two linear and two nonlinear) are built to predict the ROP of TBM. Finally a fuzzy logic model is developed as an alternative method and compared to the four statistical regression models. Results show that the fuzzy logic model provides better estimations and can be applied to predict the TBM performance. The R-squared value (R2) of the fuzzy logic model scores the highest value of 0.714 over the second runner-up of 0.667 from the multiple variables nonlinear regression model.
Quality of life in breast cancer patients--a quantile regression analysis.
Pourhoseingholi, Mohamad Amin; Safaee, Azadeh; Moghimi-Dehkordi, Bijan; Zeighami, Bahram; Faghihzadeh, Soghrat; Tabatabaee, Hamid Reza; Pourhoseingholi, Asma
2008-01-01
Quality of life study has an important role in health care especially in chronic diseases, in clinical judgment and in medical resources supplying. Statistical tools like linear regression are widely used to assess the predictors of quality of life. But when the response is not normal the results are misleading. The aim of this study is to determine the predictors of quality of life in breast cancer patients, using quantile regression model and compare to linear regression. A cross-sectional study conducted on 119 breast cancer patients that admitted and treated in chemotherapy ward of Namazi hospital in Shiraz. We used QLQ-C30 questionnaire to assessment quality of life in these patients. A quantile regression was employed to assess the assocciated factors and the results were compared to linear regression. All analysis carried out using SAS. The mean score for the global health status for breast cancer patients was 64.92+/-11.42. Linear regression showed that only grade of tumor, occupational status, menopausal status, financial difficulties and dyspnea were statistically significant. In spite of linear regression, financial difficulties were not significant in quantile regression analysis and dyspnea was only significant for first quartile. Also emotion functioning and duration of disease statistically predicted the QOL score in the third quartile. The results have demonstrated that using quantile regression leads to better interpretation and richer inference about predictors of the breast cancer patient quality of life.
DNA-Cytometry of Progressive and Regressive Cervical Intraepithelial Neoplasia
Directory of Open Access Journals (Sweden)
Antonius G. J. M. Hanselaar
1998-01-01
Full Text Available A retrospective analysis was performed on archival cervical smears from a group of 56 women with cervical intraepithelial neoplasia (CIN, who had received follow‐up by cytology only. Automated image cytometry of Feulgen‐stained DNA was used to determine the differences between progressive and regressive lesions. The first group of 30 smears was from women who had developed cancer after initial smears with dysplastic changes (progressive group. The second group of 26 smears with dysplastic changes had shown regression to normal (regressive group. The goal of the study was to determine if differences in cytometric features existed between the progressive and regressive groups. CIN categories I, II and III were represented in both groups, and measurements were pooled across diagnostic categories. Images of up to 700 intermediate cells were obtained from each slide, and cells were scanned exhaustively for the detection of diagnostic cells. Discriminant function analysis was performed for both intermediate and diagnostic cells. The most significant differences between the groups were found for diagnostic cells, with a cell classification accuracy of 82%. Intermediate cells could be classified with 60% accuracy. Cytometric features which afforded the best discrimination were characteristic of the chromatin organization in diagnostic cells (nuclear texture. Slide classification was performed by thresholding the number of cells which exhibited progression associated changes (PAC in chromatin configuration, with an accuracy of 93 and 73% for diagnostic and intermediate cells, respectively. These results indicate that regardless of the extent of nuclear atypia as reflected in the CIN category, features of chromatin organization can potentially be used to predict the malignant or progressive potential of CIN lesions.
DEFF Research Database (Denmark)
Vlachos, Evgenios; Schärfe, Henrik
2012-01-01
This work presents a method for designing facial interfaces for sociable android robots with respect to the fundamental rules of human affect expression. Extending the work of Paul Ekman towards a robotic direction, we follow the judgment-based approach for evaluating facial expressions to test...... findings are based on the results derived from a number of judgments, and suggest that before programming the facial expressions of a Geminoid, the Original should pass through the proposed procedure. According to our recommendations, the facial expressions of an android should be tested by judges, even...... in which case an android robot like the Geminoid|DK –a duplicate of an Original person- reveals emotions convincingly; when following an empirical perspective, or when following a theoretical one. The methodology includes the processes of acquiring the empirical data, and gathering feedback on them. Our...
Hecht, Jeffrey B.
The analysis of regression residuals and detection of outliers are discussed, with emphasis on determining how deviant an individual data point must be to be considered an outlier and the impact that multiple suspected outlier data points have on the process of outlier determination and treatment. Only bivariate (one dependent and one independent)…
Geographically weighted regression as a generalized Wombling to detect barriers to gene flow.
Diniz-Filho, José Alexandre Felizola; Soares, Thannya Nascimento; de Campos Telles, Mariana Pires
2016-08-01
Barriers to gene flow play an important role in structuring populations, especially in human-modified landscapes, and several methods have been proposed to detect such barriers. However, most applications of these methods require a relative large number of individuals or populations distributed in space, connected by vertices from Delaunay or Gabriel networks. Here we show, using both simulated and empirical data, a new application of geographically weighted regression (GWR) to detect such barriers, modeling the genetic variation as a "local" linear function of geographic coordinates (latitude and longitude). In the GWR, standard regression statistics, such as R(2) and slopes, are estimated for each sampling unit and thus are mapped. Peaks in these local statistics are then expected close to the barriers if genetic discontinuities exist, capturing a higher rate of population differentiation among neighboring populations. Isolation-by-Distance simulations on a longitudinally warped lattice revealed that higher local slopes from GWR coincide with the barrier detected with Monmonier algorithm. Even with a relatively small effect of the barrier, the power of local GWR in detecting the east-west barriers was higher than 95 %. We also analyzed empirical data of genetic differentiation among tree populations of Dipteryx alata and Eugenia dysenterica Brazilian Cerrado. GWR was applied to the principal coordinate of the pairwise FST matrix based on microsatellite loci. In both simulated and empirical data, the GWR results were consistent with discontinuities detected by Monmonier algorithm, as well as with previous explanations for the spatial patterns of genetic differentiation for the two species. Our analyses reveal how this new application of GWR can viewed as a generalized Wombling in a continuous space and be a useful approach to detect barriers and discontinuities to gene flow.
Regression analysis of sparse asynchronous longitudinal data.
Cao, Hongyuan; Zeng, Donglin; Fine, Jason P
2015-09-01
We consider estimation of regression models for sparse asynchronous longitudinal observations, where time-dependent responses and covariates are observed intermittently within subjects. Unlike with synchronous data, where the response and covariates are observed at the same time point, with asynchronous data, the observation times are mismatched. Simple kernel-weighted estimating equations are proposed for generalized linear models with either time invariant or time-dependent coefficients under smoothness assumptions for the covariate processes which are similar to those for synchronous data. For models with either time invariant or time-dependent coefficients, the estimators are consistent and asymptotically normal but converge at slower rates than those achieved with synchronous data. Simulation studies evidence that the methods perform well with realistic sample sizes and may be superior to a naive application of methods for synchronous data based on an ad hoc last value carried forward approach. The practical utility of the methods is illustrated on data from a study on human immunodeficiency virus.
Free Software Development. 1. Fitting Statistical Regressions
Directory of Open Access Journals (Sweden)
Lorentz JÄNTSCHI
2002-12-01
Full Text Available The present paper is focused on modeling of statistical data processing with applications in field of material science and engineering. A new method of data processing is presented and applied on a set of 10 Ni–Mn–Ga ferromagnetic ordered shape memory alloys that are known to exhibit phonon softening and soft mode condensation into a premartensitic phase prior to the martensitic transformation itself. The method allows to identify the correlations between data sets and to exploit them later in statistical study of alloys. An algorithm for computing data was implemented in preprocessed hypertext language (PHP, a hypertext markup language interface for them was also realized and put onto comp.east.utcluj.ro educational web server, and it is accessible via http protocol at the address http://vl.academicdirect.ro/applied_statistics/linear_regression/multiple/v1.5/. The program running for the set of alloys allow to identify groups of alloys properties and give qualitative measure of correlations between properties. Surfaces of property dependencies are also fitted.
Kepler AutoRegressive Planet Search (KARPS)
Caceres, Gabriel
2018-01-01
One of the main obstacles in detecting faint planetary transits is the intrinsic stellar variability of the host star. The Kepler AutoRegressive Planet Search (KARPS) project implements statistical methodology associated with autoregressive processes (in particular, ARIMA and ARFIMA) to model stellar lightcurves in order to improve exoplanet transit detection. We also develop a novel Transit Comb Filter (TCF) applied to the AR residuals which provides a periodogram analogous to the standard Box-fitting Least Squares (BLS) periodogram. We train a random forest classifier on known Kepler Objects of Interest (KOIs) using select features from different stages of this analysis, and then use ROC curves to define and calibrate the criteria to recover the KOI planet candidates with high fidelity. These statistical methods are detailed in a contributed poster (Feigelson et al., this meeting).These procedures are applied to the full DR25 dataset of NASA’s Kepler mission. Using the classification criteria, a vast majority of known KOIs are recovered and dozens of new KARPS Candidate Planets (KCPs) discovered, including ultra-short period exoplanets. The KCPs will be briefly presented and discussed.
DNBR Prediction Using a Support Vector Regression
International Nuclear Information System (INIS)
Yang, Heon Young; Na, Man Gyun
2008-01-01
PWRs (Pressurized Water Reactors) generally operate in the nucleate boiling state. However, the conversion of nucleate boiling into film boiling with conspicuously reduced heat transfer induces a boiling crisis that may cause the fuel clad melting in the long run. This type of boiling crisis is called Departure from Nucleate Boiling (DNB) phenomena. Because the prediction of minimum DNBR in a reactor core is very important to prevent the boiling crisis such as clad melting, a lot of research has been conducted to predict DNBR values. The object of this research is to predict minimum DNBR applying support vector regression (SVR) by using the measured signals of a reactor coolant system (RCS). The SVR has extensively and successfully been applied to nonlinear function approximation like the proposed problem for estimating DNBR values that will be a function of various input variables such as reactor power, reactor pressure, core mass flowrate, control rod positions and so on. The minimum DNBR in a reactor core is predicted using these various operating condition data as the inputs to the SVR. The minimum DBNR values predicted by the SVR confirm its correctness compared with COLSS values
Analysis of γ spectra in airborne radioactivity measurements using multiple linear regressions
International Nuclear Information System (INIS)
Bao Min; Shi Quanlin; Zhang Jiamei
2004-01-01
This paper describes the net peak counts calculating of nuclide 137 Cs at 662 keV of γ spectra in airborne radioactivity measurements using multiple linear regressions. Mathematic model is founded by analyzing every factor that has contribution to Cs peak counts in spectra, and multiple linear regression function is established. Calculating process adopts stepwise regression, and the indistinctive factors are eliminated by F check. The regression results and its uncertainty are calculated using Least Square Estimation, then the Cs peak net counts and its uncertainty can be gotten. The analysis results for experimental spectrum are displayed. The influence of energy shift and energy resolution on the analyzing result is discussed. In comparison with the stripping spectra method, multiple linear regression method needn't stripping radios, and the calculating result has relation with the counts in Cs peak only, and the calculating uncertainty is reduced. (authors)
Weighted SGD for ℓp Regression with Randomized Preconditioning*
Yang, Jiyan; Chow, Yin-Lam; Ré, Christopher; Mahoney, Michael W.
2018-01-01
prediction norm in 𝒪(log n·nnz(A)+poly(d) log(1/ε)/ε) time. We show that for unconstrained ℓ2 regression, this complexity is comparable to that of RLA and is asymptotically better over several state-of-the-art solvers in the regime where the desired accuracy ε, high dimension n and low dimension d satisfy d ≥ 1/ε and n ≥ d2/ε. We also provide lower bounds on the coreset complexity for more general regression problems, indicating that still new ideas will be needed to extend similar RLA preconditioning ideas to weighted SGD algorithms for more general regression problems. Finally, the effectiveness of such algorithms is illustrated numerically on both synthetic and real datasets, and the results are consistent with our theoretical findings and demonstrate that pwSGD converges to a medium-precision solution, e.g., ε = 10−3, more quickly. PMID:29782626
Penalized estimation for competing risks regression with applications to high-dimensional covariates
DEFF Research Database (Denmark)
Ambrogi, Federico; Scheike, Thomas H.
2016-01-01
of competing events. The direct binomial regression model of Scheike and others (2008. Predicting cumulative incidence probability by direct binomial regression. Biometrika 95: (1), 205-220) is reformulated in a penalized framework to possibly fit a sparse regression model. The developed approach is easily...... Research 19: (1), 29-51), the research regarding competing risks is less developed (Binder and others, 2009. Boosting for high-dimensional time-to-event data with competing risks. Bioinformatics 25: (7), 890-896). The aim of this work is to consider how to do penalized regression in the presence...... implementable using existing high-performance software to do penalized regression. Results from simulation studies are presented together with an application to genomic data when the endpoint is progression-free survival. An R function is provided to perform regularized competing risks regression according...
Hoeflinger, Jennifer L; Hoeflinger, Daniel E; Miller, Michael J
2017-01-01
Herein, an open-source method to generate quantitative bacterial growth data from high-throughput microplate assays is described. The bacterial lag time, maximum specific growth rate, doubling time and delta OD are reported. Our method was validated by carbohydrate utilization of lactobacilli, and visual inspection revealed 94% of regressions were deemed excellent. Copyright © 2016 Elsevier B.V. All rights reserved.
Declining Bias and Gender Wage Discrimination? A Meta-Regression Analysis
Jarrell, Stephen B.; Stanley, T. D.
2004-01-01
The meta-regression analysis reveals that there is a strong tendency for discrimination estimates to fall and wage discrimination exist against the woman. The biasing effect of researchers' gender of not correcting for selection bias has weakened and changes in labor market have made it less important.
International Nuclear Information System (INIS)
Che Jinxing; Wang Jianzhou
2010-01-01
In this paper, we present the use of different mathematical models to forecast electricity price under deregulated power. A successful prediction tool of electricity price can help both power producers and consumers plan their bidding strategies. Inspired by that the support vector regression (SVR) model, with the ε-insensitive loss function, admits of the residual within the boundary values of ε-tube, we propose a hybrid model that combines both SVR and Auto-regressive integrated moving average (ARIMA) models to take advantage of the unique strength of SVR and ARIMA models in nonlinear and linear modeling, which is called SVRARIMA. A nonlinear analysis of the time-series indicates the convenience of nonlinear modeling, the SVR is applied to capture the nonlinear patterns. ARIMA models have been successfully applied in solving the residuals regression estimation problems. The experimental results demonstrate that the model proposed outperforms the existing neural-network approaches, the traditional ARIMA models and other hybrid models based on the root mean square error and mean absolute percentage error.
Sirenomelia and severe caudal regression syndrome.
Seidahmed, Mohammed Z; Abdelbasit, Omer B; Alhussein, Khalid A; Miqdad, Abeer M; Khalil, Mohammed I; Salih, Mustafa A
2014-12-01
To describe cases of sirenomelia and severe caudal regression syndrome (CRS), to report the prevalence of sirenomelia, and compare our findings with the literature. Retrospective data was retrieved from the medical records of infants with the diagnosis of sirenomelia and CRS and their mothers from 1989 to 2010 (22 years) at the Security Forces Hospital, Riyadh, Saudi Arabia. A perinatologist, neonatologist, pediatric neurologist, and radiologist ascertained the diagnoses. The cases were identified as part of a study of neural tube defects during that period. A literature search was conducted using MEDLINE. During the 22-year study period, the total number of deliveries was 124,933 out of whom, 4 patients with sirenomelia, and 2 patients with severe forms of CRS were identified. All the patients with sirenomelia had single umbilical artery, and none were the infant of a diabetic mother. One patient was a twin, and another was one of triplets. The 2 patients with CRS were sisters, their mother suffered from type II diabetes mellitus and morbid obesity on insulin, and neither of them had a single umbilical artery. Other associated anomalies with sirenomelia included an absent radius, thumb, and index finger in one patient, Potter's syndrome, abnormal ribs, microphthalmia, congenital heart disease, hypoplastic lungs, and diaphragmatic hernia. The prevalence of sirenomelia (3.2 per 100,000) is high compared with the international prevalence of one per 100,000. Both cases of CRS were infants of type II diabetic mother with poor control, supporting the strong correlation of CRS and maternal diabetes.
The intermediate endpoint effect in logistic and probit regression
MacKinnon, DP; Lockwood, CM; Brown, CH; Wang, W; Hoffman, JM
2010-01-01
Background An intermediate endpoint is hypothesized to be in the middle of the causal sequence relating an independent variable to a dependent variable. The intermediate variable is also called a surrogate or mediating variable and the corresponding effect is called the mediated, surrogate endpoint, or intermediate endpoint effect. Clinical studies are often designed to change an intermediate or surrogate endpoint and through this intermediate change influence the ultimate endpoint. In many intermediate endpoint clinical studies the dependent variable is binary, and logistic or probit regression is used. Purpose The purpose of this study is to describe a limitation of a widely used approach to assessing intermediate endpoint effects and to propose an alternative method, based on products of coefficients, that yields more accurate results. Methods The intermediate endpoint model for a binary outcome is described for a true binary outcome and for a dichotomization of a latent continuous outcome. Plots of true values and a simulation study are used to evaluate the different methods. Results Distorted estimates of the intermediate endpoint effect and incorrect conclusions can result from the application of widely used methods to assess the intermediate endpoint effect. The same problem occurs for the proportion of an effect explained by an intermediate endpoint, which has been suggested as a useful measure for identifying intermediate endpoints. A solution to this problem is given based on the relationship between latent variable modeling and logistic or probit regression. Limitations More complicated intermediate variable models are not addressed in the study, although the methods described in the article can be extended to these more complicated models. Conclusions Researchers are encouraged to use an intermediate endpoint method based on the product of regression coefficients. A common method based on difference in coefficient methods can lead to distorted
Virtual machine consolidation enhancement using hybrid regression algorithms
Directory of Open Access Journals (Sweden)
Amany Abdelsamea
2017-11-01
Full Text Available Cloud computing data centers are growing rapidly in both number and capacity to meet the increasing demands for highly-responsive computing and massive storage. Such data centers consume enormous amounts of electrical energy resulting in high operating costs and carbon dioxide emissions. The reason for this extremely high energy consumption is not just the quantity of computing resources and the power inefficiency of hardware, but rather lies in the inefficient usage of these resources. VM consolidation involves live migration of VMs hence the capability of transferring a VM between physical servers with a close to zero down time. It is an effective way to improve the utilization of resources and increase energy efficiency in cloud data centers. VM consolidation consists of host overload/underload detection, VM selection and VM placement. Most of the current VM consolidation approaches apply either heuristic-based techniques, such as static utilization thresholds, decision-making based on statistical analysis of historical data; or simply periodic adaptation of the VM allocation. Most of those algorithms rely on CPU utilization only for host overload detection. In this paper we propose using hybrid factors to enhance VM consolidation. Specifically we developed a multiple regression algorithm that uses CPU utilization, memory utilization and bandwidth utilization for host overload detection. The proposed algorithm, Multiple Regression Host Overload Detection (MRHOD, significantly reduces energy consumption while ensuring a high level of adherence to Service Level Agreements (SLA since it gives a real indication of host utilization based on three parameters (CPU, Memory, Bandwidth utilizations instead of one parameter only (CPU utilization. Through simulations we show that our approach reduces power consumption by 6 times compared to single factor algorithms using random workload. Also using PlanetLab workload traces we show that MRHOD improves
Simultaneous Estimation of Regression Functions for Marine Corps Technical Training Specialties.
Dunbar, Stephen B.; And Others
This paper considers the application of Bayesian techniques for simultaneous estimation to the specification of regression weights for selection tests used in various technical training courses in the Marine Corps. Results of a method for m-group regression developed by Molenaar and Lewis (1979) suggest that common weights for training courses…
Age Regression in the Treatment of Anger in a Prison Setting.
Eisel, Harry E.
1988-01-01
Incorporated hypnotherapy with age regression into cognitive therapeutic approach with prisoners having history of anger. Technique involved age regression to establish first significant event causing current anger, catharsis of feelings for original event, and reorientation of event while under hypnosis. Results indicated decrease in acting-out…
Interpret with caution: multicollinearity in multiple regression of cognitive data.
Morrison, Catriona M
2003-08-01
Shibihara and Kondo in 2002 reported a reanalysis of the 1997 Kanji picture-naming data of Yamazaki, Ellis, Morrison, and Lambon-Ralph in which independent variables were highly correlated. Their addition of the variable visual familiarity altered the previously reported pattern of results, indicating that visual familiarity, but not age of acquisition, was important in predicting Kanji naming speed. The present paper argues that caution should be taken when drawing conclusions from multiple regression analyses in which the independent variables are so highly correlated, as such multicollinearity can lead to unreliable output.
Solving Dynamic Traveling Salesman Problem Using Dynamic Gaussian Process Regression
Directory of Open Access Journals (Sweden)
Stephen M. Akandwanaho
2014-01-01
Full Text Available This paper solves the dynamic traveling salesman problem (DTSP using dynamic Gaussian Process Regression (DGPR method. The problem of varying correlation tour is alleviated by the nonstationary covariance function interleaved with DGPR to generate a predictive distribution for DTSP tour. This approach is conjoined with Nearest Neighbor (NN method and the iterated local search to track dynamic optima. Experimental results were obtained on DTSP instances. The comparisons were performed with Genetic Algorithm and Simulated Annealing. The proposed approach demonstrates superiority in finding good traveling salesman problem (TSP tour and less computational time in nonstationary conditions.
Detecting nonsense for Chinese comments based on logistic regression
Zhuolin, Ren; Guang, Chen; Shu, Chen
2016-07-01
To understand cyber citizens' opinion accurately from Chinese news comments, the clear definition on nonsense is present, and a detection model based on logistic regression (LR) is proposed. The detection of nonsense can be treated as a binary-classification problem. Besides of traditional lexical features, we propose three kinds of features in terms of emotion, structure and relevance. By these features, we train an LR model and demonstrate its effect in understanding Chinese news comments. We find that each of proposed features can significantly promote the result. In our experiments, we achieve a prediction accuracy of 84.3% which improves the baseline 77.3% by 7%.
MENENTUKAN PROBABILITAS QUALITAS LULUSAN PROGRAM STUDI MENGGUNAKAN LOGISTIC REGRESSION
Directory of Open Access Journals (Sweden)
Maxsi Ary
2016-03-01
Full Text Available Abstract – Human resources (HR is one of the success factors in the economic field, namely how to create a human resources (HR qualified and have the skills and highly competitive in the global competition. Educational level of the labor force that is still relatively low. The structure of education of the workforce is still dominated Indonesian basic education which is about 63.2%. The issue raised is to determine the probability of a program of study (whether or not to see some of the ratio of the number of graduates by the number of students per class, the amount of quota size class (large or small using logistic regression models. Data were obtained from a search result based on the amount of data the study program students and graduates in 2010 Data processing using SPSS. The results of the analysis by assessing model fit and the results will be given for each model fit. Starting with the hypothesis for assessing model fit, statistical -2LogL, Cox and Snell's R Square, Hosmer and Lemeshow's Goodness of Fit Test, and the classification table. The results of the analysis using SPSS as a tool aimed at measuring quality of graduate courses at a university, college, or academy, whether or not based on the ratio of the number of graduates and class quotas. Keywords: Quota Class, Probability, Logistic Regression Abstrak – Sumberdaya manusia (SDM adalah salah satu faktor kesuksesan dalam bidang ekonomi, yaitu bagaimana menciptakan sumber daya manusia (SDM yang berkualitas dan memiliki keterampilan serta berdaya saing tinggi dalam persaingan global. Tingkat pendidikan angkatan kerja yang ada masih relatif rendah. Struktur pendidikan angkatan kerja Indonesia masih didominasi pendidikan dasar yaitu sekitar 63,2%. Persoalan yang dikemukakan adalah menentukan probabilitas sebuah program studi (baik atau tidak dengan melihat beberapa rasio jumlah lulusan dengan jumlah mahasiswa per angkatan, ukuran besarnya kuota kelas (besar atau kecil menggunakan