Directory of Open Access Journals (Sweden)
MILAD TAZIK
2017-11-01
Full Text Available Identifying cases in which road crashes result in fatality or injury of drivers may help improve their safety. In this study, datasets of crashes happened in TehranQom freeway, Iran, were examined by three models (multiple logistic regression, Bayesian logistic and classification tree to analyse the contribution of several variables to fatal accidents. For multiple logistic regression and Bayesian logistic models, the odds ratio was calculated for each variable. The model which best suited the identification of accident severity was determined based on AIC and DIC criteria. Based on the results of these two models, rollover crashes (OR = 14.58, %95 CI: 6.8-28.6, not using of seat belt (OR = 5.79, %95 CI: 3.1-9.9, exceeding speed limits (OR = 4.02, %95 CI: 1.8-7.9 and being female (OR = 2.91, %95 CI: 1.1-6.1 were the most important factors in fatalities of drivers. In addition, the results of the classification tree model have verified the findings of the other models.
Polynomial regression analysis and significance test of the regression function
International Nuclear Information System (INIS)
Gao Zhengming; Zhao Juan; He Shengping
2012-01-01
In order to analyze the decay heating power of a certain radioactive isotope per kilogram with polynomial regression method, the paper firstly demonstrated the broad usage of polynomial function and deduced its parameters with ordinary least squares estimate. Then significance test method of polynomial regression function is derived considering the similarity between the polynomial regression model and the multivariable linear regression model. Finally, polynomial regression analysis and significance test of the polynomial function are done to the decay heating power of the iso tope per kilogram in accord with the authors' real work. (authors)
Significance testing in ridge regression for genetic data
Directory of Open Access Journals (Sweden)
De Iorio Maria
2011-09-01
Full Text Available Abstract Background Technological developments have increased the feasibility of large scale genetic association studies. Densely typed genetic markers are obtained using SNP arrays, next-generation sequencing technologies and imputation. However, SNPs typed using these methods can be highly correlated due to linkage disequilibrium among them, and standard multiple regression techniques fail with these data sets due to their high dimensionality and correlation structure. There has been increasing interest in using penalised regression in the analysis of high dimensional data. Ridge regression is one such penalised regression technique which does not perform variable selection, instead estimating a regression coefficient for each predictor variable. It is therefore desirable to obtain an estimate of the significance of each ridge regression coefficient. Results We develop and evaluate a test of significance for ridge regression coefficients. Using simulation studies, we demonstrate that the performance of the test is comparable to that of a permutation test, with the advantage of a much-reduced computational cost. We introduce the p-value trace, a plot of the negative logarithm of the p-values of ridge regression coefficients with increasing shrinkage parameter, which enables the visualisation of the change in p-value of the regression coefficients with increasing penalisation. We apply the proposed method to a lung cancer case-control data set from EPIC, the European Prospective Investigation into Cancer and Nutrition. Conclusions The proposed test is a useful alternative to a permutation test for the estimation of the significance of ridge regression coefficients, at a much-reduced computational cost. The p-value trace is an informative graphical tool for evaluating the results of a test of significance of ridge regression coefficients as the shrinkage parameter increases, and the proposed test makes its production computationally feasible.
Identifying Interacting Genetic Variations by Fish-Swarm Logic Regression
Yang, Aiyuan; Yan, Chunxia; Zhu, Feng; Zhao, Zhongmeng; Cao, Zhi
2013-01-01
Understanding associations between genotypes and complex traits is a fundamental problem in human genetics. A major open problem in mapping phenotypes is that of identifying a set of interacting genetic variants, which might contribute to complex traits. Logic regression (LR) is a powerful multivariant association tool. Several LR-based approaches have been successfully applied to different datasets. However, these approaches are not adequate with regard to accuracy and efficiency. In this paper, we propose a new LR-based approach, called fish-swarm logic regression (FSLR), which improves the logic regression process by incorporating swarm optimization. In our approach, a school of fish agents are conducted in parallel. Each fish agent holds a regression model, while the school searches for better models through various preset behaviors. A swarm algorithm improves the accuracy and the efficiency by speeding up the convergence and preventing it from dropping into local optimums. We apply our approach on a real screening dataset and a series of simulation scenarios. Compared to three existing LR-based approaches, our approach outperforms them by having lower type I and type II error rates, being able to identify more preset causal sites, and performing at faster speeds. PMID:23984382
Identifying Interacting Genetic Variations by Fish-Swarm Logic Regression
Directory of Open Access Journals (Sweden)
Xuanping Zhang
2013-01-01
Full Text Available Understanding associations between genotypes and complex traits is a fundamental problem in human genetics. A major open problem in mapping phenotypes is that of identifying a set of interacting genetic variants, which might contribute to complex traits. Logic regression (LR is a powerful multivariant association tool. Several LR-based approaches have been successfully applied to different datasets. However, these approaches are not adequate with regard to accuracy and efficiency. In this paper, we propose a new LR-based approach, called fish-swarm logic regression (FSLR, which improves the logic regression process by incorporating swarm optimization. In our approach, a school of fish agents are conducted in parallel. Each fish agent holds a regression model, while the school searches for better models through various preset behaviors. A swarm algorithm improves the accuracy and the efficiency by speeding up the convergence and preventing it from dropping into local optimums. We apply our approach on a real screening dataset and a series of simulation scenarios. Compared to three existing LR-based approaches, our approach outperforms them by having lower type I and type II error rates, being able to identify more preset causal sites, and performing at faster speeds.
Identifying significant environmental features using feature recognition.
2015-10-01
The Department of Environmental Analysis at the Kentucky Transportation Cabinet has expressed an interest in feature-recognition capability because it may help analysts identify environmentally sensitive features in the landscape, : including those r...
Efficient logistic regression designs under an imperfect population identifier.
Albert, Paul S; Liu, Aiyi; Nansel, Tonja
2014-03-01
Motivated by actual study designs, this article considers efficient logistic regression designs where the population is identified with a binary test that is subject to diagnostic error. We consider the case where the imperfect test is obtained on all participants, while the gold standard test is measured on a small chosen subsample. Under maximum-likelihood estimation, we evaluate the optimal design in terms of sample selection as well as verification. We show that there may be substantial efficiency gains by choosing a small percentage of individuals who test negative on the imperfect test for inclusion in the sample (e.g., verifying 90% test-positive cases). We also show that a two-stage design may be a good practical alternative to a fixed design in some situations. Under optimal and nearly optimal designs, we compare maximum-likelihood and semi-parametric efficient estimators under correct and misspecified models with simulations. The methodology is illustrated with an analysis from a diabetes behavioral intervention trial. © 2013, The International Biometric Society.
Identifying predictors of physics item difficulty: A linear regression approach
Mesic, Vanes; Muratovic, Hasnija
2011-06-01
Large-scale assessments of student achievement in physics are often approached with an intention to discriminate students based on the attained level of their physics competencies. Therefore, for purposes of test design, it is important that items display an acceptable discriminatory behavior. To that end, it is recommended to avoid extraordinary difficult and very easy items. Knowing the factors that influence physics item difficulty makes it possible to model the item difficulty even before the first pilot study is conducted. Thus, by identifying predictors of physics item difficulty, we can improve the test-design process. Furthermore, we get additional qualitative feedback regarding the basic aspects of student cognitive achievement in physics that are directly responsible for the obtained, quantitative test results. In this study, we conducted a secondary analysis of data that came from two large-scale assessments of student physics achievement at the end of compulsory education in Bosnia and Herzegovina. Foremost, we explored the concept of “physics competence” and performed a content analysis of 123 physics items that were included within the above-mentioned assessments. Thereafter, an item database was created. Items were described by variables which reflect some basic cognitive aspects of physics competence. For each of the assessments, Rasch item difficulties were calculated in separate analyses. In order to make the item difficulties from different assessments comparable, a virtual test equating procedure had to be implemented. Finally, a regression model of physics item difficulty was created. It has been shown that 61.2% of item difficulty variance can be explained by factors which reflect the automaticity, complexity, and modality of the knowledge structure that is relevant for generating the most probable correct solution, as well as by the divergence of required thinking and interference effects between intuitive and formal physics knowledge
Identifying predictors of physics item difficulty: A linear regression approach
Directory of Open Access Journals (Sweden)
Hasnija Muratovic
2011-06-01
Full Text Available Large-scale assessments of student achievement in physics are often approached with an intention to discriminate students based on the attained level of their physics competencies. Therefore, for purposes of test design, it is important that items display an acceptable discriminatory behavior. To that end, it is recommended to avoid extraordinary difficult and very easy items. Knowing the factors that influence physics item difficulty makes it possible to model the item difficulty even before the first pilot study is conducted. Thus, by identifying predictors of physics item difficulty, we can improve the test-design process. Furthermore, we get additional qualitative feedback regarding the basic aspects of student cognitive achievement in physics that are directly responsible for the obtained, quantitative test results. In this study, we conducted a secondary analysis of data that came from two large-scale assessments of student physics achievement at the end of compulsory education in Bosnia and Herzegovina. Foremost, we explored the concept of “physics competence” and performed a content analysis of 123 physics items that were included within the above-mentioned assessments. Thereafter, an item database was created. Items were described by variables which reflect some basic cognitive aspects of physics competence. For each of the assessments, Rasch item difficulties were calculated in separate analyses. In order to make the item difficulties from different assessments comparable, a virtual test equating procedure had to be implemented. Finally, a regression model of physics item difficulty was created. It has been shown that 61.2% of item difficulty variance can be explained by factors which reflect the automaticity, complexity, and modality of the knowledge structure that is relevant for generating the most probable correct solution, as well as by the divergence of required thinking and interference effects between intuitive and formal
Chen, Carla Chia-Ming; Schwender, Holger; Keith, Jonathan; Nunkesser, Robin; Mengersen, Kerrie; Macrossan, Paula
2011-01-01
Due to advancements in computational ability, enhanced technology and a reduction in the price of genotyping, more data are being generated for understanding genetic associations with diseases and disorders. However, with the availability of large data sets comes the inherent challenges of new methods of statistical analysis and modeling. Considering a complex phenotype may be the effect of a combination of multiple loci, various statistical methods have been developed for identifying genetic epistasis effects. Among these methods, logic regression (LR) is an intriguing approach incorporating tree-like structures. Various methods have built on the original LR to improve different aspects of the model. In this study, we review four variations of LR, namely Logic Feature Selection, Monte Carlo Logic Regression, Genetic Programming for Association Studies, and Modified Logic Regression-Gene Expression Programming, and investigate the performance of each method using simulated and real genotype data. We contrast these with another tree-like approach, namely Random Forests, and a Bayesian logistic regression with stochastic search variable selection.
Feng, Yongjiu; Tong, Xiaohua
2017-09-22
Defining transition rules is an important issue in cellular automaton (CA)-based land use modeling because these models incorporate highly correlated driving factors. Multicollinearity among correlated driving factors may produce negative effects that must be eliminated from the modeling. Using exploratory regression under pre-defined criteria, we identified all possible combinations of factors from the candidate factors affecting land use change. Three combinations that incorporate five driving factors meeting pre-defined criteria were assessed. With the selected combinations of factors, three logistic regression-based CA models were built to simulate dynamic land use change in Shanghai, China, from 2000 to 2015. For comparative purposes, a CA model with all candidate factors was also applied to simulate the land use change. Simulations using three CA models with multicollinearity eliminated performed better (with accuracy improvements about 3.6%) than the model incorporating all candidate factors. Our results showed that not all candidate factors are necessary for accurate CA modeling and the simulations were not sensitive to changes in statistically non-significant driving factors. We conclude that exploratory regression is an effective method to search for the optimal combinations of driving factors, leading to better land use change models that are devoid of multicollinearity. We suggest identification of dominant factors and elimination of multicollinearity before building land change models, making it possible to simulate more realistic outcomes.
Fidalgo, Angel M.; Alavi, Seyed Mohammad; Amirian, Seyed Mohammad Reza
2014-01-01
This study examines three controversial aspects in differential item functioning (DIF) detection by logistic regression (LR) models: first, the relative effectiveness of different analytical strategies for detecting DIF; second, the suitability of the Wald statistic for determining the statistical significance of the parameters of interest; and…
Significance tests to determine the direction of effects in linear regression models.
Wiedermann, Wolfgang; Hagmann, Michael; von Eye, Alexander
2015-02-01
Previous studies have discussed asymmetric interpretations of the Pearson correlation coefficient and have shown that higher moments can be used to decide on the direction of dependence in the bivariate linear regression setting. The current study extends this approach by illustrating that the third moment of regression residuals may also be used to derive conclusions concerning the direction of effects. Assuming non-normally distributed variables, it is shown that the distribution of residuals of the correctly specified regression model (e.g., Y is regressed on X) is more symmetric than the distribution of residuals of the competing model (i.e., X is regressed on Y). Based on this result, 4 one-sample tests are discussed which can be used to decide which variable is more likely to be the response and which one is more likely to be the explanatory variable. A fifth significance test is proposed based on the differences of skewness estimates, which leads to a more direct test of a hypothesis that is compatible with direction of dependence. A Monte Carlo simulation study was performed to examine the behaviour of the procedures under various degrees of associations, sample sizes, and distributional properties of the underlying population. An empirical example is given which illustrates the application of the tests in practice. © 2014 The British Psychological Society.
International Nuclear Information System (INIS)
Zhang, Ning; Liu, Dong-Sheng; Chen, Yong; Liang, Shao-Bo; Deng, Yan-Ming; Lu, Rui-Liang; Chen, Hai-Yang; Zhao, Hai; Lv, Zhi-Qian; Liang, Shao-Qiang; Yang, Lin
2014-01-01
To observe the primary tumor (PT) regression speed after radiotherapy (RT) in nasopharyngeal carcinoma (NPC) and evaluate its prognostic significance. One hundred and eighty-eight consecutive newly diagnosed NPC patients were reviewed retrospectively. All patients underwent magnetic resonance imaging and fiberscope examination of the nasopharynx before RT, during RT when the accumulated dose was 46–50 Gy, at the end of RT, and 3–4 months after RT. Of 188 patients, 40.4% had complete response of PT (CRPT), 44.7% had partial response of PT (PRPT), and 14.9% had stable disease of PT (SDPT) at the end of RT. The 5-year overall survival (OS) rates for patients with CRPT, PRPT, and SDPT at the end of RT were 84.0%, 70.7%, and 44.3%, respectively (P < 0.001, hazard ratio [HR] = 2.177, 95% confidence interval [CI] = 1.480-3.202). The 5-year failure-free survival (FFS) and distant metastasis-free survival (DMFS) rates also differed significantly (87.8% vs. 74.3% vs. 52.7%, P = 0.001, HR = 2.148, 95% CI, 1.384-3.333; 91.7% vs. 84.7% vs. 66.1%, P = 0.004, HR = 2.252, 95% CI = 1.296-3.912). The 5-year local relapse–free survival (LRFS) rates were not significantly different (95.8% vs. 86.0% vs. 81.8%, P = 0.137, HR = 1.975, 95% CI, 0.976-3.995). By multivariate analyses, the PT regression speed at the end of RT was the only independent prognostic factor of OS, FFS, and DMFS (P < 0.001, P = 0.001, and P = 0.004, respectively). The 5-year FFS rates for patients with CRPT during RT and CRPT only at the end of RT were 80.2% and 97.1%, respectively (P = 0.033). For patients with persistent PT at the end of RT, the 5-year LRFS rates of patients without and with boost irradiation were 87.1% and 84.6%, respectively (P = 0.812). PT regression speed at the end of RT was an independent prognostic factor of OS, FFS, and DMFS in NPC patients. Immediate strengthening treatment may be provided to patients with poor tumor regression at the end of RT
Directory of Open Access Journals (Sweden)
Dayeon Shin
2018-01-01
Full Text Available Diet plays a crucial role in cognitive function. Few studies have examined the relationship between dietary patterns and cognitive functions of older adults in the Korean population. This study aimed to identify the effect of dietary patterns on the risk of mild cognitive impairment. A total of 239 participants, including 88 men and 151 women, aged 65 years and older were selected from health centers in the district of Seoul, Gyeonggi province, and Incheon, in Korea. Dietary patterns were determined using Reduced Rank Regression (RRR methods with responses regarding vitamin B6, vitamin C, and iron intakes, based on both a one-day 24-h recall and a food frequency questionnaire. Cognitive function was assessed using the Korean-Mini Mental State Examination (K-MMSE. Multivariable logistic regression models were used to estimate the association between dietary pattern score and the risk of mild cognitive impairment. A total of 20 (8% out of the 239 participants had mild cognitive impairment. Three dietary patterns were identified: seafood and vegetables, high meat, and bread, ham, and alcohol. Among the three dietary patterns, the older adult population who adhered to the seafood and vegetables pattern, characterized by high intake of seafood, vegetables, fruits, bread, snacks, soy products, beans, chicken, pork, ham, egg, and milk had a decreased risk of mild cognitive impairment compared to those who did not (adjusted odds ratios 0.06, 95% confidence interval 0.01–0.72 after controlling for gender, supplementation, education, history of dementia, physical activity, body mass index (BMI, and duration of sleep. The other two dietary patterns were not significantly associated with the risk of mild cognitive impairment. In conclusion, high consumption of fruits, vegetables, seafood, and protein foods was significantly associated with reduced mild cognitive impairment in older Korean adults. These results can contribute to the establishment of
Directory of Open Access Journals (Sweden)
Nikita A. Moiseev
2017-01-01
Full Text Available The paper is devoted to a new randomization method that yields unbiased adjustments of p-values for linear regression models predictors by incorporating the number of potential explanatory variables, their variance-covariance matrix and its uncertainty, based on the number of observations. This adjustment helps to control type I errors in scientific studies, significantly decreasing the number of publications that report false relations to be authentic ones. Comparative analysis with such existing methods as Bonferroni correction and Shehata and White adjustments explicitly shows their imperfections, especially in case when the number of observations and the number of potential explanatory variables are approximately equal. Also during the comparative analysis it was shown that when the variance-covariance matrix of a set of potential predictors is diagonal, i.e. the data are independent, the proposed simple correction is the best and easiest way to implement the method to obtain unbiased corrections of traditional p-values. However, in the case of the presence of strongly correlated data, a simple correction overestimates the true pvalues, which can lead to type II errors. It was also found that the corrected p-values depend on the number of observations, the number of potential explanatory variables and the sample variance-covariance matrix. For example, if there are only two potential explanatory variables competing for one position in the regression model, then if they are weakly correlated, the corrected p-value will be lower than when the number of observations is smaller and vice versa; if the data are highly correlated, the case with a larger number of observations will show a lower corrected p-value. With increasing correlation, all corrections, regardless of the number of observations, tend to the original p-value. This phenomenon is easy to explain: as correlation coefficient tends to one, two variables almost linearly depend on each
Observed to expected or logistic regression to identify hospitals with high or low 30-day mortality?
Helgeland, Jon; Clench-Aas, Jocelyne; Laake, Petter; Veierød, Marit B.
2018-01-01
Introduction A common quality indicator for monitoring and comparing hospitals is based on death within 30 days of admission. An important use is to determine whether a hospital has higher or lower mortality than other hospitals. Thus, the ability to identify such outliers correctly is essential. Two approaches for detection are: 1) calculating the ratio of observed to expected number of deaths (OE) per hospital and 2) including all hospitals in a logistic regression (LR) comparing each hospital to a form of average over all hospitals. The aim of this study was to compare OE and LR with respect to correctly identifying 30-day mortality outliers. Modifications of the methods, i.e., variance corrected approach of OE (OE-Faris), bias corrected LR (LR-Firth), and trimmed mean variants of LR and LR-Firth were also studied. Materials and methods To study the properties of OE and LR and their variants, we performed a simulation study by generating patient data from hospitals with known outlier status (low mortality, high mortality, non-outlier). Data from simulated scenarios with varying number of hospitals, hospital volume, and mortality outlier status, were analysed by the different methods and compared by level of significance (ability to falsely claim an outlier) and power (ability to reveal an outlier). Moreover, administrative data for patients with acute myocardial infarction (AMI), stroke, and hip fracture from Norwegian hospitals for 2012–2014 were analysed. Results None of the methods achieved the nominal (test) level of significance for both low and high mortality outliers. For low mortality outliers, the levels of significance were increased four- to fivefold for OE and OE-Faris. For high mortality outliers, OE and OE-Faris, LR 25% trimmed and LR-Firth 10% and 25% trimmed maintained approximately the nominal level. The methods agreed with respect to outlier status for 94.1% of the AMI hospitals, 98.0% of the stroke, and 97.8% of the hip fracture hospitals
Directory of Open Access Journals (Sweden)
Anke Hüls
2017-05-01
Full Text Available Antimicrobial resistance in livestock is a matter of general concern. To develop hygiene measures and methods for resistance prevention and control, epidemiological studies on a population level are needed to detect factors associated with antimicrobial resistance in livestock holdings. In general, regression models are used to describe these relationships between environmental factors and resistance outcome. Besides the study design, the correlation structures of the different outcomes of antibiotic resistance and structural zero measurements on the resistance outcome as well as on the exposure side are challenges for the epidemiological model building process. The use of appropriate regression models that acknowledge these complexities is essential to assure valid epidemiological interpretations. The aims of this paper are (i to explain the model building process comparing several competing models for count data (negative binomial model, quasi-Poisson model, zero-inflated model, and hurdle model and (ii to compare these models using data from a cross-sectional study on antibiotic resistance in animal husbandry. These goals are essential to evaluate which model is most suitable to identify potential prevention measures. The dataset used as an example in our analyses was generated initially to study the prevalence and associated factors for the appearance of cefotaxime-resistant Escherichia coli in 48 German fattening pig farms. For each farm, the outcome was the count of samples with resistant bacteria. There was almost no overdispersion and only moderate evidence of excess zeros in the data. Our analyses show that it is essential to evaluate regression models in studies analyzing the relationship between environmental factors and antibiotic resistances in livestock. After model comparison based on evaluation of model predictions, Akaike information criterion, and Pearson residuals, here the hurdle model was judged to be the most appropriate
Šarić, Željko; Xu, Xuecai; Duan, Li; Babić, Darko
2018-06-20
This study intended to investigate the interactions between accident rate and traffic signs in state roads located in Croatia, and accommodate the heterogeneity attributed to unobserved factors. The data from 130 state roads between 2012 and 2016 were collected from Traffic Accident Database System maintained by the Republic of Croatia Ministry of the Interior. To address the heterogeneity, a panel quantile regression model was proposed, in which quantile regression model offers a more complete view and a highly comprehensive analysis of the relationship between accident rate and traffic signs, while the panel data model accommodates the heterogeneity attributed to unobserved factors. Results revealed that (1) low visibility of material damage (MD) and death or injured (DI) increased the accident rate; (2) the number of mandatory signs and the number of warning signs were more likely to reduce the accident rate; (3)average speed limit and the number of invalid traffic signs per km exhibited a high accident rate. To our knowledge, it's the first attempt to analyze the interactions between accident consequences and traffic signs by employing a panel quantile regression model; by involving the visibility, the present study demonstrates that the low visibility causes a relatively higher risk of MD and DI; It is noteworthy that average speed limit corresponds with accident rate positively; The number of mandatory signs and the number of warning signs are more likely to reduce the accident rate; The number of invalid traffic signs per km are significant for accident rate, thus regular maintenance should be kept for a safer roadway environment.
Liu, Shelley H; Bobb, Jennifer F; Lee, Kyu Ha; Gennings, Chris; Claus Henn, Birgit; Bellinger, David; Austin, Christine; Schnaas, Lourdes; Tellez-Rojo, Martha M; Hu, Howard; Wright, Robert O; Arora, Manish; Coull, Brent A
2018-07-01
The impact of neurotoxic chemical mixtures on children's health is a critical public health concern. It is well known that during early life, toxic exposures may impact cognitive function during critical time intervals of increased vulnerability, known as windows of susceptibility. Knowledge on time windows of susceptibility can help inform treatment and prevention strategies, as chemical mixtures may affect a developmental process that is operating at a specific life phase. There are several statistical challenges in estimating the health effects of time-varying exposures to multi-pollutant mixtures, such as: multi-collinearity among the exposures both within time points and across time points, and complex exposure-response relationships. To address these concerns, we develop a flexible statistical method, called lagged kernel machine regression (LKMR). LKMR identifies critical exposure windows of chemical mixtures, and accounts for complex non-linear and non-additive effects of the mixture at any given exposure window. Specifically, LKMR estimates how the effects of a mixture of exposures change with the exposure time window using a Bayesian formulation of a grouped, fused lasso penalty within a kernel machine regression (KMR) framework. A simulation study demonstrates the performance of LKMR under realistic exposure-response scenarios, and demonstrates large gains over approaches that consider each time window separately, particularly when serial correlation among the time-varying exposures is high. Furthermore, LKMR demonstrates gains over another approach that inputs all time-specific chemical concentrations together into a single KMR. We apply LKMR to estimate associations between neurodevelopment and metal mixtures in Early Life Exposures in Mexico and Neurotoxicology, a prospective cohort study of child health in Mexico City.
SU-E-J-212: Identifying Bones From MRI: A Dictionary Learnign and Sparse Regression Approach
International Nuclear Information System (INIS)
Ruan, D; Yang, Y; Cao, M; Hu, P; Low, D
2014-01-01
Purpose: To develop an efficient and robust scheme to identify bony anatomy based on MRI-only simulation images. Methods: MRI offers important soft tissue contrast and functional information, yet its lack of correlation to electron-density has placed it as an auxiliary modality to CT in radiotherapy simulation and adaptation. An effective scheme to identify bony anatomy is an important first step towards MR-only simulation/treatment paradigm and would satisfy most practical purposes. We utilize a UTE acquisition sequence to achieve visibility of the bone. By contrast to manual + bulk or registration-to identify bones, we propose a novel learning-based approach for improved robustness to MR artefacts and environmental changes. Specifically, local information is encoded with MR image patch, and the corresponding label is extracted (during training) from simulation CT aligned to the UTE. Within each class (bone vs. nonbone), an overcomplete dictionary is learned so that typical patches within the proper class can be represented as a sparse combination of the dictionary entries. For testing, an acquired UTE-MRI is divided to patches using a sliding scheme, where each patch is sparsely regressed against both bone and nonbone dictionaries, and subsequently claimed to be associated with the class with the smaller residual. Results: The proposed method has been applied to the pilot site of brain imaging and it has showed general good performance, with dice similarity coefficient of greater than 0.9 in a crossvalidation study using 4 datasets. Importantly, it is robust towards consistent foreign objects (e.g., headset) and the artefacts relates to Gibbs and field heterogeneity. Conclusion: A learning perspective has been developed for inferring bone structures based on UTE MRI. The imaging setting is subject to minimal motion effects and the post-processing is efficient. The improved efficiency and robustness enables a first translation to MR-only routine. The scheme
SU-E-J-212: Identifying Bones From MRI: A Dictionary Learnign and Sparse Regression Approach
Energy Technology Data Exchange (ETDEWEB)
Ruan, D; Yang, Y; Cao, M; Hu, P; Low, D [UCLA, Los Angeles, CA (United States)
2014-06-01
Purpose: To develop an efficient and robust scheme to identify bony anatomy based on MRI-only simulation images. Methods: MRI offers important soft tissue contrast and functional information, yet its lack of correlation to electron-density has placed it as an auxiliary modality to CT in radiotherapy simulation and adaptation. An effective scheme to identify bony anatomy is an important first step towards MR-only simulation/treatment paradigm and would satisfy most practical purposes. We utilize a UTE acquisition sequence to achieve visibility of the bone. By contrast to manual + bulk or registration-to identify bones, we propose a novel learning-based approach for improved robustness to MR artefacts and environmental changes. Specifically, local information is encoded with MR image patch, and the corresponding label is extracted (during training) from simulation CT aligned to the UTE. Within each class (bone vs. nonbone), an overcomplete dictionary is learned so that typical patches within the proper class can be represented as a sparse combination of the dictionary entries. For testing, an acquired UTE-MRI is divided to patches using a sliding scheme, where each patch is sparsely regressed against both bone and nonbone dictionaries, and subsequently claimed to be associated with the class with the smaller residual. Results: The proposed method has been applied to the pilot site of brain imaging and it has showed general good performance, with dice similarity coefficient of greater than 0.9 in a crossvalidation study using 4 datasets. Importantly, it is robust towards consistent foreign objects (e.g., headset) and the artefacts relates to Gibbs and field heterogeneity. Conclusion: A learning perspective has been developed for inferring bone structures based on UTE MRI. The imaging setting is subject to minimal motion effects and the post-processing is efficient. The improved efficiency and robustness enables a first translation to MR-only routine. The scheme
Use of multilevel logistic regression to identify the causes of differential item functioning.
Balluerka, Nekane; Gorostiaga, Arantxa; Gómez-Benito, Juana; Hidalgo, María Dolores
2010-11-01
Given that a key function of tests is to serve as evaluation instruments and for decision making in the fields of psychology and education, the possibility that some of their items may show differential behaviour is a major concern for psychometricians. In recent decades, important progress has been made as regards the efficacy of techniques designed to detect this differential item functioning (DIF). However, the findings are scant when it comes to explaining its causes. The present study addresses this problem from the perspective of multilevel analysis. Starting from a case study in the area of transcultural comparisons, multilevel logistic regression is used: 1) to identify the item characteristics associated with the presence of DIF; 2) to estimate the proportion of variation in the DIF coefficients that is explained by these characteristics; and 3) to evaluate alternative explanations of the DIF by comparing the explanatory power or fit of different sequential models. The comparison of these models confirmed one of the two alternatives (familiarity with the stimulus) and rejected the other (the topic area) as being a cause of differential functioning with respect to the compared groups.
Eekhout, I.; Wiel, M.A. van de; Heymans, M.W.
2017-01-01
Background. Multiple imputation is a recommended method to handle missing data. For significance testing after multiple imputation, Rubin’s Rules (RR) are easily applied to pool parameter estimates. In a logistic regression model, to consider whether a categorical covariate with more than two levels
Directory of Open Access Journals (Sweden)
Charles K Fisher
Full Text Available Human associated microbial communities exert tremendous influence over human health and disease. With modern metagenomic sequencing methods it is now possible to follow the relative abundance of microbes in a community over time. These microbial communities exhibit rich ecological dynamics and an important goal of microbial ecology is to infer the ecological interactions between species directly from sequence data. Any algorithm for inferring ecological interactions must overcome three major obstacles: 1 a correlation between the abundances of two species does not imply that those species are interacting, 2 the sum constraint on the relative abundances obtained from metagenomic studies makes it difficult to infer the parameters in timeseries models, and 3 errors due to experimental uncertainty, or mis-assignment of sequencing reads into operational taxonomic units, bias inferences of species interactions due to a statistical problem called "errors-in-variables". Here we introduce an approach, Learning Interactions from MIcrobial Time Series (LIMITS, that overcomes these obstacles. LIMITS uses sparse linear regression with boostrap aggregation to infer a discrete-time Lotka-Volterra model for microbial dynamics. We tested LIMITS on synthetic data and showed that it could reliably infer the topology of the inter-species ecological interactions. We then used LIMITS to characterize the species interactions in the gut microbiomes of two individuals and found that the interaction networks varied significantly between individuals. Furthermore, we found that the interaction networks of the two individuals are dominated by distinct "keystone species", Bacteroides fragilis and Bacteroided stercosis, that have a disproportionate influence on the structure of the gut microbiome even though they are only found in moderate abundance. Based on our results, we hypothesize that the abundances of certain keystone species may be responsible for individuality in
Buchner, Florian; Wasem, Jürgen; Schillo, Sonja
2017-01-01
Risk equalization formulas have been refined since their introduction about two decades ago. Because of the complexity and the abundance of possible interactions between the variables used, hardly any interactions are considered. A regression tree is used to systematically search for interactions, a methodologically new approach in risk equalization. Analyses are based on a data set of nearly 2.9 million individuals from a major German social health insurer. A two-step approach is applied: In the first step a regression tree is built on the basis of the learning data set. Terminal nodes characterized by more than one morbidity-group-split represent interaction effects of different morbidity groups. In the second step the 'traditional' weighted least squares regression equation is expanded by adding interaction terms for all interactions detected by the tree, and regression coefficients are recalculated. The resulting risk adjustment formula shows an improvement in the adjusted R 2 from 25.43% to 25.81% on the evaluation data set. Predictive ratios are calculated for subgroups affected by the interactions. The R 2 improvement detected is only marginal. According to the sample level performance measures used, not involving a considerable number of morbidity interactions forms no relevant loss in accuracy. Copyright © 2015 John Wiley & Sons, Ltd. Copyright © 2015 John Wiley & Sons, Ltd.
Identifying the Factors That Influence Change in SEBD Using Logistic Regression Analysis
Camilleri, Liberato; Cefai, Carmel
2013-01-01
Multiple linear regression and ANOVA models are widely used in applications since they provide effective statistical tools for assessing the relationship between a continuous dependent variable and several predictors. However these models rely heavily on linearity and normality assumptions and they do not accommodate categorical dependent…
Gajewski, Byron J; Dunton, Nancy
2013-04-01
Almost a decade ago Morton and Torgerson indicated that perceived medical benefits could be due to "regression to the mean." Despite this caution, the regression to the mean "effects on the identification of changes in institutional performance do not seem to have been considered previously in any depth" (Jones and Spiegelhalter). As a response, Jones and Spiegelhalter provide a methodology to adjust for regression to the mean when modeling recent changes in institutional performance for one-variable quality indicators. Therefore, in our view, Jones and Spiegelhalter provide a breakthrough methodology for performance measures. At the same time, in the interests of parsimony, it is useful to aggregate individual quality indicators into a composite score. Our question is, can we develop and demonstrate a methodology that extends the "regression to the mean" literature to allow for composite quality indicators? Using a latent variable modeling approach, we extend the methodology to the composite indicator case. We demonstrate the approach on 4 indicators collected by the National Database of Nursing Quality Indicators. A simulation study further demonstrates its "proof of concept."
2010-10-01
... and material audit exceptions identified regarding centralized financial and administrative functions... Tribes for Participation in Self-Governance Planning Phase § 137.22 May the Secretary consider uncorrected significant and material audit exceptions identified regarding centralized financial and...
Lei, Yang; Nollen, Nikki; Ahluwahlia, Jasjit S; Yu, Qing; Mayo, Matthew S
2015-04-09
Other forms of tobacco use are increasing in prevalence, yet most tobacco control efforts are aimed at cigarettes. In light of this, it is important to identify individuals who are using both cigarettes and alternative tobacco products (ATPs). Most previous studies have used regression models. We conducted a traditional logistic regression model and a classification and regression tree (CART) model to illustrate and discuss the added advantages of using CART in the setting of identifying high-risk subgroups of ATP users among cigarettes smokers. The data were collected from an online cross-sectional survey administered by Survey Sampling International between July 5, 2012 and August 15, 2012. Eligible participants self-identified as current smokers, African American, White, or Latino (of any race), were English-speaking, and were at least 25 years old. The study sample included 2,376 participants and was divided into independent training and validation samples for a hold out validation. Logistic regression and CART models were used to examine the important predictors of cigarettes + ATP users. The logistic regression model identified nine important factors: gender, age, race, nicotine dependence, buying cigarettes or borrowing, whether the price of cigarettes influences the brand purchased, whether the participants set limits on cigarettes per day, alcohol use scores, and discrimination frequencies. The C-index of the logistic regression model was 0.74, indicating good discriminatory capability. The model performed well in the validation cohort also with good discrimination (c-index = 0.73) and excellent calibration (R-square = 0.96 in the calibration regression). The parsimonious CART model identified gender, age, alcohol use score, race, and discrimination frequencies to be the most important factors. It also revealed interesting partial interactions. The c-index is 0.70 for the training sample and 0.69 for the validation sample. The misclassification
Directory of Open Access Journals (Sweden)
Paul M. Graham, DO
2018-05-01
Full Text Available We report a case of histologically confirmed primary cutaneous diffuse large B-cell lymphoma, leg type (PCDLBCL-LT that subsequently underwent spontaneous regression in the absence of systemic treatment. The case showed an atypical lymphoid infiltrate that was CD20+ and MUM-1+ and CD10–. A subsequent biopsy of the spontaneously regressed lesion showed fibrosis associated with a lymphocytic infiltrate comprising reactive T cells. PCDLBCL-LT is a cutaneous B-cell lymphoma with a poor prognosis, which is usually treated with chemotherapy. We describe a case of clinical and histologic spontaneous regression in a patient with PCDLBCL-LT who had a negative systemic workup but a recurrence over a year after his initial presentation. Key words: B cell, lymphoma, primary cutaneous diffuse large B-cell lymphoma, leg type, regression
Quinino, Roberto C.; Reis, Edna A.; Bessegato, Lupercio F.
2013-01-01
This article proposes the use of the coefficient of determination as a statistic for hypothesis testing in multiple linear regression based on distributions acquired by beta sampling. (Contains 3 figures.)
A Matlab program for stepwise regression
Directory of Open Access Journals (Sweden)
Yanhong Qi
2016-03-01
Full Text Available The stepwise linear regression is a multi-variable regression for identifying statistically significant variables in the linear regression equation. In present study, we presented the Matlab program of stepwise regression.
Parodi, Stefano; Dosi, Corrado; Zambon, Antonella; Ferrari, Enrico; Muselli, Marco
2017-12-01
Identifying potential risk factors for problem gambling (PG) is of primary importance for planning preventive and therapeutic interventions. We illustrate a new approach based on the combination of standard logistic regression and an innovative method of supervised data mining (Logic Learning Machine or LLM). Data were taken from a pilot cross-sectional study to identify subjects with PG behaviour, assessed by two internationally validated scales (SOGS and Lie/Bet). Information was obtained from 251 gamblers recruited in six betting establishments. Data on socio-demographic characteristics, lifestyle and cognitive-related factors, and type, place and frequency of preferred gambling were obtained by a self-administered questionnaire. The following variables associated with PG were identified: instant gratification games, alcohol abuse, cognitive distortion, illegal behaviours and having started gambling with a relative or a friend. Furthermore, the combination of LLM and LR indicated the presence of two different types of PG, namely: (a) daily gamblers, more prone to illegal behaviour, with poor money management skills and who started gambling at an early age, and (b) non-daily gamblers, characterised by superstitious beliefs and a higher preference for immediate reward games. Finally, instant gratification games were strongly associated with the number of games usually played. Studies on gamblers habitually frequently betting shops are rare. The finding of different types of PG by habitual gamblers deserves further analysis in larger studies. Advanced data mining algorithms, like LLM, are powerful tools and potentially useful in identifying risk factors for PG.
Chiang, Peggy Pei-Chia; Xie, Jing; Keeffe, Jill Elizabeth
2011-04-25
To identify the critical success factors (CSF) associated with coverage of low vision services. Data were collected from a survey distributed to Vision 2020 contacts, government, and non-government organizations (NGOs) in 195 countries. The Classification and Regression Tree Analysis (CART) was used to identify the critical success factors of low vision service coverage. Independent variables were sourced from the survey: policies, epidemiology, provision of services, equipment and infrastructure, barriers to services, human resources, and monitoring and evaluation. Socioeconomic and demographic independent variables: health expenditure, population statistics, development status, and human resources in general, were sourced from the World Health Organization (WHO), World Bank, and the United Nations (UN). The findings identified that having >50% of children obtaining devices when prescribed (χ(2) = 44; P 3 rehabilitation workers per 10 million of population (χ(2) = 4.50; P = 0.034), higher percentage of population urbanized (χ(2) = 14.54; P = 0.002), a level of private investment (χ(2) = 14.55; P = 0.015), and being fully funded by government (χ(2) = 6.02; P = 0.014), are critical success factors associated with coverage of low vision services. This study identified the most important predictors for countries with better low vision coverage. The CART is a useful and suitable methodology in survey research and is a novel way to simplify a complex global public health issue in eye care.
Active Learning with Rationales for Identifying Operationally Significant Anomalies in Aviation
Sharma, Manali; Das, Kamalika; Bilgic, Mustafa; Matthews, Bryan; Nielsen, David Lynn; Oza, Nikunj C.
2016-01-01
A major focus of the commercial aviation community is discovery of unknown safety events in flight operations data. Data-driven unsupervised anomaly detection methods are better at capturing unknown safety events compared to rule-based methods which only look for known violations. However, not all statistical anomalies that are discovered by these unsupervised anomaly detection methods are operationally significant (e.g., represent a safety concern). Subject Matter Experts (SMEs) have to spend significant time reviewing these statistical anomalies individually to identify a few operationally significant ones. In this paper we propose an active learning algorithm that incorporates SME feedback in the form of rationales to build a classifier that can distinguish between uninteresting and operationally significant anomalies. Experimental evaluation on real aviation data shows that our approach improves detection of operationally significant events by as much as 75% compared to the state-of-the-art. The learnt classifier also generalizes well to additional validation data sets.
Identifying Adult Dengue Patients at Low Risk for Clinically Significant Bleeding.
Directory of Open Access Journals (Sweden)
Joshua G X Wong
Full Text Available Clinically significant bleeding is important for subsequent optimal case management in dengue patients, but most studies have focused on dengue severity as an outcome. Our study objective was to identify differences in admission parameters between patients who developed clinically significant bleeding and those that did not. We sought to develop a model for discriminating between these patients.We conducted a retrospective study of 4,383 adults aged >18 years who were hospitalized with dengue infection at Tan Tock Seng Hospital, Singapore from 2005 to 2008. Patients were divided into those with clinically significant bleeding (n = 188, and those without (n = 4,195. Demographic, clinical, and laboratory variables on admission were compared between groups to determine factors associated with clinically significant bleeding during hospitalization.On admission, female gender (p38°C (p38°C (aOR 1.81; 95% CI: 1.27-2.61, nausea/vomiting (aOR 1.39; 95% CI: 0.94-2.12, ANC (aOR 1.3; 95% CI: 1.15-1.46, ALC (aOR 0.4; 95% CI: 0.25-0.64, hematocrit percentage (aOR 0.96; 95% CI: 0.92-1.002 and platelet count (aOR 0.993; 95% CI: 0.988-0.998. At the cutoff of -3.919, the model achieved an AUC of 0.758 (sensitivity:0.87, specificity: 0.38, PPV: 0.06, NPV: 0.98.Clinical risk factors associated with clinically significant bleeding were identified. This model may be useful to complement clinical judgement in triaging adult dengue patients given the dynamic nature of acute dengue, particularly in pre-identifying those less likely to develop clinically significant bleeding.
Identifying significant temporal variation in time course microarray data without replicates
Directory of Open Access Journals (Sweden)
Porter Weston
2009-03-01
Full Text Available Abstract Background An important component of time course microarray studies is the identification of genes that demonstrate significant time-dependent variation in their expression levels. Until recently, available methods for performing such significance tests required replicates of individual time points. This paper describes a replicate-free method that was developed as part of a study of the estrous cycle in the rat mammary gland in which no replicate data was collected. Results A temporal test statistic is proposed that is based on the degree to which data are smoothed when fit by a spline function. An algorithm is presented that uses this test statistic together with a false discovery rate method to identify genes whose expression profiles exhibit significant temporal variation. The algorithm is tested on simulated data, and is compared with another recently published replicate-free method. The simulated data consists both of genes with known temporal dependencies, and genes from a null distribution. The proposed algorithm identifies a larger percentage of the time-dependent genes for a given false discovery rate. Use of the algorithm in a study of the estrous cycle in the rat mammary gland resulted in the identification of genes exhibiting distinct circadian variation. These results were confirmed in follow-up laboratory experiments. Conclusion The proposed algorithm provides a new approach for identifying expression profiles with significant temporal variation without relying on replicates. When compared with a recently published algorithm on simulated data, the proposed algorithm appears to identify a larger percentage of time-dependent genes for a given false discovery rate. The development of the algorithm was instrumental in revealing the presence of circadian variation in the virgin rat mammary gland during the estrous cycle.
Singer, Meromit; Engström, Alexander; Schönhuth, Alexander; Pachter, Lior
2011-09-23
Recent experimental and computational work confirms that CpGs can be unmethylated inside coding exons, thereby showing that codons may be subjected to both genomic and epigenomic constraint. It is therefore of interest to identify coding CpG islands (CCGIs) that are regions inside exons enriched for CpGs. The difficulty in identifying such islands is that coding exons exhibit sequence biases determined by codon usage and constraints that must be taken into account. We present a method for finding CCGIs that showcases a novel approach we have developed for identifying regions of interest that are significant (with respect to a Markov chain) for the counts of any pattern. Our method begins with the exact computation of tail probabilities for the number of CpGs in all regions contained in coding exons, and then applies a greedy algorithm for selecting islands from among the regions. We show that the greedy algorithm provably optimizes a biologically motivated criterion for selecting islands while controlling the false discovery rate. We applied this approach to the human genome (hg18) and annotated CpG islands in coding exons. The statistical criterion we apply to evaluating islands reduces the number of false positives in existing annotations, while our approach to defining islands reveals significant numbers of undiscovered CCGIs in coding exons. Many of these appear to be examples of functional epigenetic specialization in coding exons.
Directory of Open Access Journals (Sweden)
David J. Purpura
2017-12-01
Full Text Available Many children struggle to successfully acquire early mathematics skills. Theoretical and empirical evidence has pointed to deficits in domain-specific skills (e.g., non-symbolic mathematics skills or domain-general skills (e.g., executive functioning and language as underlying low mathematical performance. In the current study, we assessed a sample of 113 three- to five-year old preschool children on a battery of domain-specific and domain-general factors in the fall and spring of their preschool year to identify Time 1 (fall factors associated with low performance in mathematics knowledge at Time 2 (spring. We used the exploratory approach of classification and regression tree analyses, a strategy that uses step-wise partitioning to create subgroups from a larger sample using multiple predictors, to identify the factors that were the strongest classifiers of low performance for younger and older preschool children. Results indicated that the most consistent classifier of low mathematics performance at Time 2 was children’s Time 1 mathematical language skills. Further, other distinct classifiers of low performance emerged for younger and older children. These findings suggest that risk classification for low mathematics performance may differ depending on children’s age.
Kuramitsu, Yasuhiro; Wang, Yufeng; Okada, Futoshi; Baron, Byron; Tokuda, Kazuhiro; Kitagawa, Takao; Akada, Junko; Nakamura, Kazuyuki
2013-09-01
QR-32 is a regressive murine fibrosarcoma cell clone which cannot grow when they are transplanted in mice; QRsP-11 is a progressive malignant tumor cell clone derived from QR-32 which shows strong tumorigenicity. A recent study showed there to be differentially expressed up-regulated and down-regulated proteins in these cells, which were identified by proteomic differential display analyses by using two-dimensional gel electrophoresis and mass spectrometry. Cofilins are small proteins of less than 20 kDa. Their function is the regulation of actin assembly. Cofilin-1 is a small ubiquitous protein, and regulates actin dynamics by means of binding to actin filaments. Cofilin-1 plays roles in cell migration, proliferation and phagocytosis. Cofilin-2 is also a small protein, but it is mainly expressed in skeletal and cardiac muscles. There are many reports showing the positive correlation between the level of cofilin-1 and cancer progression. We have also reported an increased expression of cofilin-1 in pancreatic cancer tissues compared to adjacent paired normal tissues. On the other hand, cofilin-2 was significantly less expressed in pancreatic cancer tissues. Therefore, the present study investigated the comparison of the levels of cofilin-1 and cofilin-2 in regressive QR-32 and progressive QRsP-11cells by western blotting. Cofilin-2 was significantly up-regulated in QRsP-11 compared to QR-32 cells (p<0.001). On the other hand, the difference of the intensities of the bands of cofilin-1 (18 kDa) in QR-32 and QRsP-11 was not significant. However, bands of 27 kDa showed a quite different intensity between QR-32 and QRsP-11, with much higher intensities in QRsP-11 compared to QR-32 (p<0.001). These results suggested that the 27-kDa protein recognized by the antibody against cofilin-1 is a possible biomarker for progressive tumor cells.
Directory of Open Access Journals (Sweden)
Ellen Kenchington
Full Text Available The United Nations General Assembly Resolution 61/105, concerning sustainable fisheries in the marine ecosystem, calls for the protection of vulnerable marine ecosystems (VME from destructive fishing practices. Subsequently, the Food and Agriculture Organization (FAO produced guidelines for identification of VME indicator species/taxa to assist in the implementation of the resolution, but recommended the development of case-specific operational definitions for their application. We applied kernel density estimation (KDE to research vessel trawl survey data from inside the fishing footprint of the Northwest Atlantic Fisheries Organization (NAFO Regulatory Area in the high seas of the northwest Atlantic to create biomass density surfaces for four VME indicator taxa: large-sized sponges, sea pens, small and large gorgonian corals. These VME indicator taxa were identified previously by NAFO using the fragility, life history characteristics and structural complexity criteria presented by FAO, along with an evaluation of their recovery trajectories. KDE, a non-parametric neighbour-based smoothing function, has been used previously in ecology to identify hotspots, that is, areas of relatively high biomass/abundance. We present a novel approach of examining relative changes in area under polygons created from encircling successive biomass categories on the KDE surface to identify "significant concentrations" of biomass, which we equate to VMEs. This allows identification of the VMEs from the broader distribution of the species in the study area. We provide independent assessments of the VMEs so identified using underwater images, benthic sampling with other gear types (dredges, cores, and/or published species distribution models of probability of occurrence, as available. For each VME indicator taxon we provide a brief review of their ecological function which will be important in future assessments of significant adverse impact on these habitats here
Directory of Open Access Journals (Sweden)
Matt Silver
2013-11-01
Full Text Available Standard approaches to data analysis in genome-wide association studies (GWAS ignore any potential functional relationships between gene variants. In contrast gene pathways analysis uses prior information on functional structure within the genome to identify pathways associated with a trait of interest. In a second step, important single nucleotide polymorphisms (SNPs or genes may be identified within associated pathways. The pathways approach is motivated by the fact that genes do not act alone, but instead have effects that are likely to be mediated through their interaction in gene pathways. Where this is the case, pathways approaches may reveal aspects of a trait's genetic architecture that would otherwise be missed when considering SNPs in isolation. Most pathways methods begin by testing SNPs one at a time, and so fail to capitalise on the potential advantages inherent in a multi-SNP, joint modelling approach. Here, we describe a dual-level, sparse regression model for the simultaneous identification of pathways and genes associated with a quantitative trait. Our method takes account of various factors specific to the joint modelling of pathways with genome-wide data, including widespread correlation between genetic predictors, and the fact that variants may overlap multiple pathways. We use a resampling strategy that exploits finite sample variability to provide robust rankings for pathways and genes. We test our method through simulation, and use it to perform pathways-driven gene selection in a search for pathways and genes associated with variation in serum high-density lipoprotein cholesterol levels in two separate GWAS cohorts of Asian adults. By comparing results from both cohorts we identify a number of candidate pathways including those associated with cardiomyopathy, and T cell receptor and PPAR signalling. Highlighted genes include those associated with the L-type calcium channel, adenylate cyclase, integrin, laminin, MAPK
Silver, Matt; Chen, Peng; Li, Ruoying; Cheng, Ching-Yu; Wong, Tien-Yin; Tai, E-Shyong; Teo, Yik-Ying; Montana, Giovanni
2013-01-01
Standard approaches to data analysis in genome-wide association studies (GWAS) ignore any potential functional relationships between gene variants. In contrast gene pathways analysis uses prior information on functional structure within the genome to identify pathways associated with a trait of interest. In a second step, important single nucleotide polymorphisms (SNPs) or genes may be identified within associated pathways. The pathways approach is motivated by the fact that genes do not act alone, but instead have effects that are likely to be mediated through their interaction in gene pathways. Where this is the case, pathways approaches may reveal aspects of a trait's genetic architecture that would otherwise be missed when considering SNPs in isolation. Most pathways methods begin by testing SNPs one at a time, and so fail to capitalise on the potential advantages inherent in a multi-SNP, joint modelling approach. Here, we describe a dual-level, sparse regression model for the simultaneous identification of pathways and genes associated with a quantitative trait. Our method takes account of various factors specific to the joint modelling of pathways with genome-wide data, including widespread correlation between genetic predictors, and the fact that variants may overlap multiple pathways. We use a resampling strategy that exploits finite sample variability to provide robust rankings for pathways and genes. We test our method through simulation, and use it to perform pathways-driven gene selection in a search for pathways and genes associated with variation in serum high-density lipoprotein cholesterol levels in two separate GWAS cohorts of Asian adults. By comparing results from both cohorts we identify a number of candidate pathways including those associated with cardiomyopathy, and T cell receptor and PPAR signalling. Highlighted genes include those associated with the L-type calcium channel, adenylate cyclase, integrin, laminin, MAPK signalling and immune
Directory of Open Access Journals (Sweden)
Liang Cheng
Full Text Available The significantly related diseases of sequences could play an important role in understanding the functions of these sequences. In this paper, we introduced BLAT2DOLite, an online system for annotating human genes and diseases and identifying the significant relationships between sequences and diseases. Currently, BLAT2DOLite integrates Entrez Gene database and Disease Ontology Lite (DOLite, which contain loci of gene and relationships between genes and diseases. It utilizes hypergeometric test to calculate P-values between genes and diseases of DOLite. The system can be accessed from: http://123.59.132.21:8080/BLAT2DOLite. The corresponding web service is described in: http://123.59.132.21:8080/BLAT2DOLite/BLAT2DOLiteIDMappingPort?wsdl.
Smith, R.; Kasprzyk, J. R.; Balaji, R.
2017-12-01
In light of deeply uncertain factors like future climate change and population shifts, responsible resource management will require new types of information and strategies. For water utilities, this entails potential expansion and efficient management of water supply infrastructure systems for changes in overall supply; changes in frequency and severity of climate extremes such as droughts and floods; and variable demands, all while accounting for conflicting long and short term performance objectives. Multiobjective Evolutionary Algorithms (MOEAs) are emerging decision support tools that have been used by researchers and, more recently, water utilities to efficiently generate and evaluate thousands of planning portfolios. The tradeoffs between conflicting objectives are explored in an automated way to produce (often large) suites of portfolios that strike different balances of performance. Once generated, the sets of optimized portfolios are used to support relatively subjective assertions of priorities and human reasoning, leading to adoption of a plan. These large tradeoff sets contain information about complex relationships between decisions and between groups of decisions and performance that, until now, has not been quantitatively described. We present a novel use of Multivariate Regression Trees (MRTs) to analyze tradeoff sets to reveal these relationships and critical decisions. Additionally, when MRTs are applied to tradeoff sets developed for different realizations of an uncertain future, they can identify decisions that are robust across a wide range of conditions and produce fundamental insights about the system being optimized.
Knowles, Jacky; Kupka, Roland; Dumble, Sam; Garrett, Greg S.; Pandav, Chandrakant S.; Yadav, Kapil; Touré, Ndeye Khady; Foriwa Amoaful, Esi; Gorstein, Jonathan
2018-01-01
Single and multiple variable regression analyses were conducted using data from stratified, cluster sample design, iodine surveys in India, Ghana, and Senegal to identify factors associated with urinary iodine concentration (UIC) among women of reproductive age (WRA) at the national and sub-national level. Subjects were survey household respondents, typically WRA. For all three countries, UIC was significantly different (p regression analysis, UIC was significantly associated with strata and household salt iodine category in India and Ghana (p < 0.001). Estimated UIC was 1.6 (95% confidence intervals (CI) 1.3, 2.0) times higher (India) and 1.4 (95% CI 1.2, 1.6) times higher (Ghana) among WRA from households using adequately iodised salt than among WRA from households using non-iodised salt. Other significant associations with UIC were found in India, with having heard of iodine deficiency (1.2 times higher; CI 1.1, 1.3; p < 0.001) and having improved dietary diversity (1.1 times higher, CI 1.0, 1.2; p = 0.015); and in Ghana, with the level of tomato paste consumption the previous week (p = 0.029) (UIC for highest consumption level was 1.2 times lowest level; CI 1.1, 1.4). No significant associations were found in Senegal. Sub-national data on iodine status are required to assess equity of access to optimal iodine intake and to develop strategic responses as needed. PMID:29690505
Ji, Jun; Ling, Jeffrey; Jiang, Helen; Wen, Qiaojun; Whitin, John C; Tian, Lu; Cohen, Harvey J; Ling, Xuefeng B
2013-03-23
Mass spectrometry (MS) has evolved to become the primary high throughput tool for proteomics based biomarker discovery. Until now, multiple challenges in protein MS data analysis remain: large-scale and complex data set management; MS peak identification, indexing; and high dimensional peak differential analysis with the concurrent statistical tests based false discovery rate (FDR). "Turnkey" solutions are needed for biomarker investigations to rapidly process MS data sets to identify statistically significant peaks for subsequent validation. Here we present an efficient and effective solution, which provides experimental biologists easy access to "cloud" computing capabilities to analyze MS data. The web portal can be accessed at http://transmed.stanford.edu/ssa/. Presented web application supplies large scale MS data online uploading and analysis with a simple user interface. This bioinformatic tool will facilitate the discovery of the potential protein biomarkers using MS.
Directory of Open Access Journals (Sweden)
Brian A. Johnson
2018-01-01
Full Text Available The advent of very high resolution (VHR satellite imagery and the development of Geographic Object-Based Image Analysis (GEOBIA have led to many new opportunities for fine-scale land cover mapping, especially in urban areas. Image segmentation is an important step in the GEOBIA framework, so great time/effort is often spent to ensure that computer-generated image segments closely match real-world objects of interest. In the remote sensing community, segmentation is frequently performed using the multiresolution segmentation (MRS algorithm, which is tuned through three user-defined parameters (the scale, shape/color, and compactness/smoothness parameters. The scale parameter (SP is the most important parameter and governs the average size of generated image segments. Existing automatic methods to determine suitable SPs for segmentation are scene-specific and often computationally intensive, so an approach to estimating appropriate SPs that is generalizable (i.e., not scene-specific could speed up the GEOBIA workflow considerably. In this study, we attempted to identify generalizable SPs for five common urban land cover types (buildings, vegetation, roads, bare soil, and water through meta-analysis and nonlinear regression tree (RT modeling. First, we performed a literature search of recent studies that employed GEOBIA for urban land cover mapping and extracted the MRS parameters used, the image properties (i.e., spatial and radiometric resolutions, and the land cover classes mapped. Using this data extracted from the literature, we constructed RT models for each land cover class to predict suitable SP values based on the: image spatial resolution, image radiometric resolution, shape/color parameter, and compactness/smoothness parameter. Based on a visual and quantitative analysis of results, we found that for all land cover classes except water, relatively accurate SPs could be identified using our RT modeling results. The main advantage of our
Strifler, Lisa; Cardoso, Roberta; McGowan, Jessie; Cogo, Elise; Nincic, Vera; Khan, Paul A; Scott, Alistair; Ghassemi, Marco; MacDonald, Heather; Lai, Yonda; Treister, Victoria; Tricco, Andrea C; Straus, Sharon E
2018-04-13
To conduct a scoping review of knowledge translation (KT) theories, models and frameworks that have been used to guide dissemination or implementation of evidence-based interventions targeted to prevention and/or management of cancer or other chronic diseases. We used a comprehensive multistage search process from 2000-2016, which included traditional bibliographic database searching, searching using names of theories, models and frameworks, and cited reference searching. Two reviewers independently screened the literature and abstracted data. We found 596 studies reporting on the use of 159 KT theories, models or frameworks. A majority (87%) of the identified theories, models or frameworks were used in five or fewer studies, with 60% used once. The theories, models and frameworks were most commonly used to inform planning/design, implementation and evaluation activities, and least commonly used to inform dissemination and sustainability/scalability activities. Twenty-six were used across the full implementation spectrum (from planning/design to sustainability/scalability) either within or across studies. All were used for at least individual-level behavior change, while 48% were used for organization-level, 33% for community-level and 17% for system-level change. We found a significant number of KT theories, models and frameworks with a limited evidence base describing their use. Copyright © 2018. Published by Elsevier Inc.
Kanodia, Shreya; Da Silva, Diane M.; Karamanukyan, Tigran; Bogaert, Lies; Fu, Yang-Xin; Kast, W. Martin
2010-01-01
LIGHT, a ligand for the lymphotoxin-beta receptor, establishes lymphoid-like tissues inside tumor sites and recruits naïve T-cells into the tumor. However, whether these infiltrating T-cells are specific for tumor antigens is not known. We hypothesized that therapy with LIGHT can expand functional tumor-specific CD8+ T-cells that can be boosted using HPV16E6E7-Venezuelan Equine Encephalitis Virus Replicon Particles (HPV16-VRP) and that this combined therapy can eradicate HPV16-induced tumors. Our data show that forced expression of LIGHT in tumors results in an increase in expression of interferon gamma (IFNg) and chemottractant cytokines such as IL-1a, MIG and MIP-2 within the tumor and that this tumor microenvironment correlates with an increase in frequency of tumor-infiltrating CD8+ T-cells. Forced expression of LIGHT also results in the expansion of functional T-cells that recognize multiple tumor-antigens, including HPV16 E7, and these T-cells prevent the outgrowth of tumors upon secondary challenge. Subsequent boosting of E7-specific T-cells by vaccination with HPV16-VRP significantly increases their frequency in both the periphery and the tumor, and leads to the eradication of large well-established tumors, for which either treatment alone is not successful. These data establish the safety of Ad-LIGHT as a therapeutic intervention in pre-clinical studies and suggest that patients with HPV16+ tumors may benefit from combined immunotherapy with LIGHT and antigen-specific vaccination. PMID:20460520
Directory of Open Access Journals (Sweden)
Jacky Knowles
2018-04-01
Full Text Available Single and multiple variable regression analyses were conducted using data from stratified, cluster sample design, iodine surveys in India, Ghana, and Senegal to identify factors associated with urinary iodine concentration (UIC among women of reproductive age (WRA at the national and sub-national level. Subjects were survey household respondents, typically WRA. For all three countries, UIC was significantly different (p < 0.05 by household salt iodine category. Other significant differences were by strata and by household vulnerability to poverty in India and Ghana. In multiple variable regression analysis, UIC was significantly associated with strata and household salt iodine category in India and Ghana (p < 0.001. Estimated UIC was 1.6 (95% confidence intervals (CI 1.3, 2.0 times higher (India and 1.4 (95% CI 1.2, 1.6 times higher (Ghana among WRA from households using adequately iodised salt than among WRA from households using non-iodised salt. Other significant associations with UIC were found in India, with having heard of iodine deficiency (1.2 times higher; CI 1.1, 1.3; p < 0.001 and having improved dietary diversity (1.1 times higher, CI 1.0, 1.2; p = 0.015; and in Ghana, with the level of tomato paste consumption the previous week (p = 0.029 (UIC for highest consumption level was 1.2 times lowest level; CI 1.1, 1.4. No significant associations were found in Senegal. Sub-national data on iodine status are required to assess equity of access to optimal iodine intake and to develop strategic responses as needed.
Identifying the most significant indicators of the total road safety performance index.
Tešić, Milan; Hermans, Elke; Lipovac, Krsto; Pešić, Dalibor
2018-04-01
The review of the national and international literature dealing with the assessment of the road safety level has shown great efforts of the authors who tried to define the methodology for calculating the composite road safety index on a territory (region, state, etc.). The procedure for obtaining a road safety composite index of an area has been largely harmonized. The question that has not been fully resolved yet concerns the selection of indicators. There is a wide range of road safety indicators used to show a road safety situation on a territory. Road safety performance index (RSPI) obtained on the basis of a larger number of safety performance indicators (SPIs) enable decision makers to more precisely define the earlier goal- oriented actions. However, recording a broader comprehensive set of SPIs helps identify the strengths and weaknesses of a country's road safety system. Providing high quality national and international databases that would include comparable SPIs seems to be difficult since a larger number of countries dispose of a small number of identical indicators available for use. Therefore, there is a need for calculating a road safety performance index with a limited number of indicators (RSPI ln n ) which will provide a comparison of a sufficient quality, of as many countries as possible. The application of the Data Envelopment Analysis (DEA) method and correlative analysis has helped to check if the RSPI ln n is likely to be of sufficient quality. A strong correlation between the RSPI ln n and the RSPI has been identified using the proposed methodology. Based on this, the most contributing indicators and methodologies for gradual monitoring of SPIs, have been defined for each country analyzed. The indicator monitoring phases in the analyzed countries have been defined in the following way: Phase 1- the indicators relating to alcohol, speed and protective systems; Phase 2- the indicators relating to roads and Phase 3- the indicators relating to
49 CFR 520.5 - Guidelines for identifying major actions significantly affecting the environment.
2010-10-01
... significantly affecting the environment. 520.5 Section 520.5 Transportation Other Regulations Relating to... significantly affecting the environment. (a) General guidelines. The phrase, “major Federal actions significantly affecting the quality of the human environment,” as used in this part, shall be construed with a...
Zunker, Norma D.; Pearce, Daniel L.
2012-01-01
The first part of this study explored the significant works pertaining to the understanding of reading comprehension using a Modified Delphi Method. A panel of reading comprehension experts identified 19 works they considered to be significant to the understanding of reading comprehension. The panel of experts identified the reasons they…
DEFF Research Database (Denmark)
Mola, Gylli; Wenger, Therese Ramstad; Salomonsson, Petra
2017-01-01
AIM: We investigated the consequences of applying different imaging guidelines for urological anomalies after first pyelonephritis in children with normal routine antenatal ultrasounds. METHODS: The cohort comprised 472 children treated for their first culture-positive pyelonephritis and investig......AIM: We investigated the consequences of applying different imaging guidelines for urological anomalies after first pyelonephritis in children with normal routine antenatal ultrasounds. METHODS: The cohort comprised 472 children treated for their first culture-positive pyelonephritis...... identified all patients initially treated with surgery and avoided 65 scintigraphies. CONCLUSION: Dilated VUR was the dominant anomaly in a cohort with first time pyelonephritis and normal antenatal ultrasound. The optimal imaging strategy after pyelonephritis must be identified....
Methodology to identify risk-significant components for inservice inspection and testing
International Nuclear Information System (INIS)
Anderson, M.T.; Hartley, R.S.; Jones, J.L. Jr.; Kido, C.; Phillips, J.H.
1992-08-01
Periodic inspection and testing of vital system components should be performed to ensure the safe and reliable operation of Department of Energy (DOE) nuclear processing facilities. Probabilistic techniques may be used to help identify and rank components by their relative risk. A risk-based ranking would allow varied DOE sites to implement inspection and testing programs in an effective and cost-efficient manner. This report describes a methodology that can be used to rank components, while addressing multiple risk issues
2010-07-01
... deficiencies identified in sanitary surveys performed by EPA. 141.723 Section 141.723 Protection of Environment... performed by EPA, systems must respond in writing to significant deficiencies identified in sanitary survey... will address significant deficiencies noted in the survey. (d) Systems must correct significant...
Ukil, Sanchaita; Sinha, Meenakshee; Varshney, Lavneesh; Agrawal, Shipra
Type 2 Diabetes is a complex multifactorial disease, which alters several signaling cascades giving rise to serious complications. It is one of the major risk factors for cardiovascular diseases. The present research work describes an integrated functional network biology approach to identify pathways that get transcriptionally altered and lead to complex complications thereby amplifying the phenotypic effect of the impaired disease state. We have identified two sub-network modules, which could be activated under abnormal circumstances in diabetes. Present work describes key proteins such as P85A and SRC serving as important nodes to mediate alternate signaling routes during diseased condition. P85A has been shown to be an important link between stress responsive MAPK and CVD markers involved in fibrosis. MAPK8 has been shown to interact with P85A and further activate CTGF through VEGF signaling. We have traced a novel and unique route correlating inflammation and fibrosis by considering P85A as a key mediator of signals. The next sub-network module shows SRC as a junction for various signaling processes, which results in interaction between NF-kB and beta catenin to cause cell death. The powerful interaction between these important genes in response to transcriptionally altered lipid metabolism and impaired inflammatory response via SRC causes apoptosis of cells. The crosstalk between inflammation, lipid homeostasis and stress, and their serious effects downstream have been explained in the present analyses.
DEFF Research Database (Denmark)
Bergman, Gunnar; Hærskjold, Ann; Stensballe, Lone Graff
2015-01-01
BACKGROUND: Epidemiological research is facilitated in Sweden by a history of national health care registers, making large unselected national cohort studies possible. However, for complex clinical populations, such as children with congenital heart disease (CHD), register-based studies...... are challenged by registration limitations. For example, the diagnostic code system International Classification of Diseases, 10th version (ICD-10) does not indicate the clinical significance of abnormalities, therefore may be of limited use if used as the sole parameter in epidemiological research. Palivizumab...
DEFF Research Database (Denmark)
Jansen, Christian; Bogs, Christopher; Verlinden, Wim
2017-01-01
BACKGROUND & AIMS: Clinically significant portal hypertension (CSPH) is associated with severe complications and decompensation of cirrhosis. Liver stiffness measured either by transient elastography (TE) or Shear-wave elastography (SWE) and spleen stiffness by TE might be helpful in the diagnosis...... correlate with portal pressure and can both be used as a non-invasive method to investigate CSPH. Even though external validation is still missing, these algorithms to rule-out and rule-in CSPH using sequential SWE of liver and spleen might change the clinical practice....
Batson, Sarah; Sutton, Alex; Abrams, Keith
2016-01-01
Patients with atrial fibrillation are at a greater risk of stroke and therefore the main goal for treatment of patients with atrial fibrillation is to prevent stroke from occurring. There are a number of different stroke prevention treatments available to include warfarin and novel oral anticoagulants. Previous network meta-analyses of novel oral anticoagulants for stroke prevention in atrial fibrillation acknowledge the limitation of heterogeneity across the included trials but have not explored the impact of potentially important treatment modifying covariates. To explore potentially important treatment modifying covariates using network meta-regression analyses for stroke prevention in atrial fibrillation. We performed a network meta-analysis for the outcome of ischaemic stroke and conducted an exploratory regression analysis considering potentially important treatment modifying covariates. These covariates included the proportion of patients with a previous stroke, proportion of males, mean age, the duration of study follow-up and the patients underlying risk of ischaemic stroke. None of the covariates explored impacted relative treatment effects relative to placebo. Notably, the exploration of 'study follow-up' as a covariate supported the assumption that difference in trial durations is unimportant in this indication despite the variation across trials in the network. This study is limited by the quantity of data available. Further investigation is warranted, and, as justifying further trials may be difficult, it would be desirable to obtain individual patient level data (IPD) to facilitate an effort to relate treatment effects to IPD covariates in order to investigate heterogeneity. Observational data could also be examined to establish if there are potential trends elsewhere. The approach and methods presented have potentially wide applications within any indication as to highlight the potential benefit of extending decision problems to include additional
Directory of Open Access Journals (Sweden)
Sarah Batson
Full Text Available Patients with atrial fibrillation are at a greater risk of stroke and therefore the main goal for treatment of patients with atrial fibrillation is to prevent stroke from occurring. There are a number of different stroke prevention treatments available to include warfarin and novel oral anticoagulants. Previous network meta-analyses of novel oral anticoagulants for stroke prevention in atrial fibrillation acknowledge the limitation of heterogeneity across the included trials but have not explored the impact of potentially important treatment modifying covariates.To explore potentially important treatment modifying covariates using network meta-regression analyses for stroke prevention in atrial fibrillation.We performed a network meta-analysis for the outcome of ischaemic stroke and conducted an exploratory regression analysis considering potentially important treatment modifying covariates. These covariates included the proportion of patients with a previous stroke, proportion of males, mean age, the duration of study follow-up and the patients underlying risk of ischaemic stroke.None of the covariates explored impacted relative treatment effects relative to placebo. Notably, the exploration of 'study follow-up' as a covariate supported the assumption that difference in trial durations is unimportant in this indication despite the variation across trials in the network.This study is limited by the quantity of data available. Further investigation is warranted, and, as justifying further trials may be difficult, it would be desirable to obtain individual patient level data (IPD to facilitate an effort to relate treatment effects to IPD covariates in order to investigate heterogeneity. Observational data could also be examined to establish if there are potential trends elsewhere. The approach and methods presented have potentially wide applications within any indication as to highlight the potential benefit of extending decision problems to
Energy Technology Data Exchange (ETDEWEB)
Sorensen, Anette; Ahring, Birgitte K.; Lubeck, Mette; Ubhayasekera, Wimal; Bruno, Kenneth S.; Culley, David E.; Lubeck, Peter S.
2012-08-20
A newly discovered fungal species, Aspergillus saccharolyticus, was found to produce a culture broth rich in beta-glucosidase activity. In this present work, the main beta-glucosidase of A. saccharolyticus responsible for the efficient hydrolytic activity was identified, isolated, and characterized. Ion exchange chromatography was used to fractionate the culture broth, yielding fractions with high beta-glucosidase activity and only one visible band on an SDS-PAGE gel. Mass spectrometry analysis of this band gave peptide matches to beta-glucosidases from aspergilli. Through a PCR approach using degenerate primers and genome walking, a 2919 base pair sequence encoding the 860 amino acid BGL1 polypeptide was determined. BGL1 of A. saccharolyticus has 91% and 82% identity with BGL1 from Aspergillus aculeatus and BGL1 from Aspergillus niger, respectively, both belonging to Glycoside hydrolase family 3. Homology modeling studies suggested beta-glucosidase activity with preserved retaining mechanism and a wider catalytic pocket compared to other beta-glucosidases. The bgl1 gene was heterologously expressed in Trichoderma reesei QM6a, purified, and characterized by enzyme kinetics studies. The enzyme can hydrolyze cellobiose, pNPG, and cellodextrins. The enzyme showed good thermostability, was stable at 50°C, and at 60°C it had a half-life of approximately 6 hours.
Directory of Open Access Journals (Sweden)
Ettore Mosca
2017-09-01
Full Text Available Autism spectrum disorder (ASD is marked by a strong genetic heterogeneity, which is underlined by the low overlap between ASD risk gene lists proposed in different studies. In this context, molecular networks can be used to analyze the results of several genome-wide studies in order to underline those network regions harboring genetic variations associated with ASD, the so-called “disease modules.” In this work, we used a recent network diffusion-based approach to jointly analyze multiple ASD risk gene lists. We defined genome-scale prioritizations of human genes in relation to ASD genes from multiple studies, found significantly connected gene modules associated with ASD and predicted genes functionally related to ASD risk genes. Most of them play a role in synapsis and neuronal development and function; many are related to syndromes that can be in comorbidity with ASD and the remaining are involved in epigenetics, cell cycle, cell adhesion and cancer.
Knowles, Jacky; Kupka, Roland; Dumble, Sam; Garrett, Greg S.; Pandav, Chandrakant S.; Yadav, Kapil; Nahar, Baitun; Touré, Ndeye Khady; Amoaful, Esi Foriwa; Gorstein, Jonathan
2018-01-01
Regression analyses of data from stratified, cluster sample, household iodine surveys in Bangladesh, India, Ghana and Senegal were conducted to identify factors associated with household access to adequately iodised salt. For all countries, in single variable analyses, household salt iodine was significantly different (p < 0.05) between strata (geographic areas with representative data, defined by survey design), and significantly higher (p < 0.05) among households: with better living standard scores, where the respondent knew about iodised salt and/or looked for iodised salt at purchase, using salt bought in a sealed package, or using refined grain salt. Other country-level associations were also found. Multiple variable analyses showed a significant association between salt iodine and strata (p < 0.001) in India, Ghana and Senegal and that salt grain type was significantly associated with estimated iodine content in all countries (p < 0.001). Salt iodine relative to the reference (coarse salt) ranged from 1.3 (95% CI 1.2, 1.5) times higher for fine salt in Senegal to 3.6 (95% CI 2.6, 4.9) times higher for washed and 6.5 (95% CI 4.9, 8.8) times higher for refined salt in India. Sub-national data are required to monitor equity of access to adequately iodised salt. Improving household access to refined iodised salt in sealed packaging, would improve iodine intake from household salt in all four countries in this analysis, particularly in areas where there is significant small-scale salt production. PMID:29671774
Directory of Open Access Journals (Sweden)
Jacky Knowles
2018-04-01
Full Text Available Regression analyses of data from stratified, cluster sample, household iodine surveys in Bangladesh, India, Ghana and Senegal were conducted to identify factors associated with household access to adequately iodised salt. For all countries, in single variable analyses, household salt iodine was significantly different (p < 0.05 between strata (geographic areas with representative data, defined by survey design, and significantly higher (p < 0.05 among households: with better living standard scores, where the respondent knew about iodised salt and/or looked for iodised salt at purchase, using salt bought in a sealed package, or using refined grain salt. Other country-level associations were also found. Multiple variable analyses showed a significant association between salt iodine and strata (p < 0.001 in India, Ghana and Senegal and that salt grain type was significantly associated with estimated iodine content in all countries (p < 0.001. Salt iodine relative to the reference (coarse salt ranged from 1.3 (95% CI 1.2, 1.5 times higher for fine salt in Senegal to 3.6 (95% CI 2.6, 4.9 times higher for washed and 6.5 (95% CI 4.9, 8.8 times higher for refined salt in India. Sub-national data are required to monitor equity of access to adequately iodised salt. Improving household access to refined iodised salt in sealed packaging, would improve iodine intake from household salt in all four countries in this analysis, particularly in areas where there is significant small-scale salt production.
Clinical significance of pontine high signals identified on magnetic resonance imaging
International Nuclear Information System (INIS)
Watanabe, Masaki; Takahashi, Akira; Arahata, Yutaka; Motegi, Yoshimasa; Furuse, Masahiro.
1993-01-01
Spin-echo magnetic resonance imaging (MRI) was evaluated to 530 cases in order to investigate the clinical significance of pontine high signals. The subjects comprised 109 cases of pontine infarction with high signal on T 2 -weighted image and low signal on T 1 -weighted image (PI group), 145 of pontine high signal with high signal on T 2 -weighted image but normal signal on T 1 -weighted image (PH group) and 276 of age-matched control without abnormality either on T 1 or T 2 -weighted images (AC group). Subjective complaints such as vertigo-dizziness were more frequent in the PH group than in the PI group. In both PI and groups, periventricular hyperintensity as well as subcortical high signals in the supratentorium were more severe than in the AC group. These degrees were higher in the PI group than in the PH group. In conclusion, PH as well as PI may result from diffuse arteriosclerosis and PH is considered to be an early finding of pontine ischemia. (author)
Clinical significance of pontine high signals identified on magnetic resonance imaging
Energy Technology Data Exchange (ETDEWEB)
Watanabe, Masaki; Takahashi, Akira (Nagoya Univ. (Japan). Faculty of Medicine); Arahata, Yutaka; Motegi, Yoshimasa; Furuse, Masahiro
1993-07-01
Spin-echo magnetic resonance imaging (MRI) was evaluated to 530 cases in order to investigate the clinical significance of pontine high signals. The subjects comprised 109 cases of pontine infarction with high signal on T[sub 2]-weighted image and low signal on T[sub 1]-weighted image (PI group), 145 of pontine high signal with high signal on T[sub 2]-weighted image but normal signal on T[sub 1]-weighted image (PH group) and 276 of age-matched control without abnormality either on T[sub 1] or T[sub 2]-weighted images (AC group). Subjective complaints such as vertigo-dizziness were more frequent in the PH group than in the PI group. In both PI and groups, periventricular hyperintensity as well as subcortical high signals in the supratentorium were more severe than in the AC group. These degrees were higher in the PI group than in the PH group. In conclusion, PH as well as PI may result from diffuse arteriosclerosis and PH is considered to be an early finding of pontine ischemia. (author).
Clinical Significance of Focal Breast Lesions Incidentally Identified by 18F-FDG PET/CT
International Nuclear Information System (INIS)
Cho, Young Seok; Choi, Joon Young; Lee, Su Jin; Hyun, Seung Hyup; Lee, Ji Young; Choi, Yong; Choe, Yearn Seong; Lee, Kyung Han; Kim, Byung Tae
2008-01-01
We evaluated the incidence and malignant risk of focal breast lesions incidentally detected by 18 F-FDG PET/CT. Various PET/CT findings of the breast lesions were also analyzed to improve the differentiation between benign from malignant focal breast lesions. The subjects were 3,768 consecutive 18 F-FDG PET/CT exams performed in adult females without a history of breast cancer. A focal breast lesion was defined as a focal 18 F-FDG uptake or a focal nodular lesion on CT image irrespective of 18 F-FDG uptake in the breasts. The maximum SUV and CT pattern of focal breast lesions were evaluated, and were compared with final diagnosis. The incidence of focal breast lesions on PET/CT in adult female subjects was 1.4% (58 lesions in 53 subjects). In finally confirmed 53 lesions of 48 subjects, 11 lesions of 8 subjects (20.8%) were proven to be malignant. When the PET/CT patterns suggesting benignancy (maximum attenuation value > 75 HU or 20) were added as diagnostic criteria of PET/CT to differentiate benign from malignant breast lesions along with maximum SUV, the area under ROC curve of PET/CT was significantly increased compared with maximum SUV alone (0.680±0.093 vs. 0.786±0.076, p 18 F-FDG PET/CT is not low, deserving further diagnostic confirmation. Image interpretation considering both 18 F-FDG uptake and PET/CT pattern may be helpful to improve the differentiation from malignant and benign focal breast lesion
Glass, Edmund R; Dozmorov, Mikhail G
2016-10-06
The goal of many human disease-oriented studies is to detect molecular mechanisms different between healthy controls and patients. Yet, commonly used gene expression measurements from blood samples suffer from variability of cell composition. This variability hinders the detection of differentially expressed genes and is often ignored. Combined with cell counts, heterogeneous gene expression may provide deeper insights into the gene expression differences on the cell type-specific level. Published computational methods use linear regression to estimate cell type-specific differential expression, and a global cutoff to judge significance, such as False Discovery Rate (FDR). Yet, they do not consider many artifacts hidden in high-dimensional gene expression data that may negatively affect linear regression. In this paper we quantify the parameter space affecting the performance of linear regression (sensitivity of cell type-specific differential expression detection) on a per-gene basis. We evaluated the effect of sample sizes, cell type-specific proportion variability, and mean squared error on sensitivity of cell type-specific differential expression detection using linear regression. Each parameter affected variability of cell type-specific expression estimates and, subsequently, the sensitivity of differential expression detection. We provide the R package, LRCDE, which performs linear regression-based cell type-specific differential expression (deconvolution) detection on a gene-by-gene basis. Accounting for variability around cell type-specific gene expression estimates, it computes per-gene t-statistics of differential detection, p-values, t-statistic-based sensitivity, group-specific mean squared error, and several gene-specific diagnostic metrics. The sensitivity of linear regression-based cell type-specific differential expression detection differed for each gene as a function of mean squared error, per group sample sizes, and variability of the proportions
Kabeshova, A; Annweiler, C; Fantino, B; Philip, T; Gromov, V A; Launay, C P; Beauchet, O
2014-06-01
Regression tree (RT) analyses are particularly adapted to explore the risk of recurrent falling according to various combinations of fall risk factors compared to logistic regression models. The aims of this study were (1) to determine which combinations of fall risk factors were associated with the occurrence of recurrent falls in older community-dwellers, and (2) to compare the efficacy of RT and multiple logistic regression model for the identification of recurrent falls. A total of 1,760 community-dwelling volunteers (mean age ± standard deviation, 71.0 ± 5.1 years; 49.4 % female) were recruited prospectively in this cross-sectional study. Age, gender, polypharmacy, use of psychoactive drugs, fear of falling (FOF), cognitive disorders and sad mood were recorded. In addition, the history of falls within the past year was recorded using a standardized questionnaire. Among 1,760 participants, 19.7 % (n = 346) were recurrent fallers. The RT identified 14 nodes groups and 8 end nodes with FOF as the first major split. Among participants with FOF, those who had sad mood and polypharmacy formed the end node with the greatest OR for recurrent falls (OR = 6.06 with p falls (OR = 0.25 with p factors for recurrent falls, the combination most associated with recurrent falls involving FOF, sad mood and polypharmacy. The FOF emerged as the risk factor strongly associated with recurrent falls. In addition, RT and multiple logistic regression were not sensitive enough to identify the majority of recurrent fallers but appeared efficient in detecting individuals not at risk of recurrent falls.
Beurskens, Niek E G; Gorter, Thomas M; Pieper, Petronella G; Hoendermis, Elke S; Bartelds, Beatrijs; Ebels, Tjark; Berger, Rolf M F; Willems, Tineke P; van Melle, Joost P
2017-11-01
Quantification of pulmonary regurgitation (PR) is essential in the management of patients with repaired tetralogy of Fallot (TOF). We sought to evaluate the accuracy of first-line Doppler echocardiography in comparison with cardiac magnetic resonance imaging (MRI) to identify hemodynamic significant PR. Paired cardiac MRI and echocardiographic studies (n = 97) in patients with repaired TOF were retrospectively analyzed. Pressure half time (PHT) and pulmonary regurgitation index (PRi) were measured using continuous wave Doppler. The ratio of the color flow Doppler regurgitation jet width to pulmonary valve (PV) annulus (jet/annulus ratio) and diastolic to systolic time velocity integral (DSTVI; pulsed wave Doppler) were assessed. Accuracy of echocardiographic measurements was tested to identify significant PR as determined by phase-contrast MRI (PR fraction [PRF] ≥ 20%). Mean PRF was 29.4 ± 15.7%. PHT < 100 ms had a sensitivity of 93%, specificity 75%, positive predictive value (PPV) 92% and negative predictive value (NPV) 78% for identifying significant PR (C-statistic 0.82). PRi < 0.77 had sensitivity and specificity of 66% and 54%, respectively (C-statistic 0.63). Jet/annulus ratio ≥1/3 had sensitivity 96%, specificity 75%, PPV 92% and NPV 82% (C-statistic 0.87). DSTVI had sensitivity 84%, specificity 33%, PPV 84% and NPV 40%, (C-statistic 0.56). Combined jet/annulus ratio ≥1/3 and PHT < 100 ms was highly accurate in identifying PRF ≥ 20%, with sensitivity 97% and specificity 100%. PHT and jet/annulus ratio on Doppler echocardiography, especially when combined, are highly accurate in identifying significant PR and therefore seem useful in the follow-up of patients with repaired TOF.
Auerbach, Raymond K; Chen, Bin; Butte, Atul J
2013-08-01
Biological analysis has shifted from identifying genes and transcripts to mapping these genes and transcripts to biological functions. The ENCODE Project has generated hundreds of ChIP-Seq experiments spanning multiple transcription factors and cell lines for public use, but tools for a biomedical scientist to analyze these data are either non-existent or tailored to narrow biological questions. We present the ENCODE ChIP-Seq Significance Tool, a flexible web application leveraging public ENCODE data to identify enriched transcription factors in a gene or transcript list for comparative analyses. The ENCODE ChIP-Seq Significance Tool is written in JavaScript on the client side and has been tested on Google Chrome, Apple Safari and Mozilla Firefox browsers. Server-side scripts are written in PHP and leverage R and a MySQL database. The tool is available at http://encodeqt.stanford.edu. abutte@stanford.edu Supplementary material is available at Bioinformatics online.
Gruner, Christiane; Chan, Raymond H.; Crean, Andrew; Rakowski, Harry; Rowin, Ethan J.; Care, Melanie; Deva, Djeven; Williams, Lynne; Appelbaum, Evan; Gibson, C. Michael; Lesser, John R.; Haas, Tammy S.; Udelson, James E.; Manning, Warren J.; Siminovitch, Katherine
2017-01-01
Aims Cardiovascular magnetic resonance (CMR) has improved diagnostic and management strategies in hypertrophic cardiomyopathy (HCM) by expanding our appreciation for the diverse phenotypic expression. We sought to characterize the prevalence and clinical significance of a recently identified accessory left ventricular (LV) muscle bundle extending from the apex to the basal septum or anterior wall (i.e. apical-basal). Methods and results CMR was performed in 230 genotyped HCM patients (48 ± 15...
DEFF Research Database (Denmark)
Hagger, Virginia; Hendrieckx, Christel; Cameron, Fergus
2017-01-01
OBJECTIVE To establish cut point(s) for the Problem Areas in Diabetes-teen version (PAID-T) scale to identify adolescents with clinically meaningful, elevated diabetes distress. RESEARCH DESIGN AND METHODS Data were available from the Diabetes Management and Impact for Long-term Empowerment...... variables were examined to identify a clinically meaningful threshold for elevated diabetes distress. ANOVA was used to test whether these variables differed by levels of distress. RESULTS Two cut points distinguished none-to-mild (90) diabetes distress.......Moderate distresswas experienced by 18%of adolescents and high distress by 36%. Mean depressive symptoms, self-reported HbA1c, and SMBG differed significantly across the three levels of diabetes distress (all P defined two...
Directory of Open Access Journals (Sweden)
Ian Roberts
2012-01-01
Full Text Available Reliable identification of copy number aberrations (CNA from comparative genomic hybridization data would be improved by the availability of a generalised method for processing large datasets. To this end, we developed swatCGH, a data analysis framework and region detection heuristic for computational grids. swatCGH analyses sequentially displaced (sliding windows of neighbouring probes and applies adaptive thresholds of varying stringency to identify the 10% of each chromosome that contains the most frequently occurring CNAs. We used the method to analyse a published dataset, comparing data preprocessed using four different DNA segmentation algorithms, and two methods for prioritising the detected CNAs. The consolidated list of the most commonly detected aberrations confirmed the value of swatCGH as a simplified high-throughput method for identifying biologically significant CNA regions of interest.
Huybrechts, Inge; Lioret, Sandrine; Mouratidou, Theodora; Gunter, Marc J; Manios, Yannis; Kersting, Mathilde; Gottrand, Frederic; Kafatos, Anthony; De Henauw, Stefaan; Cuenca-García, Magdalena; Widhalm, Kurt; Gonzales-Gross, Marcela; Molnar, Denes; Moreno, Luis A; McNaughton, Sarah A
2017-01-01
This study aims to examine repeatability of reduced rank regression (RRR) methods in calculating dietary patterns (DP) and cross-sectional associations with overweight (OW)/obesity across European and Australian samples of adolescents. Data from two cross-sectional surveys in Europe (2006/2007 Healthy Lifestyle in Europe by Nutrition in Adolescence study, including 1954 adolescents, 12-17 years) and Australia (2007 National Children's Nutrition and Physical Activity Survey, including 1498 adolescents, 12-16 years) were used. Dietary intake was measured using two non-consecutive, 24-h recalls. RRR was used to identify DP using dietary energy density, fibre density and percentage of energy intake from fat as the intermediate variables. Associations between DP scores and body mass/fat were examined using multivariable linear and logistic regression as appropriate, stratified by sex. The first DP extracted (labelled 'energy dense, high fat, low fibre') explained 47 and 31 % of the response variation in Australian and European adolescents, respectively. It was similar for European and Australian adolescents and characterised by higher consumption of biscuits/cakes, chocolate/confectionery, crisps/savoury snacks, sugar-sweetened beverages, and lower consumption of yogurt, high-fibre bread, vegetables and fresh fruit. DP scores were inversely associated with BMI z-scores in Australian adolescent boys and borderline inverse in European adolescent boys (so as with %BF). Similarly, a lower likelihood for OW in boys was observed with higher DP scores in both surveys. No such relationships were observed in adolescent girls. In conclusion, the DP identified in this cross-country study was comparable for European and Australian adolescents, demonstrating robustness of the RRR method in calculating DP among populations. However, longitudinal designs are more relevant when studying diet-obesity associations, to prevent reverse causality.
Knowles, Emma E M; Carless, Melanie A; de Almeida, Marcio A A; Curran, Joanne E; McKay, D Reese; Sprooten, Emma; Dyer, Thomas D; Göring, Harald H; Olvera, Rene; Fox, Peter; Almasy, Laura; Duggirala, Ravi; Kent, Jack W; Blangero, John; Glahn, David C
2014-01-01
It is well established that risk for developing psychosis is largely mediated by the influence of genes, but identifying precisely which genes underlie that risk has been problematic. Focusing on endophenotypes, rather than illness risk, is one solution to this problem. Impaired cognition is a well-established endophenotype of psychosis. Here we aimed to characterize the genetic architecture of cognition using phenotypically detailed models as opposed to relying on general IQ or individual neuropsychological measures. In so doing we hoped to identify genes that mediate cognitive ability, which might also contribute to psychosis risk. Hierarchical factor models of genetically clustered cognitive traits were subjected to linkage analysis followed by QTL region-specific association analyses in a sample of 1,269 Mexican American individuals from extended pedigrees. We identified four genome wide significant QTLs, two for working and two for spatial memory, and a number of plausible and interesting candidate genes. The creation of detailed models of cognition seemingly enhanced the power to detect genetic effects on cognition and provided a number of possible candidate genes for psychosis. © 2013 Wiley Periodicals, Inc.
Sarig Bahat, Hilla; Chen, Xiaoqi; Reznik, David; Kodesh, Einat; Treleaven, Julia
2015-04-01
Chronic neck pain has been consistently shown to be associated with impaired kinematic control including reduced range, velocity and smoothness of cervical motion, that seem relevant to daily function as in quick neck motion in response to surrounding stimuli. The objectives of this study were: to compare interactive cervical kinematics in patients with neck pain and controls; to explore the new measures of cervical motion accuracy; and to find the sensitivity, specificity, and optimal cutoff values for defining impaired kinematics in those with neck pain. In this cross-section study, 33 patients with chronic neck pain and 22 asymptomatic controls were assessed for their cervical kinematic control using interactive virtual reality hardware and customized software utilizing a head mounted display with built-in head tracking. Outcome measures included peak and mean velocity, smoothness (represented by number of velocity peaks (NVP)), symmetry (represented by time to peak velocity percentage (TTPP)), and accuracy of cervical motion. Results demonstrated significant and strong effect-size differences in peak and mean velocities, NVP and TTPP in all directions excluding TTPP in left rotation, and good effect-size group differences in 5/8 accuracy measures. Regression results emphasized the high clinical value of neck motion velocity, with very high sensitivity and specificity (85%-100%), followed by motion smoothness, symmetry and accuracy. These finding suggest cervical kinematics should be evaluated clinically, and screened by the provided cut off values for identification of relevant impairments in those with neck pain. Such identification of presence or absence of kinematic impairments may direct treatment strategies and additional evaluation when needed. Copyright © 2014 Elsevier Ltd. All rights reserved.
Spady, Richard; Stouli, Sami
2012-01-01
We propose dual regression as an alternative to the quantile regression process for the global estimation of conditional distribution functions under minimal assumptions. Dual regression provides all the interpretational power of the quantile regression process while avoiding the need for repairing the intersecting conditional quantile surfaces that quantile regression often produces in practice. Our approach introduces a mathematical programming characterization of conditional distribution f...
de Tayrac, Marie; Roth, Marie-Paule; Jouanolle, Anne-Marie; Coppin, Hélène; le Gac, Gérald; Piperno, Alberto; Férec, Claude; Pelucchi, Sara; Scotet, Virginie; Bardou-Jacquet, Edouard; Ropert, Martine; Bouvet, Régis; Génin, Emmanuelle; Mosser, Jean; Deugnier, Yves
2015-03-01
Hereditary hemochromatosis (HH) is the most common form of genetic iron loading disease. It is mainly related to the homozygous C282Y/C282Y mutation in the HFE gene that is, however, a necessary but not a sufficient condition to develop clinical and even biochemical HH. This suggests that modifier genes are likely involved in the expressivity of the disease. Our aim was to identify such modifier genes. We performed a genome-wide association study (GWAS) using DNA collected from 474 unrelated C282Y homozygotes. Associations were examined for both quantitative iron burden indices and clinical outcomes with 534,213 single nucleotide polymorphisms (SNP) genotypes, with replication analyses in an independent sample of 748 C282Y homozygotes from four different European centres. One SNP met genome-wide statistical significance for association with transferrin concentration (rs3811647, GWAS p value of 7×10(-9) and replication p value of 5×10(-13)). This SNP, located within intron 11 of the TF gene, had a pleiotropic effect on serum iron (GWAS p value of 4.9×10(-6) and replication p value of 3.2×10(-6)). Both serum transferrin and iron levels were associated with serum ferritin levels, amount of iron removed and global clinical stage (pHFE-associated HH (HFE-HH) patients, identified the rs3811647 polymorphism in the TF gene as the only SNP significantly associated with iron metabolism through serum transferrin and iron levels. Because these two outcomes were clearly associated with the biochemical and clinical expression of the disease, an indirect link between the rs3811647 polymorphism and the phenotypic presentation of HFE-HH is likely. Copyright © 2014 European Association for the Study of the Liver. Published by Elsevier B.V. All rights reserved.
Shariat-Mohaymany, Afshin; Tavakoli-Kashani, Ali; Nosrati, Hadi; Ranjbari, Andisheh
2011-12-01
To identify the significant factors that influence head-on conflicts resulting from dangerous overtaking maneuvers on 2-lane rural roads in Iran. A traffic conflict technique was applied to 12 two-lane rural roads in order to investigate the potential situations for accidents to occur and thus to identify the geometric and traffic factors affecting traffic conflicts. Traffic data were collected via the inductive loop detectors installed on these roads, and geometric characteristics were obtained through field observations. Two groups of data were then analyzed independently by Pearson's chi-square test to evaluate their relationship to traffic conflicts. The independent variables were percentage of time spent following (PTSF), percentage of heavy vehicles, directional distribution of traffic (DDT), mean speed, speed standard deviation, section type, road width, longitudinal slope, holiday or workday, and lighting condition. It was indicated that increasing the PTSF, decreasing the percentage of heavy vehicles, increasing the mean speed (up to 75 km/h), increasing DDT in the range of 0 to 60 percent, and decreasing the standard deviation of speed significantly increased the occurrence of traffic conflicts. It was also revealed that traffic conflicts occur more frequently on curve sections and on workdays. The variables road width, slope, and lighting condition were found to have a minor effect on conflict occurrence. To reduce the number of head-on conflicts on the aforementioned roads, some remedial measures are suggested, such as not constructing long "No Passing" zones and constructing passing lanes where necessary; keeping road width at the standard value; constructing roads with horizontal curves and a high radius and using appropriate road markings and overtaking-forbidden signs where it is impossible to modify the radius; providing enough light and installing caution signs/devices on the roads; and intensifying police control and supervision on workdays
Differentiating regressed melanoma from regressed lichenoid keratosis.
Chan, Aegean H; Shulman, Kenneth J; Lee, Bonnie A
2017-04-01
Distinguishing regressed lichen planus-like keratosis (LPLK) from regressed melanoma can be difficult on histopathologic examination, potentially resulting in mismanagement of patients. We aimed to identify histopathologic features by which regressed melanoma can be differentiated from regressed LPLK. Twenty actively inflamed LPLK, 12 LPLK with regression and 15 melanomas with regression were compared and evaluated by hematoxylin and eosin staining as well as Melan-A, microphthalmia transcription factor (MiTF) and cytokeratin (AE1/AE3) immunostaining. (1) A total of 40% of regressed melanomas showed complete or near complete loss of melanocytes within the epidermis with Melan-A and MiTF immunostaining, while 8% of regressed LPLK exhibited this finding. (2) Necrotic keratinocytes were seen in the epidermis in 33% regressed melanomas as opposed to all of the regressed LPLK. (3) A dense infiltrate of melanophages in the papillary dermis was seen in 40% of regressed melanomas, a feature not seen in regressed LPLK. In summary, our findings suggest that a complete or near complete loss of melanocytes within the epidermis strongly favors a regressed melanoma over a regressed LPLK. In addition, necrotic epidermal keratinocytes and the presence of a dense band-like distribution of dermal melanophages can be helpful in differentiating these lesions. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
DEFF Research Database (Denmark)
Jørgensen, Lasse Vigel; Huss, Hans Henrik; Dalgaard, Paw
2001-01-01
alcohols, which were produced by microbial activity. Partial least- squares regression of volatile compounds and sensory results allowed for a multiple compound quality index to be developed. This index was based on volatile bacterial metabolites, 1- propanol and 2-butanone, and 2-furan......, 1- penten-3-ol, and 1-propanol. The potency and importance of these compounds was confirmed by gas chromatography- olfactometry. The present study provides valuable information on the bacterial reactions responsible for spoilage off-flavors of cold-smoked salmon, which can be used to develop...
Directory of Open Access Journals (Sweden)
Yeh Cheng-Yu
2009-12-01
Full Text Available Abstract Background Prostate cancer is a world wide leading cancer and it is characterized by its aggressive metastasis. According to the clinical heterogeneity, prostate cancer displays different stages and grades related to the aggressive metastasis disease. Although numerous studies used microarray analysis and traditional clustering method to identify the individual genes during the disease processes, the important gene regulations remain unclear. We present a computational method for inferring genetic regulatory networks from micorarray data automatically with transcription factor analysis and conditional independence testing to explore the potential significant gene regulatory networks that are correlated with cancer, tumor grade and stage in the prostate cancer. Results To deal with missing values in microarray data, we used a K-nearest-neighbors (KNN algorithm to determine the precise expression values. We applied web services technology to wrap the bioinformatics toolkits and databases to automatically extract the promoter regions of DNA sequences and predicted the transcription factors that regulate the gene expressions. We adopt the microarray datasets consists of 62 primary tumors, 41 normal prostate tissues from Stanford Microarray Database (SMD as a target dataset to evaluate our method. The predicted results showed that the possible biomarker genes related to cancer and denoted the androgen functions and processes may be in the development of the prostate cancer and promote the cell death in cell cycle. Our predicted results showed that sub-networks of genes SREBF1, STAT6 and PBX1 are strongly related to a high extent while ETS transcription factors ELK1, JUN and EGR2 are related to a low extent. Gene SLC22A3 may explain clinically the differentiation associated with the high grade cancer compared with low grade cancer. Enhancer of Zeste Homolg 2 (EZH2 regulated by RUNX1 and STAT3 is correlated to the pathological stage
Yeh, Hsiang-Yuan; Cheng, Shih-Wu; Lin, Yu-Chun; Yeh, Cheng-Yu; Lin, Shih-Fang; Soo, Von-Wun
2009-12-21
Prostate cancer is a world wide leading cancer and it is characterized by its aggressive metastasis. According to the clinical heterogeneity, prostate cancer displays different stages and grades related to the aggressive metastasis disease. Although numerous studies used microarray analysis and traditional clustering method to identify the individual genes during the disease processes, the important gene regulations remain unclear. We present a computational method for inferring genetic regulatory networks from micorarray data automatically with transcription factor analysis and conditional independence testing to explore the potential significant gene regulatory networks that are correlated with cancer, tumor grade and stage in the prostate cancer. To deal with missing values in microarray data, we used a K-nearest-neighbors (KNN) algorithm to determine the precise expression values. We applied web services technology to wrap the bioinformatics toolkits and databases to automatically extract the promoter regions of DNA sequences and predicted the transcription factors that regulate the gene expressions. We adopt the microarray datasets consists of 62 primary tumors, 41 normal prostate tissues from Stanford Microarray Database (SMD) as a target dataset to evaluate our method. The predicted results showed that the possible biomarker genes related to cancer and denoted the androgen functions and processes may be in the development of the prostate cancer and promote the cell death in cell cycle. Our predicted results showed that sub-networks of genes SREBF1, STAT6 and PBX1 are strongly related to a high extent while ETS transcription factors ELK1, JUN and EGR2 are related to a low extent. Gene SLC22A3 may explain clinically the differentiation associated with the high grade cancer compared with low grade cancer. Enhancer of Zeste Homolg 2 (EZH2) regulated by RUNX1 and STAT3 is correlated to the pathological stage. We provide a computational framework to reconstruct
McDonald, Shelby Elaine; Shin, Sunny; Corona, Rosalie; Maternick, Anna; Graham-Bermann, Sandra A; Ascione, Frank R; Herbert Williams, James
2016-08-01
The majority of analytic approaches aimed at understanding the influence of environmental context on children's socioemotional adjustment assume comparable effects of contextual risk and protective factors for all children. Using self-reported data from 289 maternal caregiver-child dyads, we examined the degree to which there are differential effects of severity of intimate partner violence (IPV) exposure, yearly household income, and number of children in the family on posttraumatic stress symptoms (PTS) and psychopathology symptoms (i.e., internalizing and externalizing problems) among school-age children between the ages of 7-12 years. A regression mixture model identified three latent classes that were primarily distinguished by differential effects of IPV exposure severity on PTS and psychopathology symptoms: (1) asymptomatic with low sensitivity to environmental factors (66% of children), (2) maladjusted with moderate sensitivity (24%), and (3) highly maladjusted with high sensitivity (10%). Children with mothers who had higher levels of education were more likely to be in the maladjusted with moderate sensitivity group than the asymptomatic with low sensitivity group. Latino children were less likely to be in both maladjusted groups compared to the asymptomatic group. Overall, the findings suggest differential effects of family environmental factors on PTS and psychopathology symptoms among children exposed to IPV. Implications for research and practice are discussed. Copyright © 2016 Elsevier Ltd. All rights reserved.
Directory of Open Access Journals (Sweden)
Simon D Angus
Full Text Available Multi-dose radiotherapy protocols (fraction dose and timing currently used in the clinic are the product of human selection based on habit, received wisdom, physician experience and intra-day patient timetabling. However, due to combinatorial considerations, the potential treatment protocol space for a given total dose or treatment length is enormous, even for relatively coarse search; well beyond the capacity of traditional in-vitro methods. In constrast, high fidelity numerical simulation of tumor development is well suited to the challenge. Building on our previous single-dose numerical simulation model of EMT6/Ro spheroids, a multi-dose irradiation response module is added and calibrated to the effective dose arising from 18 independent multi-dose treatment programs available in the experimental literature. With the developed model a constrained, non-linear, search for better performing cadidate protocols is conducted within the vicinity of two benchmarks by genetic algorithm (GA techniques. After evaluating less than 0.01% of the potential benchmark protocol space, candidate protocols were identified by the GA which conferred an average of 9.4% (max benefit 16.5% and 7.1% (13.3% improvement (reduction on tumour cell count compared to the two benchmarks, respectively. Noticing that a convergent phenomenon of the top performing protocols was their temporal synchronicity, a further series of numerical experiments was conducted with periodic time-gap protocols (10 h to 23 h, leading to the discovery that the performance of the GA search candidates could be replicated by 17-18 h periodic candidates. Further dynamic irradiation-response cell-phase analysis revealed that such periodicity cohered with latent EMT6/Ro cell-phase temporal patterning. Taken together, this study provides powerful evidence towards the hypothesis that even simple inter-fraction timing variations for a given fractional dose program may present a facile, and highly cost
Angus, Simon D; Piotrowska, Monika Joanna
2014-01-01
Multi-dose radiotherapy protocols (fraction dose and timing) currently used in the clinic are the product of human selection based on habit, received wisdom, physician experience and intra-day patient timetabling. However, due to combinatorial considerations, the potential treatment protocol space for a given total dose or treatment length is enormous, even for relatively coarse search; well beyond the capacity of traditional in-vitro methods. In constrast, high fidelity numerical simulation of tumor development is well suited to the challenge. Building on our previous single-dose numerical simulation model of EMT6/Ro spheroids, a multi-dose irradiation response module is added and calibrated to the effective dose arising from 18 independent multi-dose treatment programs available in the experimental literature. With the developed model a constrained, non-linear, search for better performing cadidate protocols is conducted within the vicinity of two benchmarks by genetic algorithm (GA) techniques. After evaluating less than 0.01% of the potential benchmark protocol space, candidate protocols were identified by the GA which conferred an average of 9.4% (max benefit 16.5%) and 7.1% (13.3%) improvement (reduction) on tumour cell count compared to the two benchmarks, respectively. Noticing that a convergent phenomenon of the top performing protocols was their temporal synchronicity, a further series of numerical experiments was conducted with periodic time-gap protocols (10 h to 23 h), leading to the discovery that the performance of the GA search candidates could be replicated by 17-18 h periodic candidates. Further dynamic irradiation-response cell-phase analysis revealed that such periodicity cohered with latent EMT6/Ro cell-phase temporal patterning. Taken together, this study provides powerful evidence towards the hypothesis that even simple inter-fraction timing variations for a given fractional dose program may present a facile, and highly cost-effecitive means
Zhang, Hongyang; Welch, William J.; Zamar, Ruben H.
2017-01-01
Tomal et al. (2015) introduced the notion of "phalanxes" in the context of rare-class detection in two-class classification problems. A phalanx is a subset of features that work well for classification tasks. In this paper, we propose a different class of phalanxes for application in regression settings. We define a "Regression Phalanx" - a subset of features that work well together for prediction. We propose a novel algorithm which automatically chooses Regression Phalanxes from high-dimensi...
Stevens, John R.; Jones, Todd R.; Lefevre, Michael; Ganesan, Balasubramanian; Weimer, Bart C.
2017-01-01
Microbial community analysis experiments to assess the effect of a treatment intervention (or environmental change) on the relative abundance levels of multiple related microbial species (or operational taxonomic units) simultaneously using high throughput genomics are becoming increasingly common. Within the framework of the evolutionary phylogeny of all species considered in the experiment, this translates to a statistical need to identify the phylogenetic branches that exhibit a significan...
Sasi, Sharath P.; Bae, Sanggyu; Song, Jin; Perepletchikov, Aleksandr; Schneider, Douglas; Carrozza, Joseph; Yan, Xinhua; Kishore, Raj; Enderling, Heiko; Goukassian, David A.
2014-01-01
Tumor necrosis factor-alpha (TNF) binds to two receptors: TNFR1/p55-cytotoxic and TNFR2/p75-pro-survival. We have shown that tumor growth in p75 knockout (KO) mice was decreased more than 2-fold in Lewis lung carcinoma (LLCs). We hypothesized that selective blocking of TNFR2/p75 LLCs may sensitize them to TNF-induced apoptosis and affect the tumor growth. We implanted intact and p75 knockdown (KD)-LLCs (>90%, using shRNA) into wild type (WT) mice flanks. On day 8 post-inoculation, recombinant murine (rm) TNF-α (12.5 ng/gr of body weight) or saline was injected twice daily for 6 days. Tumor volumes (tV) were measured daily and tumor weights (tW) on day 15, when study was terminated due to large tumors in LLC+TNF group. Tubular bones, spleens and peripheral blood (PB) were examined to determine possible TNF toxicity. There was no significant difference in tV or tW between LLC minus (-) TNF and p75KD/LLC-TNF tumors. Compared to 3 control groups, p75KD/LLC+TNF showed >2-5-fold decreases in tV (ptumors were 100% necrotic, the remaining revealed 40-60% necrosis. No toxicity was detected in bone marrow, spleen and peripheral blood. We concluded that blocking TNFR2/p75 in LLCs combined with intra-tumoral rmTNF injections inhibit LLC tumor growth. This could represent a novel and effective therapy against lung neoplasms and a new paradigm in cancer therapeutics. PMID:24664144
Energy Technology Data Exchange (ETDEWEB)
Bitencourt, Almir G.V.; Lima, Eduardo N.P.; Macedo, Bruna R.C.; Conrado, Jorge L.F.A.; Marques, Elvira F.; Chojniak, Rubens [A C Camargo Cancer Center-Department of Imaging, Sao Paulo, SP (Brazil)
2017-05-15
To evaluate the diagnostic accuracy of positron emission mammography (PEM) for identifying malignant lesions in patients with suspicious microcalcifications detected on mammography. A prospective, single-centre study that evaluated 40 patients with suspicious calcifications at mammography and indication for percutaneous or surgical biopsy, with mean age of 56.4 years (range: 28-81 years). Patients who agreed to participate in the study underwent PEM with 18F-fluorodeoxyglucose before the final histological evaluation. PEM findings were compared with mammography and histological findings. Most calcifications (n = 34; 85.0 %) were classified as BIRADS 4. On histology, there were 25 (62.5 %) benign and 15 (37.5 %) malignant lesions, including 11 (27.5 %) ductal carcinoma in situ (DCIS) and 4 (10 %) invasive carcinomas. On subjective analysis, PEM was positive in 15 cases (37.5 %) and most of these cases (n = 14; 93.3 %) were confirmed as malignant on histology. There was one false-positive result, which corresponded to a fibroadenoma, and one false negative, which corresponded to an intermediate-grade DCIS. PEM had a sensitivity of 93.3 %, specificity of 96.0 % and accuracy of 95 %. PEM was able to identify all invasive carcinomas and high-grade DCIS (nuclear grade 3) in the presented sample, suggesting that this method may be useful for further evaluation of patients with suspected microcalcifications. (orig.)
Sakthivel, Srinivasan; Zatkova, Andrea; Nemethova, Martina; Surovy, Milan; Kadasi, Ludevit; Saravanan, Madurai P
2014-05-01
Alkaptonuria (AKU) is an autosomal recessive disorder; caused by the mutations in the homogentisate 1, 2-dioxygenase (HGD) gene located on Chromosome 3q13.33. AKU is a rare disorder with an incidence of 1: 250,000 to 1: 1,000,000, but Slovakia and the Dominican Republic have a relatively higher incidence of 1: 19,000. Our study focused on studying the frequency of AKU and identification of HGD gene mutations in nomads. HGD gene sequencing was used to identify the mutations in alkaptonurics. For the past four years, from subjects suspected to be clinically affected, we found 16 positive cases among a randomly selected cohort of 41 Indian nomads (Narikuravar) settled in the specific area of Tamil Nadu, India. HGD gene mutation analysis showed that 11 of these patients carry the same homozygous splicing mutation c.87 + 1G > A; in five cases, this mutation was found to be heterozygous, while the second AKU-causing mutation was not identified in these patients. This result indicates that the founder effect and high degree of consanguineous marriages have contributed to AKU among nomads. Eleven positive samples were homozygous for a novel mutation c.87 + 1G > A, that abolishes an intron 2 donor splice site and most likely causes skipping of exon 2. The prevalence of AKU observed earlier seems to be highly increased in people of nomadic origin. © 2014 John Wiley & Sons Ltd/University College London.
Yang, Zheng Rong; Bullifent, Helen L.; Moore, Karen; Paszkiewicz, Konrad; Saint, Richard J.; Southern, Stephanie J.; Champion, Olivia L.; Senior, Nicola J.; Sarkar-Tyson, Mitali; Oyston, Petra C. F.; Atkins, Timothy P.; Titball, Richard W.
2017-02-01
Massively parallel sequencing technology coupled with saturation mutagenesis has provided new and global insights into gene functions and roles. At a simplistic level, the frequency of mutations within genes can indicate the degree of essentiality. However, this approach neglects to take account of the positional significance of mutations - the function of a gene is less likely to be disrupted by a mutation close to the distal ends. Therefore, a systematic bioinformatics approach to improve the reliability of essential gene identification is desirable. We report here a parametric model which introduces a novel mutation feature together with a noise trimming approach to predict the biological significance of Tn5 mutations. We show improved performance of essential gene prediction in the bacterium Yersinia pestis, the causative agent of plague. This method would have broad applicability to other organisms and to the identification of genes which are essential for competitiveness or survival under a broad range of stresses.
Yayan, Josef
2012-01-01
Patients with unstable angina or myocardial infarction are at risk of acute kidney injury, which may be aggravated by the iodine-containing contrast agent used during coronary angiography; however, the relationship between these two conditions remains unclear. The current study investigated the relationship between acute kidney injury and coronary heart disease prior to coronary angiography. All patients were evaluated after undergoing coronary angiography in the cardiac catheterization laboratory of the Vinzentius Hospital in Landau, Germany, in 2011. The study group included patients with both acute coronary heart disease and acute kidney injury (as defined according to the classification of the Acute Kidney Injury Group); the control group included patients without acute coronary heart disease. Serum creatinine profiles were evaluated in all patients, as were a variety of demographic and health characteristics. Of the 303 patients examined, 201 (66.34%) had coronary artery disease. Of these, 38 (18.91%) also had both acute kidney injury and acute coronary heart disease prior to and after coronary angiography, and of which in turn 34 (16.91%) had both acute kidney injury and acute coronary heart disease only prior to the coronary angiography. However, the occurrence of acute kidney injury was not significantly related to the presence of coronary heart disease (P = 0.95, Chi-square test). The results of this study indicate that acute kidney injury is not linked to acute coronary heart disease. However, physicians should be aware that many coronary heart patients may develop kidney injury while hospitalized for angiography.
Matson, Johnny L.; Kozlowski, Alison M.
2010-01-01
Autistic regression is one of the many mysteries in the developmental course of autism and pervasive developmental disorders not otherwise specified (PDD-NOS). Various definitions of this phenomenon have been used, further clouding the study of the topic. Despite this problem, some efforts at establishing prevalence have been made. The purpose of…
Yu, Huibin; Song, Yonghui; Liu, Ruixia; Pan, Hongwei; Xiang, Liancheng; Qian, Feng
2014-10-01
The stabilization of latent tracers of dissolved organic matter (DOM) of wastewater was analyzed by three-dimensional excitation-emission matrix (EEM) fluorescence spectroscopy coupled with self-organizing map and classification and regression tree analysis (CART) in wastewater treatment performance. DOM of water samples collected from primary sedimentation, anaerobic, anoxic, oxic and secondary sedimentation tanks in a large-scale wastewater treatment plant contained four fluorescence components: tryptophan-like (C1), tyrosine-like (C2), microbial humic-like (C3) and fulvic-like (C4) materials extracted by self-organizing map. These components showed good positive linear correlations with dissolved organic carbon of DOM. C1 and C2 were representative components in the wastewater, and they were removed to a higher extent than those of C3 and C4 in the treatment process. C2 was a latent parameter determined by CART to differentiate water samples of oxic and secondary sedimentation tanks from the successive treatment units, indirectly proving that most of tyrosine-like material was degraded by anaerobic microorganisms. C1 was an accurate parameter to comprehensively separate the samples of the five treatment units from each other, indirectly indicating that tryptophan-like material was decomposed by anaerobic and aerobic bacteria. EEM fluorescence spectroscopy in combination with self-organizing map and CART analysis can be a nondestructive effective method for characterizing structural component of DOM fractions and monitoring organic matter removal in wastewater treatment process. Copyright © 2014 Elsevier Ltd. All rights reserved.
Zewude, Bereket Tessema; Ashine, Kidus Meskele
2016-01-01
An attempt has been made to assess and identify the major variables that influence student academic achievement at college of natural and computational science of Wolaita Sodo University in Ethiopia. Study time, peer influence, securing first choice of department, arranging study time outside class, amount of money received from family, good life…
Olive, David J
2017-01-01
This text covers both multiple linear regression and some experimental design models. The text uses the response plot to visualize the model and to detect outliers, does not assume that the error distribution has a known parametric distribution, develops prediction intervals that work when the error distribution is unknown, suggests bootstrap hypothesis tests that may be useful for inference after variable selection, and develops prediction regions and large sample theory for the multivariate linear regression model that has m response variables. A relationship between multivariate prediction regions and confidence regions provides a simple way to bootstrap confidence regions. These confidence regions often provide a practical method for testing hypotheses. There is also a chapter on generalized linear models and generalized additive models. There are many R functions to produce response and residual plots, to simulate prediction intervals and hypothesis tests, to detect outliers, and to choose response trans...
Engvall, Karin; Hult, M; Corner, R; Lampa, E; Norbäck, D; Emenius, G
2010-01-01
The aim was to develop a new model to identify residential buildings with higher frequencies of "SBS" than expected, "risk buildings". In 2005, 481 multi-family buildings with 10,506 dwellings in Stockholm were studied by a new stratified random sampling. A standardised self-administered questionnaire was used to assess "SBS", atopy and personal factors. The response rate was 73%. Statistical analysis was performed by multiple logistic regressions. Dwellers owning their building reported less "SBS" than those renting. There was a strong relationship between socio-economic factors and ownership. The regression model, ended up with high explanatory values for age, gender, atopy and ownership. Applying our model, 9% of all residential buildings in Stockholm were classified as "risk buildings" with the highest proportion in houses built 1961-1975 (26%) and lowest in houses built 1985-1990 (4%). To identify "risk buildings", it is necessary to adjust for ownership and population characteristics.
Nils Breilid; Eva Dyrnes
2017-01-01
Introduction: This study deals with young people receiving special needs education in schools and their transition to lasting employment in private or public sector. Through a qualitative approach, the article aims at “identifying significant factors which can, contribute to successful transitions from school to lasting employment affiliation for pupils in vocational training programs”Theoretical approach: The theoretical approach of this article is descriptions and interpretation of the Norw...
Manz, Judith; Rodríguez, Elke; ElSharawy, Abdou; Oesau, Eva-Maria; Petersen, Britt-Sabina; Baurecht, Hansjörg; Mayr, Gabriele; Weber, Susanne; Harder, Jürgen; Reischl, Eva; Schwarz, Agatha; Novak, Natalija; Franke, Andre; Weidinger, Stephan
2016-12-01
Gene-mapping studies have consistently identified a susceptibility locus for atopic dermatitis and other inflammatory diseases on chromosome band 11q13.5, with the strongest association observed for a common variant located in an intergenic region between the two annotated genes C11orf30 and LRRC32. Using a targeted resequencing approach we identified low-frequency and rare missense mutations within the LRRC32 gene encoding the protein GARP, a receptor on activated regulatory T cells that binds latent transforming growth factor-β. Subsequent association testing in more than 2,000 atopic dermatitis patients and 2,000 control subjects showed a significant excess of these LRRC32 variants in individuals with atopic dermatitis. Structural protein modeling and bioinformatic analysis predicted a disruption of protein transport upon these variants, and overexpression assays in CD4 + CD25 - T cells showed a significant reduction in surface expression of the mutated protein. Consistently, flow cytometric (FACS) analyses of different T-cell subtypes obtained from atopic dermatitis patients showed a significantly reduced surface expression of GARP and a reduced conversion of CD4 + CD25 - T cells into regulatory T cells, along with lower expression of latency-associated protein upon stimulation in carriers of the LRRC32 A407T variant. These results link inherited disturbances of transforming growth factor-β signaling with atopic dermatitis risk. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Hu, Liyuan; Gu, Qiufang; Zhu, Zhen; Yang, Chenhao; Chen, Chao; Cao, Yun; Zhou, Wenhao
2014-08-01
Hypoglycaemia is a significant problem in high-risk neonates and predominant parieto-occipital lobe involvement has been observed after severe hypoglycaemic insult. We explored the use of flash visual evoked potentials (FVEP) in detecting parieto-occipital lobe involvement after significant hypoglycaemia. Full-term neonates (n = 15) who underwent FVEP from January 2008 to May 2013 were compared with infants (n = 11) without hypoglycaemia or parietal-occipital lobe injury. Significant hypoglycaemia was defined as being symptomatic or needing steroids, glucagon or a glucose infusion rate of ≥12 mg/kg/min. The hypoglycaemia group exhibited delayed latency of the first positive waveform on FVEP. The initial detected time for hypoglycaemia was later in the eight subjects with seizures (median 51-h-old) than those without (median 22-h-old) (P = 0.003). Magnetic resonance imaging showed that 80% of the hypoglycaemia group exhibited occipital-lobe injuries, and they were more likely to exhibit abnormal FVEP morphology (P = 0.007) than the controls. FVEP exhibited 100% sensitivity, but only 25% specificity, for detecting injuries to the parieto-occipital lobes. Flash visual evoked potential (FVEP) was sensitive, but not sufficiently specific, in identifying parieto-occipital lobe injuries among term neonates exposed to significant hypoglycaemia. Larger studies exploring the potential role of FVEP in neonatal hypoglycaemia are required. ©2014 Foundation Acta Paediatrica. Published by John Wiley & Sons Ltd.
Directory of Open Access Journals (Sweden)
Guo Rong
2012-06-01
Full Text Available Abstract Background Fragmented QRS (fQRS complexes are novel electrocardiographic signals, which reflect myocardial conduction delays in patients with coronary artery disease (CAD. The importance of fQRS complexes in identifying culprit vessels was evaluated in this retrospective study. Methods A 12-lead surface electrocardiogram was obtained in 183 patients who had non-ST-elevation myocardial infarction (NSTEMI and subsequently underwent coronary angiography (CAG. On the basis of the frequency of fQRS complexes, indices such as sensitivity, specificity, positive and negative predictive values, and likelihood ratio were evaluated to determine the ability of fQRS complexes to identify the culprit vessels. Results Among the patients studied, elderly patients (age ≥ 65 years and those with diabetes had a significantly higher frequency of fQRS complexes (p = 0.005, p = 0.003, respectively. The fQRS complexes recorded in the 4 precordial leads had the highest specificity (81.8% for indentifying the culprit vessel (left anterior descending artery. However, the specificity of fQRS complexes to identify lesions in the left circumflex and right coronary arteries was lower for the inferior and lateral leads than for the limb leads (65.5% versus 71.7%; however, the limb leads had higher sensitivity (92.3% versus 89.4%. And the total sensitivity and specificity of fQRS (77.1% and 71.5% were higher than those values for ischemic T-waves. Conclusions The frequency of fQRS complexes was higher in elderly and diabetic patients with NSTEMI. The frequency of fQRS complexes recorded in each of the ECG leads can be used to identify culprit vessels in patients with NSTEMI.
Directory of Open Access Journals (Sweden)
Christiani David C
2010-12-01
Full Text Available Abstract Background Air pollution is associated with adverse human health, but mechanisms through which pollution exerts effects remain to be clarified. One suggested pathway is that pollution causes oxidative stress. If so, oxidative stress-related genotypes may modify the oxidative response defenses to pollution exposure. Methods We explored the potential pathway by examining whether an array of oxidative stress-related genes (twenty single nucleotide polymorphisms, SNPs in nine genes modified associations of pollutants (organic carbon (OC, ozone and sulfate with urinary 8-hydroxy-2-deoxygunosine (8-OHdG, a biomarker of oxidative stress among the 320 aging men. We used a Multiple Testing Procedure in R modified by our team to identify the significance of the candidate genes adjusting for a priori covariates. Results We found that glutathione S-tranferase P1 (GSTP1, rs1799811, M1 and catalase (rs2284367 and group-specific component (GC, rs2282679, rs1155563 significantly or marginally significantly modified effects of OC and/or sulfate with larger effects among those carrying the wild type of GSTP1, catalase, non-wild type of GC and the non-null of GSTM1. Conclusions Polymorphisms of oxidative stress-related genes modified effects of OC and/or sulfate on 8-OHdG, suggesting that effects of OC or sulfate on 8-OHdG and other endpoints may be through the oxidative stress pathway.
Directory of Open Access Journals (Sweden)
Nils Breilid
2017-06-01
Full Text Available Introduction: This study deals with young people receiving special needs education in schools and their transition to lasting employment in private or public sector. Through a qualitative approach, the article aims at “identifying significant factors which can, contribute to successful transitions from school to lasting employment affiliation for pupils in vocational training programs”Theoretical approach: The theoretical approach of this article is descriptions and interpretation of the Norwegian educational legislation and the theory of «empowerment». These theoretical perspectives will be included in the empirical discussion. Method: The methodological approach is qualitative. Through four semi-structured interviews of young informants who have completed upper secondary school in a vocational education program, and have had a minimum of one-year training in an enterprise. Thematic analysis of the data is conducted with the application of NVivo 11, a computer program that is suitable for qualitative data-analysis and mixed research methods.Results and discussion: Through thematic analysis of the data, we found three significant factors contributing to successful transitions from school to lasting employment: a Application and development of the pupil’s competence - mastery and meaning b The significance of relations, communication and well-functioning socio-ecological networks c The importance of pupil participation and involvement in decision making
Multicollinearity and Regression Analysis
Daoud, Jamal I.
2017-12-01
In regression analysis it is obvious to have a correlation between the response and predictor(s), but having correlation among predictors is something undesired. The number of predictors included in the regression model depends on many factors among which, historical data, experience, etc. At the end selection of most important predictors is something objective due to the researcher. Multicollinearity is a phenomena when two or more predictors are correlated, if this happens, the standard error of the coefficients will increase [8]. Increased standard errors means that the coefficients for some or all independent variables may be found to be significantly different from In other words, by overinflating the standard errors, multicollinearity makes some variables statistically insignificant when they should be significant. In this paper we focus on the multicollinearity, reasons and consequences on the reliability of the regression model.
Bekhouche, Mourad; Leduc, Cedric; Dupont, Laura; Janssen, Lauriane; Delolme, Frederic; Vadon-Le Goff, Sandrine; Smargiasso, Nicolas; Baiwir, Dominique; Mazzucchelli, Gabriel; Zanella-Cleon, Isabelle; Dubail, Johanne; De Pauw, Edwin; Nusgens, Betty; Hulmes, David J S; Moali, Catherine; Colige, Alain
2016-05-01
A disintegrin and metalloproteinase with thrombospondin type I motif (ADAMTS)2, 3, and 14 are collectively named procollagen N-proteinases (pNPs) because of their specific ability to cleave the aminopropeptide of fibrillar procollagens. Several reports also indicate that they could be involved in other biological processes, such as blood coagulation, development, and male fertility, but the potential substrates associated with these activities remain unknown. Using the recently described N-terminal amine isotopic labeling of substrate approach, we analyzed the secretomes of human fibroblasts and identified 8, 17, and 22 candidate substrates for ADAMTS2, 3, and 14, respectively. Among these newly identified substrates, many are components of the extracellular matrix and/or proteins related to cell signaling such as latent TGF-β binding protein 1, TGF-β RIII, and dickkopf-related protein 3. Candidate substrates for the 3 ADAMTS have been biochemically validated in different contexts, and the implication of ADAMTS2 in the control of TGF-β activity has been further demonstrated in human fibroblasts. Finally, the cleavage site specificity was assessed showing a clear and unique preference for nonpolar or slightly hydrophobic amino acids. This work shows that the activities of the pNPs extend far beyond the classically reported processing of the aminopropeptide of fibrillar collagens and that they should now be considered as multilevel regulators of matrix deposition and remodeling.-Bekhouche, M., Leduc, C., Dupont, L., Janssen, L., Delolme, F., Vadon-Le Goff, S., Smargiasso, N., Baiwir, D., Mazzucchelli, G., Zanella-Cleon, I., Dubail, J., De Pauw, E., Nusgens, B., Hulmes, D. J. S., Moali, C., Colige, A. Determination of the substrate repertoire of ADAMTS2, 3, and 14 significantly broadens their functions and identifies extracellular matrix organization and TGF-β signaling as primary targets. © FASEB.
Reid, Jennifer; Cormack, Donna; Crowe, Marie
2016-03-01
Despite increased focus in New Zealand on reducing health inequities between Māori and New Zealand European ethnic groups, research on barriers and facilitators to primary healthcare access for Māori remains limited. In particular, there has been little interrogation of the significance of social-assignment of ethnicity for Māori in relation to engagement with predominantly non-Māori primary healthcare services and providers. A qualitative study was undertaken with a subsample (n = 40) of the broader Hauora Manawa Study to examine experiences of accessing and engaging with primary healthcare among adult urban Māori. Thematic analysis of in-depth interviews identified that participants perceived social-assignment as New Zealand European as an efficacious form of capital when interacting with predominantly non-Māori health professionals. Skin colour that was 'white' or was perceived to identify Māori as belonging to the 'dominant' New Zealand European ethnic group was reported as broadly advantageous and protective. In contrast, social-assignment as Māori was seen to be associated with risk of exposure to differential and discriminatory healthcare. Reducing the negative impacts of racialisation in a (neo)colonial society where 'White' cultural capital dominates requires increased recognition of the health-protective advantages of 'White' privilege and concomitant risks associated with socially-assigned categorisation of ethnicity as non-'White'. © The Author(s) 2015.
Multiple linear regression analysis
Edwards, T. R.
1980-01-01
Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.
Pirastu, Nicola; Kooyman, Maarten; Traglia, Michela; Robino, Antonietta; Willems, Sara M; Pistis, Giorgio; d'Adamo, Pio; Amin, Najaf; d'Eustacchio, Angela; Navarini, Luciano; Sala, Cinzia; Karssen, Lennart C; van Duijn, Cornelia; Toniolo, Daniela; Gasparini, Paolo
2014-01-01
Coffee, one of the most popular beverages in the world, contains many different physiologically active compounds with a potential impact on people's health. Despite the recent attention given to the genetic basis of its consumption, very little has been done in understanding genes influencing coffee preference among different individuals. Given its markedly bitter taste, we decided to verify if bitter receptor genes (TAS2Rs) variants affect coffee liking. In this light, 4066 people from different parts of Europe and Central Asia filled in a field questionnaire on coffee liking. They have been consequently recruited and included in the study. Eighty-eight SNPs covering the 25 TAS2R genes were selected from the available imputed ones and used to run association analysis for coffee liking. A significant association was detected with three SNP: one synonymous and two functional variants (W35S and H212R) on the TAS2R43 gene. Both variants have been shown to greatly reduce in vitro protein activity. Surprisingly the wild type allele, which corresponds to the functional form of the protein, is associated to higher liking of coffee. Since the hTAS2R43 receptor is sensible to caffeine, we verified if the detected variants produced differences in caffeine bitter perception on a subsample of people coming from the FVG cohort. We found a significant association between differences in caffeine perception and the H212R variant but not with the W35S, which suggests that the effect of the TAS2R43 gene on coffee liking is mediated by caffeine and in particular by the H212R variant. No other significant association was found with other TAS2R genes. In conclusion, the present study opens new perspectives in the understanding of coffee liking. Further studies are needed to clarify the role of the TAS2R43 gene in coffee hedonics and to identify which other genes and pathways are involved in its genetics.
Directory of Open Access Journals (Sweden)
Nicola Pirastu
Full Text Available Coffee, one of the most popular beverages in the world, contains many different physiologically active compounds with a potential impact on people's health. Despite the recent attention given to the genetic basis of its consumption, very little has been done in understanding genes influencing coffee preference among different individuals. Given its markedly bitter taste, we decided to verify if bitter receptor genes (TAS2Rs variants affect coffee liking. In this light, 4066 people from different parts of Europe and Central Asia filled in a field questionnaire on coffee liking. They have been consequently recruited and included in the study. Eighty-eight SNPs covering the 25 TAS2R genes were selected from the available imputed ones and used to run association analysis for coffee liking. A significant association was detected with three SNP: one synonymous and two functional variants (W35S and H212R on the TAS2R43 gene. Both variants have been shown to greatly reduce in vitro protein activity. Surprisingly the wild type allele, which corresponds to the functional form of the protein, is associated to higher liking of coffee. Since the hTAS2R43 receptor is sensible to caffeine, we verified if the detected variants produced differences in caffeine bitter perception on a subsample of people coming from the FVG cohort. We found a significant association between differences in caffeine perception and the H212R variant but not with the W35S, which suggests that the effect of the TAS2R43 gene on coffee liking is mediated by caffeine and in particular by the H212R variant. No other significant association was found with other TAS2R genes. In conclusion, the present study opens new perspectives in the understanding of coffee liking. Further studies are needed to clarify the role of the TAS2R43 gene in coffee hedonics and to identify which other genes and pathways are involved in its genetics.
Mushtaq, Ammara; Chen, Derrick J; Strand, Gregory J; Dylla, Brenda L; Cole, Nicolynn C; Mandrekar, Jayawant; Patel, Robin
2016-07-01
With the advent of matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS), most Gram-positive rods (GPRs) are readily identified; however, their clinical relevance in blood cultures remains unclear. Herein, we assessed the clinical significance of GPRs isolated from blood and identified in the era of MALDI-TOF MS. A retrospective chart review of patients presenting to the Mayo Clinic, Rochester, MN, from January 1, 2013, to October 13, 2015, was performed. Any episode of a positive blood culture for a GPR was included. We assessed the number of bottles positive for a given isolate, time to positivity of blood cultures, patient age, medical history, interpretation of culture results by the healthcare team and whether infectious diseases consultation was obtained. We also evaluated the susceptibility profiles of a larger collection of GPRs tested in the clinical microbiology laboratory of the Mayo Clinic, Rochester, MN from January 1, 2013, to October 31, 2015. There were a total of 246 GPRs isolated from the blood of 181 patients during the study period. 56% (n = 101) were deemed contaminants by the healthcare team and were not treated; 33% (n = 59) were clinically determined to represent true bacteremia and were treated; and 8% (n = 14) were considered of uncertain significance, with patients prescribed treatment regardless. Patient characteristics associated with an isolate being treated on univariate analysis included younger age (P = 0.02), identification to the species level (P = 0.02), higher number of positive blood culture sets (P < 0.0001), lower time to positivity (P < 0.0001), immunosuppression (P = 0.03), and recommendation made by an infectious disease consultant (P = 0.0005). On multivariable analysis, infectious diseases consultation (P = 0.03), higher number of positive blood culture sets (P = 0.0005) and lower time to positivity (P = 0.03) were associated with an isolate being treated. 100, 83, 48 and 34% of GPRs
Directory of Open Access Journals (Sweden)
Mok Tik
2014-06-01
Full Text Available This study formulates regression of vector data that will enable statistical analysis of various geodetic phenomena such as, polar motion, ocean currents, typhoon/hurricane tracking, crustal deformations, and precursory earthquake signals. The observed vector variable of an event (dependent vector variable is expressed as a function of a number of hypothesized phenomena realized also as vector variables (independent vector variables and/or scalar variables that are likely to impact the dependent vector variable. The proposed representation has the unique property of solving the coefficients of independent vector variables (explanatory variables also as vectors, hence it supersedes multivariate multiple regression models, in which the unknown coefficients are scalar quantities. For the solution, complex numbers are used to rep- resent vector information, and the method of least squares is deployed to estimate the vector model parameters after transforming the complex vector regression model into a real vector regression model through isomorphism. Various operational statistics for testing the predictive significance of the estimated vector parameter coefficients are also derived. A simple numerical example demonstrates the use of the proposed vector regression analysis in modeling typhoon paths.
Mechanisms of neuroblastoma regression
Brodeur, Garrett M.; Bagatell, Rochelle
2014-01-01
Recent genomic and biological studies of neuroblastoma have shed light on the dramatic heterogeneity in the clinical behaviour of this disease, which spans from spontaneous regression or differentiation in some patients, to relentless disease progression in others, despite intensive multimodality therapy. This evidence also suggests several possible mechanisms to explain the phenomena of spontaneous regression in neuroblastomas, including neurotrophin deprivation, humoral or cellular immunity, loss of telomerase activity and alterations in epigenetic regulation. A better understanding of the mechanisms of spontaneous regression might help to identify optimal therapeutic approaches for patients with these tumours. Currently, the most druggable mechanism is the delayed activation of developmentally programmed cell death regulated by the tropomyosin receptor kinase A pathway. Indeed, targeted therapy aimed at inhibiting neurotrophin receptors might be used in lieu of conventional chemotherapy or radiation in infants with biologically favourable tumours that require treatment. Alternative approaches consist of breaking immune tolerance to tumour antigens or activating neurotrophin receptor pathways to induce neuronal differentiation. These approaches are likely to be most effective against biologically favourable tumours, but they might also provide insights into treatment of biologically unfavourable tumours. We describe the different mechanisms of spontaneous neuroblastoma regression and the consequent therapeutic approaches. PMID:25331179
Morgan, Matthew J; Hunter, David; Pietsch, Rod; Osborne, William; Keogh, J Scott
2008-08-01
The iconic and brightly coloured Australian northern corroboree frog, Pseudophryne pengilleyi, and the southern corroboree frog, Pseudophryne corroboree are critically endangered and may be extinct in the wild within 3 years. We have assembled samples that cover the current range of both species and applied hypervariable microsatellite markers and mitochondrial DNA sequences to assess the levels and patterns of genetic variation. The four loci used in the study were highly variable, the total number of alleles observed ranged from 13 to 30 and the average number of alleles per locus was 19. Expected heterozygosity of the four microsatellite loci across all populations was high and varied between 0.830 and 0.935. Bayesian clustering analyses in STRUCTURE strongly supported four genetically distinct populations, which correspond exactly to the four main allopatric geographical regions in which the frogs are currently found. Individual analyses performed on the separate regions showed that breeding sites within these four regions could not be separated into distinct populations. Twelve mtND2 haplotypes were identified from 66 individuals from throughout the four geographical regions. A statistical parsimony network of mtDNA haplotypes shows two distinct groups, which correspond to the two species of corroboree frog, but with most of the haplotype diversity distributed in P. pengilleyi. These results demonstrate an unexpectedly high level of genetic diversity in both species. Our data have important implications for how the genetic diversity is managed in the future. The four evolutionarily significant units must be protected and maintained in captive breeding programmes for as long as it is possible to do.
Pedrini, D. T.; Pedrini, Bonnie C.
Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…
Regression in autistic spectrum disorders.
Stefanatos, Gerry A
2008-12-01
A significant proportion of children diagnosed with Autistic Spectrum Disorder experience a developmental regression characterized by a loss of previously-acquired skills. This may involve a loss of speech or social responsitivity, but often entails both. This paper critically reviews the phenomena of regression in autistic spectrum disorders, highlighting the characteristics of regression, age of onset, temporal course, and long-term outcome. Important considerations for diagnosis are discussed and multiple etiological factors currently hypothesized to underlie the phenomenon are reviewed. It is argued that regressive autistic spectrum disorders can be conceptualized on a spectrum with other regressive disorders that may share common pathophysiological features. The implications of this viewpoint are discussed.
Regression: The Apple Does Not Fall Far From the Tree.
Vetter, Thomas R; Schober, Patrick
2018-05-15
Researchers and clinicians are frequently interested in either: (1) assessing whether there is a relationship or association between 2 or more variables and quantifying this association; or (2) determining whether 1 or more variables can predict another variable. The strength of such an association is mainly described by the correlation. However, regression analysis and regression models can be used not only to identify whether there is a significant relationship or association between variables but also to generate estimations of such a predictive relationship between variables. This basic statistical tutorial discusses the fundamental concepts and techniques related to the most common types of regression analysis and modeling, including simple linear regression, multiple regression, logistic regression, ordinal regression, and Poisson regression, as well as the common yet often underrecognized phenomenon of regression toward the mean. The various types of regression analysis are powerful statistical techniques, which when appropriately applied, can allow for the valid interpretation of complex, multifactorial data. Regression analysis and models can assess whether there is a relationship or association between 2 or more observed variables and estimate the strength of this association, as well as determine whether 1 or more variables can predict another variable. Regression is thus being applied more commonly in anesthesia, perioperative, critical care, and pain research. However, it is crucial to note that regression can identify plausible risk factors; it does not prove causation (a definitive cause and effect relationship). The results of a regression analysis instead identify independent (predictor) variable(s) associated with the dependent (outcome) variable. As with other statistical methods, applying regression requires that certain assumptions be met, which can be tested with specific diagnostics.
Linear regression in astronomy. I
Isobe, Takashi; Feigelson, Eric D.; Akritas, Michael G.; Babu, Gutti Jogesh
1990-01-01
Five methods for obtaining linear regression fits to bivariate data with unknown or insignificant measurement errors are discussed: ordinary least-squares (OLS) regression of Y on X, OLS regression of X on Y, the bisector of the two OLS lines, orthogonal regression, and 'reduced major-axis' regression. These methods have been used by various researchers in observational astronomy, most importantly in cosmic distance scale applications. Formulas for calculating the slope and intercept coefficients and their uncertainties are given for all the methods, including a new general form of the OLS variance estimates. The accuracy of the formulas was confirmed using numerical simulations. The applicability of the procedures is discussed with respect to their mathematical properties, the nature of the astronomical data under consideration, and the scientific purpose of the regression. It is found that, for problems needing symmetrical treatment of the variables, the OLS bisector performs significantly better than orthogonal or reduced major-axis regression.
DEFF Research Database (Denmark)
Sung, Yun J; Winkler, Thomas W; de Las Fuentes, Lisa
2018-01-01
Genome-wide association analysis advanced understanding of blood pressure (BP), a major risk factor for vascular conditions such as coronary heart disease and stroke. Accounting for smoking behavior may help identify BP loci and extend our knowledge of its genetic architecture. We performed genom...
Sung, Yun J.; Winkler, Thomas W.; de las Fuentes, Lisa; Bentley, Amy R.; Brown, Michael R.; Kraja, Aldi T.; Schwander, Karen; Ntalla, Ioanna; Guo, Xiuqing; Franceschini, Nora; Lu, Yingchang; Cheng, Ching-Yu; Sim, Xueling; Vojinovic, Dina; Marten, Jonathan; Musani, Solomon K.; Li, Changwei; Feitosa, Mary F.; Kilpelainen, Tuomas O.; Richard, Melissa A.; Noordam, Raymond; Aslibekyan, Stella; Aschard, Hugues; Bartz, Traci M.; Dorajoo, Rajkumar; Liu, Yongmei; Manning, Alisa K.; Rankinen, Tuomo; Smith, Albert Vernon; Tajuddin, Salman M.; Tayo, Bamidele O.; Warren, Helen R.; Zhao, Wei; Zhou, Yanhua; Matoba, Nana; Sofer, Tamar; Alver, Maris; Amini, Marzyeh; Boissel, Mathilde; Chai, Jin Fang; Chen, Xu; Divers, Jasmin; Gandin, Ilaria; Gao, Chuan; Giulianini, Franco; Goel, Anuj; Harris, Sarah E.; Hartwig, Fernando Pires; Horimoto, Andrea R. V. R.; Hsu, Fang-Chi; Jackson, Anne U.; Kahonen, Mika; Kasturiratne, Anuradhani; Kuhnel, Brigitte; Leander, Karin; Lee, Wen-Jane; Lin, Keng-Hung; Luan, Jian' an; McKenzie, Colin A.; He Meian,; Nelson, Christopher P.; Rauramaa, Rainer; Schupf, Nicole; Scott, Robert A.; Sheu, Wayne H. H.; Stancakova, Alena; Takeuchi, Fumihiko; van der Most, Peter J.; Varga, Tibor V.; Wang, Heming; Wang, Yajuan; Ware, Erin B.; Weiss, Stefan; Wen, Wanqing; Yanek, Lisa R.; Zhang, Weihua; Zhao, Jing Hua; Afaq, Saima; Alfred, Tamuno; Amin, Najaf; Arking, Dan; Aung, Tin; Barr, R. Graham; Bielak, Lawrence F.; Boerwinkle, Eric; Bottinger, Erwin P.; Braund, Peter S.; Brody, Jennifer A.; Broeckel, Ulrich; Cabrera, Claudia P.; Cade, Brian; Yu Caizheng,; Campbell, Archie; Canouil, Mickael; Chakravarti, Aravinda; Chauhan, Ganesh; Christensen, Kaare; Cocca, Massimiliano; Collins, Francis S.; Connell, John M.; de Mutsert, Renee; de Silva, H. Janaka; Debette, Stephanie; Dorr, Marcus; Duan, Qing; Eaton, Charles B.; Ehret, Georg; Evangelou, Evangelos; Faul, Jessica D.; Fisher, Virginia A.; Forouhi, Nita G.; Franco, Oscar H.; Friedlander, Yechiel; Gao, He; Gigante, Bruna; Graff, Misa; Gu, C. Charles; Gu, Dongfeng; Gupta, Preeti; Hagenaars, Saskia P.; Harris, Tamara B.; He, Jiang; Heikkinen, Sami; Heng, Chew-Kiat; Hirata, Makoto; Hofman, Albert; Howard, Barbara V.; Hunt, Steven; Irvin, Marguerite R.; Jia, Yucheng; Joehanes, Roby; Justice, Anne E.; Katsuya, Tomohiro; Kaufman, Joel; Kerrison, Nicola D.; Khor, Chiea Chuen; Koh, Woon-Puay; Koistinen, Heikki A.; Komulainen, Pirjo; Kooperberg, Charles; Krieger, Jose E.; Kubo, Michiaki; Kuusisto, Johanna; Langefeld, Carl D.; Langenberg, Claudia; Launer, Lenore J.; Lehne, Benjamin; Lewis, Cora E.; Li, Yize; Lim, Sing Hui; Lin, Shiow; Liu, Ching-Ti; Liu, Jianjun; Liu, Jingmin; Liu, Kiang; Liu, Yeheng; Loh, Marie; Lohman, Kurt K.; Long, Jirong; Louie, Tin; Magi, Reedik; Mahajan, Anubha; Meitinger, Thomas; Metspalu, Andres; Milani, Lili; Momozawa, Yukihide; Morris, Andrew P.; Mosley, Thomas H.; Munson, Peter; Murray, Alison D.; Nalls, Mike A.; Nasri, Ubaydah; Norris, Jill M.; North, Kari; Ogunniyi, Adesola; Padmanabhan, Sandosh; Palmas, Walter R.; Palmer, Nicholette D.; Pankow, James S.; Pedersen, Nancy L.; Peters, Annette; Peyser, Patricia A.; Polasek, Ozren; Raitakari, Olli T.; Renstrom, Frida; Rice, Treva K.; Ridker, Paul M.; Robino, Antonietta; Robinson, Jennifer G.; Rose, Lynda M.; Rudan, Igor; Sabanayagam, Charumathi; Salako, Babatunde L.; Sandow, Kevin; Schmidt, Carsten O.; Schreiner, Pamela J.; Scott, William R.; Seshadri, Sudha; Sever, Peter; Sitlani, Colleen M.; Smith, Jennifer A.; Snieder, Harold; Starr, John M.; Strauch, Konstantin; Tang, Hua; Taylor, Kent D.; Teo, Yik Ying; Tham, Yih Chung; Ultterlinden, Andre G.; Waldenberger, Melanie; Wang, Lihua; Wang, Ya X.; Bin Wei, Wen; Williams, Christine; Wilson, Gregory; Wojczynski, Mary K.; Yao, Jie; Yuan, Jian-Min; Zonderman, Alan B.; Becker, Diane M.; Boehnke, Michael; Bowden, Donald W.; Chambers, John C.; Chen, Yii-Der Ida; de Faire, Ulf; Deary, Ian J.; Esko, Tonu; Farrall, Martin; Forrester, Terrence; Franks, Paul W.; Freedman, Barry I.; Froguel, Philippe; Gasparini, Paolo; Gieger, Christian; Horta, Bernardo Lessa; Hung, Yi-Jen; Jonas, Jost B.; Kato, Norihiro; Kooner, Jaspal S.; Laakso, Markku; Lehtimaki, Terho; Liang, Kae-Woei; Magnusson, Patrik K. E.; Newman, Anne B.; Oldehinkel, Albertine J.; Pereira, Alexandre C.; Redline, Susan; Rettig, Rainer; Samani, Nilesh J.; Scott, James; Shu, Xiao-Ou; van der Harst, Pim; Wagenknecht, Lynne E.; Wareham, Nicholas J.; Watkins, Hugh; Weir, David R.; Wickremasinghe, Ananda R.; Wu, Tangchun; Zheng, Wei; Kamatani, Yoichiro; Laurie, Cathy C.; Bouchard, Claude; Cooper, Richard S.; Evans, Michele K.; Gudnason, Vilmundur; Kardia, Sharon L. R.; Kritchevsky, Stephen B.; Levy, Daniel; O'Connell, Jeff R.; Psaty, Bruce M.; van Dam, Rob M.; Sims, Mario; Arnett, Donna K.; Mook-Kanamori, Dennis O.; Kelly, Tanika N.; Fox, Ervin R.; Hayward, Caroline; Fornage, Myriam; Rotimi, Charles N.; Province, Michael A.; van Duijn, Cornelia M.; Tai, E. Shyong; Wong, Tien Yin; Loos, Ruth J. F.; Reiner, Alex P.; Rotter, Jerome I.; Zhu, Xiaofeng; Bierut, Laura J.; Gauderman, W. James; Caulfield, Mark J.; Elliott, Paul; Rice, Kenneth; Munroe, Patricia B.; Morrison, Alanna C.; Cupples, L. Adrienne; Rao, Dabeeru C.; Chasman, Daniel I.; Study, Lifelines Cohort
2018-01-01
Genome-wide association analysis advanced understanding of blood pressure (BP), a major risk factor for vascular conditions such as coronary heart disease and stroke. Accounting for smoking behavior may help identify BP loci and extend our knowledge of its genetic architecture. We performed
Subset selection in regression
Miller, Alan
2002-01-01
Originally published in 1990, the first edition of Subset Selection in Regression filled a significant gap in the literature, and its critical and popular success has continued for more than a decade. Thoroughly revised to reflect progress in theory, methods, and computing power, the second edition promises to continue that tradition. The author has thoroughly updated each chapter, incorporated new material on recent developments, and included more examples and references. New in the Second Edition:A separate chapter on Bayesian methodsComplete revision of the chapter on estimationA major example from the field of near infrared spectroscopyMore emphasis on cross-validationGreater focus on bootstrappingStochastic algorithms for finding good subsets from large numbers of predictors when an exhaustive search is not feasible Software available on the Internet for implementing many of the algorithms presentedMore examplesSubset Selection in Regression, Second Edition remains dedicated to the techniques for fitting...
Better Autologistic Regression
Directory of Open Access Journals (Sweden)
Mark A. Wolters
2017-11-01
Full Text Available Autologistic regression is an important probability model for dichotomous random variables observed along with covariate information. It has been used in various fields for analyzing binary data possessing spatial or network structure. The model can be viewed as an extension of the autologistic model (also known as the Ising model, quadratic exponential binary distribution, or Boltzmann machine to include covariates. It can also be viewed as an extension of logistic regression to handle responses that are not independent. Not all authors use exactly the same form of the autologistic regression model. Variations of the model differ in two respects. First, the variable coding—the two numbers used to represent the two possible states of the variables—might differ. Common coding choices are (zero, one and (minus one, plus one. Second, the model might appear in either of two algebraic forms: a standard form, or a recently proposed centered form. Little attention has been paid to the effect of these differences, and the literature shows ambiguity about their importance. It is shown here that changes to either coding or centering in fact produce distinct, non-nested probability models. Theoretical results, numerical studies, and analysis of an ecological data set all show that the differences among the models can be large and practically significant. Understanding the nature of the differences and making appropriate modeling choices can lead to significantly improved autologistic regression analyses. The results strongly suggest that the standard model with plus/minus coding, which we call the symmetric autologistic model, is the most natural choice among the autologistic variants.
DEFF Research Database (Denmark)
Johansen, Søren
2008-01-01
The reduced rank regression model is a multivariate regression model with a coefficient matrix with reduced rank. The reduced rank regression algorithm is an estimation procedure, which estimates the reduced rank regression model. It is related to canonical correlations and involves calculating...
Seyoum, Awoke; Ndlovu, Principal; Zewotir, Temesgen
2016-01-01
CD4 cells are a type of white blood cells that plays a significant role in protecting humans from infectious diseases. Lack of information on associated factors on CD4 cell count reduction is an obstacle for improvement of cells in HIV positive adults. Therefore, the main objective of this study was to investigate baseline factors that could affect initial CD4 cell count change after highly active antiretroviral therapy had been given to adult patients in North West Ethiopia. A retrospective cross-sectional study was conducted among 792 HIV positive adult patients who already started antiretroviral therapy for 1 month of therapy. A Chi square test of association was used to assess of predictor covariates on the variable of interest. Data was secondary source and modeled using generalized linear models, especially Quasi-Poisson regression. The patients' CD4 cell count changed within a month ranged from 0 to 109 cells/mm 3 with a mean of 15.9 cells/mm 3 and standard deviation 18.44 cells/mm 3 . The first month CD4 cell count change was significantly affected by poor adherence to highly active antiretroviral therapy (aRR = 0.506, P value = 2e -16 ), fair adherence (aRR = 0.592, P value = 0.0120), initial CD4 cell count (aRR = 1.0212, P value = 1.54e -15 ), low household income (aRR = 0.63, P value = 0.671e -14 ), middle income (aRR = 0.74, P value = 0.629e -12 ), patients without cell phone (aRR = 0.67, P value = 0.615e -16 ), WHO stage 2 (aRR = 0.91, P value = 0.0078), WHO stage 3 (aRR = 0.91, P value = 0.0058), WHO stage 4 (0876, P value = 0.0214), age (aRR = 0.987, P value = 0.000) and weight (aRR = 1.0216, P value = 3.98e -14 ). Adherence to antiretroviral therapy, initial CD4 cell count, household income, WHO stages, age, weight and owner of cell phone played a major role for the variation of CD4 cell count in our data. Hence, we recommend a close follow-up of patients to adhere the prescribed medication for
Directory of Open Access Journals (Sweden)
Halliday A Idikio
Full Text Available Cancer biomarkers are sought to support cancer diagnosis, predict cancer patient response to treatment and survival. Identifying reliable biomarkers for predicting cancer treatment response needs understanding of all aspects of cancer cell death and survival. Galectin-3 and Beclin1 are involved in two coordinated pathways of programmed cell death, apoptosis and autophagy and are linked to necroptosis/necrosis. The aim of the study was to quantify galectin-3 and Beclin1 mRNA in human cancer tissue cDNA panels and determine their utility as biomarkers of cancer cell survival.A panel of 96 cDNAs from eight (8 different normal and cancer tissue types were used for quantitative real-time polymerase chain reaction (qRT-PCR using ABI7900HT. Miner2.0, a web-based 4- and 3-parameter logistic regression software was used to derive individual well polymerase chain reaction efficiencies (E and cycle threshold (Ct values. Miner software derived formula was used to calculate mRNA levels and then fold changes. The ratios of cancer to normal tissue levels of galectin-3 and Beclin1 were calculated (using the mean for each tissue type. Relative mRNA expressions for galectin-3 were higher than for Beclin1 in all tissue (normal and cancer types. In cancer tissues, breast, kidney, thyroid and prostate had the highest galectin-3 mRNA levels compared to normal tissues. High levels of Beclin1 mRNA levels were in liver and prostate cancers when compared to normal tissues. Breast, kidney and thyroid cancers had high galectin-3 levels and low Beclin1 levels.Galectin-3 expression patterns in normal and cancer tissues support its reported roles in human cancer. Beclin1 expression pattern supports its roles in cancer cell survival and in treatment response. qRT-PCR analysis method used may enable high throughput studies to generate molecular biomarker sets for diagnosis and predicting cancer treatment response.
Tumor regression patterns in retinoblastoma
International Nuclear Information System (INIS)
Zafar, S.N.; Siddique, S.N.; Zaheer, N.
2016-01-01
To observe the types of tumor regression after treatment, and identify the common pattern of regression in our patients. Study Design: Descriptive study. Place and Duration of Study: Department of Pediatric Ophthalmology and Strabismus, Al-Shifa Trust Eye Hospital, Rawalpindi, Pakistan, from October 2011 to October 2014. Methodology: Children with unilateral and bilateral retinoblastoma were included in the study. Patients were referred to Pakistan Institute of Medical Sciences, Islamabad, for chemotherapy. After every cycle of chemotherapy, dilated funds examination under anesthesia was performed to record response of the treatment. Regression patterns were recorded on RetCam II. Results: Seventy-four tumors were included in the study. Out of 74 tumors, 3 were ICRB group A tumors, 43 were ICRB group B tumors, 14 tumors belonged to ICRB group C, and remaining 14 were ICRB group D tumors. Type IV regression was seen in 39.1% (n=29) tumors, type II in 29.7% (n=22), type III in 25.6% (n=19), and type I in 5.4% (n=4). All group A tumors (100%) showed type IV regression. Seventeen (39.5%) group B tumors showed type IV regression. In group C, 5 tumors (35.7%) showed type II regression and 5 tumors (35.7%) showed type IV regression. In group D, 6 tumors (42.9%) regressed to type II non-calcified remnants. Conclusion: The response and success of the focal and systemic treatment, as judged by the appearance of different patterns of tumor regression, varies with the ICRB grouping of the tumor. (author)
Regression analysis by example
Chatterjee, Samprit
2012-01-01
Praise for the Fourth Edition: ""This book is . . . an excellent source of examples for regression analysis. It has been and still is readily readable and understandable."" -Journal of the American Statistical Association Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. Regression Analysis by Example, Fifth Edition has been expanded
DEFF Research Database (Denmark)
Fitzenberger, Bernd; Wilke, Ralf Andreas
2015-01-01
if the mean regression model does not. We provide a short informal introduction into the principle of quantile regression which includes an illustrative application from empirical labor market research. This is followed by briefly sketching the underlying statistical model for linear quantile regression based......Quantile regression is emerging as a popular statistical approach, which complements the estimation of conditional mean models. While the latter only focuses on one aspect of the conditional distribution of the dependent variable, the mean, quantile regression provides more detailed insights...... by modeling conditional quantiles. Quantile regression can therefore detect whether the partial effect of a regressor on the conditional quantiles is the same for all quantiles or differs across quantiles. Quantile regression can provide evidence for a statistical relationship between two variables even...
Understanding logistic regression analysis
Sperandei, Sandro
2014-01-01
Logistic regression is used to obtain odds ratio in the presence of more than one explanatory variable. The procedure is quite similar to multiple linear regression, with the exception that the response variable is binomial. The result is the impact of each variable on the odds ratio of the observed event of interest. The main advantage is to avoid confounding effects by analyzing the association of all variables together. In this article, we explain the logistic regression procedure using ex...
A flexible fuzzy regression algorithm for forecasting oil consumption estimation
International Nuclear Information System (INIS)
Azadeh, A.; Khakestani, M.; Saberi, M.
2009-01-01
Oil consumption plays a vital role in socio-economic development of most countries. This study presents a flexible fuzzy regression algorithm for forecasting oil consumption based on standard economic indicators. The standard indicators are annual population, cost of crude oil import, gross domestic production (GDP) and annual oil production in the last period. The proposed algorithm uses analysis of variance (ANOVA) to select either fuzzy regression or conventional regression for future demand estimation. The significance of the proposed algorithm is three fold. First, it is flexible and identifies the best model based on the results of ANOVA and minimum absolute percentage error (MAPE), whereas previous studies consider the best fitted fuzzy regression model based on MAPE or other relative error results. Second, the proposed model may identify conventional regression as the best model for future oil consumption forecasting because of its dynamic structure, whereas previous studies assume that fuzzy regression always provide the best solutions and estimation. Third, it utilizes the most standard independent variables for the regression models. To show the applicability and superiority of the proposed flexible fuzzy regression algorithm the data for oil consumption in Canada, United States, Japan and Australia from 1990 to 2005 are used. The results show that the flexible algorithm provides accurate solution for oil consumption estimation problem. The algorithm may be used by policy makers to accurately foresee the behavior of oil consumption in various regions.
Introduction to regression graphics
Cook, R Dennis
2009-01-01
Covers the use of dynamic and interactive computer graphics in linear regression analysis, focusing on analytical graphics. Features new techniques like plot rotation. The authors have composed their own regression code, using Xlisp-Stat language called R-code, which is a nearly complete system for linear regression analysis and can be utilized as the main computer program in a linear regression course. The accompanying disks, for both Macintosh and Windows computers, contain the R-code and Xlisp-Stat. An Instructor's Manual presenting detailed solutions to all the problems in the book is ava
Alternative Methods of Regression
Birkes, David
2011-01-01
Of related interest. Nonlinear Regression Analysis and its Applications Douglas M. Bates and Donald G. Watts ".an extraordinary presentation of concepts and methods concerning the use and analysis of nonlinear regression models.highly recommend[ed].for anyone needing to use and/or understand issues concerning the analysis of nonlinear regression models." --Technometrics This book provides a balance between theory and practice supported by extensive displays of instructive geometrical constructs. Numerous in-depth case studies illustrate the use of nonlinear regression analysis--with all data s
Nonparametric Mixture of Regression Models.
Huang, Mian; Li, Runze; Wang, Shaoli
2013-07-01
Motivated by an analysis of US house price index data, we propose nonparametric finite mixture of regression models. We study the identifiability issue of the proposed models, and develop an estimation procedure by employing kernel regression. We further systematically study the sampling properties of the proposed estimators, and establish their asymptotic normality. A modified EM algorithm is proposed to carry out the estimation procedure. We show that our algorithm preserves the ascent property of the EM algorithm in an asymptotic sense. Monte Carlo simulations are conducted to examine the finite sample performance of the proposed estimation procedure. An empirical analysis of the US house price index data is illustrated for the proposed methodology.
Directory of Open Access Journals (Sweden)
Matthias Schmid
Full Text Available Regression analysis with a bounded outcome is a common problem in applied statistics. Typical examples include regression models for percentage outcomes and the analysis of ratings that are measured on a bounded scale. In this paper, we consider beta regression, which is a generalization of logit models to situations where the response is continuous on the interval (0,1. Consequently, beta regression is a convenient tool for analyzing percentage responses. The classical approach to fit a beta regression model is to use maximum likelihood estimation with subsequent AIC-based variable selection. As an alternative to this established - yet unstable - approach, we propose a new estimation technique called boosted beta regression. With boosted beta regression estimation and variable selection can be carried out simultaneously in a highly efficient way. Additionally, both the mean and the variance of a percentage response can be modeled using flexible nonlinear covariate effects. As a consequence, the new method accounts for common problems such as overdispersion and non-binomial variance structures.
Understanding logistic regression analysis.
Sperandei, Sandro
2014-01-01
Logistic regression is used to obtain odds ratio in the presence of more than one explanatory variable. The procedure is quite similar to multiple linear regression, with the exception that the response variable is binomial. The result is the impact of each variable on the odds ratio of the observed event of interest. The main advantage is to avoid confounding effects by analyzing the association of all variables together. In this article, we explain the logistic regression procedure using examples to make it as simple as possible. After definition of the technique, the basic interpretation of the results is highlighted and then some special issues are discussed.
Weisberg, Sanford
2013-01-01
Praise for the Third Edition ""...this is an excellent book which could easily be used as a course text...""-International Statistical Institute The Fourth Edition of Applied Linear Regression provides a thorough update of the basic theory and methodology of linear regression modeling. Demonstrating the practical applications of linear regression analysis techniques, the Fourth Edition uses interesting, real-world exercises and examples. Stressing central concepts such as model building, understanding parameters, assessing fit and reliability, and drawing conclusions, the new edition illus
Hosmer, David W; Sturdivant, Rodney X
2013-01-01
A new edition of the definitive guide to logistic regression modeling for health science and other applications This thoroughly expanded Third Edition provides an easily accessible introduction to the logistic regression (LR) model and highlights the power of this model by examining the relationship between a dichotomous outcome and a set of covariables. Applied Logistic Regression, Third Edition emphasizes applications in the health sciences and handpicks topics that best suit the use of modern statistical software. The book provides readers with state-of-
Understanding poisson regression.
Hayat, Matthew J; Higgins, Melinda
2014-04-01
Nurse investigators often collect study data in the form of counts. Traditional methods of data analysis have historically approached analysis of count data either as if the count data were continuous and normally distributed or with dichotomization of the counts into the categories of occurred or did not occur. These outdated methods for analyzing count data have been replaced with more appropriate statistical methods that make use of the Poisson probability distribution, which is useful for analyzing count data. The purpose of this article is to provide an overview of the Poisson distribution and its use in Poisson regression. Assumption violations for the standard Poisson regression model are addressed with alternative approaches, including addition of an overdispersion parameter or negative binomial regression. An illustrative example is presented with an application from the ENSPIRE study, and regression modeling of comorbidity data is included for illustrative purposes. Copyright 2014, SLACK Incorporated.
Moderation analysis using a two-level regression model.
Yuan, Ke-Hai; Cheng, Ying; Maxwell, Scott
2014-10-01
Moderation analysis is widely used in social and behavioral research. The most commonly used model for moderation analysis is moderated multiple regression (MMR) in which the explanatory variables of the regression model include product terms, and the model is typically estimated by least squares (LS). This paper argues for a two-level regression model in which the regression coefficients of a criterion variable on predictors are further regressed on moderator variables. An algorithm for estimating the parameters of the two-level model by normal-distribution-based maximum likelihood (NML) is developed. Formulas for the standard errors (SEs) of the parameter estimates are provided and studied. Results indicate that, when heteroscedasticity exists, NML with the two-level model gives more efficient and more accurate parameter estimates than the LS analysis of the MMR model. When error variances are homoscedastic, NML with the two-level model leads to essentially the same results as LS with the MMR model. Most importantly, the two-level regression model permits estimating the percentage of variance of each regression coefficient that is due to moderator variables. When applied to data from General Social Surveys 1991, NML with the two-level model identified a significant moderation effect of race on the regression of job prestige on years of education while LS with the MMR model did not. An R package is also developed and documented to facilitate the application of the two-level model.
DEFF Research Database (Denmark)
Bache, Stefan Holst
A new and alternative quantile regression estimator is developed and it is shown that the estimator is root n-consistent and asymptotically normal. The estimator is based on a minimax ‘deviance function’ and has asymptotically equivalent properties to the usual quantile regression estimator. It is......, however, a different and therefore new estimator. It allows for both linear- and nonlinear model specifications. A simple algorithm for computing the estimates is proposed. It seems to work quite well in practice but whether it has theoretical justification is still an open question....
DEFF Research Database (Denmark)
Ozenne, Brice; Sørensen, Anne Lyngholm; Scheike, Thomas
2017-01-01
In the presence of competing risks a prediction of the time-dynamic absolute risk of an event can be based on cause-specific Cox regression models for the event and the competing risks (Benichou and Gail, 1990). We present computationally fast and memory optimized C++ functions with an R interface...... for predicting the covariate specific absolute risks, their confidence intervals, and their confidence bands based on right censored time to event data. We provide explicit formulas for our implementation of the estimator of the (stratified) baseline hazard function in the presence of tied event times. As a by...... functionals. The software presented here is implemented in the riskRegression package....
Bayesian logistic regression analysis
Van Erp, H.R.N.; Van Gelder, P.H.A.J.M.
2012-01-01
In this paper we present a Bayesian logistic regression analysis. It is found that if one wishes to derive the posterior distribution of the probability of some event, then, together with the traditional Bayes Theorem and the integrating out of nuissance parameters, the Jacobian transformation is an
Seber, George A F
2012-01-01
Concise, mathematically clear, and comprehensive treatment of the subject.* Expanded coverage of diagnostics and methods of model fitting.* Requires no specialized knowledge beyond a good grasp of matrix algebra and some acquaintance with straight-line regression and simple analysis of variance models.* More than 200 problems throughout the book plus outline solutions for the exercises.* This revision has been extensively class-tested.
Ritz, Christian; Parmigiani, Giovanni
2009-01-01
R is a rapidly evolving lingua franca of graphical display and statistical analysis of experiments from the applied sciences. This book provides a coherent treatment of nonlinear regression with R by means of examples from a diversity of applied sciences such as biology, chemistry, engineering, medicine and toxicology.
Bayesian ARTMAP for regression.
Sasu, L M; Andonie, R
2013-10-01
Bayesian ARTMAP (BA) is a recently introduced neural architecture which uses a combination of Fuzzy ARTMAP competitive learning and Bayesian learning. Training is generally performed online, in a single-epoch. During training, BA creates input data clusters as Gaussian categories, and also infers the conditional probabilities between input patterns and categories, and between categories and classes. During prediction, BA uses Bayesian posterior probability estimation. So far, BA was used only for classification. The goal of this paper is to analyze the efficiency of BA for regression problems. Our contributions are: (i) we generalize the BA algorithm using the clustering functionality of both ART modules, and name it BA for Regression (BAR); (ii) we prove that BAR is a universal approximator with the best approximation property. In other words, BAR approximates arbitrarily well any continuous function (universal approximation) and, for every given continuous function, there is one in the set of BAR approximators situated at minimum distance (best approximation); (iii) we experimentally compare the online trained BAR with several neural models, on the following standard regression benchmarks: CPU Computer Hardware, Boston Housing, Wisconsin Breast Cancer, and Communities and Crime. Our results show that BAR is an appropriate tool for regression tasks, both for theoretical and practical reasons. Copyright © 2013 Elsevier Ltd. All rights reserved.
Bounded Gaussian process regression
DEFF Research Database (Denmark)
Jensen, Bjørn Sand; Nielsen, Jens Brehm; Larsen, Jan
2013-01-01
We extend the Gaussian process (GP) framework for bounded regression by introducing two bounded likelihood functions that model the noise on the dependent variable explicitly. This is fundamentally different from the implicit noise assumption in the previously suggested warped GP framework. We...... with the proposed explicit noise-model extension....
and Multinomial Logistic Regression
African Journals Online (AJOL)
This work presented the results of an experimental comparison of two models: Multinomial Logistic Regression (MLR) and Artificial Neural Network (ANN) for classifying students based on their academic performance. The predictive accuracy for each model was measured by their average Classification Correct Rate (CCR).
Logistic Regression Modeling of Diminishing Manufacturing Sources for Integrated Circuits
National Research Council Canada - National Science Library
Gravier, Michael
1999-01-01
.... The research identified logistic regression as a powerful tool for analysis of DMSMS and further developed twenty models attempting to identify the "best" way to model and predict DMSMS using logistic regression...
Stepwise versus Hierarchical Regression: Pros and Cons
Lewis, Mitzi
2007-01-01
Multiple regression is commonly used in social and behavioral data analysis. In multiple regression contexts, researchers are very often interested in determining the "best" predictors in the analysis. This focus may stem from a need to identify those predictors that are supportive of theory. Alternatively, the researcher may simply be interested…
Ridge Regression Signal Processing
Kuhl, Mark R.
1990-01-01
The introduction of the Global Positioning System (GPS) into the National Airspace System (NAS) necessitates the development of Receiver Autonomous Integrity Monitoring (RAIM) techniques. In order to guarantee a certain level of integrity, a thorough understanding of modern estimation techniques applied to navigational problems is required. The extended Kalman filter (EKF) is derived and analyzed under poor geometry conditions. It was found that the performance of the EKF is difficult to predict, since the EKF is designed for a Gaussian environment. A novel approach is implemented which incorporates ridge regression to explain the behavior of an EKF in the presence of dynamics under poor geometry conditions. The basic principles of ridge regression theory are presented, followed by the derivation of a linearized recursive ridge estimator. Computer simulations are performed to confirm the underlying theory and to provide a comparative analysis of the EKF and the recursive ridge estimator.
Regression in organizational leadership.
Kernberg, O F
1979-02-01
The choice of good leaders is a major task for all organizations. Inforamtion regarding the prospective administrator's personality should complement questions regarding his previous experience, his general conceptual skills, his technical knowledge, and the specific skills in the area for which he is being selected. The growing psychoanalytic knowledge about the crucial importance of internal, in contrast to external, object relations, and about the mutual relationships of regression in individuals and in groups, constitutes an important practical tool for the selection of leaders.
Classification and regression trees
Breiman, Leo; Olshen, Richard A; Stone, Charles J
1984-01-01
The methodology used to construct tree structured rules is the focus of this monograph. Unlike many other statistical procedures, which moved from pencil and paper to calculators, this text's use of trees was unthinkable before computers. Both the practical and theoretical sides have been developed in the authors' study of tree methods. Classification and Regression Trees reflects these two sides, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.
Hilbe, Joseph M
2009-01-01
This book really does cover everything you ever wanted to know about logistic regression … with updates available on the author's website. Hilbe, a former national athletics champion, philosopher, and expert in astronomy, is a master at explaining statistical concepts and methods. Readers familiar with his other expository work will know what to expect-great clarity.The book provides considerable detail about all facets of logistic regression. No step of an argument is omitted so that the book will meet the needs of the reader who likes to see everything spelt out, while a person familiar with some of the topics has the option to skip "obvious" sections. The material has been thoroughly road-tested through classroom and web-based teaching. … The focus is on helping the reader to learn and understand logistic regression. The audience is not just students meeting the topic for the first time, but also experienced users. I believe the book really does meet the author's goal … .-Annette J. Dobson, Biometric...
DEFF Research Database (Denmark)
Gravgaard, Karina Hedelund; Terp, Mikkel G; Lund, Rikke R
2015-01-01
To gain insight into miRNA regulation in metastasis formation, we used a metastasis cell line model that allows investigation of extravasation and colonization of circulating cancer cells to lungs in mice. Using global miRNA profiling, 28 miRNAs were found to exhibit significantly altered...... proliferation or apoptosis in established lung tumors. To identify proteins regulated by miR-155 and thus delineate its function in our cell model, we compared the proteome of xenograft tumors derived from miR-155-overexpressing CL16 cells and CL16 control cells using mass spectrometry-based proteomics. >4......,000 proteins were identified, of which 92 were consistently differentially expressed. Network analysis revealed that the altered proteins were associated with cellular functions such as movement, growth and survival as well as cell-to-cell signaling and interaction. Downregulation of the three metastasis...
Steganalysis using logistic regression
Lubenko, Ivans; Ker, Andrew D.
2011-02-01
We advocate Logistic Regression (LR) as an alternative to the Support Vector Machine (SVM) classifiers commonly used in steganalysis. LR offers more information than traditional SVM methods - it estimates class probabilities as well as providing a simple classification - and can be adapted more easily and efficiently for multiclass problems. Like SVM, LR can be kernelised for nonlinear classification, and it shows comparable classification accuracy to SVM methods. This work is a case study, comparing accuracy and speed of SVM and LR classifiers in detection of LSB Matching and other related spatial-domain image steganography, through the state-of-art 686-dimensional SPAM feature set, in three image sets.
SEPARATION PHENOMENA LOGISTIC REGRESSION
Directory of Open Access Journals (Sweden)
Ikaro Daniel de Carvalho Barreto
2014-03-01
Full Text Available This paper proposes an application of concepts about the maximum likelihood estimation of the binomial logistic regression model to the separation phenomena. It generates bias in the estimation and provides different interpretations of the estimates on the different statistical tests (Wald, Likelihood Ratio and Score and provides different estimates on the different iterative methods (Newton-Raphson and Fisher Score. It also presents an example that demonstrates the direct implications for the validation of the model and validation of variables, the implications for estimates of odds ratios and confidence intervals, generated from the Wald statistics. Furthermore, we present, briefly, the Firth correction to circumvent the phenomena of separation.
DEFF Research Database (Denmark)
Ozenne, Brice; Sørensen, Anne Lyngholm; Scheike, Thomas
2017-01-01
In the presence of competing risks a prediction of the time-dynamic absolute risk of an event can be based on cause-specific Cox regression models for the event and the competing risks (Benichou and Gail, 1990). We present computationally fast and memory optimized C++ functions with an R interface......-product we obtain fast access to the baseline hazards (compared to survival::basehaz()) and predictions of survival probabilities, their confidence intervals and confidence bands. Confidence intervals and confidence bands are based on point-wise asymptotic expansions of the corresponding statistical...
Adaptive metric kernel regression
DEFF Research Database (Denmark)
Goutte, Cyril; Larsen, Jan
2000-01-01
Kernel smoothing is a widely used non-parametric pattern recognition technique. By nature, it suffers from the curse of dimensionality and is usually difficult to apply to high input dimensions. In this contribution, we propose an algorithm that adapts the input metric used in multivariate...... regression by minimising a cross-validation estimate of the generalisation error. This allows to automatically adjust the importance of different dimensions. The improvement in terms of modelling performance is illustrated on a variable selection task where the adaptive metric kernel clearly outperforms...
Adaptive Metric Kernel Regression
DEFF Research Database (Denmark)
Goutte, Cyril; Larsen, Jan
1998-01-01
Kernel smoothing is a widely used nonparametric pattern recognition technique. By nature, it suffers from the curse of dimensionality and is usually difficult to apply to high input dimensions. In this paper, we propose an algorithm that adapts the input metric used in multivariate regression...... by minimising a cross-validation estimate of the generalisation error. This allows one to automatically adjust the importance of different dimensions. The improvement in terms of modelling performance is illustrated on a variable selection task where the adaptive metric kernel clearly outperforms the standard...
Energy Technology Data Exchange (ETDEWEB)
Kyle, Jennifer E. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Casey, Cameron P. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Stratton, Kelly G. [National Security Directorate, Pacific Northwest National Laboratory, Richland WA USA; Zink, Erika M. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Kim, Young-Mo [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Zheng, Xueyun [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Monroe, Matthew E. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Weitz, Karl K. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Bloodsworth, Kent J. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Orton, Daniel J. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Ibrahim, Yehia M. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Moore, Ronald J. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Lee, Christine G. [Department of Medicine, Bone and Mineral Unit, Oregon Health and Science University, Portland OR USA; Research Service, Portland Veterans Affairs Medical Center, Portland OR USA; Pedersen, Catherine [Department of Medicine, Bone and Mineral Unit, Oregon Health and Science University, Portland OR USA; Orwoll, Eric [Department of Medicine, Bone and Mineral Unit, Oregon Health and Science University, Portland OR USA; Smith, Richard D. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Burnum-Johnson, Kristin E. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA; Baker, Erin S. [Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland WA USA
2017-02-05
The use of dried blood spots (DBS) has many advantages over traditional plasma and serum samples such as smaller blood volume required, storage at room temperature, and ability for sampling in remote locations. However, understanding the robustness of different analytes in DBS samples is essential, especially in older samples collected for longitudinal studies. Here we analyzed DBS samples collected in 2000-2001 and stored at room temperature and compared them to matched serum samples stored at -80°C to determine if they could be effectively used as specific time points in a longitudinal study following metabolic disease. Four hundred small molecules were identified in both the serum and DBS samples using gas chromatograph-mass spectrometry (GC-MS), liquid chromatography-MS (LC-MS) and LC-ion mobility spectrometry-MS (LC-IMS-MS). The identified polar metabolites overlapped well between the sample types, though only one statistically significant polar metabolite in a case-control study was conserved, indicating degradation occurs in the DBS samples affecting quantitation. Differences in the lipid identifications indicated that some oxidation occurs in the DBS samples. However, thirty-six statistically significant lipids correlated in both sample types indicating that lipid quantitation was more stable across the sample types.
DEFF Research Database (Denmark)
Hansen, Henrik; Tarp, Finn
2001-01-01
This paper examines the relationship between foreign aid and growth in real GDP per capita as it emerges from simple augmentations of popular cross country growth specifications. It is shown that aid in all likelihood increases the growth rate, and this result is not conditional on ‘good’ policy....... investment. We conclude by stressing the need for more theoretical work before this kind of cross-country regressions are used for policy purposes.......This paper examines the relationship between foreign aid and growth in real GDP per capita as it emerges from simple augmentations of popular cross country growth specifications. It is shown that aid in all likelihood increases the growth rate, and this result is not conditional on ‘good’ policy...
Modified Regression Correlation Coefficient for Poisson Regression Model
Kaengthong, Nattacha; Domthong, Uthumporn
2017-09-01
This study gives attention to indicators in predictive power of the Generalized Linear Model (GLM) which are widely used; however, often having some restrictions. We are interested in regression correlation coefficient for a Poisson regression model. This is a measure of predictive power, and defined by the relationship between the dependent variable (Y) and the expected value of the dependent variable given the independent variables [E(Y|X)] for the Poisson regression model. The dependent variable is distributed as Poisson. The purpose of this research was modifying regression correlation coefficient for Poisson regression model. We also compare the proposed modified regression correlation coefficient with the traditional regression correlation coefficient in the case of two or more independent variables, and having multicollinearity in independent variables. The result shows that the proposed regression correlation coefficient is better than the traditional regression correlation coefficient based on Bias and the Root Mean Square Error (RMSE).
Detection of epistatic effects with logic regression and a classical linear regression model.
Malina, Magdalena; Ickstadt, Katja; Schwender, Holger; Posch, Martin; Bogdan, Małgorzata
2014-02-01
To locate multiple interacting quantitative trait loci (QTL) influencing a trait of interest within experimental populations, usually methods as the Cockerham's model are applied. Within this framework, interactions are understood as the part of the joined effect of several genes which cannot be explained as the sum of their additive effects. However, if a change in the phenotype (as disease) is caused by Boolean combinations of genotypes of several QTLs, this Cockerham's approach is often not capable to identify them properly. To detect such interactions more efficiently, we propose a logic regression framework. Even though with the logic regression approach a larger number of models has to be considered (requiring more stringent multiple testing correction) the efficient representation of higher order logic interactions in logic regression models leads to a significant increase of power to detect such interactions as compared to a Cockerham's approach. The increase in power is demonstrated analytically for a simple two-way interaction model and illustrated in more complex settings with simulation study and real data analysis.
Luo, Chongliang; Liu, Jin; Dey, Dipak K; Chen, Kun
2016-07-01
In many fields, multi-view datasets, measuring multiple distinct but interrelated sets of characteristics on the same set of subjects, together with data on certain outcomes or phenotypes, are routinely collected. The objective in such a problem is often two-fold: both to explore the association structures of multiple sets of measurements and to develop a parsimonious model for predicting the future outcomes. We study a unified canonical variate regression framework to tackle the two problems simultaneously. The proposed criterion integrates multiple canonical correlation analysis with predictive modeling, balancing between the association strength of the canonical variates and their joint predictive power on the outcomes. Moreover, the proposed criterion seeks multiple sets of canonical variates simultaneously to enable the examination of their joint effects on the outcomes, and is able to handle multivariate and non-Gaussian outcomes. An efficient algorithm based on variable splitting and Lagrangian multipliers is proposed. Simulation studies show the superior performance of the proposed approach. We demonstrate the effectiveness of the proposed approach in an [Formula: see text] intercross mice study and an alcohol dependence study. © The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Prediction, Regression and Critical Realism
DEFF Research Database (Denmark)
Næss, Petter
2004-01-01
This paper considers the possibility of prediction in land use planning, and the use of statistical research methods in analyses of relationships between urban form and travel behaviour. Influential writers within the tradition of critical realism reject the possibility of predicting social...... phenomena. This position is fundamentally problematic to public planning. Without at least some ability to predict the likely consequences of different proposals, the justification for public sector intervention into market mechanisms will be frail. Statistical methods like regression analyses are commonly...... seen as necessary in order to identify aggregate level effects of policy measures, but are questioned by many advocates of critical realist ontology. Using research into the relationship between urban structure and travel as an example, the paper discusses relevant research methods and the kinds...
Quantum algorithm for linear regression
Wang, Guoming
2017-07-01
We present a quantum algorithm for fitting a linear regression model to a given data set using the least-squares approach. Differently from previous algorithms which yield a quantum state encoding the optimal parameters, our algorithm outputs these numbers in the classical form. So by running it once, one completely determines the fitted model and then can use it to make predictions on new data at little cost. Moreover, our algorithm works in the standard oracle model, and can handle data sets with nonsparse design matrices. It runs in time poly( log2(N ) ,d ,κ ,1 /ɛ ) , where N is the size of the data set, d is the number of adjustable parameters, κ is the condition number of the design matrix, and ɛ is the desired precision in the output. We also show that the polynomial dependence on d and κ is necessary. Thus, our algorithm cannot be significantly improved. Furthermore, we also give a quantum algorithm that estimates the quality of the least-squares fit (without computing its parameters explicitly). This algorithm runs faster than the one for finding this fit, and can be used to check whether the given data set qualifies for linear regression in the first place.
Recursive Algorithm For Linear Regression
Varanasi, S. V.
1988-01-01
Order of model determined easily. Linear-regression algorithhm includes recursive equations for coefficients of model of increased order. Algorithm eliminates duplicative calculations, facilitates search for minimum order of linear-regression model fitting set of data satisfactory.
Combining Alphas via Bounded Regression
Directory of Open Access Journals (Sweden)
Zura Kakushadze
2015-11-01
Full Text Available We give an explicit algorithm and source code for combining alpha streams via bounded regression. In practical applications, typically, there is insufficient history to compute a sample covariance matrix (SCM for a large number of alphas. To compute alpha allocation weights, one then resorts to (weighted regression over SCM principal components. Regression often produces alpha weights with insufficient diversification and/or skewed distribution against, e.g., turnover. This can be rectified by imposing bounds on alpha weights within the regression procedure. Bounded regression can also be applied to stock and other asset portfolio construction. We discuss illustrative examples.
Advanced statistics: linear regression, part I: simple linear regression.
Marill, Keith A
2004-01-01
Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.
Nasution, Inggrita Gusti Sari; Muchtar, Yasmin Chairunnisa
2013-01-01
This research is to study the factors which influence the business success of small business ‘processed rotan’. The data employed in the study are primary data within the period of July to August 2013, 30 research observations through census method. Method of analysis used in the study is multiple linear regressions. The results of analysis showed that the factors of labor, innovation and promotion have positive and significant influence on the business success of small busine...
Vaeth, Michael; Skovlund, Eva
2004-06-15
For a given regression problem it is possible to identify a suitably defined equivalent two-sample problem such that the power or sample size obtained for the two-sample problem also applies to the regression problem. For a standard linear regression model the equivalent two-sample problem is easily identified, but for generalized linear models and for Cox regression models the situation is more complicated. An approximately equivalent two-sample problem may, however, also be identified here. In particular, we show that for logistic regression and Cox regression models the equivalent two-sample problem is obtained by selecting two equally sized samples for which the parameters differ by a value equal to the slope times twice the standard deviation of the independent variable and further requiring that the overall expected number of events is unchanged. In a simulation study we examine the validity of this approach to power calculations in logistic regression and Cox regression models. Several different covariate distributions are considered for selected values of the overall response probability and a range of alternatives. For the Cox regression model we consider both constant and non-constant hazard rates. The results show that in general the approach is remarkably accurate even in relatively small samples. Some discrepancies are, however, found in small samples with few events and a highly skewed covariate distribution. Comparison with results based on alternative methods for logistic regression models with a single continuous covariate indicates that the proposed method is at least as good as its competitors. The method is easy to implement and therefore provides a simple way to extend the range of problems that can be covered by the usual formulas for power and sample size determination. Copyright 2004 John Wiley & Sons, Ltd.
Regression Equations for Birth Weight Estimation using ...
African Journals Online (AJOL)
In this study, Birth Weight has been estimated from anthropometric measurements of hand and foot. Linear regression equations were formed from each of the measured variables. These simple equations can be used to estimate Birth Weight of new born babies, in order to identify those with low birth weight and referred to ...
Linear regression in astronomy. II
Feigelson, Eric D.; Babu, Gutti J.
1992-01-01
A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.
Time-adaptive quantile regression
DEFF Research Database (Denmark)
Møller, Jan Kloppenborg; Nielsen, Henrik Aalborg; Madsen, Henrik
2008-01-01
and an updating procedure are combined into a new algorithm for time-adaptive quantile regression, which generates new solutions on the basis of the old solution, leading to savings in computation time. The suggested algorithm is tested against a static quantile regression model on a data set with wind power......An algorithm for time-adaptive quantile regression is presented. The algorithm is based on the simplex algorithm, and the linear optimization formulation of the quantile regression problem is given. The observations have been split to allow a direct use of the simplex algorithm. The simplex method...... production, where the models combine splines and quantile regression. The comparison indicates superior performance for the time-adaptive quantile regression in all the performance parameters considered....
Retro-regression--another important multivariate regression improvement.
Randić, M
2001-01-01
We review the serious problem associated with instabilities of the coefficients of regression equations, referred to as the MRA (multivariate regression analysis) "nightmare of the first kind". This is manifested when in a stepwise regression a descriptor is included or excluded from a regression. The consequence is an unpredictable change of the coefficients of the descriptors that remain in the regression equation. We follow with consideration of an even more serious problem, referred to as the MRA "nightmare of the second kind", arising when optimal descriptors are selected from a large pool of descriptors. This process typically causes at different steps of the stepwise regression a replacement of several previously used descriptors by new ones. We describe a procedure that resolves these difficulties. The approach is illustrated on boiling points of nonanes which are considered (1) by using an ordered connectivity basis; (2) by using an ordering resulting from application of greedy algorithm; and (3) by using an ordering derived from an exhaustive search for optimal descriptors. A novel variant of multiple regression analysis, called retro-regression (RR), is outlined showing how it resolves the ambiguities associated with both "nightmares" of the first and the second kind of MRA.
Quantile regression theory and applications
Davino, Cristina; Vistocco, Domenico
2013-01-01
A guide to the implementation and interpretation of Quantile Regression models This book explores the theory and numerous applications of quantile regression, offering empirical data analysis as well as the software tools to implement the methods. The main focus of this book is to provide the reader with a comprehensivedescription of the main issues concerning quantile regression; these include basic modeling, geometrical interpretation, estimation and inference for quantile regression, as well as issues on validity of the model, diagnostic tools. Each methodological aspect is explored and
Panel Smooth Transition Regression Models
DEFF Research Database (Denmark)
González, Andrés; Terasvirta, Timo; Dijk, Dick van
We introduce the panel smooth transition regression model. This new model is intended for characterizing heterogeneous panels, allowing the regression coefficients to vary both across individuals and over time. Specifically, heterogeneity is allowed for by assuming that these coefficients are bou...
Testing discontinuities in nonparametric regression
Dai, Wenlin
2017-01-19
In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100
Testing discontinuities in nonparametric regression
Dai, Wenlin; Zhou, Yuejin; Tong, Tiejun
2017-01-01
In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100
Logistic Regression: Concept and Application
Cokluk, Omay
2010-01-01
The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…
Fungible weights in logistic regression.
Jones, Jeff A; Waller, Niels G
2016-06-01
In this article we develop methods for assessing parameter sensitivity in logistic regression models. To set the stage for this work, we first review Waller's (2008) equations for computing fungible weights in linear regression. Next, we describe 2 methods for computing fungible weights in logistic regression. To demonstrate the utility of these methods, we compute fungible logistic regression weights using data from the Centers for Disease Control and Prevention's (2010) Youth Risk Behavior Surveillance Survey, and we illustrate how these alternate weights can be used to evaluate parameter sensitivity. To make our work accessible to the research community, we provide R code (R Core Team, 2015) that will generate both kinds of fungible logistic regression weights. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
International Nuclear Information System (INIS)
Leng Ling; Zhang Tianyi; Kleinman, Lawrence; Zhu Wei
2007-01-01
Regression analysis, especially the ordinary least squares method which assumes that errors are confined to the dependent variable, has seen a fair share of its applications in aerosol science. The ordinary least squares approach, however, could be problematic due to the fact that atmospheric data often does not lend itself to calling one variable independent and the other dependent. Errors often exist for both measurements. In this work, we examine two regression approaches available to accommodate this situation. They are orthogonal regression and geometric mean regression. Comparisons are made theoretically as well as numerically through an aerosol study examining whether the ratio of organic aerosol to CO would change with age
Forecasting urban water demand: A meta-regression analysis.
Sebri, Maamar
2016-12-01
Water managers and planners require accurate water demand forecasts over the short-, medium- and long-term for many purposes. These range from assessing water supply needs over spatial and temporal patterns to optimizing future investments and planning future allocations across competing sectors. This study surveys the empirical literature on the urban water demand forecasting using the meta-analytical approach. Specifically, using more than 600 estimates, a meta-regression analysis is conducted to identify explanations of cross-studies variation in accuracy of urban water demand forecasting. Our study finds that accuracy depends significantly on study characteristics, including demand periodicity, modeling method, forecasting horizon, model specification and sample size. The meta-regression results remain robust to different estimators employed as well as to a series of sensitivity checks performed. The importance of these findings lies in the conclusions and implications drawn out for regulators and policymakers and for academics alike. Copyright © 2016. Published by Elsevier Ltd.
Regression to Causality : Regression-style presentation influences causal attribution
DEFF Research Database (Denmark)
Bordacconi, Mats Joe; Larsen, Martin Vinæs
2014-01-01
of equivalent results presented as either regression models or as a test of two sample means. Our experiment shows that the subjects who were presented with results as estimates from a regression model were more inclined to interpret these results causally. Our experiment implies that scholars using regression...... models – one of the primary vehicles for analyzing statistical results in political science – encourage causal interpretation. Specifically, we demonstrate that presenting observational results in a regression model, rather than as a simple comparison of means, makes causal interpretation of the results...... more likely. Our experiment drew on a sample of 235 university students from three different social science degree programs (political science, sociology and economics), all of whom had received substantial training in statistics. The subjects were asked to compare and evaluate the validity...
Regression analysis with categorized regression calibrated exposure: some interesting findings
Directory of Open Access Journals (Sweden)
Hjartåker Anette
2006-07-01
Full Text Available Abstract Background Regression calibration as a method for handling measurement error is becoming increasingly well-known and used in epidemiologic research. However, the standard version of the method is not appropriate for exposure analyzed on a categorical (e.g. quintile scale, an approach commonly used in epidemiologic studies. A tempting solution could then be to use the predicted continuous exposure obtained through the regression calibration method and treat it as an approximation to the true exposure, that is, include the categorized calibrated exposure in the main regression analysis. Methods We use semi-analytical calculations and simulations to evaluate the performance of the proposed approach compared to the naive approach of not correcting for measurement error, in situations where analyses are performed on quintile scale and when incorporating the original scale into the categorical variables, respectively. We also present analyses of real data, containing measures of folate intake and depression, from the Norwegian Women and Cancer study (NOWAC. Results In cases where extra information is available through replicated measurements and not validation data, regression calibration does not maintain important qualities of the true exposure distribution, thus estimates of variance and percentiles can be severely biased. We show that the outlined approach maintains much, in some cases all, of the misclassification found in the observed exposure. For that reason, regression analysis with the corrected variable included on a categorical scale is still biased. In some cases the corrected estimates are analytically equal to those obtained by the naive approach. Regression calibration is however vastly superior to the naive method when applying the medians of each category in the analysis. Conclusion Regression calibration in its most well-known form is not appropriate for measurement error correction when the exposure is analyzed on a
Advanced statistics: linear regression, part II: multiple linear regression.
Marill, Keith A
2004-01-01
The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.
Logic regression and its extensions.
Schwender, Holger; Ruczinski, Ingo
2010-01-01
Logic regression is an adaptive classification and regression procedure, initially developed to reveal interacting single nucleotide polymorphisms (SNPs) in genetic association studies. In general, this approach can be used in any setting with binary predictors, when the interaction of these covariates is of primary interest. Logic regression searches for Boolean (logic) combinations of binary variables that best explain the variability in the outcome variable, and thus, reveals variables and interactions that are associated with the response and/or have predictive capabilities. The logic expressions are embedded in a generalized linear regression framework, and thus, logic regression can handle a variety of outcome types, such as binary responses in case-control studies, numeric responses, and time-to-event data. In this chapter, we provide an introduction to the logic regression methodology, list some applications in public health and medicine, and summarize some of the direct extensions and modifications of logic regression that have been proposed in the literature. Copyright © 2010 Elsevier Inc. All rights reserved.
Poisson Mixture Regression Models for Heart Disease Prediction.
Mufudza, Chipo; Erol, Hamza
2016-01-01
Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model.
Abstract Expression Grammar Symbolic Regression
Korns, Michael F.
This chapter examines the use of Abstract Expression Grammars to perform the entire Symbolic Regression process without the use of Genetic Programming per se. The techniques explored produce a symbolic regression engine which has absolutely no bloat, which allows total user control of the search space and output formulas, which is faster, and more accurate than the engines produced in our previous papers using Genetic Programming. The genome is an all vector structure with four chromosomes plus additional epigenetic and constraint vectors, allowing total user control of the search space and the final output formulas. A combination of specialized compiler techniques, genetic algorithms, particle swarm, aged layered populations, plus discrete and continuous differential evolution are used to produce an improved symbolic regression sytem. Nine base test cases, from the literature, are used to test the improvement in speed and accuracy. The improved results indicate that these techniques move us a big step closer toward future industrial strength symbolic regression systems.
Quantile Regression With Measurement Error
Wei, Ying; Carroll, Raymond J.
2009-01-01
. The finite sample performance of the proposed method is investigated in a simulation study, and compared to the standard regression calibration approach. Finally, we apply our methodology to part of the National Collaborative Perinatal Project growth data, a
Logistic regression models of factors influencing the location of bioenergy and biofuels plants
T.M. Young; R.L. Zaretzki; J.H. Perdue; F.M. Guess; X. Liu
2011-01-01
Logistic regression models were developed to identify significant factors that influence the location of existing wood-using bioenergy/biofuels plants and traditional wood-using facilities. Logistic models provided quantitative insight for variables influencing the location of woody biomass-using facilities. Availability of "thinnings to a basal area of 31.7m2/ha...
From Rasch scores to regression
DEFF Research Database (Denmark)
Christensen, Karl Bang
2006-01-01
Rasch models provide a framework for measurement and modelling latent variables. Having measured a latent variable in a population a comparison of groups will often be of interest. For this purpose the use of observed raw scores will often be inadequate because these lack interval scale propertie....... This paper compares two approaches to group comparison: linear regression models using estimated person locations as outcome variables and latent regression models based on the distribution of the score....
Testing Heteroscedasticity in Robust Regression
Czech Academy of Sciences Publication Activity Database
Kalina, Jan
2011-01-01
Roč. 1, č. 4 (2011), s. 25-28 ISSN 2045-3345 Grant - others:GA ČR(CZ) GA402/09/0557 Institutional research plan: CEZ:AV0Z10300504 Keywords : robust regression * heteroscedasticity * regression quantiles * diagnostics Subject RIV: BB - Applied Statistics , Operational Research http://www.researchjournals.co.uk/documents/Vol4/06%20Kalina.pdf
Regression methods for medical research
Tai, Bee Choo
2013-01-01
Regression Methods for Medical Research provides medical researchers with the skills they need to critically read and interpret research using more advanced statistical methods. The statistical requirements of interpreting and publishing in medical journals, together with rapid changes in science and technology, increasingly demands an understanding of more complex and sophisticated analytic procedures.The text explains the application of statistical models to a wide variety of practical medical investigative studies and clinical trials. Regression methods are used to appropriately answer the
Forecasting with Dynamic Regression Models
Pankratz, Alan
2012-01-01
One of the most widely used tools in statistical forecasting, single equation regression models is examined here. A companion to the author's earlier work, Forecasting with Univariate Box-Jenkins Models: Concepts and Cases, the present text pulls together recent time series ideas and gives special attention to possible intertemporal patterns, distributed lag responses of output to input series and the auto correlation patterns of regression disturbance. It also includes six case studies.
Refractive regression after laser in situ keratomileusis.
Yan, Mabel K; Chang, John Sm; Chan, Tommy Cy
2018-04-26
Uncorrected refractive errors are a leading cause of visual impairment across the world. In today's society, laser in situ keratomileusis (LASIK) has become the most commonly performed surgical procedure to correct refractive errors. However, regression of the initially achieved refractive correction has been a widely observed phenomenon following LASIK since its inception more than two decades ago. Despite technological advances in laser refractive surgery and various proposed management strategies, post-LASIK regression is still frequently observed and has significant implications for the long-term visual performance and quality of life of patients. This review explores the mechanism of refractive regression after both myopic and hyperopic LASIK, predisposing risk factors and its clinical course. In addition, current preventative strategies and therapies are also reviewed. © 2018 Royal Australian and New Zealand College of Ophthalmologists.
Principal component regression for crop yield estimation
Suryanarayana, T M V
2016-01-01
This book highlights the estimation of crop yield in Central Gujarat, especially with regard to the development of Multiple Regression Models and Principal Component Regression (PCR) models using climatological parameters as independent variables and crop yield as a dependent variable. It subsequently compares the multiple linear regression (MLR) and PCR results, and discusses the significance of PCR for crop yield estimation. In this context, the book also covers Principal Component Analysis (PCA), a statistical procedure used to reduce a number of correlated variables into a smaller number of uncorrelated variables called principal components (PC). This book will be helpful to the students and researchers, starting their works on climate and agriculture, mainly focussing on estimation models. The flow of chapters takes the readers in a smooth path, in understanding climate and weather and impact of climate change, and gradually proceeds towards downscaling techniques and then finally towards development of ...
Is past life regression therapy ethical?
Andrade, Gabriel
2017-01-01
Past life regression therapy is used by some physicians in cases with some mental diseases. Anxiety disorders, mood disorders, and gender dysphoria have all been treated using life regression therapy by some doctors on the assumption that they reflect problems in past lives. Although it is not supported by psychiatric associations, few medical associations have actually condemned it as unethical. In this article, I argue that past life regression therapy is unethical for two basic reasons. First, it is not evidence-based. Past life regression is based on the reincarnation hypothesis, but this hypothesis is not supported by evidence, and in fact, it faces some insurmountable conceptual problems. If patients are not fully informed about these problems, they cannot provide an informed consent, and hence, the principle of autonomy is violated. Second, past life regression therapy has the great risk of implanting false memories in patients, and thus, causing significant harm. This is a violation of the principle of non-malfeasance, which is surely the most important principle in medical ethics.
Logistic regression for dichotomized counts.
Preisser, John S; Das, Kalyan; Benecha, Habtamu; Stamm, John W
2016-12-01
Sometimes there is interest in a dichotomized outcome indicating whether a count variable is positive or zero. Under this scenario, the application of ordinary logistic regression may result in efficiency loss, which is quantifiable under an assumed model for the counts. In such situations, a shared-parameter hurdle model is investigated for more efficient estimation of regression parameters relating to overall effects of covariates on the dichotomous outcome, while handling count data with many zeroes. One model part provides a logistic regression containing marginal log odds ratio effects of primary interest, while an ancillary model part describes the mean count of a Poisson or negative binomial process in terms of nuisance regression parameters. Asymptotic efficiency of the logistic model parameter estimators of the two-part models is evaluated with respect to ordinary logistic regression. Simulations are used to assess the properties of the models with respect to power and Type I error, the latter investigated under both misspecified and correctly specified models. The methods are applied to data from a randomized clinical trial of three toothpaste formulations to prevent incident dental caries in a large population of Scottish schoolchildren. © The Author(s) 2014.
Producing The New Regressive Left
DEFF Research Database (Denmark)
Crone, Christine
members, this thesis investigates a growing political trend and ideological discourse in the Arab world that I have called The New Regressive Left. On the premise that a media outlet can function as a forum for ideology production, the thesis argues that an analysis of this material can help to trace...... the contexture of The New Regressive Left. If the first part of the thesis lays out the theoretical approach and draws the contextual framework, through an exploration of the surrounding Arab media-and ideoscapes, the second part is an analytical investigation of the discourse that permeates the programmes aired...... becomes clear from the analytical chapters is the emergence of the new cross-ideological alliance of The New Regressive Left. This emerging coalition between Shia Muslims, religious minorities, parts of the Arab Left, secular cultural producers, and the remnants of the political,strategic resistance...
Correlation and simple linear regression.
Zou, Kelly H; Tuncali, Kemal; Silverman, Stuart G
2003-06-01
In this tutorial article, the concepts of correlation and regression are reviewed and demonstrated. The authors review and compare two correlation coefficients, the Pearson correlation coefficient and the Spearman rho, for measuring linear and nonlinear relationships between two continuous variables. In the case of measuring the linear relationship between a predictor and an outcome variable, simple linear regression analysis is conducted. These statistical concepts are illustrated by using a data set from published literature to assess a computed tomography-guided interventional technique. These statistical methods are important for exploring the relationships between variables and can be applied to many radiologic studies.
Regression filter for signal resolution
International Nuclear Information System (INIS)
Matthes, W.
1975-01-01
The problem considered is that of resolving a measured pulse height spectrum of a material mixture, e.g. gamma ray spectrum, Raman spectrum, into a weighed sum of the spectra of the individual constituents. The model on which the analytical formulation is based is described. The problem reduces to that of a multiple linear regression. A stepwise linear regression procedure was constructed. The efficiency of this method was then tested by transforming the procedure in a computer programme which was used to unfold test spectra obtained by mixing some spectra, from a library of arbitrary chosen spectra, and adding a noise component. (U.K.)
Predicting significant torso trauma.
Nirula, Ram; Talmor, Daniel; Brasel, Karen
2005-07-01
Identification of motor vehicle crash (MVC) characteristics associated with thoracoabdominal injury would advance the development of automatic crash notification systems (ACNS) by improving triage and response times. Our objective was to determine the relationships between MVC characteristics and thoracoabdominal trauma to develop a torso injury probability model. Drivers involved in crashes from 1993 to 2001 within the National Automotive Sampling System were reviewed. Relationships between torso injury and MVC characteristics were assessed using multivariate logistic regression. Receiver operating characteristic curves were used to compare the model to current ACNS models. There were a total of 56,466 drivers. Age, ejection, braking, avoidance, velocity, restraints, passenger-side impact, rollover, and vehicle weight and type were associated with injury (p < 0.05). The area under the receiver operating characteristic curve (83.9) was significantly greater than current ACNS models. We have developed a thoracoabdominal injury probability model that may improve patient triage when used with ACNS.
Cactus: An Introduction to Regression
Hyde, Hartley
2008-01-01
When the author first used "VisiCalc," the author thought it a very useful tool when he had the formulas. But how could he design a spreadsheet if there was no known formula for the quantities he was trying to predict? A few months later, the author relates he learned to use multiple linear regression software and suddenly it all clicked into…
Regression Models for Repairable Systems
Czech Academy of Sciences Publication Activity Database
Novák, Petr
2015-01-01
Roč. 17, č. 4 (2015), s. 963-972 ISSN 1387-5841 Institutional support: RVO:67985556 Keywords : Reliability analysis * Repair models * Regression Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.782, year: 2015 http://library.utia.cas.cz/separaty/2015/SI/novak-0450902.pdf
Survival analysis II: Cox regression
Stel, Vianda S.; Dekker, Friedo W.; Tripepi, Giovanni; Zoccali, Carmine; Jager, Kitty J.
2011-01-01
In contrast to the Kaplan-Meier method, Cox proportional hazards regression can provide an effect estimate by quantifying the difference in survival between patient groups and can adjust for confounding effects of other variables. The purpose of this article is to explain the basic concepts of the
Kernel regression with functional response
Ferraty, Frédéric; Laksaci, Ali; Tadj, Amel; Vieu, Philippe
2011-01-01
We consider kernel regression estimate when both the response variable and the explanatory one are functional. The rates of uniform almost complete convergence are stated as function of the small ball probability of the predictor and as function of the entropy of the set on which uniformity is obtained.
The prostate health index selectively identifies clinically significant prostate cancer
S. Loeb (Stacy); M.G. Sanda (Martin G.); D.L. Broyles (Dennis L.); S.S. Shin (Sanghyuk S.); C.H. Bangma (Chris); J.T. Wei (John T.); A.W. Partin (Alan W.); G.G. Klee (George); K.M. Slawin (Kevin M.); L.S. Marks (Leonard S.); R.H.N. van Schaik (Ron); D.W. Chan (Daniel); L. Sokoll (Lori); A.B. Cruz (Amabelle B.); I.A. Mizrahi (Isaac A.); W.J. Catalona (William)
2015-01-01
textabstractPurpose The Prostate Health Index (phi) is a new test combining total, free and [-2]proPSA into a single score. It was recently approved by the FDA and is now commercially available in the U.S., Europe and Australia. We investigate whether phi improves specificity for detecting
Testing hypotheses for differences between linear regression lines
Stanley J. Zarnoch
2009-01-01
Five hypotheses are identified for testing differences between simple linear regression lines. The distinctions between these hypotheses are based on a priori assumptions and illustrated with full and reduced models. The contrast approach is presented as an easy and complete method for testing for overall differences between the regressions and for making pairwise...
Functional data analysis of generalized regression quantiles
Guo, Mengmeng
2013-11-05
Generalized regression quantiles, including the conditional quantiles and expectiles as special cases, are useful alternatives to the conditional means for characterizing a conditional distribution, especially when the interest lies in the tails. We develop a functional data analysis approach to jointly estimate a family of generalized regression quantiles. Our approach assumes that the generalized regression quantiles share some common features that can be summarized by a small number of principal component functions. The principal component functions are modeled as splines and are estimated by minimizing a penalized asymmetric loss measure. An iterative least asymmetrically weighted squares algorithm is developed for computation. While separate estimation of individual generalized regression quantiles usually suffers from large variability due to lack of sufficient data, by borrowing strength across data sets, our joint estimation approach significantly improves the estimation efficiency, which is demonstrated in a simulation study. The proposed method is applied to data from 159 weather stations in China to obtain the generalized quantile curves of the volatility of the temperature at these stations. © 2013 Springer Science+Business Media New York.
Functional data analysis of generalized regression quantiles
Guo, Mengmeng; Zhou, Lan; Huang, Jianhua Z.; Hä rdle, Wolfgang Karl
2013-01-01
Generalized regression quantiles, including the conditional quantiles and expectiles as special cases, are useful alternatives to the conditional means for characterizing a conditional distribution, especially when the interest lies in the tails. We develop a functional data analysis approach to jointly estimate a family of generalized regression quantiles. Our approach assumes that the generalized regression quantiles share some common features that can be summarized by a small number of principal component functions. The principal component functions are modeled as splines and are estimated by minimizing a penalized asymmetric loss measure. An iterative least asymmetrically weighted squares algorithm is developed for computation. While separate estimation of individual generalized regression quantiles usually suffers from large variability due to lack of sufficient data, by borrowing strength across data sets, our joint estimation approach significantly improves the estimation efficiency, which is demonstrated in a simulation study. The proposed method is applied to data from 159 weather stations in China to obtain the generalized quantile curves of the volatility of the temperature at these stations. © 2013 Springer Science+Business Media New York.
Assessing risk factors for periodontitis using regression
Lobo Pereira, J. A.; Ferreira, Maria Cristina; Oliveira, Teresa
2013-10-01
Multivariate statistical analysis is indispensable to assess the associations and interactions between different factors and the risk of periodontitis. Among others, regression analysis is a statistical technique widely used in healthcare to investigate and model the relationship between variables. In our work we study the impact of socio-demographic, medical and behavioral factors on periodontal health. Using regression, linear and logistic models, we can assess the relevance, as risk factors for periodontitis disease, of the following independent variables (IVs): Age, Gender, Diabetic Status, Education, Smoking status and Plaque Index. The multiple linear regression analysis model was built to evaluate the influence of IVs on mean Attachment Loss (AL). Thus, the regression coefficients along with respective p-values will be obtained as well as the respective p-values from the significance tests. The classification of a case (individual) adopted in the logistic model was the extent of the destruction of periodontal tissues defined by an Attachment Loss greater than or equal to 4 mm in 25% (AL≥4mm/≥25%) of sites surveyed. The association measures include the Odds Ratios together with the correspondent 95% confidence intervals.
Quantile Regression With Measurement Error
Wei, Ying
2009-08-27
Regression quantiles can be substantially biased when the covariates are measured with error. In this paper we propose a new method that produces consistent linear quantile estimation in the presence of covariate measurement error. The method corrects the measurement error induced bias by constructing joint estimating equations that simultaneously hold for all the quantile levels. An iterative EM-type estimation algorithm to obtain the solutions to such joint estimation equations is provided. The finite sample performance of the proposed method is investigated in a simulation study, and compared to the standard regression calibration approach. Finally, we apply our methodology to part of the National Collaborative Perinatal Project growth data, a longitudinal study with an unusual measurement error structure. © 2009 American Statistical Association.
Multivariate and semiparametric kernel regression
Härdle, Wolfgang; Müller, Marlene
1997-01-01
The paper gives an introduction to theory and application of multivariate and semiparametric kernel smoothing. Multivariate nonparametric density estimation is an often used pilot tool for examining the structure of data. Regression smoothing helps in investigating the association between covariates and responses. We concentrate on kernel smoothing using local polynomial fitting which includes the Nadaraya-Watson estimator. Some theory on the asymptotic behavior and bandwidth selection is pro...
Regression algorithm for emotion detection
Berthelon , Franck; Sander , Peter
2013-01-01
International audience; We present here two components of a computational system for emotion detection. PEMs (Personalized Emotion Maps) store links between bodily expressions and emotion values, and are individually calibrated to capture each person's emotion profile. They are an implementation based on aspects of Scherer's theoretical complex system model of emotion~\\cite{scherer00, scherer09}. We also present a regression algorithm that determines a person's emotional feeling from sensor m...
Directional quantile regression in R
Czech Academy of Sciences Publication Activity Database
Boček, Pavel; Šiman, Miroslav
2017-01-01
Roč. 53, č. 3 (2017), s. 480-492 ISSN 0023-5954 R&D Projects: GA ČR GA14-07234S Institutional support: RVO:67985556 Keywords : multivariate quantile * regression quantile * halfspace depth * depth contour Subject RIV: BD - Theory of Information OBOR OECD: Applied mathematics Impact factor: 0.379, year: 2016 http://library.utia.cas.cz/separaty/2017/SI/bocek-0476587.pdf
Polylinear regression analysis in radiochemistry
International Nuclear Information System (INIS)
Kopyrin, A.A.; Terent'eva, T.N.; Khramov, N.N.
1995-01-01
A number of radiochemical problems have been formulated in the framework of polylinear regression analysis, which permits the use of conventional mathematical methods for their solution. The authors have considered features of the use of polylinear regression analysis for estimating the contributions of various sources to the atmospheric pollution, for studying irradiated nuclear fuel, for estimating concentrations from spectral data, for measuring neutron fields of a nuclear reactor, for estimating crystal lattice parameters from X-ray diffraction patterns, for interpreting data of X-ray fluorescence analysis, for estimating complex formation constants, and for analyzing results of radiometric measurements. The problem of estimating the target parameters can be incorrect at certain properties of the system under study. The authors showed the possibility of regularization by adding a fictitious set of data open-quotes obtainedclose quotes from the orthogonal design. To estimate only a part of the parameters under consideration, the authors used incomplete rank models. In this case, it is necessary to take into account the possibility of confounding estimates. An algorithm for evaluating the degree of confounding is presented which is realized using standard software or regression analysis
Computing group cardinality constraint solutions for logistic regression problems.
Zhang, Yong; Kwon, Dongjin; Pohl, Kilian M
2017-01-01
We derive an algorithm to directly solve logistic regression based on cardinality constraint, group sparsity and use it to classify intra-subject MRI sequences (e.g. cine MRIs) of healthy from diseased subjects. Group cardinality constraint models are often applied to medical images in order to avoid overfitting of the classifier to the training data. Solutions within these models are generally determined by relaxing the cardinality constraint to a weighted feature selection scheme. However, these solutions relate to the original sparse problem only under specific assumptions, which generally do not hold for medical image applications. In addition, inferring clinical meaning from features weighted by a classifier is an ongoing topic of discussion. Avoiding weighing features, we propose to directly solve the group cardinality constraint logistic regression problem by generalizing the Penalty Decomposition method. To do so, we assume that an intra-subject series of images represents repeated samples of the same disease patterns. We model this assumption by combining series of measurements created by a feature across time into a single group. Our algorithm then derives a solution within that model by decoupling the minimization of the logistic regression function from enforcing the group sparsity constraint. The minimum to the smooth and convex logistic regression problem is determined via gradient descent while we derive a closed form solution for finding a sparse approximation of that minimum. We apply our method to cine MRI of 38 healthy controls and 44 adult patients that received reconstructive surgery of Tetralogy of Fallot (TOF) during infancy. Our method correctly identifies regions impacted by TOF and generally obtains statistically significant higher classification accuracy than alternative solutions to this model, i.e., ones relaxing group cardinality constraints. Copyright © 2016 Elsevier B.V. All rights reserved.
Gaussian Process Regression Model in Spatial Logistic Regression
Sofro, A.; Oktaviarina, A.
2018-01-01
Spatial analysis has developed very quickly in the last decade. One of the favorite approaches is based on the neighbourhood of the region. Unfortunately, there are some limitations such as difficulty in prediction. Therefore, we offer Gaussian process regression (GPR) to accommodate the issue. In this paper, we will focus on spatial modeling with GPR for binomial data with logit link function. The performance of the model will be investigated. We will discuss the inference of how to estimate the parameters and hyper-parameters and to predict as well. Furthermore, simulation studies will be explained in the last section.
Modeling oil production based on symbolic regression
International Nuclear Information System (INIS)
Yang, Guangfei; Li, Xianneng; Wang, Jianliang; Lian, Lian; Ma, Tieju
2015-01-01
Numerous models have been proposed to forecast the future trends of oil production and almost all of them are based on some predefined assumptions with various uncertainties. In this study, we propose a novel data-driven approach that uses symbolic regression to model oil production. We validate our approach on both synthetic and real data, and the results prove that symbolic regression could effectively identify the true models beneath the oil production data and also make reliable predictions. Symbolic regression indicates that world oil production will peak in 2021, which broadly agrees with other techniques used by researchers. Our results also show that the rate of decline after the peak is almost half the rate of increase before the peak, and it takes nearly 12 years to drop 4% from the peak. These predictions are more optimistic than those in several other reports, and the smoother decline will provide the world, especially the developing countries, with more time to orchestrate mitigation plans. -- Highlights: •A data-driven approach has been shown to be effective at modeling the oil production. •The Hubbert model could be discovered automatically from data. •The peak of world oil production is predicted to appear in 2021. •The decline rate after peak is half of the increase rate before peak. •Oil production projected to decline 4% post-peak
Spontaneous regression of pulmonary bullae
International Nuclear Information System (INIS)
Satoh, H.; Ishikawa, H.; Ohtsuka, M.; Sekizawa, K.
2002-01-01
The natural history of pulmonary bullae is often characterized by gradual, progressive enlargement. Spontaneous regression of bullae is, however, very rare. We report a case in which complete resolution of pulmonary bullae in the left upper lung occurred spontaneously. The management of pulmonary bullae is occasionally made difficult because of gradual progressive enlargement associated with abnormal pulmonary function. Some patients have multiple bulla in both lungs and/or have a history of pulmonary emphysema. Others have a giant bulla without emphysematous change in the lungs. Our present case had treated lung cancer with no evidence of local recurrence. He had no emphysematous change in lung function test and had no complaints, although the high resolution CT scan shows evidence of underlying minimal changes of emphysema. Ortin and Gurney presented three cases of spontaneous reduction in size of bulla. Interestingly, one of them had a marked decrease in the size of a bulla in association with thickening of the wall of the bulla, which was observed in our patient. This case we describe is of interest, not only because of the rarity with which regression of pulmonary bulla has been reported in the literature, but also because of the spontaneous improvements in the radiological picture in the absence of overt infection or tumor. Copyright (2002) Blackwell Science Pty Ltd
Interpretation of commonly used statistical regression models.
Kasza, Jessica; Wolfe, Rory
2014-01-01
A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.
Parameter identifiability and redundancy: theoretical considerations.
Directory of Open Access Journals (Sweden)
Mark P Little
Full Text Available BACKGROUND: Models for complex biological systems may involve a large number of parameters. It may well be that some of these parameters cannot be derived from observed data via regression techniques. Such parameters are said to be unidentifiable, the remaining parameters being identifiable. Closely related to this idea is that of redundancy, that a set of parameters can be expressed in terms of some smaller set. Before data is analysed it is critical to determine which model parameters are identifiable or redundant to avoid ill-defined and poorly convergent regression. METHODOLOGY/PRINCIPAL FINDINGS: In this paper we outline general considerations on parameter identifiability, and introduce the notion of weak local identifiability and gradient weak local identifiability. These are based on local properties of the likelihood, in particular the rank of the Hessian matrix. We relate these to the notions of parameter identifiability and redundancy previously introduced by Rothenberg (Econometrica 39 (1971 577-591 and Catchpole and Morgan (Biometrika 84 (1997 187-196. Within the widely used exponential family, parameter irredundancy, local identifiability, gradient weak local identifiability and weak local identifiability are shown to be largely equivalent. We consider applications to a recently developed class of cancer models of Little and Wright (Math Biosciences 183 (2003 111-134 and Little et al. (J Theoret Biol 254 (2008 229-238 that generalize a large number of other recently used quasi-biological cancer models. CONCLUSIONS/SIGNIFICANCE: We have shown that the previously developed concepts of parameter local identifiability and redundancy are closely related to the apparently weaker properties of weak local identifiability and gradient weak local identifiability--within the widely used exponential family these concepts largely coincide.
Satellite rainfall retrieval by logistic regression
Chiu, Long S.
1986-01-01
The potential use of logistic regression in rainfall estimation from satellite measurements is investigated. Satellite measurements provide covariate information in terms of radiances from different remote sensors.The logistic regression technique can effectively accommodate many covariates and test their significance in the estimation. The outcome from the logistical model is the probability that the rainrate of a satellite pixel is above a certain threshold. By varying the thresholds, a rainrate histogram can be obtained, from which the mean and the variant can be estimated. A logistical model is developed and applied to rainfall data collected during GATE, using as covariates the fractional rain area and a radiance measurement which is deduced from a microwave temperature-rainrate relation. It is demonstrated that the fractional rain area is an important covariate in the model, consistent with the use of the so-called Area Time Integral in estimating total rain volume in other studies. To calibrate the logistical model, simulated rain fields generated by rainfield models with prescribed parameters are needed. A stringent test of the logistical model is its ability to recover the prescribed parameters of simulated rain fields. A rain field simulation model which preserves the fractional rain area and lognormality of rainrates as found in GATE is developed. A stochastic regression model of branching and immigration whose solutions are lognormally distributed in some asymptotic limits has also been developed.
Geographically weighted regression model on poverty indicator
Slamet, I.; Nugroho, N. F. T. A.; Muslich
2017-12-01
In this research, we applied geographically weighted regression (GWR) for analyzing the poverty in Central Java. We consider Gaussian Kernel as weighted function. The GWR uses the diagonal matrix resulted from calculating kernel Gaussian function as a weighted function in the regression model. The kernel weights is used to handle spatial effects on the data so that a model can be obtained for each location. The purpose of this paper is to model of poverty percentage data in Central Java province using GWR with Gaussian kernel weighted function and to determine the influencing factors in each regency/city in Central Java province. Based on the research, we obtained geographically weighted regression model with Gaussian kernel weighted function on poverty percentage data in Central Java province. We found that percentage of population working as farmers, population growth rate, percentage of households with regular sanitation, and BPJS beneficiaries are the variables that affect the percentage of poverty in Central Java province. In this research, we found the determination coefficient R2 are 68.64%. There are two categories of district which are influenced by different of significance factors.
On Weighted Support Vector Regression
DEFF Research Database (Denmark)
Han, Xixuan; Clemmensen, Line Katrine Harder
2014-01-01
We propose a new type of weighted support vector regression (SVR), motivated by modeling local dependencies in time and space in prediction of house prices. The classic weights of the weighted SVR are added to the slack variables in the objective function (OF‐weights). This procedure directly...... shrinks the coefficient of each observation in the estimated functions; thus, it is widely used for minimizing influence of outliers. We propose to additionally add weights to the slack variables in the constraints (CF‐weights) and call the combination of weights the doubly weighted SVR. We illustrate...... the differences and similarities of the two types of weights by demonstrating the connection between the Least Absolute Shrinkage and Selection Operator (LASSO) and the SVR. We show that an SVR problem can be transformed to a LASSO problem plus a linear constraint and a box constraint. We demonstrate...
Multinomial logistic regression in workers' health
Grilo, Luís M.; Grilo, Helena L.; Gonçalves, Sónia P.; Junça, Ana
2017-11-01
In European countries, namely in Portugal, it is common to hear some people mentioning that they are exposed to excessive and continuous psychosocial stressors at work. This is increasing in diverse activity sectors, such as, the Services sector. A representative sample was collected from a Portuguese Services' organization, by applying a survey (internationally validated), which variables were measured in five ordered categories in Likert-type scale. A multinomial logistic regression model is used to estimate the probability of each category of the dependent variable general health perception where, among other independent variables, burnout appear as statistically significant.
Regression Model to Predict Global Solar Irradiance in Malaysia
Directory of Open Access Journals (Sweden)
Hairuniza Ahmed Kutty
2015-01-01
Full Text Available A novel regression model is developed to estimate the monthly global solar irradiance in Malaysia. The model is developed based on different available meteorological parameters, including temperature, cloud cover, rain precipitate, relative humidity, wind speed, pressure, and gust speed, by implementing regression analysis. This paper reports on the details of the analysis of the effect of each prediction parameter to identify the parameters that are relevant to estimating global solar irradiance. In addition, the proposed model is compared in terms of the root mean square error (RMSE, mean bias error (MBE, and the coefficient of determination (R2 with other models available from literature studies. Seven models based on single parameters (PM1 to PM7 and five multiple-parameter models (PM7 to PM12 are proposed. The new models perform well, with RMSE ranging from 0.429% to 1.774%, R2 ranging from 0.942 to 0.992, and MBE ranging from −0.1571% to 0.6025%. In general, cloud cover significantly affects the estimation of global solar irradiance. However, cloud cover in Malaysia lacks sufficient influence when included into multiple-parameter models although it performs fairly well in single-parameter prediction models.
Credit Scoring Problem Based on Regression Analysis
Khassawneh, Bashar Suhil Jad Allah
2014-01-01
ABSTRACT: This thesis provides an explanatory introduction to the regression models of data mining and contains basic definitions of key terms in the linear, multiple and logistic regression models. Meanwhile, the aim of this study is to illustrate fitting models for the credit scoring problem using simple linear, multiple linear and logistic regression models and also to analyze the found model functions by statistical tools. Keywords: Data mining, linear regression, logistic regression....
Regularized Label Relaxation Linear Regression.
Fang, Xiaozhao; Xu, Yong; Li, Xuelong; Lai, Zhihui; Wong, Wai Keung; Fang, Bingwu
2018-04-01
Linear regression (LR) and some of its variants have been widely used for classification problems. Most of these methods assume that during the learning phase, the training samples can be exactly transformed into a strict binary label matrix, which has too little freedom to fit the labels adequately. To address this problem, in this paper, we propose a novel regularized label relaxation LR method, which has the following notable characteristics. First, the proposed method relaxes the strict binary label matrix into a slack variable matrix by introducing a nonnegative label relaxation matrix into LR, which provides more freedom to fit the labels and simultaneously enlarges the margins between different classes as much as possible. Second, the proposed method constructs the class compactness graph based on manifold learning and uses it as the regularization item to avoid the problem of overfitting. The class compactness graph is used to ensure that the samples sharing the same labels can be kept close after they are transformed. Two different algorithms, which are, respectively, based on -norm and -norm loss functions are devised. These two algorithms have compact closed-form solutions in each iteration so that they are easily implemented. Extensive experiments show that these two algorithms outperform the state-of-the-art algorithms in terms of the classification accuracy and running time.
Poisson Mixture Regression Models for Heart Disease Prediction
Erol, Hamza
2016-01-01
Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model. PMID:27999611
Mapping geogenic radon potential by regression kriging
Energy Technology Data Exchange (ETDEWEB)
Pásztor, László [Institute for Soil Sciences and Agricultural Chemistry, Centre for Agricultural Research, Hungarian Academy of Sciences, Department of Environmental Informatics, Herman Ottó út 15, 1022 Budapest (Hungary); Szabó, Katalin Zsuzsanna, E-mail: sz_k_zs@yahoo.de [Department of Chemistry, Institute of Environmental Science, Szent István University, Páter Károly u. 1, Gödöllő 2100 (Hungary); Szatmári, Gábor; Laborczi, Annamária [Institute for Soil Sciences and Agricultural Chemistry, Centre for Agricultural Research, Hungarian Academy of Sciences, Department of Environmental Informatics, Herman Ottó út 15, 1022 Budapest (Hungary); Horváth, Ákos [Department of Atomic Physics, Eötvös University, Pázmány Péter sétány 1/A, 1117 Budapest (Hungary)
2016-02-15
Radon ({sup 222}Rn) gas is produced in the radioactive decay chain of uranium ({sup 238}U) which is an element that is naturally present in soils. Radon is transported mainly by diffusion and convection mechanisms through the soil depending mainly on the physical and meteorological parameters of the soil and can enter and accumulate in buildings. Health risks originating from indoor radon concentration can be attributed to natural factors and is characterized by geogenic radon potential (GRP). Identification of areas with high health risks require spatial modeling, that is, mapping of radon risk. In addition to geology and meteorology, physical soil properties play a significant role in the determination of GRP. In order to compile a reliable GRP map for a model area in Central-Hungary, spatial auxiliary information representing GRP forming environmental factors were taken into account to support the spatial inference of the locally measured GRP values. Since the number of measured sites was limited, efficient spatial prediction methodologies were searched for to construct a reliable map for a larger area. Regression kriging (RK) was applied for the interpolation using spatially exhaustive auxiliary data on soil, geology, topography, land use and climate. RK divides the spatial inference into two parts. Firstly, the deterministic component of the target variable is determined by a regression model. The residuals of the multiple linear regression analysis represent the spatially varying but dependent stochastic component, which are interpolated by kriging. The final map is the sum of the two component predictions. Overall accuracy of the map was tested by Leave-One-Out Cross-Validation. Furthermore the spatial reliability of the resultant map is also estimated by the calculation of the 90% prediction interval of the local prediction values. The applicability of the applied method as well as that of the map is discussed briefly. - Highlights: • A new method
Mapping geogenic radon potential by regression kriging
International Nuclear Information System (INIS)
Pásztor, László; Szabó, Katalin Zsuzsanna; Szatmári, Gábor; Laborczi, Annamária; Horváth, Ákos
2016-01-01
Radon ( 222 Rn) gas is produced in the radioactive decay chain of uranium ( 238 U) which is an element that is naturally present in soils. Radon is transported mainly by diffusion and convection mechanisms through the soil depending mainly on the physical and meteorological parameters of the soil and can enter and accumulate in buildings. Health risks originating from indoor radon concentration can be attributed to natural factors and is characterized by geogenic radon potential (GRP). Identification of areas with high health risks require spatial modeling, that is, mapping of radon risk. In addition to geology and meteorology, physical soil properties play a significant role in the determination of GRP. In order to compile a reliable GRP map for a model area in Central-Hungary, spatial auxiliary information representing GRP forming environmental factors were taken into account to support the spatial inference of the locally measured GRP values. Since the number of measured sites was limited, efficient spatial prediction methodologies were searched for to construct a reliable map for a larger area. Regression kriging (RK) was applied for the interpolation using spatially exhaustive auxiliary data on soil, geology, topography, land use and climate. RK divides the spatial inference into two parts. Firstly, the deterministic component of the target variable is determined by a regression model. The residuals of the multiple linear regression analysis represent the spatially varying but dependent stochastic component, which are interpolated by kriging. The final map is the sum of the two component predictions. Overall accuracy of the map was tested by Leave-One-Out Cross-Validation. Furthermore the spatial reliability of the resultant map is also estimated by the calculation of the 90% prediction interval of the local prediction values. The applicability of the applied method as well as that of the map is discussed briefly. - Highlights: • A new method, regression
Entrepreneurial intention modeling using hierarchical multiple regression
Directory of Open Access Journals (Sweden)
Marina Jeger
2014-12-01
Full Text Available The goal of this study is to identify the contribution of effectuation dimensions to the predictive power of the entrepreneurial intention model over and above that which can be accounted for by other predictors selected and confirmed in previous studies. As is often the case in social and behavioral studies, some variables are likely to be highly correlated with each other. Therefore, the relative amount of variance in the criterion variable explained by each of the predictors depends on several factors such as the order of variable entry and sample specifics. The results show the modest predictive power of two dimensions of effectuation prior to the introduction of the theory of planned behavior elements. The article highlights the main advantages of applying hierarchical regression in social sciences as well as in the specific context of entrepreneurial intention formation, and addresses some of the potential pitfalls that this type of analysis entails.
Multiple Regression Analysis of mRNA-miRNA Associations in Colorectal Cancer Pathway
Wang, Fengfeng; Wong, S. C. Cesar; Chan, Lawrence W. C.; Cho, William C. S.; Yip, S. P.; Yung, Benjamin Y. M.
2014-01-01
Background. MicroRNA (miRNA) is a short and endogenous RNA molecule that regulates posttranscriptional gene expression. It is an important factor for tumorigenesis of colorectal cancer (CRC), and a potential biomarker for diagnosis, prognosis, and therapy of CRC. Our objective is to identify the related miRNAs and their associations with genes frequently involved in CRC microsatellite instability (MSI) and chromosomal instability (CIN) signaling pathways. Results. A regression model was adopted to identify the significantly associated miRNAs targeting a set of candidate genes frequently involved in colorectal cancer MSI and CIN pathways. Multiple linear regression analysis was used to construct the model and find the significant mRNA-miRNA associations. We identified three significantly associated mRNA-miRNA pairs: BCL2 was positively associated with miR-16 and SMAD4 was positively associated with miR-567 in the CRC tissue, while MSH6 was positively associated with miR-142-5p in the normal tissue. As for the whole model, BCL2 and SMAD4 models were not significant, and MSH6 model was significant. The significant associations were different in the normal and the CRC tissues. Conclusion. Our results have laid down a solid foundation in exploration of novel CRC mechanisms, and identification of miRNA roles as oncomirs or tumor suppressor mirs in CRC. PMID:24895601
Principal component regression analysis with SPSS.
Liu, R X; Kuang, J; Gong, Q; Hou, X L
2003-06-01
The paper introduces all indices of multicollinearity diagnoses, the basic principle of principal component regression and determination of 'best' equation method. The paper uses an example to describe how to do principal component regression analysis with SPSS 10.0: including all calculating processes of the principal component regression and all operations of linear regression, factor analysis, descriptives, compute variable and bivariate correlations procedures in SPSS 10.0. The principal component regression analysis can be used to overcome disturbance of the multicollinearity. The simplified, speeded up and accurate statistical effect is reached through the principal component regression analysis with SPSS.
Spontaneous regression of retinopathy of prematurity:incidence and predictive factors
Directory of Open Access Journals (Sweden)
Rui-Hong Ju
2013-08-01
Full Text Available AIM:To evaluate the incidence of spontaneous regression of changes in the retina and vitreous in active stage of retinopathy of prematurity(ROP and identify the possible relative factors during the regression.METHODS: This was a retrospective, hospital-based study. The study consisted of 39 premature infants with mild ROP showed spontaneous regression (Group A and 17 with severe ROP who had been treated before naturally involuting (Group B from August 2008 through May 2011. Data on gender, single or multiple pregnancy, gestational age, birth weight, weight gain from birth to the sixth week of life, use of oxygen in mechanical ventilation, total duration of oxygen inhalation, surfactant given or not, need for and times of blood transfusion, 1,5,10-min Apgar score, presence of bacterial or fungal or combined infection, hyaline membrane disease (HMD, patent ductus arteriosus (PDA, duration of stay in the neonatal intensive care unit (NICU and duration of ROP were recorded.RESULTS: The incidence of spontaneous regression of ROP with stage 1 was 86.7%, and with stage 2, stage 3 was 57.1%, 5.9%, respectively. With changes in zone Ⅲ regression was detected 100%, in zoneⅡ 46.2% and in zoneⅠ 0%. The mean duration of ROP in spontaneous regression group was 5.65±3.14 weeks, lower than that of the treated ROP group (7.34±4.33 weeks, but this difference was not statistically significant (P=0.201. GA, 1min Apgar score, 5min Apgar score, duration of NICU stay, postnatal age of initial screening and oxygen therapy longer than 10 days were significant predictive factors for the spontaneous regression of ROP (P＜0.05. Retinal hemorrhage was the only independent predictive factor the spontaneous regression of ROP (OR 0.030, 95%CI 0.001-0.775, P=0.035.CONCLUSION:This study showed most stage 1 and 2 ROP and changes in zone Ⅲ can spontaneously regression in the end. Retinal hemorrhage is weakly inversely associated with the spontaneous regression.
Unbalanced Regressions and the Predictive Equation
DEFF Research Database (Denmark)
Osterrieder, Daniela; Ventosa-Santaulària, Daniel; Vera-Valdés, J. Eduardo
Predictive return regressions with persistent regressors are typically plagued by (asymptotically) biased/inconsistent estimates of the slope, non-standard or potentially even spurious statistical inference, and regression unbalancedness. We alleviate the problem of unbalancedness in the theoreti......Predictive return regressions with persistent regressors are typically plagued by (asymptotically) biased/inconsistent estimates of the slope, non-standard or potentially even spurious statistical inference, and regression unbalancedness. We alleviate the problem of unbalancedness...
Semiparametric regression during 2003–2007
Ruppert, David; Wand, M.P.; Carroll, Raymond J.
2009-01-01
Semiparametric regression is a fusion between parametric regression and nonparametric regression that integrates low-rank penalized splines, mixed model and hierarchical Bayesian methodology – thus allowing more streamlined handling of longitudinal and spatial correlation. We review progress in the field over the five-year period between 2003 and 2007. We find semiparametric regression to be a vibrant field with substantial involvement and activity, continual enhancement and widespread application.
Gaussian process regression analysis for functional data
Shi, Jian Qing
2011-01-01
Gaussian Process Regression Analysis for Functional Data presents nonparametric statistical methods for functional regression analysis, specifically the methods based on a Gaussian process prior in a functional space. The authors focus on problems involving functional response variables and mixed covariates of functional and scalar variables.Covering the basics of Gaussian process regression, the first several chapters discuss functional data analysis, theoretical aspects based on the asymptotic properties of Gaussian process regression models, and new methodological developments for high dime
Regression Analysis by Example. 5th Edition
Chatterjee, Samprit; Hadi, Ali S.
2012-01-01
Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. "Regression Analysis by Example, Fifth Edition" has been expanded and thoroughly…
Standards for Standardized Logistic Regression Coefficients
Menard, Scott
2011-01-01
Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…
A Seemingly Unrelated Poisson Regression Model
King, Gary
1989-01-01
This article introduces a new estimator for the analysis of two contemporaneously correlated endogenous event count variables. This seemingly unrelated Poisson regression model (SUPREME) estimator combines the efficiencies created by single equation Poisson regression model estimators and insights from "seemingly unrelated" linear regression models.
International Nuclear Information System (INIS)
Janssen, I.; Stebbings, J.H.
1990-01-01
In environmental epidemiology, trace and toxic substance concentrations frequently have very highly skewed distributions ranging over one or more orders of magnitude, and prediction by conventional regression is often poor. Classification and Regression Tree Analysis (CART) is an alternative in such contexts. To compare the techniques, two Pennsylvania data sets and three independent variables are used: house radon progeny (RnD) and gamma levels as predicted by construction characteristics in 1330 houses; and ∼200 house radon (Rn) measurements as predicted by topographic parameters. CART may identify structural variables of interest not identified by conventional regression, and vice versa, but in general the regression models are similar. CART has major advantages in dealing with other common characteristics of environmental data sets, such as missing values, continuous variables requiring transformations, and large sets of potential independent variables. CART is most useful in the identification and screening of independent variables, greatly reducing the need for cross-tabulations and nested breakdown analyses. There is no need to discard cases with missing values for the independent variables because surrogate variables are intrinsic to CART. The tree-structured approach is also independent of the scale on which the independent variables are measured, so that transformations are unnecessary. CART identifies important interactions as well as main effects. The major advantages of CART appear to be in exploring data. Once the important variables are identified, conventional regressions seem to lead to results similar but more interpretable by most audiences. 12 refs., 8 figs., 10 tabs
Free Software Development. 1. Fitting Statistical Regressions
Directory of Open Access Journals (Sweden)
Lorentz JÄNTSCHI
2002-12-01
Full Text Available The present paper is focused on modeling of statistical data processing with applications in field of material science and engineering. A new method of data processing is presented and applied on a set of 10 Ni–Mn–Ga ferromagnetic ordered shape memory alloys that are known to exhibit phonon softening and soft mode condensation into a premartensitic phase prior to the martensitic transformation itself. The method allows to identify the correlations between data sets and to exploit them later in statistical study of alloys. An algorithm for computing data was implemented in preprocessed hypertext language (PHP, a hypertext markup language interface for them was also realized and put onto comp.east.utcluj.ro educational web server, and it is accessible via http protocol at the address http://vl.academicdirect.ro/applied_statistics/linear_regression/multiple/v1.5/. The program running for the set of alloys allow to identify groups of alloys properties and give qualitative measure of correlations between properties. Surfaces of property dependencies are also fitted.
Regression with Sparse Approximations of Data
DEFF Research Database (Denmark)
Noorzad, Pardis; Sturm, Bob L.
2012-01-01
We propose sparse approximation weighted regression (SPARROW), a method for local estimation of the regression function that uses sparse approximation with a dictionary of measurements. SPARROW estimates the regression function at a point with a linear combination of a few regressands selected...... by a sparse approximation of the point in terms of the regressors. We show SPARROW can be considered a variant of \\(k\\)-nearest neighbors regression (\\(k\\)-NNR), and more generally, local polynomial kernel regression. Unlike \\(k\\)-NNR, however, SPARROW can adapt the number of regressors to use based...
Spontaneous regression of a congenital melanocytic nevus
Directory of Open Access Journals (Sweden)
Amiya Kumar Nath
2011-01-01
Full Text Available Congenital melanocytic nevus (CMN may rarely regress which may also be associated with a halo or vitiligo. We describe a 10-year-old girl who presented with CMN on the left leg since birth, which recently started to regress spontaneously with associated depigmentation in the lesion and at a distant site. Dermoscopy performed at different sites of the regressing lesion demonstrated loss of epidermal pigments first followed by loss of dermal pigments. Histopathology and Masson-Fontana stain demonstrated lymphocytic infiltration and loss of pigment production in the regressing area. Immunohistochemistry staining (S100 and HMB-45, however, showed that nevus cells were present in the regressing areas.
Crane, Paul K; Gibbons, Laura E; Jolley, Lance; van Belle, Gerald
2006-11-01
We present an ordinal logistic regression model for identification of items with differential item functioning (DIF) and apply this model to a Mini-Mental State Examination (MMSE) dataset. We employ item response theory ability estimation in our models. Three nested ordinal logistic regression models are applied to each item. Model testing begins with examination of the statistical significance of the interaction term between ability and the group indicator, consistent with nonuniform DIF. Then we turn our attention to the coefficient of the ability term in models with and without the group term. If including the group term has a marked effect on that coefficient, we declare that it has uniform DIF. We examined DIF related to language of test administration in addition to self-reported race, Hispanic ethnicity, age, years of education, and sex. We used PARSCALE for IRT analyses and STATA for ordinal logistic regression approaches. We used an iterative technique for adjusting IRT ability estimates on the basis of DIF findings. Five items were found to have DIF related to language. These same items also had DIF related to other covariates. The ordinal logistic regression approach to DIF detection, when combined with IRT ability estimates, provides a reasonable alternative for DIF detection. There appear to be several items with significant DIF related to language of test administration in the MMSE. More attention needs to be paid to the specific criteria used to determine whether an item has DIF, not just the technique used to identify DIF.
Multicollinearity in Regression Analyses Conducted in Epidemiologic Studies.
Vatcheva, Kristina P; Lee, MinJae; McCormick, Joseph B; Rahbar, Mohammad H
2016-04-01
The adverse impact of ignoring multicollinearity on findings and data interpretation in regression analysis is very well documented in the statistical literature. The failure to identify and report multicollinearity could result in misleading interpretations of the results. A review of epidemiological literature in PubMed from January 2004 to December 2013, illustrated the need for a greater attention to identifying and minimizing the effect of multicollinearity in analysis of data from epidemiologic studies. We used simulated datasets and real life data from the Cameron County Hispanic Cohort to demonstrate the adverse effects of multicollinearity in the regression analysis and encourage researchers to consider the diagnostic for multicollinearity as one of the steps in regression analysis.
A Bayesian Nonparametric Causal Model for Regression Discontinuity Designs
Karabatsos, George; Walker, Stephen G.
2013-01-01
The regression discontinuity (RD) design (Thistlewaite & Campbell, 1960; Cook, 2008) provides a framework to identify and estimate causal effects from a non-randomized design. Each subject of a RD design is assigned to the treatment (versus assignment to a non-treatment) whenever her/his observed value of the assignment variable equals or…
Multicollinearity in Regression Analyses Conducted in Epidemiologic Studies
Vatcheva, Kristina P.; Lee, MinJae; McCormick, Joseph B.; Rahbar, Mohammad H.
2016-01-01
The adverse impact of ignoring multicollinearity on findings and data interpretation in regression analysis is very well documented in the statistical literature. The failure to identify and report multicollinearity could result in misleading interpretations of the results. A review of epidemiological literature in PubMed from January 2004 to December 2013, illustrated the need for a greater attention to identifying and minimizing the effect of multicollinearity in analysis of data from epide...
Directory of Open Access Journals (Sweden)
Filip Kokotovic
2016-06-01
Full Text Available The study of human capital relevance to economic growth is becoming increasingly important taking into account its relevance in many of the Sustainable Development Goals proposed by the UN. This paper conducted a panel regression analysis of selected SE European countries and Scandinavian countries using the Granger causality test and pooled panel regression. In order to test the relevance of human capital on economic growth, several human capital proxy variables were identified. Aside from the human capital proxy variables, other explanatory variables were selected using stepwise regression while the dependant variable was GDP. This paper concludes that there are significant structural differences in the economies of the two observed panels. Of the human capital proxy variables observed, for the panel of SE European countries only life expectancy was statistically significant and it had a negative impact on economic growth, while in the panel of Scandinavian countries total public expenditure on education had a statistically significant positive effect on economic growth. Based upon these results and existing studies, this paper concludes that human capital has a far more significant impact on economic growth in more developed economies.
Applied regression analysis a research tool
Pantula, Sastry; Dickey, David
1998-01-01
Least squares estimation, when used appropriately, is a powerful research tool. A deeper understanding of the regression concepts is essential for achieving optimal benefits from a least squares analysis. This book builds on the fundamentals of statistical methods and provides appropriate concepts that will allow a scientist to use least squares as an effective research tool. Applied Regression Analysis is aimed at the scientist who wishes to gain a working knowledge of regression analysis. The basic purpose of this book is to develop an understanding of least squares and related statistical methods without becoming excessively mathematical. It is the outgrowth of more than 30 years of consulting experience with scientists and many years of teaching an applied regression course to graduate students. Applied Regression Analysis serves as an excellent text for a service course on regression for non-statisticians and as a reference for researchers. It also provides a bridge between a two-semester introduction to...
Regression models of reactor diagnostic signals
International Nuclear Information System (INIS)
Vavrin, J.
1989-01-01
The application is described of an autoregression model as the simplest regression model of diagnostic signals in experimental analysis of diagnostic systems, in in-service monitoring of normal and anomalous conditions and their diagnostics. The method of diagnostics is described using a regression type diagnostic data base and regression spectral diagnostics. The diagnostics is described of neutron noise signals from anomalous modes in the experimental fuel assembly of a reactor. (author)
Significant Radionuclides Determination
Energy Technology Data Exchange (ETDEWEB)
Jo A. Ziegler
2001-07-31
The purpose of this calculation is to identify radionuclides that are significant to offsite doses from potential preclosure events for spent nuclear fuel (SNF) and high-level radioactive waste expected to be received at the potential Monitored Geologic Repository (MGR). In this calculation, high-level radioactive waste is included in references to DOE SNF. A previous document, ''DOE SNF DBE Offsite Dose Calculations'' (CRWMS M&O 1999b), calculated the source terms and offsite doses for Department of Energy (DOE) and Naval SNF for use in design basis event analyses. This calculation reproduces only DOE SNF work (i.e., no naval SNF work is included in this calculation) created in ''DOE SNF DBE Offsite Dose Calculations'' and expands the calculation to include DOE SNF expected to produce a high dose consequence (even though the quantity of the SNF is expected to be small) and SNF owned by commercial nuclear power producers. The calculation does not address any specific off-normal/DBE event scenarios for receiving, handling, or packaging of SNF. The results of this calculation are developed for comparative analysis to establish the important radionuclides and do not represent the final source terms to be used for license application. This calculation will be used as input to preclosure safety analyses and is performed in accordance with procedure AP-3.12Q, ''Calculations'', and is subject to the requirements of DOE/RW-0333P, ''Quality Assurance Requirements and Description'' (DOE 2000) as determined by the activity evaluation contained in ''Technical Work Plan for: Preclosure Safety Analysis, TWP-MGR-SE-000010'' (CRWMS M&O 2000b) in accordance with procedure AP-2.21Q, ''Quality Determinations and Planning for Scientific, Engineering, and Regulatory Compliance Activities''.
Bulcock, J. W.
The problem of model estimation when the data are collinear was examined. Though the ridge regression (RR) outperforms ordinary least squares (OLS) regression in the presence of acute multicollinearity, it is not a problem free technique for reducing the variance of the estimates. It is a stochastic procedure when it should be nonstochastic and it…
Multivariate Regression Analysis and Slaughter Livestock,
AGRICULTURE, *ECONOMICS), (*MEAT, PRODUCTION), MULTIVARIATE ANALYSIS, REGRESSION ANALYSIS , ANIMALS, WEIGHT, COSTS, PREDICTIONS, STABILITY, MATHEMATICAL MODELS, STORAGE, BEEF, PORK, FOOD, STATISTICAL DATA, ACCURACY
[From clinical judgment to linear regression model.
Palacios-Cruz, Lino; Pérez, Marcela; Rivas-Ruiz, Rodolfo; Talavera, Juan O
2013-01-01
When we think about mathematical models, such as linear regression model, we think that these terms are only used by those engaged in research, a notion that is far from the truth. Legendre described the first mathematical model in 1805, and Galton introduced the formal term in 1886. Linear regression is one of the most commonly used regression models in clinical practice. It is useful to predict or show the relationship between two or more variables as long as the dependent variable is quantitative and has normal distribution. Stated in another way, the regression is used to predict a measure based on the knowledge of at least one other variable. Linear regression has as it's first objective to determine the slope or inclination of the regression line: Y = a + bx, where "a" is the intercept or regression constant and it is equivalent to "Y" value when "X" equals 0 and "b" (also called slope) indicates the increase or decrease that occurs when the variable "x" increases or decreases in one unit. In the regression line, "b" is called regression coefficient. The coefficient of determination (R 2 ) indicates the importance of independent variables in the outcome.
Sirenomelia and severe caudal regression syndrome.
Seidahmed, Mohammed Z; Abdelbasit, Omer B; Alhussein, Khalid A; Miqdad, Abeer M; Khalil, Mohammed I; Salih, Mustafa A
2014-12-01
To describe cases of sirenomelia and severe caudal regression syndrome (CRS), to report the prevalence of sirenomelia, and compare our findings with the literature. Retrospective data was retrieved from the medical records of infants with the diagnosis of sirenomelia and CRS and their mothers from 1989 to 2010 (22 years) at the Security Forces Hospital, Riyadh, Saudi Arabia. A perinatologist, neonatologist, pediatric neurologist, and radiologist ascertained the diagnoses. The cases were identified as part of a study of neural tube defects during that period. A literature search was conducted using MEDLINE. During the 22-year study period, the total number of deliveries was 124,933 out of whom, 4 patients with sirenomelia, and 2 patients with severe forms of CRS were identified. All the patients with sirenomelia had single umbilical artery, and none were the infant of a diabetic mother. One patient was a twin, and another was one of triplets. The 2 patients with CRS were sisters, their mother suffered from type II diabetes mellitus and morbid obesity on insulin, and neither of them had a single umbilical artery. Other associated anomalies with sirenomelia included an absent radius, thumb, and index finger in one patient, Potter's syndrome, abnormal ribs, microphthalmia, congenital heart disease, hypoplastic lungs, and diaphragmatic hernia. The prevalence of sirenomelia (3.2 per 100,000) is high compared with the international prevalence of one per 100,000. Both cases of CRS were infants of type II diabetic mother with poor control, supporting the strong correlation of CRS and maternal diabetes.
Determinants of LSIL Regression in Women from a Colombian Cohort
International Nuclear Information System (INIS)
Molano, Monica; Gonzalez, Mauricio; Gamboa, Oscar; Ortiz, Natasha; Luna, Joaquin; Hernandez, Gustavo; Posso, Hector; Murillo, Raul; Munoz, Nubia
2010-01-01
Objective: To analyze the role of Human Papillomavirus (HPV) and other risk factors in the regression of cervical lesions in women from the Bogota Cohort. Methods: 200 HPV positive women with abnormal cytology were included for regression analysis. The time of lesion regression was modeled using methods for interval censored survival time data. Median duration of total follow-up was 9 years. Results: 80 (40%) women were diagnosed with Atypical Squamous Cells of Undetermined Significance (ASCUS) or Atypical Glandular Cells of Undetermined Significance (AGUS) while 120 (60%) were diagnosed with Low Grade Squamous Intra-epithelial Lesions (LSIL). Globally, 40% of the lesions were still present at first year of follow up, while 1.5% was still present at 5 year check-up. The multivariate model showed similar regression rates for lesions in women with ASCUS/AGUS and women with LSIL (HR= 0.82, 95% CI 0.59-1.12). Women infected with HR HPV types and those with mixed infections had lower regression rates for lesions than did women infected with LR types (HR=0.526, 95% CI 0.33-0.84, for HR types and HR=0.378, 95% CI 0.20-0.69, for mixed infections). Furthermore, women over 30 years had a higher lesion regression rate than did women under 30 years (HR1.53, 95% CI 1.03-2.27). The study showed that the median time for lesion regression was 9 months while the median time for HPV clearance was 12 months. Conclusions: In the studied population, the type of infection and the age of the women are critical factors for the regression of cervical lesions.
Regression modeling methods, theory, and computation with SAS
Panik, Michael
2009-01-01
Regression Modeling: Methods, Theory, and Computation with SAS provides an introduction to a diverse assortment of regression techniques using SAS to solve a wide variety of regression problems. The author fully documents the SAS programs and thoroughly explains the output produced by the programs.The text presents the popular ordinary least squares (OLS) approach before introducing many alternative regression methods. It covers nonparametric regression, logistic regression (including Poisson regression), Bayesian regression, robust regression, fuzzy regression, random coefficients regression,
QRank: a novel quantile regression tool for eQTL discovery.
Song, Xiaoyu; Li, Gen; Zhou, Zhenwei; Wang, Xianling; Ionita-Laza, Iuliana; Wei, Ying
2017-07-15
Over the past decade, there has been a remarkable improvement in our understanding of the role of genetic variation in complex human diseases, especially via genome-wide association studies. However, the underlying molecular mechanisms are still poorly characterized, impending the development of therapeutic interventions. Identifying genetic variants that influence the expression level of a gene, i.e. expression quantitative trait loci (eQTLs), can help us understand how genetic variants influence traits at the molecular level. While most eQTL studies focus on identifying mean effects on gene expression using linear regression, evidence suggests that genetic variation can impact the entire distribution of the expression level. Motivated by the potential higher order associations, several studies investigated variance eQTLs. In this paper, we develop a Quantile Rank-score based test (QRank), which provides an easy way to identify eQTLs that are associated with the conditional quantile functions of gene expression. We have applied the proposed QRank to the Genotype-Tissue Expression project, an international tissue bank for studying the relationship between genetic variation and gene expression in human tissues, and found that the proposed QRank complements the existing methods, and identifies new eQTLs with heterogeneous effects across different quantile levels. Notably, we show that the eQTLs identified by QRank but missed by linear regression are associated with greater enrichment in genome-wide significant SNPs from the GWAS catalog, and are also more likely to be tissue specific than eQTLs identified by linear regression. An R package is available on R CRAN at https://cran.r-project.org/web/packages/QRank . xs2148@cumc.columbia.edu. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Bayesian logistic regression in detection of gene-steroid interaction for cancer at PDLIM5 locus.
Wang, Ke-Sheng; Owusu, Daniel; Pan, Yue; Xie, Changchun
2016-06-01
The PDZ and LIM domain 5 (PDLIM5) gene may play a role in cancer, bipolar disorder, major depression, alcohol dependence and schizophrenia; however, little is known about the interaction effect of steroid and PDLIM5 gene on cancer. This study examined 47 single-nucleotide polymorphisms (SNPs) within the PDLIM5 gene in the Marshfield sample with 716 cancer patients (any diagnosed cancer, excluding minor skin cancer) and 2848 noncancer controls. Multiple logistic regression model in PLINK software was used to examine the association of each SNP with cancer. Bayesian logistic regression in PROC GENMOD in SAS statistical software, ver. 9.4 was used to detect gene- steroid interactions influencing cancer. Single marker analysis using PLINK identified 12 SNPs associated with cancer (Plogistic regression in PROC GENMOD showed that both rs6532496 and rs951613 revealed strong gene-steroid interaction effects (OR=2.18, 95% CI=1.31-3.63 with P = 2.9 × 10⁻³ for rs6532496 and OR=2.07, 95% CI=1.24-3.45 with P = 5.43 × 10⁻³ for rs951613, respectively). Results from Bayesian logistic regression showed stronger interaction effects (OR=2.26, 95% CI=1.2-3.38 for rs6532496 and OR=2.14, 95% CI=1.14-3.2 for rs951613, respectively). All the 12 SNPs associated with cancer revealed significant gene-steroid interaction effects (P logistic regression and OR=2.59, 95% CI=1.4-3.97 from Bayesian logistic regression; respectively). This study provides evidence of common genetic variants within the PDLIM5 gene and interactions between PLDIM5 gene polymorphisms and steroid use influencing cancer.
RAWS II: A MULTIPLE REGRESSION ANALYSIS PROGRAM,
This memorandum gives instructions for the use and operation of a revised version of RAWS, a multiple regression analysis program. The program...of preprocessed data, the directed retention of variable, listing of the matrix of the normal equations and its inverse, and the bypassing of the regression analysis to provide the input variable statistics only. (Author)
A Simulation Investigation of Principal Component Regression.
Allen, David E.
Regression analysis is one of the more common analytic tools used by researchers. However, multicollinearity between the predictor variables can cause problems in using the results of regression analyses. Problems associated with multicollinearity include entanglement of relative influences of variables due to reduced precision of estimation,…
Hierarchical regression analysis in structural Equation Modeling
de Jong, P.F.
1999-01-01
In a hierarchical or fixed-order regression analysis, the independent variables are entered into the regression equation in a prespecified order. Such an analysis is often performed when the extra amount of variance accounted for in a dependent variable by a specific independent variable is the main
Categorical regression dose-response modeling
The goal of this training is to provide participants with training on the use of the U.S. EPA’s Categorical Regression soft¬ware (CatReg) and its application to risk assessment. Categorical regression fits mathematical models to toxicity data that have been assigned ord...
Variable importance in latent variable regression models
Kvalheim, O.M.; Arneberg, R.; Bleie, O.; Rajalahti, T.; Smilde, A.K.; Westerhuis, J.A.
2014-01-01
The quality and practical usefulness of a regression model are a function of both interpretability and prediction performance. This work presents some new graphical tools for improved interpretation of latent variable regression models that can also assist in improved algorithms for variable
Suppression Situations in Multiple Linear Regression
Shieh, Gwowen
2006-01-01
This article proposes alternative expressions for the two most prevailing definitions of suppression without resorting to the standardized regression modeling. The formulation provides a simple basis for the examination of their relationship. For the two-predictor regression, the author demonstrates that the previous results in the literature are…
Gibrat’s law and quantile regressions
DEFF Research Database (Denmark)
Distante, Roberta; Petrella, Ivan; Santoro, Emiliano
2017-01-01
The nexus between firm growth, size and age in U.S. manufacturing is examined through the lens of quantile regression models. This methodology allows us to overcome serious shortcomings entailed by linear regression models employed by much of the existing literature, unveiling a number of important...
Regression Analysis and the Sociological Imagination
De Maio, Fernando
2014-01-01
Regression analysis is an important aspect of most introductory statistics courses in sociology but is often presented in contexts divorced from the central concerns that bring students into the discipline. Consequently, we present five lesson ideas that emerge from a regression analysis of income inequality and mortality in the USA and Canada.
Repeated Results Analysis for Middleware Regression Benchmarking
Czech Academy of Sciences Publication Activity Database
Bulej, Lubomír; Kalibera, T.; Tůma, P.
2005-01-01
Roč. 60, - (2005), s. 345-358 ISSN 0166-5316 R&D Projects: GA ČR GA102/03/0672 Institutional research plan: CEZ:AV0Z10300504 Keywords : middleware benchmarking * regression benchmarking * regression testing Subject RIV: JD - Computer Applications, Robotics Impact factor: 0.756, year: 2005
Principles of Quantile Regression and an Application
Chen, Fang; Chalhoub-Deville, Micheline
2014-01-01
Newer statistical procedures are typically introduced to help address the limitations of those already in practice or to deal with emerging research needs. Quantile regression (QR) is introduced in this paper as a relatively new methodology, which is intended to overcome some of the limitations of least squares mean regression (LMR). QR is more…
ON REGRESSION REPRESENTATIONS OF STOCHASTIC-PROCESSES
RUSCHENDORF, L; DEVALK, [No Value
We construct a.s. nonlinear regression representations of general stochastic processes (X(n))n is-an-element-of N. As a consequence we obtain in particular special regression representations of Markov chains and of certain m-dependent sequences. For m-dependent sequences we obtain a constructive
Psychological distress of female caregivers of significant others with cancer
Directory of Open Access Journals (Sweden)
Tony Cassidy
2015-12-01
Full Text Available This study explored the role of time since diagnosis and whether the care recipient was a child, a parent, or a spouse, on caregiver’s perceptions of the caring role, with a group of 269 female cancer caregivers. Questionnaire measures were used to explore psychological and social resources and psychological distress. Analysis of variance and hierarchical multiple regression were used and identified significant effects of time since diagnosis and care recipient. This study concludes that a more tailored approach to understanding the needs of caregivers is required particularly in terms of time since diagnosis and care recipient, in order to provide more effective support.
Suzuki, Taku; Iwamoto, Takuji; Shizu, Kanae; Suzuki, Katsuji; Yamada, Harumoto; Sato, Kazuki
2017-05-01
This retrospective study was designed to investigate prognostic factors for postoperative outcomes for cubital tunnel syndrome (CubTS) using multiple logistic regression analysis with a large number of patients. Eighty-three patients with CubTS who underwent surgeries were enrolled. The following potential prognostic factors for disease severity were selected according to previous reports: sex, age, type of surgery, disease duration, body mass index, cervical lesion, presence of diabetes mellitus, Workers' Compensation status, preoperative severity, and preoperative electrodiagnostic testing. Postoperative severity of disease was assessed 2 years after surgery by Messina's criteria which is an outcome measure specifically for CubTS. Bivariate analysis was performed to select candidate prognostic factors for multiple linear regression analyses. Multiple logistic regression analysis was conducted to identify the association between postoperative severity and selected prognostic factors. Both bivariate and multiple linear regression analysis revealed only preoperative severity as an independent risk factor for poor prognosis, while other factors did not show any significant association. Although conflicting results exist regarding prognosis of CubTS, this study supports evidence from previous studies and concludes early surgical intervention portends the most favorable prognosis. Copyright © 2017 The Japanese Orthopaedic Association. Published by Elsevier B.V. All rights reserved.
International Nuclear Information System (INIS)
Arana, E.; Marti-Bonmati, L.; Bautista, D.; Paredes, R.
1998-01-01
To study the utility of logistic regression and the neuronal network in the diagnosis of cranial hemangiomas. Fifteen patients presenting hemangiomas were selected form a total of 167 patients with cranial lesions. All were evaluated by plain radiography and computed tomography (CT). Nineteen variables in their medical records were reviewed. Logistic regression and neuronal network models were constructed and validated by the jackknife (leave-one-out) approach. The yields of the two models were compared by means of ROC curves, using the area under the curve as parameter. Seven men and 8 women presented hemangiomas. The mean age of these patients was 38.4 (15.4 years (mea ± standard deviation). Logistic regression identified as significant variables the shape, soft tissue mass and periosteal reaction. The neuronal network lent more importance to the existence of ossified matrix, ruptured cortical vein and the mixed calcified-blastic (trabeculated) pattern. The neuronal network showed a greater yield than logistic regression (Az, 0.9409) (0.004 versus 0.7211± 0.075; p<0.001). The neuronal network discloses hidden interactions among the variables, providing a higher yield in the characterization of cranial hemangiomas and constituting a medical diagnostic acid. (Author)29 refs
Is Posidonia oceanica regression a general feature in the Mediterranean Sea?
Directory of Open Access Journals (Sweden)
M. BONACORSI
2013-03-01
Full Text Available Over the last few years, a widespread regression of Posidonia oceanica meadows has been noticed in the Mediterranean Sea. However, the magnitude of this decline is still debated. The objectives of this study are (i to assess the spatio-temporal evolution of Posidonia oceanica around Cap Corse (Corsica over time comparing available ancient maps (from 1960 with a new (2011 detailed map realized combining different techniques (aerial photographs, SSS, ROV, scuba diving; (ii evaluate the reliability of ancient maps; (iii discuss observed regression of the meadows in relation to human pressure along the 110 km of coast. Thus, the comparison with previous data shows that, apart from sites clearly identified with the actual evolution, there is a relative stability of the surfaces occupied by the seagrass Posidonia oceanica. The recorded differences seem more related to changes in mapping techniques. These results confirm that in areas characterized by a moderate anthropogenic impact, the Posidonia oceanica meadow has no significant regression and that the changes due to the evolution of mapping techniques are not negligible. However, others facts should be taken into account before extrapolating to the Mediterranean Sea (e.g. actually mapped surfaces and assessing the amplitude of the actual regression.
Regression of environmental noise in LIGO data
International Nuclear Information System (INIS)
Tiwari, V; Klimenko, S; Mitselmakher, G; Necula, V; Drago, M; Prodi, G; Frolov, V; Yakushin, I; Re, V; Salemi, F; Vedovato, G
2015-01-01
We address the problem of noise regression in the output of gravitational-wave (GW) interferometers, using data from the physical environmental monitors (PEM). The objective of the regression analysis is to predict environmental noise in the GW channel from the PEM measurements. One of the most promising regression methods is based on the construction of Wiener–Kolmogorov (WK) filters. Using this method, the seismic noise cancellation from the LIGO GW channel has already been performed. In the presented approach the WK method has been extended, incorporating banks of Wiener filters in the time–frequency domain, multi-channel analysis and regulation schemes, which greatly enhance the versatility of the regression analysis. Also we present the first results on regression of the bi-coherent noise in the LIGO data. (paper)
Pathological assessment of liver fibrosis regression
Directory of Open Access Journals (Sweden)
WANG Bingqiong
2017-03-01
Full Text Available Hepatic fibrosis is the common pathological outcome of chronic hepatic diseases. An accurate assessment of fibrosis degree provides an important reference for a definite diagnosis of diseases, treatment decision-making, treatment outcome monitoring, and prognostic evaluation. At present, many clinical studies have proven that regression of hepatic fibrosis and early-stage liver cirrhosis can be achieved by effective treatment, and a correct evaluation of fibrosis regression has become a hot topic in clinical research. Liver biopsy has long been regarded as the gold standard for the assessment of hepatic fibrosis, and thus it plays an important role in the evaluation of fibrosis regression. This article reviews the clinical application of current pathological staging systems in the evaluation of fibrosis regression from the perspectives of semi-quantitative scoring system, quantitative approach, and qualitative approach, in order to propose a better pathological evaluation system for the assessment of fibrosis regression.
Should metacognition be measured by logistic regression?
Rausch, Manuel; Zehetleitner, Michael
2017-03-01
Are logistic regression slopes suitable to quantify metacognitive sensitivity, i.e. the efficiency with which subjective reports differentiate between correct and incorrect task responses? We analytically show that logistic regression slopes are independent from rating criteria in one specific model of metacognition, which assumes (i) that rating decisions are based on sensory evidence generated independently of the sensory evidence used for primary task responses and (ii) that the distributions of evidence are logistic. Given a hierarchical model of metacognition, logistic regression slopes depend on rating criteria. According to all considered models, regression slopes depend on the primary task criterion. A reanalysis of previous data revealed that massive numbers of trials are required to distinguish between hierarchical and independent models with tolerable accuracy. It is argued that researchers who wish to use logistic regression as measure of metacognitive sensitivity need to control the primary task criterion and rating criteria. Copyright © 2017 Elsevier Inc. All rights reserved.
Asghari, Mehdi Poursheikhali; Hayatshahi, Sayyed Hamed Sadat; Abdolmaleki, Parviz
2012-01-01
From both the structural and functional points of view, β-turns play important biological roles in proteins. In the present study, a novel two-stage hybrid procedure has been developed to identify β-turns in proteins. Binary logistic regression was initially used for the first time to select significant sequence parameters in identification of β-turns due to a re-substitution test procedure. Sequence parameters were consisted of 80 amino acid positional occurrences and 20 amino acid percentages in sequence. Among these parameters, the most significant ones which were selected by binary logistic regression model, were percentages of Gly, Ser and the occurrence of Asn in position i+2, respectively, in sequence. These significant parameters have the highest effect on the constitution of a β-turn sequence. A neural network model was then constructed and fed by the parameters selected by binary logistic regression to build a hybrid predictor. The networks have been trained and tested on a non-homologous dataset of 565 protein chains. With applying a nine fold cross-validation test on the dataset, the network reached an overall accuracy (Qtotal) of 74, which is comparable with results of the other β-turn prediction methods. In conclusion, this study proves that the parameter selection ability of binary logistic regression together with the prediction capability of neural networks lead to the development of more precise models for identifying β-turns in proteins.
Spontaneous regression of residual low-grade cerebellar pilocytic astrocytomas in children
International Nuclear Information System (INIS)
Gunny, Roxana S.; Saunders, Dawn E.; Hayward, Richard D.; Phipps, Kim P.; Harding, Brian N.
2005-01-01
Cerebellar low-grade astrocytomas (CLGAs) of childhood are benign tumours and are usually curable by surgical resection alone or combined with adjuvant radiotherapy. To undertake a retrospective study of our children with CLGA to determine the optimum schedule for surveillance imaging following initial surgery. In this report we describe the phenomenon of spontaneous regression of residual tumour and discuss its prognostic significance regarding future imaging. A retrospective review was conducted of children treated for histologically proven CLGA at Great Ormond Street Hospital from 1988 to 1998. Of 83 children with CLGA identified, 13 (15.7%) had incomplete resections. Two children with large residual tumours associated with persistent symptoms underwent additional treatment. Eleven children were followed by surveillance imaging alone for a mean of 6.83 years (range 2-13.25 years). Spontaneous tumour regression was seen in 5 (45.5%) of the 11 children. There were no differences in age, gender, symptomatology, histological grade or Ki-67 fractions between those with spontaneous tumour regression and those with progression. There was a non-significant trend that larger volume residual tumours progressed. Residual tumour followed by surveillance imaging may either regress or progress. For children with residual disease we recommend surveillance imaging every 6 months for the first 2 years, every year for years 3, 4 and 5, then every second year if residual tumour is still present 5 years after initial surgery. This would detect not only progressive or recurrent disease, but also spontaneous regression which can occur later than disease progression. (orig.)
Real estate value prediction using multivariate regression models
Manjula, R.; Jain, Shubham; Srivastava, Sharad; Rajiv Kher, Pranav
2017-11-01
The real estate market is one of the most competitive in terms of pricing and the same tends to vary significantly based on a lot of factors, hence it becomes one of the prime fields to apply the concepts of machine learning to optimize and predict the prices with high accuracy. Therefore in this paper, we present various important features to use while predicting housing prices with good accuracy. We have described regression models, using various features to have lower Residual Sum of Squares error. While using features in a regression model some feature engineering is required for better prediction. Often a set of features (multiple regressions) or polynomial regression (applying a various set of powers in the features) is used for making better model fit. For these models are expected to be susceptible towards over fitting ridge regression is used to reduce it. This paper thus directs to the best application of regression models in addition to other techniques to optimize the result.
DIABETES MELLITUS AND ITS ROLE IN CAUDAL REGRESSION SYNDROME
Directory of Open Access Journals (Sweden)
Sandeep
2016-03-01
Full Text Available BACKGROUND Caudal regression syndrome also called as sacral agenesis or hypoplasia of the sacrum is a congenital disorder in which there is abnormal development of the lower part of the vertebral column 1 due to which there is a plethora of abnormalities such as gross motor deficiencies and other genitor-urinary malformations which in deed depends on the extent of malformations that is seen. Caudal regression syndrome is rare, with an estimated incidence of 1:7500-100,000. The aim of the study is to find the frequency of manifestations and the manifestations itself. METHODS Fifty patients who were pregnant and were diagnosed with diabetes mellitus were identified and were referred to the Department of Medicine. RESULTS In the present study the frequency of manifestations of caudal regression syndrome is 8 in 100 diagnosed patients. CONCLUSION The malformations in the babies born to diabetic mothers are high in the population of costal Karnataka and Kerala.
Variable selection and model choice in geoadditive regression models.
Kneib, Thomas; Hothorn, Torsten; Tutz, Gerhard
2009-06-01
Model choice and variable selection are issues of major concern in practical regression analyses, arising in many biometric applications such as habitat suitability analyses, where the aim is to identify the influence of potentially many environmental conditions on certain species. We describe regression models for breeding bird communities that facilitate both model choice and variable selection, by a boosting algorithm that works within a class of geoadditive regression models comprising spatial effects, nonparametric effects of continuous covariates, interaction surfaces, and varying coefficients. The major modeling components are penalized splines and their bivariate tensor product extensions. All smooth model terms are represented as the sum of a parametric component and a smooth component with one degree of freedom to obtain a fair comparison between the model terms. A generic representation of the geoadditive model allows us to devise a general boosting algorithm that automatically performs model choice and variable selection.
Regression modeling of ground-water flow
Cooley, R.L.; Naff, R.L.
1985-01-01
Nonlinear multiple regression methods are developed to model and analyze groundwater flow systems. Complete descriptions of regression methodology as applied to groundwater flow models allow scientists and engineers engaged in flow modeling to apply the methods to a wide range of problems. Organization of the text proceeds from an introduction that discusses the general topic of groundwater flow modeling, to a review of basic statistics necessary to properly apply regression techniques, and then to the main topic: exposition and use of linear and nonlinear regression to model groundwater flow. Statistical procedures are given to analyze and use the regression models. A number of exercises and answers are included to exercise the student on nearly all the methods that are presented for modeling and statistical analysis. Three computer programs implement the more complex methods. These three are a general two-dimensional, steady-state regression model for flow in an anisotropic, heterogeneous porous medium, a program to calculate a measure of model nonlinearity with respect to the regression parameters, and a program to analyze model errors in computed dependent variables such as hydraulic head. (USGS)
Alternative regression models to assess increase in childhood BMI
Directory of Open Access Journals (Sweden)
Mansmann Ulrich
2008-09-01
Full Text Available Abstract Background Body mass index (BMI data usually have skewed distributions, for which common statistical modeling approaches such as simple linear or logistic regression have limitations. Methods Different regression approaches to predict childhood BMI by goodness-of-fit measures and means of interpretation were compared including generalized linear models (GLMs, quantile regression and Generalized Additive Models for Location, Scale and Shape (GAMLSS. We analyzed data of 4967 children participating in the school entry health examination in Bavaria, Germany, from 2001 to 2002. TV watching, meal frequency, breastfeeding, smoking in pregnancy, maternal obesity, parental social class and weight gain in the first 2 years of life were considered as risk factors for obesity. Results GAMLSS showed a much better fit regarding the estimation of risk factors effects on transformed and untransformed BMI data than common GLMs with respect to the generalized Akaike information criterion. In comparison with GAMLSS, quantile regression allowed for additional interpretation of prespecified distribution quantiles, such as quantiles referring to overweight or obesity. The variables TV watching, maternal BMI and weight gain in the first 2 years were directly, and meal frequency was inversely significantly associated with body composition in any model type examined. In contrast, smoking in pregnancy was not directly, and breastfeeding and parental social class were not inversely significantly associated with body composition in GLM models, but in GAMLSS and partly in quantile regression models. Risk factor specific BMI percentile curves could be estimated from GAMLSS and quantile regression models. Conclusion GAMLSS and quantile regression seem to be more appropriate than common GLMs for risk factor modeling of BMI data.
Alternative regression models to assess increase in childhood BMI.
Beyerlein, Andreas; Fahrmeir, Ludwig; Mansmann, Ulrich; Toschke, André M
2008-09-08
Body mass index (BMI) data usually have skewed distributions, for which common statistical modeling approaches such as simple linear or logistic regression have limitations. Different regression approaches to predict childhood BMI by goodness-of-fit measures and means of interpretation were compared including generalized linear models (GLMs), quantile regression and Generalized Additive Models for Location, Scale and Shape (GAMLSS). We analyzed data of 4967 children participating in the school entry health examination in Bavaria, Germany, from 2001 to 2002. TV watching, meal frequency, breastfeeding, smoking in pregnancy, maternal obesity, parental social class and weight gain in the first 2 years of life were considered as risk factors for obesity. GAMLSS showed a much better fit regarding the estimation of risk factors effects on transformed and untransformed BMI data than common GLMs with respect to the generalized Akaike information criterion. In comparison with GAMLSS, quantile regression allowed for additional interpretation of prespecified distribution quantiles, such as quantiles referring to overweight or obesity. The variables TV watching, maternal BMI and weight gain in the first 2 years were directly, and meal frequency was inversely significantly associated with body composition in any model type examined. In contrast, smoking in pregnancy was not directly, and breastfeeding and parental social class were not inversely significantly associated with body composition in GLM models, but in GAMLSS and partly in quantile regression models. Risk factor specific BMI percentile curves could be estimated from GAMLSS and quantile regression models. GAMLSS and quantile regression seem to be more appropriate than common GLMs for risk factor modeling of BMI data.
Detecting Novelty and Significance
Ferrari, Vera; Bradley, Margaret M.; Codispoti, Maurizio; Lang, Peter J.
2013-01-01
Studies of cognition often use an “oddball” paradigm to study effects of stimulus novelty and significance on information processing. However, an oddball tends to be perceptually more novel than the standard, repeated stimulus as well as more relevant to the ongoing task, making it difficult to disentangle effects due to perceptual novelty and stimulus significance. In the current study, effects of perceptual novelty and significance on ERPs were assessed in a passive viewing context by presenting repeated and novel pictures (natural scenes) that either signaled significant information regarding the current context or not. A fronto-central N2 component was primarily affected by perceptual novelty, whereas a centro-parietal P3 component was modulated by both stimulus significance and novelty. The data support an interpretation that the N2 reflects perceptual fluency and is attenuated when a current stimulus matches an active memory representation and that the amplitude of the P3 reflects stimulus meaning and significance. PMID:19400680
Spatial vulnerability assessments by regression kriging
Pásztor, László; Laborczi, Annamária; Takács, Katalin; Szatmári, Gábor
2016-04-01
Two fairly different complex environmental phenomena, causing natural hazard were mapped based on a combined spatial inference approach. The behaviour is related to various environmental factors and the applied approach enables the inclusion of several, spatially exhaustive auxiliary variables that are available for mapping. Inland excess water (IEW) is an interrelated natural and human induced phenomenon causes several problems in the flat-land regions of Hungary, which cover nearly half of the country. The term 'inland excess water' refers to the occurrence of inundations outside the flood levee that originate from sources differing from flood overflow, it is surplus surface water forming due to the lack of runoff, insufficient absorption capability of soil or the upwelling of groundwater. There is a multiplicity of definitions, which indicate the complexity of processes that govern this phenomenon. Most of the definitions have a common part, namely, that inland excess water is temporary water inundation that occurs in flat-lands due to both precipitation and groundwater emerging on the surface as substantial sources. Radon gas is produced in the radioactive decay chain of uranium, which is an element that is naturally present in soils. Radon is transported mainly by diffusion and convection mechanisms through the soil depending mainly on soil physical and meteorological parameters and can enter and accumulate in the buildings. Health risk originating from indoor radon concentration attributed to natural factors is characterized by geogenic radon potential (GRP). In addition to geology and meteorology, physical soil properties play significant role in the determination of GRP. Identification of areas with high risk requires spatial modelling, that is mapping of specific natural hazards. In both cases external environmental factors determine the behaviour of the target process (occurrence/frequncy of IEW and grade of GRP respectively). Spatial auxiliary
Variable and subset selection in PLS regression
DEFF Research Database (Denmark)
Høskuldsson, Agnar
2001-01-01
The purpose of this paper is to present some useful methods for introductory analysis of variables and subsets in relation to PLS regression. We present here methods that are efficient in finding the appropriate variables or subset to use in the PLS regression. The general conclusion...... is that variable selection is important for successful analysis of chemometric data. An important aspect of the results presented is that lack of variable selection can spoil the PLS regression, and that cross-validation measures using a test set can show larger variation, when we use different subsets of X, than...
Applied Regression Modeling A Business Approach
Pardoe, Iain
2012-01-01
An applied and concise treatment of statistical regression techniques for business students and professionals who have little or no background in calculusRegression analysis is an invaluable statistical methodology in business settings and is vital to model the relationship between a response variable and one or more predictor variables, as well as the prediction of a response value given values of the predictors. In view of the inherent uncertainty of business processes, such as the volatility of consumer spending and the presence of market uncertainty, business professionals use regression a
Significant NRC Enforcement Actions
Nuclear Regulatory Commission — This dataset provides a list of Nuclear Regulartory Commission (NRC) issued significant enforcement actions. These actions, referred to as "escalated", are issued by...
Time series regression-based pairs trading in the Korean equities market
Kim, Saejoon; Heo, Jun
2017-07-01
Pairs trading is an instance of statistical arbitrage that relies on heavy quantitative data analysis to profit by capitalising low-risk trading opportunities provided by anomalies of related assets. A key element in pairs trading is the rule by which open and close trading triggers are defined. This paper investigates the use of time series regression to define the rule which has previously been identified with fixed threshold-based approaches. Empirical results indicate that our approach may yield significantly increased excess returns compared to ones obtained by previous approaches on large capitalisation stocks in the Korean equities market.
Determinants of orphan drugs prices in France: a regression analysis.
Korchagina, Daria; Millier, Aurelie; Vataire, Anne-Lise; Aballea, Samuel; Falissard, Bruno; Toumi, Mondher
2017-04-21
The introduction of the orphan drug legislation led to the increase in the number of available orphan drugs, but the access to them is often limited due to the high price. Social preferences regarding funding orphan drugs as well as the criteria taken into consideration while setting the price remain unclear. The study aimed at identifying the determinant of orphan drug prices in France using a regression analysis. All drugs with a valid orphan designation at the moment of launch for which the price was available in France were included in the analysis. The selection of covariates was based on a literature review and included drug characteristics (Anatomical Therapeutic Chemical (ATC) class, treatment line, age of target population), diseases characteristics (severity, prevalence, availability of alternative therapeutic options), health technology assessment (HTA) details (actual benefit (AB) and improvement in actual benefit (IAB) scores, delay between the HTA and commercialisation), and study characteristics (type of study, comparator, type of endpoint). The main data sources were European public assessment reports, HTA reports, summaries of opinion on orphan designation of the European Medicines Agency, and the French insurance database of drugs and tariffs. A generalized regression model was developed to test the association between the annual treatment cost and selected covariates. A total of 68 drugs were included. The mean annual treatment cost was €96,518. In the univariate analysis, the ATC class (p = 0.01), availability of alternative treatment options (p = 0.02) and the prevalence (p = 0.02) showed a significant correlation with the annual cost. The multivariate analysis demonstrated significant association between the annual cost and availability of alternative treatment options, ATC class, IAB score, type of comparator in the pivotal clinical trial, as well as commercialisation date and delay between the HTA and commercialisation. The
MULGRES: a computer program for stepwise multiple regression analysis
A. Jeff Martin
1971-01-01
MULGRES is a computer program source deck that is designed for multiple regression analysis employing the technique of stepwise deletion in the search for most significant variables. The features of the program, along with inputs and outputs, are briefly described, with a note on machine compatibility.
Using multiple linear regression techniques to quantify carbon ...
African Journals Online (AJOL)
Fallow ecosystems provide a significant carbon stock that can be quantified for inclusion in the accounts of global carbon budgets. Process and statistical models of productivity, though useful, are often technically rigid as the conditions for their application are not easy to satisfy. Multiple regression techniques have been ...
An Excel Solver Exercise to Introduce Nonlinear Regression
Pinder, Jonathan P.
2013-01-01
Business students taking business analytics courses that have significant predictive modeling components, such as marketing research, data mining, forecasting, and advanced financial modeling, are introduced to nonlinear regression using application software that is a "black box" to the students. Thus, although correct models are…
Brennan, Angela K.; Cross, Paul C.; Creely, Scott
2015-01-01
Summary Animal group size distributions are often right-skewed, whereby most groups are small, but most individuals occur in larger groups that may also disproportionately affect ecology and policy. In this case, examining covariates associated with upper quantiles of the group size distribution could facilitate better understanding and management of large animal groups.
Identifying multiple outliers in linear regression: robust fit and clustering approach
International Nuclear Information System (INIS)
Robiah Adnan; Mohd Nor Mohamad; Halim Setan
2001-01-01
This research provides a clustering based approach for determining potential candidates for outliers. This is modification of the method proposed by Serbert et. al (1988). It is based on using the single linkage clustering algorithm to group the standardized predicted and residual values of data set fit by least trimmed of squares (LTS). (Author)
Vectors, a tool in statistical regression theory
Corsten, L.C.A.
1958-01-01
Using linear algebra this thesis developed linear regression analysis including analysis of variance, covariance analysis, special experimental designs, linear and fertility adjustments, analysis of experiments at different places and times. The determination of the orthogonal projection, yielding
Genetics Home Reference: caudal regression syndrome
... umbilical artery: Further support for a caudal regression-sirenomelia spectrum. Am J Med Genet A. 2007 Dec ... AK, Dickinson JE, Bower C. Caudal dysgenesis and sirenomelia-single centre experience suggests common pathogenic basis. Am ...
Dynamic travel time estimation using regression trees.
2008-10-01
This report presents a methodology for travel time estimation by using regression trees. The dissemination of travel time information has become crucial for effective traffic management, especially under congested road conditions. In the absence of c...
Two Paradoxes in Linear Regression Analysis
FENG, Ge; PENG, Jing; TU, Dongke; ZHENG, Julia Z.; FENG, Changyong
2016-01-01
Summary Regression is one of the favorite tools in applied statistics. However, misuse and misinterpretation of results from regression analysis are common in biomedical research. In this paper we use statistical theory and simulation studies to clarify some paradoxes around this popular statistical method. In particular, we show that a widely used model selection procedure employed in many publications in top medical journals is wrong. Formal procedures based on solid statistical theory should be used in model selection. PMID:28638214
Discriminative Elastic-Net Regularized Linear Regression.
Zhang, Zheng; Lai, Zhihui; Xu, Yong; Shao, Ling; Wu, Jian; Xie, Guo-Sen
2017-03-01
In this paper, we aim at learning compact and discriminative linear regression models. Linear regression has been widely used in different problems. However, most of the existing linear regression methods exploit the conventional zero-one matrix as the regression targets, which greatly narrows the flexibility of the regression model. Another major limitation of these methods is that the learned projection matrix fails to precisely project the image features to the target space due to their weak discriminative capability. To this end, we present an elastic-net regularized linear regression (ENLR) framework, and develop two robust linear regression models which possess the following special characteristics. First, our methods exploit two particular strategies to enlarge the margins of different classes by relaxing the strict binary targets into a more feasible variable matrix. Second, a robust elastic-net regularization of singular values is introduced to enhance the compactness and effectiveness of the learned projection matrix. Third, the resulting optimization problem of ENLR has a closed-form solution in each iteration, which can be solved efficiently. Finally, rather than directly exploiting the projection matrix for recognition, our methods employ the transformed features as the new discriminate representations to make final image classification. Compared with the traditional linear regression model and some of its variants, our method is much more accurate in image classification. Extensive experiments conducted on publicly available data sets well demonstrate that the proposed framework can outperform the state-of-the-art methods. The MATLAB codes of our methods can be available at http://www.yongxu.org/lunwen.html.
Fuzzy multiple linear regression: A computational approach
Juang, C. H.; Huang, X. H.; Fleming, J. W.
1992-01-01
This paper presents a new computational approach for performing fuzzy regression. In contrast to Bardossy's approach, the new approach, while dealing with fuzzy variables, closely follows the conventional regression technique. In this approach, treatment of fuzzy input is more 'computational' than 'symbolic.' The following sections first outline the formulation of the new approach, then deal with the implementation and computational scheme, and this is followed by examples to illustrate the new procedure.
Computing multiple-output regression quantile regions
Czech Academy of Sciences Publication Activity Database
Paindaveine, D.; Šiman, Miroslav
2012-01-01
Roč. 56, č. 4 (2012), s. 840-853 ISSN 0167-9473 R&D Projects: GA MŠk(CZ) 1M06047 Institutional research plan: CEZ:AV0Z10750506 Keywords : halfspace depth * multiple-output regression * parametric linear programming * quantile regression Subject RIV: BA - General Mathematics Impact factor: 1.304, year: 2012 http://library.utia.cas.cz/separaty/2012/SI/siman-0376413.pdf
There is No Quantum Regression Theorem
International Nuclear Information System (INIS)
Ford, G.W.; OConnell, R.F.
1996-01-01
The Onsager regression hypothesis states that the regression of fluctuations is governed by macroscopic equations describing the approach to equilibrium. It is here asserted that this hypothesis fails in the quantum case. This is shown first by explicit calculation for the example of quantum Brownian motion of an oscillator and then in general from the fluctuation-dissipation theorem. It is asserted that the correct generalization of the Onsager hypothesis is the fluctuation-dissipation theorem. copyright 1996 The American Physical Society
Caudal regression syndrome : a case report
International Nuclear Information System (INIS)
Lee, Eun Joo; Kim, Hi Hye; Kim, Hyung Sik; Park, So Young; Han, Hye Young; Lee, Kwang Hun
1998-01-01
Caudal regression syndrome is a rare congenital anomaly, which results from a developmental failure of the caudal mesoderm during the fetal period. We present a case of caudal regression syndrome composed of a spectrum of anomalies including sirenomelia, dysplasia of the lower lumbar vertebrae, sacrum, coccyx and pelvic bones,genitourinary and anorectal anomalies, and dysplasia of the lung, as seen during infantography and MR imaging
Caudal regression syndrome : a case report
Energy Technology Data Exchange (ETDEWEB)
Lee, Eun Joo; Kim, Hi Hye; Kim, Hyung Sik; Park, So Young; Han, Hye Young; Lee, Kwang Hun [Chungang Gil Hospital, Incheon (Korea, Republic of)
1998-07-01
Caudal regression syndrome is a rare congenital anomaly, which results from a developmental failure of the caudal mesoderm during the fetal period. We present a case of caudal regression syndrome composed of a spectrum of anomalies including sirenomelia, dysplasia of the lower lumbar vertebrae, sacrum, coccyx and pelvic bones,genitourinary and anorectal anomalies, and dysplasia of the lung, as seen during infantography and MR imaging.
Spontaneous regression of metastatic Merkel cell carcinoma.
LENUS (Irish Health Repository)
Hassan, S J
2010-01-01
Merkel cell carcinoma is a rare aggressive neuroendocrine carcinoma of the skin predominantly affecting elderly Caucasians. It has a high rate of local recurrence and regional lymph node metastases. It is associated with a poor prognosis. Complete spontaneous regression of Merkel cell carcinoma has been reported but is a poorly understood phenomenon. Here we present a case of complete spontaneous regression of metastatic Merkel cell carcinoma demonstrating a markedly different pattern of events from those previously published.
Forecasting exchange rates: a robust regression approach
Preminger, Arie; Franck, Raphael
2005-01-01
The least squares estimation method as well as other ordinary estimation method for regression models can be severely affected by a small number of outliers, thus providing poor out-of-sample forecasts. This paper suggests a robust regression approach, based on the S-estimation method, to construct forecasting models that are less sensitive to data contamination by outliers. A robust linear autoregressive (RAR) and a robust neural network (RNN) models are estimated to study the predictabil...
Marginal longitudinal semiparametric regression via penalized splines
Al Kadiri, M.
2010-08-01
We study the marginal longitudinal nonparametric regression problem and some of its semiparametric extensions. We point out that, while several elaborate proposals for efficient estimation have been proposed, a relative simple and straightforward one, based on penalized splines, has not. After describing our approach, we then explain how Gibbs sampling and the BUGS software can be used to achieve quick and effective implementation. Illustrations are provided for nonparametric regression and additive models.
Marginal longitudinal semiparametric regression via penalized splines
Al Kadiri, M.; Carroll, R.J.; Wand, M.P.
2010-01-01
We study the marginal longitudinal nonparametric regression problem and some of its semiparametric extensions. We point out that, while several elaborate proposals for efficient estimation have been proposed, a relative simple and straightforward one, based on penalized splines, has not. After describing our approach, we then explain how Gibbs sampling and the BUGS software can be used to achieve quick and effective implementation. Illustrations are provided for nonparametric regression and additive models.
Post-processing through linear regression
van Schaeybroeck, B.; Vannitsem, S.
2011-03-01
Various post-processing techniques are compared for both deterministic and ensemble forecasts, all based on linear regression between forecast data and observations. In order to evaluate the quality of the regression methods, three criteria are proposed, related to the effective correction of forecast error, the optimal variability of the corrected forecast and multicollinearity. The regression schemes under consideration include the ordinary least-square (OLS) method, a new time-dependent Tikhonov regularization (TDTR) method, the total least-square method, a new geometric-mean regression (GM), a recently introduced error-in-variables (EVMOS) method and, finally, a "best member" OLS method. The advantages and drawbacks of each method are clarified. These techniques are applied in the context of the 63 Lorenz system, whose model version is affected by both initial condition and model errors. For short forecast lead times, the number and choice of predictors plays an important role. Contrarily to the other techniques, GM degrades when the number of predictors increases. At intermediate lead times, linear regression is unable to provide corrections to the forecast and can sometimes degrade the performance (GM and the best member OLS with noise). At long lead times the regression schemes (EVMOS, TDTR) which yield the correct variability and the largest correlation between ensemble error and spread, should be preferred.
Post-processing through linear regression
Directory of Open Access Journals (Sweden)
B. Van Schaeybroeck
2011-03-01
Full Text Available Various post-processing techniques are compared for both deterministic and ensemble forecasts, all based on linear regression between forecast data and observations. In order to evaluate the quality of the regression methods, three criteria are proposed, related to the effective correction of forecast error, the optimal variability of the corrected forecast and multicollinearity. The regression schemes under consideration include the ordinary least-square (OLS method, a new time-dependent Tikhonov regularization (TDTR method, the total least-square method, a new geometric-mean regression (GM, a recently introduced error-in-variables (EVMOS method and, finally, a "best member" OLS method. The advantages and drawbacks of each method are clarified.
These techniques are applied in the context of the 63 Lorenz system, whose model version is affected by both initial condition and model errors. For short forecast lead times, the number and choice of predictors plays an important role. Contrarily to the other techniques, GM degrades when the number of predictors increases. At intermediate lead times, linear regression is unable to provide corrections to the forecast and can sometimes degrade the performance (GM and the best member OLS with noise. At long lead times the regression schemes (EVMOS, TDTR which yield the correct variability and the largest correlation between ensemble error and spread, should be preferred.
Unbalanced Regressions and the Predictive Equation
DEFF Research Database (Denmark)
Osterrieder, Daniela; Ventosa-Santaulària, Daniel; Vera-Valdés, J. Eduardo
Predictive return regressions with persistent regressors are typically plagued by (asymptotically) biased/inconsistent estimates of the slope, non-standard or potentially even spurious statistical inference, and regression unbalancedness. We alleviate the problem of unbalancedness in the theoreti......Predictive return regressions with persistent regressors are typically plagued by (asymptotically) biased/inconsistent estimates of the slope, non-standard or potentially even spurious statistical inference, and regression unbalancedness. We alleviate the problem of unbalancedness...... in the theoretical predictive equation by suggesting a data generating process, where returns are generated as linear functions of a lagged latent I(0) risk process. The observed predictor is a function of this latent I(0) process, but it is corrupted by a fractionally integrated noise. Such a process may arise due...... to aggregation or unexpected level shifts. In this setup, the practitioner estimates a misspecified, unbalanced, and endogenous predictive regression. We show that the OLS estimate of this regression is inconsistent, but standard inference is possible. To obtain a consistent slope estimate, we then suggest...
Duesterhoeft, Sara M; Ernst, Linda M; Siebert, Joseph R; Kapur, Raj P
2007-12-15
Sirenomelia and caudal regression have sparked centuries of interest and recent debate regarding their classification and pathogenetic relationship. Specific anomalies are common to both conditions, but aside from fusion of the lower extremities, an aberrant abdominal umbilical artery ("persistent vitelline artery") has been invoked as the chief anatomic finding that distinguishes sirenomelia from caudal regression. This observation is important from a pathogenetic viewpoint, in that diversion of blood away from the caudal portion of the embryo through the abdominal umbilical artery ("vascular steal") has been proposed as the primary mechanism leading to sirenomelia. In contrast, caudal regression is hypothesized to arise from primary deficiency of caudal mesoderm. We present five cases of caudal regression that exhibit an aberrant abdominal umbilical artery similar to that typically associated with sirenomelia. Review of the literature identified four similar cases. Collectively, the series lends support for a caudal regression-sirenomelia spectrum with a common pathogenetic basis and suggests that abnormal umbilical arterial anatomy may be the consequence, rather than the cause, of deficient caudal mesoderm. (c) 2007 Wiley-Liss, Inc.
Length bias correction in gene ontology enrichment analysis using logistic regression.
Mi, Gu; Di, Yanming; Emerson, Sarah; Cumbie, Jason S; Chang, Jeff H
2012-01-01
When assessing differential gene expression from RNA sequencing data, commonly used statistical tests tend to have greater power to detect differential expression of genes encoding longer transcripts. This phenomenon, called "length bias", will influence subsequent analyses such as Gene Ontology enrichment analysis. In the presence of length bias, Gene Ontology categories that include longer genes are more likely to be identified as enriched. These categories, however, are not necessarily biologically more relevant. We show that one can effectively adjust for length bias in Gene Ontology analysis by including transcript length as a covariate in a logistic regression model. The logistic regression model makes the statistical issue underlying length bias more transparent: transcript length becomes a confounding factor when it correlates with both the Gene Ontology membership and the significance of the differential expression test. The inclusion of the transcript length as a covariate allows one to investigate the direct correlation between the Gene Ontology membership and the significance of testing differential expression, conditional on the transcript length. We present both real and simulated data examples to show that the logistic regression approach is simple, effective, and flexible.
Energy Technology Data Exchange (ETDEWEB)
Yoon, Hee Mang; Jung, Ah Young; Cho, Young Ah; Yoon, Chong Hyun; Lee, Jin Seong [Asan Medical Center Children' s Hospital, University of Ulsan College of Medicine, Department of Radiology and Research Institute of Radiology, Songpa-gu, Seoul (Korea, Republic of); Kim, Ellen Ai-Rhan [University of Ulsan College of Medicine, Division of Neonatology, Asan Medical Center Children' s Hospital, Seoul (Korea, Republic of); Chung, Sung-Hoon [Kyung Hee University School of Medicine, Department of Pediatrics, Seoul (Korea, Republic of); Kim, Seon-Ok [Asan Medical Center, Department of Clinical Epidemiology and Biostatistics, Seoul (Korea, Republic of)
2017-06-15
To describe the natural course of extralobar pulmonary sequestration (EPS) and identify factors associated with spontaneous regression of EPS. We retrospectively searched for patients diagnosed with EPS on initial contrast CT scan within 1 month after birth and had a follow-up CT scan without treatment. Spontaneous regression of EPS was assessed by percentage decrease in volume (PDV) and percentage decrease in sum of the diameter of systemic feeding arteries (PDD) by comparing initial and follow-up CT scans. Clinical and CT features were analysed to determine factors associated with PDV and PDD rates. Fifty-one neonates were included. The cumulative proportions of patients reaching PDV > 50 % and PDD > 50 % were 93.0 % and 73.3 % at 4 years, respectively. Tissue attenuation was significantly associated with PDV rate (B = -21.78, P <.001). The tissue attenuation (B = -22.62, P =.001) and diameter of the largest systemic feeding arteries (B = -48.31, P =.011) were significant factors associated with PDD rate. The volume and diameter of systemic feeding arteries of EPS spontaneously decreased within 4 years without treatment. EPSs showing a low tissue attenuation and small diameter of the largest systemic feeding arteries on initial contrast-enhanced CT scans were likely to regress spontaneously. (orig.)
Traditional Indian spices and their health significance.
Krishnaswamy, Kamala
2008-01-01
India has been recognized all over the world for spices and medicinal plants. Both exhibit a wide range of physiological and pharmacological properties. Current biomedical efforts are focused on their scientific merits, to provide science-based evidence for the traditional uses and to develop either functional foods or nutraceuticals. The Indian traditional medical systems use turmeric for wound healing, rheumatic disorders, gastrointestinal symptoms, deworming, rhinitis and as a cosmetic. Studies in India have explored its anti-inflammatory, cholekinetic and anti-oxidant potentials with the recent investigations focusing on its preventive effect on precarcinogenic, anti-inflammatory and anti atherosclerotic effects in biological systems both under in vitro and in vivo conditions in animals and humans. Both turmeric and curcumin were found to increase detoxifying enzymes, prevent DNA damage, improve DNA repair, decrease mutations and tumour formation and exhibit antioxidative potential in animals. Limited clinical studies suggest that turmeric can significantly impact excretion of mutagens in urine in smokers and regress precancerous palatal lesions. It reduces DNA adducts and micronuclei in oral epithelial cells. It prevents formation of nitroso compounds both in vivo and in vitro. It delays induced cataract in diabetes and reduces hyperlipidemia in obese rats. Recently several molecular targets have been identified for therapeutic / preventive effects of turmeric. Fenugreek seeds, a rich source of soluble fiber used in Indian cuisine reduces blood glucose and lipids and can be used as a food adjuvant in diabetes. Similarly garlic, onions, and ginger have been found to modulate favourably the process of carcinogenesis.
Use of probabilistic weights to enhance linear regression myoelectric control
Smith, Lauren H.; Kuiken, Todd A.; Hargrove, Levi J.
2015-12-01
Objective. Clinically available prostheses for transradial amputees do not allow simultaneous myoelectric control of degrees of freedom (DOFs). Linear regression methods can provide simultaneous myoelectric control, but frequently also result in difficulty with isolating individual DOFs when desired. This study evaluated the potential of using probabilistic estimates of categories of gross prosthesis movement, which are commonly used in classification-based myoelectric control, to enhance linear regression myoelectric control. Approach. Gaussian models were fit to electromyogram (EMG) feature distributions for three movement classes at each DOF (no movement, or movement in either direction) and used to weight the output of linear regression models by the probability that the user intended the movement. Eight able-bodied and two transradial amputee subjects worked in a virtual Fitts’ law task to evaluate differences in controllability between linear regression and probability-weighted regression for an intramuscular EMG-based three-DOF wrist and hand system. Main results. Real-time and offline analyses in able-bodied subjects demonstrated that probability weighting improved performance during single-DOF tasks (p < 0.05) by preventing extraneous movement at additional DOFs. Similar results were seen in experiments with two transradial amputees. Though goodness-of-fit evaluations suggested that the EMG feature distributions showed some deviations from the Gaussian, equal-covariance assumptions used in this experiment, the assumptions were sufficiently met to provide improved performance compared to linear regression control. Significance. Use of probability weights can improve the ability to isolate individual during linear regression myoelectric control, while maintaining the ability to simultaneously control multiple DOFs.
Regression of oral lichenoid lesions after replacement of dental restorations.
Mårell, L; Tillberg, A; Widman, L; Bergdahl, J; Berglund, A
2014-05-01
The aim of the study was to determine the prognosis and to evaluate the regression of lichenoid contact reactions (LCR) and oral lichen planus (OLP) after replacement of dental restorative materials suspected as causing the lesions. Forty-four referred patients with oral lesions participated in a follow-up study that was initiated an average of 6 years after the first examination at the Department of Odontology, i.e. the baseline examination. The patients underwent odontological clinical examination and answered a questionnaire with questions regarding dental health, medical and psychological health, and treatments undertaken from baseline to follow-up. After exchange of dental materials, regression of oral lesions was significantly higher among patients with LCR than with OLP. As no cases with OLP regressed after an exchange of materials, a proper diagnosis has to be made to avoid unnecessary exchanges of intact restorations on patients with OLP.
Linear regression and sensitivity analysis in nuclear reactor design
International Nuclear Information System (INIS)
Kumar, Akansha; Tsvetkov, Pavel V.; McClarren, Ryan G.
2015-01-01
Highlights: • Presented a benchmark for the applicability of linear regression to complex systems. • Applied linear regression to a nuclear reactor power system. • Performed neutronics, thermal–hydraulics, and energy conversion using Brayton’s cycle for the design of a GCFBR. • Performed detailed sensitivity analysis to a set of parameters in a nuclear reactor power system. • Modeled and developed reactor design using MCNP, regression using R, and thermal–hydraulics in Java. - Abstract: The paper presents a general strategy applicable for sensitivity analysis (SA), and uncertainity quantification analysis (UA) of parameters related to a nuclear reactor design. This work also validates the use of linear regression (LR) for predictive analysis in a nuclear reactor design. The analysis helps to determine the parameters on which a LR model can be fit for predictive analysis. For those parameters, a regression surface is created based on trial data and predictions are made using this surface. A general strategy of SA to determine and identify the influential parameters those affect the operation of the reactor is mentioned. Identification of design parameters and validation of linearity assumption for the application of LR of reactor design based on a set of tests is performed. The testing methods used to determine the behavior of the parameters can be used as a general strategy for UA, and SA of nuclear reactor models, and thermal hydraulics calculations. A design of a gas cooled fast breeder reactor (GCFBR), with thermal–hydraulics, and energy transfer has been used for the demonstration of this method. MCNP6 is used to simulate the GCFBR design, and perform the necessary criticality calculations. Java is used to build and run input samples, and to extract data from the output files of MCNP6, and R is used to perform regression analysis and other multivariate variance, and analysis of the collinearity of data
Regression analysis using dependent Polya trees.
Schörgendorfer, Angela; Branscum, Adam J
2013-11-30
Many commonly used models for linear regression analysis force overly simplistic shape and scale constraints on the residual structure of data. We propose a semiparametric Bayesian model for regression analysis that produces data-driven inference by using a new type of dependent Polya tree prior to model arbitrary residual distributions that are allowed to evolve across increasing levels of an ordinal covariate (e.g., time, in repeated measurement studies). By modeling residual distributions at consecutive covariate levels or time points using separate, but dependent Polya tree priors, distributional information is pooled while allowing for broad pliability to accommodate many types of changing residual distributions. We can use the proposed dependent residual structure in a wide range of regression settings, including fixed-effects and mixed-effects linear and nonlinear models for cross-sectional, prospective, and repeated measurement data. A simulation study illustrates the flexibility of our novel semiparametric regression model to accurately capture evolving residual distributions. In an application to immune development data on immunoglobulin G antibodies in children, our new model outperforms several contemporary semiparametric regression models based on a predictive model selection criterion. Copyright © 2013 John Wiley & Sons, Ltd.
International Nuclear Information System (INIS)
Coleman, D.J.; Lizzi, F.L.; Silverman, R.H.; Ellsworth, R.M.; Haik, B.G.; Abramson, D.H.; Smith, M.E.; Rondeau, M.J.
1985-01-01
Parameters derived from computer analysis of digital radio-frequency (rf) ultrasound scan data of untreated uveal malignant melanomas were examined for correlations with tumor regression following cobalt-60 plaque. Parameters included tumor height, normalized power spectrum and acoustic tissue type (ATT). Acoustic tissue type was based upon discriminant analysis of tumor power spectra, with spectra of tumors of known pathology serving as a model. Results showed ATT to be correlated with tumor regression during the first 18 months following treatment. Tumors with ATT associated with spindle cell malignant melanoma showed over twice the percentage reduction in height as those with ATT associated with mixed/epithelioid melanomas. Pre-treatment height was only weakly correlated with regression. Additionally, significant spectral changes were observed following treatment. Ultrasonic spectrum analysis thus provides a noninvasive tool for classification, prediction and monitoring of tumor response to cobalt-60 plaque
On Solving Lq-Penalized Regressions
Directory of Open Access Journals (Sweden)
Tracy Zhou Wu
2007-01-01
Full Text Available Lq-penalized regression arises in multidimensional statistical modelling where all or part of the regression coefficients are penalized to achieve both accuracy and parsimony of statistical models. There is often substantial computational difficulty except for the quadratic penalty case. The difficulty is partly due to the nonsmoothness of the objective function inherited from the use of the absolute value. We propose a new solution method for the general Lq-penalized regression problem based on space transformation and thus efficient optimization algorithms. The new method has immediate applications in statistics, notably in penalized spline smoothing problems. In particular, the LASSO problem is shown to be polynomial time solvable. Numerical studies show promise of our approach.
Influence diagnostics in meta-regression model.
Shi, Lei; Zuo, ShanShan; Yu, Dalei; Zhou, Xiaohua
2017-09-01
This paper studies the influence diagnostics in meta-regression model including case deletion diagnostic and local influence analysis. We derive the subset deletion formulae for the estimation of regression coefficient and heterogeneity variance and obtain the corresponding influence measures. The DerSimonian and Laird estimation and maximum likelihood estimation methods in meta-regression are considered, respectively, to derive the results. Internal and external residual and leverage measure are defined. The local influence analysis based on case-weights perturbation scheme, responses perturbation scheme, covariate perturbation scheme, and within-variance perturbation scheme are explored. We introduce a method by simultaneous perturbing responses, covariate, and within-variance to obtain the local influence measure, which has an advantage of capable to compare the influence magnitude of influential studies from different perturbations. An example is used to illustrate the proposed methodology. Copyright © 2017 John Wiley & Sons, Ltd.
Regression Models for Market-Shares
DEFF Research Database (Denmark)
Birch, Kristina; Olsen, Jørgen Kai; Tjur, Tue
2005-01-01
On the background of a data set of weekly sales and prices for three brands of coffee, this paper discusses various regression models and their relation to the multiplicative competitive-interaction model (the MCI model, see Cooper 1988, 1993) for market-shares. Emphasis is put on the interpretat......On the background of a data set of weekly sales and prices for three brands of coffee, this paper discusses various regression models and their relation to the multiplicative competitive-interaction model (the MCI model, see Cooper 1988, 1993) for market-shares. Emphasis is put...... on the interpretation of the parameters in relation to models for the total sales based on discrete choice models.Key words and phrases. MCI model, discrete choice model, market-shares, price elasitcity, regression model....
Enders, Felicity
2013-12-01
Although regression is widely used for reading and publishing in the medical literature, no instruments were previously available to assess students' understanding. The goal of this study was to design and assess such an instrument for graduate students in Clinical and Translational Science and Public Health. A 27-item REsearch on Global Regression Expectations in StatisticS (REGRESS) quiz was developed through an iterative process. Consenting students taking a course on linear regression in a Clinical and Translational Science program completed the quiz pre- and postcourse. Student results were compared to practicing statisticians with a master's or doctoral degree in statistics or a closely related field. Fifty-two students responded precourse, 59 postcourse , and 22 practicing statisticians completed the quiz. The mean (SD) score was 9.3 (4.3) for students precourse and 19.0 (3.5) postcourse (P REGRESS quiz was internally reliable (Cronbach's alpha 0.89). The initial validation is quite promising with statistically significant and meaningful differences across time and study populations. Further work is needed to validate the quiz across multiple institutions. © 2013 Wiley Periodicals, Inc.
Synthetic definition of biological significance
International Nuclear Information System (INIS)
Buffington, J.D.
1975-01-01
The central theme of the workshop is recounted and the views of the authors are summarized. Areas of broad agreement or disagreement, unifying principles, and research needs are identified. Authors' views are consolidated into concepts that have practical utility for the scientist making impact assessments. The need for decision-makers and managers to be cognizant of the recommendations made herein is discussed. Finally, bringing together the diverse views of the workshop participants, a conceptual definition of biological significance is synthesized
Zhu, K; Lou, Z; Zhou, J; Ballester, N; Kong, N; Parikh, P
2015-01-01
This article is part of the Focus Theme of Methods of Information in Medicine on "Big Data and Analytics in Healthcare". Hospital readmissions raise healthcare costs and cause significant distress to providers and patients. It is, therefore, of great interest to healthcare organizations to predict what patients are at risk to be readmitted to their hospitals. However, current logistic regression based risk prediction models have limited prediction power when applied to hospital administrative data. Meanwhile, although decision trees and random forests have been applied, they tend to be too complex to understand among the hospital practitioners. Explore the use of conditional logistic regression to increase the prediction accuracy. We analyzed an HCUP statewide inpatient discharge record dataset, which includes patient demographics, clinical and care utilization data from California. We extracted records of heart failure Medicare beneficiaries who had inpatient experience during an 11-month period. We corrected the data imbalance issue with under-sampling. In our study, we first applied standard logistic regression and decision tree to obtain influential variables and derive practically meaning decision rules. We then stratified the original data set accordingly and applied logistic regression on each data stratum. We further explored the effect of interacting variables in the logistic regression modeling. We conducted cross validation to assess the overall prediction performance of conditional logistic regression (CLR) and compared it with standard classification models. The developed CLR models outperformed several standard classification models (e.g., straightforward logistic regression, stepwise logistic regression, random forest, support vector machine). For example, the best CLR model improved the classification accuracy by nearly 20% over the straightforward logistic regression model. Furthermore, the developed CLR models tend to achieve better sensitivity of
Dunbar, P. K.; Furtney, M.; McLean, S. J.; Sweeney, A. D.
2014-12-01
Tsunamis have inflicted death and destruction on the coastlines of the world throughout history. The occurrence of tsunamis and the resulting effects have been collected and studied as far back as the second millennium B.C. The knowledge gained from cataloging and examining these events has led to significant changes in our understanding of tsunamis, tsunami sources, and methods to mitigate the effects of tsunamis. The most significant, not surprisingly, are often the most devastating, such as the 2011 Tohoku, Japan earthquake and tsunami. The goal of this poster is to give a brief overview of the occurrence of tsunamis and then focus specifically on several significant tsunamis. There are various criteria to determine the most significant tsunamis: the number of deaths, amount of damage, maximum runup height, had a major impact on tsunami science or policy, etc. As a result, descriptions will include some of the most costly (2011 Tohoku, Japan), the most deadly (2004 Sumatra, 1883 Krakatau), and the highest runup ever observed (1958 Lituya Bay, Alaska). The discovery of the Cascadia subduction zone as the source of the 1700 Japanese "Orphan" tsunami and a future tsunami threat to the U.S. northwest coast, contributed to the decision to form the U.S. National Tsunami Hazard Mitigation Program. The great Lisbon earthquake of 1755 marked the beginning of the modern era of seismology. Knowledge gained from the 1964 Alaska earthquake and tsunami helped confirm the theory of plate tectonics. The 1946 Alaska, 1952 Kuril Islands, 1960 Chile, 1964 Alaska, and the 2004 Banda Aceh, tsunamis all resulted in warning centers or systems being established.The data descriptions on this poster were extracted from NOAA's National Geophysical Data Center (NGDC) global historical tsunami database. Additional information about these tsunamis, as well as water level data can be found by accessing the NGDC website www.ngdc.noaa.gov/hazard/
Asymptomatic proteinuria. Clinical significance.
Papper, S
1977-09-01
Patients with asymptomatic proteinuria have varied reasons for the proteinuria and travel diverse courses. In the individual with normal renal function and no systemic cause, ie, idiopathic asymptomatic proteinuria, the outlook is generally favorable. Microscopic hematuria probably raises some degree of question about prognosis. The kidney shows normal glomeruli, subtle changes, or an identifiable lesion. The initial approach includes a clinical and laboratory search for systemic disease, repeated urinalyses, quantitative measurements of proteinuria, determination of creatinine clearance, protein electrophoresis where indicated, and intravenous pyelography. The need for regularly scheduled follow-up evaluation is emphasized. Although the initial approach need not include renal biopsy, a decline in creatinine clearance, an increase in proteinuria, or both are indications for biopsy and consideration of drug therapy.
Arcuate Fasciculus in Autism Spectrum Disorder Toddlers with Language Regression
Directory of Open Access Journals (Sweden)
Zhang Lin
2018-03-01
Full Text Available Language regression is observed in a subset of toddlers with autism spectrum disorder (ASD as initial symptom. However, such a phenomenon has not been fully explored, partly due to the lack of definite diagnostic evaluation methods and criteria. Materials and Methods: Fifteen toddlers with ASD exhibiting language regression and fourteen age-matched typically developing (TD controls underwent diffusion tensor imaging (DTI. DTI parameters including fractional anisotropy (FA, average fiber length (AFL, tract volume (TV and number of voxels (NV were analyzed by Neuro 3D in Siemens syngo workstation. Subsequently, the data were analyzed by using IBM SPSS Statistics 22. Results: Compared with TD children, a significant reduction of FA along with an increase in TV and NV was observed in ASD children with language regression. Note that there were no significant differences between ASD and TD children in AFL of the arcuate fasciculus (AF. Conclusions: These DTI changes in the AF suggest that microstructural anomalies of the AF white matter may be associated with language deficits in ASD children exhibiting language regression starting from an early age.
On directional multiple-output quantile regression
Czech Academy of Sciences Publication Activity Database
Paindaveine, D.; Šiman, Miroslav
2011-01-01
Roč. 102, č. 2 (2011), s. 193-212 ISSN 0047-259X R&D Projects: GA MŠk(CZ) 1M06047 Grant - others:Commision EC(BE) Fonds National de la Recherche Scientifique Institutional research plan: CEZ:AV0Z10750506 Keywords : multivariate quantile * quantile regression * multiple-output regression * halfspace depth * portfolio optimization * value-at risk Subject RIV: BA - General Mathematics Impact factor: 0.879, year: 2011 http://library.utia.cas.cz/separaty/2011/SI/siman-0364128.pdf
Removing Malmquist bias from linear regressions
Verter, Frances
1993-01-01
Malmquist bias is present in all astronomical surveys where sources are observed above an apparent brightness threshold. Those sources which can be detected at progressively larger distances are progressively more limited to the intrinsically luminous portion of the true distribution. This bias does not distort any of the measurements, but distorts the sample composition. We have developed the first treatment to correct for Malmquist bias in linear regressions of astronomical data. A demonstration of the corrected linear regression that is computed in four steps is presented.
Robust median estimator in logisitc regression
Czech Academy of Sciences Publication Activity Database
Hobza, T.; Pardo, L.; Vajda, Igor
2008-01-01
Roč. 138, č. 12 (2008), s. 3822-3840 ISSN 0378-3758 R&D Projects: GA MŠk 1M0572 Grant - others:Instituto Nacional de Estadistica (ES) MPO FI - IM3/136; GA MŠk(CZ) MTM 2006-06872 Institutional research plan: CEZ:AV0Z10750506 Keywords : Logistic regression * Median * Robustness * Consistency and asymptotic normality * Morgenthaler * Bianco and Yohai * Croux and Hasellbroeck Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.679, year: 2008 http://library.utia.cas.cz/separaty/2008/SI/vajda-robust%20median%20estimator%20in%20logistic%20regression.pdf
Multiple Response Regression for Gaussian Mixture Models with Known Labels.
Lee, Wonyul; Du, Ying; Sun, Wei; Hayes, D Neil; Liu, Yufeng
2012-12-01
Multiple response regression is a useful regression technique to model multiple response variables using the same set of predictor variables. Most existing methods for multiple response regression are designed for modeling homogeneous data. In many applications, however, one may have heterogeneous data where the samples are divided into multiple groups. Our motivating example is a cancer dataset where the samples belong to multiple cancer subtypes. In this paper, we consider modeling the data coming from a mixture of several Gaussian distributions with known group labels. A naive approach is to split the data into several groups according to the labels and model each group separately. Although it is simple, this approach ignores potential common structures across different groups. We propose new penalized methods to model all groups jointly in which the common and unique structures can be identified. The proposed methods estimate the regression coefficient matrix, as well as the conditional inverse covariance matrix of response variables. Asymptotic properties of the proposed methods are explored. Through numerical examples, we demonstrate that both estimation and prediction can be improved by modeling all groups jointly using the proposed methods. An application to a glioblastoma cancer dataset reveals some interesting common and unique gene relationships across different cancer subtypes.
Demonstration of a Fiber Optic Regression Probe
Korman, Valentin; Polzin, Kurt A.
2010-01-01
The capability to provide localized, real-time monitoring of material regression rates in various applications has the potential to provide a new stream of data for development testing of various components and systems, as well as serving as a monitoring tool in flight applications. These applications include, but are not limited to, the regression of a combusting solid fuel surface, the ablation of the throat in a chemical rocket or the heat shield of an aeroshell, and the monitoring of erosion in long-life plasma thrusters. The rate of regression in the first application is very fast, while the second and third are increasingly slower. A recent fundamental sensor development effort has led to a novel regression, erosion, and ablation sensor technology (REAST). The REAST sensor allows for measurement of real-time surface erosion rates at a discrete surface location. The sensor is optical, using two different, co-located fiber-optics to perform the regression measurement. The disparate optical transmission properties of the two fiber-optics makes it possible to measure the regression rate by monitoring the relative light attenuation through the fibers. As the fibers regress along with the parent material in which they are embedded, the relative light intensities through the two fibers changes, providing a measure of the regression rate. The optical nature of the system makes it relatively easy to use in a variety of harsh, high temperature environments, and it is also unaffected by the presence of electric and magnetic fields. In addition, the sensor could be used to perform optical spectroscopy on the light emitted by a process and collected by fibers, giving localized measurements of various properties. The capability to perform an in-situ measurement of material regression rates is useful in addressing a variety of physical issues in various applications. An in-situ measurement allows for real-time data regarding the erosion rates, providing a quick method for
Integrative analysis of multiple diverse omics datasets by sparse group multitask regression
Directory of Open Access Journals (Sweden)
Dongdong eLin
2014-10-01
Full Text Available A variety of high throughput genome-wide assays enable the exploration of genetic risk factors underlying complex traits. Although these studies have remarkable impact on identifying susceptible biomarkers, they suffer from issues such as limited sample size and low reproducibility. Combining individual studies of different genetic levels/platforms has the promise to improve the power and consistency of biomarker identification. In this paper, we propose a novel integrative method, namely sparse group multitask regression, for integrating diverse omics datasets, platforms and populations to identify risk genes/factors of complex diseases. This method combines multitask learning with sparse group regularization, which will: 1 treat the biomarker identification in each single study as a task and then combine them by multitask learning; 2 group variables from all studies for identifying significant genes; 3 enforce sparse constraint on groups of variables to overcome the ‘small sample, but large variables’ problem. We introduce two sparse group penalties: sparse group lasso and sparse group ridge in our multitask model, and provide an effective algorithm for each model. In addition, we propose a significance test for the identification of potential risk genes. Two simulation studies are performed to evaluate the performance of our integrative method by comparing it with conventional meta-analysis method. The results show that our sparse group multitask method outperforms meta-analysis method significantly. In an application to our osteoporosis studies, 7 genes are identified as significant genes by our method and are found to have significant effects in other three independent studies for validation. The most significant gene SOD2 has been identified in our previous osteoporosis study involving the same expression dataset. Several other genes such as TREML2, HTR1E and GLO1 are shown to be novel susceptible genes for osteoporosis, as confirmed
KELEŞ, Taliha; ALTUN, Murat
2016-01-01
Regression analysis is a statistical technique for investigating and modeling the relationship between variables. The purpose of this study was the trivial presentation of the equation for orthogonal regression (OR) and the comparison of classical linear regression (CLR) and OR techniques with respect to the sum of squared perpendicular distances. For that purpose, the analyses were shown by an example. It was found that the sum of squared perpendicular distances of OR is smaller. Thus, it wa...
Dawson, Terence P.; Curran, Paul J.; Kupiec, John A.
1995-01-01
A major goal of airborne imaging spectrometry is to estimate the biochemical composition of vegetation canopies from reflectance spectra. Remotely-sensed estimates of foliar biochemical concentrations of forests would provide valuable indicators of ecosystem function at regional and eventually global scales. Empirical research has shown a relationship exists between the amount of radiation reflected from absorption features and the concentration of given biochemicals in leaves and canopies (Matson et al., 1994, Johnson et al., 1994). A technique commonly used to determine which wavelengths have the strongest correlation with the biochemical of interest is unguided (stepwise) multiple regression. Wavelengths are entered into a multivariate regression equation, in their order of importance, each contributing to the reduction of the variance in the measured biochemical concentration. A significant problem with the use of stepwise regression for determining the correlation between biochemical concentration and spectra is that of 'overfitting' as there are significantly more wavebands than biochemical measurements. This could result in the selection of wavebands which may be more accurately attributable to noise or canopy effects. In addition, there is a real problem of collinearity in that the individual biochemical concentrations may covary. A strong correlation between the reflectance at a given wavelength and the concentration of a biochemical of interest, therefore, may be due to the effect of another biochemical which is closely related. Furthermore, it is not always possible to account for potentially suitable waveband omissions in the stepwise selection procedure. This concern about the suitability of stepwise regression has been identified and acknowledged in a number of recent studies (Wessman et al., 1988, Curran, 1989, Curran et al., 1992, Peterson and Hubbard, 1992, Martine and Aber, 1994, Kupiec, 1994). These studies have pointed to the lack of a physical
Method for nonlinear exponential regression analysis
Junkin, B. G.
1972-01-01
Two computer programs developed according to two general types of exponential models for conducting nonlinear exponential regression analysis are described. Least squares procedure is used in which the nonlinear problem is linearized by expanding in a Taylor series. Program is written in FORTRAN 5 for the Univac 1108 computer.
Measurement Error in Education and Growth Regressions
Portela, Miguel; Alessie, Rob; Teulings, Coen
2010-01-01
The use of the perpetual inventory method for the construction of education data per country leads to systematic measurement error. This paper analyzes its effect on growth regressions. We suggest a methodology for correcting this error. The standard attenuation bias suggests that using these
The M Word: Multicollinearity in Multiple Regression.
Morrow-Howell, Nancy
1994-01-01
Notes that existence of substantial correlation between two or more independent variables creates problems of multicollinearity in multiple regression. Discusses multicollinearity problem in social work research in which independent variables are usually intercorrelated. Clarifies problems created by multicollinearity, explains detection of…
Regression Discontinuity Designs Based on Population Thresholds
DEFF Research Database (Denmark)
Eggers, Andrew C.; Freier, Ronny; Grembi, Veronica
In many countries, important features of municipal government (such as the electoral system, mayors' salaries, and the number of councillors) depend on whether the municipality is above or below arbitrary population thresholds. Several papers have used a regression discontinuity design (RDD...
Deriving the Regression Line with Algebra
Quintanilla, John A.
2017-01-01
Exploration with spreadsheets and reliance on previous skills can lead students to determine the line of best fit. To perform linear regression on a set of data, students in Algebra 2 (or, in principle, Algebra 1) do not have to settle for using the mysterious "black box" of their graphing calculators (or other classroom technologies).…
Piecewise linear regression splines with hyperbolic covariates
International Nuclear Information System (INIS)
Cologne, John B.; Sposto, Richard
1992-09-01
Consider the problem of fitting a curve to data that exhibit a multiphase linear response with smooth transitions between phases. We propose substituting hyperbolas as covariates in piecewise linear regression splines to obtain curves that are smoothly joined. The method provides an intuitive and easy way to extend the two-phase linear hyperbolic response model of Griffiths and Miller and Watts and Bacon to accommodate more than two linear segments. The resulting regression spline with hyperbolic covariates may be fit by nonlinear regression methods to estimate the degree of curvature between adjoining linear segments. The added complexity of fitting nonlinear, as opposed to linear, regression models is not great. The extra effort is particularly worthwhile when investigators are unwilling to assume that the slope of the response changes abruptly at the join points. We can also estimate the join points (the values of the abscissas where the linear segments would intersect if extrapolated) if their number and approximate locations may be presumed known. An example using data on changing age at menarche in a cohort of Japanese women illustrates the use of the method for exploratory data analysis. (author)
Targeting: Logistic Regression, Special Cases and Extensions
Directory of Open Access Journals (Sweden)
Helmut Schaeben
2014-12-01
Full Text Available Logistic regression is a classical linear model for logit-transformed conditional probabilities of a binary target variable. It recovers the true conditional probabilities if the joint distribution of predictors and the target is of log-linear form. Weights-of-evidence is an ordinary logistic regression with parameters equal to the differences of the weights of evidence if all predictor variables are discrete and conditionally independent given the target variable. The hypothesis of conditional independence can be tested in terms of log-linear models. If the assumption of conditional independence is violated, the application of weights-of-evidence does not only corrupt the predicted conditional probabilities, but also their rank transform. Logistic regression models, including the interaction terms, can account for the lack of conditional independence, appropriate interaction terms compensate exactly for violations of conditional independence. Multilayer artificial neural nets may be seen as nested regression-like models, with some sigmoidal activation function. Most often, the logistic function is used as the activation function. If the net topology, i.e., its control, is sufficiently versatile to mimic interaction terms, artificial neural nets are able to account for violations of conditional independence and yield very similar results. Weights-of-evidence cannot reasonably include interaction terms; subsequent modifications of the weights, as often suggested, cannot emulate the effect of interaction terms.
Regression testing Ajax applications : Coping with dynamism
Roest, D.; Mesbah, A.; Van Deursen, A.
2009-01-01
Note: This paper is a pre-print of: Danny Roest, Ali Mesbah and Arie van Deursen. Regression Testing AJAX Applications: Coping with Dynamism. In Proceedings of the 3rd International Conference on Software Testing, Verification and Validation (ICST’10), Paris, France. IEEE Computer Society, 2010.
Group-wise partial least square regression
Camacho, José; Saccenti, Edoardo
2018-01-01
This paper introduces the group-wise partial least squares (GPLS) regression. GPLS is a new sparse PLS technique where the sparsity structure is defined in terms of groups of correlated variables, similarly to what is done in the related group-wise principal component analysis. These groups are
Finite Algorithms for Robust Linear Regression
DEFF Research Database (Denmark)
Madsen, Kaj; Nielsen, Hans Bruun
1990-01-01
The Huber M-estimator for robust linear regression is analyzed. Newton type methods for solution of the problem are defined and analyzed, and finite convergence is proved. Numerical experiments with a large number of test problems demonstrate efficiency and indicate that this kind of approach may...
Function approximation with polynomial regression slines
International Nuclear Information System (INIS)
Urbanski, P.
1996-01-01
Principles of the polynomial regression splines as well as algorithms and programs for their computation are presented. The programs prepared using software package MATLAB are generally intended for approximation of the X-ray spectra and can be applied in the multivariate calibration of radiometric gauges. (author)
Predicting Social Trust with Binary Logistic Regression
Adwere-Boamah, Joseph; Hufstedler, Shirley
2015-01-01
This study used binary logistic regression to predict social trust with five demographic variables from a national sample of adult individuals who participated in The General Social Survey (GSS) in 2012. The five predictor variables were respondents' highest degree earned, race, sex, general happiness and the importance of personally assisting…
Yet another look at MIDAS regression
Ph.H.B.F. Franses (Philip Hans)
2016-01-01
textabstractA MIDAS regression involves a dependent variable observed at a low frequency and independent variables observed at a higher frequency. This paper relates a true high frequency data generating process, where also the dependent variable is observed (hypothetically) at the high frequency,
Revisiting Regression in Autism: Heller's "Dementia Infantilis"
Westphal, Alexander; Schelinski, Stefanie; Volkmar, Fred; Pelphrey, Kevin
2013-01-01
Theodor Heller first described a severe regression of adaptive function in normally developing children, something he termed dementia infantilis, over one 100 years ago. Dementia infantilis is most closely related to the modern diagnosis, childhood disintegrative disorder. We translate Heller's paper, Uber Dementia Infantilis, and discuss…
Fast multi-output relevance vector regression
Ha, Youngmin
2017-01-01
This paper aims to decrease the time complexity of multi-output relevance vector regression from O(VM^3) to O(V^3+M^3), where V is the number of output dimensions, M is the number of basis functions, and V
Superquantile Regression: Theory, Algorithms, and Applications
2014-12-01
Highway, Suite 1204, Arlington, Va 22202-4302, and to the Office of Management and Budget, Paperwork Reduction Project (0704-0188) Washington DC 20503. 1...Navy submariners, reliability engineering, uncertainty quantification, and financial risk management . Superquantile, superquantile regression...Royset Carlos F. Borges Associate Professor of Operations Research Dissertation Supervisor Professor of Applied Mathematics Lyn R. Whitaker Javier
Measurement Error in Education and Growth Regressions
Portela, M.; Teulings, C.N.; Alessie, R.
The perpetual inventory method used for the construction of education data per country leads to systematic measurement error. This paper analyses the effect of this measurement error on GDP regressions. There is a systematic difference in the education level between census data and observations
Measurement error in education and growth regressions
Portela, Miguel; Teulings, Coen; Alessie, R.
2004-01-01
The perpetual inventory method used for the construction of education data per country leads to systematic measurement error. This paper analyses the effect of this measurement error on GDP regressions. There is a systematic difference in the education level between census data and observations
Panel data specifications in nonparametric kernel regression
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard; Henningsen, Arne
parametric panel data estimators to analyse the production technology of Polish crop farms. The results of our nonparametric kernel regressions generally differ from the estimates of the parametric models but they only slightly depend on the choice of the kernel functions. Based on economic reasoning, we...
transformation of independent variables in polynomial regression ...
African Journals Online (AJOL)
Ada
preferable when possible to work with a simple functional form in transformed variables rather than with a more complicated form in the original variables. In this paper, it is shown that linear transformations applied to independent variables in polynomial regression models affect the t ratio and hence the statistical ...
Multiple Linear Regression: A Realistic Reflector.
Nutt, A. T.; Batsell, R. R.
Examples of the use of Multiple Linear Regression (MLR) techniques are presented. This is done to show how MLR aids data processing and decision-making by providing the decision-maker with freedom in phrasing questions and by accurately reflecting the data on hand. A brief overview of the rationale underlying MLR is given, some basic definitions…
Cawyer, Chase R; Anderson, Sarah B; Szychowski, Jeff M; Neely, Cherry; Owen, John
2018-03-01
To compare the accuracy of a new regression-derived formula developed from the National Fetal Growth Studies data to the common alternative method that uses the average of the gestational ages (GAs) calculated for each fetal biometric measurement (biparietal diameter, head circumference, abdominal circumference, and femur length). This retrospective cross-sectional study identified nonanomalous singleton pregnancies that had a crown-rump length plus at least 1 additional sonographic examination with complete fetal biometric measurements. With the use of the crown-rump length to establish the referent estimated date of delivery, each method's (National Institute of Child Health and Human Development regression versus Hadlock average [Radiology 1984; 152:497-501]), error at every examination was computed. Error, defined as the difference between the crown-rump length-derived GA and each method's predicted GA (weeks), was compared in 3 GA intervals: 1 (14 weeks-20 weeks 6 days), 2 (21 weeks-28 weeks 6 days), and 3 (≥29 weeks). In addition, the proportion of each method's examinations that had errors outside prespecified (±) day ranges was computed by using odds ratios. A total of 16,904 sonograms were identified. The overall and prespecified GA range subset mean errors were significantly smaller for the regression compared to the average (P < .01), and the regression had significantly lower odds of observing examinations outside the specified range of error in GA intervals 2 (odds ratio, 1.15; 95% confidence interval, 1.01-1.31) and 3 (odds ratio, 1.24; 95% confidence interval, 1.17-1.32) than the average method. In a contemporary unselected population of women dated by a crown-rump length-derived GA, the National Institute of Child Health and Human Development regression formula produced fewer estimates outside a prespecified margin of error than the commonly used Hadlock average; the differences were most pronounced for GA estimates at 29 weeks and later.
Ren, Yilong; Wang, Yunpeng; Wu, Xinkai; Yu, Guizhen; Ding, Chuan
2016-10-01
Red light running (RLR) has become a major safety concern at signalized intersection. To prevent RLR related crashes, it is critical to identify the factors that significantly impact the drivers' behaviors of RLR, and to predict potential RLR in real time. In this research, 9-month's RLR events extracted from high-resolution traffic data collected by loop detectors from three signalized intersections were applied to identify the factors that significantly affect RLR behaviors. The data analysis indicated that occupancy time, time gap, used yellow time, time left to yellow start, whether the preceding vehicle runs through the intersection during yellow, and whether there is a vehicle passing through the intersection on the adjacent lane were significantly factors for RLR behaviors. Furthermore, due to the rare events nature of RLR, a modified rare events logistic regression model was developed for RLR prediction. The rare events logistic regression method has been applied in many fields for rare events studies and shows impressive performance, but so far none of previous research has applied this method to study RLR. The results showed that the rare events logistic regression model performed significantly better than the standard logistic regression model. More importantly, the proposed RLR prediction method is purely based on loop detector data collected from a single advance loop detector located 400 feet away from stop-bar. This brings great potential for future field applications of the proposed method since loops have been widely implemented in many intersections and can collect data in real time. This research is expected to contribute to the improvement of intersection safety significantly. Copyright © 2016 Elsevier Ltd. All rights reserved.
Gong, Xu; Cui, Jianli; Jiang, Ziping; Lu, Laijin; Li, Xiucun
2018-03-01
Few clinical retrospective studies have reported the risk factors of pedicled flap necrosis in hand soft tissue reconstruction. The aim of this study was to identify non-technical risk factors associated with pedicled flap perioperative necrosis in hand soft tissue reconstruction via a multivariate logistic regression analysis. For patients with hand soft tissue reconstruction, we carefully reviewed hospital records and identified 163 patients who met the inclusion criteria. The characteristics of these patients, flap transfer procedures and postoperative complications were recorded. Eleven predictors were identified. The correlations between pedicled flap necrosis and risk factors were analysed using a logistic regression model. Of 163 skin flaps, 125 flaps survived completely without any complications. The pedicled flap necrosis rate in hands was 11.04%, which included partial flap necrosis (7.36%) and total flap necrosis (3.68%). Soft tissue defects in fingers were noted in 68.10% of all cases. The logistic regression analysis indicated that the soft tissue defect site (P = 0.046, odds ratio (OR) = 0.079, confidence interval (CI) (0.006, 0.959)), flap size (P = 0.020, OR = 1.024, CI (1.004, 1.045)) and postoperative wound infection (P < 0.001, OR = 17.407, CI (3.821, 79.303)) were statistically significant risk factors for pedicled flap necrosis of the hand. Soft tissue defect site, flap size and postoperative wound infection were risk factors associated with pedicled flap necrosis in hand soft tissue defect reconstruction. © 2017 Royal Australasian College of Surgeons.
Directory of Open Access Journals (Sweden)
Joachim I. Krueger
2018-04-01
Full Text Available The practice of Significance Testing (ST remains widespread in psychological science despite continual criticism of its flaws and abuses. Using simulation experiments, we address four concerns about ST and for two of these we compare ST’s performance with prominent alternatives. We find the following: First, the 'p' values delivered by ST predict the posterior probability of the tested hypothesis well under many research conditions. Second, low 'p' values support inductive inferences because they are most likely to occur when the tested hypothesis is false. Third, 'p' values track likelihood ratios without raising the uncertainties of relative inference. Fourth, 'p' values predict the replicability of research findings better than confidence intervals do. Given these results, we conclude that 'p' values may be used judiciously as a heuristic tool for inductive inference. Yet, 'p' values cannot bear the full burden of inference. We encourage researchers to be flexible in their selection and use of statistical methods.
Safety significance evaluation system
International Nuclear Information System (INIS)
Lew, B.S.; Yee, D.; Brewer, W.K.; Quattro, P.J.; Kirby, K.D.
1991-01-01
This paper reports that the Pacific Gas and Electric Company (PG and E), in cooperation with ABZ, Incorporated and Science Applications International Corporation (SAIC), investigated the use of artificial intelligence-based programming techniques to assist utility personnel in regulatory compliance problems. The result of this investigation is that artificial intelligence-based programming techniques can successfully be applied to this problem. To demonstrate this, a general methodology was developed and several prototype systems based on this methodology were developed. The prototypes address U.S. Nuclear Regulatory Commission (NRC) event reportability requirements, technical specification compliance based on plant equipment status, and quality assurance assistance. This collection of prototype modules is named the safety significance evaluation system
Gas revenue increasingly significant
International Nuclear Information System (INIS)
Megill, R.E.
1991-01-01
This paper briefly describes the wellhead prices of natural gas compared to crude oil over the past 70 years. Although natural gas prices have never reached price parity with crude oil, the relative value of a gas BTU has been increasing. It is one of the reasons that the total amount of money coming from natural gas wells is becoming more significant. From 1920 to 1955 the revenue at the wellhead for natural gas was only about 10% of the money received by producers. Most of the money needed for exploration, development, and production came from crude oil. At present, however, over 40% of the money from the upstream portion of the petroleum industry is from natural gas. As a result, in a few short years natural gas may become 50% of the money revenues generated from wellhead production facilities
International Nuclear Information System (INIS)
Bhowmik, K.R.; Islam, S.
2016-01-01
Logistic regression (LR) analysis is the most common statistical methodology to find out the determinants of childhood mortality. However, the significant predictors cannot be ranked according to their influence on the response variable. Multiple classification (MC) analysis can be applied to identify the significant predictors with a priority index which helps to rank the predictors. The main objective of the study is to find the socio-demographic determinants of childhood mortality at neonatal, post-neonatal, and post-infant period by fitting LR model as well as to rank those through MC analysis. The study is conducted using the data of Bangladesh Demographic and Health Survey 2007 where birth and death information of children were collected from their mothers. Three dichotomous response variables are constructed from children age at death to fit the LR and MC models. Socio-economic and demographic variables significantly associated with the response variables separately are considered in LR and MC analyses. Both the LR and MC models identified the same significant predictors for specific childhood mortality. For both the neonatal and child mortality, biological factors of children, regional settings, and parents socio-economic status are found as 1st, 2nd, and 3rd significant groups of predictors respectively. Mother education and household environment are detected as major significant predictors of post-neonatal mortality. This study shows that MC analysis with or without LR analysis can be applied to detect determinants with rank which help the policy makers taking initiatives on a priority basis. (author)
REGSTEP - stepwise multivariate polynomial regression with singular extensions
International Nuclear Information System (INIS)
Davierwalla, D.M.
1977-09-01
The program REGSTEP determines a polynomial approximation, in the least squares sense, to tabulated data. The polynomial may be univariate or multivariate. The computational method is that of stepwise regression. A variable is inserted into the regression basis if it is significant with respect to an appropriate F-test at a preselected risk level. In addition, should a variable already in the basis, become nonsignificant (again with respect to an appropriate F-test) after the entry of a new variable, it is expelled from the model. Thus only significant variables are retained in the model. Although written expressly to be incorporated into CORCOD, a code for predicting nuclear cross sections for given values of power, temperature, void fractions, Boron content etc. there is nothing to limit the use of REGSTEP to nuclear applications, as the examples demonstrate. A separate version has been incorporated into RSYST for the general user. (Auth.)
Bayesian median regression for temporal gene expression data
Yu, Keming; Vinciotti, Veronica; Liu, Xiaohui; 't Hoen, Peter A. C.
2007-09-01
Most of the existing methods for the identification of biologically interesting genes in a temporal expression profiling dataset do not fully exploit the temporal ordering in the dataset and are based on normality assumptions for the gene expression. In this paper, we introduce a Bayesian median regression model to detect genes whose temporal profile is significantly different across a number of biological conditions. The regression model is defined by a polynomial function where both time and condition effects as well as interactions between the two are included. MCMC-based inference returns the posterior distribution of the polynomial coefficients. From this a simple Bayes factor test is proposed to test for significance. The estimation of the median rather than the mean, and within a Bayesian framework, increases the robustness of the method compared to a Hotelling T2-test previously suggested. This is shown on simulated data and on muscular dystrophy gene expression data.
Asgary, S; Dinani, N Jafari; Madani, H; Mahzouni, P
2008-05-01
Artemisia aucheri is a native-growing plant which is widely used in Iranian traditional medicine. This study was designed to evaluate the effects of A. aucheri on regression of atherosclerosis in hypercholesterolemic rabbits. Twenty five rabbits were randomly divided into five groups of five each and treated 3-months as follows: 1: normal diet, 2: hypercholesterolemic diet (HCD), 3 and 4: HCD for 60 days and then normal diet and normal diet + A. aucheri (100 mg x kg(-1) x day(-1)) respectively for an additional 30 days (regression period). In the regression period dietary use of A. aucheri in group 4 significantly decreased total cholesterol, triglyceride and LDL-cholesterol, while HDL-cholesterol was significantly increased. The atherosclerotic area was significantly decreased in this group. Animals, which received only normal diet in the regression period showed no regression but rather progression of atherosclerosis. These findings suggest that A. aucheri may cause regression of atherosclerotic lesions.
International Nuclear Information System (INIS)
Supe, S.J.; Nagalaxmi, K.V.; Meenakshi, L.
1983-01-01
In the practice of radiotherapy, various concepts like NSD, CRE, TDF, and BIR are being used to evaluate the biological effectiveness of the treatment schedules on the normal tissues. This has been accepted as the tolerance of the normal tissue is the limiting factor in the treatment of cancers. At present when various schedules are tried, attention is therefore paid to the biological damage of the normal tissues only and it is expected that the damage to the cancerous tissues would be extensive enough to control the cancer. Attempt is made in the present work to evaluate the concent of tumor significant dose (TSD) which will represent the damage to the cancerous tissue. Strandquist in the analysis of a large number of cases of squamous cell carcinoma found that for the 5 fraction/week treatment, the total dose required to bring about the same damage for the cancerous tissue is proportional to T/sup -0.22/, where T is the overall time over which the dose is delivered. Using this finding the TSD was defined as DxN/sup -p/xT/sup -q/, where D is the total dose, N the number of fractions, T the overall time p and q are the exponents to be suitably chosen. The values of p and q are adjusted such that p+q< or =0.24, and p varies from 0.0 to 0.24 and q varies from 0.0 to 0.22. Cases of cancer of cervix uteri treated between 1978 and 1980 in the V. N. Cancer Centre, Kuppuswamy Naidu Memorial Hospital, Coimbatore, India were analyzed on the basis of these formulations. These data, coupled with the clinical experience, were used for choice of a formula for the TSD. Further, the dose schedules used in the British Institute of Radiology fraction- ation studies were also used to propose that the tumor significant dose is represented by DxN/sup -0.18/xT/sup -0.06/
Management of Industrial Performance Indicators: Regression Analysis and Simulation
Directory of Open Access Journals (Sweden)
Walter Roberto Hernandez Vergara
2017-11-01
Full Text Available Stochastic methods can be used in problem solving and explanation of natural phenomena through the application of statistical procedures. The article aims to associate the regression analysis and systems simulation, in order to facilitate the practical understanding of data analysis. The algorithms were developed in Microsoft Office Excel software, using statistical techniques such as regression theory, ANOVA and Cholesky Factorization, which made it possible to create models of single and multiple systems with up to five independent variables. For the analysis of these models, the Monte Carlo simulation and analysis of industrial performance indicators were used, resulting in numerical indices that aim to improve the goals’ management for compliance indicators, by identifying systems’ instability, correlation and anomalies. The analytical models presented in the survey indicated satisfactory results with numerous possibilities for industrial and academic applications, as well as the potential for deployment in new analytical techniques.
ANALYSIS OF THE FINANCIAL PERFORMANCES OF THE FIRM, BY USING THE MULTIPLE REGRESSION MODEL
Directory of Open Access Journals (Sweden)
Constantin Anghelache
2011-11-01
Full Text Available The information achieved through the use of simple linear regression are not always enough to characterize the evolution of an economic phenomenon and, furthermore, to identify its possible future evolution. To remedy these drawbacks, the special literature includes multiple regression models, in which the evolution of the dependant variable is defined depending on two or more factorial variables.
Uranium chemistry: significant advances
International Nuclear Information System (INIS)
Mazzanti, M.
2011-01-01
The author reviews recent progress in uranium chemistry achieved in CEA laboratories. Like its neighbors in the Mendeleev chart uranium undergoes hydrolysis, oxidation and disproportionation reactions which make the chemistry of these species in water highly complex. The study of the chemistry of uranium in an anhydrous medium has led to correlate the structural and electronic differences observed in the interaction of uranium(III) and the lanthanides(III) with nitrogen or sulfur molecules and the effectiveness of these molecules in An(III)/Ln(III) separation via liquid-liquid extraction. Recent work on the redox reactivity of trivalent uranium U(III) in an organic medium with molecules such as water or an azide ion (N 3 - ) in stoichiometric quantities, led to extremely interesting uranium aggregates particular those involved in actinide migration in the environment or in aggregation problems in the fuel processing cycle. Another significant advance was the discovery of a compound containing the uranyl ion with a degree of oxidation (V) UO 2 + , obtained by oxidation of uranium(III). Recently chemists have succeeded in blocking the disproportionation reaction of uranyl(V) and in stabilizing polymetallic complexes of uranyl(V), opening the way to to a systematic study of the reactivity and the electronic and magnetic properties of uranyl(V) compounds. (A.C.)
Directory of Open Access Journals (Sweden)
Ph D Student Roman Mihaela
2011-05-01
Full Text Available The concept of "public accountability" is a challenge for political science as a new concept in this area in full debate and developement ,both in theory and practice. This paper is a theoretical approach of displaying some definitions, relevant meanings and significance odf the concept in political science. The importance of this concept is that although originally it was used as a tool to improve effectiveness and eficiency of public governance, it has gradually become a purpose it itself. "Accountability" has become an image of good governance first in the United States of America then in the European Union.Nevertheless,the concept is vaguely defined and provides ambiguous images of good governance.This paper begins with the presentation of some general meanings of the concept as they emerge from specialized dictionaries and ancyclopaedies and continues with the meanings developed in political science. The concept of "public accontability" is rooted in economics and management literature,becoming increasingly relevant in today's political science both in theory and discourse as well as in practice in formulating and evaluating public policies. A first conclusin that emerges from, the analysis of the evolution of this term is that it requires a conceptual clarification in political science. A clear definition will then enable an appropriate model of proving the system of public accountability in formulating and assessing public policies, in order to implement a system of assessment and monitoring thereof.
The Regression Analysis of Individual Financial Performance: Evidence from Croatia
Bahovec, Vlasta; Barbić, Dajana; Palić, Irena
2017-01-01
Background: A large body of empirical literature indicates that gender and financial literacy are significant determinants of individual financial performance. Objectives: The purpose of this paper is to recognize the impact of the variable financial literacy and the variable gender on the variation of the financial performance using the regression analysis. Methods/Approach: The survey was conducted using the systematically chosen random sample of Croatian financial consumers. The cross sect...
Lin, Yingzhi; Deng, Xiangzheng; Li, Xing; Ma, Enjun
2014-12-01
Spatially explicit simulation of land use change is the basis for estimating the effects of land use and cover change on energy fluxes, ecology and the environment. At the pixel level, logistic regression is one of the most common approaches used in spatially explicit land use allocation models to determine the relationship between land use and its causal factors in driving land use change, and thereby to evaluate land use suitability. However, these models have a drawback in that they do not determine/allocate land use based on the direct relationship between land use change and its driving factors. Consequently, a multinomial logistic regression method was introduced to address this flaw, and thereby, judge the suitability of a type of land use in any given pixel in a case study area of the Jiangxi Province, China. A comparison of the two regression methods indicated that the proportion of correctly allocated pixels using multinomial logistic regression was 92.98%, which was 8.47% higher than that obtained using logistic regression. Paired t-test results also showed that pixels were more clearly distinguished by multinomial logistic regression than by logistic regression. In conclusion, multinomial logistic regression is a more efficient and accurate method for the spatial allocation of land use changes. The application of this method in future land use change studies may improve the accuracy of predicting the effects of land use and cover change on energy fluxes, ecology, and environment.
Spatial stochastic regression modelling of urban land use
International Nuclear Information System (INIS)
Arshad, S H M; Jaafar, J; Abiden, M Z Z; Latif, Z A; Rasam, A R A
2014-01-01
Urbanization is very closely linked to industrialization, commercialization or overall economic growth and development. This results in innumerable benefits of the quantity and quality of the urban environment and lifestyle but on the other hand contributes to unbounded development, urban sprawl, overcrowding and decreasing standard of living. Regulation and observation of urban development activities is crucial. The understanding of urban systems that promotes urban growth are also essential for the purpose of policy making, formulating development strategies as well as development plan preparation. This study aims to compare two different stochastic regression modeling techniques for spatial structure models of urban growth in the same specific study area. Both techniques will utilize the same datasets and their results will be analyzed. The work starts by producing an urban growth model by using stochastic regression modeling techniques namely the Ordinary Least Square (OLS) and Geographically Weighted Regression (GWR). The two techniques are compared to and it is found that, GWR seems to be a more significant stochastic regression model compared to OLS, it gives a smaller AICc (Akaike's Information Corrected Criterion) value and its output is more spatially explainable
Leadership and regressive group processes: a pilot study.
Rudden, Marie G; Twemlow, Stuart; Ackerman, Steven
2008-10-01
Various perspectives on leadership within the psychoanalytic, organizational and sociobiological literature are reviewed, with particular attention to research studies in these areas. Hypotheses are offered about what makes an effective leader: her ability to structure tasks well in order to avoid destructive regressions, to make constructive use of the omnipresent regressive energies in group life, and to redirect regressions when they occur. Systematic qualitative observations of three videotaped sessions each from N = 18 medical staff work groups at an urban medical center are discussed, as is the utility of a scale, the Leadership and Group Regressions Scale (LGRS), that attempts to operationalize the hypotheses. Analyzing the tapes qualitatively, it was noteworthy that at times (in N = 6 groups), the nominal leader of the group did not prove to be the actual, working leader. Quantitatively, a significant correlation was seen between leaders' LGRS scores and the group's satisfactory completion of their quantitative goals (p = 0.007) and ability to sustain the goals (p = 0.04), when the score of the person who met criteria for group leadership was used.
Testing for marginal linear effects in quantile regression
Wang, Huixia Judy
2017-10-23
The paper develops a new marginal testing procedure to detect significant predictors that are associated with the conditional quantiles of a scalar response. The idea is to fit the marginal quantile regression on each predictor one at a time, and then to base the test on the t-statistics that are associated with the most predictive predictors. A resampling method is devised to calibrate this test statistic, which has non-regular limiting behaviour due to the selection of the most predictive variables. Asymptotic validity of the procedure is established in a general quantile regression setting in which the marginal quantile regression models can be misspecified. Even though a fixed dimension is assumed to derive the asymptotic results, the test proposed is applicable and computationally feasible for large dimensional predictors. The method is more flexible than existing marginal screening test methods based on mean regression and has the added advantage of being robust against outliers in the response. The approach is illustrated by using an application to a human immunodeficiency virus drug resistance data set.
Testing for marginal linear effects in quantile regression
Wang, Huixia Judy; McKeague, Ian W.; Qian, Min
2017-01-01
The paper develops a new marginal testing procedure to detect significant predictors that are associated with the conditional quantiles of a scalar response. The idea is to fit the marginal quantile regression on each predictor one at a time, and then to base the test on the t-statistics that are associated with the most predictive predictors. A resampling method is devised to calibrate this test statistic, which has non-regular limiting behaviour due to the selection of the most predictive variables. Asymptotic validity of the procedure is established in a general quantile regression setting in which the marginal quantile regression models can be misspecified. Even though a fixed dimension is assumed to derive the asymptotic results, the test proposed is applicable and computationally feasible for large dimensional predictors. The method is more flexible than existing marginal screening test methods based on mean regression and has the added advantage of being robust against outliers in the response. The approach is illustrated by using an application to a human immunodeficiency virus drug resistance data set.
International Nuclear Information System (INIS)
Staskiewicz, Grzegorz; Czekajska-Chehab, Elżbieta; Uhlig, Sebastian; Przegalinski, Jerzy; Maciejewski, Ryszard; Drop, Andrzej
2013-01-01
Purpose: Diagnosis of right ventricular dysfunction in patients with acute pulmonary embolism (PE) is known to be associated with increased risk of mortality. The aim of the study was to calculate a logistic regression model for reliable identification of right ventricular dysfunction (RVD) in patients diagnosed with computed tomography pulmonary angiography. Material and methods: Ninety-seven consecutive patients with acute pulmonary embolism were divided into groups with and without RVD basing upon echocardiographic measurement of pulmonary artery systolic pressure (PASP). PE severity was graded with the pulmonary obstruction score. CT measurements of heart chambers and mediastinal vessels were performed; position of interventricular septum and presence of contrast reflux into the inferior vena cava were also recorded. The logistic regression model was prepared by means of stepwise logistic regression. Results: Among the used parameters, the final model consisted of pulmonary obstruction score, short axis diameter of right ventricle and diameter of inferior vena cava. The calculated model is characterized by 79% sensitivity and 81% specificity, and its performance was significantly better than single CT-based measurements. Conclusion: Logistic regression model identifies RVD significantly better, than single CT-based measurements
Controlling attribute effect in linear regression
Calders, Toon; Karim, Asim A.; Kamiran, Faisal; Ali, Wasif Mohammad; Zhang, Xiangliang
2013-01-01
In data mining we often have to learn from biased data, because, for instance, data comes from different batches or there was a gender or racial bias in the collection of social data. In some applications it may be necessary to explicitly control this bias in the models we learn from the data. This paper is the first to study learning linear regression models under constraints that control the biasing effect of a given attribute such as gender or batch number. We show how propensity modeling can be used for factoring out the part of the bias that can be justified by externally provided explanatory attributes. Then we analytically derive linear models that minimize squared error while controlling the bias by imposing constraints on the mean outcome or residuals of the models. Experiments with discrimination-aware crime prediction and batch effect normalization tasks show that the proposed techniques are successful in controlling attribute effects in linear regression models. © 2013 IEEE.
Stochastic development regression using method of moments
DEFF Research Database (Denmark)
Kühnel, Line; Sommer, Stefan Horst
2017-01-01
This paper considers the estimation problem arising when inferring parameters in the stochastic development regression model for manifold valued non-linear data. Stochastic development regression captures the relation between manifold-valued response and Euclidean covariate variables using...... the stochastic development construction. It is thereby able to incorporate several covariate variables and random effects. The model is intrinsically defined using the connection of the manifold, and the use of stochastic development avoids linearizing the geometry. We propose to infer parameters using...... the Method of Moments procedure that matches known constraints on moments of the observations conditional on the latent variables. The performance of the model is investigated in a simulation example using data on finite dimensional landmark manifolds....
Beta-binomial regression and bimodal utilization.
Liu, Chuan-Fen; Burgess, James F; Manning, Willard G; Maciejewski, Matthew L
2013-10-01
To illustrate how the analysis of bimodal U-shaped distributed utilization can be modeled with beta-binomial regression, which is rarely used in health services research. Veterans Affairs (VA) administrative data and Medicare claims in 2001-2004 for 11,123 Medicare-eligible VA primary care users in 2000. We compared means and distributions of VA reliance (the proportion of all VA/Medicare primary care visits occurring in VA) predicted from beta-binomial, binomial, and ordinary least-squares (OLS) models. Beta-binomial model fits the bimodal distribution of VA reliance better than binomial and OLS models due to the nondependence on normality and the greater flexibility in shape parameters. Increased awareness of beta-binomial regression may help analysts apply appropriate methods to outcomes with bimodal or U-shaped distributions. © Health Research and Educational Trust.
Testing homogeneity in Weibull-regression models.
Bolfarine, Heleno; Valença, Dione M
2005-10-01
In survival studies with families or geographical units it may be of interest testing whether such groups are homogeneous for given explanatory variables. In this paper we consider score type tests for group homogeneity based on a mixing model in which the group effect is modelled as a random variable. As opposed to hazard-based frailty models, this model presents survival times that conditioned on the random effect, has an accelerated failure time representation. The test statistics requires only estimation of the conventional regression model without the random effect and does not require specifying the distribution of the random effect. The tests are derived for a Weibull regression model and in the uncensored situation, a closed form is obtained for the test statistic. A simulation study is used for comparing the power of the tests. The proposed tests are applied to real data sets with censored data.
Are increases in cigarette taxation regressive?
Borren, P; Sutton, M
1992-12-01
Using the latest published data from Tobacco Advisory Council surveys, this paper re-evaluates the question of whether or not increases in cigarette taxation are regressive in the United Kingdom. The extended data set shows no evidence of increasing price-elasticity by social class as found in a major previous study. To the contrary, there appears to be no clear pattern in the price responsiveness of smoking behaviour across different social classes. Increases in cigarette taxation, while reducing smoking levels in all groups, fall most heavily on men and women in the lowest social class. Men and women in social class five can expect to pay eight and eleven times more of a tax increase respectively, than their social class one counterparts. Taken as a proportion of relative incomes, the regressive nature of increases in cigarette taxation is even more pronounced.
Controlling attribute effect in linear regression
Calders, Toon
2013-12-01
In data mining we often have to learn from biased data, because, for instance, data comes from different batches or there was a gender or racial bias in the collection of social data. In some applications it may be necessary to explicitly control this bias in the models we learn from the data. This paper is the first to study learning linear regression models under constraints that control the biasing effect of a given attribute such as gender or batch number. We show how propensity modeling can be used for factoring out the part of the bias that can be justified by externally provided explanatory attributes. Then we analytically derive linear models that minimize squared error while controlling the bias by imposing constraints on the mean outcome or residuals of the models. Experiments with discrimination-aware crime prediction and batch effect normalization tasks show that the proposed techniques are successful in controlling attribute effects in linear regression models. © 2013 IEEE.
Regression Models For Multivariate Count Data.
Zhang, Yiwen; Zhou, Hua; Zhou, Jin; Sun, Wei
2017-01-01
Data with multivariate count responses frequently occur in modern applications. The commonly used multinomial-logit model is limiting due to its restrictive mean-variance structure. For instance, analyzing count data from the recent RNA-seq technology by the multinomial-logit model leads to serious errors in hypothesis testing. The ubiquity of over-dispersion and complicated correlation structures among multivariate counts calls for more flexible regression models. In this article, we study some generalized linear models that incorporate various correlation structures among the counts. Current literature lacks a treatment of these models, partly due to the fact that they do not belong to the natural exponential family. We study the estimation, testing, and variable selection for these models in a unifying framework. The regression models are compared on both synthetic and real RNA-seq data.
Model selection in kernel ridge regression
DEFF Research Database (Denmark)
Exterkate, Peter
2013-01-01
Kernel ridge regression is a technique to perform ridge regression with a potentially infinite number of nonlinear transformations of the independent variables as regressors. This method is gaining popularity as a data-rich nonlinear forecasting tool, which is applicable in many different contexts....... The influence of the choice of kernel and the setting of tuning parameters on forecast accuracy is investigated. Several popular kernels are reviewed, including polynomial kernels, the Gaussian kernel, and the Sinc kernel. The latter two kernels are interpreted in terms of their smoothing properties......, and the tuning parameters associated to all these kernels are related to smoothness measures of the prediction function and to the signal-to-noise ratio. Based on these interpretations, guidelines are provided for selecting the tuning parameters from small grids using cross-validation. A Monte Carlo study...
Confidence bands for inverse regression models
International Nuclear Information System (INIS)
Birke, Melanie; Bissantz, Nicolai; Holzmann, Hajo
2010-01-01
We construct uniform confidence bands for the regression function in inverse, homoscedastic regression models with convolution-type operators. Here, the convolution is between two non-periodic functions on the whole real line rather than between two periodic functions on a compact interval, since the former situation arguably arises more often in applications. First, following Bickel and Rosenblatt (1973 Ann. Stat. 1 1071–95) we construct asymptotic confidence bands which are based on strong approximations and on a limit theorem for the supremum of a stationary Gaussian process. Further, we propose bootstrap confidence bands based on the residual bootstrap and prove consistency of the bootstrap procedure. A simulation study shows that the bootstrap confidence bands perform reasonably well for moderate sample sizes. Finally, we apply our method to data from a gel electrophoresis experiment with genetically engineered neuronal receptor subunits incubated with rat brain extract
Regressing Atherosclerosis by Resolving Plaque Inflammation
2017-07-01
regression requires the alteration of macrophages in the plaques to a tissue repair “alternatively” activated state. This switch in activation state... tissue repair “alternatively” activated state. This switch in activation state requires the action of TH2 cytokines interleukin (IL)-4 or IL-13. To...regulation of tissue macrophage and dendritic cell population dynamics by CSF-1. J Exp Med. 2011;208(9):1901–1916. 35. Xu H, Exner BG, Chilton PM
Determination of regression laws: Linear and nonlinear
International Nuclear Information System (INIS)
Onishchenko, A.M.
1994-01-01
A detailed mathematical determination of regression laws is presented in the article. Particular emphasis is place on determining the laws of X j on X l to account for source nuclei decay and detector errors in nuclear physics instrumentation. Both linear and nonlinear relations are presented. Linearization of 19 functions is tabulated, including graph, relation, variable substitution, obtained linear function, and remarks. 6 refs., 1 tab
Directional quantile regression in Octave (and MATLAB)
Czech Academy of Sciences Publication Activity Database
Boček, Pavel; Šiman, Miroslav
2016-01-01
Roč. 52, č. 1 (2016), s. 28-51 ISSN 0023-5954 R&D Projects: GA ČR GA14-07234S Institutional support: RVO:67985556 Keywords : quantile regression * multivariate quantile * depth contour * Matlab Subject RIV: IN - Informatics, Computer Science Impact factor: 0.379, year: 2016 http://library.utia.cas.cz/separaty/2016/SI/bocek-0458380.pdf
Logistic regression a self-learning text
Kleinbaum, David G
1994-01-01
This textbook provides students and professionals in the health sciences with a presentation of the use of logistic regression in research. The text is self-contained, and designed to be used both in class or as a tool for self-study. It arises from the author's many years of experience teaching this material and the notes on which it is based have been extensively used throughout the world.
Multitask Quantile Regression under the Transnormal Model.
Fan, Jianqing; Xue, Lingzhou; Zou, Hui
2016-01-01
We consider estimating multi-task quantile regression under the transnormal model, with focus on high-dimensional setting. We derive a surprisingly simple closed-form solution through rank-based covariance regularization. In particular, we propose the rank-based ℓ 1 penalization with positive definite constraints for estimating sparse covariance matrices, and the rank-based banded Cholesky decomposition regularization for estimating banded precision matrices. By taking advantage of alternating direction method of multipliers, nearest correlation matrix projection is introduced that inherits sampling properties of the unprojected one. Our work combines strengths of quantile regression and rank-based covariance regularization to simultaneously deal with nonlinearity and nonnormality for high-dimensional regression. Furthermore, the proposed method strikes a good balance between robustness and efficiency, achieves the "oracle"-like convergence rate, and provides the provable prediction interval under the high-dimensional setting. The finite-sample performance of the proposed method is also examined. The performance of our proposed rank-based method is demonstrated in a real application to analyze the protein mass spectroscopy data.
Complex regression Doppler optical coherence tomography
Elahi, Sahar; Gu, Shi; Thrane, Lars; Rollins, Andrew M.; Jenkins, Michael W.
2018-04-01
We introduce a new method to measure Doppler shifts more accurately and extend the dynamic range of Doppler optical coherence tomography (OCT). The two-point estimate of the conventional Doppler method is replaced with a regression that is applied to high-density B-scans in polar coordinates. We built a high-speed OCT system using a 1.68-MHz Fourier domain mode locked laser to acquire high-density B-scans (16,000 A-lines) at high enough frame rates (˜100 fps) to accurately capture the dynamics of the beating embryonic heart. Flow phantom experiments confirm that the complex regression lowers the minimum detectable velocity from 12.25 mm / s to 374 μm / s, whereas the maximum velocity of 400 mm / s is measured without phase wrapping. Complex regression Doppler OCT also demonstrates higher accuracy and precision compared with the conventional method, particularly when signal-to-noise ratio is low. The extended dynamic range allows monitoring of blood flow over several stages of development in embryos without adjusting the imaging parameters. In addition, applying complex averaging recovers hidden features in structural images.
Linear regression and the normality assumption.
Schmidt, Amand F; Finan, Chris
2017-12-16
Researchers often perform arbitrary outcome transformations to fulfill the normality assumption of a linear regression model. This commentary explains and illustrates that in large data settings, such transformations are often unnecessary, and worse may bias model estimates. Linear regression assumptions are illustrated using simulated data and an empirical example on the relation between time since type 2 diabetes diagnosis and glycated hemoglobin levels. Simulation results were evaluated on coverage; i.e., the number of times the 95% confidence interval included the true slope coefficient. Although outcome transformations bias point estimates, violations of the normality assumption in linear regression analyses do not. The normality assumption is necessary to unbiasedly estimate standard errors, and hence confidence intervals and P-values. However, in large sample sizes (e.g., where the number of observations per variable is >10) violations of this normality assumption often do not noticeably impact results. Contrary to this, assumptions on, the parametric model, absence of extreme observations, homoscedasticity, and independency of the errors, remain influential even in large sample size settings. Given that modern healthcare research typically includes thousands of subjects focusing on the normality assumption is often unnecessary, does not guarantee valid results, and worse may bias estimates due to the practice of outcome transformations. Copyright © 2017 Elsevier Inc. All rights reserved.
Bayesian Inference of a Multivariate Regression Model
Directory of Open Access Journals (Sweden)
Marick S. Sinay
2014-01-01
Full Text Available We explore Bayesian inference of a multivariate linear regression model with use of a flexible prior for the covariance structure. The commonly adopted Bayesian setup involves the conjugate prior, multivariate normal distribution for the regression coefficients and inverse Wishart specification for the covariance matrix. Here we depart from this approach and propose a novel Bayesian estimator for the covariance. A multivariate normal prior for the unique elements of the matrix logarithm of the covariance matrix is considered. Such structure allows for a richer class of prior distributions for the covariance, with respect to strength of beliefs in prior location hyperparameters, as well as the added ability, to model potential correlation amongst the covariance structure. The posterior moments of all relevant parameters of interest are calculated based upon numerical results via a Markov chain Monte Carlo procedure. The Metropolis-Hastings-within-Gibbs algorithm is invoked to account for the construction of a proposal density that closely matches the shape of the target posterior distribution. As an application of the proposed technique, we investigate a multiple regression based upon the 1980 High School and Beyond Survey.
Face Alignment via Regressing Local Binary Features.
Ren, Shaoqing; Cao, Xudong; Wei, Yichen; Sun, Jian
2016-03-01
This paper presents a highly efficient and accurate regression approach for face alignment. Our approach has two novel components: 1) a set of local binary features and 2) a locality principle for learning those features. The locality principle guides us to learn a set of highly discriminative local binary features for each facial landmark independently. The obtained local binary features are used to jointly learn a linear regression for the final output. This approach achieves the state-of-the-art results when tested on the most challenging benchmarks to date. Furthermore, because extracting and regressing local binary features are computationally very cheap, our system is much faster than previous methods. It achieves over 3000 frames per second (FPS) on a desktop or 300 FPS on a mobile phone for locating a few dozens of landmarks. We also study a key issue that is important but has received little attention in the previous research, which is the face detector used to initialize alignment. We investigate several face detectors and perform quantitative evaluation on how they affect alignment accuracy. We find that an alignment friendly detector can further greatly boost the accuracy of our alignment method, reducing the error up to 16% relatively. To facilitate practical usage of face detection/alignment methods, we also propose a convenient metric to measure how good a detector is for alignment initialization.
Mixed-effects regression models in linguistics
Heylen, Kris; Geeraerts, Dirk
2018-01-01
When data consist of grouped observations or clusters, and there is a risk that measurements within the same group are not independent, group-specific random effects can be added to a regression model in order to account for such within-group associations. Regression models that contain such group-specific random effects are called mixed-effects regression models, or simply mixed models. Mixed models are a versatile tool that can handle both balanced and unbalanced datasets and that can also be applied when several layers of grouping are present in the data; these layers can either be nested or crossed. In linguistics, as in many other fields, the use of mixed models has gained ground rapidly over the last decade. This methodological evolution enables us to build more sophisticated and arguably more realistic models, but, due to its technical complexity, also introduces new challenges. This volume brings together a number of promising new evolutions in the use of mixed models in linguistics, but also addres...
On logistic regression analysis of dichotomized responses.
Lu, Kaifeng
2017-01-01
We study the properties of treatment effect estimate in terms of odds ratio at the study end point from logistic regression model adjusting for the baseline value when the underlying continuous repeated measurements follow a multivariate normal distribution. Compared with the analysis that does not adjust for the baseline value, the adjusted analysis produces a larger treatment effect as well as a larger standard error. However, the increase in standard error is more than offset by the increase in treatment effect so that the adjusted analysis is more powerful than the unadjusted analysis for detecting the treatment effect. On the other hand, the true adjusted odds ratio implied by the normal distribution of the underlying continuous variable is a function of the baseline value and hence is unlikely to be able to be adequately represented by a single value of adjusted odds ratio from the logistic regression model. In contrast, the risk difference function derived from the logistic regression model provides a reasonable approximation to the true risk difference function implied by the normal distribution of the underlying continuous variable over the range of the baseline distribution. We show that different metrics of treatment effect have similar statistical power when evaluated at the baseline mean. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
General regression and representation model for classification.
Directory of Open Access Journals (Sweden)
Jianjun Qian
Full Text Available Recently, the regularized coding-based classification methods (e.g. SRC and CRC show a great potential for pattern classification. However, most existing coding methods assume that the representation residuals are uncorrelated. In real-world applications, this assumption does not hold. In this paper, we take account of the correlations of the representation residuals and develop a general regression and representation model (GRR for classification. GRR not only has advantages of CRC, but also takes full use of the prior information (e.g. the correlations between representation residuals and representation coefficients and the specific information (weight matrix of image pixels to enhance the classification performance. GRR uses the generalized Tikhonov regularization and K Nearest Neighbors to learn the prior information from the training data. Meanwhile, the specific information is obtained by using an iterative algorithm to update the feature (or image pixel weights of the test sample. With the proposed model as a platform, we design two classifiers: basic general regression and representation classifier (B-GRR and robust general regression and representation classifier (R-GRR. The experimental results demonstrate the performance advantages of proposed methods over state-of-the-art algorithms.
Image superresolution using support vector regression.
Ni, Karl S; Nguyen, Truong Q
2007-06-01
A thorough investigation of the application of support vector regression (SVR) to the superresolution problem is conducted through various frameworks. Prior to the study, the SVR problem is enhanced by finding the optimal kernel. This is done by formulating the kernel learning problem in SVR form as a convex optimization problem, specifically a semi-definite programming (SDP) problem. An additional constraint is added to reduce the SDP to a quadratically constrained quadratic programming (QCQP) problem. After this optimization, investigation of the relevancy of SVR to superresolution proceeds with the possibility of using a single and general support vector regression for all image content, and the results are impressive for small training sets. This idea is improved upon by observing structural properties in the discrete cosine transform (DCT) domain to aid in learning the regression. Further improvement involves a combination of classification and SVR-based techniques, extending works in resolution synthesis. This method, termed kernel resolution synthesis, uses specific regressors for isolated image content to describe the domain through a partitioned look of the vector space, thereby yielding good results.
Song, Chao; Kwan, Mei-Po; Zhu, Jiping
2017-04-08
An increasing number of fires are occurring with the rapid development of cities, resulting in increased risk for human beings and the environment. This study compares geographically weighted regression-based models, including geographically weighted regression (GWR) and geographically and temporally weighted regression (GTWR), which integrates spatial and temporal effects and global linear regression models (LM) for modeling fire risk at the city scale. The results show that the road density and the spatial distribution of enterprises have the strongest influences on fire risk, which implies that we should focus on areas where roads and enterprises are densely clustered. In addition, locations with a large number of enterprises have fewer fire ignition records, probably because of strict management and prevention measures. A changing number of significant variables across space indicate that heterogeneity mainly exists in the northern and eastern rural and suburban areas of Hefei city, where human-related facilities or road construction are only clustered in the city sub-centers. GTWR can capture small changes in the spatiotemporal heterogeneity of the variables while GWR and LM cannot. An approach that integrates space and time enables us to better understand the dynamic changes in fire risk. Thus governments can use the results to manage fire safety at the city scale.
International Nuclear Information System (INIS)
Jafri, Y.Z.; Kamal, L.
2007-01-01
Various statistical techniques was used on five-year data from 1998-2002 of average humidity, rainfall, maximum and minimum temperatures, respectively. The relationships to regression analysis time series (RATS) were developed for determining the overall trend of these climate parameters on the basis of which forecast models can be corrected and modified. We computed the coefficient of determination as a measure of goodness of fit, to our polynomial regression analysis time series (PRATS). The correlation to multiple linear regression (MLR) and multiple linear regression analysis time series (MLRATS) were also developed for deciphering the interdependence of weather parameters. Spearman's rand correlation and Goldfeld-Quandt test were used to check the uniformity or non-uniformity of variances in our fit to polynomial regression (PR). The Breusch-Pagan test was applied to MLR and MLRATS, respectively which yielded homoscedasticity. We also employed Bartlett's test for homogeneity of variances on a five-year data of rainfall and humidity, respectively which showed that the variances in rainfall data were not homogenous while in case of humidity, were homogenous. Our results on regression and regression analysis time series show the best fit to prediction modeling on climatic data of Quetta, Pakistan. (author)
van Veen, S H C M; van Kleef, R C; van de Ven, W P M M; van Vliet, R C J A
2018-02-01
This study explores the predictive power of interaction terms between the risk adjusters in the Dutch risk equalization (RE) model of 2014. Due to the sophistication of this RE-model and the complexity of the associations in the dataset (N = ~16.7 million), there are theoretically more than a million interaction terms. We used regression tree modelling, which has been applied rarely within the field of RE, to identify interaction terms that statistically significantly explain variation in observed expenses that is not already explained by the risk adjusters in this RE-model. The interaction terms identified were used as additional risk adjusters in the RE-model. We found evidence that interaction terms can improve the prediction of expenses overall and for specific groups in the population. However, the prediction of expenses for some other selective groups may deteriorate. Thus, interactions can reduce financial incentives for risk selection for some groups but may increase them for others. Furthermore, because regression trees are not robust, additional criteria are needed to decide which interaction terms should be used in practice. These criteria could be the right incentive structure for risk selection and efficiency or the opinion of medical experts. Copyright © 2017 John Wiley & Sons, Ltd.
Identifying the Prognosis Factors in Death after Liver Transplantation via Adaptive LASSO in Iran
Directory of Open Access Journals (Sweden)
Hadi Raeisi Shahraki
2016-01-01
Full Text Available Despite the widespread use of liver transplantation as a routine therapy in liver diseases, the effective factors on its outcomes are still controversial. This study attempted to identify the most effective factors on death after liver transplantation. For this purpose, modified least absolute shrinkage and selection operator (LASSO, called Adaptive LASSO, was utilized. One of the best advantages of this method is considering high number of factors. Therefore, in a historical cohort study from 2008 to 2013, the clinical findings of 680 patients undergoing liver transplant surgery were considered. Ridge and Adaptive LASSO regression methods were then implemented to identify the most effective factors on death. To compare the performance of these two models, receiver operating characteristic (ROC curve was used. According to the results, 12 factors in Ridge regression and 9 ones in Adaptive LASSO regression were significant. The area under the ROC curve (AUC of Adaptive LASSO was equal to 89% (95% CI: 86%–91%, which was significantly greater than Ridge regression (64%, 95% CI: 61%–68% (p<0.001. As a conclusion, the significant factors and the performance criteria revealed the superiority of Adaptive LASSO method as a penalized model versus traditional regression model in the present study.
Ribaroff, G A; Wastnedge, E; Drake, A J; Sharpe, R M; Chambers, T J G
2017-06-01
Animal models of maternal high fat diet (HFD) demonstrate perturbed offspring metabolism although the effects differ markedly between models. We assessed studies investigating metabolic parameters in the offspring of HFD fed mothers to identify factors explaining these inter-study differences. A total of 171 papers were identified, which provided data from 6047 offspring. Data were extracted regarding body weight, adiposity, glucose homeostasis and lipidaemia. Information regarding the macronutrient content of diet, species, time point of exposure and gestational weight gain were collected and utilized in meta-regression models to explore predictive factors. Publication bias was assessed using Egger's regression test. Maternal HFD exposure did not affect offspring birthweight but increased weaning weight, final bodyweight, adiposity, triglyceridaemia, cholesterolaemia and insulinaemia in both female and male offspring. Hyperglycaemia was found in female offspring only. Meta-regression analysis identified lactational HFD exposure as a key moderator. The fat content of the diet did not correlate with any outcomes. There was evidence of significant publication bias for all outcomes except birthweight. Maternal HFD exposure was associated with perturbed metabolism in offspring but between studies was not accounted for by dietary constituents, species, strain or maternal gestational weight gain. Specific weaknesses in experimental design predispose many of the results to bias. © 2017 The Authors. Obesity Reviews published by John Wiley & Sons Ltd on behalf of World Obesity Federation.
Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.; Michael, John A.; Helsel, Dennis R.
2008-01-01
Logistic regression was used to develop statistical models that can be used to predict the probability of debris flows in areas recently burned by wildfires by using data from 14 wildfires that burned in southern California during 2003-2006. Twenty-eight independent variables describing the basin morphology, burn severity, rainfall, and soil properties of 306 drainage basins located within those burned areas were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows soon after the 2003 to 2006 fires were delineated from data in the National Elevation Dataset using a geographic information system; (2) Data describing the basin morphology, burn severity, rainfall, and soil properties were compiled for each basin. These data were then input to a statistics software package for analysis using logistic regression; and (3) Relations between the occurrence or absence of debris flows and the basin morphology, burn severity, rainfall, and soil properties were evaluated, and five multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combinations produced the most effective models, and the multivariate models that best predicted the occurrence of debris flows were identified. Percentage of high burn severity and 3-hour peak rainfall intensity were significant variables in all models. Soil organic matter content and soil clay content were significant variables in all models except Model 5. Soil slope was a significant variable in all models except Model 4. The most suitable model can be selected from these five models on the basis of the availability of independent variables in the particular area of interest and field checking of probability maps. The multivariate logistic regression models can be entered into a geographic information system, and maps showing the probability of debris flows can be constructed in recently burned areas of
Maximum Entropy Discrimination Poisson Regression for Software Reliability Modeling.
Chatzis, Sotirios P; Andreou, Andreas S
2015-11-01
Reliably predicting software defects is one of the most significant tasks in software engineering. Two of the major components of modern software reliability modeling approaches are: 1) extraction of salient features for software system representation, based on appropriately designed software metrics and 2) development of intricate regression models for count data, to allow effective software reliability data modeling and prediction. Surprisingly, research in the latter frontier of count data regression modeling has been rather limited. More specifically, a lack of simple and efficient algorithms for posterior computation has made the Bayesian approaches appear unattractive, and thus underdeveloped in the context of software reliability modeling. In this paper, we try to address these issues by introducing a novel Bayesian regression model for count data, based on the concept of max-margin data modeling, effected in the context of a fully Bayesian model treatment with simple and efficient posterior distribution updates. Our novel approach yields a more discriminative learning technique, making more effective use of our training data during model inference. In addition, it allows of better handling uncertainty in the modeled data, which can be a significant problem when the training data are limited. We derive elegant inference algorithms for our model under the mean-field paradigm and exhibit its effectiveness using the publicly available benchmark data sets.
On macroeconomic values investigation using fuzzy linear regression analysis
Directory of Open Access Journals (Sweden)
Richard Pospíšil
2017-06-01
Full Text Available The theoretical background for abstract formalization of the vague phenomenon of complex systems is the fuzzy set theory. In the paper, vague data is defined as specialized fuzzy sets - fuzzy numbers and there is described a fuzzy linear regression model as a fuzzy function with fuzzy numbers as vague parameters. To identify the fuzzy coefficients of the model, the genetic algorithm is used. The linear approximation of the vague function together with its possibility area is analytically and graphically expressed. A suitable application is performed in the tasks of the time series fuzzy regression analysis. The time-trend and seasonal cycles including their possibility areas are calculated and expressed. The examples are presented from the economy field, namely the time-development of unemployment, agricultural production and construction respectively between 2009 and 2011 in the Czech Republic. The results are shown in the form of the fuzzy regression models of variables of time series. For the period 2009-2011, the analysis assumptions about seasonal behaviour of variables and the relationship between them were confirmed; in 2010, the system behaved fuzzier and the relationships between the variables were vaguer, that has a lot of causes, from the different elasticity of demand, through state interventions to globalization and transnational impacts.
Approximate median regression for complex survey data with skewed response.
Fraser, Raphael André; Lipsitz, Stuart R; Sinha, Debajyoti; Fitzmaurice, Garrett M; Pan, Yi
2016-12-01
The ready availability of public-use data from various large national complex surveys has immense potential for the assessment of population characteristics using regression models. Complex surveys can be used to identify risk factors for important diseases such as cancer. Existing statistical methods based on estimating equations and/or utilizing resampling methods are often not valid with survey data due to complex survey design features. That is, stratification, multistage sampling, and weighting. In this article, we accommodate these design features in the analysis of highly skewed response variables arising from large complex surveys. Specifically, we propose a double-transform-both-sides (DTBS)'based estimating equations approach to estimate the median regression parameters of the highly skewed response; the DTBS approach applies the same Box-Cox type transformation twice to both the outcome and regression function. The usual sandwich variance estimate can be used in our approach, whereas a resampling approach would be needed for a pseudo-likelihood based on minimizing absolute deviations (MAD). Furthermore, the approach is relatively robust to the true underlying distribution, and has much smaller mean square error than a MAD approach. The method is motivated by an analysis of laboratory data on urinary iodine (UI) concentration from the National Health and Nutrition Examination Survey. © 2016, The International Biometric Society.
Significant Factors Determining E-government Adoption in Selangor, Malaysia
Directory of Open Access Journals (Sweden)
Siti Hajar Mohd Idris
2016-06-01
Full Text Available Studies have shown that low adoption rate among citizens has been hindering the optimization of e-Government services especially in developing countries. Hence, one of the critical measures that has to be undertaken is to identify and overcome possible barriers to further facilitate a higher rate of adoption. A multistage stratified sampling was used in this study to collect data from 1000 respondents, both user and non-user residing in the state of Selangor, Malaysia. This state was chosen as to provide a better understanding of low adoption when issues of basic facilities have been successfully overcome. An exploratory factor analysis was performed to identify latent constructs and seven key factors were identified. A multiple regression model was subsequently used to analyze significant factors in determining the willingness to use e-Government services. The determinants are language barrier, educational level, secure, format, easy to use, enjoyable, reliable, visual appeal and infrastructure. The result shows significant variables that act as barriers to adoption are reliable, enjoyable, easy to use, secure, and language used. The constraints pointed out in the open ended questions mainly focus on the issue of accessibility, ease of use and awareness. Overcoming these obstacles is therefore crucial in order to enhance the usage of e-Government services which consequently will improve the quality of public administration in Malaysia.
Model building strategy for logistic regression: purposeful selection.
Zhang, Zhongheng
2016-03-01
Logistic regression is one of the most commonly used models to account for confounders in medical literature. The article introduces how to perform purposeful selection model building strategy with R. I stress on the use of likelihood ratio test to see whether deleting a variable will have significant impact on model fit. A deleted variable should also be checked for whether it is an important adjustment of remaining covariates. Interaction should be checked to disentangle complex relationship between covariates and their synergistic effect on response variable. Model should be checked for the goodness-of-fit (GOF). In other words, how the fitted model reflects the real data. Hosmer-Lemeshow GOF test is the most widely used for logistic regression model.
Testing the Perturbation Sensitivity of Abortion-Crime Regressions
Directory of Open Access Journals (Sweden)
Michał Brzeziński
2012-06-01
Full Text Available The hypothesis that the legalisation of abortion contributed significantly to the reduction of crime in the United States in 1990s is one of the most prominent ideas from the recent “economics-made-fun” movement sparked by the book Freakonomics. This paper expands on the existing literature about the computational stability of abortion-crime regressions by testing the sensitivity of coefficients’ estimates to small amounts of data perturbation. In contrast to previous studies, we use a new data set on crime correlates for each of the US states, the original model specifica-tion and estimation methodology, and an improved data perturbation algorithm. We find that the coefficients’ estimates in abortion-crime regressions are not computationally stable and, therefore, are unreliable.
Kempe, P T; van Oppen, P; de Haan, E; Twisk, J W R; Sluis, A; Smit, J H; van Dyck, R; van Balkom, A J L M
2007-09-01
Two methods for predicting remissions in obsessive-compulsive disorder (OCD) treatment are evaluated. Y-BOCS measurements of 88 patients with a primary OCD (DSM-III-R) diagnosis were performed over a 16-week treatment period, and during three follow-ups. Remission at any measurement was defined as a Y-BOCS score lower than thirteen combined with a reduction of seven points when compared with baseline. Logistic regression models were compared with a Cox regression for recurrent events model. Logistic regression yielded different models at different evaluation times. The recurrent events model remained stable when fewer measurements were used. Higher baseline levels of neuroticism and more severe OCD symptoms were associated with a lower chance of remission, early age of onset and more depressive symptoms with a higher chance. Choice of outcome time affects logistic regression prediction models. Recurrent events analysis uses all information on remissions and relapses. Short- and long-term predictors for OCD remission show overlap.
A method for nonlinear exponential regression analysis
Junkin, B. G.
1971-01-01
A computer-oriented technique is presented for performing a nonlinear exponential regression analysis on decay-type experimental data. The technique involves the least squares procedure wherein the nonlinear problem is linearized by expansion in a Taylor series. A linear curve fitting procedure for determining the initial nominal estimates for the unknown exponential model parameters is included as an integral part of the technique. A correction matrix was derived and then applied to the nominal estimate to produce an improved set of model parameters. The solution cycle is repeated until some predetermined criterion is satisfied.
Three Contributions to Robust Regression Diagnostics
Czech Academy of Sciences Publication Activity Database
Kalina, Jan
2015-01-01
Roč. 11, č. 2 (2015), s. 69-78 ISSN 1336-9180 Grant - others:GA ČR(CZ) GA13-01930S; Nadační fond na podporu vědy(CZ) Neuron Institutional support: RVO:67985807 Keywords : robust regression * robust econometrics * hypothesis test ing Subject RIV: BA - General Mathematics http://www.degruyter.com/view/j/jamsi.2015.11.issue-2/jamsi-2015-0013/jamsi-2015-0013.xml?format=INT
SDE based regression for random PDEs
Bayer, Christian
2016-01-01
A simulation based method for the numerical solution of PDE with random coefficients is presented. By the Feynman-Kac formula, the solution can be represented as conditional expectation of a functional of a corresponding stochastic differential equation driven by independent noise. A time discretization of the SDE for a set of points in the domain and a subsequent Monte Carlo regression lead to an approximation of the global solution of the random PDE. We provide an initial error and complexity analysis of the proposed method along with numerical examples illustrating its behaviour.
Bayesian regression of piecewise homogeneous Poisson processes
Directory of Open Access Journals (Sweden)
Diego Sevilla
2015-12-01
Full Text Available In this paper, a Bayesian method for piecewise regression is adapted to handle counting processes data distributed as Poisson. A numerical code in Mathematica is developed and tested analyzing simulated data. The resulting method is valuable for detecting breaking points in the count rate of time series for Poisson processes. Received: 2 November 2015, Accepted: 27 November 2015; Edited by: R. Dickman; Reviewed by: M. Hutter, Australian National University, Canberra, Australia.; DOI: http://dx.doi.org/10.4279/PIP.070018 Cite as: D J R Sevilla, Papers in Physics 7, 070018 (2015
Selecting a Regression Saturated by Indicators
DEFF Research Database (Denmark)
Hendry, David F.; Johansen, Søren; Santos, Carlos
We consider selecting a regression model, using a variant of Gets, when there are more variables than observations, in the special case that the variables are impulse dummies (indicators) for every observation. We show that the setting is unproblematic if tackled appropriately, and obtain the fin...... the finite-sample distribution of estimators of the mean and variance in a simple location-scale model under the null that no impulses matter. A Monte Carlo simulation confirms the null distribution, and shows power against an alternative of interest....
Selecting a Regression Saturated by Indicators
DEFF Research Database (Denmark)
Hendry, David F.; Johansen, Søren; Santos, Carlos
We consider selecting a regression model, using a variant of Gets, when there are more variables than observations, in the special case that the variables are impulse dummies (indicators) for every observation. We show that the setting is unproblematic if tackled appropriately, and obtain the fin...... the finite-sample distribution of estimators of the mean and variance in a simple location-scale model under the null that no impulses matter. A Monte Carlo simulation confirms the null distribution, and shows power against an alternative of interest...
Fixed kernel regression for voltammogram feature extraction
International Nuclear Information System (INIS)
Acevedo Rodriguez, F J; López-Sastre, R J; Gil-Jiménez, P; Maldonado Bascón, S; Ruiz-Reyes, N
2009-01-01
Cyclic voltammetry is an electroanalytical technique for obtaining information about substances under analysis without the need for complex flow systems. However, classifying the information in voltammograms obtained using this technique is difficult. In this paper, we propose the use of fixed kernel regression as a method for extracting features from these voltammograms, reducing the information to a few coefficients. The proposed approach has been applied to a wine classification problem with accuracy rates of over 98%. Although the method is described here for extracting voltammogram information, it can be used for other types of signals
Regression analysis for the social sciences
Gordon, Rachel A
2010-01-01
The book provides graduate students in the social sciences with the basic skills that they need to estimate, interpret, present, and publish basic regression models using contemporary standards. Key features of the book include: interweaving the teaching of statistical concepts with examples developed for the course from publicly-available social science data or drawn from the literature. thorough integration of teaching statistical theory with teaching data processing and analysis. teaching of both SAS and Stata "side-by-side" and use of chapter exercises in which students practice programming and interpretation on the same data set and course exercises in which students can choose their own research questions and data set.
SDE based regression for random PDEs
Bayer, Christian
2016-01-06
A simulation based method for the numerical solution of PDE with random coefficients is presented. By the Feynman-Kac formula, the solution can be represented as conditional expectation of a functional of a corresponding stochastic differential equation driven by independent noise. A time discretization of the SDE for a set of points in the domain and a subsequent Monte Carlo regression lead to an approximation of the global solution of the random PDE. We provide an initial error and complexity analysis of the proposed method along with numerical examples illustrating its behaviour.
Neutrosophic Correlation and Simple Linear Regression
Directory of Open Access Journals (Sweden)
A. A. Salama
2014-09-01
Full Text Available Since the world is full of indeterminacy, the neutrosophics found their place into contemporary research. The fundamental concepts of neutrosophic set, introduced by Smarandache. Recently, Salama et al., introduced the concept of correlation coefficient of neutrosophic data. In this paper, we introduce and study the concepts of correlation and correlation coefficient of neutrosophic data in probability spaces and study some of their properties. Also, we introduce and study the neutrosophic simple linear regression model. Possible applications to data processing are touched upon.
Spectral density regression for bivariate extremes
Castro Camilo, Daniela
2016-05-11
We introduce a density regression model for the spectral density of a bivariate extreme value distribution, that allows us to assess how extremal dependence can change over a covariate. Inference is performed through a double kernel estimator, which can be seen as an extension of the Nadaraya–Watson estimator where the usual scalar responses are replaced by mean constrained densities on the unit interval. Numerical experiments with the methods illustrate their resilience in a variety of contexts of practical interest. An extreme temperature dataset is used to illustrate our methods. © 2016 Springer-Verlag Berlin Heidelberg
SPE dose prediction using locally weighted regression
International Nuclear Information System (INIS)
Hines, J. W.; Townsend, L. W.; Nichols, T. F.
2005-01-01
When astronauts are outside earth's protective magnetosphere, they are subject to large radiation doses resulting from solar particle events (SPEs). The total dose received from a major SPE in deep space could cause severe radiation poisoning. The dose is usually received over a 20-40 h time interval but the event's effects may be mitigated with an early warning system. This paper presents a method to predict the total dose early in the event. It uses a locally weighted regression model, which is easier to train and provides predictions as accurate as neural network models previously used. (authors)
SPE dose prediction using locally weighted regression
International Nuclear Information System (INIS)
Hines, J. W.; Townsend, L. W.; Nichols, T. F.
2005-01-01
When astronauts are outside Earth's protective magnetosphere, they are subject to large radiation doses resulting from solar particle events. The total dose received from a major solar particle event in deep space could cause severe radiation poisoning. The dose is usually received over a 20-40 h time interval but the event's effects may be reduced with an early warning system. This paper presents a method to predict the total dose early in the event. It uses a locally weighted regression model, which is easier to train, and provides predictions as accurate as the neural network models that were used previously. (authors)
AIRLINE ACTIVITY FORECASTING BY REGRESSION MODELS
Directory of Open Access Journals (Sweden)
Н. Білак
2012-04-01
Full Text Available Proposed linear and nonlinear regression models, which take into account the equation of trend and seasonality indices for the analysis and restore the volume of passenger traffic over the past period of time and its prediction for future years, as well as the algorithm of formation of these models based on statistical analysis over the years. The desired model is the first step for the synthesis of more complex models, which will enable forecasting of passenger (income level airline with the highest accuracy and time urgency.
Logistic regression applied to natural hazards: rare event logistic regression with replications
Guns, M.; Vanacker, Veerle
2012-01-01
Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logisti...
A robust regression based on weighted LSSVM and penalized trimmed squares
International Nuclear Information System (INIS)
Liu, Jianyong; Wang, Yong; Fu, Chengqun; Guo, Jie; Yu, Qin
2016-01-01
Least squares support vector machine (LS-SVM) for nonlinear regression is sensitive to outliers in the field of machine learning. Weighted LS-SVM (WLS-SVM) overcomes this drawback by adding weight to each training sample. However, as the number of outliers increases, the accuracy of WLS-SVM may decrease. In order to improve the robustness of WLS-SVM, a new robust regression method based on WLS-SVM and penalized trimmed squares (WLSSVM–PTS) has been proposed. The algorithm comprises three main stages. The initial parameters are obtained by least trimmed squares at first. Then, the significant outliers are identified and eliminated by the Fast-PTS algorithm. The remaining samples with little outliers are estimated by WLS-SVM at last. The statistical tests of experimental results carried out on numerical datasets and real-world datasets show that the proposed WLSSVM–PTS is significantly robust than LS-SVM, WLS-SVM and LSSVM–LTS.
Hill, Benjamin David; Womble, Melissa N; Rohling, Martin L
2015-01-01
This study utilized logistic regression to determine whether performance patterns on Concussion Vital Signs (CVS) could differentiate known groups with either genuine or feigned performance. For the embedded measure development group (n = 174), clinical patients and undergraduate students categorized as feigning obtained significantly lower scores on the overall test battery mean for the CVS, Shipley-2 composite score, and California Verbal Learning Test-Second Edition subtests than did genuinely performing individuals. The final full model of 3 predictor variables (Verbal Memory immediate hits, Verbal Memory immediate correct passes, and Stroop Test complex reaction time correct) was significant and correctly classified individuals in their known group 83% of the time (sensitivity = .65; specificity = .97) in a mixed sample of young-adult clinical cases and simulators. The CVS logistic regression function was applied to a separate undergraduate college group (n = 378) that was asked to perform genuinely and identified 5% as having possibly feigned performance indicating a low false-positive rate. The failure rate was 11% and 16% at baseline cognitive testing in samples of high school and college athletes, respectively. These findings have particular relevance given the increasing use of computerized test batteries for baseline cognitive testing and return-to-play decisions after concussion.
Directory of Open Access Journals (Sweden)
Guan Lian
2018-01-01
Full Text Available Accurate prediction of taxi-out time is significant precondition for improving the operationality of the departure process at an airport, as well as reducing the long taxi-out time, congestion, and excessive emission of greenhouse gases. Unfortunately, several of the traditional methods of predicting taxi-out time perform unsatisfactorily at congested airports. This paper describes and tests three of those conventional methods which include Generalized Linear Model, Softmax Regression Model, and Artificial Neural Network method and two improved Support Vector Regression (SVR approaches based on swarm intelligence algorithm optimization, which include Particle Swarm Optimization (PSO and Firefly Algorithm. In order to improve the global searching ability of Firefly Algorithm, adaptive step factor and Lévy flight are implemented simultaneously when updating the location function. Six factors are analysed, of which delay is identified as one significant factor in congested airports. Through a series of specific dynamic analyses, a case study of Beijing International Airport (PEK is tested with historical data. The performance measures show that the proposed two SVR approaches, especially the Improved Firefly Algorithm (IFA optimization-based SVR method, not only perform as the best modelling measures and accuracy rate compared with the representative forecast models, but also can achieve a better predictive performance when dealing with abnormal taxi-out time states.
Quality of life in breast cancer patients--a quantile regression analysis.
Pourhoseingholi, Mohamad Amin; Safaee, Azadeh; Moghimi-Dehkordi, Bijan; Zeighami, Bahram; Faghihzadeh, Soghrat; Tabatabaee, Hamid Reza; Pourhoseingholi, Asma
2008-01-01
Quality of life study has an important role in health care especially in chronic diseases, in clinical judgment and in medical resources supplying. Statistical tools like linear regression are widely used to assess the predictors of quality of life. But when the response is not normal the results are misleading. The aim of this study is to determine the predictors of quality of life in breast cancer patients, using quantile regression model and compare to linear regression. A cross-sectional study conducted on 119 breast cancer patients that admitted and treated in chemotherapy ward of Namazi hospital in Shiraz. We used QLQ-C30 questionnaire to assessment quality of life in these patients. A quantile regression was employed to assess the assocciated factors and the results were compared to linear regression. All analysis carried out using SAS. The mean score for the global health status for breast cancer patients was 64.92+/-11.42. Linear regression showed that only grade of tumor, occupational status, menopausal status, financial difficulties and dyspnea were statistically significant. In spite of linear regression, financial difficulties were not significant in quantile regression analysis and dyspnea was only significant for first quartile. Also emotion functioning and duration of disease statistically predicted the QOL score in the third quartile. The results have demonstrated that using quantile regression leads to better interpretation and richer inference about predictors of the breast cancer patient quality of life.
Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon
2015-01-01
Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended.
Hahn, Andrew D; Rowe, Daniel B
2012-02-01
As more evidence is presented suggesting that the phase, as well as the magnitude, of functional MRI (fMRI) time series may contain important information and that there are theoretical drawbacks to modeling functional response in the magnitude alone, removing noise in the phase is becoming more important. Previous studies have shown that retrospective correction of noise from physiologic sources can remove significant phase variance and that dynamic main magnetic field correction and regression of estimated motion parameters also remove significant phase fluctuations. In this work, we investigate the performance of physiologic noise regression in a framework along with correction for dynamic main field fluctuations and motion regression. Our findings suggest that including physiologic regressors provides some benefit in terms of reduction in phase noise power, but it is small compared to the benefit of dynamic field corrections and use of estimated motion parameters as nuisance regressors. Additionally, we show that the use of all three techniques reduces phase variance substantially, removes undesirable spatial phase correlations and improves detection of the functional response in magnitude and phase. Copyright © 2011 Elsevier Inc. All rights reserved.
Bayesian nonlinear regression for large small problems
Chakraborty, Sounak; Ghosh, Malay; Mallick, Bani K.
2012-01-01
Statistical modeling and inference problems with sample sizes substantially smaller than the number of available covariates are challenging. This is known as large p small n problem. Furthermore, the problem is more complicated when we have multiple correlated responses. We develop multivariate nonlinear regression models in this setup for accurate prediction. In this paper, we introduce a full Bayesian support vector regression model with Vapnik's ε-insensitive loss function, based on reproducing kernel Hilbert spaces (RKHS) under the multivariate correlated response setup. This provides a full probabilistic description of support vector machine (SVM) rather than an algorithm for fitting purposes. We have also introduced a multivariate version of the relevance vector machine (RVM). Instead of the original treatment of the RVM relying on the use of type II maximum likelihood estimates of the hyper-parameters, we put a prior on the hyper-parameters and use Markov chain Monte Carlo technique for computation. We have also proposed an empirical Bayes method for our RVM and SVM. Our methods are illustrated with a prediction problem in the near-infrared (NIR) spectroscopy. A simulation study is also undertaken to check the prediction accuracy of our models. © 2012 Elsevier Inc.
Spontaneous regression of intracranial malignant lymphoma
International Nuclear Information System (INIS)
Kojo, Nobuto; Tokutomi, Takashi; Eguchi, Gihachirou; Takagi, Shigeyuki; Matsumoto, Tomie; Sasaguri, Yasuyuki; Shigemori, Minoru.
1988-01-01
In a 46-year-old female with a 1-month history of gait and speech disturbances, computed tomography (CT) demonstrated mass lesions of slightly high density in the left basal ganglia and left frontal lobe. The lesions were markedly enhanced by contrast medium. The patient received no specific treatment, but her clinical manifestations gradually abated and the lesions decreased in size. Five months after her initial examination, the lesions were absent on CT scans; only a small area of low density remained. Residual clinical symptoms included mild right hemiparesis and aphasia. After 14 months the patient again deteriorated, and a CT scan revealed mass lesions in the right frontal lobe and the pons. However, no enhancement was observed in the previously affected regions. A biopsy revealed malignant lymphoma. Despite treatment with steroids and radiation, the patient's clinical status progressively worsened and she died 27 months after initial presentation. Seven other cases of spontaneous regression of primary malignant lymphoma have been reported. In this case, the mechanism of the spontaneous regression was not clear, but changes in immunologic status may have been involved. (author)
Regression testing in the TOTEM DCS
International Nuclear Information System (INIS)
Rodríguez, F Lucas; Atanassov, I; Burkimsher, P; Frost, O; Taskinen, J; Tulimaki, V
2012-01-01
The Detector Control System of the TOTEM experiment at the LHC is built with the industrial product WinCC OA (PVSS). The TOTEM system is generated automatically through scripts using as input the detector Product Breakdown Structure (PBS) structure and its pinout connectivity, archiving and alarm metainformation, and some other heuristics based on the naming conventions. When those initial parameters and automation code are modified to include new features, the resulting PVSS system can also introduce side-effects. On a daily basis, a custom developed regression testing tool takes the most recent code from a Subversion (SVN) repository and builds a new control system from scratch. This system is exported in plain text format using the PVSS export tool, and compared with a system previously validated by a human. A report is sent to the developers with any differences highlighted, in readiness for validation and acceptance as a new stable version. This regression approach is not dependent on any development framework or methodology. This process has been satisfactory during several months, proving to be a very valuable tool before deploying new versions in the production systems.
Supporting Regularized Logistic Regression Privately and Efficiently
Li, Wenfa; Liu, Hongzhe; Yang, Peng; Xie, Wei
2016-01-01
As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc. PMID:27271738
Structural Break Tests Robust to Regression Misspecification
Directory of Open Access Journals (Sweden)
Alaa Abi Morshed
2018-05-01
Full Text Available Structural break tests for regression models are sensitive to model misspecification. We show—analytically and through simulations—that the sup Wald test for breaks in the conditional mean and variance of a time series process exhibits severe size distortions when the conditional mean dynamics are misspecified. We also show that the sup Wald test for breaks in the unconditional mean and variance does not have the same size distortions, yet benefits from similar power to its conditional counterpart in correctly specified models. Hence, we propose using it as an alternative and complementary test for breaks. We apply the unconditional and conditional mean and variance tests to three US series: unemployment, industrial production growth and interest rates. Both the unconditional and the conditional mean tests detect a break in the mean of interest rates. However, for the other two series, the unconditional mean test does not detect a break, while the conditional mean tests based on dynamic regression models occasionally detect a break, with the implied break-point estimator varying across different dynamic specifications. For all series, the unconditional variance does not detect a break while most tests for the conditional variance do detect a break which also varies across specifications.
Supporting Regularized Logistic Regression Privately and Efficiently.
Li, Wenfa; Liu, Hongzhe; Yang, Peng; Xie, Wei
2016-01-01
As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc.
Bayesian nonlinear regression for large small problems
Chakraborty, Sounak
2012-07-01
Statistical modeling and inference problems with sample sizes substantially smaller than the number of available covariates are challenging. This is known as large p small n problem. Furthermore, the problem is more complicated when we have multiple correlated responses. We develop multivariate nonlinear regression models in this setup for accurate prediction. In this paper, we introduce a full Bayesian support vector regression model with Vapnik\\'s ε-insensitive loss function, based on reproducing kernel Hilbert spaces (RKHS) under the multivariate correlated response setup. This provides a full probabilistic description of support vector machine (SVM) rather than an algorithm for fitting purposes. We have also introduced a multivariate version of the relevance vector machine (RVM). Instead of the original treatment of the RVM relying on the use of type II maximum likelihood estimates of the hyper-parameters, we put a prior on the hyper-parameters and use Markov chain Monte Carlo technique for computation. We have also proposed an empirical Bayes method for our RVM and SVM. Our methods are illustrated with a prediction problem in the near-infrared (NIR) spectroscopy. A simulation study is also undertaken to check the prediction accuracy of our models. © 2012 Elsevier Inc.
Hyperspectral Unmixing with Robust Collaborative Sparse Regression
Directory of Open Access Journals (Sweden)
Chang Li
2016-07-01
Full Text Available Recently, sparse unmixing (SU of hyperspectral data has received particular attention for analyzing remote sensing images. However, most SU methods are based on the commonly admitted linear mixing model (LMM, which ignores the possible nonlinear effects (i.e., nonlinearity. In this paper, we propose a new method named robust collaborative sparse regression (RCSR based on the robust LMM (rLMM for hyperspectral unmixing. The rLMM takes the nonlinearity into consideration, and the nonlinearity is merely treated as outlier, which has the underlying sparse property. The RCSR simultaneously takes the collaborative sparse property of the abundance and sparsely distributed additive property of the outlier into consideration, which can be formed as a robust joint sparse regression problem. The inexact augmented Lagrangian method (IALM is used to optimize the proposed RCSR. The qualitative and quantitative experiments on synthetic datasets and real hyperspectral images demonstrate that the proposed RCSR is efficient for solving the hyperspectral SU problem compared with the other four state-of-the-art algorithms.
Supporting Regularized Logistic Regression Privately and Efficiently.
Directory of Open Access Journals (Sweden)
Wenfa Li
Full Text Available As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc.
Directory of Open Access Journals (Sweden)
Menon Carlo
2011-09-01
Full Text Available Abstract Background Several regression models have been proposed for estimation of isometric joint torque using surface electromyography (SEMG signals. Common issues related to torque estimation models are degradation of model accuracy with passage of time, electrode displacement, and alteration of limb posture. This work compares the performance of the most commonly used regression models under these circumstances, in order to assist researchers with identifying the most appropriate model for a specific biomedical application. Methods Eleven healthy volunteers participated in this study. A custom-built rig, equipped with a torque sensor, was used to measure isometric torque as each volunteer flexed and extended his wrist. SEMG signals from eight forearm muscles, in addition to wrist joint torque data were gathered during the experiment. Additional data were gathered one hour and twenty-four hours following the completion of the first data gathering session, for the purpose of evaluating the effects of passage of time and electrode displacement on accuracy of models. Acquired SEMG signals were filtered, rectified, normalized and then fed to models for training. Results It was shown that mean adjusted coefficient of determination (Ra2 values decrease between 20%-35% for different models after one hour while altering arm posture decreased mean Ra2 values between 64% to 74% for different models. Conclusions Model estimation accuracy drops significantly with passage of time, electrode displacement, and alteration of limb posture. Therefore model retraining is crucial for preserving estimation accuracy. Data resampling can significantly reduce model training time without losing estimation accuracy. Among the models compared, ordinary least squares linear regression model (OLS was shown to have high isometric torque estimation accuracy combined with very short training times.
Interpreting parameters in the logistic regression model with random effects
DEFF Research Database (Denmark)
Larsen, Klaus; Petersen, Jørgen Holm; Budtz-Jørgensen, Esben
2000-01-01
interpretation, interval odds ratio, logistic regression, median odds ratio, normally distributed random effects......interpretation, interval odds ratio, logistic regression, median odds ratio, normally distributed random effects...
Bonellie, Sandra R
2012-10-01
To illustrate the use of regression and logistic regression models to investigate changes over time in size of babies particularly in relation to social deprivation, age of the mother and smoking. Mean birthweight has been found to be increasing in many countries in recent years, but there are still a group of babies who are born with low birthweights. Population-based retrospective cohort study. Multiple linear regression and logistic regression models are used to analyse data on term 'singleton births' from Scottish hospitals between 1994-2003. Mothers who smoke are shown to give birth to lighter babies on average, a difference of approximately 0.57 Standard deviations lower (95% confidence interval. 0.55-0.58) when adjusted for sex and parity. These mothers are also more likely to have babies that are low birthweight (odds ratio 3.46, 95% confidence interval 3.30-3.63) compared with non-smokers. Low birthweight is 30% more likely where the mother lives in the most deprived areas compared with the least deprived, (odds ratio 1.30, 95% confidence interval 1.21-1.40). Smoking during pregnancy is shown to have a detrimental effect on the size of infants at birth. This effect explains some, though not all, of the observed socioeconomic birthweight. It also explains much of the observed birthweight differences by the age of the mother. Identifying mothers at greater risk of having a low birthweight baby as important implications for the care and advice this group receives. © 2012 Blackwell Publishing Ltd.
Age Regression in the Treatment of Anger in a Prison Setting.
Eisel, Harry E.
1988-01-01
Incorporated hypnotherapy with age regression into cognitive therapeutic approach with prisoners having history of anger. Technique involved age regression to establish first significant event causing current anger, catharsis of feelings for original event, and reorientation of event while under hypnosis. Results indicated decrease in acting-out…
Regression of an atlantoaxial rheumatoid pannus following posterior instrumented fusion.
Bydon, Mohamad; Macki, Mohamed; Qadi, Mohamud; De la Garza-Ramos, Rafael; Kosztowski, Thomas A; Sciubba, Daniel M; Wolinsky, Jean-Paul; Witham, Timothy F; Gokaslan, Ziya L; Bydon, Ali
2015-10-01
Rheumatoid patients may develop a retrodental lesion (atlantoaxial rheumatoid pannus) that may cause cervical instability and/or neurological compromise. The objective is to characterize clinical and radiographic outcomes after posterior instrumented fusion for atlantoaxial rheumatoid pannus. We retrospectively reviewed all patients who underwent posterior fusions for an atlantoaxial rheumatoid pannus at a single institution. Both preoperative and postoperative imaging was available for all patients. Anterior or circumferential operations, non-atlantoaxial panni, or prior C1-C2 operations were excluded. Primary outcome measures included Nurick score, Ranawat score (neurologic status in patients with rheumatoid arthritis), pannus regression, and reoperation. Pannus volume was determined with axial and sagittal views on both preoperative and postoperative radiological images. Thirty patients surgically managed for an atlantoaxial rheumatoid pannus were followed for a mean of 24.43 months. Nine patients underwent posterior instrumented fusion alone, while 21 patients underwent posterior decompression and instrumented fusion. Following a posterior instrumented fusion in all 30 patients, the pannus statistically significantly regressed by 44.44%, from a mean volume of 1.26cm(3) to 0.70cm(3) (ppannus radiographically regressed by 44.44% over a mean of 8.02 months, and patients clinically improved per the Nurick score. The Ranawat score did not improve, and 20% of patients required reoperation over a mean of 13.18 months. The annualized reoperation rate was approximately 13.62%. Copyright © 2015 Elsevier B.V. All rights reserved.
Predicting company growth using logistic regression and neural networks
Directory of Open Access Journals (Sweden)
Marijana Zekić-Sušac
2016-12-01
Full Text Available The paper aims to establish an efficient model for predicting company growth by leveraging the strengths of logistic regression and neural networks. A real dataset of Croatian companies was used which described the relevant industry sector, financial ratios, income, and assets in the input space, with a dependent binomial variable indicating whether a company had high-growth if it had annualized growth in assets by more than 20% a year over a three-year period. Due to a large number of input variables, factor analysis was performed in the pre -processing stage in order to extract the most important input components. Building an efficient model with a high classification rate and explanatory ability required application of two data mining methods: logistic regression as a parametric and neural networks as a non -parametric method. The methods were tested on the models with and without variable reduction. The classification accuracy of the models was compared using statistical tests and ROC curves. The results showed that neural networks produce a significantly higher classification accuracy in the model when incorporating all available variables. The paper further discusses the advantages and disadvantages of both approaches, i.e. logistic regression and neural networks in modelling company growth. The suggested model is potentially of benefit to investors and economic policy makers as it provides support for recognizing companies with growth potential, especially during times of economic downturn.
Determining Semantically Related Significant Genes.
Taha, Kamal
2014-01-01
GO relation embodies some aspects of existence dependency. If GO term xis existence-dependent on GO term y, the presence of y implies the presence of x. Therefore, the genes annotated with the function of the GO term y are usually functionally and semantically related to the genes annotated with the function of the GO term x. A large number of gene set enrichment analysis methods have been developed in recent years for analyzing gene sets enrichment. However, most of these methods overlook the structural dependencies between GO terms in GO graph by not considering the concept of existence dependency. We propose in this paper a biological search engine called RSGSearch that identifies enriched sets of genes annotated with different functions using the concept of existence dependency. We observe that GO term xcannot be existence-dependent on GO term y, if x- and y- have the same specificity (biological characteristics). After encoding into a numeric format the contributions of GO terms annotating target genes to the semantics of their lowest common ancestors (LCAs), RSGSearch uses microarray experiment to identify the most significant LCA that annotates the result genes. We evaluated RSGSearch experimentally and compared it with five gene set enrichment systems. Results showed marked improvement.
Burggraaff, Jessica; Knol, Dirk L; Uitdehaag, Bernard M J
2017-01-01
Appropriate and timely screening instruments that sensitively capture the cognitive functioning of multiple sclerosis (MS) patients are the need of the hour. We evaluated newly derived regression-based norms for the Symbol Digit Modalities Test (SDMT) in a Dutch-speaking sample, as an indicator of the cognitive state of MS patients. Regression-based norms for the SDMT were created from a healthy control sample (n = 96) and used to convert MS patients' (n = 157) raw scores to demographically adjusted Z-scores, correcting for the effects of age, age2, gender, and education. Conventional and regression-based norms were compared on their impairment-classification rates and related to other neuropsychological measures. The regression analyses revealed that age was the only significantly influencing demographic in our healthy sample. Regression-based norms for the SDMT more readily detected impairment in MS patients than conventional normalization methods (32 patients instead of 15). Patients changing from an SDMT-preserved to -impaired status (n = 17) were also impaired on other cognitive domains (p < 0.05), except for visuospatial memory (p = 0.34). Regression-based norms for the SDMT more readily detect abnormal performance in MS patients than conventional norms, identifying those patients at highest risk for cognitive impairment, which was supported by a worse performance on other neuropsychological measures. © 2017 S. Karger AG, Basel.
BANK FAILURE PREDICTION WITH LOGISTIC REGRESSION
Directory of Open Access Journals (Sweden)
Taha Zaghdoudi
2013-04-01
Full Text Available In recent years the economic and financial world is shaken by a wave of financial crisis and resulted in violent bank fairly huge losses. Several authors have focused on the study of the crises in order to develop an early warning model. It is in the same path that our work takes its inspiration. Indeed, we have tried to develop a predictive model of Tunisian bank failures with the contribution of the binary logistic regression method. The specificity of our prediction model is that it takes into account microeconomic indicators of bank failures. The results obtained using our provisional model show that a bank's ability to repay its debt, the coefficient of banking operations, bank profitability per employee and leverage financial ratio has a negative impact on the probability of failure.
Robust Mediation Analysis Based on Median Regression
Yuan, Ying; MacKinnon, David P.
2014-01-01
Mediation analysis has many applications in psychology and the social sciences. The most prevalent methods typically assume that the error distribution is normal and homoscedastic. However, this assumption may rarely be met in practice, which can affect the validity of the mediation analysis. To address this problem, we propose robust mediation analysis based on median regression. Our approach is robust to various departures from the assumption of homoscedasticity and normality, including heavy-tailed, skewed, contaminated, and heteroscedastic distributions. Simulation studies show that under these circumstances, the proposed method is more efficient and powerful than standard mediation analysis. We further extend the proposed robust method to multilevel mediation analysis, and demonstrate through simulation studies that the new approach outperforms the standard multilevel mediation analysis. We illustrate the proposed method using data from a program designed to increase reemployment and enhance mental health of job seekers. PMID:24079925
ANYOLS, Least Square Fit by Stepwise Regression
International Nuclear Information System (INIS)
Atwoods, C.L.; Mathews, S.
1986-01-01
Description of program or function: ANYOLS is a stepwise program which fits data using ordinary or weighted least squares. Variables are selected for the model in a stepwise way based on a user- specified input criterion or a user-written subroutine. The order in which variables are entered can be influenced by user-defined forcing priorities. Instead of stepwise selection, ANYOLS can try all possible combinations of any desired subset of the variables. Automatic output for the final model in a stepwise search includes plots of the residuals, 'studentized' residuals, and leverages; if the model is not too large, the output also includes partial regression and partial leverage plots. A data set may be re-used so that several selection criteria can be tried. Flexibility is increased by allowing the substitution of user-written subroutines for several default subroutines
Nonparametric additive regression for repeatedly measured data
Carroll, R. J.
2009-05-20
We develop an easily computed smooth backfitting algorithm for additive model fitting in repeated measures problems. Our methodology easily copes with various settings, such as when some covariates are the same over repeated response measurements. We allow for a working covariance matrix for the regression errors, showing that our method is most efficient when the correct covariance matrix is used. The component functions achieve the known asymptotic variance lower bound for the scalar argument case. Smooth backfitting also leads directly to design-independent biases in the local linear case. Simulations show our estimator has smaller variance than the usual kernel estimator. This is also illustrated by an example from nutritional epidemiology. © 2009 Biometrika Trust.
Conjoined legs: Sirenomelia or caudal regression syndrome?
Directory of Open Access Journals (Sweden)
Sakti Prasad Das
2013-01-01
Full Text Available Presence of single umbilical persistent vitelline artery distinguishes sirenomelia from caudal regression syndrome. We report a case of a12-year-old boy who had bilateral umbilical arteries presented with fusion of both legs in the lower one third of leg. Both feet were rudimentary. The right foot had a valgus rocker-bottom deformity. All toes were present but rudimentary. The left foot showed absence of all toes. Physical examination showed left tibia vara. The chest evaluation in sitting revealed pigeon chest and elevated right shoulder. Posterior examination of the trunk showed thoracic scoliosis with convexity to right. The patient was operated and at 1 year followup the boy had two separate legs with a good aesthetic and functional results.
Conjoined legs: Sirenomelia or caudal regression syndrome?
Das, Sakti Prasad; Ojha, Niranjan; Ganesh, G Shankar; Mohanty, Ram Narayan
2013-07-01
Presence of single umbilical persistent vitelline artery distinguishes sirenomelia from caudal regression syndrome. We report a case of a12-year-old boy who had bilateral umbilical arteries presented with fusion of both legs in the lower one third of leg. Both feet were rudimentary. The right foot had a valgus rocker-bottom deformity. All toes were present but rudimentary. The left foot showed absence of all toes. Physical examination showed left tibia vara. The chest evaluation in sitting revealed pigeon chest and elevated right shoulder. Posterior examination of the trunk showed thoracic scoliosis with convexity to right. The patient was operated and at 1 year followup the boy had two separate legs with a good aesthetic and functional results.
Logistic regression against a divergent Bayesian network
Directory of Open Access Journals (Sweden)
Noel Antonio Sánchez Trujillo
2015-01-01
Full Text Available This article is a discussion about two statistical tools used for prediction and causality assessment: logistic regression and Bayesian networks. Using data of a simulated example from a study assessing factors that might predict pulmonary emphysema (where fingertip pigmentation and smoking are considered; we posed the following questions. Is pigmentation a confounding, causal or predictive factor? Is there perhaps another factor, like smoking, that confounds? Is there a synergy between pigmentation and smoking? The results, in terms of prediction, are similar with the two techniques; regarding causation, differences arise. We conclude that, in decision-making, the sum of both: a statistical tool, used with common sense, and previous evidence, taking years or even centuries to develop; is better than the automatic and exclusive use of statistical resources.
Adaptive regression for modeling nonlinear relationships
Knafl, George J
2016-01-01
This book presents methods for investigating whether relationships are linear or nonlinear and for adaptively fitting appropriate models when they are nonlinear. Data analysts will learn how to incorporate nonlinearity in one or more predictor variables into regression models for different types of outcome variables. Such nonlinear dependence is often not considered in applied research, yet nonlinear relationships are common and so need to be addressed. A standard linear analysis can produce misleading conclusions, while a nonlinear analysis can provide novel insights into data, not otherwise possible. A variety of examples of the benefits of modeling nonlinear relationships are presented throughout the book. Methods are covered using what are called fractional polynomials based on real-valued power transformations of primary predictor variables combined with model selection based on likelihood cross-validation. The book covers how to formulate and conduct such adaptive fractional polynomial modeling in the s...
Crime Modeling using Spatial Regression Approach
Saleh Ahmar, Ansari; Adiatma; Kasim Aidid, M.
2018-01-01
Act of criminality in Indonesia increased both variety and quantity every year. As murder, rape, assault, vandalism, theft, fraud, fencing, and other cases that make people feel unsafe. Risk of society exposed to crime is the number of reported cases in the police institution. The higher of the number of reporter to the police institution then the number of crime in the region is increasing. In this research, modeling criminality in South Sulawesi, Indonesia with the dependent variable used is the society exposed to the risk of crime. Modelling done by area approach is the using Spatial Autoregressive (SAR) and Spatial Error Model (SEM) methods. The independent variable used is the population density, the number of poor population, GDP per capita, unemployment and the human development index (HDI). Based on the analysis using spatial regression can be shown that there are no dependencies spatial both lag or errors in South Sulawesi.
Regression analysis for the social sciences
Gordon, Rachel A
2015-01-01
Provides graduate students in the social sciences with the basic skills they need to estimate, interpret, present, and publish basic regression models using contemporary standards. Key features of the book include: interweaving the teaching of statistical concepts with examples developed for the course from publicly-available social science data or drawn from the literature. thorough integration of teaching statistical theory with teaching data processing and analysis. teaching of Stata and use of chapter exercises in which students practice programming and interpretation on the same data set. A separate set of exercises allows students to select a data set to apply the concepts learned in each chapter to a research question of interest to them, all updated for this edition.
Gaussian process regression for geometry optimization
Denzel, Alexander; Kästner, Johannes
2018-03-01
We implemented a geometry optimizer based on Gaussian process regression (GPR) to find minimum structures on potential energy surfaces. We tested both a two times differentiable form of the Matérn kernel and the squared exponential kernel. The Matérn kernel performs much better. We give a detailed description of the optimization procedures. These include overshooting the step resulting from GPR in order to obtain a higher degree of interpolation vs. extrapolation. In a benchmark against the Limited-memory Broyden-Fletcher-Goldfarb-Shanno optimizer of the DL-FIND library on 26 test systems, we found the new optimizer to generally reduce the number of required optimization steps.
Least square regularized regression in sum space.
Xu, Yong-Li; Chen, Di-Rong; Li, Han-Xiong; Liu, Lu
2013-04-01
This paper proposes a least square regularized regression algorithm in sum space of reproducing kernel Hilbert spaces (RKHSs) for nonflat function approximation, and obtains the solution of the algorithm by solving a system of linear equations. This algorithm can approximate the low- and high-frequency component of the target function with large and small scale kernels, respectively. The convergence and learning rate are analyzed. We measure the complexity of the sum space by its covering number and demonstrate that the covering number can be bounded by the product of the covering numbers of basic RKHSs. For sum space of RKHSs with Gaussian kernels, by choosing appropriate parameters, we tradeoff the sample error and regularization error, and obtain a polynomial learning rate, which is better than that in any single RKHS. The utility of this method is illustrated with two simulated data sets and five real-life databases.
Statistical learning from a regression perspective
Berk, Richard A
2016-01-01
This textbook considers statistical learning applications when interest centers on the conditional distribution of the response variable, given a set of predictors, and when it is important to characterize how the predictors are related to the response. As a first approximation, this can be seen as an extension of nonparametric regression. This fully revised new edition includes important developments over the past 8 years. Consistent with modern data analytics, it emphasizes that a proper statistical learning data analysis derives from sound data collection, intelligent data management, appropriate statistical procedures, and an accessible interpretation of results. A continued emphasis on the implications for practice runs through the text. Among the statistical learning procedures examined are bagging, random forests, boosting, support vector machines and neural networks. Response variables may be quantitative or categorical. As in the first edition, a unifying theme is supervised learning that can be trea...
Model Selection in Kernel Ridge Regression
DEFF Research Database (Denmark)
Exterkate, Peter
Kernel ridge regression is gaining popularity as a data-rich nonlinear forecasting tool, which is applicable in many different contexts. This paper investigates the influence of the choice of kernel and the setting of tuning parameters on forecast accuracy. We review several popular kernels......, including polynomial kernels, the Gaussian kernel, and the Sinc kernel. We interpret the latter two kernels in terms of their smoothing properties, and we relate the tuning parameters associated to all these kernels to smoothness measures of the prediction function and to the signal-to-noise ratio. Based...... on these interpretations, we provide guidelines for selecting the tuning parameters from small grids using cross-validation. A Monte Carlo study confirms the practical usefulness of these rules of thumb. Finally, the flexible and smooth functional forms provided by the Gaussian and Sinc kernels makes them widely...
DNA-Cytometry of Progressive and Regressive Cervical Intraepithelial Neoplasia
Directory of Open Access Journals (Sweden)
Antonius G. J. M. Hanselaar
1998-01-01
Full Text Available A retrospective analysis was performed on archival cervical smears from a group of 56 women with cervical intraepithelial neoplasia (CIN, who had received follow‐up by cytology only. Automated image cytometry of Feulgen‐stained DNA was used to determine the differences between progressive and regressive lesions. The first group of 30 smears was from women who had developed cancer after initial smears with dysplastic changes (progressive group. The second group of 26 smears with dysplastic changes had shown regression to normal (regressive group. The goal of the study was to determine if differences in cytometric features existed between the progressive and regressive groups. CIN categories I, II and III were represented in both groups, and measurements were pooled across diagnostic categories. Images of up to 700 intermediate cells were obtained from each slide, and cells were scanned exhaustively for the detection of diagnostic cells. Discriminant function analysis was performed for both intermediate and diagnostic cells. The most significant differences between the groups were found for diagnostic cells, with a cell classification accuracy of 82%. Intermediate cells could be classified with 60% accuracy. Cytometric features which afforded the best discrimination were characteristic of the chromatin organization in diagnostic cells (nuclear texture. Slide classification was performed by thresholding the number of cells which exhibited progression associated changes (PAC in chromatin configuration, with an accuracy of 93 and 73% for diagnostic and intermediate cells, respectively. These results indicate that regardless of the extent of nuclear atypia as reflected in the CIN category, features of chromatin organization can potentially be used to predict the malignant or progressive potential of CIN lesions.
Learning Inverse Rig Mappings by Nonlinear Regression.
Holden, Daniel; Saito, Jun; Komura, Taku
2017-03-01
We present a framework to design inverse rig-functions-functions that map low level representations of a character's pose such as joint positions or surface geometry to the representation used by animators called the animation rig. Animators design scenes using an animation rig, a framework widely adopted in animation production which allows animators to design character poses and geometry via intuitive parameters and interfaces. Yet most state-of-the-art computer animation techniques control characters through raw, low level representations such as joint angles, joint positions, or vertex coordinates. This difference often stops the adoption of state-of-the-art techniques in animation production. Our framework solves this issue by learning a mapping between the low level representations of the pose and the animation rig. We use nonlinear regression techniques, learning from example animation sequences designed by the animators. When new motions are provided in the skeleton space, the learned mapping is used to estimate the rig controls that reproduce such a motion. We introduce two nonlinear functions for producing such a mapping: Gaussian process regression and feedforward neural networks. The appropriate solution depends on the nature of the rig and the amount of data available for training. We show our framework applied to various examples including articulated biped characters, quadruped characters, facial animation rigs, and deformable characters. With our system, animators have the freedom to apply any motion synthesis algorithm to arbitrary rigging and animation pipelines for immediate editing. This greatly improves the productivity of 3D animation, while retaining the flexibility and creativity of artistic input.
DRREP: deep ridge regressed epitope predictor.
Sher, Gene; Zhi, Degui; Zhang, Shaojie
2017-10-03
The ability to predict epitopes plays an enormous role in vaccine development in terms of our ability to zero in on where to do a more thorough in-vivo analysis of the protein in question. Though for the past decade there have been numerous advancements and improvements in epitope prediction, on average the best benchmark prediction accuracies are still only around 60%. New machine learning algorithms have arisen within the domain of deep learning, text mining, and convolutional networks. This paper presents a novel analytically trained and string kernel using deep neural network, which is tailored for continuous epitope prediction, called: Deep Ridge Regressed Epitope Predictor (DRREP). DRREP was tested on long protein sequences from the following datasets: SARS, Pellequer, HIV, AntiJen, and SEQ194. DRREP was compared to numerous state of the art epitope predictors, including the most recently published predictors called LBtope and DMNLBE. Using area under ROC curve (AUC), DRREP achieved a performance improvement over the best performing predictors on SARS (13.7%), HIV (8.9%), Pellequer (1.5%), and SEQ194 (3.1%), with its performance being matched only on the AntiJen dataset, by the LBtope predictor, where both DRREP and LBtope achieved an AUC of 0.702. DRREP is an analytically trained deep neural network, thus capable of learning in a single step through regression. By combining the features of deep learning, string kernels, and convolutional networks, the system is able to perform residue-by-residue prediction of continues epitopes with higher accuracy than the current state of the art predictors.
Collaborative regression-based anatomical landmark detection
International Nuclear Information System (INIS)
Gao, Yaozong; Shen, Dinggang
2015-01-01
Anatomical landmark detection plays an important role in medical image analysis, e.g. for registration, segmentation and quantitative analysis. Among the various existing methods for landmark detection, regression-based methods have recently attracted much attention due to their robustness and efficiency. In these methods, landmarks are localised through voting from all image voxels, which is completely different from the classification-based methods that use voxel-wise classification to detect landmarks. Despite their robustness, the accuracy of regression-based landmark detection methods is often limited due to (1) the inclusion of uninformative image voxels in the voting procedure, and (2) the lack of effective ways to incorporate inter-landmark spatial dependency into the detection step. In this paper, we propose a collaborative landmark detection framework to address these limitations. The concept of collaboration is reflected in two aspects. (1) Multi-resolution collaboration. A multi-resolution strategy is proposed to hierarchically localise landmarks by gradually excluding uninformative votes from faraway voxels. Moreover, for informative voxels near the landmark, a spherical sampling strategy is also designed at the training stage to improve their prediction accuracy. (2) Inter-landmark collaboration. A confidence-based landmark detection strategy is proposed to improve the detection accuracy of ‘difficult-to-detect’ landmarks by using spatial guidance from ‘easy-to-detect’ landmarks. To evaluate our method, we conducted experiments extensively on three datasets for detecting prostate landmarks and head and neck landmarks in computed tomography images, and also dental landmarks in cone beam computed tomography images. The results show the effectiveness of our collaborative landmark detection framework in improving landmark detection accuracy, compared to other state-of-the-art methods. (paper)
Poisson Regression Analysis of Illness and Injury Surveillance Data
Energy Technology Data Exchange (ETDEWEB)
Frome E.L., Watkins J.P., Ellis E.D.
2012-12-12
The Department of Energy (DOE) uses illness and injury surveillance to monitor morbidity and assess the overall health of the work force. Data collected from each participating site include health events and a roster file with demographic information. The source data files are maintained in a relational data base, and are used to obtain stratified tables of health event counts and person time at risk that serve as the starting point for Poisson regression analysis. The explanatory variables that define these tables are age, gender, occupational group, and time. Typical response variables of interest are the number of absences due to illness or injury, i.e., the response variable is a count. Poisson regression methods are used to describe the effect of the explanatory variables on the health event rates using a log-linear main effects model. Results of fitting the main effects model are summarized in a tabular and graphical form and interpretation of model parameters is provided. An analysis of deviance table is used to evaluate the importance of each of the explanatory variables on the event rate of interest and to determine if interaction terms should be considered in the analysis. Although Poisson regression methods are widely used in the analysis of count data, there are situations in which over-dispersion occurs. This could be due to lack-of-fit of the regression model, extra-Poisson variation, or both. A score test statistic and regression diagnostics are used to identify over-dispersion. A quasi-likelihood method of moments procedure is used to evaluate and adjust for extra-Poisson variation when necessary. Two examples are presented using respiratory disease absence rates at two DOE sites to illustrate the methods and interpretation of the results. In the first example the Poisson main effects model is adequate. In the second example the score test indicates considerable over-dispersion and a more detailed analysis attributes the over-dispersion to extra
Lewis, Kristin Nicole; Heckman, Bernadette Davantes; Himawan, Lina
2011-08-01
Growth mixture modeling (GMM) identified latent groups based on treatment outcome trajectories of headache disability measures in patients in headache subspecialty treatment clinics. Using a longitudinal design, 219 patients in headache subspecialty clinics in 4 large cities throughout Ohio provided data on their headache disability at pretreatment and 3 follow-up assessments. GMM identified 3 treatment outcome trajectory groups: (1) patients who initiated treatment with elevated disability levels and who reported statistically significant reductions in headache disability (high-disability improvers; 11%); (2) patients who initiated treatment with elevated disability but who reported no reductions in disability (high-disability nonimprovers; 34%); and (3) patients who initiated treatment with moderate disability and who reported statistically significant reductions in headache disability (moderate-disability improvers; 55%). Based on the final multinomial logistic regression model, a dichotomized treatment appointment attendance variable was a statistically significant predictor for differentiating high-disability improvers from high-disability nonimprovers. Three-fourths of patients who initiated treatment with elevated disability levels did not report reductions in disability after 5 months of treatment with new preventive pharmacotherapies. Preventive headache agents may be most efficacious for patients with moderate levels of disability and for patients with high disability levels who attend all treatment appointments. Copyright © 2011 International Association for the Study of Pain. Published by Elsevier B.V. All rights reserved.
Regression of Cardiac Rhabdomyomas in a Neonate after Everolimus Treatment
Directory of Open Access Journals (Sweden)
Helen Bornaun
2016-01-01
Full Text Available Cardiac rhabdomyoma often shows spontaneous regression and usually requires only close follow-up. However, patients with symptomatic inoperable rhabdomyomas may be candidates for everolimus treatment. Our patient had multiple inoperable cardiac rhabdomyomas causing serious left ventricle outflow-tract obstruction that showed a dramatic reduction in the size after everolimus therapy, a mammalian target of rapamycin (mTOR inhibitor. After discontinuation of therapy, an increase in the diameter of masses occurred and everolimus was restarted. After 6 months of treatment, rhabdomyomas decreased in size and therapy was stopped. In conclusion, everolimus could be a possible novel therapy for neonates with clinically significant rhabdomyomas.
Detecting nonsense for Chinese comments based on logistic regression
Zhuolin, Ren; Guang, Chen; Shu, Chen
2016-07-01
To understand cyber citizens' opinion accurately from Chinese news comments, the clear definition on nonsense is present, and a detection model based on logistic regression (LR) is proposed. The detection of nonsense can be treated as a binary-classification problem. Besides of traditional lexical features, we propose three kinds of features in terms of emotion, structure and relevance. By these features, we train an LR model and demonstrate its effect in understanding Chinese news comments. We find that each of proposed features can significantly promote the result. In our experiments, we achieve a prediction accuracy of 84.3% which improves the baseline 77.3% by 7%.
Logistic regression applied to natural hazards: rare event logistic regression with replications
Directory of Open Access Journals (Sweden)
M. Guns
2012-06-01
Full Text Available Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logistic regression with replications, combines the strength of probabilistic and statistical methods, and allows overcoming some of the limitations of previous developments through robust variable selection. This technique was here developed for the analyses of landslide controlling factors, but the concept is widely applicable for statistical analyses of natural hazards.
Logistic regression applied to natural hazards: rare event logistic regression with replications
Guns, M.; Vanacker, V.
2012-06-01
Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logistic regression with replications, combines the strength of probabilistic and statistical methods, and allows overcoming some of the limitations of previous developments through robust variable selection. This technique was here developed for the analyses of landslide controlling factors, but the concept is widely applicable for statistical analyses of natural hazards.
Smith, Paul F; Ganesh, Siva; Liu, Ping
2013-10-30
Regression is a common statistical tool for prediction in neuroscience. However, linear regression is by far the most common form of regression used, with regression trees receiving comparatively little attention. In this study, the results of conventional multiple linear regression (MLR) were compared with those of random forest regression (RFR), in the prediction of the concentrations of 9 neurochemicals in the vestibular nucleus complex and cerebellum that are part of the l-arginine biochemical pathway (agmatine, putrescine, spermidine, spermine, l-arginine, l-ornithine, l-citrulline, glutamate and γ-aminobutyric acid (GABA)). The R(2) values for the MLRs were higher than the proportion of variance explained values for the RFRs: 6/9 of them were ≥ 0.70 compared to 4/9 for RFRs. Even the variables that had the lowest R(2) values for the MLRs, e.g. ornithine (0.50) and glutamate (0.61), had much lower proportion of variance explained values for the RFRs (0.27 and 0.49, respectively). The RSE values for the MLRs were lower than those for the RFRs in all but two cases. In general, MLRs seemed to be superior to the RFRs in terms of predictive value and error. In the case of this data set, MLR appeared to be superior to RFR in terms of its explanatory value and error. This result suggests that MLR may have advantages over RFR for prediction in neuroscience with this kind of data set, but that RFR can still have good predictive value in some cases. Copyright © 2013 Elsevier B.V. All rights reserved.
Impact significance determination-Back to basics
International Nuclear Information System (INIS)
Lawrence, David P.
2007-01-01
Impact significance determination is widely recognized as a vital and critical EIA activity. But impact significance related concepts are poorly understood. And the quality of approaches for impact significance determination in EIA practice remains highly variable. This article seeks to help establish a sound and practical conceptual foundation for formulating and evaluating impact significance determination approaches. It addresses the nature (what is impact significance?), the core characteristics (what are the major properties of significance determination?), the rationale (why are impact significance determinations necessary?), the procedural and substantive objectives (what do impact significance determinations seek to achieve?), and the process for making impact significance judgments (how is impact significance determination conducted?). By identifying fundamental attributes and key distinctions associated with impact significance determinations, a basis is provided for designing and evaluating impact significance determination procedures at both the regulatory and applied levels
Ridge regression estimator: combining unbiased and ordinary ridge regression methods of estimation
Directory of Open Access Journals (Sweden)
Sharad Damodar Gore
2009-10-01
Full Text Available Statistical literature has several methods for coping with multicollinearity. This paper introduces a new shrinkage estimator, called modified unbiased ridge (MUR. This estimator is obtained from unbiased ridge regression (URR in the same way that ordinary ridge regression (ORR is obtained from ordinary least squares (OLS. Properties of MUR are derived. Results on its matrix mean squared error (MMSE are obtained. MUR is compared with ORR and URR in terms of MMSE. These results are illustrated with an example based on data generated by Hoerl and Kennard (1975.
Directory of Open Access Journals (Sweden)
Hong-Juan Li
2013-04-01
Full Text Available Electric load forecasting is an important issue for a power utility, associated with the management of daily operations such as energy transfer scheduling, unit commitment, and load dispatch. Inspired by strong non-linear learning capability of support vector regression (SVR, this paper presents a SVR model hybridized with the empirical mode decomposition (EMD method and auto regression (AR for electric load forecasting. The electric load data of the New South Wales (Australia market are employed for comparing the forecasting performances of different forecasting models. The results confirm the validity of the idea that the proposed model can simultaneously provide forecasting with good accuracy and interpretability.
Using the Ridge Regression Procedures to Estimate the Multiple Linear Regression Coefficients
Gorgees, HazimMansoor; Mahdi, FatimahAssim
2018-05-01
This article concerns with comparing the performance of different types of ordinary ridge regression estimators that have been already proposed to estimate the regression parameters when the near exact linear relationships among the explanatory variables is presented. For this situations we employ the data obtained from tagi gas filling company during the period (2008-2010). The main result we reached is that the method based on the condition number performs better than other methods since it has smaller mean square error (MSE) than the other stated methods.
AN APPLICATION OF FUNCTIONAL MULTIVARIATE REGRESSION MODEL TO MULTICLASS CLASSIFICATION
Krzyśko, Mirosław; Smaga, Łukasz
2017-01-01
In this paper, the scale response functional multivariate regression model is considered. By using the basis functions representation of functional predictors and regression coefficients, this model is rewritten as a multivariate regression model. This representation of the functional multivariate regression model is used for multiclass classification for multivariate functional data. Computational experiments performed on real labelled data sets demonstrate the effectiveness of the proposed ...
Hunter, Paul R
2009-12-01
Household water treatment (HWT) is being widely promoted as an appropriate intervention for reducing the burden of waterborne disease in poor communities in developing countries. A recent study has raised concerns about the effectiveness of HWT, in part because of concerns over the lack of blinding and in part because of considerable heterogeneity in the reported effectiveness of randomized controlled trials. This study set out to attempt to investigate the causes of this heterogeneity and so identify factors associated with good health gains. Studies identified in an earlier systematic review and meta-analysis were supplemented with more recently published randomized controlled trials. A total of 28 separate studies of randomized controlled trials of HWT with 39 intervention arms were included in the analysis. Heterogeneity was studied using the "metareg" command in Stata. Initial analyses with single candidate predictors were undertaken and all variables significant at the P Risk and the parameter estimates from the final regression model. The overall effect size of all unblinded studies was relative risk = 0.56 (95% confidence intervals 0.51-0.63), but after adjusting for bias due to lack of blinding the effect size was much lower (RR = 0.85, 95% CI = 0.76-0.97). Four main variables were significant predictors of effectiveness of intervention in a multipredictor meta regression model: Log duration of study follow-up (regression coefficient of log effect size = 0.186, standard error (SE) = 0.072), whether or not the study was blinded (coefficient 0.251, SE 0.066) and being conducted in an emergency setting (coefficient -0.351, SE 0.076) were all significant predictors of effect size in the final model. Compared to the ceramic filter all other interventions were much less effective (Biosand 0.247, 0.073; chlorine and safe waste storage 0.295, 0.061; combined coagulant-chlorine 0.2349, 0.067; SODIS 0.302, 0.068). A Monte Carlo model predicted that over 12 months
Introduction to the use of regression models in epidemiology.
Bender, Ralf
2009-01-01
Regression modeling is one of the most important statistical techniques used in analytical epidemiology. By means of regression models the effect of one or several explanatory variables (e.g., exposures, subject characteristics, risk factors) on a response variable such as mortality or cancer can be investigated. From multiple regression models, adjusted effect estimates can be obtained that take the effect of potential confounders into account. Regression methods can be applied in all epidemiologic study designs so that they represent a universal tool for data analysis in epidemiology. Different kinds of regression models have been developed in dependence on the measurement scale of the response variable and the study design. The most important methods are linear regression for continuous outcomes, logistic regression for binary outcomes, Cox regression for time-to-event data, and Poisson regression for frequencies and rates. This chapter provides a nontechnical introduction to these regression models with illustrating examples from cancer research.
Automation of Flight Software Regression Testing
Tashakkor, Scott B.
2016-01-01
NASA is developing the Space Launch System (SLS) to be a heavy lift launch vehicle supporting human and scientific exploration beyond earth orbit. SLS will have a common core stage, an upper stage, and different permutations of boosters and fairings to perform various crewed or cargo missions. Marshall Space Flight Center (MSFC) is writing the Flight Software (FSW) that will operate the SLS launch vehicle. The FSW is developed in an incremental manner based on "Agile" software techniques. As the FSW is incrementally developed, testing the functionality of the code needs to be performed continually to ensure that the integrity of the software is maintained. Manually testing the functionality on an ever-growing set of requirements and features is not an efficient solution and therefore needs to be done automatically to ensure testing is comprehensive. To support test automation, a framework for a regression test harness has been developed and used on SLS FSW. The test harness provides a modular design approach that can compile or read in the required information specified by the developer of the test. The modularity provides independence between groups of tests and the ability to add and remove tests without disturbing others. This provides the SLS FSW team a time saving feature that is essential to meeting SLS Program technical and programmatic requirements. During development of SLS FSW, this technique has proved to be a useful tool to ensure all requirements have been tested, and that desired functionality is maintained, as changes occur. It also provides a mechanism for developers to check functionality of the code that they have developed. With this system, automation of regression testing is accomplished through a scheduling tool and/or commit hooks. Key advantages of this test harness capability includes execution support for multiple independent test cases, the ability for developers to specify precisely what they are testing and how, the ability to add
Laplacian embedded regression for scalable manifold regularization.
Chen, Lin; Tsang, Ivor W; Xu, Dong
2012-06-01
Semi-supervised learning (SSL), as a powerful tool to learn from a limited number of labeled data and a large number of unlabeled data, has been attracting increasing attention in the machine learning community. In particular, the manifold regularization framework has laid solid theoretical foundations for a large family of SSL algorithms, such as Laplacian support vector machine (LapSVM) and Laplacian regularized least squares (LapRLS). However, most of these algorithms are limited to small scale problems due to the high computational cost of the matrix inversion operation involved in the optimization problem. In this paper, we propose a novel framework called Laplacian embedded regression by introducing an intermediate decision variable into the manifold regularization framework. By using ∈-insensitive loss, we obtain the Laplacian embedded support vector regression (LapESVR) algorithm, which inherits the sparse solution from SVR. Also, we derive Laplacian embedded RLS (LapERLS) corresponding to RLS under the proposed framework. Both LapESVR and LapERLS possess a simpler form of a transformed kernel, which is the summation of the original kernel and a graph kernel that captures the manifold structure. The benefits of the transformed kernel are two-fold: (1) we can deal with the original kernel matrix and the graph Laplacian matrix in the graph kernel separately and (2) if the graph Laplacian matrix is sparse, we only need to perform the inverse operation for a sparse matrix, which is much more efficient when compared with that for a dense one. Inspired by kernel principal component analysis, we further propose to project the introduced decision variable into a subspace spanned by a few eigenvectors of the graph Laplacian matrix in order to better reflect the data manifold, as well as accelerate the calculation of the graph kernel, allowing our methods to efficiently and effectively cope with large scale SSL problems. Extensive experiments on both toy and real
Árnadóttir, Í.; Gíslason, M. K.; Carraro, U.
2016-01-01
Muscle degeneration has been consistently identified as an independent risk factor for high mortality in both aging populations and individuals suffering from neuromuscular pathology or injury. While there is much extant literature on its quantification and correlation to comorbidities, a quantitative gold standard for analyses in this regard remains undefined. Herein, we hypothesize that rigorously quantifying entire radiodensitometric distributions elicits more muscle quality information than average values reported in extant methods. This study reports the development and utility of a nonlinear trimodal regression analysis method utilized on radiodensitometric distributions of upper leg muscles from CT scans of a healthy young adult, a healthy elderly subject, and a spinal cord injury patient. The method was then employed with a THA cohort to assess pre- and postsurgical differences in their healthy and operative legs. Results from the initial representative models elicited high degrees of correlation to HU distributions, and regression parameters highlighted physiologically evident differences between subjects. Furthermore, results from the THA cohort echoed physiological justification and indicated significant improvements in muscle quality in both legs following surgery. Altogether, these results highlight the utility of novel parameters from entire HU distributions that could provide insight into the optimal quantification of muscle degeneration. PMID:28115982
Directory of Open Access Journals (Sweden)
K. J. Edmunds
2016-01-01
Full Text Available Muscle degeneration has been consistently identified as an independent risk factor for high mortality in both aging populations and individuals suffering from neuromuscular pathology or injury. While there is much extant literature on its quantification and correlation to comorbidities, a quantitative gold standard for analyses in this regard remains undefined. Herein, we hypothesize that rigorously quantifying entire radiodensitometric distributions elicits more muscle quality information than average values reported in extant methods. This study reports the development and utility of a nonlinear trimodal regression analysis method utilized on radiodensitometric distributions of upper leg muscles from CT scans of a healthy young adult, a healthy elderly subject, and a spinal cord injury patient. The method was then employed with a THA cohort to assess pre- and postsurgical differences in their healthy and operative legs. Results from the initial representative models elicited high degrees of correlation to HU distributions, and regression parameters highlighted physiologically evident differences between subjects. Furthermore, results from the THA cohort echoed physiological justification and indicated significant improvements in muscle quality in both legs following surgery. Altogether, these results highlight the utility of novel parameters from entire HU distributions that could provide insight into the optimal quantification of muscle degeneration.