WorldWideScience

Sample records for regression analyses based

  1. Statistical power analyses using G*Power 3.1: tests for correlation and regression analyses.

    Science.gov (United States)

    Faul, Franz; Erdfelder, Edgar; Buchner, Axel; Lang, Albert-Georg

    2009-11-01

    G*Power is a free power analysis program for a variety of statistical tests. We present extensions and improvements of the version introduced by Faul, Erdfelder, Lang, and Buchner (2007) in the domain of correlation and regression analyses. In the new version, we have added procedures to analyze the power of tests based on (1) single-sample tetrachoric correlations, (2) comparisons of dependent correlations, (3) bivariate linear regression, (4) multiple linear regression based on the random predictor model, (5) logistic regression, and (6) Poisson regression. We describe these new features and provide a brief introduction to their scope and handling.

  2. Personal, social, and game-related correlates of active and non-active gaming among dutch gaming adolescents: survey-based multivariable, multilevel logistic regression analyses.

    Science.gov (United States)

    Simons, Monique; de Vet, Emely; Chinapaw, Mai Jm; de Boer, Michiel; Seidell, Jacob C; Brug, Johannes

    2014-04-04

    Playing video games contributes substantially to sedentary behavior in youth. A new generation of video games-active games-seems to be a promising alternative to sedentary games to promote physical activity and reduce sedentary behavior. At this time, little is known about correlates of active and non-active gaming among adolescents. The objective of this study was to examine potential personal, social, and game-related correlates of both active and non-active gaming in adolescents. A survey assessing game behavior and potential personal, social, and game-related correlates was conducted among adolescents (12-16 years, N=353) recruited via schools. Multivariable, multilevel logistic regression analyses, adjusted for demographics (age, sex and educational level of adolescents), were conducted to examine personal, social, and game-related correlates of active gaming ≥1 hour per week (h/wk) and non-active gaming >7 h/wk. Active gaming ≥1 h/wk was significantly associated with a more positive attitude toward active gaming (OR 5.3, CI 2.4-11.8; Pgames (OR 0.30, CI 0.1-0.6; P=.002), a higher score on habit strength regarding gaming (OR 1.9, CI 1.2-3.2; P=.008) and having brothers/sisters (OR 6.7, CI 2.6-17.1; Pgame engagement (OR 0.95, CI 0.91-0.997; P=.04). Non-active gaming >7 h/wk was significantly associated with a more positive attitude toward non-active gaming (OR 2.6, CI 1.1-6.3; P=.035), a stronger habit regarding gaming (OR 3.0, CI 1.7-5.3; P7 h/wk. Active gaming is most strongly (negatively) associated with attitude with respect to non-active games, followed by observed active game behavior of brothers and sisters and attitude with respect to active gaming (positive associations). On the other hand, non-active gaming is most strongly associated with observed non-active game behavior of friends, habit strength regarding gaming and attitude toward non-active gaming (positive associations). Habit strength was a correlate of both active and non-active gaming

  3. Personal, Social, and Game-Related Correlates of Active and Non-Active Gaming Among Dutch Gaming Adolescents: Survey-Based Multivariable, Multilevel Logistic Regression Analyses

    Science.gov (United States)

    de Vet, Emely; Chinapaw, Mai JM; de Boer, Michiel; Seidell, Jacob C; Brug, Johannes

    2014-01-01

    Background Playing video games contributes substantially to sedentary behavior in youth. A new generation of video games—active games—seems to be a promising alternative to sedentary games to promote physical activity and reduce sedentary behavior. At this time, little is known about correlates of active and non-active gaming among adolescents. Objective The objective of this study was to examine potential personal, social, and game-related correlates of both active and non-active gaming in adolescents. Methods A survey assessing game behavior and potential personal, social, and game-related correlates was conducted among adolescents (12-16 years, N=353) recruited via schools. Multivariable, multilevel logistic regression analyses, adjusted for demographics (age, sex and educational level of adolescents), were conducted to examine personal, social, and game-related correlates of active gaming ≥1 hour per week (h/wk) and non-active gaming >7 h/wk. Results Active gaming ≥1 h/wk was significantly associated with a more positive attitude toward active gaming (OR 5.3, CI 2.4-11.8; Pgames (OR 0.30, CI 0.1-0.6; P=.002), a higher score on habit strength regarding gaming (OR 1.9, CI 1.2-3.2; P=.008) and having brothers/sisters (OR 6.7, CI 2.6-17.1; Pgame engagement (OR 0.95, CI 0.91-0.997; P=.04). Non-active gaming >7 h/wk was significantly associated with a more positive attitude toward non-active gaming (OR 2.6, CI 1.1-6.3; P=.035), a stronger habit regarding gaming (OR 3.0, CI 1.7-5.3; P7 h/wk. Active gaming is most strongly (negatively) associated with attitude with respect to non-active games, followed by observed active game behavior of brothers and sisters and attitude with respect to active gaming (positive associations). On the other hand, non-active gaming is most strongly associated with observed non-active game behavior of friends, habit strength regarding gaming and attitude toward non-active gaming (positive associations). Habit strength was a

  4. Applications of MIDAS regression in analysing trends in water quality

    Science.gov (United States)

    Penev, Spiridon; Leonte, Daniela; Lazarov, Zdravetz; Mann, Rob A.

    2014-04-01

    We discuss novel statistical methods in analysing trends in water quality. Such analysis uses complex data sets of different classes of variables, including water quality, hydrological and meteorological. We analyse the effect of rainfall and flow on trends in water quality utilising a flexible model called Mixed Data Sampling (MIDAS). This model arises because of the mixed frequency in the data collection. Typically, water quality variables are sampled fortnightly, whereas the rain data is sampled daily. The advantage of using MIDAS regression is in the flexible and parsimonious modelling of the influence of the rain and flow on trends in water quality variables. We discuss the model and its implementation on a data set from the Shoalhaven Supply System and Catchments in the state of New South Wales, Australia. Information criteria indicate that MIDAS modelling improves upon simplistic approaches that do not utilise the mixed data sampling nature of the data.

  5. Multicollinearity in Regression Analyses Conducted in Epidemiologic Studies.

    Science.gov (United States)

    Vatcheva, Kristina P; Lee, MinJae; McCormick, Joseph B; Rahbar, Mohammad H

    2016-04-01

    The adverse impact of ignoring multicollinearity on findings and data interpretation in regression analysis is very well documented in the statistical literature. The failure to identify and report multicollinearity could result in misleading interpretations of the results. A review of epidemiological literature in PubMed from January 2004 to December 2013, illustrated the need for a greater attention to identifying and minimizing the effect of multicollinearity in analysis of data from epidemiologic studies. We used simulated datasets and real life data from the Cameron County Hispanic Cohort to demonstrate the adverse effects of multicollinearity in the regression analysis and encourage researchers to consider the diagnostic for multicollinearity as one of the steps in regression analysis.

  6. Statistical and regression analyses of detected extrasolar systems

    Czech Academy of Sciences Publication Activity Database

    Pintr, Pavel; Peřinová, V.; Lukš, A.; Pathak, A.

    2013-01-01

    Roč. 75, č. 1 (2013), s. 37-45 ISSN 0032-0633 Institutional support: RVO:61389021 Keywords : Exoplanets * Kepler candidates * Regression analysis Subject RIV: BN - Astronomy, Celestial Mechanics, Astrophysics Impact factor: 1.630, year: 2013 http://www.sciencedirect.com/science/article/pii/S0032063312003066

  7. Multicollinearity in Regression Analyses Conducted in Epidemiologic Studies

    OpenAIRE

    Vatcheva, Kristina P.; Lee, MinJae; McCormick, Joseph B.; Rahbar, Mohammad H.

    2016-01-01

    The adverse impact of ignoring multicollinearity on findings and data interpretation in regression analysis is very well documented in the statistical literature. The failure to identify and report multicollinearity could result in misleading interpretations of the results. A review of epidemiological literature in PubMed from January 2004 to December 2013, illustrated the need for a greater attention to identifying and minimizing the effect of multicollinearity in analysis of data from epide...

  8. Analysing inequalities in Germany a structured additive distributional regression approach

    CERN Document Server

    Silbersdorff, Alexander

    2017-01-01

    This book seeks new perspectives on the growing inequalities that our societies face, putting forward Structured Additive Distributional Regression as a means of statistical analysis that circumvents the common problem of analytical reduction to simple point estimators. This new approach allows the observed discrepancy between the individuals’ realities and the abstract representation of those realities to be explicitly taken into consideration using the arithmetic mean alone. In turn, the method is applied to the question of economic inequality in Germany.

  9. Pathway-based analyses.

    Science.gov (United States)

    Kent, Jack W

    2016-02-03

    New technologies for acquisition of genomic data, while offering unprecedented opportunities for genetic discovery, also impose severe burdens of interpretation and penalties for multiple testing. The Pathway-based Analyses Group of the Genetic Analysis Workshop 19 (GAW19) sought reduction of multiple-testing burden through various approaches to aggregation of highdimensional data in pathways informed by prior biological knowledge. Experimental methods testedincluded the use of "synthetic pathways" (random sets of genes) to estimate power and false-positive error rate of methods applied to simulated data; data reduction via independent components analysis, single-nucleotide polymorphism (SNP)-SNP interaction, and use of gene sets to estimate genetic similarity; and general assessment of the efficacy of prior biological knowledge to reduce the dimensionality of complex genomic data. The work of this group explored several promising approaches to managing high-dimensional data, with the caveat that these methods are necessarily constrained by the quality of external bioinformatic annotation.

  10. How to deal with continuous and dichotomic outcomes in epidemiological research: linear and logistic regression analyses

    NARCIS (Netherlands)

    Tripepi, Giovanni; Jager, Kitty J.; Stel, Vianda S.; Dekker, Friedo W.; Zoccali, Carmine

    2011-01-01

    Because of some limitations of stratification methods, epidemiologists frequently use multiple linear and logistic regression analyses to address specific epidemiological questions. If the dependent variable is a continuous one (for example, systolic pressure and serum creatinine), the researcher

  11. Modeling oil production based on symbolic regression

    International Nuclear Information System (INIS)

    Yang, Guangfei; Li, Xianneng; Wang, Jianliang; Lian, Lian; Ma, Tieju

    2015-01-01

    Numerous models have been proposed to forecast the future trends of oil production and almost all of them are based on some predefined assumptions with various uncertainties. In this study, we propose a novel data-driven approach that uses symbolic regression to model oil production. We validate our approach on both synthetic and real data, and the results prove that symbolic regression could effectively identify the true models beneath the oil production data and also make reliable predictions. Symbolic regression indicates that world oil production will peak in 2021, which broadly agrees with other techniques used by researchers. Our results also show that the rate of decline after the peak is almost half the rate of increase before the peak, and it takes nearly 12 years to drop 4% from the peak. These predictions are more optimistic than those in several other reports, and the smoother decline will provide the world, especially the developing countries, with more time to orchestrate mitigation plans. -- Highlights: •A data-driven approach has been shown to be effective at modeling the oil production. •The Hubbert model could be discovered automatically from data. •The peak of world oil production is predicted to appear in 2021. •The decline rate after peak is half of the increase rate before peak. •Oil production projected to decline 4% post-peak

  12. SDE based regression for random PDEs

    KAUST Repository

    Bayer, Christian

    2016-01-01

    A simulation based method for the numerical solution of PDE with random coefficients is presented. By the Feynman-Kac formula, the solution can be represented as conditional expectation of a functional of a corresponding stochastic differential equation driven by independent noise. A time discretization of the SDE for a set of points in the domain and a subsequent Monte Carlo regression lead to an approximation of the global solution of the random PDE. We provide an initial error and complexity analysis of the proposed method along with numerical examples illustrating its behaviour.

  13. SDE based regression for random PDEs

    KAUST Repository

    Bayer, Christian

    2016-01-06

    A simulation based method for the numerical solution of PDE with random coefficients is presented. By the Feynman-Kac formula, the solution can be represented as conditional expectation of a functional of a corresponding stochastic differential equation driven by independent noise. A time discretization of the SDE for a set of points in the domain and a subsequent Monte Carlo regression lead to an approximation of the global solution of the random PDE. We provide an initial error and complexity analysis of the proposed method along with numerical examples illustrating its behaviour.

  14. Credit Scoring Problem Based on Regression Analysis

    OpenAIRE

    Khassawneh, Bashar Suhil Jad Allah

    2014-01-01

    ABSTRACT: This thesis provides an explanatory introduction to the regression models of data mining and contains basic definitions of key terms in the linear, multiple and logistic regression models. Meanwhile, the aim of this study is to illustrate fitting models for the credit scoring problem using simple linear, multiple linear and logistic regression models and also to analyze the found model functions by statistical tools. Keywords: Data mining, linear regression, logistic regression....

  15. Collaborative regression-based anatomical landmark detection

    International Nuclear Information System (INIS)

    Gao, Yaozong; Shen, Dinggang

    2015-01-01

    Anatomical landmark detection plays an important role in medical image analysis, e.g. for registration, segmentation and quantitative analysis. Among the various existing methods for landmark detection, regression-based methods have recently attracted much attention due to their robustness and efficiency. In these methods, landmarks are localised through voting from all image voxels, which is completely different from the classification-based methods that use voxel-wise classification to detect landmarks. Despite their robustness, the accuracy of regression-based landmark detection methods is often limited due to (1) the inclusion of uninformative image voxels in the voting procedure, and (2) the lack of effective ways to incorporate inter-landmark spatial dependency into the detection step. In this paper, we propose a collaborative landmark detection framework to address these limitations. The concept of collaboration is reflected in two aspects. (1) Multi-resolution collaboration. A multi-resolution strategy is proposed to hierarchically localise landmarks by gradually excluding uninformative votes from faraway voxels. Moreover, for informative voxels near the landmark, a spherical sampling strategy is also designed at the training stage to improve their prediction accuracy. (2) Inter-landmark collaboration. A confidence-based landmark detection strategy is proposed to improve the detection accuracy of ‘difficult-to-detect’ landmarks by using spatial guidance from ‘easy-to-detect’ landmarks. To evaluate our method, we conducted experiments extensively on three datasets for detecting prostate landmarks and head and neck landmarks in computed tomography images, and also dental landmarks in cone beam computed tomography images. The results show the effectiveness of our collaborative landmark detection framework in improving landmark detection accuracy, compared to other state-of-the-art methods. (paper)

  16. Robust Mediation Analysis Based on Median Regression

    Science.gov (United States)

    Yuan, Ying; MacKinnon, David P.

    2014-01-01

    Mediation analysis has many applications in psychology and the social sciences. The most prevalent methods typically assume that the error distribution is normal and homoscedastic. However, this assumption may rarely be met in practice, which can affect the validity of the mediation analysis. To address this problem, we propose robust mediation analysis based on median regression. Our approach is robust to various departures from the assumption of homoscedasticity and normality, including heavy-tailed, skewed, contaminated, and heteroscedastic distributions. Simulation studies show that under these circumstances, the proposed method is more efficient and powerful than standard mediation analysis. We further extend the proposed robust method to multilevel mediation analysis, and demonstrate through simulation studies that the new approach outperforms the standard multilevel mediation analysis. We illustrate the proposed method using data from a program designed to increase reemployment and enhance mental health of job seekers. PMID:24079925

  17. USE OF THE SIMPLE LINEAR REGRESSION MODEL IN MACRO-ECONOMICAL ANALYSES

    Directory of Open Access Journals (Sweden)

    Constantin ANGHELACHE

    2011-10-01

    Full Text Available The article presents the fundamental aspects of the linear regression, as a toolbox which can be used in macroeconomic analyses. The article describes the estimation of the parameters, the statistical tests used, the homoscesasticity and heteroskedasticity. The use of econometrics instrument in macroeconomics is an important factor that guarantees the quality of the models, analyses, results and possible interpretation that can be drawn at this level.

  18. Regression Discontinuity Designs Based on Population Thresholds

    DEFF Research Database (Denmark)

    Eggers, Andrew C.; Freier, Ronny; Grembi, Veronica

    In many countries, important features of municipal government (such as the electoral system, mayors' salaries, and the number of councillors) depend on whether the municipality is above or below arbitrary population thresholds. Several papers have used a regression discontinuity design (RDD...

  19. Linear Regression Based Real-Time Filtering

    Directory of Open Access Journals (Sweden)

    Misel Batmend

    2013-01-01

    Full Text Available This paper introduces real time filtering method based on linear least squares fitted line. Method can be used in case that a filtered signal is linear. This constraint narrows a band of potential applications. Advantage over Kalman filter is that it is computationally less expensive. The paper further deals with application of introduced method on filtering data used to evaluate a position of engraved material with respect to engraving machine. The filter was implemented to the CNC engraving machine control system. Experiments showing its performance are included.

  20. Genetic analyses of partial egg production in Japanese quail using multi-trait random regression models.

    Science.gov (United States)

    Karami, K; Zerehdaran, S; Barzanooni, B; Lotfi, E

    2017-12-01

    1. The aim of the present study was to estimate genetic parameters for average egg weight (EW) and egg number (EN) at different ages in Japanese quail using multi-trait random regression (MTRR) models. 2. A total of 8534 records from 900 quail, hatched between 2014 and 2015, were used in the study. Average weekly egg weights and egg numbers were measured from second until sixth week of egg production. 3. Nine random regression models were compared to identify the best order of the Legendre polynomials (LP). The most optimal model was identified by the Bayesian Information Criterion. A model with second order of LP for fixed effects, second order of LP for additive genetic effects and third order of LP for permanent environmental effects (MTRR23) was found to be the best. 4. According to the MTRR23 model, direct heritability for EW increased from 0.26 in the second week to 0.53 in the sixth week of egg production, whereas the ratio of permanent environment to phenotypic variance decreased from 0.48 to 0.1. Direct heritability for EN was low, whereas the ratio of permanent environment to phenotypic variance decreased from 0.57 to 0.15 during the production period. 5. For each trait, estimated genetic correlations among weeks of egg production were high (from 0.85 to 0.98). Genetic correlations between EW and EN were low and negative for the first two weeks, but they were low and positive for the rest of the egg production period. 6. In conclusion, random regression models can be used effectively for analysing egg production traits in Japanese quail. Response to selection for increased egg weight would be higher at older ages because of its higher heritability and such a breeding program would have no negative genetic impact on egg production.

  1. Reducing Inter-Laboratory Differences between Semen Analyses Using Z Score and Regression Transformations

    Directory of Open Access Journals (Sweden)

    Esther Leushuis

    2016-12-01

    Full Text Available Background: Standardization of the semen analysis may improve reproducibility. We assessed variability between laboratories in semen analyses and evaluated whether a transformation using Z scores and regression statistics was able to reduce this variability. Materials and Methods: We performed a retrospective cohort study. We calculated between-laboratory coefficients of variation (CVB for sperm concentration and for morphology. Subsequently, we standardized the semen analysis results by calculating laboratory specific Z scores, and by using regression. We used analysis of variance for four semen parameters to assess systematic differences between laboratories before and after the transformations, both in the circulation samples and in the samples obtained in the prospective cohort study in the Netherlands between January 2002 and February 2004. Results: The mean CVB was 7% for sperm concentration (range 3 to 13% and 32% for sperm morphology (range 18 to 51%. The differences between the laboratories were statistically significant for all semen parameters (all P<0.001. Standardization using Z scores did not reduce the differences in semen analysis results between the laboratories (all P<0.001. Conclusion: There exists large between-laboratory variability for sperm morphology and small, but statistically significant, between-laboratory variation for sperm concentration. Standardization using Z scores does not eliminate between-laboratory variability.

  2. Analyses of Developmental Rate Isomorphy in Ectotherms: Introducing the Dirichlet Regression.

    Directory of Open Access Journals (Sweden)

    David S Boukal

    Full Text Available Temperature drives development in insects and other ectotherms because their metabolic rate and growth depends directly on thermal conditions. However, relative durations of successive ontogenetic stages often remain nearly constant across a substantial range of temperatures. This pattern, termed 'developmental rate isomorphy' (DRI in insects, appears to be widespread and reported departures from DRI are generally very small. We show that these conclusions may be due to the caveats hidden in the statistical methods currently used to study DRI. Because the DRI concept is inherently based on proportional data, we propose that Dirichlet regression applied to individual-level data is an appropriate statistical method to critically assess DRI. As a case study we analyze data on five aquatic and four terrestrial insect species. We find that results obtained by Dirichlet regression are consistent with DRI violation in at least eight of the studied species, although standard analysis detects significant departure from DRI in only four of them. Moreover, the departures from DRI detected by Dirichlet regression are consistently much larger than previously reported. The proposed framework can also be used to infer whether observed departures from DRI reflect life history adaptations to size- or stage-dependent effects of varying temperature. Our results indicate that the concept of DRI in insects and other ectotherms should be critically re-evaluated and put in a wider context, including the concept of 'equiproportional development' developed for copepods.

  3. Logistic regression and multiple classification analyses to explore risk factors of under-5 mortality in bangladesh

    International Nuclear Information System (INIS)

    Bhowmik, K.R.; Islam, S.

    2016-01-01

    Logistic regression (LR) analysis is the most common statistical methodology to find out the determinants of childhood mortality. However, the significant predictors cannot be ranked according to their influence on the response variable. Multiple classification (MC) analysis can be applied to identify the significant predictors with a priority index which helps to rank the predictors. The main objective of the study is to find the socio-demographic determinants of childhood mortality at neonatal, post-neonatal, and post-infant period by fitting LR model as well as to rank those through MC analysis. The study is conducted using the data of Bangladesh Demographic and Health Survey 2007 where birth and death information of children were collected from their mothers. Three dichotomous response variables are constructed from children age at death to fit the LR and MC models. Socio-economic and demographic variables significantly associated with the response variables separately are considered in LR and MC analyses. Both the LR and MC models identified the same significant predictors for specific childhood mortality. For both the neonatal and child mortality, biological factors of children, regional settings, and parents socio-economic status are found as 1st, 2nd, and 3rd significant groups of predictors respectively. Mother education and household environment are detected as major significant predictors of post-neonatal mortality. This study shows that MC analysis with or without LR analysis can be applied to detect determinants with rank which help the policy makers taking initiatives on a priority basis. (author)

  4. Testing the equality of nonparametric regression curves based on ...

    African Journals Online (AJOL)

    Abstract. In this work we propose a new methodology for the comparison of two regression functions f1 and f2 in the case of homoscedastic error structure and a fixed design. Our approach is based on the empirical Fourier coefficients of the regression functions f1 and f2 respectively. As our main results we obtain the ...

  5. The number of subjects per variable required in linear regression analyses

    NARCIS (Netherlands)

    P.C. Austin (Peter); E.W. Steyerberg (Ewout)

    2015-01-01

    textabstractObjectives To determine the number of independent variables that can be included in a linear regression model. Study Design and Setting We used a series of Monte Carlo simulations to examine the impact of the number of subjects per variable (SPV) on the accuracy of estimated regression

  6. PC based uranium enrichment analyser

    International Nuclear Information System (INIS)

    Madan, V.K.; Gopalakrishana, K.R.; Bairi, B.R.

    1991-01-01

    It is important to measure enrichment of unirradiated nuclear fuel elements during production as a quality control measure. An IBM PC based system has recently been tested for enrichment measurements for Nuclear Fuel Complex (NFC), Hyderabad. As required by NFC, the system has ease of calibration. It is easy to switch the system from measuring enrichment of fuel elements to pellets and also automatically store the data and the results. The system uses an IBM PC plug in card to acquire data. The card incorporates programmable interval timers (8253-5). The counter/timer devices are executed by I/O mapped I/O's. A novel algorithm has been incorporated to make the system more reliable. The application software has been written in BASIC. (author). 9 refs., 1 fig

  7. The number of subjects per variable required in linear regression analyses.

    Science.gov (United States)

    Austin, Peter C; Steyerberg, Ewout W

    2015-06-01

    To determine the number of independent variables that can be included in a linear regression model. We used a series of Monte Carlo simulations to examine the impact of the number of subjects per variable (SPV) on the accuracy of estimated regression coefficients and standard errors, on the empirical coverage of estimated confidence intervals, and on the accuracy of the estimated R(2) of the fitted model. A minimum of approximately two SPV tended to result in estimation of regression coefficients with relative bias of less than 10%. Furthermore, with this minimum number of SPV, the standard errors of the regression coefficients were accurately estimated and estimated confidence intervals had approximately the advertised coverage rates. A much higher number of SPV were necessary to minimize bias in estimating the model R(2), although adjusted R(2) estimates behaved well. The bias in estimating the model R(2) statistic was inversely proportional to the magnitude of the proportion of variation explained by the population regression model. Linear regression models require only two SPV for adequate estimation of regression coefficients, standard errors, and confidence intervals. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  8. Regression-based Multi-View Facial Expression Recognition

    NARCIS (Netherlands)

    Rudovic, Ognjen; Patras, Ioannis; Pantic, Maja

    2010-01-01

    We present a regression-based scheme for multi-view facial expression recognition based on 2蚠D geometric features. We address the problem by mapping facial points (e.g. mouth corners) from non-frontal to frontal view where further recognition of the expressions can be performed using a

  9. Improved Dietary Guidelines for Vitamin D: Application of Individual Participant Data (IPD-Level Meta-Regression Analyses

    Directory of Open Access Journals (Sweden)

    Kevin D. Cashman

    2017-05-01

    Full Text Available Dietary Reference Values (DRVs for vitamin D have a key role in the prevention of vitamin D deficiency. However, despite adopting similar risk assessment protocols, estimates from authoritative agencies over the last 6 years have been diverse. This may have arisen from diverse approaches to data analysis. Modelling strategies for pooling of individual subject data from cognate vitamin D randomized controlled trials (RCTs are likely to provide the most appropriate DRV estimates. Thus, the objective of the present work was to undertake the first-ever individual participant data (IPD-level meta-regression, which is increasingly recognized as best practice, from seven winter-based RCTs (with 882 participants ranging in age from 4 to 90 years of the vitamin D intake–serum 25-hydroxyvitamin D (25(OHD dose-response. Our IPD-derived estimates of vitamin D intakes required to maintain 97.5% of 25(OHD concentrations >25, 30, and 50 nmol/L across the population are 10, 13, and 26 µg/day, respectively. In contrast, standard meta-regression analyses with aggregate data (as used by several agencies in recent years from the same RCTs estimated that a vitamin D intake requirement of 14 µg/day would maintain 97.5% of 25(OHD >50 nmol/L. These first IPD-derived estimates offer improved dietary recommendations for vitamin D because the underpinning modeling captures the between-person variability in response of serum 25(OHD to vitamin D intake.

  10. Correcting for multivariate measurement error by regression calibration in meta-analyses of epidemiological studies.

    NARCIS (Netherlands)

    Kromhout, D.

    2009-01-01

    Within-person variability in measured values of multiple risk factors can bias their associations with disease. The multivariate regression calibration (RC) approach can correct for such measurement error and has been applied to studies in which true values or independent repeat measurements of the

  11. Testing Mediation Using Multiple Regression and Structural Equation Modeling Analyses in Secondary Data

    Science.gov (United States)

    Li, Spencer D.

    2011-01-01

    Mediation analysis in child and adolescent development research is possible using large secondary data sets. This article provides an overview of two statistical methods commonly used to test mediated effects in secondary analysis: multiple regression and structural equation modeling (SEM). Two empirical studies are presented to illustrate the…

  12. Regression Analyses on the Butterfly Ballot Effect: A Statistical Perspective of the US 2000 Election

    Science.gov (United States)

    Wu, Dane W.

    2002-01-01

    The year 2000 US presidential election between Al Gore and George Bush has been the most intriguing and controversial one in American history. The state of Florida was the trigger for the controversy, mainly, due to the use of the misleading "butterfly ballot". Using prediction (or confidence) intervals for least squares regression lines…

  13. Alpins and thibos vectorial astigmatism analyses: proposal of a linear regression model between methods

    Directory of Open Access Journals (Sweden)

    Giuliano de Oliveira Freitas

    2013-10-01

    Full Text Available PURPOSE: To determine linear regression models between Alpins descriptive indices and Thibos astigmatic power vectors (APV, assessing the validity and strength of such correlations. METHODS: This case series prospectively assessed 62 eyes of 31 consecutive cataract patients with preoperative corneal astigmatism between 0.75 and 2.50 diopters in both eyes. Patients were randomly assorted among two phacoemulsification groups: one assigned to receive AcrySof®Toric intraocular lens (IOL in both eyes and another assigned to have AcrySof Natural IOL associated with limbal relaxing incisions, also in both eyes. All patients were reevaluated postoperatively at 6 months, when refractive astigmatism analysis was performed using both Alpins and Thibos methods. The ratio between Thibos postoperative APV and preoperative APV (APVratio and its linear regression to Alpins percentage of success of astigmatic surgery, percentage of astigmatism corrected and percentage of astigmatism reduction at the intended axis were assessed. RESULTS: Significant negative correlation between the ratio of post- and preoperative Thibos APVratio and Alpins percentage of success (%Success was found (Spearman's ρ=-0.93; linear regression is given by the following equation: %Success = (-APVratio + 1.00x100. CONCLUSION: The linear regression we found between APVratio and %Success permits a validated mathematical inference concerning the overall success of astigmatic surgery.

  14. Check-all-that-apply data analysed by Partial Least Squares regression

    DEFF Research Database (Denmark)

    Rinnan, Åsmund; Giacalone, Davide; Frøst, Michael Bom

    2015-01-01

    are analysed by multivariate techniques. CATA data can be analysed both by setting the CATA as the X and the Y. The former is the PLS-Discriminant Analysis (PLS-DA) version, while the latter is the ANOVA-PLS (A-PLS) version. We investigated the difference between these two approaches, concluding...

  15. Differential item functioning (DIF) analyses of health-related quality of life instruments using logistic regression

    DEFF Research Database (Denmark)

    Scott, Neil W; Fayers, Peter M; Aaronson, Neil K

    2010-01-01

    Differential item functioning (DIF) methods can be used to determine whether different subgroups respond differently to particular items within a health-related quality of life (HRQoL) subscale, after allowing for overall subgroup differences in that scale. This article reviews issues that arise ...... when testing for DIF in HRQoL instruments. We focus on logistic regression methods, which are often used because of their efficiency, simplicity and ease of application....

  16. Determinants of Inequality in Cameroon: A Regression-Based ...

    African Journals Online (AJOL)

    This paper applies the regression-based inequality decomposition approach to explore determinants of income inequality in Cameroon using the 2007 Cameroon household consumption survey. The contribution of each source to measured income inequality is the sum of its weighted marginal contributions in all possible ...

  17. Correlation and regression analyses of genetic effects for different types of cells in mammals under radiation and chemical treatment

    International Nuclear Information System (INIS)

    Slutskaya, N.G.; Mosseh, I.B.

    2006-01-01

    Data about genetic mutations under radiation and chemical treatment for different types of cells have been analyzed with correlation and regression analyses. Linear correlation between different genetic effects in sex cells and somatic cells have found. The results may be extrapolated on sex cells of human and mammals. (authors)

  18. Model-based Quantile Regression for Discrete Data

    KAUST Repository

    Padellini, Tullia

    2018-04-10

    Quantile regression is a class of methods voted to the modelling of conditional quantiles. In a Bayesian framework quantile regression has typically been carried out exploiting the Asymmetric Laplace Distribution as a working likelihood. Despite the fact that this leads to a proper posterior for the regression coefficients, the resulting posterior variance is however affected by an unidentifiable parameter, hence any inferential procedure beside point estimation is unreliable. We propose a model-based approach for quantile regression that considers quantiles of the generating distribution directly, and thus allows for a proper uncertainty quantification. We then create a link between quantile regression and generalised linear models by mapping the quantiles to the parameter of the response variable, and we exploit it to fit the model with R-INLA. We extend it also in the case of discrete responses, where there is no 1-to-1 relationship between quantiles and distribution\\'s parameter, by introducing continuous generalisations of the most common discrete variables (Poisson, Binomial and Negative Binomial) to be exploited in the fitting.

  19. Improved Dietary Guidelines for Vitamin D: Application of Individual Participant Data (IPD)-Level Meta-Regression Analyses

    Science.gov (United States)

    Cashman, Kevin D.; Ritz, Christian; Kiely, Mairead

    2017-01-01

    Dietary Reference Values (DRVs) for vitamin D have a key role in the prevention of vitamin D deficiency. However, despite adopting similar risk assessment protocols, estimates from authoritative agencies over the last 6 years have been diverse. This may have arisen from diverse approaches to data analysis. Modelling strategies for pooling of individual subject data from cognate vitamin D randomized controlled trials (RCTs) are likely to provide the most appropriate DRV estimates. Thus, the objective of the present work was to undertake the first-ever individual participant data (IPD)-level meta-regression, which is increasingly recognized as best practice, from seven winter-based RCTs (with 882 participants ranging in age from 4 to 90 years) of the vitamin D intake–serum 25-hydroxyvitamin D (25(OH)D) dose-response. Our IPD-derived estimates of vitamin D intakes required to maintain 97.5% of 25(OH)D concentrations >25, 30, and 50 nmol/L across the population are 10, 13, and 26 µg/day, respectively. In contrast, standard meta-regression analyses with aggregate data (as used by several agencies in recent years) from the same RCTs estimated that a vitamin D intake requirement of 14 µg/day would maintain 97.5% of 25(OH)D >50 nmol/L. These first IPD-derived estimates offer improved dietary recommendations for vitamin D because the underpinning modeling captures the between-person variability in response of serum 25(OH)D to vitamin D intake. PMID:28481259

  20. Correcting for multivariate measurement error by regression calibration in meta-analyses of epidemiological studies

    DEFF Research Database (Denmark)

    Tybjærg-Hansen, Anne

    2009-01-01

    Within-person variability in measured values of multiple risk factors can bias their associations with disease. The multivariate regression calibration (RC) approach can correct for such measurement error and has been applied to studies in which true values or independent repeat measurements...... of the risk factors are observed on a subsample. We extend the multivariate RC techniques to a meta-analysis framework where multiple studies provide independent repeat measurements and information on disease outcome. We consider the cases where some or all studies have repeat measurements, and compare study......-specific, averaged and empirical Bayes estimates of RC parameters. Additionally, we allow for binary covariates (e.g. smoking status) and for uncertainty and time trends in the measurement error corrections. Our methods are illustrated using a subset of individual participant data from prospective long-term studies...

  1. Longitudinal changes in telomere length and associated genetic parameters in dairy cattle analysed using random regression models.

    Directory of Open Access Journals (Sweden)

    Luise A Seeker

    Full Text Available Telomeres cap the ends of linear chromosomes and shorten with age in many organisms. In humans short telomeres have been linked to morbidity and mortality. With the accumulation of longitudinal datasets the focus shifts from investigating telomere length (TL to exploring TL change within individuals over time. Some studies indicate that the speed of telomere attrition is predictive of future disease. The objectives of the present study were to 1 characterize the change in bovine relative leukocyte TL (RLTL across the lifetime in Holstein Friesian dairy cattle, 2 estimate genetic parameters of RLTL over time and 3 investigate the association of differences in individual RLTL profiles with productive lifespan. RLTL measurements were analysed using Legendre polynomials in a random regression model to describe TL profiles and genetic variance over age. The analyses were based on 1,328 repeated RLTL measurements of 308 female Holstein Friesian dairy cattle. A quadratic Legendre polynomial was fitted to the fixed effect of age in months and to the random effect of the animal identity. Changes in RLTL, heritability and within-trait genetic correlation along the age trajectory were calculated and illustrated. At a population level, the relationship between RLTL and age was described by a positive quadratic function. Individuals varied significantly regarding the direction and amount of RLTL change over life. The heritability of RLTL ranged from 0.36 to 0.47 (SE = 0.05-0.08 and remained statistically unchanged over time. The genetic correlation of RLTL at birth with measurements later in life decreased with the time interval between samplings from near unity to 0.69, indicating that TL later in life might be regulated by different genes than TL early in life. Even though animals differed in their RLTL profiles significantly, those differences were not correlated with productive lifespan (p = 0.954.

  2. Correlation, Regression and Path Analyses of Seed Yield Components in Crambe abyssinica, a Promising Industrial Oil Crop

    OpenAIRE

    Huang, Banglian; Yang, Yiming; Luo, Tingting; Wu, S.; Du, Xuezhu; Cai, Detian; Loo, van, E.N.; Huang Bangquan

    2013-01-01

    In the present study correlation, regression and path analyses were carried out to decide correlations among the agro- nomic traits and their contributions to seed yield per plant in Crambe abyssinica. Partial correlation analysis indicated that plant height (X1) was significantly correlated with branching height and the number of first branches (P <0.01); Branching height (X2) was significantly correlated with pod number of primary inflorescence (P <0.01) and number of secondary branch...

  3. Support Vector Regression Model Based on Empirical Mode Decomposition and Auto Regression for Electric Load Forecasting

    Directory of Open Access Journals (Sweden)

    Hong-Juan Li

    2013-04-01

    Full Text Available Electric load forecasting is an important issue for a power utility, associated with the management of daily operations such as energy transfer scheduling, unit commitment, and load dispatch. Inspired by strong non-linear learning capability of support vector regression (SVR, this paper presents a SVR model hybridized with the empirical mode decomposition (EMD method and auto regression (AR for electric load forecasting. The electric load data of the New South Wales (Australia market are employed for comparing the forecasting performances of different forecasting models. The results confirm the validity of the idea that the proposed model can simultaneously provide forecasting with good accuracy and interpretability.

  4. Detecting nonsense for Chinese comments based on logistic regression

    Science.gov (United States)

    Zhuolin, Ren; Guang, Chen; Shu, Chen

    2016-07-01

    To understand cyber citizens' opinion accurately from Chinese news comments, the clear definition on nonsense is present, and a detection model based on logistic regression (LR) is proposed. The detection of nonsense can be treated as a binary-classification problem. Besides of traditional lexical features, we propose three kinds of features in terms of emotion, structure and relevance. By these features, we train an LR model and demonstrate its effect in understanding Chinese news comments. We find that each of proposed features can significantly promote the result. In our experiments, we achieve a prediction accuracy of 84.3% which improves the baseline 77.3% by 7%.

  5. Model-based Quantile Regression for Discrete Data

    KAUST Repository

    Padellini, Tullia; Rue, Haavard

    2018-01-01

    Quantile regression is a class of methods voted to the modelling of conditional quantiles. In a Bayesian framework quantile regression has typically been carried out exploiting the Asymmetric Laplace Distribution as a working likelihood. Despite

  6. Econometric analysis of realized covariation: high frequency based covariance, regression, and correlation in financial economics

    DEFF Research Database (Denmark)

    Barndorff-Nielsen, Ole Eiler; Shephard, N.

    2004-01-01

    This paper analyses multivariate high frequency financial data using realized covariation. We provide a new asymptotic distribution theory for standard methods such as regression, correlation analysis, and covariance. It will be based on a fixed interval of time (e.g., a day or week), allowing...... the number of high frequency returns during this period to go to infinity. Our analysis allows us to study how high frequency correlations, regressions, and covariances change through time. In particular we provide confidence intervals for each of these quantities....

  7. Gradient descent for robust kernel-based regression

    Science.gov (United States)

    Guo, Zheng-Chu; Hu, Ting; Shi, Lei

    2018-06-01

    In this paper, we study the gradient descent algorithm generated by a robust loss function over a reproducing kernel Hilbert space (RKHS). The loss function is defined by a windowing function G and a scale parameter σ, which can include a wide range of commonly used robust losses for regression. There is still a gap between theoretical analysis and optimization process of empirical risk minimization based on loss: the estimator needs to be global optimal in the theoretical analysis while the optimization method can not ensure the global optimality of its solutions. In this paper, we aim to fill this gap by developing a novel theoretical analysis on the performance of estimators generated by the gradient descent algorithm. We demonstrate that with an appropriately chosen scale parameter σ, the gradient update with early stopping rules can approximate the regression function. Our elegant error analysis can lead to convergence in the standard L 2 norm and the strong RKHS norm, both of which are optimal in the mini-max sense. We show that the scale parameter σ plays an important role in providing robustness as well as fast convergence. The numerical experiments implemented on synthetic examples and real data set also support our theoretical results.

  8. Image Jacobian Matrix Estimation Based on Online Support Vector Regression

    Directory of Open Access Journals (Sweden)

    Shangqin Mao

    2012-10-01

    Full Text Available Research into robotics visual servoing is an important area in the field of robotics. It has proven difficult to achieve successful results for machine vision and robotics in unstructured environments without using any a priori camera or kinematic models. In uncalibrated visual servoing, image Jacobian matrix estimation methods can be divided into two groups: the online method and the offline method. The offline method is not appropriate for most natural environments. The online method is robust but rough. Moreover, if the images feature configuration changes, it needs to restart the approximating procedure. A novel approach based on an online support vector regression (OL-SVR algorithm is proposed which overcomes the drawbacks and combines the virtues just mentioned.

  9. Analyses of non-fatal accidents in an opencast mine by logistic regression model - a case study.

    Science.gov (United States)

    Onder, Seyhan; Mutlu, Mert

    2017-09-01

    Accidents cause major damage for both workers and enterprises in the mining industry. To reduce the number of occupational accidents, these incidents should be properly registered and carefully analysed. This study efficiently examines the Aegean Lignite Enterprise (ELI) of Turkish Coal Enterprises (TKI) in Soma between 2006 and 2011, and opencast coal mine occupational accident records were used for statistical analyses. A total of 231 occupational accidents were analysed for this study. The accident records were categorized into seven groups: area, reason, occupation, part of body, age, shift hour and lost days. The SPSS package program was used in this study for logistic regression analyses, which predicted the probability of accidents resulting in greater or less than 3 lost workdays for non-fatal injuries. Social facilities-area of surface installations, workshops and opencast mining areas are the areas with the highest probability for accidents with greater than 3 lost workdays for non-fatal injuries, while the reasons with the highest probability for these types of accidents are transporting and manual handling. Additionally, the model was tested for such reported accidents that occurred in 2012 for the ELI in Soma and estimated the probability of exposure to accidents with lost workdays correctly by 70%.

  10. A Novel Multiobjective Evolutionary Algorithm Based on Regression Analysis

    Directory of Open Access Journals (Sweden)

    Zhiming Song

    2015-01-01

    Full Text Available As is known, the Pareto set of a continuous multiobjective optimization problem with m objective functions is a piecewise continuous (m-1-dimensional manifold in the decision space under some mild conditions. However, how to utilize the regularity to design multiobjective optimization algorithms has become the research focus. In this paper, based on this regularity, a model-based multiobjective evolutionary algorithm with regression analysis (MMEA-RA is put forward to solve continuous multiobjective optimization problems with variable linkages. In the algorithm, the optimization problem is modelled as a promising area in the decision space by a probability distribution, and the centroid of the probability distribution is (m-1-dimensional piecewise continuous manifold. The least squares method is used to construct such a model. A selection strategy based on the nondominated sorting is used to choose the individuals to the next generation. The new algorithm is tested and compared with NSGA-II and RM-MEDA. The result shows that MMEA-RA outperforms RM-MEDA and NSGA-II on the test instances with variable linkages. At the same time, MMEA-RA has higher efficiency than the other two algorithms. A few shortcomings of MMEA-RA have also been identified and discussed in this paper.

  11. Ordinary Least Squares and Quantile Regression: An Inquiry-Based Learning Approach to a Comparison of Regression Methods

    Science.gov (United States)

    Helmreich, James E.; Krog, K. Peter

    2018-01-01

    We present a short, inquiry-based learning course on concepts and methods underlying ordinary least squares (OLS), least absolute deviation (LAD), and quantile regression (QR). Students investigate squared, absolute, and weighted absolute distance functions (metrics) as location measures. Using differential calculus and properties of convex…

  12. Automatic incrementalization of Prolog based static analyses

    DEFF Research Database (Denmark)

    Eichberg, Michael; Kahl, Matthias; Saha, Diptikalyan

    2007-01-01

    Modem development environments integrate various static analyses into the build process. Analyses that analyze the whole project whenever the project changes are impractical in this context. We present an approach to automatic incrementalization of analyses that are specified as tabled logic prog...

  13. A regression-based Kansei engineering system based on form feature lines for product form design

    Directory of Open Access Journals (Sweden)

    Yan Xiong

    2016-06-01

    Full Text Available When developing new products, it is important for a designer to understand users’ perceptions and develop product form with the corresponding perceptions. In order to establish the mapping between users’ perceptions and product design features effectively, in this study, we presented a regression-based Kansei engineering system based on form feature lines for product form design. First according to the characteristics of design concept representation, product form features–product form feature lines were defined. Second, Kansei words were chosen to describe image perceptions toward product samples. Then, multiple linear regression and support vector regression were used to construct the models, respectively, that predicted users’ image perceptions. Using mobile phones as experimental samples, Kansei prediction models were established based on the front view form feature lines of the samples. From the experimental results, these two predict models were of good adaptability. But in contrast to multiple linear regression, the predict performance of support vector regression model was better, and support vector regression is more suitable for form regression prediction. The results of the case showed that the proposed method provided an effective means for designers to manipulate product features as a whole, and it can optimize Kansei model and improve practical values.

  14. On-line mixture-based alternative to logistic regression

    Czech Academy of Sciences Publication Activity Database

    Nagy, Ivan; Suzdaleva, Evgenia

    2016-01-01

    Roč. 26, č. 5 (2016), s. 417-437 ISSN 1210-0552 R&D Projects: GA ČR GA15-03564S Institutional support: RVO:67985556 Keywords : on-line modeling * on-line logistic regression * recursive mixture estimation * data dependent pointer Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.394, year: 2016 http://library.utia.cas.cz/separaty/2016/ZS/suzdaleva-0464463.pdf

  15. Automatic incrementalization of Prolog based static analyses

    DEFF Research Database (Denmark)

    Eichberg, Michael; Kahl, Matthias; Saha, Diptikalyan

    2007-01-01

    Modem development environments integrate various static analyses into the build process. Analyses that analyze the whole project whenever the project changes are impractical in this context. We present an approach to automatic incrementalization of analyses that are specified as tabled logic...... programs and evaluated using incremental tabled evaluation, a technique for efficiently updating memo tables in response to changes in facts and rules. The approach has been implemented and integrated into the Eclipse IDE. Our measurements show that this technique is effective for automatically...

  16. Fault trend prediction of device based on support vector regression

    International Nuclear Information System (INIS)

    Song Meicun; Cai Qi

    2011-01-01

    The research condition of fault trend prediction and the basic theory of support vector regression (SVR) were introduced. SVR was applied to the fault trend prediction of roller bearing, and compared with other methods (BP neural network, gray model, and gray-AR model). The results show that BP network tends to overlearn and gets into local minimum so that the predictive result is unstable. It also shows that the predictive result of SVR is stabilization, and SVR is superior to BP neural network, gray model and gray-AR model in predictive precision. SVR is a kind of effective method of fault trend prediction. (authors)

  17. Short-term electricity prices forecasting based on support vector regression and Auto-regressive integrated moving average modeling

    International Nuclear Information System (INIS)

    Che Jinxing; Wang Jianzhou

    2010-01-01

    In this paper, we present the use of different mathematical models to forecast electricity price under deregulated power. A successful prediction tool of electricity price can help both power producers and consumers plan their bidding strategies. Inspired by that the support vector regression (SVR) model, with the ε-insensitive loss function, admits of the residual within the boundary values of ε-tube, we propose a hybrid model that combines both SVR and Auto-regressive integrated moving average (ARIMA) models to take advantage of the unique strength of SVR and ARIMA models in nonlinear and linear modeling, which is called SVRARIMA. A nonlinear analysis of the time-series indicates the convenience of nonlinear modeling, the SVR is applied to capture the nonlinear patterns. ARIMA models have been successfully applied in solving the residuals regression estimation problems. The experimental results demonstrate that the model proposed outperforms the existing neural-network approaches, the traditional ARIMA models and other hybrid models based on the root mean square error and mean absolute percentage error.

  18. Determinants of Birthweight Outcomes: Quantile Regressions Based on Panel Data

    DEFF Research Database (Denmark)

    Bache, Stefan Holst; Dahl, Christian Møller; Kristensen, Johannes Tang

    to the possibility that smoking habits can be influenced through policy conduct. It is widely believed that maternal smoking reduces birthweight; however, the crucial difficulty in estimating such effects is the unobserved heterogeneity among mothers. We consider extensions of three panel data models to a quantile......Low birthweight outcomes are associated with large social and economic costs, and therefore the possible determinants of low birthweight are of great interest. One such determinant which has received considerable attention is maternal smoking. From an economic perspective this is in part due...... regression framework in order to control for heterogeneity and to infer conclusions about causality across the entire birthweight distribution. We obtain estimation results for maternal smoking and other interesting determinants, applying these to data obtained from Aarhus University Hospital, Skejby...

  19. Forecast Model of Urban Stagnant Water Based on Logistic Regression

    Directory of Open Access Journals (Sweden)

    Liu Pan

    2017-01-01

    Full Text Available With the development of information technology, the construction of water resource system has been gradually carried out. In the background of big data, the work of water information needs to carry out the process of quantitative to qualitative change. Analyzing the correlation of data and exploring the deep value of data which are the key of water information’s research. On the basis of the research on the water big data and the traditional data warehouse architecture, we try to find out the connection of different data source. According to the temporal and spatial correlation of stagnant water and rainfall, we use spatial interpolation to integrate data of stagnant water and rainfall which are from different data source and different sensors, then use logistic regression to find out the relationship between them.

  20. Predicting Taxi-Out Time at Congested Airports with Optimization-Based Support Vector Regression Methods

    Directory of Open Access Journals (Sweden)

    Guan Lian

    2018-01-01

    Full Text Available Accurate prediction of taxi-out time is significant precondition for improving the operationality of the departure process at an airport, as well as reducing the long taxi-out time, congestion, and excessive emission of greenhouse gases. Unfortunately, several of the traditional methods of predicting taxi-out time perform unsatisfactorily at congested airports. This paper describes and tests three of those conventional methods which include Generalized Linear Model, Softmax Regression Model, and Artificial Neural Network method and two improved Support Vector Regression (SVR approaches based on swarm intelligence algorithm optimization, which include Particle Swarm Optimization (PSO and Firefly Algorithm. In order to improve the global searching ability of Firefly Algorithm, adaptive step factor and Lévy flight are implemented simultaneously when updating the location function. Six factors are analysed, of which delay is identified as one significant factor in congested airports. Through a series of specific dynamic analyses, a case study of Beijing International Airport (PEK is tested with historical data. The performance measures show that the proposed two SVR approaches, especially the Improved Firefly Algorithm (IFA optimization-based SVR method, not only perform as the best modelling measures and accuracy rate compared with the representative forecast models, but also can achieve a better predictive performance when dealing with abnormal taxi-out time states.

  1. Clinical evaluation of a novel population-based regression analysis for detecting glaucomatous visual field progression.

    Science.gov (United States)

    Kovalska, M P; Bürki, E; Schoetzau, A; Orguel, S F; Orguel, S; Grieshaber, M C

    2011-04-01

    The distinction of real progression from test variability in visual field (VF) series may be based on clinical judgment, on trend analysis based on follow-up of test parameters over time, or on identification of a significant change related to the mean of baseline exams (event analysis). The aim of this study was to compare a new population-based method (Octopus field analysis, OFA) with classic regression analyses and clinical judgment for detecting glaucomatous VF changes. 240 VF series of 240 patients with at least 9 consecutive examinations available were included into this study. They were independently classified by two experienced investigators. The results of such a classification served as a reference for comparison for the following statistical tests: (a) t-test global, (b) r-test global, (c) regression analysis of 10 VF clusters and (d) point-wise linear regression analysis. 32.5 % of the VF series were classified as progressive by the investigators. The sensitivity and specificity were 89.7 % and 92.0 % for r-test, and 73.1 % and 93.8 % for the t-test, respectively. In the point-wise linear regression analysis, the specificity was comparable (89.5 % versus 92 %), but the sensitivity was clearly lower than in the r-test (22.4 % versus 89.7 %) at a significance level of p = 0.01. A regression analysis for the 10 VF clusters showed a markedly higher sensitivity for the r-test (37.7 %) than the t-test (14.1 %) at a similar specificity (88.3 % versus 93.8 %) for a significant trend (p = 0.005). In regard to the cluster distribution, the paracentral clusters and the superior nasal hemifield progressed most frequently. The population-based regression analysis seems to be superior to the trend analysis in detecting VF progression in glaucoma, and may eliminate the drawbacks of the event analysis. Further, it may assist the clinician in the evaluation of VF series and may allow better visualization of the correlation between function and structure owing to VF

  2. SPECIFICS OF THE APPLICATIONS OF MULTIPLE REGRESSION MODEL IN THE ANALYSES OF THE EFFECTS OF GLOBAL FINANCIAL CRISES

    Directory of Open Access Journals (Sweden)

    Željko V. Račić

    2010-12-01

    Full Text Available This paper aims to present the specifics of the application of multiple linear regression model. The economic (financial crisis is analyzed in terms of gross domestic product which is in a function of the foreign trade balance (on one hand and the credit cards, i.e. indebtedness of the population on this basis (on the other hand, in the USA (from 1999. to 2008. We used the extended application model which shows how the analyst should run the whole development process of regression model. This process began with simple statistical features and the application of regression procedures, and ended with residual analysis, intended for the study of compatibility of data and model settings. This paper also analyzes the values of some standard statistics used in the selection of appropriate regression model. Testing of the model is carried out with the use of the Statistics PASW 17 program.

  3. Analysing organic transistors based on interface approximation

    International Nuclear Information System (INIS)

    Akiyama, Yuto; Mori, Takehiko

    2014-01-01

    Temperature-dependent characteristics of organic transistors are analysed thoroughly using interface approximation. In contrast to amorphous silicon transistors, it is characteristic of organic transistors that the accumulation layer is concentrated on the first monolayer, and it is appropriate to consider interface charge rather than band bending. On the basis of this model, observed characteristics of hexamethylenetetrathiafulvalene (HMTTF) and dibenzotetrathiafulvalene (DBTTF) transistors with various surface treatments are analysed, and the trap distribution is extracted. In turn, starting from a simple exponential distribution, we can reproduce the temperature-dependent transistor characteristics as well as the gate voltage dependence of the activation energy, so we can investigate various aspects of organic transistors self-consistently under the interface approximation. Small deviation from such an ideal transistor operation is discussed assuming the presence of an energetically discrete trap level, which leads to a hump in the transfer characteristics. The contact resistance is estimated by measuring the transfer characteristics up to the linear region

  4. Regularized Regression and Density Estimation based on Optimal Transport

    KAUST Repository

    Burger, M.

    2012-03-11

    The aim of this paper is to investigate a novel nonparametric approach for estimating and smoothing density functions as well as probability densities from discrete samples based on a variational regularization method with the Wasserstein metric as a data fidelity. The approach allows a unified treatment of discrete and continuous probability measures and is hence attractive for various tasks. In particular, the variational model for special regularization functionals yields a natural method for estimating densities and for preserving edges in the case of total variation regularization. In order to compute solutions of the variational problems, a regularized optimal transport problem needs to be solved, for which we discuss several formulations and provide a detailed analysis. Moreover, we compute special self-similar solutions for standard regularization functionals and we discuss several computational approaches and results. © 2012 The Author(s).

  5. Theoretical analyses of superconductivity in iron based ...

    African Journals Online (AJOL)

    This paper focuses on the theoretical analysis of superconductivity in iron based superconductor Ba1−xKxFe2As2. After reviewing the current findings on this system, we suggest that phononexciton combined mechanism gives a right order of superconducting transition temperature (TC) for Ba1−xKxFe2As2 . By developing ...

  6. Using synthetic data to evaluate multiple regression and principal component analyses for statistical modeling of daily building energy consumption

    Energy Technology Data Exchange (ETDEWEB)

    Reddy, T.A. (Energy Systems Lab., Texas A and M Univ., College Station, TX (United States)); Claridge, D.E. (Energy Systems Lab., Texas A and M Univ., College Station, TX (United States))

    1994-01-01

    Multiple regression modeling of monitored building energy use data is often faulted as a reliable means of predicting energy use on the grounds that multicollinearity between the regressor variables can lead both to improper interpretation of the relative importance of the various physical regressor parameters and to a model with unstable regressor coefficients. Principal component analysis (PCA) has the potential to overcome such drawbacks. While a few case studies have already attempted to apply this technique to building energy data, the objectives of this study were to make a broader evaluation of PCA and multiple regression analysis (MRA) and to establish guidelines under which one approach is preferable to the other. Four geographic locations in the US with different climatic conditions were selected and synthetic data sequence representative of daily energy use in large institutional buildings were generated in each location using a linear model with outdoor temperature, outdoor specific humidity and solar radiation as the three regression variables. MRA and PCA approaches were then applied to these data sets and their relative performances were compared. Conditions under which PCA seems to perform better than MRA were identified and preliminary recommendations on the use of either modeling approach formulated. (orig.)

  7. Parameter Selection Method for Support Vector Regression Based on Adaptive Fusion of the Mixed Kernel Function

    Directory of Open Access Journals (Sweden)

    Hailun Wang

    2017-01-01

    Full Text Available Support vector regression algorithm is widely used in fault diagnosis of rolling bearing. A new model parameter selection method for support vector regression based on adaptive fusion of the mixed kernel function is proposed in this paper. We choose the mixed kernel function as the kernel function of support vector regression. The mixed kernel function of the fusion coefficients, kernel function parameters, and regression parameters are combined together as the parameters of the state vector. Thus, the model selection problem is transformed into a nonlinear system state estimation problem. We use a 5th-degree cubature Kalman filter to estimate the parameters. In this way, we realize the adaptive selection of mixed kernel function weighted coefficients and the kernel parameters, the regression parameters. Compared with a single kernel function, unscented Kalman filter (UKF support vector regression algorithms, and genetic algorithms, the decision regression function obtained by the proposed method has better generalization ability and higher prediction accuracy.

  8. Exploring reasons for the observed inconsistent trial reports on intra-articular injections with hyaluronic acid in the treatment of osteoarthritis: Meta-regression analyses of randomized trials.

    Science.gov (United States)

    Johansen, Mette; Bahrt, Henriette; Altman, Roy D; Bartels, Else M; Juhl, Carsten B; Bliddal, Henning; Lund, Hans; Christensen, Robin

    2016-08-01

    The aim was to identify factors explaining inconsistent observations concerning the efficacy of intra-articular hyaluronic acid compared to intra-articular sham/control, or non-intervention control, in patients with symptomatic osteoarthritis, based on randomized clinical trials (RCTs). A systematic review and meta-regression analyses of available randomized trials were conducted. The outcome, pain, was assessed according to a pre-specified hierarchy of potentially available outcomes. Hedges׳s standardized mean difference [SMD (95% CI)] served as effect size. REstricted Maximum Likelihood (REML) mixed-effects models were used to combine study results, and heterogeneity was calculated and interpreted as Tau-squared and I-squared, respectively. Overall, 99 studies (14,804 patients) met the inclusion criteria: Of these, only 71 studies (72%), including 85 comparisons (11,216 patients), had adequate data available for inclusion in the primary meta-analysis. Overall, compared with placebo, intra-articular hyaluronic acid reduced pain with an effect size of -0.39 [-0.47 to -0.31; P hyaluronic acid. Based on available trial data, intra-articular hyaluronic acid showed a better effect than intra-articular saline on pain reduction in osteoarthritis. Publication bias and the risk of selective outcome reporting suggest only small clinical effect compared to saline. Copyright © 2016 Elsevier Inc. All rights reserved.

  9. Estimation of pyrethroid pesticide intake using regression modeling of food groups based on composite dietary samples

    Data.gov (United States)

    U.S. Environmental Protection Agency — Population-based estimates of pesticide intake are needed to characterize exposure for particular demographic groups based on their dietary behaviors. Regression...

  10. The N400 as a snapshot of interactive processing: evidence from regression analyses of orthographic neighbor and lexical associate effects

    Science.gov (United States)

    Laszlo, Sarah; Federmeier, Kara D.

    2010-01-01

    Linking print with meaning tends to be divided into subprocesses, such as recognition of an input's lexical entry and subsequent access of semantics. However, recent results suggest that the set of semantic features activated by an input is broader than implied by a view wherein access serially follows recognition. EEG was collected from participants who viewed items varying in number and frequency of both orthographic neighbors and lexical associates. Regression analysis of single item ERPs replicated past findings, showing that N400 amplitudes are greater for items with more neighbors, and further revealed that N400 amplitudes increase for items with more lexical associates and with higher frequency neighbors or associates. Together, the data suggest that in the N400 time window semantic features of items broadly related to inputs are active, consistent with models in which semantic access takes place in parallel with stimulus recognition. PMID:20624252

  11. Semiparametric Mixtures of Regressions with Single-index for Model Based Clustering

    OpenAIRE

    Xiang, Sijia; Yao, Weixin

    2017-01-01

    In this article, we propose two classes of semiparametric mixture regression models with single-index for model based clustering. Unlike many semiparametric/nonparametric mixture regression models that can only be applied to low dimensional predictors, the new semiparametric models can easily incorporate high dimensional predictors into the nonparametric components. The proposed models are very general, and many of the recently proposed semiparametric/nonparametric mixture regression models a...

  12. An Entropy-Based Measure for Assessing Fuzziness in Logistic Regression

    Science.gov (United States)

    Weiss, Brandi A.; Dardick, William

    2016-01-01

    This article introduces an entropy-based measure of data-model fit that can be used to assess the quality of logistic regression models. Entropy has previously been used in mixture-modeling to quantify how well individuals are classified into latent classes. The current study proposes the use of entropy for logistic regression models to quantify…

  13. Modeling the potential risk factors of bovine viral diarrhea prevalence in Egypt using univariable and multivariable logistic regression analyses

    Directory of Open Access Journals (Sweden)

    Abdelfattah M. Selim

    2018-03-01

    Full Text Available Aim: The present cross-sectional study was conducted to determine the seroprevalence and potential risk factors associated with Bovine viral diarrhea virus (BVDV disease in cattle and buffaloes in Egypt, to model the potential risk factors associated with the disease using logistic regression (LR models, and to fit the best predictive model for the current data. Materials and Methods: A total of 740 blood samples were collected within November 2012-March 2013 from animals aged between 6 months and 3 years. The potential risk factors studied were species, age, sex, and herd location. All serum samples were examined with indirect ELIZA test for antibody detection. Data were analyzed with different statistical approaches such as Chi-square test, odds ratios (OR, univariable, and multivariable LR models. Results: Results revealed a non-significant association between being seropositive with BVDV and all risk factors, except for species of animal. Seroprevalence percentages were 40% and 23% for cattle and buffaloes, respectively. OR for all categories were close to one with the highest OR for cattle relative to buffaloes, which was 2.237. Likelihood ratio tests showed a significant drop of the -2LL from univariable LR to multivariable LR models. Conclusion: There was an evidence of high seroprevalence of BVDV among cattle as compared with buffaloes with the possibility of infection in different age groups of animals. In addition, multivariable LR model was proved to provide more information for association and prediction purposes relative to univariable LR models and Chi-square tests if we have more than one predictor.

  14. Structural vascular disease in Africans: performance of ethnic-specific waist circumference cut points using logistic regression and neural network analyses: the SABPA study

    OpenAIRE

    Botha, J.; De Ridder, J.H.; Potgieter, J.C.; Steyn, H.S.; Malan, L.

    2013-01-01

    A recently proposed model for waist circumference cut points (RPWC), driven by increased blood pressure, was demonstrated in an African population. We therefore aimed to validate the RPWC by comparing the RPWC and the Joint Statement Consensus (JSC) models via Logistic Regression (LR) and Neural Networks (NN) analyses. Urban African gender groups (N=171) were stratified according to the JSC and RPWC cut point models. Ultrasound carotid intima media thickness (CIMT), blood pressure (BP) and fa...

  15. Bootstrap-based procedures for inference in nonparametric receiver-operating characteristic curve regression analysis.

    Science.gov (United States)

    Rodríguez-Álvarez, María Xosé; Roca-Pardiñas, Javier; Cadarso-Suárez, Carmen; Tahoces, Pablo G

    2018-03-01

    Prior to using a diagnostic test in a routine clinical setting, the rigorous evaluation of its diagnostic accuracy is essential. The receiver-operating characteristic curve is the measure of accuracy most widely used for continuous diagnostic tests. However, the possible impact of extra information about the patient (or even the environment) on diagnostic accuracy also needs to be assessed. In this paper, we focus on an estimator for the covariate-specific receiver-operating characteristic curve based on direct regression modelling and nonparametric smoothing techniques. This approach defines the class of generalised additive models for the receiver-operating characteristic curve. The main aim of the paper is to offer new inferential procedures for testing the effect of covariates on the conditional receiver-operating characteristic curve within the above-mentioned class. Specifically, two different bootstrap-based tests are suggested to check (a) the possible effect of continuous covariates on the receiver-operating characteristic curve and (b) the presence of factor-by-curve interaction terms. The validity of the proposed bootstrap-based procedures is supported by simulations. To facilitate the application of these new procedures in practice, an R-package, known as npROCRegression, is provided and briefly described. Finally, data derived from a computer-aided diagnostic system for the automatic detection of tumour masses in breast cancer is analysed.

  16. Noninvasive spectral imaging of skin chromophores based on multiple regression analysis aided by Monte Carlo simulation

    Science.gov (United States)

    Nishidate, Izumi; Wiswadarma, Aditya; Hase, Yota; Tanaka, Noriyuki; Maeda, Takaaki; Niizeki, Kyuichi; Aizu, Yoshihisa

    2011-08-01

    In order to visualize melanin and blood concentrations and oxygen saturation in human skin tissue, a simple imaging technique based on multispectral diffuse reflectance images acquired at six wavelengths (500, 520, 540, 560, 580 and 600nm) was developed. The technique utilizes multiple regression analysis aided by Monte Carlo simulation for diffuse reflectance spectra. Using the absorbance spectrum as a response variable and the extinction coefficients of melanin, oxygenated hemoglobin, and deoxygenated hemoglobin as predictor variables, multiple regression analysis provides regression coefficients. Concentrations of melanin and total blood are then determined from the regression coefficients using conversion vectors that are deduced numerically in advance, while oxygen saturation is obtained directly from the regression coefficients. Experiments with a tissue-like agar gel phantom validated the method. In vivo experiments with human skin of the human hand during upper limb occlusion and of the inner forearm exposed to UV irradiation demonstrated the ability of the method to evaluate physiological reactions of human skin tissue.

  17. Improving validation methods for molecular diagnostics: application of Bland-Altman, Deming and simple linear regression analyses in assay comparison and evaluation for next-generation sequencing.

    Science.gov (United States)

    Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L

    2018-02-01

    A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R 2 ), using R 2 as the primary metric of assay agreement. However, the use of R 2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  18. Classification and regression tree (CART) analyses of genomic signatures reveal sets of tetramers that discriminate temperature optima of archaea and bacteria

    Science.gov (United States)

    Dyer, Betsey D.; Kahn, Michael J.; LeBlanc, Mark D.

    2008-01-01

    Classification and regression tree (CART) analysis was applied to genome-wide tetranucleotide frequencies (genomic signatures) of 195 archaea and bacteria. Although genomic signatures have typically been used to classify evolutionary divergence, in this study, convergent evolution was the focus. Temperature optima for most of the organisms examined could be distinguished by CART analyses of tetranucleotide frequencies. This suggests that pervasive (nonlinear) qualities of genomes may reflect certain environmental conditions (such as temperature) in which those genomes evolved. The predominant use of GAGA and AGGA as the discriminating tetramers in CART models suggests that purine-loading and codon biases of thermophiles may explain some of the results. PMID:19054742

  19. Graph Regularized Meta-path Based Transductive Regression in Heterogeneous Information Network.

    Science.gov (United States)

    Wan, Mengting; Ouyang, Yunbo; Kaplan, Lance; Han, Jiawei

    2015-01-01

    A number of real-world networks are heterogeneous information networks, which are composed of different types of nodes and links. Numerical prediction in heterogeneous information networks is a challenging but significant area because network based information for unlabeled objects is usually limited to make precise estimations. In this paper, we consider a graph regularized meta-path based transductive regression model ( Grempt ), which combines the principal philosophies of typical graph-based transductive classification methods and transductive regression models designed for homogeneous networks. The computation of our method is time and space efficient and the precision of our model can be verified by numerical experiments.

  20. Comparative analysis of neural network and regression based condition monitoring approaches for wind turbine fault detection

    DEFF Research Database (Denmark)

    Schlechtingen, Meik; Santos, Ilmar

    2011-01-01

    This paper presents the research results of a comparison of three different model based approaches for wind turbine fault detection in online SCADA data, by applying developed models to five real measured faults and anomalies. The regression based model as the simplest approach to build a normal...

  1. Assessing the suitability of summary data for two-sample Mendelian randomization analyses using MR-Egger regression: the role of the I2 statistic.

    Science.gov (United States)

    Bowden, Jack; Del Greco M, Fabiola; Minelli, Cosetta; Davey Smith, George; Sheehan, Nuala A; Thompson, John R

    2016-12-01

    : MR-Egger regression has recently been proposed as a method for Mendelian randomization (MR) analyses incorporating summary data estimates of causal effect from multiple individual variants, which is robust to invalid instruments. It can be used to test for directional pleiotropy and provides an estimate of the causal effect adjusted for its presence. MR-Egger regression provides a useful additional sensitivity analysis to the standard inverse variance weighted (IVW) approach that assumes all variants are valid instruments. Both methods use weights that consider the single nucleotide polymorphism (SNP)-exposure associations to be known, rather than estimated. We call this the `NO Measurement Error' (NOME) assumption. Causal effect estimates from the IVW approach exhibit weak instrument bias whenever the genetic variants utilized violate the NOME assumption, which can be reliably measured using the F-statistic. The effect of NOME violation on MR-Egger regression has yet to be studied. An adaptation of the I2 statistic from the field of meta-analysis is proposed to quantify the strength of NOME violation for MR-Egger. It lies between 0 and 1, and indicates the expected relative bias (or dilution) of the MR-Egger causal estimate in the two-sample MR context. We call it IGX2 . The method of simulation extrapolation is also explored to counteract the dilution. Their joint utility is evaluated using simulated data and applied to a real MR example. In simulated two-sample MR analyses we show that, when a causal effect exists, the MR-Egger estimate of causal effect is biased towards the null when NOME is violated, and the stronger the violation (as indicated by lower values of IGX2 ), the stronger the dilution. When additionally all genetic variants are valid instruments, the type I error rate of the MR-Egger test for pleiotropy is inflated and the causal effect underestimated. Simulation extrapolation is shown to substantially mitigate these adverse effects. We

  2. Study on Thermal Degradation Characteristics and Regression Rate Measurement of Paraffin-Based Fuel

    Directory of Open Access Journals (Sweden)

    Songqi Hu

    2015-09-01

    Full Text Available Paraffin fuel has been found to have a regression rate that is higher than conventional HTPB (hydroxyl-terminated polybutadiene fuel and, thus, presents itself as an ideal energy source for a hybrid rocket engine. The energy characteristics of paraffin-based fuel and HTPB fuel have been calculated by the method of minimum free energy. The thermal degradation characteristics were measured for paraffin, pretreated paraffin, HTPB and paraffin-based fuel in different working conditions by the using differential scanning calorimetry (DSC and a thermogravimetric analyzer (TGA. The regression rates of paraffin-based fuel and HTPB fuel were tested by a rectangular solid-gas hybrid engine. The research findings showed that: the specific impulse of paraffin-based fuel is almost the same as that of HTPB fuel; the decomposition temperature of pretreated paraffin is higher than that of the unprocessed paraffin, but lower than that of HTPB; with the increase of paraffin, the initial reaction exothermic peak of paraffin-based fuel is reached in advance, and the initial reaction heat release also increases; the regression rate of paraffin-based fuel is higher than the common HTPB fuel under the same conditions; with the increase of oxidizer mass flow rate, the regression rate of solid fuel increases accordingly for the same fuel formulation.

  3. A Cross-Domain Collaborative Filtering Algorithm Based on Feature Construction and Locally Weighted Linear Regression.

    Science.gov (United States)

    Yu, Xu; Lin, Jun-Yu; Jiang, Feng; Du, Jun-Wei; Han, Ji-Zhong

    2018-01-01

    Cross-domain collaborative filtering (CDCF) solves the sparsity problem by transferring rating knowledge from auxiliary domains. Obviously, different auxiliary domains have different importance to the target domain. However, previous works cannot evaluate effectively the significance of different auxiliary domains. To overcome this drawback, we propose a cross-domain collaborative filtering algorithm based on Feature Construction and Locally Weighted Linear Regression (FCLWLR). We first construct features in different domains and use these features to represent different auxiliary domains. Thus the weight computation across different domains can be converted as the weight computation across different features. Then we combine the features in the target domain and in the auxiliary domains together and convert the cross-domain recommendation problem into a regression problem. Finally, we employ a Locally Weighted Linear Regression (LWLR) model to solve the regression problem. As LWLR is a nonparametric regression method, it can effectively avoid underfitting or overfitting problem occurring in parametric regression methods. We conduct extensive experiments to show that the proposed FCLWLR algorithm is effective in addressing the data sparsity problem by transferring the useful knowledge from the auxiliary domains, as compared to many state-of-the-art single-domain or cross-domain CF methods.

  4. A Cross-Domain Collaborative Filtering Algorithm Based on Feature Construction and Locally Weighted Linear Regression

    Directory of Open Access Journals (Sweden)

    Xu Yu

    2018-01-01

    Full Text Available Cross-domain collaborative filtering (CDCF solves the sparsity problem by transferring rating knowledge from auxiliary domains. Obviously, different auxiliary domains have different importance to the target domain. However, previous works cannot evaluate effectively the significance of different auxiliary domains. To overcome this drawback, we propose a cross-domain collaborative filtering algorithm based on Feature Construction and Locally Weighted Linear Regression (FCLWLR. We first construct features in different domains and use these features to represent different auxiliary domains. Thus the weight computation across different domains can be converted as the weight computation across different features. Then we combine the features in the target domain and in the auxiliary domains together and convert the cross-domain recommendation problem into a regression problem. Finally, we employ a Locally Weighted Linear Regression (LWLR model to solve the regression problem. As LWLR is a nonparametric regression method, it can effectively avoid underfitting or overfitting problem occurring in parametric regression methods. We conduct extensive experiments to show that the proposed FCLWLR algorithm is effective in addressing the data sparsity problem by transferring the useful knowledge from the auxiliary domains, as compared to many state-of-the-art single-domain or cross-domain CF methods.

  5. A review and comparison of Bayesian and likelihood-based inferences in beta regression and zero-or-one-inflated beta regression.

    Science.gov (United States)

    Liu, Fang; Eugenio, Evercita C

    2018-04-01

    Beta regression is an increasingly popular statistical technique in medical research for modeling of outcomes that assume values in (0, 1), such as proportions and patient reported outcomes. When outcomes take values in the intervals [0,1), (0,1], or [0,1], zero-or-one-inflated beta (zoib) regression can be used. We provide a thorough review on beta regression and zoib regression in the modeling, inferential, and computational aspects via the likelihood-based and Bayesian approaches. We demonstrate the statistical and practical importance of correctly modeling the inflation at zero/one rather than ad hoc replacing them with values close to zero/one via simulation studies; the latter approach can lead to biased estimates and invalid inferences. We show via simulation studies that the likelihood-based approach is computationally faster in general than MCMC algorithms used in the Bayesian inferences, but runs the risk of non-convergence, large biases, and sensitivity to starting values in the optimization algorithm especially with clustered/correlated data, data with sparse inflation at zero and one, and data that warrant regularization of the likelihood. The disadvantages of the regular likelihood-based approach make the Bayesian approach an attractive alternative in these cases. Software packages and tools for fitting beta and zoib regressions in both the likelihood-based and Bayesian frameworks are also reviewed.

  6. Sample size calculation to externally validate scoring systems based on logistic regression models.

    Directory of Open Access Journals (Sweden)

    Antonio Palazón-Bru

    Full Text Available A sample size containing at least 100 events and 100 non-events has been suggested to validate a predictive model, regardless of the model being validated and that certain factors can influence calibration of the predictive model (discrimination, parameterization and incidence. Scoring systems based on binary logistic regression models are a specific type of predictive model.The aim of this study was to develop an algorithm to determine the sample size for validating a scoring system based on a binary logistic regression model and to apply it to a case study.The algorithm was based on bootstrap samples in which the area under the ROC curve, the observed event probabilities through smooth curves, and a measure to determine the lack of calibration (estimated calibration index were calculated. To illustrate its use for interested researchers, the algorithm was applied to a scoring system, based on a binary logistic regression model, to determine mortality in intensive care units.In the case study provided, the algorithm obtained a sample size with 69 events, which is lower than the value suggested in the literature.An algorithm is provided for finding the appropriate sample size to validate scoring systems based on binary logistic regression models. This could be applied to determine the sample size in other similar cases.

  7. Robust Face Recognition via Multi-Scale Patch-Based Matrix Regression.

    Directory of Open Access Journals (Sweden)

    Guangwei Gao

    Full Text Available In many real-world applications such as smart card solutions, law enforcement, surveillance and access control, the limited training sample size is the most fundamental problem. By making use of the low-rank structural information of the reconstructed error image, the so-called nuclear norm-based matrix regression has been demonstrated to be effective for robust face recognition with continuous occlusions. However, the recognition performance of nuclear norm-based matrix regression degrades greatly in the face of the small sample size problem. An alternative solution to tackle this problem is performing matrix regression on each patch and then integrating the outputs from all patches. However, it is difficult to set an optimal patch size across different databases. To fully utilize the complementary information from different patch scales for the final decision, we propose a multi-scale patch-based matrix regression scheme based on which the ensemble of multi-scale outputs can be achieved optimally. Extensive experiments on benchmark face databases validate the effectiveness and robustness of our method, which outperforms several state-of-the-art patch-based face recognition algorithms.

  8. Building a new predictor for multiple linear regression technique-based corrective maintenance turnaround time.

    Science.gov (United States)

    Cruz, Antonio M; Barr, Cameron; Puñales-Pozo, Elsa

    2008-01-01

    This research's main goals were to build a predictor for a turnaround time (TAT) indicator for estimating its values and use a numerical clustering technique for finding possible causes of undesirable TAT values. The following stages were used: domain understanding, data characterisation and sample reduction and insight characterisation. Building the TAT indicator multiple linear regression predictor and clustering techniques were used for improving corrective maintenance task efficiency in a clinical engineering department (CED). The indicator being studied was turnaround time (TAT). Multiple linear regression was used for building a predictive TAT value model. The variables contributing to such model were clinical engineering department response time (CE(rt), 0.415 positive coefficient), stock service response time (Stock(rt), 0.734 positive coefficient), priority level (0.21 positive coefficient) and service time (0.06 positive coefficient). The regression process showed heavy reliance on Stock(rt), CE(rt) and priority, in that order. Clustering techniques revealed the main causes of high TAT values. This examination has provided a means for analysing current technical service quality and effectiveness. In doing so, it has demonstrated a process for identifying areas and methods of improvement and a model against which to analyse these methods' effectiveness.

  9. The more total cognitive load is reduced by cues, the better retention and transfer of multimedia learning: A meta-analysis and two meta-regression analyses.

    Science.gov (United States)

    Xie, Heping; Wang, Fuxing; Hao, Yanbin; Chen, Jiaxue; An, Jing; Wang, Yuxin; Liu, Huashan

    2017-01-01

    Cueing facilitates retention and transfer of multimedia learning. From the perspective of cognitive load theory (CLT), cueing has a positive effect on learning outcomes because of the reduction in total cognitive load and avoidance of cognitive overload. However, this has not been systematically evaluated. Moreover, what remains ambiguous is the direct relationship between the cue-related cognitive load and learning outcomes. A meta-analysis and two subsequent meta-regression analyses were conducted to explore these issues. Subjective total cognitive load (SCL) and scores on a retention test and transfer test were selected as dependent variables. Through a systematic literature search, 32 eligible articles encompassing 3,597 participants were included in the SCL-related meta-analysis. Among them, 25 articles containing 2,910 participants were included in the retention-related meta-analysis and the following retention-related meta-regression, while there were 29 articles containing 3,204 participants included in the transfer-related meta-analysis and the transfer-related meta-regression. The meta-analysis revealed a statistically significant cueing effect on subjective ratings of cognitive load (d = -0.11, 95% CI = [-0.19, -0.02], p < 0.05), retention performance (d = 0.27, 95% CI = [0.08, 0.46], p < 0.01), and transfer performance (d = 0.34, 95% CI = [0.12, 0.56], p < 0.01). The subsequent meta-regression analyses showed that dSCL for cueing significantly predicted dretention for cueing (β = -0.70, 95% CI = [-1.02, -0.38], p < 0.001), as well as dtransfer for cueing (β = -0.60, 95% CI = [-0.92, -0.28], p < 0.001). Thus in line with CLT, adding cues in multimedia materials can indeed reduce SCL and promote learning outcomes, and the more SCL is reduced by cues, the better retention and transfer of multimedia learning.

  10. Modeling Personalized Email Prioritization: Classification-based and Regression-based Approaches

    Energy Technology Data Exchange (ETDEWEB)

    Yoo S.; Yang, Y.; Carbonell, J.

    2011-10-24

    Email overload, even after spam filtering, presents a serious productivity challenge for busy professionals and executives. One solution is automated prioritization of incoming emails to ensure the most important are read and processed quickly, while others are processed later as/if time permits in declining priority levels. This paper presents a study of machine learning approaches to email prioritization into discrete levels, comparing ordinal regression versus classier cascades. Given the ordinal nature of discrete email priority levels, SVM ordinal regression would be expected to perform well, but surprisingly a cascade of SVM classifiers significantly outperforms ordinal regression for email prioritization. In contrast, SVM regression performs well -- better than classifiers -- on selected UCI data sets. This unexpected performance inversion is analyzed and results are presented, providing core functionality for email prioritization systems.

  11. Model-based bootstrapping when correcting for measurement error with application to logistic regression.

    Science.gov (United States)

    Buonaccorsi, John P; Romeo, Giovanni; Thoresen, Magne

    2018-03-01

    When fitting regression models, measurement error in any of the predictors typically leads to biased coefficients and incorrect inferences. A plethora of methods have been proposed to correct for this. Obtaining standard errors and confidence intervals using the corrected estimators can be challenging and, in addition, there is concern about remaining bias in the corrected estimators. The bootstrap, which is one option to address these problems, has received limited attention in this context. It has usually been employed by simply resampling observations, which, while suitable in some situations, is not always formally justified. In addition, the simple bootstrap does not allow for estimating bias in non-linear models, including logistic regression. Model-based bootstrapping, which can potentially estimate bias in addition to being robust to the original sampling or whether the measurement error variance is constant or not, has received limited attention. However, it faces challenges that are not present in handling regression models with no measurement error. This article develops new methods for model-based bootstrapping when correcting for measurement error in logistic regression with replicate measures. The methodology is illustrated using two examples, and a series of simulations are carried out to assess and compare the simple and model-based bootstrap methods, as well as other standard methods. While not always perfect, the model-based approaches offer some distinct improvements over the other methods. © 2017, The International Biometric Society.

  12. In service monitoring based on fatigue analyses, possibilities and limitations

    International Nuclear Information System (INIS)

    Dittmar, S.; Binder, F.

    2004-01-01

    German LWR reactors are equipped with monitoring systems which are to enable a comparison of real transients with load case catalogues and fatigue catalogues for fatigue analyses. The information accuracy depends on the accuracy of measurements, on the consideration of parameters influencing fatigue (medium, component surface, component size, etc.), and on the accuracy of the load analyses. The contribution attempts a critical evaluation, also inview of the fact that real fatigue damage often are impossible to quantify on the basis of fatigue analyses at a later stage. The effects of the consideration or non-consideration of various influencing factors are discussed, as well as the consequences of the scatter of material characteristics on which the analyses are based. Possible measures to be taken in operational monitoring are derived. (orig.) [de

  13. Linear regression based on Minimum Covariance Determinant (MCD) and TELBS methods on the productivity of phytoplankton

    Science.gov (United States)

    Gusriani, N.; Firdaniza

    2018-03-01

    The existence of outliers on multiple linear regression analysis causes the Gaussian assumption to be unfulfilled. If the Least Square method is forcedly used on these data, it will produce a model that cannot represent most data. For that, we need a robust regression method against outliers. This paper will compare the Minimum Covariance Determinant (MCD) method and the TELBS method on secondary data on the productivity of phytoplankton, which contains outliers. Based on the robust determinant coefficient value, MCD method produces a better model compared to TELBS method.

  14. Brillouin Scattering Spectrum Analysis Based on Auto-Regressive Spectral Estimation

    Science.gov (United States)

    Huang, Mengyun; Li, Wei; Liu, Zhangyun; Cheng, Linghao; Guan, Bai-Ou

    2018-06-01

    Auto-regressive (AR) spectral estimation technology is proposed to analyze the Brillouin scattering spectrum in Brillouin optical time-domain refelectometry. It shows that AR based method can reliably estimate the Brillouin frequency shift with an accuracy much better than fast Fourier transform (FFT) based methods provided the data length is not too short. It enables about 3 times improvement over FFT at a moderate spatial resolution.

  15. Brillouin Scattering Spectrum Analysis Based on Auto-Regressive Spectral Estimation

    Science.gov (United States)

    Huang, Mengyun; Li, Wei; Liu, Zhangyun; Cheng, Linghao; Guan, Bai-Ou

    2018-03-01

    Auto-regressive (AR) spectral estimation technology is proposed to analyze the Brillouin scattering spectrum in Brillouin optical time-domain refelectometry. It shows that AR based method can reliably estimate the Brillouin frequency shift with an accuracy much better than fast Fourier transform (FFT) based methods provided the data length is not too short. It enables about 3 times improvement over FFT at a moderate spatial resolution.

  16. A bromine-based dichroic X-ray polarization analyser

    CERN Document Server

    Collins, S P; Brown, S D; Thompson, P

    2001-01-01

    We have demonstrated the advantages offered by dichroic X-ray polarization filters for linear polarization analysis, and describe such a device, based on a dibromoalkane/urea inclusion compound. The polarizer has been successfully tested by analysing the polarization of magnetic diffraction from holmium.

  17. Predictors of success of external cephalic version and cephalic presentation at birth among 1253 women with non-cephalic presentation using logistic regression and classification tree analyses.

    Science.gov (United States)

    Hutton, Eileen K; Simioni, Julia C; Thabane, Lehana

    2017-08-01

    Among women with a fetus with a non-cephalic presentation, external cephalic version (ECV) has been shown to reduce the rate of breech presentation at birth and cesarean birth. Compared with ECV at term, beginning ECV prior to 37 weeks' gestation decreases the number of infants in a non-cephalic presentation at birth. The purpose of this secondary analysis was to investigate factors associated with a successful ECV procedure and to present this in a clinically useful format. Data were collected as part of the Early ECV Pilot and Early ECV2 Trials, which randomized 1776 women with a fetus in breech presentation to either early ECV (34-36 weeks' gestation) or delayed ECV (at or after 37 weeks). The outcome of interest was successful ECV, defined as the fetus being in a cephalic presentation immediately following the procedure, as well as at the time of birth. The importance of several factors in predicting successful ECV was investigated using two statistical methods: logistic regression and classification and regression tree (CART) analyses. Among nulliparas, non-engagement of the presenting part and an easily palpable fetal head were independently associated with success. Among multiparas, non-engagement of the presenting part, gestation less than 37 weeks and an easily palpable fetal head were found to be independent predictors of success. These findings were consistent with results of the CART analyses. Regardless of parity, descent of the presenting part was the most discriminating factor in predicting successful ECV and cephalic presentation at birth. © 2017 Nordic Federation of Societies of Obstetrics and Gynecology.

  18. Generating patient specific pseudo-CT of the head from MR using atlas-based regression

    International Nuclear Information System (INIS)

    Sjölund, J; Forsberg, D; Andersson, M; Knutsson, H

    2015-01-01

    Radiotherapy planning and attenuation correction of PET images require simulation of radiation transport. The necessary physical properties are typically derived from computed tomography (CT) images, but in some cases, including stereotactic neurosurgery and combined PET/MR imaging, only magnetic resonance (MR) images are available. With these applications in mind, we describe how a realistic, patient-specific, pseudo-CT of the head can be derived from anatomical MR images. We refer to the method as atlas-based regression, because of its similarity to atlas-based segmentation. Given a target MR and an atlas database comprising MR and CT pairs, atlas-based regression works by registering each atlas MR to the target MR, applying the resulting displacement fields to the corresponding atlas CTs and, finally, fusing the deformed atlas CTs into a single pseudo-CT. We use a deformable registration algorithm known as the Morphon and augment it with a certainty mask that allows a tailoring of the influence certain regions are allowed to have on the registration. Moreover, we propose a novel method of fusion, wherein the collection of deformed CTs is iteratively registered to their joint mean and find that the resulting mean CT becomes more similar to the target CT. However, the voxelwise median provided even better results; at least as good as earlier work that required special MR imaging techniques. This makes atlas-based regression a good candidate for clinical use. (paper)

  19. Accounting for estimated IQ in neuropsychological test performance with regression-based techniques.

    Science.gov (United States)

    Testa, S Marc; Winicki, Jessica M; Pearlson, Godfrey D; Gordon, Barry; Schretlen, David J

    2009-11-01

    Regression-based normative techniques account for variability in test performance associated with multiple predictor variables and generate expected scores based on algebraic equations. Using this approach, we show that estimated IQ, based on oral word reading, accounts for 1-9% of the variability beyond that explained by individual differences in age, sex, race, and years of education for most cognitive measures. These results confirm that adding estimated "premorbid" IQ to demographic predictors in multiple regression models can incrementally improve the accuracy with which regression-based norms (RBNs) benchmark expected neuropsychological test performance in healthy adults. It remains to be seen whether the incremental variance in test performance explained by estimated "premorbid" IQ translates to improved diagnostic accuracy in patient samples. We describe these methods, and illustrate the step-by-step application of RBNs with two cases. We also discuss the rationale, assumptions, and caveats of this approach. More broadly, we note that adjusting test scores for age and other characteristics might actually decrease the accuracy with which test performance predicts absolute criteria, such as the ability to drive or live independently.

  20. Analyser-based phase contrast image reconstruction using geometrical optics

    International Nuclear Information System (INIS)

    Kitchen, M J; Pavlov, K M; Siu, K K W; Menk, R H; Tromba, G; Lewis, R A

    2007-01-01

    Analyser-based phase contrast imaging can provide radiographs of exceptional contrast at high resolution (<100 μm), whilst quantitative phase and attenuation information can be extracted using just two images when the approximations of geometrical optics are satisfied. Analytical phase retrieval can be performed by fitting the analyser rocking curve with a symmetric Pearson type VII function. The Pearson VII function provided at least a 10% better fit to experimentally measured rocking curves than linear or Gaussian functions. A test phantom, a hollow nylon cylinder, was imaged at 20 keV using a Si(1 1 1) analyser at the ELETTRA synchrotron radiation facility. Our phase retrieval method yielded a more accurate object reconstruction than methods based on a linear fit to the rocking curve. Where reconstructions failed to map expected values, calculations of the Takagi number permitted distinction between the violation of the geometrical optics conditions and the failure of curve fitting procedures. The need for synchronized object/detector translation stages was removed by using a large, divergent beam and imaging the object in segments. Our image acquisition and reconstruction procedure enables quantitative phase retrieval for systems with a divergent source and accounts for imperfections in the analyser

  1. Analyser-based phase contrast image reconstruction using geometrical optics.

    Science.gov (United States)

    Kitchen, M J; Pavlov, K M; Siu, K K W; Menk, R H; Tromba, G; Lewis, R A

    2007-07-21

    Analyser-based phase contrast imaging can provide radiographs of exceptional contrast at high resolution (geometrical optics are satisfied. Analytical phase retrieval can be performed by fitting the analyser rocking curve with a symmetric Pearson type VII function. The Pearson VII function provided at least a 10% better fit to experimentally measured rocking curves than linear or Gaussian functions. A test phantom, a hollow nylon cylinder, was imaged at 20 keV using a Si(1 1 1) analyser at the ELETTRA synchrotron radiation facility. Our phase retrieval method yielded a more accurate object reconstruction than methods based on a linear fit to the rocking curve. Where reconstructions failed to map expected values, calculations of the Takagi number permitted distinction between the violation of the geometrical optics conditions and the failure of curve fitting procedures. The need for synchronized object/detector translation stages was removed by using a large, divergent beam and imaging the object in segments. Our image acquisition and reconstruction procedure enables quantitative phase retrieval for systems with a divergent source and accounts for imperfections in the analyser.

  2. A Novel Imbalanced Data Classification Approach Based on Logistic Regression and Fisher Discriminant

    Directory of Open Access Journals (Sweden)

    Baofeng Shi

    2015-01-01

    Full Text Available We introduce an imbalanced data classification approach based on logistic regression significant discriminant and Fisher discriminant. First of all, a key indicators extraction model based on logistic regression significant discriminant and correlation analysis is derived to extract features for customer classification. Secondly, on the basis of the linear weighted utilizing Fisher discriminant, a customer scoring model is established. And then, a customer rating model where the customer number of all ratings follows normal distribution is constructed. The performance of the proposed model and the classical SVM classification method are evaluated in terms of their ability to correctly classify consumers as default customer or nondefault customer. Empirical results using the data of 2157 customers in financial engineering suggest that the proposed approach better performance than the SVM model in dealing with imbalanced data classification. Moreover, our approach contributes to locating the qualified customers for the banks and the bond investors.

  3. COLOR IMAGE RETRIEVAL BASED ON FEATURE FUSION THROUGH MULTIPLE LINEAR REGRESSION ANALYSIS

    Directory of Open Access Journals (Sweden)

    K. Seetharaman

    2015-08-01

    Full Text Available This paper proposes a novel technique based on feature fusion using multiple linear regression analysis, and the least-square estimation method is employed to estimate the parameters. The given input query image is segmented into various regions according to the structure of the image. The color and texture features are extracted on each region of the query image, and the features are fused together using the multiple linear regression model. The estimated parameters of the model, which is modeled based on the features, are formed as a vector called a feature vector. The Canberra distance measure is adopted to compare the feature vectors of the query and target images. The F-measure is applied to evaluate the performance of the proposed technique. The obtained results expose that the proposed technique is comparable to the other existing techniques.

  4. Inference for multivariate regression model based on multiply imputed synthetic data generated via posterior predictive sampling

    Science.gov (United States)

    Moura, Ricardo; Sinha, Bimal; Coelho, Carlos A.

    2017-06-01

    The recent popularity of the use of synthetic data as a Statistical Disclosure Control technique has enabled the development of several methods of generating and analyzing such data, but almost always relying in asymptotic distributions and in consequence being not adequate for small sample datasets. Thus, a likelihood-based exact inference procedure is derived for the matrix of regression coefficients of the multivariate regression model, for multiply imputed synthetic data generated via Posterior Predictive Sampling. Since it is based in exact distributions this procedure may even be used in small sample datasets. Simulation studies compare the results obtained from the proposed exact inferential procedure with the results obtained from an adaptation of Reiters combination rule to multiply imputed synthetic datasets and an application to the 2000 Current Population Survey is discussed.

  5. Least square regression based integrated multi-parameteric demand modeling for short term load forecasting

    International Nuclear Information System (INIS)

    Halepoto, I.A.; Uqaili, M.A.

    2014-01-01

    Nowadays, due to power crisis, electricity demand forecasting is deemed an important area for socioeconomic development and proper anticipation of the load forecasting is considered essential step towards efficient power system operation, scheduling and planning. In this paper, we present STLF (Short Term Load Forecasting) using multiple regression techniques (i.e. linear, multiple linear, quadratic and exponential) by considering hour by hour load model based on specific targeted day approach with temperature variant parameter. The proposed work forecasts the future load demand correlation with linear and non-linear parameters (i.e. considering temperature in our case) through different regression approaches. The overall load forecasting error is 2.98% which is very much acceptable. From proposed regression techniques, Quadratic Regression technique performs better compared to than other techniques because it can optimally fit broad range of functions and data sets. The work proposed in this paper, will pave a path to effectively forecast the specific day load with multiple variance factors in a way that optimal accuracy can be maintained. (author)

  6. Time series modeling by a regression approach based on a latent process.

    Science.gov (United States)

    Chamroukhi, Faicel; Samé, Allou; Govaert, Gérard; Aknin, Patrice

    2009-01-01

    Time series are used in many domains including finance, engineering, economics and bioinformatics generally to represent the change of a measurement over time. Modeling techniques may then be used to give a synthetic representation of such data. A new approach for time series modeling is proposed in this paper. It consists of a regression model incorporating a discrete hidden logistic process allowing for activating smoothly or abruptly different polynomial regression models. The model parameters are estimated by the maximum likelihood method performed by a dedicated Expectation Maximization (EM) algorithm. The M step of the EM algorithm uses a multi-class Iterative Reweighted Least-Squares (IRLS) algorithm to estimate the hidden process parameters. To evaluate the proposed approach, an experimental study on simulated data and real world data was performed using two alternative approaches: a heteroskedastic piecewise regression model using a global optimization algorithm based on dynamic programming, and a Hidden Markov Regression Model whose parameters are estimated by the Baum-Welch algorithm. Finally, in the context of the remote monitoring of components of the French railway infrastructure, and more particularly the switch mechanism, the proposed approach has been applied to modeling and classifying time series representing the condition measurements acquired during switch operations.

  7. A Comparative Study of Pairwise Learning Methods Based on Kernel Ridge Regression.

    Science.gov (United States)

    Stock, Michiel; Pahikkala, Tapio; Airola, Antti; De Baets, Bernard; Waegeman, Willem

    2018-06-12

    Many machine learning problems can be formulated as predicting labels for a pair of objects. Problems of that kind are often referred to as pairwise learning, dyadic prediction, or network inference problems. During the past decade, kernel methods have played a dominant role in pairwise learning. They still obtain a state-of-the-art predictive performance, but a theoretical analysis of their behavior has been underexplored in the machine learning literature. In this work we review and unify kernel-based algorithms that are commonly used in different pairwise learning settings, ranging from matrix filtering to zero-shot learning. To this end, we focus on closed-form efficient instantiations of Kronecker kernel ridge regression. We show that independent task kernel ridge regression, two-step kernel ridge regression, and a linear matrix filter arise naturally as a special case of Kronecker kernel ridge regression, implying that all these methods implicitly minimize a squared loss. In addition, we analyze universality, consistency, and spectral filtering properties. Our theoretical results provide valuable insights into assessing the advantages and limitations of existing pairwise learning methods.

  8. Model-based Recursive Partitioning for Subgroup Analyses

    OpenAIRE

    Seibold, Heidi; Zeileis, Achim; Hothorn, Torsten

    2016-01-01

    The identification of patient subgroups with differential treatment effects is the first step towards individualised treatments. A current draft guideline by the EMA discusses potentials and problems in subgroup analyses and formulated challenges to the development of appropriate statistical procedures for the data-driven identification of patient subgroups. We introduce model-based recursive partitioning as a procedure for the automated detection of patient subgroups that are identifiable by...

  9. Analyses of polycyclic aromatic hydrocarbon (PAH) and chiral-PAH analogues-methyl-β-cyclodextrin guest-host inclusion complexes by fluorescence spectrophotometry and multivariate regression analysis.

    Science.gov (United States)

    Greene, LaVana; Elzey, Brianda; Franklin, Mariah; Fakayode, Sayo O

    2017-03-05

    The negative health impact of polycyclic aromatic hydrocarbons (PAHs) and differences in pharmacological activity of enantiomers of chiral molecules in humans highlights the need for analysis of PAHs and their chiral analogue molecules in humans. Herein, the first use of cyclodextrin guest-host inclusion complexation, fluorescence spectrophotometry, and chemometric approach to PAH (anthracene) and chiral-PAH analogue derivatives (1-(9-anthryl)-2,2,2-triflouroethanol (TFE)) analyses are reported. The binding constants (K b ), stoichiometry (n), and thermodynamic properties (Gibbs free energy (ΔG), enthalpy (ΔH), and entropy (ΔS)) of anthracene and enantiomers of TFE-methyl-β-cyclodextrin (Me-β-CD) guest-host complexes were also determined. Chemometric partial-least-square (PLS) regression analysis of emission spectra data of Me-β-CD-guest-host inclusion complexes was used for the determination of anthracene and TFE enantiomer concentrations in Me-β-CD-guest-host inclusion complex samples. The values of calculated K b and negative ΔG suggest the thermodynamic favorability of anthracene-Me-β-CD and enantiomeric of TFE-Me-β-CD inclusion complexation reactions. However, anthracene-Me-β-CD and enantiomer TFE-Me-β-CD inclusion complexations showed notable differences in the binding affinity behaviors and thermodynamic properties. The PLS regression analysis resulted in square-correlation-coefficients of 0.997530 or better and a low LOD of 3.81×10 -7 M for anthracene and 3.48×10 -8 M for TFE enantiomers at physiological conditions. Most importantly, PLS regression accurately determined the anthracene and TFE enantiomer concentrations with an average low error of 2.31% for anthracene, 4.44% for R-TFE and 3.60% for S-TFE. The results of the study are highly significant because of its high sensitivity and accuracy for analysis of PAH and chiral PAH analogue derivatives without the need of an expensive chiral column, enantiomeric resolution, or use of a polarized

  10. Bisphenol-A exposures and behavioural aberrations: median and linear spline and meta-regression analyses of 12 toxicity studies in rodents.

    Science.gov (United States)

    Peluso, Marco E M; Munnia, Armelle; Ceppi, Marcello

    2014-11-05

    Exposures to bisphenol-A, a weak estrogenic chemical, largely used for the production of plastic containers, can affect the rodent behaviour. Thus, we examined the relationships between bisphenol-A and the anxiety-like behaviour, spatial skills, and aggressiveness, in 12 toxicity studies of rodent offspring from females orally exposed to bisphenol-A, while pregnant and/or lactating, by median and linear splines analyses. Subsequently, the meta-regression analysis was applied to quantify the behavioural changes. U-shaped, inverted U-shaped and J-shaped dose-response curves were found to describe the relationships between bisphenol-A with the behavioural outcomes. The occurrence of anxiogenic-like effects and spatial skill changes displayed U-shaped and inverted U-shaped curves, respectively, providing examples of effects that are observed at low-doses. Conversely, a J-dose-response relationship was observed for aggressiveness. When the proportion of rodents expressing certain traits or the time that they employed to manifest an attitude was analysed, the meta-regression indicated that a borderline significant increment of anxiogenic-like effects was present at low-doses regardless of sexes (β)=-0.8%, 95% C.I. -1.7/0.1, P=0.076, at ≤120 μg bisphenol-A. Whereas, only bisphenol-A-males exhibited a significant inhibition of spatial skills (β)=0.7%, 95% C.I. 0.2/1.2, P=0.004, at ≤100 μg/day. A significant increment of aggressiveness was observed in both the sexes (β)=67.9,C.I. 3.4, 172.5, P=0.038, at >4.0 μg. Then, bisphenol-A treatments significantly abrogated spatial learning and ability in males (Pbisphenol-A, e.g. ≤120 μg/day, were associated to behavioural aberrations in offspring. Copyright © 2014. Published by Elsevier Ireland Ltd.

  11. Basic Diagnosis and Prediction of Persistent Contrail Occurrence using High-resolution Numerical Weather Analyses/Forecasts and Logistic Regression. Part II: Evaluation of Sample Models

    Science.gov (United States)

    Duda, David P.; Minnis, Patrick

    2009-01-01

    Previous studies have shown that probabilistic forecasting may be a useful method for predicting persistent contrail formation. A probabilistic forecast to accurately predict contrail formation over the contiguous United States (CONUS) is created by using meteorological data based on hourly meteorological analyses from the Advanced Regional Prediction System (ARPS) and from the Rapid Update Cycle (RUC) as well as GOES water vapor channel measurements, combined with surface and satellite observations of contrails. Two groups of logistic models were created. The first group of models (SURFACE models) is based on surface-based contrail observations supplemented with satellite observations of contrail occurrence. The second group of models (OUTBREAK models) is derived from a selected subgroup of satellite-based observations of widespread persistent contrails. The mean accuracies for both the SURFACE and OUTBREAK models typically exceeded 75 percent when based on the RUC or ARPS analysis data, but decreased when the logistic models were derived from ARPS forecast data.

  12. Short-term solar irradiation forecasting based on Dynamic Harmonic Regression

    International Nuclear Information System (INIS)

    Trapero, Juan R.; Kourentzes, Nikolaos; Martin, A.

    2015-01-01

    Solar power generation is a crucial research area for countries that have high dependency on fossil energy sources and is gaining prominence with the current shift to renewable sources of energy. In order to integrate the electricity generated by solar energy into the grid, solar irradiation must be reasonably well forecasted, where deviations of the forecasted value from the actual measured value involve significant costs. The present paper proposes a univariate Dynamic Harmonic Regression model set up in a State Space framework for short-term (1–24 h) solar irradiation forecasting. Time series hourly aggregated as the Global Horizontal Irradiation and the Direct Normal Irradiation will be used to illustrate the proposed approach. This method provides a fast automatic identification and estimation procedure based on the frequency domain. Furthermore, the recursive algorithms applied offer adaptive predictions. The good forecasting performance is illustrated with solar irradiance measurements collected from ground-based weather stations located in Spain. The results show that the Dynamic Harmonic Regression achieves the lowest relative Root Mean Squared Error; about 30% and 47% for the Global and Direct irradiation components, respectively, for a forecast horizon of 24 h ahead. - Highlights: • Solar irradiation forecasts at short-term are required to operate solar power plants. • This paper assesses the Dynamic Harmonic Regression to forecast solar irradiation. • Models are evaluated using hourly GHI and DNI data collected in Spain. • The results show that forecasting accuracy is improved by using the model proposed

  13. Soft Sensor Modeling Based on Multiple Gaussian Process Regression and Fuzzy C-mean Clustering

    Directory of Open Access Journals (Sweden)

    Xianglin ZHU

    2014-06-01

    Full Text Available In order to overcome the difficulties of online measurement of some crucial biochemical variables in fermentation processes, a new soft sensor modeling method is presented based on the Gaussian process regression and fuzzy C-mean clustering. With the consideration that the typical fermentation process can be distributed into 4 phases including lag phase, exponential growth phase, stable phase and dead phase, the training samples are classified into 4 subcategories by using fuzzy C- mean clustering algorithm. For each sub-category, the samples are trained using the Gaussian process regression and the corresponding soft-sensing sub-model is established respectively. For a new sample, the membership between this sample and sub-models are computed based on the Euclidean distance, and then the prediction output of soft sensor is obtained using the weighting sum. Taking the Lysine fermentation as example, the simulation and experiment are carried out and the corresponding results show that the presented method achieves better fitting and generalization ability than radial basis function neutral network and single Gaussian process regression model.

  14. Support vector regression model based predictive control of water level of U-tube steam generators

    Energy Technology Data Exchange (ETDEWEB)

    Kavaklioglu, Kadir, E-mail: kadir.kavaklioglu@pau.edu.tr

    2014-10-15

    Highlights: • Water level of U-tube steam generators was controlled in a model predictive fashion. • Models for steam generator water level were built using support vector regression. • Cost function minimization for future optimal controls was performed by using the steepest descent method. • The results indicated the feasibility of the proposed method. - Abstract: A predictive control algorithm using support vector regression based models was proposed for controlling the water level of U-tube steam generators of pressurized water reactors. Steam generator data were obtained using a transfer function model of U-tube steam generators. Support vector regression based models were built using a time series type model structure for five different operating powers. Feedwater flow controls were calculated by minimizing a cost function that includes the level error, the feedwater change and the mismatch between feedwater and steam flow rates. Proposed algorithm was applied for a scenario consisting of a level setpoint change and a steam flow disturbance. The results showed that steam generator level can be controlled at all powers effectively by the proposed method.

  15. Regression-based model of skin diffuse reflectance for skin color analysis

    Science.gov (United States)

    Tsumura, Norimichi; Kawazoe, Daisuke; Nakaguchi, Toshiya; Ojima, Nobutoshi; Miyake, Yoichi

    2008-11-01

    A simple regression-based model of skin diffuse reflectance is developed based on reflectance samples calculated by Monte Carlo simulation of light transport in a two-layered skin model. This reflectance model includes the values of spectral reflectance in the visible spectra for Japanese women. The modified Lambert Beer law holds in the proposed model with a modified mean free path length in non-linear density space. The averaged RMS and maximum errors of the proposed model were 1.1 and 3.1%, respectively, in the above range.

  16. Mass estimation of loose parts in nuclear power plant based on multiple regression

    International Nuclear Information System (INIS)

    He, Yuanfeng; Cao, Yanlong; Yang, Jiangxin; Gan, Chunbiao

    2012-01-01

    According to the application of the Hilbert–Huang transform to the non-stationary signal and the relation between the mass of loose parts in nuclear power plant and corresponding frequency content, a new method for loose part mass estimation based on the marginal Hilbert–Huang spectrum (MHS) and multiple regression is proposed in this paper. The frequency spectrum of a loose part in a nuclear power plant can be expressed by the MHS. The multiple regression model that is constructed by the MHS feature of the impact signals for mass estimation is used to predict the unknown masses of a loose part. A simulated experiment verified that the method is feasible and the errors of the results are acceptable. (paper)

  17. Dynamic Optimization for IPS2 Resource Allocation Based on Improved Fuzzy Multiple Linear Regression

    Directory of Open Access Journals (Sweden)

    Maokuan Zheng

    2017-01-01

    Full Text Available The study mainly focuses on resource allocation optimization for industrial product-service systems (IPS2. The development of IPS2 leads to sustainable economy by introducing cooperative mechanisms apart from commodity transaction. The randomness and fluctuation of service requests from customers lead to the volatility of IPS2 resource utilization ratio. Three basic rules for resource allocation optimization are put forward to improve system operation efficiency and cut unnecessary costs. An approach based on fuzzy multiple linear regression (FMLR is developed, which integrates the strength and concision of multiple linear regression in data fitting and factor analysis and the merit of fuzzy theory in dealing with uncertain or vague problems, which helps reduce those costs caused by unnecessary resource transfer. The iteration mechanism is introduced in the FMLR algorithm to improve forecasting accuracy. A case study of human resource allocation optimization in construction machinery industry is implemented to test and verify the proposed model.

  18. Prediction of monthly electric energy consumption using pattern-based fuzzy nearest neighbour regression

    Directory of Open Access Journals (Sweden)

    Pełka Paweł

    2017-01-01

    Full Text Available Electricity demand forecasting is of important role in power system planning and operation. In this work, fuzzy nearest neighbour regression has been utilised to estimate monthly electricity demands. The forecasting model was based on the pre-processed energy consumption time series, where input and output variables were defined as patterns representing unified fragments of the time series. Relationships between inputs and outputs, which were simplified due to patterns, were modelled using nonparametric regression with weighting function defined as a fuzzy membership of learning points to the neighbourhood of a query point. In an experimental part of the work the model was evaluated using real-world data. The results are encouraging and show high performances of the model and its competitiveness compared to other forecasting models.

  19. Order Selection for General Expression of Nonlinear Autoregressive Model Based on Multivariate Stepwise Regression

    Science.gov (United States)

    Shi, Jinfei; Zhu, Songqing; Chen, Ruwen

    2017-12-01

    An order selection method based on multiple stepwise regressions is proposed for General Expression of Nonlinear Autoregressive model which converts the model order problem into the variable selection of multiple linear regression equation. The partial autocorrelation function is adopted to define the linear term in GNAR model. The result is set as the initial model, and then the nonlinear terms are introduced gradually. Statistics are chosen to study the improvements of both the new introduced and originally existed variables for the model characteristics, which are adopted to determine the model variables to retain or eliminate. So the optimal model is obtained through data fitting effect measurement or significance test. The simulation and classic time-series data experiment results show that the method proposed is simple, reliable and can be applied to practical engineering.

  20. A Spline-Based Lack-Of-Fit Test for Independent Variable Effect in Poisson Regression.

    Science.gov (United States)

    Li, Chin-Shang; Tu, Wanzhu

    2007-05-01

    In regression analysis of count data, independent variables are often modeled by their linear effects under the assumption of log-linearity. In reality, the validity of such an assumption is rarely tested, and its use is at times unjustifiable. A lack-of-fit test is proposed for the adequacy of a postulated functional form of an independent variable within the framework of semiparametric Poisson regression models based on penalized splines. It offers added flexibility in accommodating the potentially non-loglinear effect of the independent variable. A likelihood ratio test is constructed for the adequacy of the postulated parametric form, for example log-linearity, of the independent variable effect. Simulations indicate that the proposed model performs well, and misspecified parametric model has much reduced power. An example is given.

  1. Fuzzy Regression Prediction and Application Based on Multi-Dimensional Factors of Freight Volume

    Science.gov (United States)

    Xiao, Mengting; Li, Cheng

    2018-01-01

    Based on the reality of the development of air cargo, the multi-dimensional fuzzy regression method is used to determine the influencing factors, and the three most important influencing factors of GDP, total fixed assets investment and regular flight route mileage are determined. The system’s viewpoints and analogy methods, the use of fuzzy numbers and multiple regression methods to predict the civil aviation cargo volume. In comparison with the 13th Five-Year Plan for China’s Civil Aviation Development (2016-2020), it is proved that this method can effectively improve the accuracy of forecasting and reduce the risk of forecasting. It is proved that this model predicts civil aviation freight volume of the feasibility, has a high practical significance and practical operation.

  2. Extreme Learning Machine and Moving Least Square Regression Based Solar Panel Vision Inspection

    Directory of Open Access Journals (Sweden)

    Heng Liu

    2017-01-01

    Full Text Available In recent years, learning based machine intelligence has aroused a lot of attention across science and engineering. Particularly in the field of automatic industry inspection, the machine learning based vision inspection plays a more and more important role in defect identification and feature extraction. Through learning from image samples, many features of industry objects, such as shapes, positions, and orientations angles, can be obtained and then can be well utilized to determine whether there is defect or not. However, the robustness and the quickness are not easily achieved in such inspection way. In this work, for solar panel vision inspection, we present an extreme learning machine (ELM and moving least square regression based approach to identify solder joint defect and detect the panel position. Firstly, histogram peaks distribution (HPD and fractional calculus are applied for image preprocessing. Then an ELM-based defective solder joints identification is discussed in detail. Finally, moving least square regression (MLSR algorithm is introduced for solar panel position determination. Experimental results and comparisons show that the proposed ELM and MLSR based inspection method is efficient not only in detection accuracy but also in processing speed.

  3. Quantitative metagenomic analyses based on average genome size normalization

    DEFF Research Database (Denmark)

    Frank, Jeremy Alexander; Sørensen, Søren Johannes

    2011-01-01

    provide not just a census of the community members but direct information on metabolic capabilities and potential interactions among community members. Here we introduce a method for the quantitative characterization and comparison of microbial communities based on the normalization of metagenomic data...... marine sources using both conventional small-subunit (SSU) rRNA gene analyses and our quantitative method to calculate the proportion of genomes in each sample that are capable of a particular metabolic trait. With both environments, to determine what proportion of each community they make up and how......). These analyses demonstrate how genome proportionality compares to SSU rRNA gene relative abundance and how factors such as average genome size and SSU rRNA gene copy number affect sampling probability and therefore both types of community analysis....

  4. Process for carrying out analyses based on concurrent reactions

    Energy Technology Data Exchange (ETDEWEB)

    Glover, J S; Shepherd, B P

    1980-01-03

    The invention refers to a process for carrying out analyses based on concurrent reactions. A part of a compound to be analysed is subjected with a standard quantity of this compound in a labelled form to a common reaction with a standard quantity of a reagent, which must be less than the sum of the two parts of the reacting compound. The parts of the marked reaction compound and the labelled final compound resulting from the concurrence are separated in a tube (e.g. by centrifuging) after forced phase change (precipitation, absorption etc.) and the radio-activity of both phases in contact is measured separately. The shielded measuring device developed for this and suitable for centrifuge tubes of known dimensions is also included in the patent claims. The insulin concentration of a defined serum is measured as an example of the applications of the method (Radioimmunoassay).

  5. Remote sensing and GIS-based landslide hazard analysis and cross-validation using multivariate logistic regression model on three test areas in Malaysia

    Science.gov (United States)

    Pradhan, Biswajeet

    2010-05-01

    This paper presents the results of the cross-validation of a multivariate logistic regression model using remote sensing data and GIS for landslide hazard analysis on the Penang, Cameron, and Selangor areas in Malaysia. Landslide locations in the study areas were identified by interpreting aerial photographs and satellite images, supported by field surveys. SPOT 5 and Landsat TM satellite imagery were used to map landcover and vegetation index, respectively. Maps of topography, soil type, lineaments and land cover were constructed from the spatial datasets. Ten factors which influence landslide occurrence, i.e., slope, aspect, curvature, distance from drainage, lithology, distance from lineaments, soil type, landcover, rainfall precipitation, and normalized difference vegetation index (ndvi), were extracted from the spatial database and the logistic regression coefficient of each factor was computed. Then the landslide hazard was analysed using the multivariate logistic regression coefficients derived not only from the data for the respective area but also using the logistic regression coefficients calculated from each of the other two areas (nine hazard maps in all) as a cross-validation of the model. For verification of the model, the results of the analyses were then compared with the field-verified landslide locations. Among the three cases of the application of logistic regression coefficient in the same study area, the case of Selangor based on the Selangor logistic regression coefficients showed the highest accuracy (94%), where as Penang based on the Penang coefficients showed the lowest accuracy (86%). Similarly, among the six cases from the cross application of logistic regression coefficient in other two areas, the case of Selangor based on logistic coefficient of Cameron showed highest (90%) prediction accuracy where as the case of Penang based on the Selangor logistic regression coefficients showed the lowest accuracy (79%). Qualitatively, the cross

  6. Item Response Theory Modeling and Categorical Regression Analyses of the Five-Factor Model Rating Form: A Study on Italian Community-Dwelling Adolescent Participants and Adult Participants.

    Science.gov (United States)

    Fossati, Andrea; Widiger, Thomas A; Borroni, Serena; Maffei, Cesare; Somma, Antonella

    2017-06-01

    To extend the evidence on the reliability and construct validity of the Five-Factor Model Rating Form (FFMRF) in its self-report version, two independent samples of Italian participants, which were composed of 510 adolescent high school students and 457 community-dwelling adults, respectively, were administered the FFMRF in its Italian translation. Adolescent participants were also administered the Italian translation of the Borderline Personality Features Scale for Children-11 (BPFSC-11), whereas adult participants were administered the Italian translation of the Triarchic Psychopathy Measure (TriPM). Cronbach α values were consistent with previous findings; in both samples, average interitem r values indicated acceptable internal consistency for all FFMRF scales. A multidimensional graded item response theory model indicated that the majority of FFMRF items had adequate discrimination parameters; information indices supported the reliability of the FFMRF scales. Both categorical (i.e., item-level) and scale-level regression analyses suggested that the FFMRF scores may predict a nonnegligible amount of variance in the BPFSC-11 total score in adolescent participants, and in the TriPM scale scores in adult participants.

  7. Regression-Based Norms for the Symbol Digit Modalities Test in the Dutch Population: Improving Detection of Cognitive Impairment in Multiple Sclerosis?

    Science.gov (United States)

    Burggraaff, Jessica; Knol, Dirk L; Uitdehaag, Bernard M J

    2017-01-01

    Appropriate and timely screening instruments that sensitively capture the cognitive functioning of multiple sclerosis (MS) patients are the need of the hour. We evaluated newly derived regression-based norms for the Symbol Digit Modalities Test (SDMT) in a Dutch-speaking sample, as an indicator of the cognitive state of MS patients. Regression-based norms for the SDMT were created from a healthy control sample (n = 96) and used to convert MS patients' (n = 157) raw scores to demographically adjusted Z-scores, correcting for the effects of age, age2, gender, and education. Conventional and regression-based norms were compared on their impairment-classification rates and related to other neuropsychological measures. The regression analyses revealed that age was the only significantly influencing demographic in our healthy sample. Regression-based norms for the SDMT more readily detected impairment in MS patients than conventional normalization methods (32 patients instead of 15). Patients changing from an SDMT-preserved to -impaired status (n = 17) were also impaired on other cognitive domains (p < 0.05), except for visuospatial memory (p = 0.34). Regression-based norms for the SDMT more readily detect abnormal performance in MS patients than conventional norms, identifying those patients at highest risk for cognitive impairment, which was supported by a worse performance on other neuropsychological measures. © 2017 S. Karger AG, Basel.

  8. Weighted functional linear regression models for gene-based association analysis.

    Science.gov (United States)

    Belonogova, Nadezhda M; Svishcheva, Gulnara R; Wilson, James F; Campbell, Harry; Axenovich, Tatiana I

    2018-01-01

    Functional linear regression models are effectively used in gene-based association analysis of complex traits. These models combine information about individual genetic variants, taking into account their positions and reducing the influence of noise and/or observation errors. To increase the power of methods, where several differently informative components are combined, weights are introduced to give the advantage to more informative components. Allele-specific weights have been introduced to collapsing and kernel-based approaches to gene-based association analysis. Here we have for the first time introduced weights to functional linear regression models adapted for both independent and family samples. Using data simulated on the basis of GAW17 genotypes and weights defined by allele frequencies via the beta distribution, we demonstrated that type I errors correspond to declared values and that increasing the weights of causal variants allows the power of functional linear models to be increased. We applied the new method to real data on blood pressure from the ORCADES sample. Five of the six known genes with P models. Moreover, we found an association between diastolic blood pressure and the VMP1 gene (P = 8.18×10-6), when we used a weighted functional model. For this gene, the unweighted functional and weighted kernel-based models had P = 0.004 and 0.006, respectively. The new method has been implemented in the program package FREGAT, which is freely available at https://cran.r-project.org/web/packages/FREGAT/index.html.

  9. Squeezeposenet: Image Based Pose Regression with Small Convolutional Neural Networks for Real Time Uas Navigation

    Science.gov (United States)

    Müller, M. S.; Urban, S.; Jutzi, B.

    2017-08-01

    The number of unmanned aerial vehicles (UAVs) is increasing since low-cost airborne systems are available for a wide range of users. The outdoor navigation of such vehicles is mostly based on global navigation satellite system (GNSS) methods to gain the vehicles trajectory. The drawback of satellite-based navigation are failures caused by occlusions and multi-path interferences. Beside this, local image-based solutions like Simultaneous Localization and Mapping (SLAM) and Visual Odometry (VO) can e.g. be used to support the GNSS solution by closing trajectory gaps but are computationally expensive. However, if the trajectory estimation is interrupted or not available a re-localization is mandatory. In this paper we will provide a novel method for a GNSS-free and fast image-based pose regression in a known area by utilizing a small convolutional neural network (CNN). With on-board processing in mind, we employ a lightweight CNN called SqueezeNet and use transfer learning to adapt the network to pose regression. Our experiments show promising results for GNSS-free and fast localization.

  10. Convergent Time-Varying Regression Models for Data Streams: Tracking Concept Drift by the Recursive Parzen-Based Generalized Regression Neural Networks.

    Science.gov (United States)

    Duda, Piotr; Jaworski, Maciej; Rutkowski, Leszek

    2018-03-01

    One of the greatest challenges in data mining is related to processing and analysis of massive data streams. Contrary to traditional static data mining problems, data streams require that each element is processed only once, the amount of allocated memory is constant and the models incorporate changes of investigated streams. A vast majority of available methods have been developed for data stream classification and only a few of them attempted to solve regression problems, using various heuristic approaches. In this paper, we develop mathematically justified regression models working in a time-varying environment. More specifically, we study incremental versions of generalized regression neural networks, called IGRNNs, and we prove their tracking properties - weak (in probability) and strong (with probability one) convergence assuming various concept drift scenarios. First, we present the IGRNNs, based on the Parzen kernels, for modeling stationary systems under nonstationary noise. Next, we extend our approach to modeling time-varying systems under nonstationary noise. We present several types of concept drifts to be handled by our approach in such a way that weak and strong convergence holds under certain conditions. Finally, in the series of simulations, we compare our method with commonly used heuristic approaches, based on forgetting mechanism or sliding windows, to deal with concept drift. Finally, we apply our concept in a real life scenario solving the problem of currency exchange rates prediction.

  11. Unconscious analyses of visual scenes based on feature conjunctions.

    Science.gov (United States)

    Tachibana, Ryosuke; Noguchi, Yasuki

    2015-06-01

    To efficiently process a cluttered scene, the visual system analyzes statistical properties or regularities of visual elements embedded in the scene. It is controversial, however, whether those scene analyses could also work for stimuli unconsciously perceived. Here we show that our brain performs the unconscious scene analyses not only using a single featural cue (e.g., orientation) but also based on conjunctions of multiple visual features (e.g., combinations of color and orientation information). Subjects foveally viewed a stimulus array (duration: 50 ms) where 4 types of bars (red-horizontal, red-vertical, green-horizontal, and green-vertical) were intermixed. Although a conscious perception of those bars was inhibited by a subsequent mask stimulus, the brain correctly analyzed the information about color, orientation, and color-orientation conjunctions of those invisible bars. The information of those features was then used for the unconscious configuration analysis (statistical processing) of the central bars, which induced a perceptual bias and illusory feature binding in visible stimuli at peripheral locations. While statistical analyses and feature binding are normally 2 key functions of the visual system to construct coherent percepts of visual scenes, our results show that a high-level analysis combining those 2 functions is correctly performed by unconscious computations in the brain. (c) 2015 APA, all rights reserved).

  12. Effective behaviour change techniques for physical activity and healthy eating in overweight and obese adults; systematic review and meta-regression analyses.

    Science.gov (United States)

    Samdal, Gro Beate; Eide, Geir Egil; Barth, Tom; Williams, Geoffrey; Meland, Eivind

    2017-03-28

    This systematic review aims to explain the heterogeneity in results of interventions to promote physical activity and healthy eating for overweight and obese adults, by exploring the differential effects of behaviour change techniques (BCTs) and other intervention characteristics. The inclusion criteria specified RCTs with ≥ 12 weeks' duration, from January 2007 to October 2014, for adults (mean age ≥ 40 years, mean BMI ≥ 30). Primary outcomes were measures of healthy diet or physical activity. Two reviewers rated study quality, coded the BCTs, and collected outcome results at short (≤6 months) and long term (≥12 months). Meta-analyses and meta-regressions were used to estimate effect sizes (ES), heterogeneity indices (I 2 ) and regression coefficients. We included 48 studies containing a total of 82 outcome reports. The 32 long term reports had an overall ES = 0.24 with 95% confidence interval (CI): 0.15 to 0.33 and I 2  = 59.4%. The 50 short term reports had an ES = 0.37 with 95% CI: 0.26 to 0.48, and I 2  = 71.3%. The number of BCTs unique to the intervention group, and the BCTs goal setting and self-monitoring of behaviour predicted the effect at short and long term. The total number of BCTs in both intervention arms and using the BCTs goal setting of outcome, feedback on outcome of behaviour, implementing graded tasks, and adding objects to the environment, e.g. using a step counter, significantly predicted the effect at long term. Setting a goal for change; and the presence of reporting bias independently explained 58.8% of inter-study variation at short term. Autonomy supportive and person-centred methods as in Motivational Interviewing, the BCTs goal setting of behaviour, and receiving feedback on the outcome of behaviour, explained all of the between study variations in effects at long term. There are similarities, but also differences in effective BCTs promoting change in healthy eating and physical activity and

  13. Quadratic Regression-based Non-uniform Response Correction for Radiochromic Film Scanners

    International Nuclear Information System (INIS)

    Jeong, Hae Sun; Kim, Chan Hyeong; Han, Young Yih; Kum, O Yeon

    2009-01-01

    In recent years, several types of radiochromic films have been extensively used for two-dimensional dose measurements such as dosimetry in radiotherapy as well as imaging and radiation protection applications. One of the critical aspects in radiochromic film dosimetry is the accurate readout of the scanner without dose distortion. However, most of charge-coupled device (CCD) scanners used for the optical density readout of the film employ a fluorescent lamp or a coldcathode lamp as a light source, which leads to a significant amount of light scattering on the active layer of the film. Due to the effect of the light scattering, dose distortions are produced with non-uniform responses, although the dose is uniformly irradiated to the film. In order to correct the distorted doses, a method based on correction factors (CF) has been reported and used. However, the prediction of the real incident doses is difficult when the indiscreet doses are delivered to the film, since the dose correction with the CF-based method is restrictively used in case that the incident doses are already known. In a previous study, therefore, a pixel-based algorithm with linear regression was developed to correct the dose distortion of a flatbed scanner, and to estimate the initial doses. The result, however, was not very good for some cases especially when the incident dose is under approximately 100 cGy. In the present study, the problem was addressed by replacing the linear regression with the quadratic regression. The corrected doses using this method were also compared with the results of other conventional methods

  14. Time series regression-based pairs trading in the Korean equities market

    Science.gov (United States)

    Kim, Saejoon; Heo, Jun

    2017-07-01

    Pairs trading is an instance of statistical arbitrage that relies on heavy quantitative data analysis to profit by capitalising low-risk trading opportunities provided by anomalies of related assets. A key element in pairs trading is the rule by which open and close trading triggers are defined. This paper investigates the use of time series regression to define the rule which has previously been identified with fixed threshold-based approaches. Empirical results indicate that our approach may yield significantly increased excess returns compared to ones obtained by previous approaches on large capitalisation stocks in the Korean equities market.

  15. Ordinal Regression Based Subpixel Shift Estimation for Video Super-Resolution

    Directory of Open Access Journals (Sweden)

    Petrovic Nemanja

    2007-01-01

    Full Text Available We present a supervised learning-based approach for subpixel motion estimation which is then used to perform video super-resolution. The novelty of this work is the formulation of the problem of subpixel motion estimation in a ranking framework. The ranking formulation is a variant of classification and regression formulation, in which the ordering present in class labels namely, the shift between patches is explicitly taken into account. Finally, we demonstrate the applicability of our approach on superresolving synthetically generated images with global subpixel shifts and enhancing real video frames by accounting for both local integer and subpixel shifts.

  16. An Application to the Prediction of LOD Change Based on General Regression Neural Network

    Science.gov (United States)

    Zhang, X. H.; Wang, Q. J.; Zhu, J. J.; Zhang, H.

    2011-07-01

    Traditional prediction of the LOD (length of day) change was based on linear models, such as the least square model and the autoregressive technique, etc. Due to the complex non-linear features of the LOD variation, the performances of the linear model predictors are not fully satisfactory. This paper applies a non-linear neural network - general regression neural network (GRNN) model to forecast the LOD change, and the results are analyzed and compared with those obtained with the back propagation neural network and other models. The comparison shows that the performance of the GRNN model in the prediction of the LOD change is efficient and feasible.

  17. Distinguishing between Rural and Urban Road Segment Traffic Safety Based on Zero-Inflated Negative Binomial Regression Models

    Directory of Open Access Journals (Sweden)

    Xuedong Yan

    2012-01-01

    Full Text Available In this study, the traffic crash rate, total crash frequency, and injury and fatal crash frequency were taken into consideration for distinguishing between rural and urban road segment safety. The GIS-based crash data during four and half years in Pikes Peak Area, US were applied for the analyses. The comparative statistical results show that the crash rates in rural segments are consistently lower than urban segments. Further, the regression results based on Zero-Inflated Negative Binomial (ZINB regression models indicate that the urban areas have a higher crash risk in terms of both total crash frequency and injury and fatal crash frequency, compared to rural areas. Additionally, it is found that crash frequencies increase as traffic volume and segment length increase, though the higher traffic volume lower the likelihood of severe crash occurrence; compared to 2-lane roads, the 4-lane roads have lower crash frequencies but have a higher probability of severe crash occurrence; and better road facilities with higher free flow speed can benefit from high standard design feature thus resulting in a lower total crash frequency, but they cannot mitigate the severe crash risk.

  18. Vocational Teachers and Professionalism - A Model Based on Empirical Analyses

    DEFF Research Database (Denmark)

    Duch, Henriette Skjærbæk; Andreasen, Karen E

    Vocational Teachers and Professionalism - A Model Based on Empirical Analyses Several theorists has developed models to illustrate the processes of adult learning and professional development (e.g. Illeris, Argyris, Engeström; Wahlgren & Aarkorg, Kolb and Wenger). Models can sometimes be criticized...... emphasis on the adult employee, the organization, its surroundings as well as other contextual factors. Our concern is adult vocational teachers attending a pedagogical course and teaching at vocational colleges. The aim of the paper is to discuss different models and develop a model concerning teachers...... at vocational colleges based on empirical data in a specific context, vocational teacher-training course in Denmark. By offering a basis and concepts for analysis of practice such model is meant to support the development of vocational teachers’ professionalism at courses and in organizational contexts...

  19. A robust background regression based score estimation algorithm for hyperspectral anomaly detection

    Science.gov (United States)

    Zhao, Rui; Du, Bo; Zhang, Liangpei; Zhang, Lefei

    2016-12-01

    Anomaly detection has become a hot topic in the hyperspectral image analysis and processing fields in recent years. The most important issue for hyperspectral anomaly detection is the background estimation and suppression. Unreasonable or non-robust background estimation usually leads to unsatisfactory anomaly detection results. Furthermore, the inherent nonlinearity of hyperspectral images may cover up the intrinsic data structure in the anomaly detection. In order to implement robust background estimation, as well as to explore the intrinsic data structure of the hyperspectral image, we propose a robust background regression based score estimation algorithm (RBRSE) for hyperspectral anomaly detection. The Robust Background Regression (RBR) is actually a label assignment procedure which segments the hyperspectral data into a robust background dataset and a potential anomaly dataset with an intersection boundary. In the RBR, a kernel expansion technique, which explores the nonlinear structure of the hyperspectral data in a reproducing kernel Hilbert space, is utilized to formulate the data as a density feature representation. A minimum squared loss relationship is constructed between the data density feature and the corresponding assigned labels of the hyperspectral data, to formulate the foundation of the regression. Furthermore, a manifold regularization term which explores the manifold smoothness of the hyperspectral data, and a maximization term of the robust background average density, which suppresses the bias caused by the potential anomalies, are jointly appended in the RBR procedure. After this, a paired-dataset based k-nn score estimation method is undertaken on the robust background and potential anomaly datasets, to implement the detection output. The experimental results show that RBRSE achieves superior ROC curves, AUC values, and background-anomaly separation than some of the other state-of-the-art anomaly detection methods, and is easy to implement

  20. Correcting for cryptic relatedness by a regression-based genomic control method

    Directory of Open Access Journals (Sweden)

    Yang Yaning

    2009-12-01

    Full Text Available Abstract Background Genomic control (GC method is a useful tool to correct for the cryptic relatedness in population-based association studies. It was originally proposed for correcting for the variance inflation of Cochran-Armitage's additive trend test by using information from unlinked null markers, and was later generalized to be applicable to other tests with the additional requirement that the null markers are matched with the candidate marker in allele frequencies. However, matching allele frequencies limits the number of available null markers and thus limits the applicability of the GC method. On the other hand, errors in genotype/allele frequencies may cause further bias and variance inflation and thereby aggravate the effect of GC correction. Results In this paper, we propose a regression-based GC method using null markers that are not necessarily matched in allele frequencies with the candidate marker. Variation of allele frequencies of the null markers is adjusted by a regression method. Conclusion The proposed method can be readily applied to the Cochran-Armitage's trend tests other than the additive trend test, the Pearson's chi-square test and other robust efficiency tests. Simulation results show that the proposed method is effective in controlling type I error in the presence of population substructure.

  1. GIS-based rare events logistic regression for mineral prospectivity mapping

    Science.gov (United States)

    Xiong, Yihui; Zuo, Renguang

    2018-02-01

    Mineralization is a special type of singularity event, and can be considered as a rare event, because within a specific study area the number of prospective locations (1s) are considerably fewer than the number of non-prospective locations (0s). In this study, GIS-based rare events logistic regression (RELR) was used to map the mineral prospectivity in the southwestern Fujian Province, China. An odds ratio was used to measure the relative importance of the evidence variables with respect to mineralization. The results suggest that formations, granites, and skarn alterations, followed by faults and aeromagnetic anomaly are the most important indicators for the formation of Fe-related mineralization in the study area. The prediction rate and the area under the curve (AUC) values show that areas with higher probability have a strong spatial relationship with the known mineral deposits. Comparing the results with original logistic regression (OLR) demonstrates that the GIS-based RELR performs better than OLR. The prospectivity map obtained in this study benefits the search for skarn Fe-related mineralization in the study area.

  2. Robust Image Regression Based on the Extended Matrix Variate Power Exponential Distribution of Dependent Noise.

    Science.gov (United States)

    Luo, Lei; Yang, Jian; Qian, Jianjun; Tai, Ying; Lu, Gui-Fu

    2017-09-01

    Dealing with partial occlusion or illumination is one of the most challenging problems in image representation and classification. In this problem, the characterization of the representation error plays a crucial role. In most current approaches, the error matrix needs to be stretched into a vector and each element is assumed to be independently corrupted. This ignores the dependence between the elements of error. In this paper, it is assumed that the error image caused by partial occlusion or illumination changes is a random matrix variate and follows the extended matrix variate power exponential distribution. This has the heavy tailed regions and can be used to describe a matrix pattern of l×m dimensional observations that are not independent. This paper reveals the essence of the proposed distribution: it actually alleviates the correlations between pixels in an error matrix E and makes E approximately Gaussian. On the basis of this distribution, we derive a Schatten p -norm-based matrix regression model with L q regularization. Alternating direction method of multipliers is applied to solve this model. To get a closed-form solution in each step of the algorithm, two singular value function thresholding operators are introduced. In addition, the extended Schatten p -norm is utilized to characterize the distance between the test samples and classes in the design of the classifier. Extensive experimental results for image reconstruction and classification with structural noise demonstrate that the proposed algorithm works much more robustly than some existing regression-based methods.

  3. Hybrid data mining-regression for infrastructure risk assessment based on zero-inflated data

    International Nuclear Information System (INIS)

    Guikema, S.D.; Quiring, S.M.

    2012-01-01

    Infrastructure disaster risk assessment seeks to estimate the probability of a given customer or area losing service during a disaster, sometimes in conjunction with estimating the duration of each outage. This is often done on the basis of past data about the effects of similar events impacting the same or similar systems. In many situations this past performance data from infrastructure systems is zero-inflated; it has more zeros than can be appropriately modeled with standard probability distributions. The data are also often non-linear and exhibit threshold effects due to the complexities of infrastructure system performance. Standard zero-inflated statistical models such as zero-inflated Poisson and zero-inflated negative binomial regression models do not adequately capture these complexities. In this paper we develop a novel method that is a hybrid classification tree/regression method for complex, zero-inflated data sets. We investigate its predictive accuracy based on a large number of simulated data sets and then demonstrate its practical usefulness with an application to hurricane power outage risk assessment for a large utility based on actual data from the utility. While formulated for infrastructure disaster risk assessment, this method is promising for data-driven analysis for other situations with zero-inflated, complex data exhibiting response thresholds.

  4. hMuLab: A Biomedical Hybrid MUlti-LABel Classifier Based on Multiple Linear Regression.

    Science.gov (United States)

    Wang, Pu; Ge, Ruiquan; Xiao, Xuan; Zhou, Manli; Zhou, Fengfeng

    2017-01-01

    Many biomedical classification problems are multi-label by nature, e.g., a gene involved in a variety of functions and a patient with multiple diseases. The majority of existing classification algorithms assumes each sample with only one class label, and the multi-label classification problem remains to be a challenge for biomedical researchers. This study proposes a novel multi-label learning algorithm, hMuLab, by integrating both feature-based and neighbor-based similarity scores. The multiple linear regression modeling techniques make hMuLab capable of producing multiple label assignments for a query sample. The comparison results over six commonly-used multi-label performance measurements suggest that hMuLab performs accurately and stably for the biomedical datasets, and may serve as a complement to the existing literature.

  5. Mixture model-based clustering and logistic regression for automatic detection of microaneurysms in retinal images

    Science.gov (United States)

    Sánchez, Clara I.; Hornero, Roberto; Mayo, Agustín; García, María

    2009-02-01

    Diabetic Retinopathy is one of the leading causes of blindness and vision defects in developed countries. An early detection and diagnosis is crucial to avoid visual complication. Microaneurysms are the first ocular signs of the presence of this ocular disease. Their detection is of paramount importance for the development of a computer-aided diagnosis technique which permits a prompt diagnosis of the disease. However, the detection of microaneurysms in retinal images is a difficult task due to the wide variability that these images usually present in screening programs. We propose a statistical approach based on mixture model-based clustering and logistic regression which is robust to the changes in the appearance of retinal fundus images. The method is evaluated on the public database proposed by the Retinal Online Challenge in order to obtain an objective performance measure and to allow a comparative study with other proposed algorithms.

  6. Fault prediction for nonlinear stochastic system with incipient faults based on particle filter and nonlinear regression.

    Science.gov (United States)

    Ding, Bo; Fang, Huajing

    2017-05-01

    This paper is concerned with the fault prediction for the nonlinear stochastic system with incipient faults. Based on the particle filter and the reasonable assumption about the incipient faults, the modified fault estimation algorithm is proposed, and the system state is estimated simultaneously. According to the modified fault estimation, an intuitive fault detection strategy is introduced. Once each of the incipient fault is detected, the parameters of which are identified by a nonlinear regression method. Then, based on the estimated parameters, the future fault signal can be predicted. Finally, the effectiveness of the proposed method is verified by the simulations of the Three-tank system. Copyright © 2017 ISA. Published by Elsevier Ltd. All rights reserved.

  7. Computational neural network regression model for Host based Intrusion Detection System

    Directory of Open Access Journals (Sweden)

    Sunil Kumar Gautam

    2016-09-01

    Full Text Available The current scenario of information gathering and storing in secure system is a challenging task due to increasing cyber-attacks. There exists computational neural network techniques designed for intrusion detection system, which provide security to single machine and entire network's machine. In this paper, we have used two types of computational neural network models, namely, Generalized Regression Neural Network (GRNN model and Multilayer Perceptron Neural Network (MPNN model for Host based Intrusion Detection System using log files that are generated by a single personal computer. The simulation results show correctly classified percentage of normal and abnormal (intrusion class using confusion matrix. On the basis of results and discussion, we found that the Host based Intrusion Systems Model (HISM significantly improved the detection accuracy while retaining minimum false alarm rate.

  8. Evaluation of the comprehensive palatability of Japanese sake paired with dishes by multiple regression analysis based on subdomains.

    Science.gov (United States)

    Nakamura, Ryo; Nakano, Kumiko; Tamura, Hiroyasu; Mizunuma, Masaki; Fushiki, Tohru; Hirata, Dai

    2017-08-01

    Many factors contribute to palatability. In order to evaluate the palatability of Japanese alcohol sake paired with certain dishes by integrating multiple factors, here we applied an evaluation method previously reported for palatability of cheese by multiple regression analysis based on 3 subdomain factors (rewarding, cultural, and informational). We asked 94 Japanese participants/subjects to evaluate the palatability of sake (1st evaluation/E1 for the first cup, 2nd/E2 and 3rd/E3 for the palatability with aftertaste/afterglow of certain dishes) and to respond to a questionnaire related to 3 subdomains. In E1, 3 factors were extracted by a factor analysis, and the subsequent multiple regression analyses indicated that the palatability of sake was interpreted by mainly the rewarding. Further, the results of attribution-dissections in E1 indicated that 2 factors (rewarding and informational) contributed to the palatability. Finally, our results indicated that the palatability of sake was influenced by the dish eaten just before drinking.

  9. Univariate and multiple linear regression analyses for 23 single nucleotide polymorphisms in 14 genes predisposing to chronic glomerular diseases and IgA nephropathy in Han Chinese.

    Science.gov (United States)

    Wang, Hui; Sui, Weiguo; Xue, Wen; Wu, Junyong; Chen, Jiejing; Dai, Yong

    2014-09-01

    Immunoglobulin A nephropathy (IgAN) is a complex trait regulated by the interaction among multiple physiologic regulatory systems and probably involving numerous genes, which leads to inconsistent findings in genetic studies. One possibility of failure to replicate some single-locus results is that the underlying genetics of IgAN nephropathy is based on multiple genes with minor effects. To learn the association between 23 single nucleotide polymorphisms (SNPs) in 14 genes predisposing to chronic glomerular diseases and IgAN in Han males, the 23 SNPs genotypes of 21 Han males were detected and analyzed with a BaiO gene chip, and their associations were analyzed with univariate analysis and multiple linear regression analysis. Analysis showed that CTLA4 rs231726 and CR2 rs1048971 revealed a significant association with IgAN. These findings support the multi-gene nature of the etiology of IgAN and propose a potential gene-gene interactive model for future studies.

  10. Real-time prediction of respiratory motion based on local regression methods

    International Nuclear Information System (INIS)

    Ruan, D; Fessler, J A; Balter, J M

    2007-01-01

    Recent developments in modulation techniques enable conformal delivery of radiation doses to small, localized target volumes. One of the challenges in using these techniques is real-time tracking and predicting target motion, which is necessary to accommodate system latencies. For image-guided-radiotherapy systems, it is also desirable to minimize sampling rates to reduce imaging dose. This study focuses on predicting respiratory motion, which can significantly affect lung tumours. Predicting respiratory motion in real-time is challenging, due to the complexity of breathing patterns and the many sources of variability. We propose a prediction method based on local regression. There are three major ingredients of this approach: (1) forming an augmented state space to capture system dynamics, (2) local regression in the augmented space to train the predictor from previous observation data using semi-periodicity of respiratory motion, (3) local weighting adjustment to incorporate fading temporal correlations. To evaluate prediction accuracy, we computed the root mean square error between predicted tumor motion and its observed location for ten patients. For comparison, we also investigated commonly used predictive methods, namely linear prediction, neural networks and Kalman filtering to the same data. The proposed method reduced the prediction error for all imaging rates and latency lengths, particularly for long prediction lengths

  11. A robust regression based on weighted LSSVM and penalized trimmed squares

    International Nuclear Information System (INIS)

    Liu, Jianyong; Wang, Yong; Fu, Chengqun; Guo, Jie; Yu, Qin

    2016-01-01

    Least squares support vector machine (LS-SVM) for nonlinear regression is sensitive to outliers in the field of machine learning. Weighted LS-SVM (WLS-SVM) overcomes this drawback by adding weight to each training sample. However, as the number of outliers increases, the accuracy of WLS-SVM may decrease. In order to improve the robustness of WLS-SVM, a new robust regression method based on WLS-SVM and penalized trimmed squares (WLSSVM–PTS) has been proposed. The algorithm comprises three main stages. The initial parameters are obtained by least trimmed squares at first. Then, the significant outliers are identified and eliminated by the Fast-PTS algorithm. The remaining samples with little outliers are estimated by WLS-SVM at last. The statistical tests of experimental results carried out on numerical datasets and real-world datasets show that the proposed WLSSVM–PTS is significantly robust than LS-SVM, WLS-SVM and LSSVM–LTS.

  12. Waste generated in high-rise buildings construction: a quantification model based on statistical multiple regression.

    Science.gov (United States)

    Parisi Kern, Andrea; Ferreira Dias, Michele; Piva Kulakowski, Marlova; Paulo Gomes, Luciana

    2015-05-01

    Reducing construction waste is becoming a key environmental issue in the construction industry. The quantification of waste generation rates in the construction sector is an invaluable management tool in supporting mitigation actions. However, the quantification of waste can be a difficult process because of the specific characteristics and the wide range of materials used in different construction projects. Large variations are observed in the methods used to predict the amount of waste generated because of the range of variables involved in construction processes and the different contexts in which these methods are employed. This paper proposes a statistical model to determine the amount of waste generated in the construction of high-rise buildings by assessing the influence of design process and production system, often mentioned as the major culprits behind the generation of waste in construction. Multiple regression was used to conduct a case study based on multiple sources of data of eighteen residential buildings. The resulting statistical model produced dependent (i.e. amount of waste generated) and independent variables associated with the design and the production system used. The best regression model obtained from the sample data resulted in an adjusted R(2) value of 0.694, which means that it predicts approximately 69% of the factors involved in the generation of waste in similar constructions. Most independent variables showed a low determination coefficient when assessed in isolation, which emphasizes the importance of assessing their joint influence on the response (dependent) variable. Copyright © 2015 Elsevier Ltd. All rights reserved.

  13. Screening for ketosis using multiple logistic regression based on milk yield and composition.

    Science.gov (United States)

    Kayano, Mitsunori; Kataoka, Tomoko

    2015-11-01

    Multiple logistic regression was applied to milk yield and composition data for 632 records of healthy cows and 61 records of ketotic cows in Hokkaido, Japan. The purpose was to diagnose ketosis based on milk yield and composition, simultaneously. The cows were divided into two groups: (1) multiparous, including 314 healthy cows and 45 ketotic cows and (2) primiparous, including 318 healthy cows and 16 ketotic cows, since nutritional status, milk yield and composition are affected by parity. Multiple logistic regression was applied to these groups separately. For multiparous cows, milk yield (kg/day/cow) and protein-to-fat (P/F) ratio in milk were significant factors (Pketosis. For primiparous cows, lactose content (%), solid not fat (SNF) content (%) and milk urea nitrogen (MUN) content (mg/dl) were significantly associated with ketosis (Pketosis, provided the sensitivity, specificity and AUC values of (1) 0.711, 0.726 and 0.781; and (2) 0.678, 0.767 and 0.738, respectively.

  14. Large biases in regression-based constituent flux estimates: causes and diagnostic tools

    Science.gov (United States)

    Hirsch, Robert M.

    2014-01-01

    It has been documented in the literature that, in some cases, widely used regression-based models can produce severely biased estimates of long-term mean river fluxes of various constituents. These models, estimated using sample values of concentration, discharge, and date, are used to compute estimated fluxes for a multiyear period at a daily time step. This study compares results of the LOADEST seven-parameter model, LOADEST five-parameter model, and the Weighted Regressions on Time, Discharge, and Season (WRTDS) model using subsampling of six very large datasets to better understand this bias problem. This analysis considers sample datasets for dissolved nitrate and total phosphorus. The results show that LOADEST-7 and LOADEST-5, although they often produce very nearly unbiased results, can produce highly biased results. This study identifies three conditions that can give rise to these severe biases: (1) lack of fit of the log of concentration vs. log discharge relationship, (2) substantial differences in the shape of this relationship across seasons, and (3) severely heteroscedastic residuals. The WRTDS model is more resistant to the bias problem than the LOADEST models but is not immune to them. Understanding the causes of the bias problem is crucial to selecting an appropriate method for flux computations. Diagnostic tools for identifying the potential for bias problems are introduced, and strategies for resolving bias problems are described.

  15. Measurement error in epidemiologic studies of air pollution based on land-use regression models.

    Science.gov (United States)

    Basagaña, Xavier; Aguilera, Inmaculada; Rivera, Marcela; Agis, David; Foraster, Maria; Marrugat, Jaume; Elosua, Roberto; Künzli, Nino

    2013-10-15

    Land-use regression (LUR) models are increasingly used to estimate air pollution exposure in epidemiologic studies. These models use air pollution measurements taken at a small set of locations and modeling based on geographical covariates for which data are available at all study participant locations. The process of LUR model development commonly includes a variable selection procedure. When LUR model predictions are used as explanatory variables in a model for a health outcome, measurement error can lead to bias of the regression coefficients and to inflation of their variance. In previous studies dealing with spatial predictions of air pollution, bias was shown to be small while most of the effect of measurement error was on the variance. In this study, we show that in realistic cases where LUR models are applied to health data, bias in health-effect estimates can be substantial. This bias depends on the number of air pollution measurement sites, the number of available predictors for model selection, and the amount of explainable variability in the true exposure. These results should be taken into account when interpreting health effects from studies that used LUR models.

  16. Support Vector Regression-Based Adaptive Divided Difference Filter for Nonlinear State Estimation Problems

    Directory of Open Access Journals (Sweden)

    Hongjian Wang

    2014-01-01

    Full Text Available We present a support vector regression-based adaptive divided difference filter (SVRADDF algorithm for improving the low state estimation accuracy of nonlinear systems, which are typically affected by large initial estimation errors and imprecise prior knowledge of process and measurement noises. The derivative-free SVRADDF algorithm is significantly simpler to compute than other methods and is implemented using only functional evaluations. The SVRADDF algorithm involves the use of the theoretical and actual covariance of the innovation sequence. Support vector regression (SVR is employed to generate the adaptive factor to tune the noise covariance at each sampling instant when the measurement update step executes, which improves the algorithm’s robustness. The performance of the proposed algorithm is evaluated by estimating states for (i an underwater nonmaneuvering target bearing-only tracking system and (ii maneuvering target bearing-only tracking in an air-traffic control system. The simulation results show that the proposed SVRADDF algorithm exhibits better performance when compared with a traditional DDF algorithm.

  17. Observer-Based and Regression Model-Based Detection of Emerging Faults in Coal Mills

    DEFF Research Database (Denmark)

    Odgaard, Peter Fogh; Lin, Bao; Jørgensen, Sten Bay

    2006-01-01

    In order to improve the reliability of power plants it is important to detect fault as fast as possible. Doing this it is interesting to find the most efficient method. Since modeling of large scale systems is time consuming it is interesting to compare a model-based method with data driven ones....

  18. Breast mass detection in mammography and tomosynthesis via fully convolutional network-based heatmap regression

    Science.gov (United States)

    Zhang, Jun; Cain, Elizabeth Hope; Saha, Ashirbani; Zhu, Zhe; Mazurowski, Maciej A.

    2018-02-01

    Breast mass detection in mammography and digital breast tomosynthesis (DBT) is an essential step in computerized breast cancer analysis. Deep learning-based methods incorporate feature extraction and model learning into a unified framework and have achieved impressive performance in various medical applications (e.g., disease diagnosis, tumor detection, and landmark detection). However, these methods require large-scale accurately annotated data. Unfortunately, it is challenging to get precise annotations of breast masses. To address this issue, we propose a fully convolutional network (FCN) based heatmap regression method for breast mass detection, using only weakly annotated mass regions in mammography images. Specifically, we first generate heat maps of masses based on human-annotated rough regions for breast masses. We then develop an FCN model for end-to-end heatmap regression with an F-score loss function, where the mammography images are regarded as the input and heatmaps for breast masses are used as the output. Finally, the probability map of mass locations can be estimated with the trained model. Experimental results on a mammography dataset with 439 subjects demonstrate the effectiveness of our method. Furthermore, we evaluate whether we can use mammography data to improve detection models for DBT, since mammography shares similar structure with tomosynthesis. We propose a transfer learning strategy by fine-tuning the learned FCN model from mammography images. We test this approach on a small tomosynthesis dataset with only 40 subjects, and we show an improvement in the detection performance as compared to training the model from scratch.

  19. Trace analysis of acids and bases by conductometric titration with multiparametric non-linear regression.

    Science.gov (United States)

    Coelho, Lúcia H G; Gutz, Ivano G R

    2006-03-15

    A chemometric method for analysis of conductometric titration data was introduced to extend its applicability to lower concentrations and more complex acid-base systems. Auxiliary pH measurements were made during the titration to assist the calculation of the distribution of protonable species on base of known or guessed equilibrium constants. Conductivity values of each ionized or ionizable species possibly present in the sample were introduced in a general equation where the only unknown parameters were the total concentrations of (conjugated) bases and of strong electrolytes not involved in acid-base equilibria. All these concentrations were adjusted by a multiparametric nonlinear regression (NLR) method, based on the Levenberg-Marquardt algorithm. This first conductometric titration method with NLR analysis (CT-NLR) was successfully applied to simulated conductometric titration data and to synthetic samples with multiple components at concentrations as low as those found in rainwater (approximately 10 micromol L(-1)). It was possible to resolve and quantify mixtures containing a strong acid, formic acid, acetic acid, ammonium ion, bicarbonate and inert electrolyte with accuracy of 5% or better.

  20. Dual Regression

    OpenAIRE

    Spady, Richard; Stouli, Sami

    2012-01-01

    We propose dual regression as an alternative to the quantile regression process for the global estimation of conditional distribution functions under minimal assumptions. Dual regression provides all the interpretational power of the quantile regression process while avoiding the need for repairing the intersecting conditional quantile surfaces that quantile regression often produces in practice. Our approach introduces a mathematical programming characterization of conditional distribution f...

  1. Distance Based Root Cause Analysis and Change Impact Analysis of Performance Regressions

    Directory of Open Access Journals (Sweden)

    Junzan Zhou

    2015-01-01

    Full Text Available Performance regression testing is applied to uncover both performance and functional problems of software releases. A performance problem revealed by performance testing can be high response time, low throughput, or even being out of service. Mature performance testing process helps systematically detect software performance problems. However, it is difficult to identify the root cause and evaluate the potential change impact. In this paper, we present an approach leveraging server side logs for identifying root causes of performance problems. Firstly, server side logs are used to recover call tree of each business transaction. We define a novel distance based metric computed from call trees for root cause analysis and apply inverted index from methods to business transactions for change impact analysis. Empirical studies show that our approach can effectively and efficiently help developers diagnose root cause of performance problems.

  2. Focused information criterion and model averaging based on weighted composite quantile regression

    KAUST Repository

    Xu, Ganggang

    2013-08-13

    We study the focused information criterion and frequentist model averaging and their application to post-model-selection inference for weighted composite quantile regression (WCQR) in the context of the additive partial linear models. With the non-parametric functions approximated by polynomial splines, we show that, under certain conditions, the asymptotic distribution of the frequentist model averaging WCQR-estimator of a focused parameter is a non-linear mixture of normal distributions. This asymptotic distribution is used to construct confidence intervals that achieve the nominal coverage probability. With properly chosen weights, the focused information criterion based WCQR estimators are not only robust to outliers and non-normal residuals but also can achieve efficiency close to the maximum likelihood estimator, without assuming the true error distribution. Simulation studies and a real data analysis are used to illustrate the effectiveness of the proposed procedure. © 2013 Board of the Foundation of the Scandinavian Journal of Statistics..

  3. Forecasting systems reliability based on support vector regression with genetic algorithms

    International Nuclear Information System (INIS)

    Chen, K.-Y.

    2007-01-01

    This study applies a novel neural-network technique, support vector regression (SVR), to forecast reliability in engine systems. The aim of this study is to examine the feasibility of SVR in systems reliability prediction by comparing it with the existing neural-network approaches and the autoregressive integrated moving average (ARIMA) model. To build an effective SVR model, SVR's parameters must be set carefully. This study proposes a novel approach, known as GA-SVR, which searches for SVR's optimal parameters using real-value genetic algorithms, and then adopts the optimal parameters to construct the SVR models. A real reliability data for 40 suits of turbochargers were employed as the data set. The experimental results demonstrate that SVR outperforms the existing neural-network approaches and the traditional ARIMA models based on the normalized root mean square error and mean absolute percentage error

  4. Multiple Linear Regression Model Based on Neural Network and Its Application in the MBR Simulation

    Directory of Open Access Journals (Sweden)

    Chunqing Li

    2012-01-01

    Full Text Available The computer simulation of the membrane bioreactor MBR has become the research focus of the MBR simulation. In order to compensate for the defects, for example, long test period, high cost, invisible equipment seal, and so forth, on the basis of conducting in-depth study of the mathematical model of the MBR, combining with neural network theory, this paper proposed a three-dimensional simulation system for MBR wastewater treatment, with fast speed, high efficiency, and good visualization. The system is researched and developed with the hybrid programming of VC++ programming language and OpenGL, with a multifactor linear regression model of affecting MBR membrane fluxes based on neural network, applying modeling method of integer instead of float and quad tree recursion. The experiments show that the three-dimensional simulation system, using the above models and methods, has the inspiration and reference for the future research and application of the MBR simulation technology.

  5. Earthquake prediction in California using regression algorithms and cloud-based big data infrastructure

    Science.gov (United States)

    Asencio-Cortés, G.; Morales-Esteban, A.; Shang, X.; Martínez-Álvarez, F.

    2018-06-01

    Earthquake magnitude prediction is a challenging problem that has been widely studied during the last decades. Statistical, geophysical and machine learning approaches can be found in literature, with no particularly satisfactory results. In recent years, powerful computational techniques to analyze big data have emerged, making possible the analysis of massive datasets. These new methods make use of physical resources like cloud based architectures. California is known for being one of the regions with highest seismic activity in the world and many data are available. In this work, the use of several regression algorithms combined with ensemble learning is explored in the context of big data (1 GB catalog is used), in order to predict earthquakes magnitude within the next seven days. Apache Spark framework, H2 O library in R language and Amazon cloud infrastructure were been used, reporting very promising results.

  6. Operational Satellite-based Surface Oil Analyses (Invited)

    Science.gov (United States)

    Streett, D.; Warren, C.

    2010-12-01

    During the Deepwater Horizon spill, NOAA imagery analysts in the Satellite Analysis Branch (SAB) issued more than 300 near-real-time satellite-based oil spill analyses. These analyses were used by the oil spill response community for planning, issuing surface oil trajectories and tasking assets (e.g., oil containment booms, skimmers, overflights). SAB analysts used both Synthetic Aperture Radar (SAR) and high resolution visible/near IR multispectral satellite imagery as well as a variety of ancillary datasets. Satellite imagery used included ENVISAT ASAR (ESA), TerraSAR-X (DLR), Cosmo-Skymed (ASI), ALOS (JAXA), Radarsat (MDA), ENVISAT MERIS (ESA), SPOT (SPOT Image Corp.), Aster (NASA), MODIS (NASA), and AVHRR (NOAA). Ancillary datasets included ocean current information, wind information, location of natural oil seeps and a variety of in situ oil observations. The analyses were available as jpegs, pdfs, shapefiles and through Google, KML files and also available on a variety of websites including Geoplatform and ERMA. From the very first analysis issued just 5 hours after the rig sank through the final analysis issued in August, the complete archive is still publicly available on the NOAA/NESDIS website http://www.ssd.noaa.gov/PS/MPS/deepwater.html SAB personnel also served as the Deepwater Horizon International Disaster Charter Project Manager (at the official request of the USGS). The Project Manager’s primary responsibility was to acquire and oversee the processing and dissemination of satellite data generously donated by numerous private companies and nations in support of the oil spill response including some of the imagery described above. SAB has begun to address a number of goals that will improve our routine oil spill response as well as help assure that we are ready for the next spill of national significance. We hope to (1) secure a steady, abundant and timely stream of suitable satellite imagery even in the absence of large-scale emergencies such as

  7. Financial analysis and forecasting of the results of small businesses performance based on regression model

    Directory of Open Access Journals (Sweden)

    Svetlana O. Musienko

    2017-03-01

    Full Text Available Objective to develop the economicmathematical model of the dependence of revenue on other balance sheet items taking into account the sectoral affiliation of the companies. Methods using comparative analysis the article studies the existing approaches to the construction of the company management models. Applying the regression analysis and the least squares method which is widely used for financial management of enterprises in Russia and abroad the author builds a model of the dependence of revenue on other balance sheet items taking into account the sectoral affiliation of the companies which can be used in the financial analysis and prediction of small enterprisesrsquo performance. Results the article states the need to identify factors affecting the financial management efficiency. The author analyzed scientific research and revealed the lack of comprehensive studies on the methodology for assessing the small enterprisesrsquo management while the methods used for large companies are not always suitable for the task. The systematized approaches of various authors to the formation of regression models describe the influence of certain factors on the company activity. It is revealed that the resulting indicators in the studies were revenue profit or the company relative profitability. The main drawback of most models is the mathematical not economic approach to the definition of the dependent and independent variables. Basing on the analysis it was determined that the most correct is the model of dependence between revenues and total assets of the company using the decimal logarithm. The model was built using data on the activities of the 507 small businesses operating in three spheres of economic activity. Using the presented model it was proved that there is direct dependence between the sales proceeds and the main items of the asset balance as well as differences in the degree of this effect depending on the economic activity of small

  8. Principal components based support vector regression model for on-line instrument calibration monitoring in NPPs

    International Nuclear Information System (INIS)

    Seo, In Yong; Ha, Bok Nam; Lee, Sung Woo; Shin, Chang Hoon; Kim, Seong Jun

    2010-01-01

    In nuclear power plants (NPPs), periodic sensor calibrations are required to assure that sensors are operating correctly. By checking the sensor's operating status at every fuel outage, faulty sensors may remain undetected for periods of up to 24 months. Moreover, typically, only a few faulty sensors are found to be calibrated. For the safe operation of NPP and the reduction of unnecessary calibration, on-line instrument calibration monitoring is needed. In this study, principal component based auto-associative support vector regression (PCSVR) using response surface methodology (RSM) is proposed for the sensor signal validation of NPPs. This paper describes the design of a PCSVR-based sensor validation system for a power generation system. RSM is employed to determine the optimal values of SVR hyperparameters and is compared to the genetic algorithm (GA). The proposed PCSVR model is confirmed with the actual plant data of Kori Nuclear Power Plant Unit 3 and is compared with the Auto-Associative support vector regression (AASVR) and the auto-associative neural network (AANN) model. The auto-sensitivity of AASVR is improved by around six times by using a PCA, resulting in good detection of sensor drift. Compared to AANN, accuracy and cross-sensitivity are better while the auto-sensitivity is almost the same. Meanwhile, the proposed RSM for the optimization of the PCSVR algorithm performs even better in terms of accuracy, auto-sensitivity, and averaged maximum error, except in averaged RMS error, and this method is much more time efficient compared to the conventional GA method

  9. A Logistic Regression Based Auto Insurance Rate-Making Model Designed for the Insurance Rate Reform

    Directory of Open Access Journals (Sweden)

    Zhengmin Duan

    2018-02-01

    Full Text Available Using a generalized linear model to determine the claim frequency of auto insurance is a key ingredient in non-life insurance research. Among auto insurance rate-making models, there are very few considering auto types. Therefore, in this paper we are proposing a model that takes auto types into account by making an innovative use of the auto burden index. Based on this model and data from a Chinese insurance company, we built a clustering model that classifies auto insurance rates into three risk levels. The claim frequency and the claim costs are fitted to select a better loss distribution. Then the Logistic Regression model is employed to fit the claim frequency, with the auto burden index considered. Three key findings can be concluded from our study. First, more than 80% of the autos with an auto burden index of 20 or higher belong to the highest risk level. Secondly, the claim frequency is better fitted using the Poisson distribution, however the claim cost is better fitted using the Gamma distribution. Lastly, based on the AIC criterion, the claim frequency is more adequately represented by models that consider the auto burden index than those do not. It is believed that insurance policy recommendations that are based on Generalized linear models (GLM can benefit from our findings.

  10. A neutron spectrum unfolding code based on generalized regression artificial neural networks

    International Nuclear Information System (INIS)

    Ortiz R, J. M.; Martinez B, M. R.; Castaneda M, R.; Solis S, L. O.; Vega C, H. R.

    2015-10-01

    The most delicate part of neutron spectrometry, is the unfolding process. Then derivation of the spectral information is not simple because the unknown is not given directly as result of the measurements. Novel methods based on Artificial Neural Networks have been widely investigated. In prior works, back propagation neural networks (BPNN) have been used to solve the neutron spectrometry problem, however, some drawbacks still exist using this kind of neural nets, as the optimum selection of the network topology and the long training time. Compared to BPNN, is usually much faster to train a generalized regression neural network (GRNN). That is mainly because spread constant is the only parameter used in GRNN. Another feature is that the network will converge to a global minimum. In addition, often are more accurate than BPNN in prediction. These characteristics make GRNN be of great interest in the neutron spectrometry domain. In this work is presented a computational tool based on GRNN, capable to solve the neutron spectrometry problem. This computational code, automates the pre-processing, training and testing stages, the statistical analysis and the post-processing of the information, using 7 Bonner spheres rate counts as only entrance data. The code was designed for a Bonner Spheres System based on a 6 LiI(Eu) neutron detector and a response matrix expressed in 60 energy bins taken from an International Atomic Energy Agency compilation. (Author)

  11. Estimation of trabecular bone parameters in children from multisequence MRI using texture-based regression

    Energy Technology Data Exchange (ETDEWEB)

    Lekadir, Karim, E-mail: karim.lekadir@upf.edu; Hoogendoorn, Corné [Center for Computational Imaging and Simulation Technologies in Biomedicine, Universitat Pompeu Fabra, Barcelona 08018 (Spain); Armitage, Paul [The Academic Unit of Radiology, The University of Sheffield, Sheffield S10 2JF (United Kingdom); Whitby, Elspeth [The Academic Unit of Reproductive and Developmental Medicine, The University of Sheffield, Sheffield S10 2SF (United Kingdom); King, David [The Academic Unit of Child Health, The University of Sheffield, Sheffield S10 2TH (United Kingdom); Dimitri, Paul [The Mellanby Centre for Bone Research, The University of Sheffield, Sheffield S10 2RX (United Kingdom); Frangi, Alejandro F. [Center for Computational Imaging and Simulation Technologies in Biomedicine, The University of Sheffield, Sheffield S1 3JD (United Kingdom)

    2016-06-15

    Purpose: This paper presents a statistical approach for the prediction of trabecular bone parameters from low-resolution multisequence magnetic resonance imaging (MRI) in children, thus addressing the limitations of high-resolution modalities such as HR-pQCT, including the significant exposure of young patients to radiation and the limited applicability of such modalities to peripheral bones in vivo. Methods: A statistical predictive model is constructed from a database of MRI and HR-pQCT datasets, to relate the low-resolution MRI appearance in the cancellous bone to the trabecular parameters extracted from the high-resolution images. The description of the MRI appearance is achieved between subjects by using a collection of feature descriptors, which describe the texture properties inside the cancellous bone, and which are invariant to the geometry and size of the trabecular areas. The predictive model is built by fitting to the training data a nonlinear partial least square regression between the input MRI features and the output trabecular parameters. Results: Detailed validation based on a sample of 96 datasets shows correlations >0.7 between the trabecular parameters predicted from low-resolution multisequence MRI based on the proposed statistical model and the values extracted from high-resolution HRp-QCT. Conclusions: The obtained results indicate the promise of the proposed predictive technique for the estimation of trabecular parameters in children from multisequence MRI, thus reducing the need for high-resolution radiation-based scans for a fragile population that is under development and growth.

  12. A neutron spectrum unfolding code based on generalized regression artificial neural networks

    Energy Technology Data Exchange (ETDEWEB)

    Ortiz R, J. M.; Martinez B, M. R.; Castaneda M, R.; Solis S, L. O. [Universidad Autonoma de Zacatecas, Unidad Academica de Ingenieria Electrica, Av. Ramon Lopez Velarde 801, Col. Centro, 98000 Zacatecas, Zac. (Mexico); Vega C, H. R., E-mail: morvymm@yahoo.com.mx [Universidad Autonoma de Zacatecas, Unidad Academica de Estudios Nucleares, Cipres No. 10, Fracc. La Penuela, 98068 Zacatecas, Zac. (Mexico)

    2015-10-15

    The most delicate part of neutron spectrometry, is the unfolding process. Then derivation of the spectral information is not simple because the unknown is not given directly as result of the measurements. Novel methods based on Artificial Neural Networks have been widely investigated. In prior works, back propagation neural networks (BPNN) have been used to solve the neutron spectrometry problem, however, some drawbacks still exist using this kind of neural nets, as the optimum selection of the network topology and the long training time. Compared to BPNN, is usually much faster to train a generalized regression neural network (GRNN). That is mainly because spread constant is the only parameter used in GRNN. Another feature is that the network will converge to a global minimum. In addition, often are more accurate than BPNN in prediction. These characteristics make GRNN be of great interest in the neutron spectrometry domain. In this work is presented a computational tool based on GRNN, capable to solve the neutron spectrometry problem. This computational code, automates the pre-processing, training and testing stages, the statistical analysis and the post-processing of the information, using 7 Bonner spheres rate counts as only entrance data. The code was designed for a Bonner Spheres System based on a {sup 6}LiI(Eu) neutron detector and a response matrix expressed in 60 energy bins taken from an International Atomic Energy Agency compilation. (Author)

  13. Single Image Super-Resolution Using Global Regression Based on Multiple Local Linear Mappings.

    Science.gov (United States)

    Choi, Jae-Seok; Kim, Munchurl

    2017-03-01

    Super-resolution (SR) has become more vital, because of its capability to generate high-quality ultra-high definition (UHD) high-resolution (HR) images from low-resolution (LR) input images. Conventional SR methods entail high computational complexity, which makes them difficult to be implemented for up-scaling of full-high-definition input images into UHD-resolution images. Nevertheless, our previous super-interpolation (SI) method showed a good compromise between Peak-Signal-to-Noise Ratio (PSNR) performances and computational complexity. However, since SI only utilizes simple linear mappings, it may fail to precisely reconstruct HR patches with complex texture. In this paper, we present a novel SR method, which inherits the large-to-small patch conversion scheme from SI but uses global regression based on local linear mappings (GLM). Thus, our new SR method is called GLM-SI. In GLM-SI, each LR input patch is divided into 25 overlapped subpatches. Next, based on the local properties of these subpatches, 25 different local linear mappings are applied to the current LR input patch to generate 25 HR patch candidates, which are then regressed into one final HR patch using a global regressor. The local linear mappings are learned cluster-wise in our off-line training phase. The main contribution of this paper is as follows: Previously, linear-mapping-based conventional SR methods, including SI only used one simple yet coarse linear mapping to each patch to reconstruct its HR version. On the contrary, for each LR input patch, our GLM-SI is the first to apply a combination of multiple local linear mappings, where each local linear mapping is found according to local properties of the current LR patch. Therefore, it can better approximate nonlinear LR-to-HR mappings for HR patches with complex texture. Experiment results show that the proposed GLM-SI method outperforms most of the state-of-the-art methods, and shows comparable PSNR performance with much lower

  14. SOCR Analyses - an Instructional Java Web-based Statistical Analysis Toolkit.

    Science.gov (United States)

    Chu, Annie; Cui, Jenny; Dinov, Ivo D

    2009-03-01

    The Statistical Online Computational Resource (SOCR) designs web-based tools for educational use in a variety of undergraduate courses (Dinov 2006). Several studies have demonstrated that these resources significantly improve students' motivation and learning experiences (Dinov et al. 2008). SOCR Analyses is a new component that concentrates on data modeling and analysis using parametric and non-parametric techniques supported with graphical model diagnostics. Currently implemented analyses include commonly used models in undergraduate statistics courses like linear models (Simple Linear Regression, Multiple Linear Regression, One-Way and Two-Way ANOVA). In addition, we implemented tests for sample comparisons, such as t-test in the parametric category; and Wilcoxon rank sum test, Kruskal-Wallis test, Friedman's test, in the non-parametric category. SOCR Analyses also include several hypothesis test models, such as Contingency tables, Friedman's test and Fisher's exact test.The code itself is open source (http://socr.googlecode.com/), hoping to contribute to the efforts of the statistical computing community. The code includes functionality for each specific analysis model and it has general utilities that can be applied in various statistical computing tasks. For example, concrete methods with API (Application Programming Interface) have been implemented in statistical summary, least square solutions of general linear models, rank calculations, etc. HTML interfaces, tutorials, source code, activities, and data are freely available via the web (www.SOCR.ucla.edu). Code examples for developers and demos for educators are provided on the SOCR Wiki website.In this article, the pedagogical utilization of the SOCR Analyses is discussed, as well as the underlying design framework. As the SOCR project is on-going and more functions and tools are being added to it, these resources are constantly improved. The reader is strongly encouraged to check the SOCR site for most

  15. Conducting Meta-Analyses Based on p Values

    Science.gov (United States)

    van Aert, Robbie C. M.; Wicherts, Jelte M.; van Assen, Marcel A. L. M.

    2016-01-01

    Because of overwhelming evidence of publication bias in psychology, techniques to correct meta-analytic estimates for such bias are greatly needed. The methodology on which the p-uniform and p-curve methods are based has great promise for providing accurate meta-analytic estimates in the presence of publication bias. However, in this article, we show that in some situations, p-curve behaves erratically, whereas p-uniform may yield implausible estimates of negative effect size. Moreover, we show that (and explain why) p-curve and p-uniform result in overestimation of effect size under moderate-to-large heterogeneity and may yield unpredictable bias when researchers employ p-hacking. We offer hands-on recommendations on applying and interpreting results of meta-analyses in general and p-uniform and p-curve in particular. Both methods as well as traditional methods are applied to a meta-analysis on the effect of weight on judgments of importance. We offer guidance for applying p-uniform or p-curve using R and a user-friendly web application for applying p-uniform. PMID:27694466

  16. PC based 8K multichannel analyser for nuclear spectroscopy

    International Nuclear Information System (INIS)

    Jain, S.K.; Gupta, J.D.; Suman Kumari, B.

    1989-01-01

    An IBM-PC based 8K multichannel analyser(MCA) has been developed which incorporates all the features of an advanced system like very high throughput for data acquisition in PHA as well as MCS modes, fast real-time display, extensive display manipulation facilities, various present controls and concurrent data processing. The compact system hardware consists of a 2 bit wide NIM module and a PC add-on card. Because of external acquisition hardware, the system after initial programming by PC can acquire data independently allowing the PC to be switched off. To attain very high throughput, the most desirable feature of an MCA, a dual-port memory architecture has been used. The asymmetric dual-port RAM, housed in the NIM module offers 24 bit parallel access to the ADC and 8 bit wide access to PC which results in fast real-time histogramic display on the monitor. PC emulation software is menu driven and user friendly. It integrates a comprehensive set of commonly required application routines for concurrent data processing. After the transfer of know-how to the Electronic Corporation of India Ltd. (ECIL), this system is bein g produced at ECIL. (author). 5 refs., 4 figs

  17. Analyser-based x-ray imaging for biomedical research

    International Nuclear Information System (INIS)

    Suortti, Pekka; Keyriläinen, Jani; Thomlinson, William

    2013-01-01

    Analyser-based imaging (ABI) is one of the several phase-contrast x-ray imaging techniques being pursued at synchrotron radiation facilities. With advancements in compact source technology, there is a possibility that ABI will become a clinical imaging modality. This paper presents the history of ABI as it has developed from its laboratory source to synchrotron imaging. The fundamental physics of phase-contrast imaging is presented both in a general sense and specifically for ABI. The technology is dependent on the use of perfect crystal monochromator optics. The theory of the x-ray optics is developed and presented in a way that will allow optimization of the imaging for specific biomedical systems. The advancement of analytical algorithms to produce separate images of the sample absorption, refraction angle map and small-angle x-ray scattering is detailed. Several detailed applications to biomedical imaging are presented to illustrate the broad range of systems and body sites studied preclinically to date: breast, cartilage and bone, soft tissue and organs. Ultimately, the application of ABI in clinical imaging will depend partly on the availability of compact sources with sufficient x-ray intensity comparable with that of the current synchrotron environment. (paper)

  18. Progress Report on Computational Analyses of Water-Based NSTF

    Energy Technology Data Exchange (ETDEWEB)

    Lv, Q. [Argonne National Lab. (ANL), Argonne, IL (United States); Kraus, A. [Argonne National Lab. (ANL), Argonne, IL (United States); Hu, R. [Argonne National Lab. (ANL), Argonne, IL (United States); Bucknor, M. [Argonne National Lab. (ANL), Argonne, IL (United States); Lisowski, D. [Argonne National Lab. (ANL), Argonne, IL (United States); Nunez, D. [Argonne National Lab. (ANL), Argonne, IL (United States)

    2017-08-01

    CFD analysis has been focused on important component-level phenomena using STARCCM+ to supplement the system analysis of integral system behavior. A notable area of interest was the cavity region. This area is of particular interest for CFD analysis due to the multi-dimensional flow and complex heat transfer (thermal radiation heat transfer and natural convection), which are not simulated directly by RELAP5. CFD simulations allow for the estimation of the boundary heat flux distribution along the riser tubes, which is needed in the RELAP5 simulations. The CFD results can also provide additional data to help establish what level of modeling detail is necessary in RELAP5. It was found that the flow profiles in the cavity region are simpler for the water-based concept than for the air-cooled concept. The local heat flux noticeably increases axially, and is higher in the fins than in the riser tubes. These results were utilized in RELAP5 simulations as boundary conditions, to provide better temperature predictions in the system level analyses. It was also determined that temperatures were higher in the fins than the riser tubes, but within design limits for thermal stresses. Higher temperature predictions were identified in the edge fins, in part due to additional thermal radiation from the side cavity walls.

  19. Model-Based Recursive Partitioning for Subgroup Analyses.

    Science.gov (United States)

    Seibold, Heidi; Zeileis, Achim; Hothorn, Torsten

    2016-05-01

    The identification of patient subgroups with differential treatment effects is the first step towards individualised treatments. A current draft guideline by the EMA discusses potentials and problems in subgroup analyses and formulated challenges to the development of appropriate statistical procedures for the data-driven identification of patient subgroups. We introduce model-based recursive partitioning as a procedure for the automated detection of patient subgroups that are identifiable by predictive factors. The method starts with a model for the overall treatment effect as defined for the primary analysis in the study protocol and uses measures for detecting parameter instabilities in this treatment effect. The procedure produces a segmented model with differential treatment parameters corresponding to each patient subgroup. The subgroups are linked to predictive factors by means of a decision tree. The method is applied to the search for subgroups of patients suffering from amyotrophic lateral sclerosis that differ with respect to their Riluzole treatment effect, the only currently approved drug for this disease.

  20. Polychotomization of continuous variables in regression models based on the overall C index

    Directory of Open Access Journals (Sweden)

    Bax Leon

    2006-12-01

    Full Text Available Abstract Background When developing multivariable regression models for diagnosis or prognosis, continuous independent variables can be categorized to make a prediction table instead of a prediction formula. Although many methods have been proposed to dichotomize prognostic variables, to date there has been no integrated method for polychotomization. The latter is necessary when dichotomization results in too much loss of information or when central values refer to normal states and more dispersed values refer to less preferable states, a situation that is not unusual in medical settings (e.g. body temperature, blood pressure. The goal of our study was to develop a theoretical and practical method for polychotomization. Methods We used the overall discrimination index C, introduced by Harrel, as a measure of the predictive ability of an independent regressor variable and derived a method for polychotomization mathematically. Since the naïve application of our method, like some existing methods, gives rise to positive bias, we developed a parametric method that minimizes this bias and assessed its performance by the use of Monte Carlo simulation. Results The overall C is closely related to the area under the ROC curve and the produced di(polychotomized variable's predictive performance is comparable to the original continuous variable. The simulation shows that the parametric method is essentially unbiased for both the estimates of performance and the cutoff points. Application of our method to the predictor variables of a previous study on rhabdomyolysis shows that it can be used to make probability profile tables that are applicable to the diagnosis or prognosis of individual patient status. Conclusion We propose a polychotomization (including dichotomization method for independent continuous variables in regression models based on the overall discrimination index C and clarified its meaning mathematically. To avoid positive bias in

  1. Geographically weighted regression based methods for merging satellite and gauge precipitation

    Science.gov (United States)

    Chao, Lijun; Zhang, Ke; Li, Zhijia; Zhu, Yuelong; Wang, Jingfeng; Yu, Zhongbo

    2018-03-01

    Real-time precipitation data with high spatiotemporal resolutions are crucial for accurate hydrological forecasting. To improve the spatial resolution and quality of satellite precipitation, a three-step satellite and gauge precipitation merging method was formulated in this study: (1) bilinear interpolation is first applied to downscale coarser satellite precipitation to a finer resolution (PS); (2) the (mixed) geographically weighted regression methods coupled with a weighting function are then used to estimate biases of PS as functions of gauge observations (PO) and PS; and (3) biases of PS are finally corrected to produce a merged precipitation product. Based on the above framework, eight algorithms, a combination of two geographically weighted regression methods and four weighting functions, are developed to merge CMORPH (CPC MORPHing technique) precipitation with station observations on a daily scale in the Ziwuhe Basin of China. The geographical variables (elevation, slope, aspect, surface roughness, and distance to the coastline) and a meteorological variable (wind speed) were used for merging precipitation to avoid the artificial spatial autocorrelation resulting from traditional interpolation methods. The results show that the combination of the MGWR and BI-square function (MGWR-BI) has the best performance (R = 0.863 and RMSE = 7.273 mm/day) among the eight algorithms. The MGWR-BI algorithm was then applied to produce hourly merged precipitation product. Compared to the original CMORPH product (R = 0.208 and RMSE = 1.208 mm/hr), the quality of the merged data is significantly higher (R = 0.724 and RMSE = 0.706 mm/hr). The developed merging method not only improves the spatial resolution and quality of the satellite product but also is easy to implement, which is valuable for hydrological modeling and other applications.

  2. Fruit fly optimization based least square support vector regression for blind image restoration

    Science.gov (United States)

    Zhang, Jiao; Wang, Rui; Li, Junshan; Yang, Yawei

    2014-11-01

    The goal of image restoration is to reconstruct the original scene from a degraded observation. It is a critical and challenging task in image processing. Classical restorations require explicit knowledge of the point spread function and a description of the noise as priors. However, it is not practical for many real image processing. The recovery processing needs to be a blind image restoration scenario. Since blind deconvolution is an ill-posed problem, many blind restoration methods need to make additional assumptions to construct restrictions. Due to the differences of PSF and noise energy, blurring images can be quite different. It is difficult to achieve a good balance between proper assumption and high restoration quality in blind deconvolution. Recently, machine learning techniques have been applied to blind image restoration. The least square support vector regression (LSSVR) has been proven to offer strong potential in estimating and forecasting issues. Therefore, this paper proposes a LSSVR-based image restoration method. However, selecting the optimal parameters for support vector machine is essential to the training result. As a novel meta-heuristic algorithm, the fruit fly optimization algorithm (FOA) can be used to handle optimization problems, and has the advantages of fast convergence to the global optimal solution. In the proposed method, the training samples are created from a neighborhood in the degraded image to the central pixel in the original image. The mapping between the degraded image and the original image is learned by training LSSVR. The two parameters of LSSVR are optimized though FOA. The fitness function of FOA is calculated by the restoration error function. With the acquired mapping, the degraded image can be recovered. Experimental results show the proposed method can obtain satisfactory restoration effect. Compared with BP neural network regression, SVR method and Lucy-Richardson algorithm, it speeds up the restoration rate and

  3. Predictive based monitoring of nuclear plant component degradation using support vector regression

    International Nuclear Information System (INIS)

    Agarwal, Vivek; Alamaniotis, Miltiadis; Tsoukalas, Lefteri H.

    2015-01-01

    Nuclear power plants (NPPs) are large installations comprised of many active and passive assets. Degradation monitoring of all these assets is expensive (labor cost) and highly demanding task. In this paper a framework based on Support Vector Regression (SVR) for online surveillance of critical parameter degradation of NPP components is proposed. In this case, on time replacement or maintenance of components will prevent potential plant malfunctions, and reduce the overall operational cost. In the current work, we apply SVR equipped with a Gaussian kernel function to monitor components. Monitoring includes the one-step-ahead prediction of the component's respective operational quantity using the SVR model, while the SVR model is trained using a set of previous recorded degradation histories of similar components. Predictive capability of the model is evaluated upon arrival of a sensor measurement, which is compared to the component failure threshold. A maintenance decision is based on a fuzzy inference system that utilizes three parameters: (i) prediction evaluation in the previous steps, (ii) predicted value of the current step, (iii) and difference of current predicted value with components failure thresholds. The proposed framework will be tested on turbine blade degradation data.

  4. The Chaotic Prediction for Aero-Engine Performance Parameters Based on Nonlinear PLS Regression

    Directory of Open Access Journals (Sweden)

    Chunxiao Zhang

    2012-01-01

    Full Text Available The prediction of the aero-engine performance parameters is very important for aero-engine condition monitoring and fault diagnosis. In this paper, the chaotic phase space of engine exhaust temperature (EGT time series which come from actual air-borne ACARS data is reconstructed through selecting some suitable nearby points. The partial least square (PLS based on the cubic spline function or the kernel function transformation is adopted to obtain chaotic predictive function of EGT series. The experiment results indicate that the proposed PLS chaotic prediction algorithm based on biweight kernel function transformation has significant advantage in overcoming multicollinearity of the independent variables and solve the stability of regression model. Our predictive NMSE is 16.5 percent less than that of the traditional linear least squares (OLS method and 10.38 percent less than that of the linear PLS approach. At the same time, the forecast error is less than that of nonlinear PLS algorithm through bootstrap test screening.

  5. Reference Function Based Spatiotemporal Fuzzy Logic Control Design Using Support Vector Regression Learning

    Directory of Open Access Journals (Sweden)

    Xian-Xia Zhang

    2013-01-01

    Full Text Available This paper presents a reference function based 3D FLC design methodology using support vector regression (SVR learning. The concept of reference function is introduced to 3D FLC for the generation of 3D membership functions (MF, which enhance the capability of the 3D FLC to cope with more kinds of MFs. The nonlinear mathematical expression of the reference function based 3D FLC is derived, and spatial fuzzy basis functions are defined. Via relating spatial fuzzy basis functions of a 3D FLC to kernel functions of an SVR, an equivalence relationship between a 3D FLC and an SVR is established. Therefore, a 3D FLC can be constructed using the learned results of an SVR. Furthermore, the universal approximation capability of the proposed 3D fuzzy system is proven in terms of the finite covering theorem. Finally, the proposed method is applied to a catalytic packed-bed reactor and simulation results have verified its effectiveness.

  6. Reducing false-positive incidental findings with ensemble genotyping and logistic regression based variant filtering methods.

    Science.gov (United States)

    Hwang, Kyu-Baek; Lee, In-Hee; Park, Jin-Ho; Hambuch, Tina; Choe, Yongjoon; Kim, MinHyeok; Lee, Kyungjoon; Song, Taemin; Neu, Matthew B; Gupta, Neha; Kohane, Isaac S; Green, Robert C; Kong, Sek Won

    2014-08-01

    As whole genome sequencing (WGS) uncovers variants associated with rare and common diseases, an immediate challenge is to minimize false-positive findings due to sequencing and variant calling errors. False positives can be reduced by combining results from orthogonal sequencing methods, but costly. Here, we present variant filtering approaches using logistic regression (LR) and ensemble genotyping to minimize false positives without sacrificing sensitivity. We evaluated the methods using paired WGS datasets of an extended family prepared using two sequencing platforms and a validated set of variants in NA12878. Using LR or ensemble genotyping based filtering, false-negative rates were significantly reduced by 1.1- to 17.8-fold at the same levels of false discovery rates (5.4% for heterozygous and 4.5% for homozygous single nucleotide variants (SNVs); 30.0% for heterozygous and 18.7% for homozygous insertions; 25.2% for heterozygous and 16.6% for homozygous deletions) compared to the filtering based on genotype quality scores. Moreover, ensemble genotyping excluded > 98% (105,080 of 107,167) of false positives while retaining > 95% (897 of 937) of true positives in de novo mutation (DNM) discovery in NA12878, and performed better than a consensus method using two sequencing platforms. Our proposed methods were effective in prioritizing phenotype-associated variants, and an ensemble genotyping would be essential to minimize false-positive DNM candidates. © 2014 WILEY PERIODICALS, INC.

  7. Model-free prediction and regression a transformation-based approach to inference

    CERN Document Server

    Politis, Dimitris N

    2015-01-01

    The Model-Free Prediction Principle expounded upon in this monograph is based on the simple notion of transforming a complex dataset to one that is easier to work with, e.g., i.i.d. or Gaussian. As such, it restores the emphasis on observable quantities, i.e., current and future data, as opposed to unobservable model parameters and estimates thereof, and yields optimal predictors in diverse settings such as regression and time series. Furthermore, the Model-Free Bootstrap takes us beyond point prediction in order to construct frequentist prediction intervals without resort to unrealistic assumptions such as normality. Prediction has been traditionally approached via a model-based paradigm, i.e., (a) fit a model to the data at hand, and (b) use the fitted model to extrapolate/predict future data. Due to both mathematical and computational constraints, 20th century statistical practice focused mostly on parametric models. Fortunately, with the advent of widely accessible powerful computing in the late 1970s, co...

  8. A simplified calculation procedure for mass isotopomer distribution analysis (MIDA) based on multiple linear regression.

    Science.gov (United States)

    Fernández-Fernández, Mario; Rodríguez-González, Pablo; García Alonso, J Ignacio

    2016-10-01

    We have developed a novel, rapid and easy calculation procedure for Mass Isotopomer Distribution Analysis based on multiple linear regression which allows the simultaneous calculation of the precursor pool enrichment and the fraction of newly synthesized labelled proteins (fractional synthesis) using linear algebra. To test this approach, we used the peptide RGGGLK as a model tryptic peptide containing three subunits of glycine. We selected glycine labelled in two 13 C atoms ( 13 C 2 -glycine) as labelled amino acid to demonstrate that spectral overlap is not a problem in the proposed methodology. The developed methodology was tested first in vitro by changing the precursor pool enrichment from 10 to 40% of 13 C 2 -glycine. Secondly, a simulated in vivo synthesis of proteins was designed by combining the natural abundance RGGGLK peptide and 10 or 20% 13 C 2 -glycine at 1 : 1, 1 : 3 and 3 : 1 ratios. Precursor pool enrichments and fractional synthesis values were calculated with satisfactory precision and accuracy using a simple spreadsheet. This novel approach can provide a relatively rapid and easy means to measure protein turnover based on stable isotope tracers. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  9. EPMLR: sequence-based linear B-cell epitope prediction method using multiple linear regression.

    Science.gov (United States)

    Lian, Yao; Ge, Meng; Pan, Xian-Ming

    2014-12-19

    B-cell epitopes have been studied extensively due to their immunological applications, such as peptide-based vaccine development, antibody production, and disease diagnosis and therapy. Despite several decades of research, the accurate prediction of linear B-cell epitopes has remained a challenging task. In this work, based on the antigen's primary sequence information, a novel linear B-cell epitope prediction model was developed using the multiple linear regression (MLR). A 10-fold cross-validation test on a large non-redundant dataset was performed to evaluate the performance of our model. To alleviate the problem caused by the noise of negative dataset, 300 experiments utilizing 300 sub-datasets were performed. We achieved overall sensitivity of 81.8%, precision of 64.1% and area under the receiver operating characteristic curve (AUC) of 0.728. We have presented a reliable method for the identification of linear B cell epitope using antigen's primary sequence information. Moreover, a web server EPMLR has been developed for linear B-cell epitope prediction: http://www.bioinfo.tsinghua.edu.cn/epitope/EPMLR/ .

  10. An Ionospheric Index Model based on Linear Regression and Neural Network Approaches

    Science.gov (United States)

    Tshisaphungo, Mpho; McKinnell, Lee-Anne; Bosco Habarulema, John

    2017-04-01

    The ionosphere is well known to reflect radio wave signals in the high frequency (HF) band due to the present of electron and ions within the region. To optimise the use of long distance HF communications, it is important to understand the drivers of ionospheric storms and accurately predict the propagation conditions especially during disturbed days. This paper presents the development of an ionospheric storm-time index over the South African region for the application of HF communication users. The model will result into a valuable tool to measure the complex ionospheric behaviour in an operational space weather monitoring and forecasting environment. The development of an ionospheric storm-time index is based on a single ionosonde station data over Grahamstown (33.3°S,26.5°E), South Africa. Critical frequency of the F2 layer (foF2) measurements for a period 1996-2014 were considered for this study. The model was developed based on linear regression and neural network approaches. In this talk validation results for low, medium and high solar activity periods will be discussed to demonstrate model's performance.

  11. Accurate palm vein recognition based on wavelet scattering and spectral regression kernel discriminant analysis

    Science.gov (United States)

    Elnasir, Selma; Shamsuddin, Siti Mariyam; Farokhi, Sajad

    2015-01-01

    Palm vein recognition (PVR) is a promising new biometric that has been applied successfully as a method of access control by many organizations, which has even further potential in the field of forensics. The palm vein pattern has highly discriminative features that are difficult to forge because of its subcutaneous position in the palm. Despite considerable progress and a few practical issues, providing accurate palm vein readings has remained an unsolved issue in biometrics. We propose a robust and more accurate PVR method based on the combination of wavelet scattering (WS) with spectral regression kernel discriminant analysis (SRKDA). As the dimension of WS generated features is quite large, SRKDA is required to reduce the extracted features to enhance the discrimination. The results based on two public databases-PolyU Hyper Spectral Palmprint public database and PolyU Multi Spectral Palmprint-show the high performance of the proposed scheme in comparison with state-of-the-art methods. The proposed approach scored a 99.44% identification rate and a 99.90% verification rate [equal error rate (EER)=0.1%] for the hyperspectral database and a 99.97% identification rate and a 99.98% verification rate (EER=0.019%) for the multispectral database.

  12. Basic Diagnosis and Prediction of Persistent Contrail Occurrence using High-resolution Numerical Weather Analyses/Forecasts and Logistic Regression. Part I: Effects of Random Error

    Science.gov (United States)

    Duda, David P.; Minnis, Patrick

    2009-01-01

    Straightforward application of the Schmidt-Appleman contrail formation criteria to diagnose persistent contrail occurrence from numerical weather prediction data is hindered by significant bias errors in the upper tropospheric humidity. Logistic models of contrail occurrence have been proposed to overcome this problem, but basic questions remain about how random measurement error may affect their accuracy. A set of 5000 synthetic contrail observations is created to study the effects of random error in these probabilistic models. The simulated observations are based on distributions of temperature, humidity, and vertical velocity derived from Advanced Regional Prediction System (ARPS) weather analyses. The logistic models created from the simulated observations were evaluated using two common statistical measures of model accuracy, the percent correct (PC) and the Hanssen-Kuipers discriminant (HKD). To convert the probabilistic results of the logistic models into a dichotomous yes/no choice suitable for the statistical measures, two critical probability thresholds are considered. The HKD scores are higher when the climatological frequency of contrail occurrence is used as the critical threshold, while the PC scores are higher when the critical probability threshold is 0.5. For both thresholds, typical random errors in temperature, relative humidity, and vertical velocity are found to be small enough to allow for accurate logistic models of contrail occurrence. The accuracy of the models developed from synthetic data is over 85 percent for both the prediction of contrail occurrence and non-occurrence, although in practice, larger errors would be anticipated.

  13. Consequences of kriging and land use regression for PM2.5 predictions in epidemiologic analyses: insights into spatial variability using high-resolution satellite data.

    Science.gov (United States)

    Alexeeff, Stacey E; Schwartz, Joel; Kloog, Itai; Chudnovsky, Alexandra; Koutrakis, Petros; Coull, Brent A

    2015-01-01

    Many epidemiological studies use predicted air pollution exposures as surrogates for true air pollution levels. These predicted exposures contain exposure measurement error, yet simulation studies have typically found negligible bias in resulting health effect estimates. However, previous studies typically assumed a statistical spatial model for air pollution exposure, which may be oversimplified. We address this shortcoming by assuming a realistic, complex exposure surface derived from fine-scale (1 km × 1 km) remote-sensing satellite data. Using simulation, we evaluate the accuracy of epidemiological health effect estimates in linear and logistic regression when using spatial air pollution predictions from kriging and land use regression models. We examined chronic (long-term) and acute (short-term) exposure to air pollution. Results varied substantially across different scenarios. Exposure models with low out-of-sample R(2) yielded severe biases in the health effect estimates of some models, ranging from 60% upward bias to 70% downward bias. One land use regression exposure model with >0.9 out-of-sample R(2) yielded upward biases up to 13% for acute health effect estimates. Almost all models drastically underestimated the SEs. Land use regression models performed better in chronic effect simulations. These results can help researchers when interpreting health effect estimates in these types of studies.

  14. Regression trees for predicting mortality in patients with cardiovascular disease: What improvement is achieved by using ensemble-based methods?

    Science.gov (United States)

    Austin, Peter C; Lee, Douglas S; Steyerberg, Ewout W; Tu, Jack V

    2012-01-01

    In biomedical research, the logistic regression model is the most commonly used method for predicting the probability of a binary outcome. While many clinical researchers have expressed an enthusiasm for regression trees, this method may have limited accuracy for predicting health outcomes. We aimed to evaluate the improvement that is achieved by using ensemble-based methods, including bootstrap aggregation (bagging) of regression trees, random forests, and boosted regression trees. We analyzed 30-day mortality in two large cohorts of patients hospitalized with either acute myocardial infarction (N = 16,230) or congestive heart failure (N = 15,848) in two distinct eras (1999–2001 and 2004–2005). We found that both the in-sample and out-of-sample prediction of ensemble methods offered substantial improvement in predicting cardiovascular mortality compared to conventional regression trees. However, conventional logistic regression models that incorporated restricted cubic smoothing splines had even better performance. We conclude that ensemble methods from the data mining and machine learning literature increase the predictive performance of regression trees, but may not lead to clear advantages over conventional logistic regression models for predicting short-term mortality in population-based samples of subjects with cardiovascular disease. PMID:22777999

  15. An Analysis of Bank Service Satisfaction Based on Quantile Regression and Grey Relational Analysis

    Directory of Open Access Journals (Sweden)

    Wen-Tsao Pan

    2016-01-01

    Full Text Available Bank service satisfaction is vital to the success of a bank. In this paper, we propose to use the grey relational analysis to gauge the levels of service satisfaction of the banks. With the grey relational analysis, we compared the effects of different variables on service satisfaction. We gave ranks to the banks according to their levels of service satisfaction. We further used the quantile regression model to find the variables that affected the satisfaction of a customer at a specific quantile of satisfaction level. The result of the quantile regression analysis provided a bank manager with information to formulate policies to further promote satisfaction of the customers at different quantiles of satisfaction level. We also compared the prediction accuracies of the regression models at different quantiles. The experiment result showed that, among the seven quantile regression models, the median regression model has the best performance in terms of RMSE, RTIC, and CE performance measures.

  16. Regression Tree-Based Methodology for Customizing Building Energy Benchmarks to Individual Commercial Buildings

    Science.gov (United States)

    Kaskhedikar, Apoorva Prakash

    According to the U.S. Energy Information Administration, commercial buildings represent about 40% of the United State's energy consumption of which office buildings consume a major portion. Gauging the extent to which an individual building consumes energy in excess of its peers is the first step in initiating energy efficiency improvement. Energy Benchmarking offers initial building energy performance assessment without rigorous evaluation. Energy benchmarking tools based on the Commercial Buildings Energy Consumption Survey (CBECS) database are investigated in this thesis. This study proposes a new benchmarking methodology based on decision trees, where a relationship between the energy use intensities (EUI) and building parameters (continuous and categorical) is developed for different building types. This methodology was applied to medium office and school building types contained in the CBECS database. The Random Forest technique was used to find the most influential parameters that impact building energy use intensities. Subsequently, correlations which were significant were identified between EUIs and CBECS variables. Other than floor area, some of the important variables were number of workers, location, number of PCs and main cooling equipment. The coefficient of variation was used to evaluate the effectiveness of the new model. The customization technique proposed in this thesis was compared with another benchmarking model that is widely used by building owners and designers namely, the ENERGY STAR's Portfolio Manager. This tool relies on the standard Linear Regression methods which is only able to handle continuous variables. The model proposed uses data mining technique and was found to perform slightly better than the Portfolio Manager. The broader impacts of the new benchmarking methodology proposed is that it allows for identifying important categorical variables, and then incorporating them in a local, as against a global, model framework for EUI

  17. A neutron spectrum unfolding code based on generalized regression artificial neural networks

    International Nuclear Information System (INIS)

    Rosario Martinez-Blanco, Ma. del

    2016-01-01

    The most delicate part of neutron spectrometry, is the unfolding process. The derivation of the spectral information is not simple because the unknown is not given directly as a result of the measurements. Novel methods based on Artificial Neural Networks have been widely investigated. In prior works, back propagation neural networks (BPNN) have been used to solve the neutron spectrometry problem, however, some drawbacks still exist using this kind of neural nets, i.e. the optimum selection of the network topology and the long training time. Compared to BPNN, it's usually much faster to train a generalized regression neural network (GRNN). That's mainly because spread constant is the only parameter used in GRNN. Another feature is that the network will converge to a global minimum, provided that the optimal values of spread has been determined and that the dataset adequately represents the problem space. In addition, GRNN are often more accurate than BPNN in the prediction. These characteristics make GRNNs to be of great interest in the neutron spectrometry domain. This work presents a computational tool based on GRNN capable to solve the neutron spectrometry problem. This computational code, automates the pre-processing, training and testing stages using a k-fold cross validation of 3 folds, the statistical analysis and the post-processing of the information, using 7 Bonner spheres rate counts as only entrance data. The code was designed for a Bonner Spheres System based on a "6LiI(Eu) neutron detector and a response matrix expressed in 60 energy bins taken from an International Atomic Energy Agency compilation. - Highlights: • Main drawback of neutron spectrometry with BPNN is network topology optimization. • Compared to BPNN, it’s usually much faster to train a (GRNN). • GRNN are often more accurate than BPNN in the prediction. These characteristics make GRNNs to be of great interest. • This computational code, automates the pre-processing, training

  18. Spatial Bayesian latent factor regression modeling of coordinate-based meta-analysis data.

    Science.gov (United States)

    Montagna, Silvia; Wager, Tor; Barrett, Lisa Feldman; Johnson, Timothy D; Nichols, Thomas E

    2018-03-01

    Now over 20 years old, functional MRI (fMRI) has a large and growing literature that is best synthesised with meta-analytic tools. As most authors do not share image data, only the peak activation coordinates (foci) reported in the article are available for Coordinate-Based Meta-Analysis (CBMA). Neuroimaging meta-analysis is used to (i) identify areas of consistent activation; and (ii) build a predictive model of task type or cognitive process for new studies (reverse inference). To simultaneously address these aims, we propose a Bayesian point process hierarchical model for CBMA. We model the foci from each study as a doubly stochastic Poisson process, where the study-specific log intensity function is characterized as a linear combination of a high-dimensional basis set. A sparse representation of the intensities is guaranteed through latent factor modeling of the basis coefficients. Within our framework, it is also possible to account for the effect of study-level covariates (meta-regression), significantly expanding the capabilities of the current neuroimaging meta-analysis methods available. We apply our methodology to synthetic data and neuroimaging meta-analysis datasets. © 2017, The International Biometric Society.

  19. [Socioeconomic factors conditioning obesity in adults. Evidence based on quantile regression and panel data].

    Science.gov (United States)

    Temporelli, Karina L; Viego, Valentina N

    2016-08-01

    Objective To measure the effect of socioeconomic variables on the prevalence of obesity. Factors such as income level, urbanization, incorporation of women into the labor market and access to unhealthy foods are considered in this paper. Method Econometric estimates of the proportion of obese men and women by country were calculated using models based on panel data and quantile regressions, with data from 192 countries for the period 2002-2005.Levels of per capita income, urbanization, income/big mac ratio price and labor indicators for female population were considered as explanatory variables. Results Factors that have influence over obesity in adults differ between men and women; accessibility to fast food is related to male obesity, while the employment mode causes higher rates in women. The underlying socioeconomic factors for obesity are also different depending on the magnitude of this problem in each country; in countries with low prevalence, a greater level of income favor the transition to obesogenic habits, while a higher income level mitigates the problem in those countries with high rates of obesity. Discussion Identifying the socio-economic causes of the significant increase in the prevalence of obesity is essential for the implementation of effective strategies for prevention, since this condition not only affects the quality of life of those who suffer from it but also puts pressure on health systems due to the treatment costs of associated diseases.

  20. Mass balance-based regression modeling of PAHs accumulation in urban soils, role of urban development

    International Nuclear Information System (INIS)

    Peng, Chi; Wang, Meie; Chen, Weiping; Chang, Andrew C.

    2015-01-01

    We investigated the polycyclic aromatic hydrocarbons (PAHs) contents in 68 soils samples collected at housing developments that represent different length of development periods across Beijing. Based on the data, we derived a mass balanced mathematical model to simulate the dynamics of PAH accumulations in urban soils as affected by the urban developments. The key parameters were estimated by fitting the modified mass balance model to the data of PAH concentrations vs. building age of the sampling green area. The total PAH concentrations would increase from the baseline of 267 ng g −1 to 3631 ng g −1 during the period of 1978–2048. It showed that the dynamic changes in the rates of accumulations of light and heavy PAH species were related to the shifting of sources of fuels, combustion efficiencies, and amounts of energy consumed during the course of development. - Highlights: • Introduced a mass balance model for soil PAHs accumulation with urbanization. • Reconstructed the historical data of PAH accumulation in soil of Beijing, China. • The soil PAH concentrations would be doubled in the following 40 years. • The composition of PAH emissions were shifting to light PAH species. - Introduced a regression modeling approach to predict the changes of PAH concentrations in urban soil

  1. Improved regression models for ventilation estimation based on chest and abdomen movements

    International Nuclear Information System (INIS)

    Liu, Shaopeng; Gao, Robert; He, Qingbo; Staudenmayer, John; Freedson, Patty

    2012-01-01

    Non-invasive estimation of minute ventilation is important for quantifying the intensity of physical activity of individuals. In this paper, several improved regression models are presented, based on the measurement of chest and abdomen movements from sensor belts worn by subjects (n = 50) engaged in 14 types of physical activity. Five linear models involving a combination of 11 features were developed, and the effects of different model training approaches and window sizes for computing the features were investigated. The performance of the models was evaluated using experimental data collected during the physical activity protocol. The predicted minute ventilation was compared to the criterion ventilation measured using a bidirectional digital volume transducer housed in a respiratory gas exchange system. The results indicate that the inclusion of breathing frequency and the use of percentile points instead of interdecile ranges over a 60 s window size reduced error by about 43%, when applied to the classical two-degrees-of-freedom model. The mean percentage error of the minute ventilation estimated for all the activities was below 7.5%, verifying reasonably good performance of the models and the applicability of the wearable sensing system for minute ventilation estimation during physical activity. (paper)

  2. A Gaussian process regression based hybrid approach for short-term wind speed prediction

    International Nuclear Information System (INIS)

    Zhang, Chi; Wei, Haikun; Zhao, Xin; Liu, Tianhong; Zhang, Kanjian

    2016-01-01

    Highlights: • A novel hybrid approach is proposed for short-term wind speed prediction. • This method combines the parametric AR model with the non-parametric GPR model. • The relative importance of different inputs is considered. • Different types of covariance functions are considered and combined. • It can provide both accurate point forecasts and satisfactory prediction intervals. - Abstract: This paper proposes a hybrid model based on autoregressive (AR) model and Gaussian process regression (GPR) for probabilistic wind speed forecasting. In the proposed approach, the AR model is employed to capture the overall structure from wind speed series, and the GPR is adopted to extract the local structure. Additionally, automatic relevance determination (ARD) is used to take into account the relative importance of different inputs, and different types of covariance functions are combined to capture the characteristics of the data. The proposed hybrid model is compared with the persistence model, artificial neural network (ANN), and support vector machine (SVM) for one-step ahead forecasting, using wind speed data collected from three wind farms in China. The forecasting results indicate that the proposed method can not only improve point forecasts compared with other methods, but also generate satisfactory prediction intervals.

  3. Groundwater level prediction of landslide based on classification and regression tree

    Directory of Open Access Journals (Sweden)

    Yannan Zhao

    2016-09-01

    Full Text Available According to groundwater level monitoring data of Shuping landslide in the Three Gorges Reservoir area, based on the response relationship between influential factors such as rainfall and reservoir level and the change of groundwater level, the influential factors of groundwater level were selected. Then the classification and regression tree (CART model was constructed by the subset and used to predict the groundwater level. Through the verification, the predictive results of the test sample were consistent with the actually measured values, and the mean absolute error and relative error is 0.28 m and 1.15% respectively. To compare the support vector machine (SVM model constructed using the same set of factors, the mean absolute error and relative error of predicted results is 1.53 m and 6.11% respectively. It is indicated that CART model has not only better fitting and generalization ability, but also strong advantages in the analysis of landslide groundwater dynamic characteristics and the screening of important variables. It is an effective method for prediction of ground water level in landslides.

  4. Automatic coronary artery segmentation based on multi-domains remapping and quantile regression in angiographies.

    Science.gov (United States)

    Li, Zhixun; Zhang, Yingtao; Gong, Huiling; Li, Weimin; Tang, Xianglong

    2016-12-01

    Coronary artery disease has become the most dangerous diseases to human life. And coronary artery segmentation is the basis of computer aided diagnosis and analysis. Existing segmentation methods are difficult to handle the complex vascular texture due to the projective nature in conventional coronary angiography. Due to large amount of data and complex vascular shapes, any manual annotation has become increasingly unrealistic. A fully automatic segmentation method is necessary in clinic practice. In this work, we study a method based on reliable boundaries via multi-domains remapping and robust discrepancy correction via distance balance and quantile regression for automatic coronary artery segmentation of angiography images. The proposed method can not only segment overlapping vascular structures robustly, but also achieve good performance in low contrast regions. The effectiveness of our approach is demonstrated on a variety of coronary blood vessels compared with the existing methods. The overall segmentation performances si, fnvf, fvpf and tpvf were 95.135%, 3.733%, 6.113%, 96.268%, respectively. Copyright © 2016 Elsevier Ltd. All rights reserved.

  5. Spatial Bayesian Latent Factor Regression Modeling of Coordinate-based Meta-analysis Data

    Science.gov (United States)

    Montagna, Silvia; Wager, Tor; Barrett, Lisa Feldman; Johnson, Timothy D.; Nichols, Thomas E.

    2017-01-01

    Summary Now over 20 years old, functional MRI (fMRI) has a large and growing literature that is best synthesised with meta-analytic tools. As most authors do not share image data, only the peak activation coordinates (foci) reported in the paper are available for Coordinate-Based Meta-Analysis (CBMA). Neuroimaging meta-analysis is used to 1) identify areas of consistent activation; and 2) build a predictive model of task type or cognitive process for new studies (reverse inference). To simultaneously address these aims, we propose a Bayesian point process hierarchical model for CBMA. We model the foci from each study as a doubly stochastic Poisson process, where the study-specific log intensity function is characterised as a linear combination of a high-dimensional basis set. A sparse representation of the intensities is guaranteed through latent factor modeling of the basis coefficients. Within our framework, it is also possible to account for the effect of study-level covariates (meta-regression), significantly expanding the capabilities of the current neuroimaging meta-analysis methods available. We apply our methodology to synthetic data and neuroimaging meta-analysis datasets. PMID:28498564

  6. Normalization in Unsupervised Segmentation Parameter Optimization: A Solution Based on Local Regression Trend Analysis

    Directory of Open Access Journals (Sweden)

    Stefanos Georganos

    2018-02-01

    Full Text Available In object-based image analysis (OBIA, the appropriate parametrization of segmentation algorithms is crucial for obtaining satisfactory image classification results. One of the ways this can be done is by unsupervised segmentation parameter optimization (USPO. A popular USPO method does this through the optimization of a “global score” (GS, which minimizes intrasegment heterogeneity and maximizes intersegment heterogeneity. However, the calculated GS values are sensitive to the minimum and maximum ranges of the candidate segmentations. Previous research proposed the use of fixed minimum/maximum threshold values for the intrasegment/intersegment heterogeneity measures to deal with the sensitivity of user-defined ranges, but the performance of this approach has not been investigated in detail. In the context of a remote sensing very-high-resolution urban application, we show the limitations of the fixed threshold approach, both in a theoretical and applied manner, and instead propose a novel solution to identify the range of candidate segmentations using local regression trend analysis. We found that the proposed approach showed significant improvements over the use of fixed minimum/maximum values, is less subjective than user-defined threshold values and, thus, can be of merit for a fully automated procedure and big data applications.

  7. Monthly streamflow forecasting based on hidden Markov model and Gaussian Mixture Regression

    Science.gov (United States)

    Liu, Yongqi; Ye, Lei; Qin, Hui; Hong, Xiaofeng; Ye, Jiajun; Yin, Xingli

    2018-06-01

    Reliable streamflow forecasts can be highly valuable for water resources planning and management. In this study, we combined a hidden Markov model (HMM) and Gaussian Mixture Regression (GMR) for probabilistic monthly streamflow forecasting. The HMM is initialized using a kernelized K-medoids clustering method, and the Baum-Welch algorithm is then executed to learn the model parameters. GMR derives a conditional probability distribution for the predictand given covariate information, including the antecedent flow at a local station and two surrounding stations. The performance of HMM-GMR was verified based on the mean square error and continuous ranked probability score skill scores. The reliability of the forecasts was assessed by examining the uniformity of the probability integral transform values. The results show that HMM-GMR obtained reasonably high skill scores and the uncertainty spread was appropriate. Different HMM states were assumed to be different climate conditions, which would lead to different types of observed values. We demonstrated that the HMM-GMR approach can handle multimodal and heteroscedastic data.

  8. Correction of TRMM 3B42V7 Based on Linear Regression Models over China

    Directory of Open Access Journals (Sweden)

    Shaohua Liu

    2016-01-01

    Full Text Available High temporal-spatial precipitation is necessary for hydrological simulation and water resource management, and remotely sensed precipitation products (RSPPs play a key role in supporting high temporal-spatial precipitation, especially in sparse gauge regions. TRMM 3B42V7 data (TRMM precipitation is an essential RSPP outperforming other RSPPs. Yet the utilization of TRMM precipitation is still limited by the inaccuracy and low spatial resolution at regional scale. In this paper, linear regression models (LRMs have been constructed to correct and downscale the TRMM precipitation based on the gauge precipitation at 2257 stations over China from 1998 to 2013. Then, the corrected TRMM precipitation was validated by gauge precipitation at 839 out of 2257 stations in 2014 at station and grid scales. The results show that both monthly and annual LRMs have obviously improved the accuracy of corrected TRMM precipitation with acceptable error, and monthly LRM performs slightly better than annual LRM in Mideastern China. Although the performance of corrected TRMM precipitation from the LRMs has been increased in Northwest China and Tibetan plateau, the error of corrected TRMM precipitation is still significant due to the large deviation between TRMM precipitation and low-density gauge precipitation.

  9. Predicting number of hospitalization days based on health insurance claims data using bagged regression trees.

    Science.gov (United States)

    Xie, Yang; Schreier, Günter; Chang, David C W; Neubauer, Sandra; Redmond, Stephen J; Lovell, Nigel H

    2014-01-01

    Healthcare administrators worldwide are striving to both lower the cost of care whilst improving the quality of care given. Therefore, better clinical and administrative decision making is needed to improve these issues. Anticipating outcomes such as number of hospitalization days could contribute to addressing this problem. In this paper, a method was developed, using large-scale health insurance claims data, to predict the number of hospitalization days in a population. We utilized a regression decision tree algorithm, along with insurance claim data from 300,000 individuals over three years, to provide predictions of number of days in hospital in the third year, based on medical admissions and claims data from the first two years. Our method performs well in the general population. For the population aged 65 years and over, the predictive model significantly improves predictions over a baseline method (predicting a constant number of days for each patient), and achieved a specificity of 70.20% and sensitivity of 75.69% in classifying these subjects into two categories of 'no hospitalization' and 'at least one day in hospital'.

  10. Advanced signal processing based on support vector regression for lidar applications

    Science.gov (United States)

    Gelfusa, M.; Murari, A.; Malizia, A.; Lungaroni, M.; Peluso, E.; Parracino, S.; Talebzadeh, S.; Vega, J.; Gaudio, P.

    2015-10-01

    The LIDAR technique has recently found many applications in atmospheric physics and remote sensing. One of the main issues, in the deployment of systems based on LIDAR, is the filtering of the backscattered signal to alleviate the problems generated by noise. Improvement in the signal to noise ratio is typically achieved by averaging a quite large number (of the order of hundreds) of successive laser pulses. This approach can be effective but presents significant limitations. First of all, it implies a great stress on the laser source, particularly in the case of systems for automatic monitoring of large areas for long periods. Secondly, this solution can become difficult to implement in applications characterised by rapid variations of the atmosphere, for example in the case of pollutant emissions, or by abrupt changes in the noise. In this contribution, a new method for the software filtering and denoising of LIDAR signals is presented. The technique is based on support vector regression. The proposed new method is insensitive to the statistics of the noise and is therefore fully general and quite robust. The developed numerical tool has been systematically compared with the most powerful techniques available, using both synthetic and experimental data. Its performances have been tested for various statistical distributions of the noise and also for other disturbances of the acquired signal such as outliers. The competitive advantages of the proposed method are fully documented. The potential of the proposed approach to widen the capability of the LIDAR technique, particularly in the detection of widespread smoke, is discussed in detail.

  11. A Regression-based K nearest neighbor algorithm for gene function prediction from heterogeneous data

    Directory of Open Access Journals (Sweden)

    Ruzzo Walter L

    2006-03-01

    Full Text Available Abstract Background As a variety of functional genomic and proteomic techniques become available, there is an increasing need for functional analysis methodologies that integrate heterogeneous data sources. Methods In this paper, we address this issue by proposing a general framework for gene function prediction based on the k-nearest-neighbor (KNN algorithm. The choice of KNN is motivated by its simplicity, flexibility to incorporate different data types and adaptability to irregular feature spaces. A weakness of traditional KNN methods, especially when handling heterogeneous data, is that performance is subject to the often ad hoc choice of similarity metric. To address this weakness, we apply regression methods to infer a similarity metric as a weighted combination of a set of base similarity measures, which helps to locate the neighbors that are most likely to be in the same class as the target gene. We also suggest a novel voting scheme to generate confidence scores that estimate the accuracy of predictions. The method gracefully extends to multi-way classification problems. Results We apply this technique to gene function prediction according to three well-known Escherichia coli classification schemes suggested by biologists, using information derived from microarray and genome sequencing data. We demonstrate that our algorithm dramatically outperforms the naive KNN methods and is competitive with support vector machine (SVM algorithms for integrating heterogenous data. We also show that by combining different data sources, prediction accuracy can improve significantly. Conclusion Our extension of KNN with automatic feature weighting, multi-class prediction, and probabilistic inference, enhance prediction accuracy significantly while remaining efficient, intuitive and flexible. This general framework can also be applied to similar classification problems involving heterogeneous datasets.

  12. Focused information criterion and model averaging based on weighted composite quantile regression

    KAUST Repository

    Xu, Ganggang; Wang, Suojin; Huang, Jianhua Z.

    2013-01-01

    We study the focused information criterion and frequentist model averaging and their application to post-model-selection inference for weighted composite quantile regression (WCQR) in the context of the additive partial linear models. With the non

  13. Simulation-based Investigations of Electrostatic Beam Energy Analysers

    CERN Document Server

    Pahl, Hannes

    2015-01-01

    An energy analyser is needed to measure the beam energy profile behind the REX-EBIS at ISOLDE. The device should be able to operate with an accuracy of 1 V at voltages up to 30 kV. In order to find a working concept for an electrostatic energy analyser different designs were evaluated with simulations. A spherical device and its design issues are presented. The potential deformation effects of grids at high voltages and their influence on the energy resolution were investigated. First tests were made with a grid-free ring electrode device and show promising results.

  14. Cost-of-illness studies based on massive data: a prevalence-based, top-down regression approach.

    Science.gov (United States)

    Stollenwerk, Björn; Welchowski, Thomas; Vogl, Matthias; Stock, Stephanie

    2016-04-01

    Despite the increasing availability of routine data, no analysis method has yet been presented for cost-of-illness (COI) studies based on massive data. We aim, first, to present such a method and, second, to assess the relevance of the associated gain in numerical efficiency. We propose a prevalence-based, top-down regression approach consisting of five steps: aggregating the data; fitting a generalized additive model (GAM); predicting costs via the fitted GAM; comparing predicted costs between prevalent and non-prevalent subjects; and quantifying the stochastic uncertainty via error propagation. To demonstrate the method, it was applied to aggregated data in the context of chronic lung disease to German sickness funds data (from 1999), covering over 7.3 million insured. To assess the gain in numerical efficiency, the computational time of the innovative approach has been compared with corresponding GAMs applied to simulated individual-level data. Furthermore, the probability of model failure was modeled via logistic regression. Applying the innovative method was reasonably fast (19 min). In contrast, regarding patient-level data, computational time increased disproportionately by sample size. Furthermore, using patient-level data was accompanied by a substantial risk of model failure (about 80 % for 6 million subjects). The gain in computational efficiency of the innovative COI method seems to be of practical relevance. Furthermore, it may yield more precise cost estimates.

  15. Techniques for Scaling Up Analyses Based on Pre-interpretations

    DEFF Research Database (Denmark)

    Gallagher, John Patrick; Henriksen, Kim Steen; Banda, Gourinath

    2005-01-01

    a variety of analyses, both generic (such as mode analysis) and program-specific (with respect to a type describing some particular property of interest). Previous work demonstrated the approach using pre-interpretations over small domains. In this paper we present techniques that allow the method...

  16. Droplet Image Super Resolution Based on Sparse Representation and Kernel Regression

    Science.gov (United States)

    Zou, Zhenzhen; Luo, Xinghong; Yu, Qiang

    2018-05-01

    Microgravity and containerless conditions, which are produced via electrostatic levitation combined with a drop tube, are important when studying the intrinsic properties of new metastable materials. Generally, temperature and image sensors can be used to measure the changes of sample temperature, morphology and volume. Then, the specific heat, surface tension, viscosity changes and sample density can be obtained. Considering that the falling speed of the material sample droplet is approximately 31.3 m/s when it reaches the bottom of a 50-meter-high drop tube, a high-speed camera with a collection rate of up to 106 frames/s is required to image the falling droplet. However, at the high-speed mode, very few pixels, approximately 48-120, will be obtained in each exposure time, which results in low image quality. Super-resolution image reconstruction is an algorithm that provides finer details than the sampling grid of a given imaging device by increasing the number of pixels per unit area in the image. In this work, we demonstrate the application of single image-resolution reconstruction in the microgravity and electrostatic levitation for the first time. Here, using the image super-resolution method based on sparse representation, a low-resolution droplet image can be reconstructed. Employed Yang's related dictionary model, high- and low-resolution image patches were combined with dictionary training, and high- and low-resolution-related dictionaries were obtained. The online double-sparse dictionary training algorithm was used in the study of related dictionaries and overcome the shortcomings of the traditional training algorithm with small image patch. During the stage of image reconstruction, the algorithm of kernel regression is added, which effectively overcomes the shortcomings of the Yang image's edge blurs.

  17. A prediction model for spontaneous regression of cervical intraepithelial neoplasia grade 2, based on simple clinical parameters.

    Science.gov (United States)

    Koeneman, Margot M; van Lint, Freyja H M; van Kuijk, Sander M J; Smits, Luc J M; Kooreman, Loes F S; Kruitwagen, Roy F P M; Kruse, Arnold J

    2017-01-01

    This study aims to develop a prediction model for spontaneous regression of cervical intraepithelial neoplasia grade 2 (CIN 2) lesions based on simple clinicopathological parameters. The study was conducted at Maastricht University Medical Center, the Netherlands. The prediction model was developed in a retrospective cohort of 129 women with a histologic diagnosis of CIN 2 who were managed by watchful waiting for 6 to 24months. Five potential predictors for spontaneous regression were selected based on the literature and expert opinion and were analyzed in a multivariable logistic regression model, followed by backward stepwise deletion based on the Wald test. The prediction model was internally validated by the bootstrapping method. Discriminative capacity and accuracy were tested by assessing the area under the receiver operating characteristic curve (AUC) and a calibration plot. Disease regression within 24months was seen in 91 (71%) of 129 patients. A prediction model was developed including the following variables: smoking, Papanicolaou test outcome before the CIN 2 diagnosis, concomitant CIN 1 diagnosis in the same biopsy, and more than 1 biopsy containing CIN 2. Not smoking, Papanicolaou class predictive of disease regression. The AUC was 69.2% (95% confidence interval, 58.5%-79.9%), indicating a moderate discriminative ability of the model. The calibration plot indicated good calibration of the predicted probabilities. This prediction model for spontaneous regression of CIN 2 may aid physicians in the personalized management of these lesions. Copyright © 2016 Elsevier Inc. All rights reserved.

  18. Quality assurance challenges in x-ray emission based analyses

    International Nuclear Information System (INIS)

    Papp, T.

    2005-01-01

    Complete text of publication follows. There is a large scatter in the results of X-ray analysis with solid-state detectors suggesting methodological origin. Although the PIXE (proton induced X-ray emission) analytical technique can work without relation to any physics, as was commented at the recent PIXE conference, one could argue that if the same technique is used for measuring physical quantities reveals problems, then perhaps potential methodological issues can not apriory be excluded. We present a simple example which could be interpreted as indications for methodological considerations. Recently an inter-comparison was made of analysis of the spectra measured at the laboratory of the International Atomic Energy Agency (IAEA). Four participating analytical software packages were used to evaluate the X-ray spectra. There are several thin metal samples spectra, for which common energy scale could not be established. The quality of the spectrum can be judged from the line shape. The line shape is parametrized by the full widths at half maximum (FWHM) of a peak and the so-called low energy tailing. Fitting the spectra individually we obtained FWHM squared values at different energies and determined the linear regression parameters. The parameters suggest a rather poor detector performance. It is generally assumed that the (FWHM) 2 values have a first order polynomial form as a function of X-ray energy. Having done a linear regression analysis, we can plot the standard residual, presented in Fig. 1, which clearly shows a three-sigma deviation. The probability to having a three-sigma deviation is 1%. In other words, the probability that these spectra are in accordance with the expected FWHM functional form is less than 1%. The main problem is that, although the composite spectra were analyzed using four different programs, the difficulty in interpreting the spectra was not commented upon by any of the participants in the inter-comparison. (author)

  19. A Web-based Tool Combining Different Type Analyses

    DEFF Research Database (Denmark)

    Henriksen, Kim Steen; Gallagher, John Patrick

    2006-01-01

    of both, and they can be goal-dependent or goal-independent. We describe a prototype tool that can be accessed from a web browser, allowing various type analyses to be run. The first goal of the tool is to allow the analysis results to be examined conveniently by clicking on points in the original program...... the minimal "domain model" of the program with respect to the corresponding pre-interpretation, which can give more precise information than the original descriptive type....

  20. Graphic-based musculoskeletal model for biomechanical analyses and animation.

    Science.gov (United States)

    Chao, Edmund Y S

    2003-04-01

    The ability to combine physiology and engineering analyses with computer sciences has opened the door to the possibility of creating the 'Virtual Human' reality. This paper presents a broad foundation for a full-featured biomechanical simulator for the human musculoskeletal system physiology. This simulation technology unites the expertise in biomechanical analysis and graphic modeling to investigate joint and connective tissue mechanics at the structural level and to visualize the results in both static and animated forms together with the model. Adaptable anatomical models including prosthetic implants and fracture fixation devices and a robust computational infrastructure for static, kinematic, kinetic, and stress analyses under varying boundary and loading conditions are incorporated on a common platform, the VIMS (Virtual Interactive Musculoskeletal System). Within this software system, a manageable database containing long bone dimensions, connective tissue material properties and a library of skeletal joint system functional activities and loading conditions are also available and they can easily be modified, updated and expanded. Application software is also available to allow end-users to perform biomechanical analyses interactively. This paper details the design, capabilities, and features of the VIMS development at Johns Hopkins University, an effort possible only through academic and commercial collaborations. Examples using these models and the computational algorithms in a virtual laboratory environment are used to demonstrate the utility of this unique database and simulation technology. This integrated system will impact on medical education, basic research, device development and application, and clinical patient care related to musculoskeletal diseases, trauma, and rehabilitation.

  1. Education-Based Gaps in eHealth: A Weighted Logistic Regression Approach.

    Science.gov (United States)

    Amo, Laura

    2016-10-12

    Persons with a college degree are more likely to engage in eHealth behaviors than persons without a college degree, compounding the health disadvantages of undereducated groups in the United States. However, the extent to which quality of recent eHealth experience reduces the education-based eHealth gap is unexplored. The goal of this study was to examine how eHealth information search experience moderates the relationship between college education and eHealth behaviors. Based on a nationally representative sample of adults who reported using the Internet to conduct the most recent health information search (n=1458), I evaluated eHealth search experience in relation to the likelihood of engaging in different eHealth behaviors. I examined whether Internet health information search experience reduces the eHealth behavior gaps among college-educated and noncollege-educated adults. Weighted logistic regression models were used to estimate the probability of different eHealth behaviors. College education was significantly positively related to the likelihood of 4 eHealth behaviors. In general, eHealth search experience was negatively associated with health care behaviors, health information-seeking behaviors, and user-generated or content sharing behaviors after accounting for other covariates. Whereas Internet health information search experience has narrowed the education gap in terms of likelihood of using email or Internet to communicate with a doctor or health care provider and likelihood of using a website to manage diet, weight, or health, it has widened the education gap in the instances of searching for health information for oneself, searching for health information for someone else, and downloading health information on a mobile device. The relationship between college education and eHealth behaviors is moderated by Internet health information search experience in different ways depending on the type of eHealth behavior. After controlling for college

  2. An Apple II -based bidimensional pulse height analyser

    International Nuclear Information System (INIS)

    Bateman, J.E.; Flesher, A.C.; Honeyman, R.N.; Pritchard, T.E.; Price, W.P.R.

    1984-06-01

    The implementation of a pulse height analyser function in an Apple II microcomputer using minimal purpose built hardware is described. Except for a small interface module the system consists of two suites of software, one giving a conventional one dimensional analysis on a span of 1024 channels, and the other a two dimensional analysis on a 128 x 128 image format. Using the recently introduced ACCELERATOR coprocessor card the system performs with a dead time per event of less than 50 μS. Full software facilities are provided for display, storage and processing of the data using standard Applesoft BASIC. (author)

  3. Kernel based eigenvalue-decomposition methods for analysing ham

    DEFF Research Database (Denmark)

    Christiansen, Asger Nyman; Nielsen, Allan Aasbjerg; Møller, Flemming

    2010-01-01

    methods, such as PCA, MAF or MNF. We therefore investigated the applicability of kernel based versions of these transformation. This meant implementing the kernel based methods and developing new theory, since kernel based MAF and MNF is not described in the literature yet. The traditional methods only...... have two factors that are useful for segmentation and none of them can be used to segment the two types of meat. The kernel based methods have a lot of useful factors and they are able to capture the subtle differences in the images. This is illustrated in Figure 1. You can see a comparison of the most...... useful factor of PCA and kernel based PCA respectively in Figure 2. The factor of the kernel based PCA turned out to be able to segment the two types of meat and in general that factor is much more distinct, compared to the traditional factor. After the orthogonal transformation a simple thresholding...

  4. Avoiding and Correcting Bias in Score-Based Latent Variable Regression with Discrete Manifest Items

    Science.gov (United States)

    Lu, Irene R. R.; Thomas, D. Roland

    2008-01-01

    This article considers models involving a single structural equation with latent explanatory and/or latent dependent variables where discrete items are used to measure the latent variables. Our primary focus is the use of scores as proxies for the latent variables and carrying out ordinary least squares (OLS) regression on such scores to estimate…

  5. Construction of risk prediction model of type 2 diabetes mellitus based on logistic regression

    Directory of Open Access Journals (Sweden)

    Li Jian

    2017-01-01

    Full Text Available Objective: to construct multi factor prediction model for the individual risk of T2DM, and to explore new ideas for early warning, prevention and personalized health services for T2DM. Methods: using logistic regression techniques to screen the risk factors for T2DM and construct the risk prediction model of T2DM. Results: Male’s risk prediction model logistic regression equation: logit(P=BMI × 0.735+ vegetables × (−0.671 + age × 0.838+ diastolic pressure × 0.296+ physical activity× (−2.287 + sleep ×(−0.009 +smoking ×0.214; Female’s risk prediction model logistic regression equation: logit(P=BMI ×1.979+ vegetables× (−0.292 + age × 1.355+ diastolic pressure× 0.522+ physical activity × (−2.287 + sleep × (−0.010.The area under the ROC curve of male was 0.83, the sensitivity was 0.72, the specificity was 0.86, the area under the ROC curve of female was 0.84, the sensitivity was 0.75, the specificity was 0.90. Conclusion: This study model data is from a compared study of nested case, the risk prediction model has been established by using the more mature logistic regression techniques, and the model is higher predictive sensitivity, specificity and stability.

  6. Genomic prediction based on data from three layer lines using non-linear regression models

    NARCIS (Netherlands)

    Huang, H.; Windig, J.J.; Vereijken, A.; Calus, M.P.L.

    2014-01-01

    Background - Most studies on genomic prediction with reference populations that include multiple lines or breeds have used linear models. Data heterogeneity due to using multiple populations may conflict with model assumptions used in linear regression methods. Methods - In an attempt to alleviate

  7. Photovoltaic Array Condition Monitoring Based on Online Regression of Performance Model

    DEFF Research Database (Denmark)

    Spataru, Sergiu; Sera, Dezso; Kerekes, Tamas

    2013-01-01

    regression modeling, from PV array production, plane-of-array irradiance, and module temperature measurements, acquired during an initial learning phase of the system. After the model has been parameterized automatically, the condition monitoring system enters the normal operation phase, where...

  8. Regression Phalanxes

    OpenAIRE

    Zhang, Hongyang; Welch, William J.; Zamar, Ruben H.

    2017-01-01

    Tomal et al. (2015) introduced the notion of "phalanxes" in the context of rare-class detection in two-class classification problems. A phalanx is a subset of features that work well for classification tasks. In this paper, we propose a different class of phalanxes for application in regression settings. We define a "Regression Phalanx" - a subset of features that work well together for prediction. We propose a novel algorithm which automatically chooses Regression Phalanxes from high-dimensi...

  9. Genomic prediction based on data from three layer lines using non-linear regression models.

    Science.gov (United States)

    Huang, Heyun; Windig, Jack J; Vereijken, Addie; Calus, Mario P L

    2014-11-06

    Most studies on genomic prediction with reference populations that include multiple lines or breeds have used linear models. Data heterogeneity due to using multiple populations may conflict with model assumptions used in linear regression methods. In an attempt to alleviate potential discrepancies between assumptions of linear models and multi-population data, two types of alternative models were used: (1) a multi-trait genomic best linear unbiased prediction (GBLUP) model that modelled trait by line combinations as separate but correlated traits and (2) non-linear models based on kernel learning. These models were compared to conventional linear models for genomic prediction for two lines of brown layer hens (B1 and B2) and one line of white hens (W1). The three lines each had 1004 to 1023 training and 238 to 240 validation animals. Prediction accuracy was evaluated by estimating the correlation between observed phenotypes and predicted breeding values. When the training dataset included only data from the evaluated line, non-linear models yielded at best a similar accuracy as linear models. In some cases, when adding a distantly related line, the linear models showed a slight decrease in performance, while non-linear models generally showed no change in accuracy. When only information from a closely related line was used for training, linear models and non-linear radial basis function (RBF) kernel models performed similarly. The multi-trait GBLUP model took advantage of the estimated genetic correlations between the lines. Combining linear and non-linear models improved the accuracy of multi-line genomic prediction. Linear models and non-linear RBF models performed very similarly for genomic prediction, despite the expectation that non-linear models could deal better with the heterogeneous multi-population data. This heterogeneity of the data can be overcome by modelling trait by line combinations as separate but correlated traits, which avoids the occasional

  10. [Hyperspectral Estimation of Apple Tree Canopy LAI Based on SVM and RF Regression].

    Science.gov (United States)

    Han, Zhao-ying; Zhu, Xi-cun; Fang, Xian-yi; Wang, Zhuo-yuan; Wang, Ling; Zhao, Geng-Xing; Jiang, Yuan-mao

    2016-03-01

    Leaf area index (LAI) is the dynamic index of crop population size. Hyperspectral technology can be used to estimate apple canopy LAI rapidly and nondestructively. It can be provide a reference for monitoring the tree growing and yield estimation. The Red Fuji apple trees of full bearing fruit are the researching objects. Ninety apple trees canopies spectral reflectance and LAI values were measured by the ASD Fieldspec3 spectrometer and LAI-2200 in thirty orchards in constant two years in Qixia research area of Shandong Province. The optimal vegetation indices were selected by the method of correlation analysis of the original spectral reflectance and vegetation indices. The models of predicting the LAI were built with the multivariate regression analysis method of support vector machine (SVM) and random forest (RF). The new vegetation indices, GNDVI527, ND-VI676, RVI682, FD-NVI656 and GRVI517 and the previous two main vegetation indices, NDVI670 and NDVI705, are in accordance with LAI. In the RF regression model, the calibration set decision coefficient C-R2 of 0.920 and validation set decision coefficient V-R2 of 0.889 are higher than the SVM regression model by 0.045 and 0.033 respectively. The root mean square error of calibration set C-RMSE of 0.249, the root mean square error validation set V-RMSE of 0.236 are lower than that of the SVM regression model by 0.054 and 0.058 respectively. Relative analysis of calibrating error C-RPD and relative analysis of validation set V-RPD reached 3.363 and 2.520, 0.598 and 0.262, respectively, which were higher than the SVM regression model. The measured and predicted the scatterplot trend line slope of the calibration set and validation set C-S and V-S are close to 1. The estimation result of RF regression model is better than that of the SVM. RF regression model can be used to estimate the LAI of red Fuji apple trees in full fruit period.

  11. Sensitivity and uncertainty analyses in aging risk-based prioritizations

    International Nuclear Information System (INIS)

    Hassan, M.; Uryas'ev, S.; Vesely, W.E.

    1993-01-01

    Aging risk evaluations of nuclear power plants using Probabilistic Risk Analyses (PRAs) involve assessments of the impact of aging structures, systems, and components (SSCs) on plant core damage frequency (CDF). These assessments can be used to prioritize the contributors to aging risk reflecting the relative risk potential of the SSCs. Aging prioritizations are important for identifying the SSCs contributing most to plant risk and can provide a systematic basis on which aging risk control and management strategies for a plant can be developed. However, these prioritizations are subject to variabilities arising from uncertainties in data, and/or from various modeling assumptions. The objective of this paper is to present an evaluation of the sensitivity of aging prioritizations of active components to uncertainties in aging risk quantifications. Approaches for robust prioritization of SSCs also are presented which are less susceptible to the uncertainties

  12. A SOCIOLOGICAL ANALYSIS OF THE CHILDBEARING COEFFICIENT IN THE ALTAI REGION BASED ON METHOD OF FUZZY LINEAR REGRESSION

    Directory of Open Access Journals (Sweden)

    Sergei Vladimirovich Varaksin

    2017-06-01

    Full Text Available Purpose. Construction of a mathematical model of the dynamics of childbearing change in the Altai region in 2000–2016, analysis of the dynamics of changes in birth rates for multiple age categories of women of childbearing age. Methodology. A auxiliary analysis element is the construction of linear mathematical models of the dynamics of childbearing by using fuzzy linear regression method based on fuzzy numbers. Fuzzy linear regression is considered as an alternative to standard statistical linear regression for short time series and unknown distribution law. The parameters of fuzzy linear and standard statistical regressions for childbearing time series were defined with using the built in language MatLab algorithm. Method of fuzzy linear regression is not used in sociological researches yet. Results. There are made the conclusions about the socio-demographic changes in society, the high efficiency of the demographic policy of the leadership of the region and the country, and the applicability of the method of fuzzy linear regression for sociological analysis.

  13. Stress Regression Analysis of Asphalt Concrete Deck Pavement Based on Orthogonal Experimental Design and Interlayer Contact

    Science.gov (United States)

    Wang, Xuntao; Feng, Jianhu; Wang, Hu; Hong, Shidi; Zheng, Supei

    2018-03-01

    A three-dimensional finite element box girder bridge and its asphalt concrete deck pavement were established by ANSYS software, and the interlayer bonding condition of asphalt concrete deck pavement was assumed to be contact bonding condition. Orthogonal experimental design is used to arrange the testing plans of material parameters, and an evaluation of the effect of different material parameters in the mechanical response of asphalt concrete surface layer was conducted by multiple linear regression model and using the results from the finite element analysis. Results indicated that stress regression equations can well predict the stress of the asphalt concrete surface layer, and elastic modulus of waterproof layer has a significant influence on stress values of asphalt concrete surface layer.

  14. Discussion on Regression Methods Based on Ensemble Learning and Applicability Domains of Linear Submodels.

    Science.gov (United States)

    Kaneko, Hiromasa

    2018-02-26

    To develop a new ensemble learning method and construct highly predictive regression models in chemoinformatics and chemometrics, applicability domains (ADs) are introduced into the ensemble learning process of prediction. When estimating values of an objective variable using subregression models, only the submodels with ADs that cover a query sample, i.e., the sample is inside the model's AD, are used. By constructing submodels and changing a list of selected explanatory variables, the union of the submodels' ADs, which defines the overall AD, becomes large, and the prediction performance is enhanced for diverse compounds. By analyzing a quantitative structure-activity relationship data set and a quantitative structure-property relationship data set, it is confirmed that the ADs can be enlarged and the estimation performance of regression models is improved compared with traditional methods.

  15. Inverse estimation of multiple muscle activations based on linear logistic regression.

    Science.gov (United States)

    Sekiya, Masashi; Tsuji, Toshiaki

    2017-07-01

    This study deals with a technology to estimate the muscle activity from the movement data using a statistical model. A linear regression (LR) model and artificial neural networks (ANN) have been known as statistical models for such use. Although ANN has a high estimation capability, it is often in the clinical application that the lack of data amount leads to performance deterioration. On the other hand, the LR model has a limitation in generalization performance. We therefore propose a muscle activity estimation method to improve the generalization performance through the use of linear logistic regression model. The proposed method was compared with the LR model and ANN in the verification experiment with 7 participants. As a result, the proposed method showed better generalization performance than the conventional methods in various tasks.

  16. Linear and support vector regressions based on geometrical correlation of data

    Directory of Open Access Journals (Sweden)

    Kaijun Wang

    2007-10-01

    Full Text Available Linear regression (LR and support vector regression (SVR are widely used in data analysis. Geometrical correlation learning (GcLearn was proposed recently to improve the predictive ability of LR and SVR through mining and using correlations between data of a variable (inner correlation. This paper theoretically analyzes prediction performance of the GcLearn method and proves that GcLearn LR and SVR will have better prediction performance than traditional LR and SVR for prediction tasks when good inner correlations are obtained and predictions by traditional LR and SVR are far away from their neighbor training data under inner correlation. This gives the applicable condition of GcLearn method.

  17. Meta-regression analyses to explain statistical heterogeneity in a systematic review of strategies for guideline implementation in primary health care.

    Directory of Open Access Journals (Sweden)

    Susanne Unverzagt

    Full Text Available This study is an in-depth-analysis to explain statistical heterogeneity in a systematic review of implementation strategies to improve guideline adherence of primary care physicians in the treatment of patients with cardiovascular diseases. The systematic review included randomized controlled trials from a systematic search in MEDLINE, EMBASE, CENTRAL, conference proceedings and registers of ongoing studies. Implementation strategies were shown to be effective with substantial heterogeneity of treatment effects across all investigated strategies. Primary aim of this study was to explain different effects of eligible trials and to identify methodological and clinical effect modifiers. Random effects meta-regression models were used to simultaneously assess the influence of multimodal implementation strategies and effect modifiers on physician adherence. Effect modifiers included the staff responsible for implementation, level of prevention and definition pf the primary outcome, unit of randomization, duration of follow-up and risk of bias. Six clinical and methodological factors were investigated as potential effect modifiers of the efficacy of different implementation strategies on guideline adherence in primary care practices on the basis of information from 75 eligible trials. Five effect modifiers were able to explain a substantial amount of statistical heterogeneity. Physician adherence was improved by 62% (95% confidence interval (95% CI 29 to 104% or 29% (95% CI 5 to 60% in trials where other non-medical professionals or nurses were included in the implementation process. Improvement of physician adherence was more successful in primary and secondary prevention of cardiovascular diseases by around 30% (30%; 95% CI -2 to 71% and 31%; 95% CI 9 to 57%, respectively compared to tertiary prevention. This study aimed to identify effect modifiers of implementation strategies on physician adherence. Especially the cooperation of different health

  18. Evaluation of an optoacoustic based gas analysing device

    Science.gov (United States)

    Markmann, Janine; Lange, Birgit; Theisen-Kunde, Dirk; Danicke, Veit; Mayorov, Fedor; Eckert, Sebastian; Kettmann, Pascal; Brinkmann, Ralf

    2017-07-01

    The relative occurrence of volatile organic compounds in the human respiratory gas is disease-specific (ppb range). A prototype of a gas analysing device using two tuneable laser systems, an OPO-laser (2.5 to 10 μm) and a CO2-laser (9 to 11 μm), and an optoacoustic measurement cell was developed to detect concentrations in the ppb range. The sensitivity and resolution of the system was determined by test gas measurements, measuring ethylene and sulfur hexafluoride with the CO2-laser and butane with the OPO-laser. System sensitivity found to be 13 ppb for sulfur hexafluoride, 17 ppb for ethylene and Respiratory gas samples of 8 healthy volunteers were investigated by irradiation with 17 laser lines of the CO2-laser. Several of those lines overlap with strong absorption bands of ammonia. As it is known that ammonia concentration increases by age a separation of people 35 was striven for. To evaluate the data the first seven gas samples were used to train a discriminant analysis algorithm. The eighth subject was then assigned correctly to the group >35 years with the age of 49 years.

  19. A Fuzzy Logic Based Method for Analysing Test Results

    Directory of Open Access Journals (Sweden)

    Le Xuan Vinh

    2017-11-01

    Full Text Available Network operators must perform many tasks to ensure smooth operation of the network, such as planning, monitoring, etc. Among those tasks, regular testing of network performance, network errors and troubleshooting is very important. Meaningful test results will allow the operators to evaluate network performanceof any shortcomings and to better plan for network upgrade. Due to the diverse and mainly unquantifiable nature of network testing results, there is a needs to develop a method for systematically and rigorously analysing these results. In this paper, we present STAM (System Test-result Analysis Method which employs a bottom-up hierarchical processing approach using Fuzzy logic. STAM is capable of combining all test results into a quantitative description of the network performance in terms of network stability, the significance of various network erros, performance of each function blocks within the network. The validity of this method has been successfully demonstrated in assisting the testing of a VoIP system at the Research Instiute of Post and Telecoms in Vietnam. The paper is organized as follows. The first section gives an overview of fuzzy logic theory the concepts of which will be used in the development of STAM. The next section describes STAM. The last section, demonstrating STAM’s capability, presents a success story in which STAM is successfully applied.

  20. Visualizing Confidence in Cluster-Based Ensemble Weather Forecast Analyses.

    Science.gov (United States)

    Kumpf, Alexander; Tost, Bianca; Baumgart, Marlene; Riemer, Michael; Westermann, Rudiger; Rautenhaus, Marc

    2018-01-01

    In meteorology, cluster analysis is frequently used to determine representative trends in ensemble weather predictions in a selected spatio-temporal region, e.g., to reduce a set of ensemble members to simplify and improve their analysis. Identified clusters (i.e., groups of similar members), however, can be very sensitive to small changes of the selected region, so that clustering results can be misleading and bias subsequent analyses. In this article, we - a team of visualization scientists and meteorologists-deliver visual analytics solutions to analyze the sensitivity of clustering results with respect to changes of a selected region. We propose an interactive visual interface that enables simultaneous visualization of a) the variation in composition of identified clusters (i.e., their robustness), b) the variability in cluster membership for individual ensemble members, and c) the uncertainty in the spatial locations of identified trends. We demonstrate that our solution shows meteorologists how representative a clustering result is, and with respect to which changes in the selected region it becomes unstable. Furthermore, our solution helps to identify those ensemble members which stably belong to a given cluster and can thus be considered similar. In a real-world application case we show how our approach is used to analyze the clustering behavior of different regions in a forecast of "Tropical Cyclone Karl", guiding the user towards the cluster robustness information required for subsequent ensemble analysis.

  1. A Cyber-Attack Detection Model Based on Multivariate Analyses

    Science.gov (United States)

    Sakai, Yuto; Rinsaka, Koichiro; Dohi, Tadashi

    In the present paper, we propose a novel cyber-attack detection model based on two multivariate-analysis methods to the audit data observed on a host machine. The statistical techniques used here are the well-known Hayashi's quantification method IV and cluster analysis method. We quantify the observed qualitative audit event sequence via the quantification method IV, and collect similar audit event sequence in the same groups based on the cluster analysis. It is shown in simulation experiments that our model can improve the cyber-attack detection accuracy in some realistic cases where both normal and attack activities are intermingled.

  2. How distributed processing produces false negatives in voxel-based lesion-deficit analyses.

    Science.gov (United States)

    Gajardo-Vidal, Andrea; Lorca-Puls, Diego L; Crinion, Jennifer T; White, Jitrachote; Seghier, Mohamed L; Leff, Alex P; Hope, Thomas M H; Ludersdorfer, Philipp; Green, David W; Bowman, Howard; Price, Cathy J

    2018-07-01

    In this study, we hypothesized that if the same deficit can be caused by damage to one or another part of a distributed neural system, then voxel-based analyses might miss critical lesion sites because preservation of each site will not be consistently associated with preserved function. The first part of our investigation used voxel-based multiple regression analyses of data from 359 right-handed stroke survivors to identify brain regions where lesion load is associated with picture naming abilities after factoring out variance related to object recognition, semantics and speech articulation so as to focus on deficits arising at the word retrieval level. A highly significant lesion-deficit relationship was identified in left temporal and frontal/premotor regions. Post-hoc analyses showed that damage to either of these sites caused the deficit of interest in less than half the affected patients (76/162 = 47%). After excluding all patients with damage to one or both of the identified regions, our second analysis revealed a new region, in the anterior part of the left putamen, which had not been previously detected because many patients had the deficit of interest after temporal or frontal damage that preserved the left putamen. The results illustrate how (i) false negative results arise when the same deficit can be caused by different lesion sites; (ii) some of the missed effects can be unveiled by adopting an iterative approach that systematically excludes patients with lesions to the areas identified in previous analyses, (iii) statistically significant voxel-based lesion-deficit mappings can be driven by a subset of patients; (iv) focal lesions to the identified regions are needed to determine whether the deficit of interest is the consequence of focal damage or much more extensive damage that includes the identified region; and, finally, (v) univariate voxel-based lesion-deficit mappings cannot, in isolation, be used to predict outcome in other patients

  3. Adjusting for overdispersion in piecewise exponential regression models to estimate excess mortality rate in population-based research.

    Science.gov (United States)

    Luque-Fernandez, Miguel Angel; Belot, Aurélien; Quaresma, Manuela; Maringe, Camille; Coleman, Michel P; Rachet, Bernard

    2016-10-01

    In population-based cancer research, piecewise exponential regression models are used to derive adjusted estimates of excess mortality due to cancer using the Poisson generalized linear modelling framework. However, the assumption that the conditional mean and variance of the rate parameter given the set of covariates x i are equal is strong and may fail to account for overdispersion given the variability of the rate parameter (the variance exceeds the mean). Using an empirical example, we aimed to describe simple methods to test and correct for overdispersion. We used a regression-based score test for overdispersion under the relative survival framework and proposed different approaches to correct for overdispersion including a quasi-likelihood, robust standard errors estimation, negative binomial regression and flexible piecewise modelling. All piecewise exponential regression models showed the presence of significant inherent overdispersion (p-value regression modelling, with either a quasi-likelihood or robust standard errors, was the best approach as it deals with both, overdispersion due to model misspecification and true or inherent overdispersion.

  4. Analysing co-articulation using frame-based feature trajectories

    CSIR Research Space (South Africa)

    Badenhorst, J

    2010-11-01

    Full Text Available The authors investigate several approaches aimed at a more detailed understanding of co-articulation in spoken utterances. They find that the Euclidean difference between instantaneous frame-based feature values and the mean values of these features...

  5. PCR and RFLP analyses based on the ribosomal protein operon

    Science.gov (United States)

    Differentiation and classification of phytoplasmas have been primarily based on the highly conserved 16Sr RNA gene. RFLP analysis of 16Sr RNA gene sequences has identified 31 16Sr RNA (16Sr) groups and more than 100 16Sr subgroups. Classification of phytoplasma strains can however, become more refin...

  6. Landslide susceptibility mapping using decision-tree based CHi-squared automatic interaction detection (CHAID) and Logistic regression (LR) integration

    International Nuclear Information System (INIS)

    Althuwaynee, Omar F; Pradhan, Biswajeet; Ahmad, Noordin

    2014-01-01

    This article uses methodology based on chi-squared automatic interaction detection (CHAID), as a multivariate method that has an automatic classification capacity to analyse large numbers of landslide conditioning factors. This new algorithm was developed to overcome the subjectivity of the manual categorization of scale data of landslide conditioning factors, and to predict rainfall-induced susceptibility map in Kuala Lumpur city and surrounding areas using geographic information system (GIS). The main objective of this article is to use CHi-squared automatic interaction detection (CHAID) method to perform the best classification fit for each conditioning factor, then, combining it with logistic regression (LR). LR model was used to find the corresponding coefficients of best fitting function that assess the optimal terminal nodes. A cluster pattern of landslide locations was extracted in previous study using nearest neighbor index (NNI), which were then used to identify the clustered landslide locations range. Clustered locations were used as model training data with 14 landslide conditioning factors such as; topographic derived parameters, lithology, NDVI, land use and land cover maps. Pearson chi-squared value was used to find the best classification fit between the dependent variable and conditioning factors. Finally the relationship between conditioning factors were assessed and the landslide susceptibility map (LSM) was produced. An area under the curve (AUC) was used to test the model reliability and prediction capability with the training and validation landslide locations respectively. This study proved the efficiency and reliability of decision tree (DT) model in landslide susceptibility mapping. Also it provided a valuable scientific basis for spatial decision making in planning and urban management studies

  7. Landslide susceptibility mapping using decision-tree based CHi-squared automatic interaction detection (CHAID) and Logistic regression (LR) integration

    Science.gov (United States)

    Althuwaynee, Omar F.; Pradhan, Biswajeet; Ahmad, Noordin

    2014-06-01

    This article uses methodology based on chi-squared automatic interaction detection (CHAID), as a multivariate method that has an automatic classification capacity to analyse large numbers of landslide conditioning factors. This new algorithm was developed to overcome the subjectivity of the manual categorization of scale data of landslide conditioning factors, and to predict rainfall-induced susceptibility map in Kuala Lumpur city and surrounding areas using geographic information system (GIS). The main objective of this article is to use CHi-squared automatic interaction detection (CHAID) method to perform the best classification fit for each conditioning factor, then, combining it with logistic regression (LR). LR model was used to find the corresponding coefficients of best fitting function that assess the optimal terminal nodes. A cluster pattern of landslide locations was extracted in previous study using nearest neighbor index (NNI), which were then used to identify the clustered landslide locations range. Clustered locations were used as model training data with 14 landslide conditioning factors such as; topographic derived parameters, lithology, NDVI, land use and land cover maps. Pearson chi-squared value was used to find the best classification fit between the dependent variable and conditioning factors. Finally the relationship between conditioning factors were assessed and the landslide susceptibility map (LSM) was produced. An area under the curve (AUC) was used to test the model reliability and prediction capability with the training and validation landslide locations respectively. This study proved the efficiency and reliability of decision tree (DT) model in landslide susceptibility mapping. Also it provided a valuable scientific basis for spatial decision making in planning and urban management studies.

  8. Propensity-score matching in economic analyses: comparison with regression models, instrumental variables, residual inclusion, differences-in-differences, and decomposition methods.

    Science.gov (United States)

    Crown, William H

    2014-02-01

    This paper examines the use of propensity score matching in economic analyses of observational data. Several excellent papers have previously reviewed practical aspects of propensity score estimation and other aspects of the propensity score literature. The purpose of this paper is to compare the conceptual foundation of propensity score models with alternative estimators of treatment effects. References are provided to empirical comparisons among methods that have appeared in the literature. These comparisons are available for a subset of the methods considered in this paper. However, in some cases, no pairwise comparisons of particular methods are yet available, and there are no examples of comparisons across all of the methods surveyed here. Irrespective of the availability of empirical comparisons, the goal of this paper is to provide some intuition about the relative merits of alternative estimators in health economic evaluations where nonlinearity, sample size, availability of pre/post data, heterogeneity, and missing variables can have important implications for choice of methodology. Also considered is the potential combination of propensity score matching with alternative methods such as differences-in-differences and decomposition methods that have not yet appeared in the empirical literature.

  9. Short term load forecasting technique based on the seasonal exponential adjustment method and the regression model

    International Nuclear Information System (INIS)

    Wu, Jie; Wang, Jianzhou; Lu, Haiyan; Dong, Yao; Lu, Xiaoxiao

    2013-01-01

    Highlights: ► The seasonal and trend items of the data series are forecasted separately. ► Seasonal item in the data series is verified by the Kendall τ correlation testing. ► Different regression models are applied to the trend item forecasting. ► We examine the superiority of the combined models by the quartile value comparison. ► Paired-sample T test is utilized to confirm the superiority of the combined models. - Abstract: For an energy-limited economy system, it is crucial to forecast load demand accurately. This paper devotes to 1-week-ahead daily load forecasting approach in which load demand series are predicted by employing the information of days before being similar to that of the forecast day. As well as in many nonlinear systems, seasonal item and trend item are coexisting in load demand datasets. In this paper, the existing of the seasonal item in the load demand data series is firstly verified according to the Kendall τ correlation testing method. Then in the belief of the separate forecasting to the seasonal item and the trend item would improve the forecasting accuracy, hybrid models by combining seasonal exponential adjustment method (SEAM) with the regression methods are proposed in this paper, where SEAM and the regression models are employed to seasonal and trend items forecasting respectively. Comparisons of the quartile values as well as the mean absolute percentage error values demonstrate this forecasting technique can significantly improve the accuracy though models applied to the trend item forecasting are eleven different ones. This superior performance of this separate forecasting technique is further confirmed by the paired-sample T tests

  10. Analysing Leontiev Tube Capabilities in the Space-based Plants

    Directory of Open Access Journals (Sweden)

    N. L. Shchegolev

    2017-01-01

    Full Text Available The paper presents a review of publications dedicated to the gas-dynamic temperature stratification device (the Leontief tube and shows main factors affecting its efficiency. Describes an experimental installation, which is used to obtain data on the value of energy separation in the air to prove this device the operability.The assumption that there is an optimal relationship between the flow velocities in the subsonic and supersonic channels of the gas-dynamic temperature stratification device is experimentally confirmed.The paper conducts analysis of possible ways to raise the efficiency of power plants of various (including space basing, and shows that, currently, a mainstream of increasing efficiency of their operation is to complicate design solutions.A scheme of the closed gas-turbine space-based plant using a mixture of inert gases (helium-xenon one for operation is proposed. What differs it from the simplest variants is a lack of the cooler-radiator and integration into gas-dynamic temperature stratification device and heat compressor.Based on the equations of one-dimensional gas dynamics, it is shown that the total pressure restorability when removing heat in a thermal compressor determines operating capability of this scheme. The exploratory study of creating a heat compressor is performed, and it is shown that when operating on gases with a Prandtl number close to 1, the total pressure does not increase.The operating capability conditions of the heat compressor are operation on gases with a low value of the Prandtl number (helium-xenon mixture at high supersonic velocities and with a longitudinal pressure gradient available.It is shown that there is a region of the low values of the Prandtl number (Pr <0.3 for which, with the longitudinal pressure gradient available in the supersonic flows of a viscous gas, the total pressure can be restored.

  11. Design of the storage location based on the ABC analyses

    Science.gov (United States)

    Jemelka, Milan; Chramcov, Bronislav; Kříž, Pavel

    2016-06-01

    The paper focuses on process efficiency and saving storage costs. Maintaining inventory through putaway strategy takes personnel time and costs money. The aim is to control inventory in the best way. The ABC classification based on Villefredo Pareto theory is used for a design of warehouse layout. New design of storage location reduces the distance of fork-lifters, total costs and it increases inventory process efficiency. The suggested solutions and evaluation of achieved results are described in detail. Proposed solutions were realized in real warehouse operation.

  12. Recognition of NEMP and LEMP signals based on auto-regression model and artificial neutral network

    International Nuclear Information System (INIS)

    Li Peng; Song Lijun; Han Chao; Zheng Yi; Cao Baofeng; Li Xiaoqiang; Zhang Xueqin; Liang Rui

    2010-01-01

    Auto-regression (AR) model, one power spectrum estimation method of stationary random signals, and artificial neutral network were adopted to recognize nuclear and lightning electromagnetic pulses. Self-correlation function and Burg algorithms were used to acquire the AR model coefficients as eigenvalues, and BP artificial neural network was introduced as the classifier with different numbers of hidden layers and hidden layer nodes. The results show that AR model is effective in those signals, feature extraction, and the Burg algorithm is more effective than the self-correlation function algorithm. (authors)

  13. Classification of Error-Diffused Halftone Images Based on Spectral Regression Kernel Discriminant Analysis

    Directory of Open Access Journals (Sweden)

    Zhigao Zeng

    2016-01-01

    Full Text Available This paper proposes a novel algorithm to solve the challenging problem of classifying error-diffused halftone images. We firstly design the class feature matrices, after extracting the image patches according to their statistics characteristics, to classify the error-diffused halftone images. Then, the spectral regression kernel discriminant analysis is used for feature dimension reduction. The error-diffused halftone images are finally classified using an idea similar to the nearest centroids classifier. As demonstrated by the experimental results, our method is fast and can achieve a high classification accuracy rate with an added benefit of robustness in tackling noise.

  14. Forecasting Advective Sea Fog with the Use of Classification and Regression Tree Analyses for Kunsan Air Base

    National Research Council Canada - National Science Library

    Lewis, Danielle

    2004-01-01

    .... To date, there are no suitable methods developed for forecasting advective sea fog at Kunsan, primarily due to a lack of understanding of sea fog formation under various synoptic situations over the Yellow Sea...

  15. Economic evaluation of algae biodiesel based on meta-analyses

    Science.gov (United States)

    Zhang, Yongli; Liu, Xiaowei; White, Mark A.; Colosi, Lisa M.

    2017-08-01

    The objective of this study is to elucidate the economic viability of algae-to-energy systems at a large scale, by developing a meta-analysis of five previously published economic evaluations of systems producing algae biodiesel. Data from original studies were harmonised into a standardised framework using financial and technical assumptions. Results suggest that the selling price of algae biodiesel under the base case would be 5.00-10.31/gal, higher than the selected benchmarks: 3.77/gal for petroleum diesel, and 4.21/gal for commercial biodiesel (B100) from conventional vegetable oil or animal fat. However, the projected selling price of algal biodiesel (2.76-4.92/gal), following anticipated improvements, would be competitive. A scenario-based sensitivity analysis reveals that the price of algae biodiesel is most sensitive to algae biomass productivity, algae oil content, and algae cultivation cost. This indicates that the improvements in the yield, quality, and cost of algae feedstock could be the key factors to make algae-derived biodiesel economically viable.

  16. Predicting Charging Time of Battery Electric Vehicles Based on Regression and Time-Series Methods: A Case Study of Beijing

    Directory of Open Access Journals (Sweden)

    Jun Bi

    2018-04-01

    Full Text Available Battery electric vehicles (BEVs reduce energy consumption and air pollution as compared with conventional vehicles. However, the limited driving range and potential long charging time of BEVs create new problems. Accurate charging time prediction of BEVs helps drivers determine travel plans and alleviate their range anxiety during trips. This study proposed a combined model for charging time prediction based on regression and time-series methods according to the actual data from BEVs operating in Beijing, China. After data analysis, a regression model was established by considering the charged amount for charging time prediction. Furthermore, a time-series method was adopted to calibrate the regression model, which significantly improved the fitting accuracy of the model. The parameters of the model were determined by using the actual data. Verification results confirmed the accuracy of the model and showed that the model errors were small. The proposed model can accurately depict the charging time characteristics of BEVs in Beijing.

  17. Blood glucose level prediction based on support vector regression using mobile platforms.

    Science.gov (United States)

    Reymann, Maximilian P; Dorschky, Eva; Groh, Benjamin H; Martindale, Christine; Blank, Peter; Eskofier, Bjoern M

    2016-08-01

    The correct treatment of diabetes is vital to a patient's health: Staying within defined blood glucose levels prevents dangerous short- and long-term effects on the body. Mobile devices informing patients about their future blood glucose levels could enable them to take counter-measures to prevent hypo or hyper periods. Previous work addressed this challenge by predicting the blood glucose levels using regression models. However, these approaches required a physiological model, representing the human body's response to insulin and glucose intake, or are not directly applicable to mobile platforms (smart phones, tablets). In this paper, we propose an algorithm for mobile platforms to predict blood glucose levels without the need for a physiological model. Using an online software simulator program, we trained a Support Vector Regression (SVR) model and exported the parameter settings to our mobile platform. The prediction accuracy of our mobile platform was evaluated with pre-recorded data of a type 1 diabetes patient. The blood glucose level was predicted with an error of 19 % compared to the true value. Considering the permitted error of commercially used devices of 15 %, our algorithm is the basis for further development of mobile prediction algorithms.

  18. Spatial and Inter-temporal Sources of Poverty, Inequality and Gender Disparities in Cameroon: a Regression-Based Decomposition Analysis

    OpenAIRE

    Boniface Ngah Epo; Francis Menjo Baye; Nadine Teme Angele Manga

    2011-01-01

    This study applies the regression-based inequality decomposition technique to explain poverty and inequality trends in Cameroon. We also identify gender related factors which explain income disparities and discrimination based on the 2001 and 2007 Cameroon household consumption surveys. The results show that education, health, employment in the formal sector, age cohorts, household size, gender, ownership of farmland and urban versus rural residence explain household economic wellbeing; dispa...

  19. Development of a Multiple Linear Regression Model to Forecast Facility Electrical Consumption at an Air Force Base.

    Science.gov (United States)

    1981-09-01

    corresponds to the same square footage that consumed the electrical energy. 3. The basic assumptions of multiple linear regres- sion, as enumerated in...7. Data related to the sample of bases is assumed to be representative of bases in the population. Limitations Basic limitations on this research were... Ratemaking --Overview. Rand Report R-5894, Santa Monica CA, May 1977. Chatterjee, Samprit, and Bertram Price. Regression Analysis by Example. New York: John

  20. Exploring Person Fit with an Approach Based on Multilevel Logistic Regression

    Science.gov (United States)

    Walker, A. Adrienne; Engelhard, George, Jr.

    2015-01-01

    The idea that test scores may not be valid representations of what students know, can do, and should learn next is well known. Person fit provides an important aspect of validity evidence. Person fit analyses at the individual student level are not typically conducted and person fit information is not communicated to educational stakeholders. In…

  1. Evidence for Endothermy in Pterosaurs Based on Flight Capability Analyses

    Science.gov (United States)

    Jenkins, H. S.; Pratson, L. F.

    2005-12-01

    Previous attempts to constrain flight capability in pterosaurs have relied heavily on the fossil record, using bone articulation and apparent muscle allocation to evaluate flight potential (Frey et al., 1997; Padian, 1983; Bramwell, 1974). However, broad definitions of the physical parameters necessary for flight in pterosaurs remain loosely defined and few systematic approaches to constraining flight capability have been synthesized (Templin, 2000; Padian, 1983). Here we present a new method to assess flight capability in pterosaurs as a function of humerus length and flight velocity. By creating an energy-balance model to evaluate the power required for flight against the power available to the animal, we derive a `U'-shaped power curve and infer optimal flight speeds and maximal wingspan lengths for pterosaurs Quetzalcoatlus northropi and Pteranodon ingens. Our model corroborates empirically derived power curves for the modern black-billed magpie ( Pica Pica) and accurately reproduces the mechanical power curve for modern cockatiels ( Nymphicus hollandicus) (Tobalske et al., 2003). When we adjust our model to include an endothermic metabolic rate for pterosaurs, we find a maximal wingspan length of 18 meters for Q. northropi. Model runs using an exothermic metabolism derive maximal wingspans of 6-8 meters. As estimates based on fossil evidence show total wingspan lengths reaching up to 15 meters for Q. northropi, we conclude that large pterosaurs may have been endothermic and therefore more metabolically similar to birds than to reptiles.

  2. Quantitative Prediction of Coalbed Gas Content Based on Seismic Multiple-Attribute Analyses

    Directory of Open Access Journals (Sweden)

    Renfang Pan

    2015-09-01

    Full Text Available Accurate prediction of gas planar distribution is crucial to selection and development of new CBM exploration areas. Based on seismic attributes, well logging and testing data we found that seismic absorption attenuation, after eliminating the effects of burial depth, shows an evident correlation with CBM gas content; (positive structure curvature has a negative correlation with gas content; and density has a negative correlation with gas content. It is feasible to use the hydrocarbon index (P*G and pseudo-Poisson ratio attributes for detection of gas enrichment zones. Based on seismic multiple-attribute analyses, a multiple linear regression equation was established between the seismic attributes and gas content at the drilling wells. Application of this equation to the seismic attributes at locations other than the drilling wells yielded a quantitative prediction of planar gas distribution. Prediction calculations were performed for two different models, one using pre-stack inversion and the other one disregarding pre-stack inversion. A comparison of the results indicates that both models predicted a similar trend for gas content distribution, except that the model using pre-stack inversion yielded a prediction result with considerably higher precision than the other model.

  3. Regression-based statistical mediation and moderation analysis in clinical research: Observations, recommendations, and implementation.

    Science.gov (United States)

    Hayes, Andrew F; Rockwood, Nicholas J

    2017-11-01

    There have been numerous treatments in the clinical research literature about various design, analysis, and interpretation considerations when testing hypotheses about mechanisms and contingencies of effects, popularly known as mediation and moderation analysis. In this paper we address the practice of mediation and moderation analysis using linear regression in the pages of Behaviour Research and Therapy and offer some observations and recommendations, debunk some popular myths, describe some new advances, and provide an example of mediation, moderation, and their integration as conditional process analysis using the PROCESS macro for SPSS and SAS. Our goal is to nudge clinical researchers away from historically significant but increasingly old school approaches toward modifications, revisions, and extensions that characterize more modern thinking about the analysis of the mechanisms and contingencies of effects. Copyright © 2016 Elsevier Ltd. All rights reserved.

  4. A Meta-Heuristic Regression-Based Feature Selection for Predictive Analytics

    Directory of Open Access Journals (Sweden)

    Bharat Singh

    2014-11-01

    Full Text Available A high-dimensional feature selection having a very large number of features with an optimal feature subset is an NP-complete problem. Because conventional optimization techniques are unable to tackle large-scale feature selection problems, meta-heuristic algorithms are widely used. In this paper, we propose a particle swarm optimization technique while utilizing regression techniques for feature selection. We then use the selected features to classify the data. Classification accuracy is used as a criterion to evaluate classifier performance, and classification is accomplished through the use of k-nearest neighbour (KNN and Bayesian techniques. Various high dimensional data sets are used to evaluate the usefulness of the proposed approach. Results show that our approach gives better results when compared with other conventional feature selection algorithms.

  5. Analysis of stresses on buried pipeline subjected to landslide based on numerical simulation and regression analysis

    Energy Technology Data Exchange (ETDEWEB)

    Han, Bing; Jing, Hongyuan; Liu, Jianping; Wu, Zhangzhong [PetroChina Pipeline RandD Center, Langfang, Hebei (China); Hao, Jianbin [School of Petroleum Engineering, Southwest Petroleum University, Chengdu, Sichuan (China)

    2010-07-01

    Landslides have a serious impact on the integrity of oil and gas pipelines in the tough terrain of Western China. This paper introduces a solving method of axial stress, which uses numerical simulation and regression analysis for the pipelines subjected to landslides. Numerical simulation is performed to analyze the change regularity of pipe stresses for the five vulnerability assessment indexes, which are: the distance between pipeline and landslide tail; the thickness of landslide; the inclination angle of landslide; the pipeline length passing through landslide; and the buried depth of pipeline. A pipeline passing through a certain landslide in southwest China was selected as an example to verify the feasibility and effectiveness of this method. This method has practical applicability, but it would need large numbers of examples to better verify its reliability and should be modified accordingly. Also, it only considers the case where the direction of the pipeline is perpendicular to the primary slip direction of the landslide.

  6. The Effect of Latent Binary Variables on the Uncertainty of the Prediction of a Dichotomous Outcome Using Logistic Regression Based Propensity Score Matching.

    Science.gov (United States)

    Szekér, Szabolcs; Vathy-Fogarassy, Ágnes

    2018-01-01

    Logistic regression based propensity score matching is a widely used method in case-control studies to select the individuals of the control group. This method creates a suitable control group if all factors affecting the output variable are known. However, if relevant latent variables exist as well, which are not taken into account during the calculations, the quality of the control group is uncertain. In this paper, we present a statistics-based research in which we try to determine the relationship between the accuracy of the logistic regression model and the uncertainty of the dependent variable of the control group defined by propensity score matching. Our analyses show that there is a linear correlation between the fit of the logistic regression model and the uncertainty of the output variable. In certain cases, a latent binary explanatory variable can result in a relative error of up to 70% in the prediction of the outcome variable. The observed phenomenon calls the attention of analysts to an important point, which must be taken into account when deducting conclusions.

  7. AucPR: an AUC-based approach using penalized regression for disease prediction with high-dimensional omics data.

    Science.gov (United States)

    Yu, Wenbao; Park, Taesung

    2014-01-01

    It is common to get an optimal combination of markers for disease classification and prediction when multiple markers are available. Many approaches based on the area under the receiver operating characteristic curve (AUC) have been proposed. Existing works based on AUC in a high-dimensional context depend mainly on a non-parametric, smooth approximation of AUC, with no work using a parametric AUC-based approach, for high-dimensional data. We propose an AUC-based approach using penalized regression (AucPR), which is a parametric method used for obtaining a linear combination for maximizing the AUC. To obtain the AUC maximizer in a high-dimensional context, we transform a classical parametric AUC maximizer, which is used in a low-dimensional context, into a regression framework and thus, apply the penalization regression approach directly. Two kinds of penalization, lasso and elastic net, are considered. The parametric approach can avoid some of the difficulties of a conventional non-parametric AUC-based approach, such as the lack of an appropriate concave objective function and a prudent choice of the smoothing parameter. We apply the proposed AucPR for gene selection and classification using four real microarray and synthetic data. Through numerical studies, AucPR is shown to perform better than the penalized logistic regression and the nonparametric AUC-based method, in the sense of AUC and sensitivity for a given specificity, particularly when there are many correlated genes. We propose a powerful parametric and easily-implementable linear classifier AucPR, for gene selection and disease prediction for high-dimensional data. AucPR is recommended for its good prediction performance. Beside gene expression microarray data, AucPR can be applied to other types of high-dimensional omics data, such as miRNA and protein data.

  8. FY01 Supplemental Science and Performance Analyses, Volume 1: Scientific Bases and Analyses, Part 1 and 2

    International Nuclear Information System (INIS)

    Dobson, David

    2001-01-01

    The U.S. Department of Energy (DOE) is considering the possible recommendation of a site at Yucca Mountain, Nevada, for development as a geologic repository for the disposal of high-level radioactive waste and spent nuclear fuel. To facilitate public review and comment, in May 2001 the DOE released the Yucca Mountain Science and Engineering Report (S and ER) (DOE 2001 [DIRS 153849]), which presents technical information supporting the consideration of the possible site recommendation. The report summarizes the results of more than 20 years of scientific and engineering studies. A decision to recommend the site has not been made: the DOE has provided the S and ER and its supporting documents as an aid to the public in formulating comments on the possible recommendation. When the S and ER (DOE 2001 [DIRS 153849]) was released, the DOE acknowledged that technical and scientific analyses of the site were ongoing. Therefore, the DOE noted in the Federal Register Notice accompanying the report (66 FR 23 013 [DIRS 155009], p. 2) that additional technical information would be released before the dates, locations, and times for public hearings on the possible recommendation were announced. This information includes: (1) the results of additional technical studies of a potential repository at Yucca Mountain, contained in this FY01 Supplemental Science and Performance Analyses: Vol. 1, Scientific Bases and Analyses; and FY01 Supplemental Science and Performance Analyses: Vol. 2, Performance Analyses (McNeish 2001 [DIRS 155023]) (collectively referred to as the SSPA) and (2) a preliminary evaluation of the Yucca Mountain site's preclosure and postclosure performance against the DOE's proposed site suitability guidelines (10 CFR Part 963 [64 FR 67054] [DIRS 124754]). By making the large amount of information developed on Yucca Mountain available in stages, the DOE intends to provide the public and interested parties with time to review the available materials and to formulate

  9. Landslide Fissure Inference Assessment by ANFIS and Logistic Regression Using UAS-Based Photogrammetry

    Directory of Open Access Journals (Sweden)

    Ozgun Akcay

    2015-10-01

    Full Text Available Unmanned Aerial Systems (UAS are now capable of gathering high-resolution data, therefore, landslides can be explored in detail at larger scales. In this research, 132 aerial photographs were captured, and 85,456 features were detected and matched automatically using UAS photogrammetry. The root mean square (RMS values of the image coordinates of the Ground Control Points (GPCs varied from 0.521 to 2.293 pixels, whereas maximum RMS values of automatically matched features was calculated as 2.921 pixels. Using the 3D point cloud, which was acquired by aerial photogrammetry, the raster datasets of the aspect, slope, and maximally stable extremal regions (MSER detecting visual uniformity, were defined as three variables, in order to reason fissure structures on the landslide surface. In this research, an Adaptive Neuro Fuzzy Inference System (ANFIS and a Logistic Regression (LR were implemented using training datasets to infer fissure data appropriately. The accuracy of the predictive models was evaluated by drawing receiver operating characteristic (ROC curves and by calculating the area under the ROC curve (AUC. The experiments exposed that high-resolution imagery is an indispensable data source to model and validate landslide fissures appropriately.

  10. Particle swarm optimization-based least squares support vector regression for critical heat flux prediction

    International Nuclear Information System (INIS)

    Jiang, B.T.; Zhao, F.Y.

    2013-01-01

    Highlights: ► CHF data are collected from the published literature. ► Less training data are used to train the LSSVR model. ► PSO is adopted to optimize the key parameters to improve the model precision. ► The reliability of LSSVR is proved through parametric trends analysis. - Abstract: In view of practical importance of critical heat flux (CHF) for design and safety of nuclear reactors, accurate prediction of CHF is of utmost significance. This paper presents a novel approach using least squares support vector regression (LSSVR) and particle swarm optimization (PSO) to predict CHF. Two available published datasets are used to train and test the proposed algorithm, in which PSO is employed to search for the best parameters involved in LSSVR model. The CHF values obtained by the LSSVR model are compared with the corresponding experimental values and those of a previous method, adaptive neuro fuzzy inference system (ANFIS). This comparison is also carried out in the investigation of parametric trends of CHF. It is found that the proposed method can achieve the desired performance and yields a more satisfactory fit with experimental results than ANFIS. Therefore, LSSVR method is likely to be suitable for other parameters processing such as CHF

  11. How and for whom does web-based acceptance and commitment therapy work? Mediation and moderation analyses of web-based ACT for depressive symptoms.

    Science.gov (United States)

    Pots, Wendy T M; Trompetter, Hester R; Schreurs, Karlein M G; Bohlmeijer, Ernst T

    2016-05-23

    Acceptance and Commitment Therapy (ACT) has been demonstrated to be effective in reducing depressive symptoms. However, little is known how and for whom therapeutic change occurs, specifically in web-based interventions. This study focuses on the mediators, moderators and predictors of change during a web-based ACT intervention. Data from 236 adults from the general population with mild to moderate depressive symptoms, randomized to either web-based ACT (n = 82) or one of two control conditions (web-based Expressive Writing (EW; n = 67) and a waiting list (n = 87)), were analysed. Single and multiple mediation analyses, and exploratory linear regression analyses were performed using PROCESS and linear regression analyses, to examine mediators, moderators and predictors on pre- to post- and follow-up treatment change of depressive symptoms. The treatment effect of ACT versus the waiting list was mediated by psychological flexibility and two mindfulness facets. The treatment effect of ACT versus EW was not significantly mediated. The moderator analyses demonstrated that the effects of web-based ACT did not vary according to baseline patient characteristics when compared to both control groups. However, higher baseline depressive symptoms and positive mental health and lower baseline anxiety were identified as predictors of outcome across all conditions. Similar results are found for follow-up. The findings of this study corroborate the evidence that psychological flexibility and mindfulness are distinct process mechanisms that mediate the effects of web-based ACT intervention. The results indicate that there are no restrictions to the allocation of web-based ACT intervention and that web-based ACT can work for different subpopulations. Netherlands Trial Register NTR2736 . Registered 6 February 2011.

  12. Autistic Regression

    Science.gov (United States)

    Matson, Johnny L.; Kozlowski, Alison M.

    2010-01-01

    Autistic regression is one of the many mysteries in the developmental course of autism and pervasive developmental disorders not otherwise specified (PDD-NOS). Various definitions of this phenomenon have been used, further clouding the study of the topic. Despite this problem, some efforts at establishing prevalence have been made. The purpose of…

  13. Secondary mediation and regression analyses of the PTClinResNet database: determining causal relationships among the International Classification of Functioning, Disability and Health levels for four physical therapy intervention trials.

    Science.gov (United States)

    Mulroy, Sara J; Winstein, Carolee J; Kulig, Kornelia; Beneck, George J; Fowler, Eileen G; DeMuth, Sharon K; Sullivan, Katherine J; Brown, David A; Lane, Christianne J

    2011-12-01

    Each of the 4 randomized clinical trials (RCTs) hosted by the Physical Therapy Clinical Research Network (PTClinResNet) targeted a different disability group (low back disorder in the Muscle-Specific Strength Training Effectiveness After Lumbar Microdiskectomy [MUSSEL] trial, chronic spinal cord injury in the Strengthening and Optimal Movements for Painful Shoulders in Chronic Spinal Cord Injury [STOMPS] trial, adult stroke in the Strength Training Effectiveness Post-Stroke [STEPS] trial, and pediatric cerebral palsy in the Pediatric Endurance and Limb Strengthening [PEDALS] trial for children with spastic diplegic cerebral palsy) and tested the effectiveness of a muscle-specific or functional activity-based intervention on primary outcomes that captured pain (STOMPS, MUSSEL) or locomotor function (STEPS, PEDALS). The focus of these secondary analyses was to determine causal relationships among outcomes across levels of the International Classification of Functioning, Disability and Health (ICF) framework for the 4 RCTs. With the database from PTClinResNet, we used 2 separate secondary statistical approaches-mediation analysis for the MUSSEL and STOMPS trials and regression analysis for the STEPS and PEDALS trials-to test relationships among muscle performance, primary outcomes (pain related and locomotor related), activity and participation measures, and overall quality of life. Predictive models were stronger for the 2 studies with pain-related primary outcomes. Change in muscle performance mediated or predicted reductions in pain for the MUSSEL and STOMPS trials and, to some extent, walking speed for the STEPS trial. Changes in primary outcome variables were significantly related to changes in activity and participation variables for all 4 trials. Improvement in activity and participation outcomes mediated or predicted increases in overall quality of life for the 3 trials with adult populations. Variables included in the statistical models were limited to those

  14. Regression Analysis

    CERN Document Server

    Freund, Rudolf J; Sa, Ping

    2006-01-01

    The book provides complete coverage of the classical methods of statistical analysis. It is designed to give students an understanding of the purpose of statistical analyses, to allow the student to determine, at least to some degree, the correct type of statistical analyses to be performed in a given situation, and have some appreciation of what constitutes good experimental design

  15. Comparison of regression coefficient and GIS-based methodologies for regional estimates of forest soil carbon stocks

    International Nuclear Information System (INIS)

    Elliott Campbell, J.; Moen, Jeremie C.; Ney, Richard A.; Schnoor, Jerald L.

    2008-01-01

    Estimates of forest soil organic carbon (SOC) have applications in carbon science, soil quality studies, carbon sequestration technologies, and carbon trading. Forest SOC has been modeled using a regression coefficient methodology that applies mean SOC densities (mass/area) to broad forest regions. A higher resolution model is based on an approach that employs a geographic information system (GIS) with soil databases and satellite-derived landcover images. Despite this advancement, the regression approach remains the basis of current state and federal level greenhouse gas inventories. Both approaches are analyzed in detail for Wisconsin forest soils from 1983 to 2001, applying rigorous error-fixing algorithms to soil databases. Resulting SOC stock estimates are 20% larger when determined using the GIS method rather than the regression approach. Average annual rates of increase in SOC stocks are 3.6 and 1.0 million metric tons of carbon per year for the GIS and regression approaches respectively. - Large differences in estimates of soil organic carbon stocks and annual changes in stocks for Wisconsin forestlands indicate a need for validation from forthcoming forest surveys

  16. Forecasting the daily power output of a grid-connected photovoltaic system based on multivariate adaptive regression splines

    International Nuclear Information System (INIS)

    Li, Yanting; He, Yong; Su, Yan; Shu, Lianjie

    2016-01-01

    Highlights: • Suggests a nonparametric model based on MARS for output power prediction. • Compare the MARS model with a wide variety of prediction models. • Show that the MARS model is able to provide an overall good performance in both the training and testing stages. - Abstract: Both linear and nonlinear models have been proposed for forecasting the power output of photovoltaic systems. Linear models are simple to implement but less flexible. Due to the stochastic nature of the power output of PV systems, nonlinear models tend to provide better forecast than linear models. Motivated by this, this paper suggests a fairly simple nonlinear regression model known as multivariate adaptive regression splines (MARS), as an alternative to forecasting of solar power output. The MARS model is a data-driven modeling approach without any assumption about the relationship between the power output and predictors. It maintains simplicity of the classical multiple linear regression (MLR) model while possessing the capability of handling nonlinearity. It is simpler in format than other nonlinear models such as ANN, k-nearest neighbors (KNN), classification and regression tree (CART), and support vector machine (SVM). The MARS model was applied on the daily output of a grid-connected 2.1 kW PV system to provide the 1-day-ahead mean daily forecast of the power output. The comparisons with a wide variety of forecast models show that the MARS model is able to provide reliable forecast performance.

  17. Edinburgh Cognitive and Behavioural ALS Screen (ECAS)-Italian version: regression based norms and equivalent scores.

    Science.gov (United States)

    Siciliano, Mattia; Trojano, Luigi; Trojsi, Francesca; Greco, Roberta; Santoro, Manuela; Basile, Giuseppe; Piscopo, Fausta; D'Iorio, Alfonsina; Patrone, Manila; Femiano, Cinzia; Monsurrò, Mariarosaria; Tedeschi, Gioacchino; Santangelo, Gabriella

    2017-06-01

    Cognitive assessment for individuals with Amyotrophic Lateral Sclerosis (ALS) can be difficult because of frequent occurrence of difficulties with speech, writing, and drawing. The Edinburgh Cognitive and Behavioural ALS Screen (ECAS) is a recent multi-domain neuropsychological screening tool specifically devised for this purpose, and it assesses the following domains: executive functions, social cognition, verbal fluency and language (ALS-specific), but also memory and visuospatial abilities (Non-ALS specific). ECAS total score ranges from 0 (worst performance) to 136 (best performance). Moreover, a brief caregiver interview provides an assessment of behaviour changes and psychotic symptoms usually associated with ALS patients. The aim of the present study was to provide normative values for ECAS total score and sub-scores in a sample of Italian healthy subjects. Two hundred and seventy-seven Italian healthy subjects (151 women and 126 men; age range 30-79 years; educational level from primary school to university) underwent ECAS and Montreal Cognitive Assessment (MoCA). Multiple linear regression analysis revealed that age and education significantly influenced performance on ECAS total score and sub-scale scores. From the derived linear equation, a correction grid for raw scores was built. Inferential cut-off scores were estimated using a non-parametric technique and equivalent scores (ES) were computed. Correlation analysis showed a good significant correlation between adjusted ECAS total scores with adjusted MoCA total scores (r rho  = 0.669, p < 0.0001). The present study provided normative data for the ECAS in an Italian population useful for both clinical and research purposes.

  18. An adaptive functional regression-based prognostic model for applications with missing data

    International Nuclear Information System (INIS)

    Fang, Xiaolei; Zhou, Rensheng; Gebraeel, Nagi

    2015-01-01

    Most prognostic degradation models rely on a relatively accurate and comprehensive database of historical degradation signals. Typically, these signals are used to identify suitable degradation trends that are useful for predicting lifetime. In many real-world applications, these degradation signals are usually incomplete, i.e., contain missing observations. Often the amount of missing data compromises the ability to identify a suitable parametric degradation model. This paper addresses this problem by developing a semi-parametric approach that can be used to predict the remaining lifetime of partially degraded systems. First, key signal features are identified by applying Functional Principal Components Analysis (FPCA) to the available historical data. Next, an adaptive functional regression model is used to model the extracted signal features and the corresponding times-to-failure. The model is then used to predict remaining lifetimes and to update these predictions using real-time signals observed from fielded components. Results show that the proposed approach is relatively robust to significant levels of missing data. The performance of the model is evaluated and shown to provide significantly accurate predictions of residual lifetime using two case studies. - Highlights: • We model degradation signals with missing data with the goal of predicting remaining lifetime. • We examine two types of signal characteristics, fragmented and sparse. • We provide framework that updates remaining life predictions by incorporating real-time signal observations. • For the missing data, we show that the proposed model outperforms other benchmark models. • For the complete data, we show that the proposed model performs at least as good as a benchmark model

  19. Agricultural drought prediction using climate indices based on Support Vector Regression in Xiangjiang River basin.

    Science.gov (United States)

    Tian, Ye; Xu, Yue-Ping; Wang, Guoqing

    2018-05-01

    Drought can have a substantial impact on the ecosystem and agriculture of the affected region and does harm to local economy. This study aims to analyze the relation between soil moisture and drought and predict agricultural drought in Xiangjiang River basin. The agriculture droughts are presented with the Precipitation-Evapotranspiration Index (SPEI). The Support Vector Regression (SVR) model incorporating climate indices is developed to predict the agricultural droughts. Analysis of climate forcing including El Niño Southern Oscillation and western Pacific subtropical high (WPSH) are carried out to select climate indices. The results show that SPEI of six months time scales (SPEI-6) represents the soil moisture better than that of three and one month time scale on drought duration, severity and peaks. The key factor that influences the agriculture drought is the Ridge Point of WPSH, which mainly controls regional temperature. The SVR model incorporating climate indices, especially ridge point of WPSH, could improve the prediction accuracy compared to that solely using drought index by 4.4% in training and 5.1% in testing measured by Nash Sutcliffe efficiency coefficient (NSE) for three month lead time. The improvement is more significant for the prediction with one month lead (15.8% in training and 27.0% in testing) than that with three months lead time. However, it needs to be cautious in selection of the input parameters, since adding redundant information could have a counter effect in attaining a better prediction. Copyright © 2017 Elsevier B.V. All rights reserved.

  20. Bayesian quantile regression-based partially linear mixed-effects joint models for longitudinal data with multiple features.

    Science.gov (United States)

    Zhang, Hanze; Huang, Yangxin; Wang, Wei; Chen, Henian; Langland-Orban, Barbara

    2017-01-01

    In longitudinal AIDS studies, it is of interest to investigate the relationship between HIV viral load and CD4 cell counts, as well as the complicated time effect. Most of common models to analyze such complex longitudinal data are based on mean-regression, which fails to provide efficient estimates due to outliers and/or heavy tails. Quantile regression-based partially linear mixed-effects models, a special case of semiparametric models enjoying benefits of both parametric and nonparametric models, have the flexibility to monitor the viral dynamics nonparametrically and detect the varying CD4 effects parametrically at different quantiles of viral load. Meanwhile, it is critical to consider various data features of repeated measurements, including left-censoring due to a limit of detection, covariate measurement error, and asymmetric distribution. In this research, we first establish a Bayesian joint models that accounts for all these data features simultaneously in the framework of quantile regression-based partially linear mixed-effects models. The proposed models are applied to analyze the Multicenter AIDS Cohort Study (MACS) data. Simulation studies are also conducted to assess the performance of the proposed methods under different scenarios.

  1. Improving ASTER GDEM Accuracy Using Land Use-Based Linear Regression Methods: A Case Study of Lianyungang, East China

    Directory of Open Access Journals (Sweden)

    Xiaoyan Yang

    2018-04-01

    Full Text Available The Advanced Spaceborne Thermal-Emission and Reflection Radiometer Global Digital Elevation Model (ASTER GDEM is important to a wide range of geographical and environmental studies. Its accuracy, to some extent associated with land-use types reflecting topography, vegetation coverage, and human activities, impacts the results and conclusions of these studies. In order to improve the accuracy of ASTER GDEM prior to its application, we investigated ASTER GDEM errors based on individual land-use types and proposed two linear regression calibration methods, one considering only land use-specific errors and the other considering the impact of both land-use and topography. Our calibration methods were tested on the coastal prefectural city of Lianyungang in eastern China. Results indicate that (1 ASTER GDEM is highly accurate for rice, wheat, grass and mining lands but less accurate for scenic, garden, wood and bare lands; (2 despite improvements in ASTER GDEM2 accuracy, multiple linear regression calibration requires more data (topography and a relatively complex calibration process; (3 simple linear regression calibration proves a practicable and simplified means to systematically investigate and improve the impact of land-use on ASTER GDEM accuracy. Our method is applicable to areas with detailed land-use data based on highly accurate field-based point-elevation measurements.

  2. A Gaussian mixture copula model based localized Gaussian process regression approach for long-term wind speed prediction

    International Nuclear Information System (INIS)

    Yu, Jie; Chen, Kuilin; Mori, Junichi; Rashid, Mudassir M.

    2013-01-01

    Optimizing wind power generation and controlling the operation of wind turbines to efficiently harness the renewable wind energy is a challenging task due to the intermittency and unpredictable nature of wind speed, which has significant influence on wind power production. A new approach for long-term wind speed forecasting is developed in this study by integrating GMCM (Gaussian mixture copula model) and localized GPR (Gaussian process regression). The time series of wind speed is first classified into multiple non-Gaussian components through the Gaussian mixture copula model and then Bayesian inference strategy is employed to incorporate the various non-Gaussian components using the posterior probabilities. Further, the localized Gaussian process regression models corresponding to different non-Gaussian components are built to characterize the stochastic uncertainty and non-stationary seasonality of the wind speed data. The various localized GPR models are integrated through the posterior probabilities as the weightings so that a global predictive model is developed for the prediction of wind speed. The proposed GMCM–GPR approach is demonstrated using wind speed data from various wind farm locations and compared against the GMCM-based ARIMA (auto-regressive integrated moving average) and SVR (support vector regression) methods. In contrast to GMCM–ARIMA and GMCM–SVR methods, the proposed GMCM–GPR model is able to well characterize the multi-seasonality and uncertainty of wind speed series for accurate long-term prediction. - Highlights: • A novel predictive modeling method is proposed for long-term wind speed forecasting. • Gaussian mixture copula model is estimated to characterize the multi-seasonality. • Localized Gaussian process regression models can deal with the random uncertainty. • Multiple GPR models are integrated through Bayesian inference strategy. • The proposed approach shows higher prediction accuracy and reliability

  3. Linear regression

    CERN Document Server

    Olive, David J

    2017-01-01

    This text covers both multiple linear regression and some experimental design models. The text uses the response plot to visualize the model and to detect outliers, does not assume that the error distribution has a known parametric distribution, develops prediction intervals that work when the error distribution is unknown, suggests bootstrap hypothesis tests that may be useful for inference after variable selection, and develops prediction regions and large sample theory for the multivariate linear regression model that has m response variables. A relationship between multivariate prediction regions and confidence regions provides a simple way to bootstrap confidence regions. These confidence regions often provide a practical method for testing hypotheses. There is also a chapter on generalized linear models and generalized additive models. There are many R functions to produce response and residual plots, to simulate prediction intervals and hypothesis tests, to detect outliers, and to choose response trans...

  4. Quantile Regression Methods

    DEFF Research Database (Denmark)

    Fitzenberger, Bernd; Wilke, Ralf Andreas

    2015-01-01

    if the mean regression model does not. We provide a short informal introduction into the principle of quantile regression which includes an illustrative application from empirical labor market research. This is followed by briefly sketching the underlying statistical model for linear quantile regression based......Quantile regression is emerging as a popular statistical approach, which complements the estimation of conditional mean models. While the latter only focuses on one aspect of the conditional distribution of the dependent variable, the mean, quantile regression provides more detailed insights...... by modeling conditional quantiles. Quantile regression can therefore detect whether the partial effect of a regressor on the conditional quantiles is the same for all quantiles or differs across quantiles. Quantile regression can provide evidence for a statistical relationship between two variables even...

  5. Innovation Strategy Research of Yunnan Image in Modern Entertainment Channels based on Regression and Inheritance of Culture

    Institute of Scientific and Technical Information of China (English)

    Yang Shao[1

    2016-01-01

    In this paper, we conduct innovation strategy research of the Yunnan image in the modern entertainment channels based on regression and inheritance of culture. The concept of culture and complicated contents as simply includes material culture and spiritual culture. Cultural inheritance is the process of acculturation a nation it not only is the transmission of national culture, the inheritance and development that also can be refl ected by many indexes. Multicultural education theory formed in the American civil rights movement in the 1960s. Multicultural education theory is that, when the mainstream national culture and the minority subculture in contact, every culture shall be entitled to retain their own cultural traits. Our research integrate the regression and inheritance of culture to then propose the innovation strategy research of the Yunnan image that will promote further development of the corresponding and related industry.

  6. A new fuzzy regression model based on interval-valued fuzzy neural network and its applications to management

    Directory of Open Access Journals (Sweden)

    Somaye Yeylaghi

    2017-06-01

    Full Text Available In this paper, a novel hybrid method based on interval-valued fuzzy neural network for approximate of interval-valued fuzzy regression models, is presented. The work of this paper is an expansion of the research of real fuzzy regression models. In this paper interval-valued fuzzy neural network (IVFNN can be trained with crisp and interval-valued fuzzy data. Here a neural network is considered as a part of a large field called neural computing or soft computing. Moreover, in order to find the approximate parameters, a simple algorithm from the cost function of the fuzzy neural network is proposed. Finally, we illustrate our approach by some numerical examples and compare this method with existing methods.

  7. Two-Stage Method Based on Local Polynomial Fitting for a Linear Heteroscedastic Regression Model and Its Application in Economics

    Directory of Open Access Journals (Sweden)

    Liyun Su

    2012-01-01

    Full Text Available We introduce the extension of local polynomial fitting to the linear heteroscedastic regression model. Firstly, the local polynomial fitting is applied to estimate heteroscedastic function, then the coefficients of regression model are obtained by using generalized least squares method. One noteworthy feature of our approach is that we avoid the testing for heteroscedasticity by improving the traditional two-stage method. Due to nonparametric technique of local polynomial estimation, we do not need to know the heteroscedastic function. Therefore, we can improve the estimation precision, when the heteroscedastic function is unknown. Furthermore, we focus on comparison of parameters and reach an optimal fitting. Besides, we verify the asymptotic normality of parameters based on numerical simulations. Finally, this approach is applied to a case of economics, and it indicates that our method is surely effective in finite-sample situations.

  8. Area under the curve predictions of dalbavancin, a new lipoglycopeptide agent, using the end of intravenous infusion concentration data point by regression analyses such as linear, log-linear and power models.

    Science.gov (United States)

    Bhamidipati, Ravi Kanth; Syed, Muzeeb; Mullangi, Ramesh; Srinivas, Nuggehally

    2018-02-01

    1. Dalbavancin, a lipoglycopeptide, is approved for treating gram-positive bacterial infections. Area under plasma concentration versus time curve (AUC inf ) of dalbavancin is a key parameter and AUC inf /MIC ratio is a critical pharmacodynamic marker. 2. Using end of intravenous infusion concentration (i.e. C max ) C max versus AUC inf relationship for dalbavancin was established by regression analyses (i.e. linear, log-log, log-linear and power models) using 21 pairs of subject data. 3. The predictions of the AUC inf were performed using published C max data by application of regression equations. The quotient of observed/predicted values rendered fold difference. The mean absolute error (MAE)/root mean square error (RMSE) and correlation coefficient (r) were used in the assessment. 4. MAE and RMSE values for the various models were comparable. The C max versus AUC inf exhibited excellent correlation (r > 0.9488). The internal data evaluation showed narrow confinement (0.84-1.14-fold difference) with a RMSE models predicted AUC inf with a RMSE of 3.02-27.46% with fold difference largely contained within 0.64-1.48. 5. Regardless of the regression models, a single time point strategy of using C max (i.e. end of 30-min infusion) is amenable as a prospective tool for predicting AUC inf of dalbavancin in patients.

  9. Predicting volume of distribution with decision tree-based regression methods using predicted tissue:plasma partition coefficients.

    Science.gov (United States)

    Freitas, Alex A; Limbu, Kriti; Ghafourian, Taravat

    2015-01-01

    Volume of distribution is an important pharmacokinetic property that indicates the extent of a drug's distribution in the body tissues. This paper addresses the problem of how to estimate the apparent volume of distribution at steady state (Vss) of chemical compounds in the human body using decision tree-based regression methods from the area of data mining (or machine learning). Hence, the pros and cons of several different types of decision tree-based regression methods have been discussed. The regression methods predict Vss using, as predictive features, both the compounds' molecular descriptors and the compounds' tissue:plasma partition coefficients (Kt:p) - often used in physiologically-based pharmacokinetics. Therefore, this work has assessed whether the data mining-based prediction of Vss can be made more accurate by using as input not only the compounds' molecular descriptors but also (a subset of) their predicted Kt:p values. Comparison of the models that used only molecular descriptors, in particular, the Bagging decision tree (mean fold error of 2.33), with those employing predicted Kt:p values in addition to the molecular descriptors, such as the Bagging decision tree using adipose Kt:p (mean fold error of 2.29), indicated that the use of predicted Kt:p values as descriptors may be beneficial for accurate prediction of Vss using decision trees if prior feature selection is applied. Decision tree based models presented in this work have an accuracy that is reasonable and similar to the accuracy of reported Vss inter-species extrapolations in the literature. The estimation of Vss for new compounds in drug discovery will benefit from methods that are able to integrate large and varied sources of data and flexible non-linear data mining methods such as decision trees, which can produce interpretable models. Graphical AbstractDecision trees for the prediction of tissue partition coefficient and volume of distribution of drugs.

  10. Towards artificial intelligence based diesel engine performance control under varying operating conditions using support vector regression

    Directory of Open Access Journals (Sweden)

    Naradasu Kumar Ravi

    2013-01-01

    Full Text Available Diesel engine designers are constantly on the look-out for performance enhancement through efficient control of operating parameters. In this paper, the concept of an intelligent engine control system is proposed that seeks to ensure optimized performance under varying operating conditions. The concept is based on arriving at the optimum engine operating parameters to ensure the desired output in terms of efficiency. In addition, a Support Vector Machines based prediction model has been developed to predict the engine performance under varying operating conditions. Experiments were carried out at varying loads, compression ratios and amounts of exhaust gas recirculation using a variable compression ratio diesel engine for data acquisition. It was observed that the SVM model was able to predict the engine performance accurately.

  11. Education-Based Gaps in eHealth: A Weighted Logistic Regression Approach

    OpenAIRE

    Amo, Laura

    2016-01-01

    Background Persons with a college degree are more likely to engage in eHealth behaviors than persons without a college degree, compounding the health disadvantages of undereducated groups in the United States. However, the extent to which quality of recent eHealth experience reduces the education-based eHealth gap is unexplored. Objective The goal of this study was to examine how eHealth information search experience moderates the relationship between college education and eHealth behaviors. ...

  12. Orbitrap-based mass analyser for in-situ characterization of asteroids: ILMA, Ion Laser Mass Analyser

    Science.gov (United States)

    Briois, C.; Cotti, H.; Thirkell, L.; Space Orbitrap Consortium[K. Aradj, French; Bouabdellah, A.; Boukrara, A.; Carrasco, N.; Chalumeau, G.; Chapelon, O.; Colin, F.; Coll, P.; Engrand, C.; Grand, N.; Kukui, A.; Lebreton, J.-P.; Pennanech, C.; Szopa, C.; Thissen, R.; Vuitton, V.; Zapf], P.; Makarov, A.

    2014-07-01

    Since about a decade the boundaries between comets and carbonaceous asteroids are fading [1,2]. No doubt that the Rosetta mission should bring a new wealth of data on the composition of comets. But as promising as it may look, the mass resolving power of the mass spectrometers onboard (so far the best on a space mission) will only be able to partially account for the diversity of chemical structures present. ILMA (Ion-Laser Mass Analyser) is a new generation high mass resolution LDI-MS (Laser Desorption-Ionization Mass Spectrometer) instrument concept using the Orbitrap technique, which has been developed in the frame of the two Marco Polo & Marco Polo-R proposals to the ESA Cosmic Vision program. Flagged by ESA as an instrument concept of interest for the mission in 2012, it has been under study for a few years in the frame of a Research and Technology (R&T) development programme between 5 French laboratories (LPC2E, IPAG, LATMOS, LISA, CSNSM) [3,4], partly funded by the French Space Agency (CNES). The work is undertaken in close collaboration with the Thermo Fisher Scientific Company, which commercialises Orbitrap-based laboratory instruments. The R&T activities are currently concentrating on the core elements of the Orbitrap analyser that are required to reach a sufficient maturity level for allowing design studies of future space instruments. A prototype is under development at LPC2E and a mass resolution (m/Δm FWHM) of 100,000 as been obtained at m/z = 150 for a background pressure of 10^{-8} mbar. ILMA would be a key instrument to measure the molecular, elemental and isotopic composition of objects such as carbonaceous asteroids, comets, or other bodies devoid of atmosphere such as the surface of an icy satellite, the Moon, or Mercury.

  13. Consistency analysis of subspace identification methods based on a linear regression approach

    DEFF Research Database (Denmark)

    Knudsen, Torben

    2001-01-01

    In the literature results can be found which claim consistency for the subspace method under certain quite weak assumptions. Unfortunately, a new result gives a counter example showing inconsistency under these assumptions and then gives new more strict sufficient assumptions which however does n...... not include important model structures as e.g. Box-Jenkins. Based on a simple least squares approach this paper shows the possible inconsistency under the weak assumptions and develops only slightly stricter assumptions sufficient for consistency and which includes any model structure...

  14. Ensemble regression model-based anomaly detection for cyber-physical intrusion detection in smart grids

    DEFF Research Database (Denmark)

    Kosek, Anna Magdalena; Gehrke, Oliver

    2016-01-01

    The shift from centralised large production to distributed energy production has several consequences for current power system operation. The replacement of large power plants by growing numbers of distributed energy resources (DERs) increases the dependency of the power system on small scale......, distributed production. Many of these DERs can be accessed and controlled remotely, posing a cybersecurity risk. This paper investigates an intrusion detection system which evaluates the DER operation in order to discover unauthorized control actions. The proposed anomaly detection method is based...

  15. A case study to estimate costs using Neural Networks and regression based models

    Directory of Open Access Journals (Sweden)

    Nadia Bhuiyan

    2012-07-01

    Full Text Available Bombardier Aerospace’s high performance aircrafts and services set the utmost standard for the Aerospace industry. A case study in collaboration with Bombardier Aerospace is conducted in order to estimate the target cost of a landing gear. More precisely, the study uses both parametric model and neural network models to estimate the cost of main landing gears, a major aircraft commodity. A comparative analysis between the parametric based model and those upon neural networks model will be considered in order to determine the most accurate method to predict the cost of a main landing gear. Several trials are presented for the design and use of the neural network model. The analysis for the case under study shows the flexibility in the design of the neural network model. Furthermore, the performance of the neural network model is deemed superior to the parametric models for this case study.

  16. Recursive wind speed forecasting based on Hammerstein Auto-Regressive model

    International Nuclear Information System (INIS)

    Ait Maatallah, Othman; Achuthan, Ajit; Janoyan, Kerop; Marzocca, Pier

    2015-01-01

    Highlights: • Developed a new recursive WSF model for 1–24 h horizon based on Hammerstein model. • Nonlinear HAR model successfully captured chaotic dynamics of wind speed time series. • Recursive WSF intrinsic error accumulation corrected by applying rotation. • Model verified for real wind speed data from two sites with different characteristics. • HAR model outperformed both ARIMA and ANN models in terms of accuracy of prediction. - Abstract: A new Wind Speed Forecasting (WSF) model, suitable for a short term 1–24 h forecast horizon, is developed by adapting Hammerstein model to an Autoregressive approach. The model is applied to real data collected for a period of three years (2004–2006) from two different sites. The performance of HAR model is evaluated by comparing its prediction with the classical Autoregressive Integrated Moving Average (ARIMA) model and a multi-layer perceptron Artificial Neural Network (ANN). Results show that the HAR model outperforms both the ARIMA model and ANN model in terms of root mean square error (RMSE), mean absolute error (MAE), and Mean Absolute Percentage Error (MAPE). When compared to the conventional models, the new HAR model can better capture various wind speed characteristics, including asymmetric (non-gaussian) wind speed distribution, non-stationary time series profile, and the chaotic dynamics. The new model is beneficial for various applications in the renewable energy area, particularly for power scheduling

  17. Gaussian process regression based optimal design of combustion systems using flame images

    International Nuclear Information System (INIS)

    Chen, Junghui; Chan, Lester Lik Teck; Cheng, Yi-Cheng

    2013-01-01

    Highlights: • The digital color images of flames are applied to combustion design. • The combustion with modeling stochastic nature is developed using GP. • GP based uncertainty design is made and evaluated through a real combustion system. - Abstract: With the advanced methods of digital image processing and optical sensing, it is possible to have continuous imaging carried out on-line in combustion processes. In this paper, a method that extracts characteristics from the flame images is presented to immediately predict the outlet content of the flue gas. First, from the large number of flame image data, principal component analysis is used to discover the principal components or combinational variables, which describe the important trends and variations in the operation data. Then stochastic modeling of the combustion process is done by a Gaussian process with the aim to capture the stochastic nature of the flame associated with the oxygen content. The designed oxygen combustion content considers the uncertainty presented in the combustion. A reference image can be designed for the actual combustion process to provide an easy and straightforward maintenance of the combustion process

  18. Adaptabilidade e estabilidade via regressão não paramétrica em genótipos de café Adaptability and stability based on nonparametric regression in coffee genotypes

    Directory of Open Access Journals (Sweden)

    Moysés Nascimento

    2010-01-01

    Full Text Available O objetivo deste trabalho foi avaliar uma metodologia de análise de adaptabilidade e estabilidade fenotípica de genótipos de café baseada em regressão não paramétrica. A técnica utilizada difere das demais, pois reduz a influência na estimação do parâmetro de adaptabilidade de algum ponto extremo, ocasionado pela presença de genótipos com respostas demasiadamente diferenciadas a determinado ambiente. Foram utilizados dados provenientes de um experimento sobre produtividade média de grãos de 40 genótipos de café (Coffea canephora, com delineamento em blocos ao acaso, com seis repetições. Os genótipos foram avaliados em cinco anos (1996, 1998, 1999, 2000 e 2001, em dois locais (Sooretama e Marilândia, ES no total de dez ambientes. A metodologia proposta demonstrou ser adequada e eficiente, pois extingue os efeitos impróprios induzidos pela presença de pontos extremos e evita a recomendação incorreta de genótipos quanto à adaptabilidade.The objective of this work was to evaluate a methodology of phenotypic adaptability and stability analyses of coffee genotypes based on nonparametric regression. The technique used differs from other techniques because it reduces the influence of extreme points resulting from the presence of genotypes whose answers to a certain environment are too different on the estimation of the adaptability parameter. Data from an experiment studying the average yield of 40 coffee (Coffea canephora genotypes in a randomized block design with six replicates were used to evaluate the method. The genotypes were evaluated along five years (1996, 1998, 1999, 2000 and 2001 in two locations (Sooretama and Marilândia, ES, Brazil, in a total of ten environments. The methodology proposed proved adequate and efficient, since it eliminates the disproportionate effects induced by the presence of extreme points and avoids misleading recommendations of genotypes in terms of adaptability.

  19. Complementary Exploratory and Confirmatory Factor Analyses of the French WISC-V: Analyses Based on the Standardization Sample.

    Science.gov (United States)

    Lecerf, Thierry; Canivez, Gary L

    2017-12-28

    Interpretation of the French Wechsler Intelligence Scale for Children-Fifth Edition (French WISC-V; Wechsler, 2016a) is based on a 5-factor model including Verbal Comprehension (VC), Visual Spatial (VS), Fluid Reasoning (FR), Working Memory (WM), and Processing Speed (PS). Evidence for the French WISC-V factorial structure was established exclusively through confirmatory factor analyses (CFAs). However, as recommended by Carroll (1995); Reise (2012), and Brown (2015), factorial structure should derive from both exploratory factor analysis (EFA) and CFA. The first goal of this study was to examine the factorial structure of the French WISC-V using EFA. The 15 French WISC-V primary and secondary subtest scaled scores intercorrelation matrix was used and factor extraction criteria suggested from 1 to 4 factors. To disentangle the contribution of first- and second-order factors, the Schmid and Leiman (1957) orthogonalization transformation (SLT) was applied. Overall, no EFA evidence for 5 factors was found. Results indicated that the g factor accounted for about 67% of the common variance and that the contributions of the first-order factors were weak (3.6 to 11.9%). CFA was used to test numerous alternative models. Results indicated that bifactor models produced better fit to these data than higher-order models. Consistent with previous studies, findings suggested dominance of the general intelligence factor and that users should thus emphasize the Full Scale IQ (FSIQ) when interpreting the French WISC-V. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  20. Copula Regression Analysis of Simultaneously Recorded Frontal Eye Field and Inferotemporal Spiking Activity during Object-Based Working Memory

    Science.gov (United States)

    Hu, Meng; Clark, Kelsey L.; Gong, Xiajing; Noudoost, Behrad; Li, Mingyao; Moore, Tirin

    2015-01-01

    Inferotemporal (IT) neurons are known to exhibit persistent, stimulus-selective activity during the delay period of object-based working memory tasks. Frontal eye field (FEF) neurons show robust, spatially selective delay period activity during memory-guided saccade tasks. We present a copula regression paradigm to examine neural interaction of these two types of signals between areas IT and FEF of the monkey during a working memory task. This paradigm is based on copula models that can account for both marginal distribution over spiking activity of individual neurons within each area and joint distribution over ensemble activity of neurons between areas. Considering the popular GLMs as marginal models, we developed a general and flexible likelihood framework that uses the copula to integrate separate GLMs into a joint regression analysis. Such joint analysis essentially leads to a multivariate analog of the marginal GLM theory and hence efficient model estimation. In addition, we show that Granger causality between spike trains can be readily assessed via the likelihood ratio statistic. The performance of this method is validated by extensive simulations, and compared favorably to the widely used GLMs. When applied to spiking activity of simultaneously recorded FEF and IT neurons during working memory task, we observed significant Granger causality influence from FEF to IT, but not in the opposite direction, suggesting the role of the FEF in the selection and retention of visual information during working memory. The copula model has the potential to provide unique neurophysiological insights about network properties of the brain. PMID:26063909

  1. Performance of an Axisymmetric Rocket Based Combined Cycle Engine During Rocket Only Operation Using Linear Regression Analysis

    Science.gov (United States)

    Smith, Timothy D.; Steffen, Christopher J., Jr.; Yungster, Shaye; Keller, Dennis J.

    1998-01-01

    The all rocket mode of operation is shown to be a critical factor in the overall performance of a rocket based combined cycle (RBCC) vehicle. An axisymmetric RBCC engine was used to determine specific impulse efficiency values based upon both full flow and gas generator configurations. Design of experiments methodology was used to construct a test matrix and multiple linear regression analysis was used to build parametric models. The main parameters investigated in this study were: rocket chamber pressure, rocket exit area ratio, injected secondary flow, mixer-ejector inlet area, mixer-ejector area ratio, and mixer-ejector length-to-inlet diameter ratio. A perfect gas computational fluid dynamics analysis, using both the Spalart-Allmaras and k-omega turbulence models, was performed with the NPARC code to obtain values of vacuum specific impulse. Results from the multiple linear regression analysis showed that for both the full flow and gas generator configurations increasing mixer-ejector area ratio and rocket area ratio increase performance, while increasing mixer-ejector inlet area ratio and mixer-ejector length-to-diameter ratio decrease performance. Increasing injected secondary flow increased performance for the gas generator analysis, but was not statistically significant for the full flow analysis. Chamber pressure was found to be not statistically significant.

  2. [Correlation coefficient-based classification method of hydrological dependence variability: With auto-regression model as example].

    Science.gov (United States)

    Zhao, Yu Xi; Xie, Ping; Sang, Yan Fang; Wu, Zi Yi

    2018-04-01

    Hydrological process evaluation is temporal dependent. Hydrological time series including dependence components do not meet the data consistency assumption for hydrological computation. Both of those factors cause great difficulty for water researches. Given the existence of hydrological dependence variability, we proposed a correlationcoefficient-based method for significance evaluation of hydrological dependence based on auto-regression model. By calculating the correlation coefficient between the original series and its dependence component and selecting reasonable thresholds of correlation coefficient, this method divided significance degree of dependence into no variability, weak variability, mid variability, strong variability, and drastic variability. By deducing the relationship between correlation coefficient and auto-correlation coefficient in each order of series, we found that the correlation coefficient was mainly determined by the magnitude of auto-correlation coefficient from the 1 order to p order, which clarified the theoretical basis of this method. With the first-order and second-order auto-regression models as examples, the reasonability of the deduced formula was verified through Monte-Carlo experiments to classify the relationship between correlation coefficient and auto-correlation coefficient. This method was used to analyze three observed hydrological time series. The results indicated the coexistence of stochastic and dependence characteristics in hydrological process.

  3. A Technique for Estimating Intensity of Emotional Expressions and Speaking Styles in Speech Based on Multiple-Regression HSMM

    Science.gov (United States)

    Nose, Takashi; Kobayashi, Takao

    In this paper, we propose a technique for estimating the degree or intensity of emotional expressions and speaking styles appearing in speech. The key idea is based on a style control technique for speech synthesis using a multiple regression hidden semi-Markov model (MRHSMM), and the proposed technique can be viewed as the inverse of the style control. In the proposed technique, the acoustic features of spectrum, power, fundamental frequency, and duration are simultaneously modeled using the MRHSMM. We derive an algorithm for estimating explanatory variables of the MRHSMM, each of which represents the degree or intensity of emotional expressions and speaking styles appearing in acoustic features of speech, based on a maximum likelihood criterion. We show experimental results to demonstrate the ability of the proposed technique using two types of speech data, simulated emotional speech and spontaneous speech with different speaking styles. It is found that the estimated values have correlation with human perception.

  4. Short-term wind speed prediction using an unscented Kalman filter based state-space support vector regression approach

    International Nuclear Information System (INIS)

    Chen, Kuilin; Yu, Jie

    2014-01-01

    Highlights: • A novel hybrid modeling method is proposed for short-term wind speed forecasting. • Support vector regression model is constructed to formulate nonlinear state-space framework. • Unscented Kalman filter is adopted to recursively update states under random uncertainty. • The new SVR–UKF approach is compared to several conventional methods for short-term wind speed prediction. • The proposed method demonstrates higher prediction accuracy and reliability. - Abstract: Accurate wind speed forecasting is becoming increasingly important to improve and optimize renewable wind power generation. Particularly, reliable short-term wind speed prediction can enable model predictive control of wind turbines and real-time optimization of wind farm operation. However, this task remains challenging due to the strong stochastic nature and dynamic uncertainty of wind speed. In this study, unscented Kalman filter (UKF) is integrated with support vector regression (SVR) based state-space model in order to precisely update the short-term estimation of wind speed sequence. In the proposed SVR–UKF approach, support vector regression is first employed to formulate a nonlinear state-space model and then unscented Kalman filter is adopted to perform dynamic state estimation recursively on wind sequence with stochastic uncertainty. The novel SVR–UKF method is compared with artificial neural networks (ANNs), SVR, autoregressive (AR) and autoregressive integrated with Kalman filter (AR-Kalman) approaches for predicting short-term wind speed sequences collected from three sites in Massachusetts, USA. The forecasting results indicate that the proposed method has much better performance in both one-step-ahead and multi-step-ahead wind speed predictions than the other approaches across all the locations

  5. Geoelectrical parameter-based multivariate regression borehole yield model for predicting aquifer yield in managing groundwater resource sustainability

    Directory of Open Access Journals (Sweden)

    Kehinde Anthony Mogaji

    2016-07-01

    Full Text Available This study developed a GIS-based multivariate regression (MVR yield rate prediction model of groundwater resource sustainability in the hard-rock geology terrain of southwestern Nigeria. This model can economically manage the aquifer yield rate potential predictions that are often overlooked in groundwater resources development. The proposed model relates the borehole yield rate inventory of the area to geoelectrically derived parameters. Three sets of borehole yield rate conditioning geoelectrically derived parameters—aquifer unit resistivity (ρ, aquifer unit thickness (D and coefficient of anisotropy (λ—were determined from the acquired and interpreted geophysical data. The extracted borehole yield rate values and the geoelectrically derived parameter values were regressed to develop the MVR relationship model by applying linear regression and GIS techniques. The sensitivity analysis results of the MVR model evaluated at P ⩽ 0.05 for the predictors ρ, D and λ provided values of 2.68 × 10−05, 2 × 10−02 and 2.09 × 10−06, respectively. The accuracy and predictive power tests conducted on the MVR model using the Theil inequality coefficient measurement approach, coupled with the sensitivity analysis results, confirmed the model yield rate estimation and prediction capability. The MVR borehole yield prediction model estimates were processed in a GIS environment to model an aquifer yield potential prediction map of the area. The information on the prediction map can serve as a scientific basis for predicting aquifer yield potential rates relevant in groundwater resources sustainability management. The developed MVR borehole yield rate prediction mode provides a good alternative to other methods used for this purpose.

  6. Evaluating the performance of different predictor strategies in regression-based downscaling with a focus on glacierized mountain environments

    Science.gov (United States)

    Hofer, Marlis; Nemec, Johanna

    2016-04-01

    This study presents first steps towards verifying the hypothesis that uncertainty in global and regional glacier mass simulations can be reduced considerably by reducing the uncertainty in the high-resolution atmospheric input data. To this aim, we systematically explore the potential of different predictor strategies for improving the performance of regression-based downscaling approaches. The investigated local-scale target variables are precipitation, air temperature, wind speed, relative humidity and global radiation, all at a daily time scale. Observations of these target variables are assessed from three sites in geo-environmentally and climatologically very distinct settings, all within highly complex topography and in the close proximity to mountain glaciers: (1) the Vernagtbach station in the Northern European Alps (VERNAGT), (2) the Artesonraju measuring site in the tropical South American Andes (ARTESON), and (3) the Brewster measuring site in the Southern Alps of New Zealand (BREWSTER). As the large-scale predictors, ERA interim reanalysis data are used. In the applied downscaling model training and evaluation procedures, particular emphasis is put on appropriately accounting for the pitfalls of limited and/or patchy observation records that are usually the only (if at all) available data from the glacierized mountain sites. Generalized linear models and beta regression are investigated as alternatives to ordinary least squares regression for the non-Gaussian target variables. By analyzing results for the three different sites, five predictands and for different times of the year, we look for systematic improvements in the downscaling models' skill specifically obtained by (i) using predictor data at the optimum scale rather than the minimum scale of the reanalysis data, (ii) identifying the optimum predictor allocation in the vertical, and (iii) considering multiple (variable, level and/or grid point) predictor options combined with state

  7. Improved predictive mapping of indoor radon concentrations using ensemble regression trees based on automatic clustering of geological units

    International Nuclear Information System (INIS)

    Kropat, Georg; Bochud, Francois; Jaboyedoff, Michel; Laedermann, Jean-Pascal; Murith, Christophe; Palacios, Martha; Baechler, Sébastien

    2015-01-01

    Purpose: According to estimations around 230 people die as a result of radon exposure in Switzerland. This public health concern makes reliable indoor radon prediction and mapping methods necessary in order to improve risk communication to the public. The aim of this study was to develop an automated method to classify lithological units according to their radon characteristics and to develop mapping and predictive tools in order to improve local radon prediction. Method: About 240 000 indoor radon concentration (IRC) measurements in about 150 000 buildings were available for our analysis. The automated classification of lithological units was based on k-medoids clustering via pair-wise Kolmogorov distances between IRC distributions of lithological units. For IRC mapping and prediction we used random forests and Bayesian additive regression trees (BART). Results: The automated classification groups lithological units well in terms of their IRC characteristics. Especially the IRC differences in metamorphic rocks like gneiss are well revealed by this method. The maps produced by random forests soundly represent the regional difference of IRCs in Switzerland and improve the spatial detail compared to existing approaches. We could explain 33% of the variations in IRC data with random forests. Additionally, the influence of a variable evaluated by random forests shows that building characteristics are less important predictors for IRCs than spatial/geological influences. BART could explain 29% of IRC variability and produced maps that indicate the prediction uncertainty. Conclusion: Ensemble regression trees are a powerful tool to model and understand the multidimensional influences on IRCs. Automatic clustering of lithological units complements this method by facilitating the interpretation of radon properties of rock types. This study provides an important element for radon risk communication. Future approaches should consider taking into account further variables

  8. [Study on the Recognition of Liquor Age of Gujing Based on Raman Spectra and Support Vector Regression].

    Science.gov (United States)

    Wang, Guo-xiang; Wang, Hai-yan; Wang, Hu; Zhang, Zheng-yong; Liu, Jun

    2016-03-01

    It is an important and difficult research point to recognize the age of Chinese liquor rapidly and exactly in the field of liquor analyzing, which is also of great significance to the healthy development of the liquor industry and protection of the legitimate rights and interests of consumers. Spectroscopy together with the pattern recognition technology is a preferred method of achieving rapid identification of wine quality, in which the Raman Spectroscopy is promising because of its little affection of water and little or free of sample pretreatment. So, in this paper, Raman spectra and support vector regression (SVR) are used to recognize different ages and different storing time of the liquor of the same age. The innovation of this paper is mainly reflected in the following three aspects. First, the application of Raman in the area of liquor analysis is rarely reported till now. Second, the concentration of studying the recognition of wine age, while most studies focus on studying specific components of liquor and studies together with the pattern recognition method focus more on the identification of brands or different types of base wine. The third one is the application of regression analysis framework, which cannot be only used to identify different years of liquor, but also can be used to analyze different storing time, which has theoretical and practical significance to the research and quality control of liquor. Three kinds of experiments are conducted in this paper. Firstly, SVR is used to recognize different ages of 5, 8, 16 and 26 years of the Gujing Liquor; secondly, SVR is also used to classify the storing time of the 8-years liquor; thirdly, certain group of train data is deleted form the train set and put into the test set to simulate the actual situation of liquor age recognition. Results show that the SVR model has good train and predict performance in these experiments, and it has better performance than other non-liner regression method such

  9. Prediction of coal response to froth flotation based on coal analysis using regression and artificial neural network

    Energy Technology Data Exchange (ETDEWEB)

    Jorjani, E.; Poorali, H.A.; Sam, A.; Chelgani, S.C.; Mesroghli, S.; Shayestehfar, M.R. [Islam Azad University, Tehran (Iran). Dept. of Mining Engineering

    2009-10-15

    In this paper, the combustible value (i.e. 100-Ash) and combustible recovery of coal flotation concentrate were predicted by regression and artificial neural network based on proximate and group macerals analysis. The regression method shows that the relationships between (a) in (ash), volatile matter and moisture (b) in (ash), in (liptinite), fusinite and vitrinite with combustible value can achieve the correlation coefficients (R{sup 2}) of 0.8 and 0.79, respectively. In addition, the input sets of (c) ash, volatile matter and moisture (d) ash, liptinite and fusinite can predict the combustible recovery with the correlation coefficients of 0.84 and 0.63, respectively. Feed-forward artificial neural network with 6-8-12-11-2-1 arrangement for moisture, ash and volatile matter input set was capable to estimate both combustible value and combustible recovery with correlation of 0.95. It was shown that the proposed neural network model could accurately reproduce all the effects of proximate and group macerals analysis on coal flotation system.

  10. Advanced exergy-based analyses applied to a system including LNG regasification and electricity generation

    Energy Technology Data Exchange (ETDEWEB)

    Morosuk, Tatiana; Tsatsaronis, George; Boyano, Alicia; Gantiva, Camilo [Technische Univ. Berlin (Germany)

    2012-07-01

    Liquefied natural gas (LNG) will contribute more in the future than in the past to the overall energy supply in the world. The paper discusses the application of advanced exergy-based analyses to a recently developed LNG-based cogeneration system. These analyses include advanced exergetic, advanced exergoeconomic, and advanced exergoenvironmental analyses in which thermodynamic inefficiencies (exergy destruction), costs, and environmental impacts have been split into avoidable and unavoidable parts. With the aid of these analyses, the potentials for improving the thermodynamic efficiency and for reducing the overall cost and the overall environmental impact are revealed. The objectives of this paper are to demonstrate (a) the potential for generating electricity while regasifying LNG and (b) some of the capabilities associated with advanced exergy-based methods. The most important subsystems and components are identified, and suggestions for improving them are made. (orig.)

  11. Quantitative X-ray Map Analyser (Q-XRMA): A new GIS-based statistical approach to Mineral Image Analysis

    Science.gov (United States)

    Ortolano, Gaetano; Visalli, Roberto; Godard, Gaston; Cirrincione, Rosolino

    2018-06-01

    We present a new ArcGIS®-based tool developed in the Python programming language for calibrating EDS/WDS X-ray element maps, with the aim of acquiring quantitative information of petrological interest. The calibration procedure is based on a multiple linear regression technique that takes into account interdependence among elements and is constrained by the stoichiometry of minerals. The procedure requires an appropriate number of spot analyses for use as internal standards and provides several test indexes for a rapid check of calibration accuracy. The code is based on an earlier image-processing tool designed primarily for classifying minerals in X-ray element maps; the original Python code has now been enhanced to yield calibrated maps of mineral end-members or the chemical parameters of each classified mineral. The semi-automated procedure can be used to extract a dataset that is automatically stored within queryable tables. As a case study, the software was applied to an amphibolite-facies garnet-bearing micaschist. The calibrated images obtained for both anhydrous (i.e., garnet and plagioclase) and hydrous (i.e., biotite) phases show a good fit with corresponding electron microprobe analyses. This new GIS-based tool package can thus find useful application in petrology and materials science research. Moreover, the huge quantity of data extracted opens new opportunities for the development of a thin-section microchemical database that, using a GIS platform, can be linked with other major global geoscience databases.

  12. Regression toward the mean – a detection method for unknown population mean based on Mee and Chua's algorithm

    Directory of Open Access Journals (Sweden)

    Lüdtke Rainer

    2008-08-01

    Full Text Available Abstract Background Regression to the mean (RTM occurs in situations of repeated measurements when extreme values are followed by measurements in the same subjects that are closer to the mean of the basic population. In uncontrolled studies such changes are likely to be interpreted as a real treatment effect. Methods Several statistical approaches have been developed to analyse such situations, including the algorithm of Mee and Chua which assumes a known population mean μ. We extend this approach to a situation where μ is unknown and suggest to vary it systematically over a range of reasonable values. Using differential calculus we provide formulas to estimate the range of μ where treatment effects are likely to occur when RTM is present. Results We successfully applied our method to three real world examples denoting situations when (a no treatment effect can be confirmed regardless which μ is true, (b when a treatment effect must be assumed independent from the true μ and (c in the appraisal of results of uncontrolled studies. Conclusion Our method can be used to separate the wheat from the chaff in situations, when one has to interpret the results of uncontrolled studies. In meta-analysis, health-technology reports or systematic reviews this approach may be helpful to clarify the evidence given from uncontrolled observational studies.

  13. Ca analysis: an Excel based program for the analysis of intracellular calcium transients including multiple, simultaneous regression analysis.

    Science.gov (United States)

    Greensmith, David J

    2014-01-01

    Here I present an Excel based program for the analysis of intracellular Ca transients recorded using fluorescent indicators. The program can perform all the necessary steps which convert recorded raw voltage changes into meaningful physiological information. The program performs two fundamental processes. (1) It can prepare the raw signal by several methods. (2) It can then be used to analyze the prepared data to provide information such as absolute intracellular Ca levels. Also, the rates of change of Ca can be measured using multiple, simultaneous regression analysis. I demonstrate that this program performs equally well as commercially available software, but has numerous advantages, namely creating a simplified, self-contained analysis workflow. Copyright © 2013 The Author. Published by Elsevier Ireland Ltd.. All rights reserved.

  14. Optimization to the Culture Conditions for Phellinus Production with Regression Analysis and Gene-Set Based Genetic Algorithm

    Science.gov (United States)

    Li, Zhongwei; Xin, Yuezhen; Wang, Xun; Sun, Beibei; Xia, Shengyu; Li, Hui

    2016-01-01

    Phellinus is a kind of fungus and is known as one of the elemental components in drugs to avoid cancers. With the purpose of finding optimized culture conditions for Phellinus production in the laboratory, plenty of experiments focusing on single factor were operated and large scale of experimental data were generated. In this work, we use the data collected from experiments for regression analysis, and then a mathematical model of predicting Phellinus production is achieved. Subsequently, a gene-set based genetic algorithm is developed to optimize the values of parameters involved in culture conditions, including inoculum size, PH value, initial liquid volume, temperature, seed age, fermentation time, and rotation speed. These optimized values of the parameters have accordance with biological experimental results, which indicate that our method has a good predictability for culture conditions optimization. PMID:27610365

  15. Ca analysis: An Excel based program for the analysis of intracellular calcium transients including multiple, simultaneous regression analysis☆

    Science.gov (United States)

    Greensmith, David J.

    2014-01-01

    Here I present an Excel based program for the analysis of intracellular Ca transients recorded using fluorescent indicators. The program can perform all the necessary steps which convert recorded raw voltage changes into meaningful physiological information. The program performs two fundamental processes. (1) It can prepare the raw signal by several methods. (2) It can then be used to analyze the prepared data to provide information such as absolute intracellular Ca levels. Also, the rates of change of Ca can be measured using multiple, simultaneous regression analysis. I demonstrate that this program performs equally well as commercially available software, but has numerous advantages, namely creating a simplified, self-contained analysis workflow. PMID:24125908

  16. A multiple linear regression analysis of hot corrosion attack on a series of nickel base turbine alloys

    Science.gov (United States)

    Barrett, C. A.

    1985-01-01

    Multiple linear regression analysis was used to determine an equation for estimating hot corrosion attack for a series of Ni base cast turbine alloys. The U transform (i.e., 1/sin (% A/100) to the 1/2) was shown to give the best estimate of the dependent variable, y. A complete second degree equation is described for the centered" weight chemistries for the elements Cr, Al, Ti, Mo, W, Cb, Ta, and Co. In addition linear terms for the minor elements C, B, and Zr were added for a basic 47 term equation. The best reduced equation was determined by the stepwise selection method with essentially 13 terms. The Cr term was found to be the most important accounting for 60 percent of the explained variability hot corrosion attack.

  17. Multiple linear regression to develop strength scaled equations for knee and elbow joints based on age, gender and segment mass

    DEFF Research Database (Denmark)

    D'Souza, Sonia; Rasmussen, John; Schwirtz, Ansgar

    2012-01-01

    and valuable ergonomic tool. Objective: To investigate age and gender effects on the torque-producing ability in the knee and elbow in older adults. To create strength scaled equations based on age, gender, upper/lower limb lengths and masses using multiple linear regression. To reduce the number of dependent...... flexors. Results: Males were signifantly stronger than females across all age groups. Elbow peak torque (EPT) was better preserved from 60s to 70s whereas knee peak torque (KPT) reduced significantly (PGender, thigh mass and age best...... predicted KPT (R2=0.60). Gender, forearm mass and age best predicted EPT (R2=0.75). Good crossvalidation was established for both elbow and knee models. Conclusion: This cross-sectional study of muscle strength created and validated strength scaled equations of EPT and KPT using only gender, segment mass...

  18. Data-Based Control for Humanoid Robots Using Support Vector Regression, Fuzzy Logic, and Cubature Kalman Filter

    Directory of Open Access Journals (Sweden)

    Liyang Wang

    2016-01-01

    Full Text Available Time-varying external disturbances cause instability of humanoid robots or even tip robots over. In this work, a trapezoidal fuzzy least squares support vector regression- (TF-LSSVR- based control system is proposed to learn the external disturbances and increase the zero-moment-point (ZMP stability margin of humanoid robots. First, the humanoid states and the corresponding control torques of the joints for training the controller are collected by implementing simulation experiments. Secondly, a TF-LSSVR with a time-related trapezoidal fuzzy membership function (TFMF is proposed to train the controller using the simulated data. Thirdly, the parameters of the proposed TF-LSSVR are updated using a cubature Kalman filter (CKF. Simulation results are provided. The proposed method is shown to be effective in learning and adapting occasional external disturbances and ensuring the stability margin of the robot.

  19. Panel data specifications in nonparametric kernel regression

    DEFF Research Database (Denmark)

    Czekaj, Tomasz Gerard; Henningsen, Arne

    parametric panel data estimators to analyse the production technology of Polish crop farms. The results of our nonparametric kernel regressions generally differ from the estimates of the parametric models but they only slightly depend on the choice of the kernel functions. Based on economic reasoning, we...

  20. The Spatial Distribution of Hepatitis C Virus Infections and Associated Determinants--An Application of a Geographically Weighted Poisson Regression for Evidence-Based Screening Interventions in Hotspots.

    Science.gov (United States)

    Kauhl, Boris; Heil, Jeanne; Hoebe, Christian J P A; Schweikart, Jürgen; Krafft, Thomas; Dukers-Muijrers, Nicole H T M

    2015-01-01

    Hepatitis C Virus (HCV) infections are a major cause for liver diseases. A large proportion of these infections remain hidden to care due to its mostly asymptomatic nature. Population-based screening and screening targeted on behavioural risk groups had not proven to be effective in revealing these hidden infections. Therefore, more practically applicable approaches to target screenings are necessary. Geographic Information Systems (GIS) and spatial epidemiological methods may provide a more feasible basis for screening interventions through the identification of hotspots as well as demographic and socio-economic determinants. Analysed data included all HCV tests (n = 23,800) performed in the southern area of the Netherlands between 2002-2008. HCV positivity was defined as a positive immunoblot or polymerase chain reaction test. Population data were matched to the geocoded HCV test data. The spatial scan statistic was applied to detect areas with elevated HCV risk. We applied global regression models to determine associations between population-based determinants and HCV risk. Geographically weighted Poisson regression models were then constructed to determine local differences of the association between HCV risk and population-based determinants. HCV prevalence varied geographically and clustered in urban areas. The main population at risk were middle-aged males, non-western immigrants and divorced persons. Socio-economic determinants consisted of one-person households, persons with low income and mean property value. However, the association between HCV risk and demographic as well as socio-economic determinants displayed strong regional and intra-urban differences. The detection of local hotspots in our study may serve as a basis for prioritization of areas for future targeted interventions. Demographic and socio-economic determinants associated with HCV risk show regional differences underlining that a one-size-fits-all approach even within small geographic

  1. A new non-invasive approach based on polyhexamethylene biguanide increases the regression rate of HPV infection

    Directory of Open Access Journals (Sweden)

    Gentile Antonio

    2012-09-01

    Full Text Available Abstract Background HPV infection is a worldwide problem strictly linked to the development of cervical cancer. Persistence of the infection is one of the main factors responsible for the invasive progression and women diagnosed with intraepithelial squamous lesions are referred for further assessment and surgical treatments which are prone to complications. Despite this, there are several reports on the spontaneous regression of the infection. This study was carried out to evaluate the effectiveness of a long term polyhexamethylene biguanide (PHMB-based local treatment in improving the viral clearance, reducing the time exposure to the infection and avoiding the complications associated with the invasive treatments currently available. Method 100 women diagnosed with HPV infection were randomly assigned to receive six months of treatment with a PHMB-based gynecological solution (Monogin®, Lo.Li. Pharma, Rome - Italy or to remain untreated for the same period of time. Results A greater number of patients, who received the treatment were cleared of the infection at the two time points of the study (three and six months compared to that of the control group. A significant difference in the regression rate (90% Monogin group vs 70% control group was observed at the end of the study highlighting the time-dependent ability of PHMB to interact with the infection progression. Conclusions The topic treatment with PHMB is a preliminary safe and promising approach for patients with detected HPV infection increasing the chance of clearance and avoiding the use of invasive treatments when not strictly necessary. Trial registration ClinicalTrials.gov Identifier NCT01571141

  2. Robust spinal cord resting-state fMRI using independent component analysis-based nuisance regression noise reduction.

    Science.gov (United States)

    Hu, Yong; Jin, Richu; Li, Guangsheng; Luk, Keith Dk; Wu, Ed X

    2018-04-16

    Physiological noise reduction plays a critical role in spinal cord (SC) resting-state fMRI (rsfMRI). To reduce physiological noise and increase the robustness of SC rsfMRI by using an independent component analysis (ICA)-based nuisance regression (ICANR) method. Retrospective. Ten healthy subjects (female/male = 4/6, age = 27 ± 3 years, range 24-34 years). 3T/gradient-echo echo planar imaging (EPI). We used three alternative methods (no regression [Nil], conventional region of interest [ROI]-based noise reduction method without ICA [ROI-based], and correction of structured noise using spatial independent component analysis [CORSICA]) to compare with the performance of ICANR. Reduction of the influence of physiological noise on the SC and the reproducibility of rsfMRI analysis after noise reduction were examined. The correlation coefficient (CC) was calculated to assess the influence of physiological noise. Reproducibility was calculated by intraclass correlation (ICC). Results from different methods were compared by one-way analysis of variance (ANOVA) with post-hoc analysis. No significant difference in cerebrospinal fluid (CSF) pulsation influence or tissue motion influence were found (P = 0.223 in CSF, P = 0.2461 in tissue motion) in the ROI-based (CSF: 0.122 ± 0.020; tissue motion: 0.112 ± 0.015), and Nil (CSF: 0.134 ± 0.026; tissue motion: 0.124 ± 0.019). CORSICA showed a significantly stronger influence of CSF pulsation and tissue motion (CSF: 0.166 ± 0.045, P = 0.048; tissue motion: 0.160 ± 0.032, P = 0.048) than Nil. ICANR showed a significantly weaker influence of CSF pulsation and tissue motion (CSF: 0.076 ± 0.007, P = 0.0003; tissue motion: 0.081 ± 0.014, P = 0.0182) than Nil. The ICC values in the Nil, ROI-based, CORSICA, and ICANR were 0.669, 0.645, 0.561, and 0.766, respectively. ICANR more effectively reduced physiological noise from both tissue motion and CSF pulsation than three alternative methods. ICANR increases the robustness of SC rsf

  3. GIS-based Approaches to Catchment Area Analyses of Mass Transit

    DEFF Research Database (Denmark)

    Andersen, Jonas Lohmann Elkjær; Landex, Alex

    2009-01-01

    Catchment area analyses of stops or stations are used to investigate potential number of travelers to public transportation. These analyses are considered a strong decision tool in the planning process of mass transit especially railroads. Catchment area analyses are GIS-based buffer and overlay...... analyses with different approaches depending on the desired level of detail. A simple but straightforward approach to implement is the Circular Buffer Approach where catchment areas are circular. A more detailed approach is the Service Area Approach where catchment areas are determined by a street network...... search to simulate the actual walking distances. A refinement of the Service Area Approach is to implement additional time resistance in the network search to simulate obstacles in the walking environment. This paper reviews and compares the different GIS-based catchment area approaches, their level...

  4. Meta-Analyses of Human Cell-Based Cardiac Regeneration Therapies

    DEFF Research Database (Denmark)

    Gyöngyösi, Mariann; Wojakowski, Wojciech; Navarese, Eliano P

    2016-01-01

    In contrast to multiple publication-based meta-analyses involving clinical cardiac regeneration therapy in patients with recent myocardial infarction, a recently published meta-analysis based on individual patient data reported no effect of cell therapy on left ventricular function or clinical...

  5. Multi Objective Optimization of Multi Wall Carbon Nanotube Based Nanogrinding Wheel Using Grey Relational and Regression Analysis

    Science.gov (United States)

    Sethuramalingam, Prabhu; Vinayagam, Babu Kupusamy

    2016-07-01

    Carbon nanotube mixed grinding wheel is used in the grinding process to analyze the surface characteristics of AISI D2 tool steel material. Till now no work has been carried out using carbon nanotube based grinding wheel. Carbon nanotube based grinding wheel has excellent thermal conductivity and good mechanical properties which are used to improve the surface finish of the workpiece. In the present study, the multi response optimization of process parameters like surface roughness and metal removal rate of grinding process of single wall carbon nanotube (CNT) in mixed cutting fluids is undertaken using orthogonal array with grey relational analysis. Experiments are performed with designated grinding conditions obtained using the L9 orthogonal array. Based on the results of the grey relational analysis, a set of optimum grinding parameters is obtained. Using the analysis of variance approach the significant machining parameters are found. Empirical model for the prediction of output parameters has been developed using regression analysis and the results are compared empirically, for conditions of with and without CNT grinding wheel in grinding process.

  6. Examination of Parameters Affecting the House Prices by Multiple Regression Analysis and its Contributions to Earthquake-Based Urban Transformation

    Science.gov (United States)

    Denli, H. H.; Durmus, B.

    2016-12-01

    The purpose of this study is to examine the factors which may affect the apartment prices with multiple linear regression analysis models and visualize the results by value maps. The study is focused on a county of Istanbul - Turkey. Totally 390 apartments around the county Umraniye are evaluated due to their physical and locational conditions. The identification of factors affecting the price of apartments in the county with a population of approximately 600k is expected to provide a significant contribution to the apartment market.Physical factors are selected as the age, number of rooms, size, floor numbers of the building and the floor that the apartment is positioned in. Positional factors are selected as the distances to the nearest hospital, school, park and police station. Totally ten physical and locational parameters are examined by regression analysis.After the regression analysis has been performed, value maps are composed from the parameters age, price and price per square meters. The most significant of the composed maps is the price per square meters map. Results show that the location of the apartment has the most influence to the square meter price information of the apartment. A different practice is developed from the composed maps by searching the ability of using price per square meters map in urban transformation practices. By marking the buildings older than 15 years in the price per square meters map, a different and new interpretation has been made to determine the buildings, to which should be given priority during an urban transformation in the county.This county is very close to the North Anatolian Fault zone and is under the threat of earthquakes. By marking the apartments older than 15 years on the price per square meters map, both older and expensive square meters apartments list can be gathered. By the help of this list, the priority could be given to the selected higher valued old apartments to support the economy of the country

  7. Assessment of triglyceride and cholesterol in overweight people based on multiple linear regression and artificial intelligence model.

    Science.gov (United States)

    Ma, Jing; Yu, Jiong; Hao, Guangshu; Wang, Dan; Sun, Yanni; Lu, Jianxin; Cao, Hongcui; Lin, Feiyan

    2017-02-20

    The prevalence of high hyperlipemia is increasing around the world. Our aims are to analyze the relationship of triglyceride (TG) and cholesterol (TC) with indexes of liver function and kidney function, and to develop a prediction model of TG, TC in overweight people. A total of 302 adult healthy subjects and 273 overweight subjects were enrolled in this study. The levels of fasting indexes of TG (fs-TG), TC (fs-TC), blood glucose, liver function, and kidney function were measured and analyzed by correlation analysis and multiple linear regression (MRL). The back propagation artificial neural network (BP-ANN) was applied to develop prediction models of fs-TG and fs-TC. The results showed there was significant difference in biochemical indexes between healthy people and overweight people. The correlation analysis showed fs-TG was related to weight, height, blood glucose, and indexes of liver and kidney function; while fs-TC was correlated with age, indexes of liver function (P < 0.01). The MRL analysis indicated regression equations of fs-TG and fs-TC both had statistic significant (P < 0.01) when included independent indexes. The BP-ANN model of fs-TG reached training goal at 59 epoch, while fs-TC model achieved high prediction accuracy after training 1000 epoch. In conclusions, there was high relationship of fs-TG and fs-TC with weight, height, age, blood glucose, indexes of liver function and kidney function. Based on related variables, the indexes of fs-TG and fs-TC can be predicted by BP-ANN models in overweight people.

  8. Adaptive smoothing based on Gaussian processes regression increases the sensitivity and specificity of fMRI data.

    Science.gov (United States)

    Strappini, Francesca; Gilboa, Elad; Pitzalis, Sabrina; Kay, Kendrick; McAvoy, Mark; Nehorai, Arye; Snyder, Abraham Z

    2017-03-01

    Temporal and spatial filtering of fMRI data is often used to improve statistical power. However, conventional methods, such as smoothing with fixed-width Gaussian filters, remove fine-scale structure in the data, necessitating a tradeoff between sensitivity and specificity. Specifically, smoothing may increase sensitivity (reduce noise and increase statistical power) but at the cost loss of specificity in that fine-scale structure in neural activity patterns is lost. Here, we propose an alternative smoothing method based on Gaussian processes (GP) regression for single subjects fMRI experiments. This method adapts the level of smoothing on a voxel by voxel basis according to the characteristics of the local neural activity patterns. GP-based fMRI analysis has been heretofore impractical owing to computational demands. Here, we demonstrate a new implementation of GP that makes it possible to handle the massive data dimensionality of the typical fMRI experiment. We demonstrate how GP can be used as a drop-in replacement to conventional preprocessing steps for temporal and spatial smoothing in a standard fMRI pipeline. We present simulated and experimental results that show the increased sensitivity and specificity compared to conventional smoothing strategies. Hum Brain Mapp 38:1438-1459, 2017. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  9. Estimation of the daily global solar radiation based on the Gaussian process regression methodology in the Saharan climate

    Science.gov (United States)

    Guermoui, Mawloud; Gairaa, Kacem; Rabehi, Abdelaziz; Djafer, Djelloul; Benkaciali, Said

    2018-06-01

    Accurate estimation of solar radiation is the major concern in renewable energy applications. Over the past few years, a lot of machine learning paradigms have been proposed in order to improve the estimation performances, mostly based on artificial neural networks, fuzzy logic, support vector machine and adaptive neuro-fuzzy inference system. The aim of this work is the prediction of the daily global solar radiation, received on a horizontal surface through the Gaussian process regression (GPR) methodology. A case study of Ghardaïa region (Algeria) has been used in order to validate the above methodology. In fact, several combinations have been tested; it was found that, GPR-model based on sunshine duration, minimum air temperature and relative humidity gives the best results in term of mean absolute bias error (MBE), root mean square error (RMSE), relative mean square error (rRMSE), and correlation coefficient ( r) . The obtained values of these indicators are 0.67 MJ/m2, 1.15 MJ/m2, 5.2%, and 98.42%, respectively.

  10. INDIA’S ELECTRICITY DEMAND FORECAST USING REGRESSION ANALYSIS AND ARTIFICIAL NEURAL NETWORKS BASED ON PRINCIPAL COMPONENTS

    Directory of Open Access Journals (Sweden)

    S. Saravanan

    2012-07-01

    Full Text Available Power System planning starts with Electric load (demand forecasting. Accurate electricity load forecasting is one of the most important challenges in managing supply and demand of the electricity, since the electricity demand is volatile in nature; it cannot be stored and has to be consumed instantly. The aim of this study deals with electricity consumption in India, to forecast future projection of demand for a period of 19 years from 2012 to 2030. The eleven input variables used are Amount of CO2 emission, Population, Per capita GDP, Per capita gross national income, Gross Domestic savings, Industry, Consumer price index, Wholesale price index, Imports, Exports and Per capita power consumption. A new methodology based on Artificial Neural Networks (ANNs using principal components is also used. Data of 29 years used for training and data of 10 years used for testing the ANNs. Comparison made with multiple linear regression (based on original data and the principal components and ANNs with original data as input variables. The results show that the use of ANNs with principal components (PC is more effective.

  11. To Set Up a Logistic Regression Prediction Model for Hepatotoxicity of Chinese Herbal Medicines Based on Traditional Chinese Medicine Theory

    Science.gov (United States)

    Liu, Hongjie; Li, Tianhao; Zhan, Sha; Pan, Meilan; Ma, Zhiguo; Li, Chenghua

    2016-01-01

    Aims. To establish a logistic regression (LR) prediction model for hepatotoxicity of Chinese herbal medicines (HMs) based on traditional Chinese medicine (TCM) theory and to provide a statistical basis for predicting hepatotoxicity of HMs. Methods. The correlations of hepatotoxic and nonhepatotoxic Chinese HMs with four properties, five flavors, and channel tropism were analyzed with chi-square test for two-way unordered categorical data. LR prediction model was established and the accuracy of the prediction by this model was evaluated. Results. The hepatotoxic and nonhepatotoxic Chinese HMs were related with four properties (p 0.05). There were totally 12 variables from four properties and five flavors for the LR. Four variables, warm and neutral of the four properties and pungent and salty of five flavors, were selected to establish the LR prediction model, with the cutoff value being 0.204. Conclusions. Warm and neutral of the four properties and pungent and salty of five flavors were the variables to affect the hepatotoxicity. Based on such results, the established LR prediction model had some predictive power for hepatotoxicity of Chinese HMs. PMID:27656240

  12. CLASSIFICATION OF URBAN AERIAL DATA BASED ON PIXEL LABELLING WITH DEEP CONVOLUTIONAL NEURAL NETWORKS AND LOGISTIC REGRESSION

    Directory of Open Access Journals (Sweden)

    W. Yao

    2016-06-01

    Full Text Available The recent success of deep convolutional neural networks (CNN on a large number of applications can be attributed to large amounts of available training data and increasing computing power. In this paper, a semantic pixel labelling scheme for urban areas using multi-resolution CNN and hand-crafted spatial-spectral features of airborne remotely sensed data is presented. Both CNN and hand-crafted features are applied to image/DSM patches to produce per-pixel class probabilities with a L1-norm regularized logistical regression classifier. The evidence theory infers a degree of belief for pixel labelling from different sources to smooth regions by handling the conflicts present in the both classifiers while reducing the uncertainty. The aerial data used in this study were provided by ISPRS as benchmark datasets for 2D semantic labelling tasks in urban areas, which consists of two data sources from LiDAR and color infrared camera. The test sites are parts of a city in Germany which is assumed to consist of typical object classes including impervious surfaces, trees, buildings, low vegetation, vehicles and clutter. The evaluation is based on the computation of pixel-based confusion matrices by random sampling. The performance of the strategy with respect to scene characteristics and method combination strategies is analyzed and discussed. The competitive classification accuracy could be not only explained by the nature of input data sources: e.g. the above-ground height of nDSM highlight the vertical dimension of houses, trees even cars and the nearinfrared spectrum indicates vegetation, but also attributed to decision-level fusion of CNN’s texture-based approach with multichannel spatial-spectral hand-crafted features based on the evidence combination theory.

  13. Classification of Urban Aerial Data Based on Pixel Labelling with Deep Convolutional Neural Networks and Logistic Regression

    Science.gov (United States)

    Yao, W.; Poleswki, P.; Krzystek, P.

    2016-06-01

    The recent success of deep convolutional neural networks (CNN) on a large number of applications can be attributed to large amounts of available training data and increasing computing power. In this paper, a semantic pixel labelling scheme for urban areas using multi-resolution CNN and hand-crafted spatial-spectral features of airborne remotely sensed data is presented. Both CNN and hand-crafted features are applied to image/DSM patches to produce per-pixel class probabilities with a L1-norm regularized logistical regression classifier. The evidence theory infers a degree of belief for pixel labelling from different sources to smooth regions by handling the conflicts present in the both classifiers while reducing the uncertainty. The aerial data used in this study were provided by ISPRS as benchmark datasets for 2D semantic labelling tasks in urban areas, which consists of two data sources from LiDAR and color infrared camera. The test sites are parts of a city in Germany which is assumed to consist of typical object classes including impervious surfaces, trees, buildings, low vegetation, vehicles and clutter. The evaluation is based on the computation of pixel-based confusion matrices by random sampling. The performance of the strategy with respect to scene characteristics and method combination strategies is analyzed and discussed. The competitive classification accuracy could be not only explained by the nature of input data sources: e.g. the above-ground height of nDSM highlight the vertical dimension of houses, trees even cars and the nearinfrared spectrum indicates vegetation, but also attributed to decision-level fusion of CNN's texture-based approach with multichannel spatial-spectral hand-crafted features based on the evidence combination theory.

  14. Boosted beta regression.

    Directory of Open Access Journals (Sweden)

    Matthias Schmid

    Full Text Available Regression analysis with a bounded outcome is a common problem in applied statistics. Typical examples include regression models for percentage outcomes and the analysis of ratings that are measured on a bounded scale. In this paper, we consider beta regression, which is a generalization of logit models to situations where the response is continuous on the interval (0,1. Consequently, beta regression is a convenient tool for analyzing percentage responses. The classical approach to fit a beta regression model is to use maximum likelihood estimation with subsequent AIC-based variable selection. As an alternative to this established - yet unstable - approach, we propose a new estimation technique called boosted beta regression. With boosted beta regression estimation and variable selection can be carried out simultaneously in a highly efficient way. Additionally, both the mean and the variance of a percentage response can be modeled using flexible nonlinear covariate effects. As a consequence, the new method accounts for common problems such as overdispersion and non-binomial variance structures.

  15. Applications of high lateral and energy resolution imaging XPS with a double hemispherical analyser based spectromicroscope

    International Nuclear Information System (INIS)

    Escher, M.; Winkler, K.; Renault, O.; Barrett, N.

    2010-01-01

    The design and applications of an instrument for imaging X-ray photoelectron spectroscopy (XPS) are reviewed. The instrument is based on a photoelectron microscope and a double hemispherical analyser whose symmetric configuration avoids the spherical aberration (α 2 -term) inherent for standard analysers. The analyser allows high transmission imaging without sacrificing the lateral and energy resolution of the instrument. The importance of high transmission, especially for highest resolution imaging XPS with monochromated laboratory X-ray sources, is outlined and the close interrelation of energy resolution, lateral resolution and analyser transmission is illustrated. Chemical imaging applications using a monochromatic laboratory Al Kα-source are shown, with a lateral resolution of 610 nm. Examples of measurements made using synchrotron and laboratory ultra-violet light show the broad field of applications from imaging of core level electrons with chemical shift identification, high resolution threshold photoelectron emission microscopy (PEEM), work function imaging and band structure imaging.

  16. Transgender Population Size in the United States: a Meta-Regression of Population-Based Probability Samples

    Science.gov (United States)

    Sevelius, Jae M.

    2017-01-01

    Background. Transgender individuals have a gender identity that differs from the sex they were assigned at birth. The population size of transgender individuals in the United States is not well-known, in part because official records, including the US Census, do not include data on gender identity. Population surveys today more often collect transgender-inclusive gender-identity data, and secular trends in culture and the media have created a somewhat more favorable environment for transgender people. Objectives. To estimate the current population size of transgender individuals in the United States and evaluate any trend over time. Search methods. In June and July 2016, we searched PubMed, Cumulative Index to Nursing and Allied Health Literature, and Web of Science for national surveys, as well as “gray” literature, through an Internet search. We limited the search to 2006 through 2016. Selection criteria. We selected population-based surveys that used probability sampling and included self-reported transgender-identity data. Data collection and analysis. We used random-effects meta-analysis to pool eligible surveys and used meta-regression to address our hypothesis that the transgender population size estimate would increase over time. We used subsample and leave-one-out analysis to assess for bias. Main results. Our meta-regression model, based on 12 surveys covering 2007 to 2015, explained 62.5% of model heterogeneity, with a significant effect for each unit increase in survey year (F = 17.122; df = 1,10; b = 0.026%; P = .002). Extrapolating these results to 2016 suggested a current US population size of 390 adults per 100 000, or almost 1 million adults nationally. This estimate may be more indicative for younger adults, who represented more than 50% of the respondents in our analysis. Authors’ conclusions. Future national surveys are likely to observe higher numbers of transgender people. The large variety in questions used to ask

  17. Estimation of snowpack matching ground-truth data and MODIS satellite-based observations by using regression kriging

    Science.gov (United States)

    Juan Collados-Lara, Antonio; Pardo-Iguzquiza, Eulogio; Pulido-Velazquez, David

    2016-04-01

    The estimation of Snow Water Equivalent (SWE) is essential for an appropriate assessment of the available water resources in Alpine catchment. The hydrologic regime in these areas is dominated by the storage of water in the snowpack, which is discharged to rivers throughout the melt season. An accurate estimation of the resources will be necessary for an appropriate analysis of the system operation alternatives using basin scale management models. In order to obtain an appropriate estimation of the SWE we need to know the spatial distribution snowpack and snow density within the Snow Cover Area (SCA). Data for these snow variables can be extracted from in-situ point measurements and air-borne/space-borne remote sensing observations. Different interpolation and simulation techniques have been employed for the estimation of the cited variables. In this paper we propose to estimate snowpack from a reduced number of ground-truth data (1 or 2 campaigns per year with 23 observation point from 2000-2014) and MODIS satellite-based observations in the Sierra Nevada Mountain (Southern Spain). Regression based methodologies has been used to study snowpack distribution using different kind of explicative variables: geographic, topographic, climatic. 40 explicative variables were considered: the longitude, latitude, altitude, slope, eastness, northness, radiation, maximum upwind slope and some mathematical transformation of each of them [Ln(v), (v)^-1; (v)^2; (v)^0.5). Eight different structure of regression models have been tested (combining 1, 2, 3 or 4 explicative variables). Y=B0+B1Xi (1); Y=B0+B1XiXj (2); Y=B0+B1Xi+B2Xj (3); Y=B0+B1Xi+B2XjXl (4); Y=B0+B1XiXk+B2XjXl (5); Y=B0+B1Xi+B2Xj+B3Xl (6); Y=B0+B1Xi+B2Xj+B3XlXk (7); Y=B0+B1Xi+B2Xj+B3Xl+B4Xk (8). Where: Y is the snow depth; (Xi, Xj, Xl, Xk) are the prediction variables (any of the 40 variables); (B0, B1, B2, B3) are the coefficients to be estimated. The ground data are employed to calibrate the multiple regressions. In

  18. A regression-based differential expression detection algorithm for microarray studies with ultra-low sample size.

    Directory of Open Access Journals (Sweden)

    Daniel Vasiliu

    Full Text Available Global gene expression analysis using microarrays and, more recently, RNA-seq, has allowed investigators to understand biological processes at a system level. However, the identification of differentially expressed genes in experiments with small sample size, high dimensionality, and high variance remains challenging, limiting the usability of these tens of thousands of publicly available, and possibly many more unpublished, gene expression datasets. We propose a novel variable selection algorithm for ultra-low-n microarray studies using generalized linear model-based variable selection with a penalized binomial regression algorithm called penalized Euclidean distance (PED. Our method uses PED to build a classifier on the experimental data to rank genes by importance. In place of cross-validation, which is required by most similar methods but not reliable for experiments with small sample size, we use a simulation-based approach to additively build a list of differentially expressed genes from the rank-ordered list. Our simulation-based approach maintains a low false discovery rate while maximizing the number of differentially expressed genes identified, a feature critical for downstream pathway analysis. We apply our method to microarray data from an experiment perturbing the Notch signaling pathway in Xenopus laevis embryos. This dataset was chosen because it showed very little differential expression according to limma, a powerful and widely-used method for microarray analysis. Our method was able to detect a significant number of differentially expressed genes in this dataset and suggest future directions for investigation. Our method is easily adaptable for analysis of data from RNA-seq and other global expression experiments with low sample size and high dimensionality.

  19. Developing a Referral Protocol for Community-Based Occupational Therapy Services in Taiwan: A Logistic Regression Analysis.

    Science.gov (United States)

    Mao, Hui-Fen; Chang, Ling-Hui; Tsai, Athena Yi-Jung; Huang, Wen-Ni; Wang, Jye

    2016-01-01

    Because resources for long-term care services are limited, timely and appropriate referral for rehabilitation services is critical for optimizing clients' functions and successfully integrating them into the community. We investigated which client characteristics are most relevant in predicting Taiwan's community-based occupational therapy (OT) service referral based on experts' beliefs. Data were collected in face-to-face interviews using the Multidimensional Assessment Instrument (MDAI). Community-dwelling participants (n = 221) ≥ 18 years old who reported disabilities in the previous National Survey of Long-term Care Needs in Taiwan were enrolled. The standard for referral was the judgment and agreement of two experienced occupational therapists who reviewed the results of the MDAI. Logistic regressions and Generalized Additive Models were used for analysis. Two predictive models were proposed, one using basic activities of daily living (BADLs) and one using instrumental ADLs (IADLs). Dementia, psychiatric disorders, cognitive impairment, joint range-of-motion limitations, fear of falling, behavioral or emotional problems, expressive deficits (in the BADL-based model), and limitations in IADLs or BADLs were significantly correlated with the need for referral. Both models showed high area under the curve (AUC) values on receiver operating curve testing (AUC = 0.977 and 0.972, respectively). The probability of being referred for community OT services was calculated using the referral algorithm. The referral protocol facilitated communication between healthcare professionals to make appropriate decisions for OT referrals. The methods and findings should be useful for developing referral protocols for other long-term care services.

  20. Basing assessment and treatment of problem behavior on behavioral momentum theory: Analyses of behavioral persistence.

    Science.gov (United States)

    Schieltz, Kelly M; Wacker, David P; Ringdahl, Joel E; Berg, Wendy K

    2017-08-01

    The connection, or bridge, between applied and basic behavior analysis has been long-established (Hake, 1982; Mace & Critchfield, 2010). In this article, we describe how clinical decisions can be based more directly on behavioral processes and how basing clinical procedures on behavioral processes can lead to improved clinical outcomes. As a case in point, we describe how applied behavior analyses of maintenance, and specifically the long-term maintenance of treatment effects related to problem behavior, can be adjusted and potentially enhanced by basing treatment on Behavioral Momentum Theory. We provide a brief review of the literature including descriptions of two translational studies that proposed changes in how differential reinforcement of alternative behavior treatments are conducted based on Behavioral Momentum Theory. We then describe current clinical examples of how these translations are continuing to impact the definitions, designs, analyses, and treatment procedures used in our clinical practice. Copyright © 2017 Elsevier B.V. All rights reserved.

  1. IPF-LASSO: Integrative L1-Penalized Regression with Penalty Factors for Prediction Based on Multi-Omics Data

    Directory of Open Access Journals (Sweden)

    Anne-Laure Boulesteix

    2017-01-01

    Full Text Available As modern biotechnologies advance, it has become increasingly frequent that different modalities of high-dimensional molecular data (termed “omics” data in this paper, such as gene expression, methylation, and copy number, are collected from the same patient cohort to predict the clinical outcome. While prediction based on omics data has been widely studied in the last fifteen years, little has been done in the statistical literature on the integration of multiple omics modalities to select a subset of variables for prediction, which is a critical task in personalized medicine. In this paper, we propose a simple penalized regression method to address this problem by assigning different penalty factors to different data modalities for feature selection and prediction. The penalty factors can be chosen in a fully data-driven fashion by cross-validation or by taking practical considerations into account. In simulation studies, we compare the prediction performance of our approach, called IPF-LASSO (Integrative LASSO with Penalty Factors and implemented in the R package ipflasso, with the standard LASSO and sparse group LASSO. The use of IPF-LASSO is also illustrated through applications to two real-life cancer datasets. All data and codes are available on the companion website to ensure reproducibility.

  2. Use of different marker pre-selection methods based on single SNP regression in the estimation of Genomic-EBVs

    Directory of Open Access Journals (Sweden)

    Corrado Dimauro

    2010-01-01

    Full Text Available Two methods of SNPs pre-selection based on single marker regression for the estimation of genomic breeding values (G-EBVs were compared using simulated data provided by the XII QTL-MAS workshop: i Bonferroni correction of the significance threshold and ii Permutation test to obtain the reference distribution of the null hypothesis and identify significant markers at P<0.01 and P<0.001 significance thresholds. From the set of markers significant at P<0.001, random subsets of 50% and 25% markers were extracted, to evaluate the effect of further reducing the number of significant SNPs on G-EBV predictions. The Bonferroni correction method allowed the identification of 595 significant SNPs that gave the best G-EBV accuracies in prediction generations (82.80%. The permutation methods gave slightly lower G-EBV accuracies even if a larger number of SNPs resulted significant (2,053 and 1,352 for 0.01 and 0.001 significance thresholds, respectively. Interestingly, halving or dividing by four the number of SNPs significant at P<0.001 resulted in an only slightly decrease of G-EBV accuracies. The genetic structure of the simulated population with few QTL carrying large effects, might have favoured the Bonferroni method.

  3. Configuring calendar variation based on time series regression method for forecasting of monthly currency inflow and outflow in Central Java

    Science.gov (United States)

    Setiawan, Suhartono, Ahmad, Imam Safawi; Rahmawati, Noorgam Ika

    2015-12-01

    Bank Indonesia (BI) as the central bank of Republic Indonesiahas a single overarching objective to establish and maintain rupiah stability. This objective could be achieved by monitoring traffic of inflow and outflow money currency. Inflow and outflow are related to stock and distribution of money currency around Indonesia territory. It will effect of economic activities. Economic activities of Indonesia,as one of Moslem country, absolutely related to Islamic Calendar (lunar calendar), that different with Gregorian calendar. This research aims to forecast the inflow and outflow money currency of Representative Office (RO) of BI Semarang Central Java region. The results of the analysis shows that the characteristics of inflow and outflow money currency influenced by the effects of the calendar variations, that is the day of Eid al-Fitr (moslem holyday) as well as seasonal patterns. In addition, the period of a certain week during Eid al-Fitr also affect the increase of inflow and outflow money currency. The best model based on the value of the smallestRoot Mean Square Error (RMSE) for inflow data is ARIMA model. While the best model for predicting the outflow data in RO of BI Semarang is ARIMAX model or Time Series Regression, because both of them have the same model. The results forecast in a period of 2015 shows an increase of inflow money currency happened in August, while the increase in outflow money currency happened in July.

  4. Regression-based approach for testing the association between multi-region haplotype configuration and complex trait

    Directory of Open Access Journals (Sweden)

    Zhao Hongbo

    2009-09-01

    Full Text Available Abstract Background It is quite common that the genetic architecture of complex traits involves many genes and their interactions. Therefore, dealing with multiple unlinked genomic regions simultaneously is desirable. Results In this paper we develop a regression-based approach to assess the interactions of haplotypes that belong to different unlinked regions, and we use score statistics to test the null hypothesis of non-genetic association. Additionally, multiple marker combinations at each unlinked region are considered. The multiple tests are settled via the minP approach. The P value of the "best" multi-region multi-marker configuration is corrected via Monte-Carlo simulations. Through simulation studies, we assess the performance of the proposed approach and demonstrate its validity and power in testing for haplotype interaction association. Conclusion Our simulations showed that, for binary trait without covariates, our proposed methods prove to be equal and even more powerful than htr and hapcc which are part of the FAMHAP program. Additionally, our model can be applied to a wider variety of traits and allow adjustment for other covariates. To test the validity, our methods are applied to analyze the association between four unlinked candidate genes and pig meat quality.

  5. Wave-optics uncertainty propagation and regression-based bias model in GNSS radio occultation bending angle retrievals

    Directory of Open Access Journals (Sweden)

    M. E. Gorbunov

    2018-01-01

    Full Text Available A new reference occultation processing system (rOPS will include a Global Navigation Satellite System (GNSS radio occultation (RO retrieval chain with integrated uncertainty propagation. In this paper, we focus on wave-optics bending angle (BA retrieval in the lower troposphere and introduce (1 an empirically estimated boundary layer bias (BLB model then employed to reduce the systematic uncertainty of excess phases and bending angles in about the lowest 2 km of the troposphere and (2 the estimation of (residual systematic uncertainties and their propagation together with random uncertainties from excess phase to bending angle profiles. Our BLB model describes the estimated bias of the excess phase transferred from the estimated bias of the bending angle, for which the model is built, informed by analyzing refractivity fluctuation statistics shown to induce such biases. The model is derived from regression analysis using a large ensemble of Constellation Observing System for Meteorology, Ionosphere, and Climate (COSMIC RO observations and concurrent European Centre for Medium-Range Weather Forecasts (ECMWF analysis fields. It is formulated in terms of predictors and adaptive functions (powers and cross products of predictors, where we use six main predictors derived from observations: impact altitude, latitude, bending angle and its standard deviation, canonical transform (CT amplitude, and its fluctuation index. Based on an ensemble of test days, independent of the days of data used for the regression analysis to establish the BLB model, we find the model very effective for bias reduction and capable of reducing bending angle and corresponding refractivity biases by about a factor of 5. The estimated residual systematic uncertainty, after the BLB profile subtraction, is lower bounded by the uncertainty from the (indirect use of ECMWF analysis fields but is significantly lower than the systematic uncertainty without BLB correction. The

  6. Wave-optics uncertainty propagation and regression-based bias model in GNSS radio occultation bending angle retrievals

    Science.gov (United States)

    Gorbunov, Michael E.; Kirchengast, Gottfried

    2018-01-01

    A new reference occultation processing system (rOPS) will include a Global Navigation Satellite System (GNSS) radio occultation (RO) retrieval chain with integrated uncertainty propagation. In this paper, we focus on wave-optics bending angle (BA) retrieval in the lower troposphere and introduce (1) an empirically estimated boundary layer bias (BLB) model then employed to reduce the systematic uncertainty of excess phases and bending angles in about the lowest 2 km of the troposphere and (2) the estimation of (residual) systematic uncertainties and their propagation together with random uncertainties from excess phase to bending angle profiles. Our BLB model describes the estimated bias of the excess phase transferred from the estimated bias of the bending angle, for which the model is built, informed by analyzing refractivity fluctuation statistics shown to induce such biases. The model is derived from regression analysis using a large ensemble of Constellation Observing System for Meteorology, Ionosphere, and Climate (COSMIC) RO observations and concurrent European Centre for Medium-Range Weather Forecasts (ECMWF) analysis fields. It is formulated in terms of predictors and adaptive functions (powers and cross products of predictors), where we use six main predictors derived from observations: impact altitude, latitude, bending angle and its standard deviation, canonical transform (CT) amplitude, and its fluctuation index. Based on an ensemble of test days, independent of the days of data used for the regression analysis to establish the BLB model, we find the model very effective for bias reduction and capable of reducing bending angle and corresponding refractivity biases by about a factor of 5. The estimated residual systematic uncertainty, after the BLB profile subtraction, is lower bounded by the uncertainty from the (indirect) use of ECMWF analysis fields but is significantly lower than the systematic uncertainty without BLB correction. The systematic and

  7. Robustness of intra urban land-use regression models for ultrafine particles and black carbon based on mobile monitoring.

    Science.gov (United States)

    Kerckhoffs, Jules; Hoek, Gerard; Vlaanderen, Jelle; van Nunen, Erik; Messier, Kyle; Brunekreef, Bert; Gulliver, John; Vermeulen, Roel

    2017-11-01

    Land-use regression (LUR) models for ultrafine particles (UFP) and Black Carbon (BC) in urban areas have been developed using short-term stationary monitoring or mobile platforms in order to capture the high variability of these pollutants. However, little is known about the comparability of predictions of mobile and short-term stationary models and especially the validity of these models for assessing residential exposures and the robustness of model predictions developed in different campaigns. We used an electric car to collect mobile measurements (n = 5236 unique road segments) and short-term stationary measurements (3 × 30min, n = 240) of UFP and BC in three Dutch cities (Amsterdam, Utrecht, Maastricht) in 2014-2015. Predictions of LUR models based on mobile measurements were compared to (i) measured concentrations at the short-term stationary sites, (ii) LUR model predictions based on short-term stationary measurements at 1500 random addresses in the three cities, (iii) externally obtained home outdoor measurements (3 × 24h samples; n = 42) and (iv) predictions of a LUR model developed based upon a 2013 mobile campaign in two cities (Amsterdam, Rotterdam). Despite the poor model R 2 of 15%, the ability of mobile UFP models to predict measurements with longer averaging time increased substantially from 36% for short-term stationary measurements to 57% for home outdoor measurements. In contrast, the mobile BC model only predicted 14% of the variation in the short-term stationary sites and also 14% of the home outdoor sites. Models based upon mobile and short-term stationary monitoring provided fairly high correlated predictions of UFP concentrations at 1500 randomly selected addresses in the three Dutch cities (R 2 = 0.64). We found higher UFP predictions (of about 30%) based on mobile models opposed to short-term model predictions and home outdoor measurements with no clear geospatial patterns. The mobile model for UFP was stable over different settings as

  8. Percentile-Based ETCCDI Temperature Extremes Indices for CMIP5 Model Output: New Results through Semiparametric Quantile Regression Approach

    Science.gov (United States)

    Li, L.; Yang, C.

    2017-12-01

    Climate extremes often manifest as rare events in terms of surface air temperature and precipitation with an annual reoccurrence period. In order to represent the manifold characteristics of climate extremes for monitoring and analysis, the Expert Team on Climate Change Detection and Indices (ETCCDI) had worked out a set of 27 core indices based on daily temperature and precipitation data, describing extreme weather and climate events on an annual basis. The CLIMDEX project (http://www.climdex.org) had produced public domain datasets of such indices for data from a variety of sources, including output from global climate models (GCM) participating in the Coupled Model Intercomparison Project Phase 5 (CMIP5). Among the 27 ETCCDI indices, there are six percentile-based temperature extremes indices that may fall into two groups: exceedance rates (ER) (TN10p, TN90p, TX10p and TX90p) and durations (CSDI and WSDI). Percentiles must be estimated prior to the calculation of the indices, and could more or less be biased by the adopted algorithm. Such biases will in turn be propagated to the final results of indices. The CLIMDEX used an empirical quantile estimator combined with a bootstrap resampling procedure to reduce the inhomogeneity in the annual series of the ER indices. However, there are still some problems remained in the CLIMDEX datasets, namely the overestimated climate variability due to unaccounted autocorrelation in the daily temperature data, seasonally varying biases and inconsistency between algorithms applied to the ER indices and to the duration indices. We now present new results of the six indices through a semiparametric quantile regression approach for the CMIP5 model output. By using the base-period data as a whole and taking seasonality and autocorrelation into account, this approach successfully addressed the aforementioned issues and came out with consistent results. The new datasets cover the historical and three projected (RCP2.6, RCP4.5 and RCP

  9. Modified Regression Correlation Coefficient for Poisson Regression Model

    Science.gov (United States)

    Kaengthong, Nattacha; Domthong, Uthumporn

    2017-09-01

    This study gives attention to indicators in predictive power of the Generalized Linear Model (GLM) which are widely used; however, often having some restrictions. We are interested in regression correlation coefficient for a Poisson regression model. This is a measure of predictive power, and defined by the relationship between the dependent variable (Y) and the expected value of the dependent variable given the independent variables [E(Y|X)] for the Poisson regression model. The dependent variable is distributed as Poisson. The purpose of this research was modifying regression correlation coefficient for Poisson regression model. We also compare the proposed modified regression correlation coefficient with the traditional regression correlation coefficient in the case of two or more independent variables, and having multicollinearity in independent variables. The result shows that the proposed regression correlation coefficient is better than the traditional regression correlation coefficient based on Bias and the Root Mean Square Error (RMSE).

  10. riskRegression

    DEFF Research Database (Denmark)

    Ozenne, Brice; Sørensen, Anne Lyngholm; Scheike, Thomas

    2017-01-01

    In the presence of competing risks a prediction of the time-dynamic absolute risk of an event can be based on cause-specific Cox regression models for the event and the competing risks (Benichou and Gail, 1990). We present computationally fast and memory optimized C++ functions with an R interface...... for predicting the covariate specific absolute risks, their confidence intervals, and their confidence bands based on right censored time to event data. We provide explicit formulas for our implementation of the estimator of the (stratified) baseline hazard function in the presence of tied event times. As a by...... functionals. The software presented here is implemented in the riskRegression package....

  11. Reliable change indices and standardized regression-based change score norms for evaluating neuropsychological change in children with epilepsy.

    Science.gov (United States)

    Busch, Robyn M; Lineweaver, Tara T; Ferguson, Lisa; Haut, Jennifer S

    2015-06-01

    Reliable change indices (RCIs) and standardized regression-based (SRB) change score norms permit evaluation of meaningful changes in test scores following treatment interventions, like epilepsy surgery, while accounting for test-retest reliability, practice effects, score fluctuations due to error, and relevant clinical and demographic factors. Although these methods are frequently used to assess cognitive change after epilepsy surgery in adults, they have not been widely applied to examine cognitive change in children with epilepsy. The goal of the current study was to develop RCIs and SRB change score norms for use in children with epilepsy. Sixty-three children with epilepsy (age range: 6-16; M=10.19, SD=2.58) underwent comprehensive neuropsychological evaluations at two time points an average of 12 months apart. Practice effect-adjusted RCIs and SRB change score norms were calculated for all cognitive measures in the battery. Practice effects were quite variable across the neuropsychological measures, with the greatest differences observed among older children, particularly on the Children's Memory Scale and Wisconsin Card Sorting Test. There was also notable variability in test-retest reliabilities across measures in the battery, with coefficients ranging from 0.14 to 0.92. Reliable change indices and SRB change score norms for use in assessing meaningful cognitive change in children following epilepsy surgery are provided for measures with reliability coefficients above 0.50. This is the first study to provide RCIs and SRB change score norms for a comprehensive neuropsychological battery based on a large sample of children with epilepsy. Tables to aid in evaluating cognitive changes in children who have undergone epilepsy surgery are provided for clinical use. An Excel sheet to perform all relevant calculations is also available to interested clinicians or researchers. Copyright © 2015 Elsevier Inc. All rights reserved.

  12. Reliable Change Indices and Standardized Regression-Based Change Score Norms for Evaluating Neuropsychological Change in Children with Epilepsy

    Science.gov (United States)

    Busch, Robyn M.; Lineweaver, Tara T.; Ferguson, Lisa; Haut, Jennifer S.

    2015-01-01

    Reliable change index scores (RCIs) and standardized regression-based change score norms (SRBs) permit evaluation of meaningful changes in test scores following treatment interventions, like epilepsy surgery, while accounting for test-retest reliability, practice effects, score fluctuations due to error, and relevant clinical and demographic factors. Although these methods are frequently used to assess cognitive change after epilepsy surgery in adults, they have not been widely applied to examine cognitive change in children with epilepsy. The goal of the current study was to develop RCIs and SRBs for use in children with epilepsy. Sixty-three children with epilepsy (age range 6–16; M=10.19, SD=2.58) underwent comprehensive neuropsychological evaluations at two time points an average of 12 months apart. Practice adjusted RCIs and SRBs were calculated for all cognitive measures in the battery. Practice effects were quite variable across the neuropsychological measures, with the greatest differences observed among older children, particularly on the Children’s Memory Scale and Wisconsin Card Sorting Test. There was also notable variability in test-retest reliabilities across measures in the battery, with coefficients ranging from 0.14 to 0.92. RCIs and SRBs for use in assessing meaningful cognitive change in children following epilepsy surgery are provided for measures with reliability coefficients above 0.50. This is the first study to provide RCIs and SRBs for a comprehensive neuropsychological battery based on a large sample of children with epilepsy. Tables to aid in evaluating cognitive changes in children who have undergone epilepsy surgery are provided for clinical use. An excel sheet to perform all relevant calculations is also available to interested clinicians or researchers. PMID:26043163

  13. Noise model based ν-support vector regression with its application to short-term wind speed forecasting.

    Science.gov (United States)

    Hu, Qinghua; Zhang, Shiguang; Xie, Zongxia; Mi, Jusheng; Wan, Jie

    2014-09-01

    Support vector regression (SVR) techniques are aimed at discovering a linear or nonlinear structure hidden in sample data. Most existing regression techniques take the assumption that the error distribution is Gaussian. However, it was observed that the noise in some real-world applications, such as wind power forecasting and direction of the arrival estimation problem, does not satisfy Gaussian distribution, but a beta distribution, Laplacian distribution, or other models. In these cases the current regression techniques are not optimal. According to the Bayesian approach, we derive a general loss function and develop a technique of the uniform model of ν-support vector regression for the general noise model (N-SVR). The Augmented Lagrange Multiplier method is introduced to solve N-SVR. Numerical experiments on artificial data sets, UCI data and short-term wind speed prediction are conducted. The results show the effectiveness of the proposed technique. Copyright © 2014 Elsevier Ltd. All rights reserved.

  14. PCA-based algorithm for calibration of spectrophotometric analysers of food

    International Nuclear Information System (INIS)

    Morawski, Roman Z; Miekina, Andrzej

    2013-01-01

    Spectrophotometric analysers of food, being instruments for determination of the composition of food products and ingredients, are today of growing importance for food industry, as well as for food distributors and consumers. Their metrological performance significantly depends of the numerical performance of available means for spectrophotometric data processing; in particular – the means for calibration of analysers. In this paper, a new algorithm for this purpose is proposed, viz. the algorithm using principal components analysis (PCA). It is almost as efficient as PLS-based algorithms of calibration, but much simpler

  15. A Server-Client-Based Graphical Development Environment for Physics Analyses (VISPA)

    International Nuclear Information System (INIS)

    Bretz, H-P; Erdmann, M; Fischer, R; Hinzmann, A; Klingebiel, D; Komm, M; Müller, G; Rieger, M; Steffens, J; Steggemann, J; Urban, M; Winchen, T

    2012-01-01

    The Visual Physics Analysis (VISPA) project provides a graphical development environment for data analysis. It addresses the typical development cycle of (re-)designing, executing, and verifying an analysis. We present the new server-client-based web application of the VISPA project to perform physics analyses via a standard internet browser. This enables individual scientists to work with a large variety of devices including touch screens, and teams of scientists to share, develop, and execute analyses on a server via the web interface.

  16. Automatic image-based analyses using a coupled quadtree-SBFEM/SCM approach

    Science.gov (United States)

    Gravenkamp, Hauke; Duczek, Sascha

    2017-10-01

    Quadtree-based domain decomposition algorithms offer an efficient option to create meshes for automatic image-based analyses. Without introducing hanging nodes the scaled boundary finite element method (SBFEM) can directly operate on such meshes by only discretizing the edges of each subdomain. However, the convergence of a numerical method that relies on a quadtree-based geometry approximation is often suboptimal due to the inaccurate representation of the boundary. To overcome this problem a combination of the SBFEM with the spectral cell method (SCM) is proposed. The basic idea is to treat each uncut quadtree cell as an SBFEM polygon, while all cut quadtree cells are computed employing the SCM. This methodology not only reduces the required number of degrees of freedom but also avoids a two-dimensional quadrature in all uncut quadtree cells. Numerical examples including static, harmonic, modal and transient analyses of complex geometries are studied, highlighting the performance of this novel approach.

  17. Engineering design and exergy analyses for combustion gas turbine based power generation system

    International Nuclear Information System (INIS)

    Sue, D.-C.; Chuang, C.-C.

    2004-01-01

    This paper presents the engineering design and theoretical exergetic analyses of the plant for combustion gas turbine based power generation systems. Exergy analysis is performed based on the first and second laws of thermodynamics for power generation systems. The results show the exergy analyses for a steam cycle system predict the plant efficiency more precisely. The plant efficiency for partial load operation is lower than full load operation. Increasing the pinch points will decrease the combined cycle plant efficiency. The engineering design is based on inlet air-cooling and natural gas preheating for increasing the net power output and efficiency. To evaluate the energy utilization, one combined cycle unit and one cogeneration system, consisting of gas turbine generators, heat recovery steam generators, one steam turbine generator with steam extracted for process have been analyzed. The analytical results are used for engineering design and component selection

  18. Coalescent-based genome analyses resolve the early branches of the euarchontoglires.

    Directory of Open Access Journals (Sweden)

    Vikas Kumar

    Full Text Available Despite numerous large-scale phylogenomic studies, certain parts of the mammalian tree are extraordinarily difficult to resolve. We used the coding regions from 19 completely sequenced genomes to study the relationships within the super-clade Euarchontoglires (Primates, Rodentia, Lagomorpha, Dermoptera and Scandentia because the placement of Scandentia within this clade is controversial. The difficulty in resolving this issue is due to the short time spans between the early divergences of Euarchontoglires, which may cause incongruent gene trees. The conflict in the data can be depicted by network analyses and the contentious relationships are best reconstructed by coalescent-based analyses. This method is expected to be superior to analyses of concatenated data in reconstructing a species tree from numerous gene trees. The total concatenated dataset used to study the relationships in this group comprises 5,875 protein-coding genes (9,799,170 nucleotides from all orders except Dermoptera (flying lemurs. Reconstruction of the species tree from 1,006 gene trees using coalescent models placed Scandentia as sister group to the primates, which is in agreement with maximum likelihood analyses of concatenated nucleotide sequence data. Additionally, both analytical approaches favoured the Tarsier to be sister taxon to Anthropoidea, thus belonging to the Haplorrhine clade. When divergence times are short such as in radiations over periods of a few million years, even genome scale analyses struggle to resolve phylogenetic relationships. On these short branches processes such as incomplete lineage sorting and possibly hybridization occur and make it preferable to base phylogenomic analyses on coalescent methods.

  19. THE GOAL OF VALUE-BASED MEDICINE ANALYSES: COMPARABILITY. THE CASE FOR NEOVASCULAR MACULAR DEGENERATION

    Science.gov (United States)

    Brown, Gary C.; Brown, Melissa M.; Brown, Heidi C.; Kindermann, Sylvia; Sharma, Sanjay

    2007-01-01

    Purpose To evaluate the comparability of articles in the peer-reviewed literature assessing the (1) patient value and (2) cost-utility (cost-effectiveness) associated with interventions for neovascular age-related macular degeneration (ARMD). Methods A search was performed in the National Library of Medicine database of 16 million peer-reviewed articles using the key words cost-utility, cost-effectiveness, value, verteporfin, pegaptanib, laser photocoagulation, ranibizumab, and therapy. All articles that used an outcome of quality-adjusted life-years (QALYs) were studied in regard to (1) percent improvement in quality of life, (2) utility methodology, (3) utility respondents, (4) types of costs included (eg, direct healthcare, direct nonhealthcare, indirect), (5) cost bases (eg, Medicare, National Health Service in the United Kingdom), and (6) study cost perspective (eg, government, societal, third-party insurer). To qualify as a value-based medicine analysis, the patient value had to be measured using the outcome of the QALYs conferred by respective interventions. As with value-based medicine analyses, patient-based time tradeoff utility analysis had to be utilized, patient utility respondents were necessary, and direct medical costs were used. Results Among 21 cost-utility analyses performed on interventions for neovascular macular degeneration, 15 (71%) met value-based medicine criteria. The 6 others (29%) were not comparable owing to (1) varying utility methodology, (2) varying utility respondents, (3) differing costs utilized, (4) differing cost bases, and (5) varying study perspectives. Among value-based medicine studies, laser photocoagulation confers a 4.4% value gain (improvement in quality of life) for the treatment of classic subfoveal choroidal neovascularization. Intravitreal pegaptanib confers a 5.9% value gain (improvement in quality of life) for classic, minimally classic, and occult subfoveal choroidal neovascularization, and photodynamic therapy

  20. The goal of value-based medicine analyses: comparability. The case for neovascular macular degeneration.

    Science.gov (United States)

    Brown, Gary C; Brown, Melissa M; Brown, Heidi C; Kindermann, Sylvia; Sharma, Sanjay

    2007-01-01

    To evaluate the comparability of articles in the peer-reviewed literature assessing the (1) patient value and (2) cost-utility (cost-effectiveness) associated with interventions for neovascular age-related macular degeneration (ARMD). A search was performed in the National Library of Medicine database of 16 million peer-reviewed articles using the key words cost-utility, cost-effectiveness, value, verteporfin, pegaptanib, laser photocoagulation, ranibizumab, and therapy. All articles that used an outcome of quality-adjusted life-years (QALYs) were studied in regard to (1) percent improvement in quality of life, (2) utility methodology, (3) utility respondents, (4) types of costs included (eg, direct healthcare, direct nonhealthcare, indirect), (5) cost bases (eg, Medicare, National Health Service in the United Kingdom), and (6) study cost perspective (eg, government, societal, third-party insurer). To qualify as a value-based medicine analysis, the patient value had to be measured using the outcome of the QALYs conferred by respective interventions. As with value-based medicine analyses, patient-based time tradeoff utility analysis had to be utilized, patient utility respondents were necessary, and direct medical costs were used. Among 21 cost-utility analyses performed on interventions for neovascular macular degeneration, 15 (71%) met value-based medicine criteria. The 6 others (29%) were not comparable owing to (1) varying utility methodology, (2) varying utility respondents, (3) differing costs utilized, (4) differing cost bases, and (5) varying study perspectives. Among value-based medicine studies, laser photocoagulation confers a 4.4% value gain (improvement in quality of life) for the treatment of classic subfoveal choroidal neovascularization. Intravitreal pegaptanib confers a 5.9% value gain (improvement in quality of life) for classic, minimally classic, and occult subfoveal choroidal neovascularization, and photodynamic therapy with verteporfin confers

  1. Quantifying the dose-response relationship between circulating folate concentrations and colorectal cancer in cohort studies: a meta-analysis based on a flexible meta-regression model.

    Science.gov (United States)

    Chuang, Shu-Chun; Rota, Matteo; Gunter, Marc J; Zeleniuch-Jacquotte, Anne; Eussen, Simone J P M; Vollset, Stein Emil; Ueland, Per Magne; Norat, Teresa; Ziegler, Regina G; Vineis, Paolo

    2013-10-01

    Most epidemiologic studies on folate intake suggest that folate may be protective against colorectal cancer, but the results on circulating (plasma or serum) folate are mostly inconclusive. We conducted a meta-analysis of case-control studies nested within prospective studies on circulating folate and colorectal cancer risk by using flexible meta-regression models to test the linear and nonlinear dose-response relationships. A total of 8 publications (10 cohorts, representing 3,477 cases and 7,039 controls) were included in the meta-analysis. The linear and nonlinear models corresponded to relative risks of 0.96 (95% confidence interval (CI): 0.91, 1.02) and 0.99 (95% CI: 0.96, 1.02), respectively, per 10 nmol/L of circulating folate in contrast to the reference value. The pooled relative risks when comparing the highest with the lowest category were 0.80 (95% CI: 0.61, 0.99) for radioimmunoassay and 1.03 (95% CI: 0.83, 1.22) for microbiological assay. Overall, our analyses suggest a null association between circulating folate and colorectal cancer risk. The stronger association for the radioimmunoassay-based studies could reflect differences in cohorts and study designs rather than assay performance. Further investigations need to integrate more accurate measurements and flexible modeling to explore the effects of folate in the presence of genetic, lifestyle, dietary, and hormone-related factors.

  2. Multiple linear regression model for bromate formation based on the survey data of source waters from geographically different regions across China.

    Science.gov (United States)

    Yu, Jianwei; Liu, Juan; An, Wei; Wang, Yongjing; Zhang, Junzhi; Wei, Wei; Su, Ming; Yang, Min

    2015-01-01

    A total of 86 source water samples from 38 cities across major watersheds of China were collected for a bromide (Br(-)) survey, and the bromate (BrO3 (-)) formation potentials (BFPs) of 41 samples with Br(-) concentration >20 μg L(-1) were evaluated using a batch ozonation reactor. Statistical analyses indicated that higher alkalinity, hardness, and pH of water samples could lead to higher BFPs, with alkalinity as the most important factor. Based on the survey data, a multiple linear regression (MLR) model including three parameters (alkalinity, ozone dose, and total organic carbon (TOC)) was established with a relatively good prediction performance (model selection criterion = 2.01, R (2) = 0.724), using logarithmic transformation of the variables. Furthermore, a contour plot was used to interpret the influence of alkalinity and TOC on BrO3 (-) formation with prediction accuracy as high as 71 %, suggesting that these two parameters, apart from ozone dosage, were the most important ones affecting the BFPs of source waters with Br(-) concentration >20 μg L(-1). The model could be a useful tool for the prediction of the BFPs of source water.

  3. Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses.

    Science.gov (United States)

    Liu, Bo; Madduri, Ravi K; Sotomayor, Borja; Chard, Kyle; Lacinski, Lukasz; Dave, Utpal J; Li, Jianqiang; Liu, Chunchen; Foster, Ian T

    2014-06-01

    Due to the upcoming data deluge of genome data, the need for storing and processing large-scale genome data, easy access to biomedical analyses tools, efficient data sharing and retrieval has presented significant challenges. The variability in data volume results in variable computing and storage requirements, therefore biomedical researchers are pursuing more reliable, dynamic and convenient methods for conducting sequencing analyses. This paper proposes a Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses, which enables reliable and highly scalable execution of sequencing analyses workflows in a fully automated manner. Our platform extends the existing Galaxy workflow system by adding data management capabilities for transferring large quantities of data efficiently and reliably (via Globus Transfer), domain-specific analyses tools preconfigured for immediate use by researchers (via user-specific tools integration), automatic deployment on Cloud for on-demand resource allocation and pay-as-you-go pricing (via Globus Provision), a Cloud provisioning tool for auto-scaling (via HTCondor scheduler), and the support for validating the correctness of workflows (via semantic verification tools). Two bioinformatics workflow use cases as well as performance evaluation are presented to validate the feasibility of the proposed approach. Copyright © 2014 Elsevier Inc. All rights reserved.

  4. Estimation of Tree Cover in an Agricultural Parkland of Senegal Using Rule-Based Regression Tree Modeling

    Directory of Open Access Journals (Sweden)

    Stefanie M. Herrmann

    2013-10-01

    Full Text Available Field trees are an integral part of the farmed parkland landscape in West Africa and provide multiple benefits to the local environment and livelihoods. While field trees have received increasing interest in the context of strengthening resilience to climate variability and change, the actual extent of farmed parkland and spatial patterns of tree cover are largely unknown. We used the rule-based predictive modeling tool Cubist® to estimate field tree cover in the west-central agricultural region of Senegal. A collection of rules and associated multiple linear regression models was constructed from (1 a reference dataset of percent tree cover derived from very high spatial resolution data (2 m Orbview as the dependent variable, and (2 ten years of 10-day 250 m Moderate Resolution Imaging Spectrometer (MODIS Normalized Difference Vegetation Index (NDVI composites and derived phenological metrics as independent variables. Correlation coefficients between modeled and reference percent tree cover of 0.88 and 0.77 were achieved for training and validation data respectively, with absolute mean errors of 1.07 and 1.03 percent tree cover. The resulting map shows a west-east gradient from high tree cover in the peri-urban areas of horticulture and arboriculture to low tree cover in the more sparsely populated eastern part of the study area. A comparison of current (2000s tree cover along this gradient with historic cover as seen on Corona images reveals dynamics of change but also areas of remarkable stability of field tree cover since 1968. The proposed modeling approach can help to identify locations of high and low tree cover in dryland environments and guide ground studies and management interventions aimed at promoting the integration of field trees in agricultural systems.

  5. A Versatile Software Package for Inter-subject Correlation Based Analyses of fMRI

    Directory of Open Access Journals (Sweden)

    Jukka-Pekka eKauppi

    2014-01-01

    Full Text Available In the inter-subject correlation (ISC based analysis of the functional magnetic resonance imaging (fMRI data, the extent of shared processing across subjects during the experiment is determined by calculating correlation coefficients between the fMRI time series of the subjects in the corresponding brain locations. This implies that ISC can be used to analyze fMRI data without explicitly modelling the stimulus and thus ISC is a potential method to analyze fMRI data acquired under complex naturalistic stimuli. Despite of the suitability of ISC based approach to analyze complex fMRI data, no generic software tools have been made available for this purpose, limiting a widespread use of ISC based analysis techniques among neuroimaging community. In this paper, we present a graphical user interface (GUI based software package, ISC Toolbox, implemented in Matlab for computing various ISC based analyses. Many advanced computations such as comparison of ISCs between different stimuli, time window ISC, and inter-subject phase synchronization are supported by the toolbox. The analyses are coupled with re-sampling based statistical inference. The ISC based analyses are data and computation intensive and the ISC toolbox is equipped with mechanisms to execute the parallel computations in a cluster environment automatically and with an automatic detection of the cluster environment in use. Currently, SGE-based (Oracle Grid Engine, Son of a Grid Engine or Open Grid Scheduler and Slurm environments are supported. In this paper, we present a detailed account on the methods behind the ISC Toolbox, the implementation of the toolbox and demonstrate the possible use of the toolbox by summarizing selected example applications. We also report the computation time experiments both using a single desktop computer and two grid environments demonstrating that parallelization effectively reduces the computing time. The ISC Toolbox is available in https://code.google.com/p/isc-toolbox/.

  6. A versatile software package for inter-subject correlation based analyses of fMRI.

    Science.gov (United States)

    Kauppi, Jukka-Pekka; Pajula, Juha; Tohka, Jussi

    2014-01-01

    In the inter-subject correlation (ISC) based analysis of the functional magnetic resonance imaging (fMRI) data, the extent of shared processing across subjects during the experiment is determined by calculating correlation coefficients between the fMRI time series of the subjects in the corresponding brain locations. This implies that ISC can be used to analyze fMRI data without explicitly modeling the stimulus and thus ISC is a potential method to analyze fMRI data acquired under complex naturalistic stimuli. Despite of the suitability of ISC based approach to analyze complex fMRI data, no generic software tools have been made available for this purpose, limiting a widespread use of ISC based analysis techniques among neuroimaging community. In this paper, we present a graphical user interface (GUI) based software package, ISC Toolbox, implemented in Matlab for computing various ISC based analyses. Many advanced computations such as comparison of ISCs between different stimuli, time window ISC, and inter-subject phase synchronization are supported by the toolbox. The analyses are coupled with re-sampling based statistical inference. The ISC based analyses are data and computation intensive and the ISC toolbox is equipped with mechanisms to execute the parallel computations in a cluster environment automatically and with an automatic detection of the cluster environment in use. Currently, SGE-based (Oracle Grid Engine, Son of a Grid Engine, or Open Grid Scheduler) and Slurm environments are supported. In this paper, we present a detailed account on the methods behind the ISC Toolbox, the implementation of the toolbox and demonstrate the possible use of the toolbox by summarizing selected example applications. We also report the computation time experiments both using a single desktop computer and two grid environments demonstrating that parallelization effectively reduces the computing time. The ISC Toolbox is available in https://code.google.com/p/isc-toolbox/

  7. and Multinomial Logistic Regression

    African Journals Online (AJOL)

    This work presented the results of an experimental comparison of two models: Multinomial Logistic Regression (MLR) and Artificial Neural Network (ANN) for classifying students based on their academic performance. The predictive accuracy for each model was measured by their average Classification Correct Rate (CCR).

  8. A spline-based regression parameter set for creating customized DARTEL MRI brain templates from infancy to old age

    Directory of Open Access Journals (Sweden)

    Marko Wilke

    2018-02-01

    Full Text Available This dataset contains the regression parameters derived by analyzing segmented brain MRI images (gray matter and white matter from a large population of healthy subjects, using a multivariate adaptive regression splines approach. A total of 1919 MRI datasets ranging in age from 1–75 years from four publicly available datasets (NIH, C-MIND, fCONN, and IXI were segmented using the CAT12 segmentation framework, writing out gray matter and white matter images normalized using an affine-only spatial normalization approach. These images were then subjected to a six-step DARTEL procedure, employing an iterative non-linear registration approach and yielding increasingly crisp intermediate images. The resulting six datasets per tissue class were then analyzed using multivariate adaptive regression splines, using the CerebroMatic toolbox. This approach allows for flexibly modelling smoothly varying trajectories while taking into account demographic (age, gender as well as technical (field strength, data quality predictors. The resulting regression parameters described here can be used to generate matched DARTEL or SHOOT templates for a given population under study, from infancy to old age. The dataset and the algorithm used to generate it are publicly available at https://irc.cchmc.org/software/cerebromatic.php. Keywords: MRI template creation, Multivariate adaptive regression splines, DARTEL, Structural MRI

  9. Comparison of two regression-based approaches for determining nutrient and sediment fluxes and trends in the Chesapeake Bay watershed

    Science.gov (United States)

    Moyer, Douglas; Hirsch, Robert M.; Hyer, Kenneth

    2012-01-01

    Nutrient and sediment fluxes and changes in fluxes over time are key indicators that water resource managers can use to assess the progress being made in improving the structure and function of the Chesapeake Bay ecosystem. The U.S. Geological Survey collects annual nutrient (nitrogen and phosphorus) and sediment flux data and computes trends that describe the extent to which water-quality conditions are changing within the major Chesapeake Bay tributaries. Two regression-based approaches were compared for estimating annual nutrient and sediment fluxes and for characterizing how these annual fluxes are changing over time. The two regression models compared are the traditionally used ESTIMATOR and the newly developed Weighted Regression on Time, Discharge, and Season (WRTDS). The model comparison focused on answering three questions: (1) What are the differences between the functional form and construction of each model? (2) Which model produces estimates of flux with the greatest accuracy and least amount of bias? (3) How different would the historical estimates of annual flux be if WRTDS had been used instead of ESTIMATOR? One additional point of comparison between the two models is how each model determines trends in annual flux once the year-to-year variations in discharge have been determined. All comparisons were made using total nitrogen, nitrate, total phosphorus, orthophosphorus, and suspended-sediment concentration data collected at the nine U.S. Geological Survey River Input Monitoring stations located on the Susquehanna, Potomac, James, Rappahannock, Appomattox, Pamunkey, Mattaponi, Patuxent, and Choptank Rivers in the Chesapeake Bay watershed. Two model characteristics that uniquely distinguish ESTIMATOR and WRTDS are the fundamental model form and the determination of model coefficients. ESTIMATOR and WRTDS both predict water-quality constituent concentration by developing a linear relation between the natural logarithm of observed constituent

  10. Analysing task design and students' responses to context-based problems through different analytical frameworks

    Science.gov (United States)

    Broman, Karolina; Bernholt, Sascha; Parchmann, Ilka

    2015-05-01

    Background:Context-based learning approaches are used to enhance students' interest in, and knowledge about, science. According to different empirical studies, students' interest is improved by applying these more non-conventional approaches, while effects on learning outcomes are less coherent. Hence, further insights are needed into the structure of context-based problems in comparison to traditional problems, and into students' problem-solving strategies. Therefore, a suitable framework is necessary, both for the analysis of tasks and strategies. Purpose:The aim of this paper is to explore traditional and context-based tasks as well as students' responses to exemplary tasks to identify a suitable framework for future design and analyses of context-based problems. The paper discusses different established frameworks and applies the Higher-Order Cognitive Skills/Lower-Order Cognitive Skills (HOCS/LOCS) taxonomy and the Model of Hierarchical Complexity in Chemistry (MHC-C) to analyse traditional tasks and students' responses. Sample:Upper secondary students (n=236) at the Natural Science Programme, i.e. possible future scientists, are investigated to explore learning outcomes when they solve chemistry tasks, both more conventional as well as context-based chemistry problems. Design and methods:A typical chemistry examination test has been analysed, first the test items in themselves (n=36), and thereafter 236 students' responses to one representative context-based problem. Content analysis using HOCS/LOCS and MHC-C frameworks has been applied to analyse both quantitative and qualitative data, allowing us to describe different problem-solving strategies. Results:The empirical results show that both frameworks are suitable to identify students' strategies, mainly focusing on recall of memorized facts when solving chemistry test items. Almost all test items were also assessing lower order thinking. The combination of frameworks with the chemistry syllabus has been

  11. A protein relational database and protein family knowledge bases to facilitate structure-based design analyses.

    Science.gov (United States)

    Mobilio, Dominick; Walker, Gary; Brooijmans, Natasja; Nilakantan, Ramaswamy; Denny, R Aldrin; Dejoannis, Jason; Feyfant, Eric; Kowticwar, Rupesh K; Mankala, Jyoti; Palli, Satish; Punyamantula, Sairam; Tatipally, Maneesh; John, Reji K; Humblet, Christine

    2010-08-01

    The Protein Data Bank is the most comprehensive source of experimental macromolecular structures. It can, however, be difficult at times to locate relevant structures with the Protein Data Bank search interface. This is particularly true when searching for complexes containing specific interactions between protein and ligand atoms. Moreover, searching within a family of proteins can be tedious. For example, one cannot search for some conserved residue as residue numbers vary across structures. We describe herein three databases, Protein Relational Database, Kinase Knowledge Base, and Matrix Metalloproteinase Knowledge Base, containing protein structures from the Protein Data Bank. In Protein Relational Database, atom-atom distances between protein and ligand have been precalculated allowing for millisecond retrieval based on atom identity and distance constraints. Ring centroids, centroid-centroid and centroid-atom distances and angles have also been included permitting queries for pi-stacking interactions and other structural motifs involving rings. Other geometric features can be searched through the inclusion of residue pair and triplet distances. In Kinase Knowledge Base and Matrix Metalloproteinase Knowledge Base, the catalytic domains have been aligned into common residue numbering schemes. Thus, by searching across Protein Relational Database and Kinase Knowledge Base, one can easily retrieve structures wherein, for example, a ligand of interest is making contact with the gatekeeper residue.

  12. Minimax Regression Quantiles

    DEFF Research Database (Denmark)

    Bache, Stefan Holst

    A new and alternative quantile regression estimator is developed and it is shown that the estimator is root n-consistent and asymptotically normal. The estimator is based on a minimax ‘deviance function’ and has asymptotically equivalent properties to the usual quantile regression estimator. It is......, however, a different and therefore new estimator. It allows for both linear- and nonlinear model specifications. A simple algorithm for computing the estimates is proposed. It seems to work quite well in practice but whether it has theoretical justification is still an open question....

  13. Estimation of effective block conductivities based on discrete network analyses using data from the Aespoe site

    International Nuclear Information System (INIS)

    La Pointe, P.R.; Wallmann, P.; Follin, S.

    1995-09-01

    Numerical continuum codes may be used for assessing the role of regional groundwater flow in far-field safety analyses of a nuclear waste repository at depth. The focus of this project is to develop and evaluate one method based on Discrete Fracture Network (DFN) models to estimate block-scale permeability values for continuum codes. Data from the Aespoe HRL and surrounding area are used. 57 refs, 76 figs, 15 tabs

  14. Differentiating regressed melanoma from regressed lichenoid keratosis.

    Science.gov (United States)

    Chan, Aegean H; Shulman, Kenneth J; Lee, Bonnie A

    2017-04-01

    Distinguishing regressed lichen planus-like keratosis (LPLK) from regressed melanoma can be difficult on histopathologic examination, potentially resulting in mismanagement of patients. We aimed to identify histopathologic features by which regressed melanoma can be differentiated from regressed LPLK. Twenty actively inflamed LPLK, 12 LPLK with regression and 15 melanomas with regression were compared and evaluated by hematoxylin and eosin staining as well as Melan-A, microphthalmia transcription factor (MiTF) and cytokeratin (AE1/AE3) immunostaining. (1) A total of 40% of regressed melanomas showed complete or near complete loss of melanocytes within the epidermis with Melan-A and MiTF immunostaining, while 8% of regressed LPLK exhibited this finding. (2) Necrotic keratinocytes were seen in the epidermis in 33% regressed melanomas as opposed to all of the regressed LPLK. (3) A dense infiltrate of melanophages in the papillary dermis was seen in 40% of regressed melanomas, a feature not seen in regressed LPLK. In summary, our findings suggest that a complete or near complete loss of melanocytes within the epidermis strongly favors a regressed melanoma over a regressed LPLK. In addition, necrotic epidermal keratinocytes and the presence of a dense band-like distribution of dermal melanophages can be helpful in differentiating these lesions. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  15. Comparison of stochastic and regression based methods for quantification of predictive uncertainty of model-simulated wellhead protection zones in heterogeneous aquifers

    DEFF Research Database (Denmark)

    Christensen, Steen; Moore, C.; Doherty, J.

    2006-01-01

    accurate and required a few hundred model calls to be computed. (b) The linearized regression-based interval (Cooley, 2004) required just over a hundred model calls and also appeared to be nearly correct. (c) The calibration-constrained Monte-Carlo interval (Doherty, 2003) was found to be narrower than......For a synthetic case we computed three types of individual prediction intervals for the location of the aquifer entry point of a particle that moves through a heterogeneous aquifer and ends up in a pumping well. (a) The nonlinear regression-based interval (Cooley, 2004) was found to be nearly...... the regression-based intervals but required about half a million model calls. It is unclear whether or not this type of prediction interval is accurate....

  16. GIS-based spatial regression and prediction of water quality in river networks: A case study in Iowa

    Science.gov (United States)

    Yang, X.; Jin, W.

    2010-01-01

    Nonpoint source pollution is the leading cause of the U.S.'s water quality problems. One important component of nonpoint source pollution control is an understanding of what and how watershed-scale conditions influence ambient water quality. This paper investigated the use of spatial regression to evaluate the impacts of watershed characteristics on stream NO3NO2-N concentration in the Cedar River Watershed, Iowa. An Arc Hydro geodatabase was constructed to organize various datasets on the watershed. Spatial regression models were developed to evaluate the impacts of watershed characteristics on stream NO3NO2-N concentration and predict NO3NO2-N concentration at unmonitored locations. Unlike the traditional ordinary least square (OLS) method, the spatial regression method incorporates the potential spatial correlation among the observations in its coefficient estimation. Study results show that NO3NO2-N observations in the Cedar River Watershed are spatially correlated, and by ignoring the spatial correlation, the OLS method tends to over-estimate the impacts of watershed characteristics on stream NO3NO2-N concentration. In conjunction with kriging, the spatial regression method not only makes better stream NO3NO2-N concentration predictions than the OLS method, but also gives estimates of the uncertainty of the predictions, which provides useful information for optimizing the design of stream monitoring network. It is a promising tool for better managing and controlling nonpoint source pollution. ?? 2010 Elsevier Ltd.

  17. A spline-based regression parameter set for creating customized DARTEL MRI brain templates from infancy to old age.

    Science.gov (United States)

    Wilke, Marko

    2018-02-01

    This dataset contains the regression parameters derived by analyzing segmented brain MRI images (gray matter and white matter) from a large population of healthy subjects, using a multivariate adaptive regression splines approach. A total of 1919 MRI datasets ranging in age from 1-75 years from four publicly available datasets (NIH, C-MIND, fCONN, and IXI) were segmented using the CAT12 segmentation framework, writing out gray matter and white matter images normalized using an affine-only spatial normalization approach. These images were then subjected to a six-step DARTEL procedure, employing an iterative non-linear registration approach and yielding increasingly crisp intermediate images. The resulting six datasets per tissue class were then analyzed using multivariate adaptive regression splines, using the CerebroMatic toolbox. This approach allows for flexibly modelling smoothly varying trajectories while taking into account demographic (age, gender) as well as technical (field strength, data quality) predictors. The resulting regression parameters described here can be used to generate matched DARTEL or SHOOT templates for a given population under study, from infancy to old age. The dataset and the algorithm used to generate it are publicly available at https://irc.cchmc.org/software/cerebromatic.php.

  18. On the use of a regression model for trend estimates from ground-based atmospheric observations in the Southern hemisphere

    CSIR Research Space (South Africa)

    Bencherif, H

    2010-09-01

    Full Text Available The present reports on the use of a multi-regression model adapted at Reunion University for temperature and ozone trend estimates. Depending on the location of the observing site, the studied geophysical signal is broken down in form of a sum...

  19. Physical characterization of biomass-based pyrolysis liquids. Application of standard fuel oil analyses

    Energy Technology Data Exchange (ETDEWEB)

    Oasmaa, A; Leppaemaeki, E; Koponen, P; Levander, J; Tapola, E [VTT Energy, Espoo (Finland). Energy Production Technologies

    1998-12-31

    The main purpose of the study was to test the applicability of standard fuel oil methods developed for petroleum-based fuels to pyrolysis liquids. In addition, research on sampling, homogeneity, stability, miscibility and corrosivity was carried out. The standard methods have been tested for several different pyrolysis liquids. Recommendations on sampling, sample size and small modifications of standard methods are presented. In general, most of the methods can be used as such but the accuracy of the analysis can be improved by minor modifications. Fuel oil analyses not suitable for pyrolysis liquids have been identified. Homogeneity of the liquids is the most critical factor in accurate analysis. The presence of air bubbles may disturb in several analyses. Sample preheating and prefiltration should be avoided when possible. The former may cause changes in the composition and structure of the pyrolysis liquid. The latter may remove part of organic material with particles. The size of the sample should be determined on the basis of the homogeneity and the water content of the liquid. The basic analyses of the Technical Research Centre of Finland (VTT) include water, pH, solids, ash, Conradson carbon residue, heating value, CHN, density, viscosity, pourpoint, flash point, and stability. Additional analyses are carried out when needed. (orig.) 53 refs.

  20. A regression-based method for mapping traffic-related air pollution. Application and testing in four contrasting urban environments

    International Nuclear Information System (INIS)

    Briggs, D.J.; De Hoogh, C.; Elliot, P.; Gulliver, J.; Wills, J.; Kingham, S.; Smallbone, K.

    2000-01-01

    Accurate, high-resolution maps of traffic-related air pollution are needed both as a basis for assessing exposures as part of epidemiological studies, and to inform urban air-quality policy and traffic management. This paper assesses the use of a GIS-based, regression mapping technique to model spatial patterns of traffic-related air pollution. The model - developed using data from 80 passive sampler sites in Huddersfield, as part of the SAVIAH (Small Area Variations in Air Quality and Health) project - uses data on traffic flows and land cover in the 300-m buffer zone around each site, and altitude of the site, as predictors of NO 2 concentrations. It was tested here by application in four urban areas in the UK: Huddersfield (for the year following that used for initial model development), Sheffield, Northampton, and part of London. In each case, a GIS was built in ArcInfo, integrating relevant data on road traffic, urban land use and topography. Monitoring of NO 2 was undertaken using replicate passive samplers (in London, data were obtained from surveys carried out as part of the London network). In Huddersfield, Sheffield and Northampton, the model was first calibrated by comparing modelled results with monitored NO 2 concentrations at 10 randomly selected sites; the calibrated model was then validated against data from a further 10-28 sites. In London, where data for only 11 sites were available, validation was not undertaken. Results showed that the model performed well in all cases. After local calibration, the model gave estimates of mean annual NO 2 concentrations within a factor of 1.5 of the actual mean (approx. 70-90%) of the time and within a factor of 2 between 70 and 100% of the time. r 2 values between modelled and observed concentrations are in the range of 0.58-0.76. These results are comparable to those achieved by more sophisticated dispersion models. The model also has several advantages over dispersion modelling. It is able, for example, to

  1. A New Optimization Method for Centrifugal Compressors Based on 1D Calculations and Analyses

    Directory of Open Access Journals (Sweden)

    Pei-Yuan Li

    2015-05-01

    Full Text Available This paper presents an optimization design method for centrifugal compressors based on one-dimensional calculations and analyses. It consists of two parts: (1 centrifugal compressor geometry optimization based on one-dimensional calculations and (2 matching optimization of the vaned diffuser with an impeller based on the required throat area. A low pressure stage centrifugal compressor in a MW level gas turbine is optimized by this method. One-dimensional calculation results show that D3/D2 is too large in the original design, resulting in the low efficiency of the entire stage. Based on the one-dimensional optimization results, the geometry of the diffuser has been redesigned. The outlet diameter of the vaneless diffuser has been reduced, and the original single stage diffuser has been replaced by a tandem vaned diffuser. After optimization, the entire stage pressure ratio is increased by approximately 4%, and the efficiency is increased by approximately 2%.

  2. FY01 Supplemental Science and Performance Analysis: Volume 1, Scientific Bases and Analyses

    International Nuclear Information System (INIS)

    Bodvarsson, G.S.; Dobson, David

    2001-01-01

    The U.S. Department of Energy (DOE) is considering the possible recommendation of a site at Yucca Mountain, Nevada, for development as a geologic repository for the disposal of high-level radioactive waste and spent nuclear fuel. To facilitate public review and comment, in May 2001 the DOE released the Yucca Mountain Science and Engineering Report (S and ER) (DOE 2001 [DIRS 153849]), which presents technical information supporting the consideration of the possible site recommendation. The report summarizes the results of more than 20 years of scientific and engineering studies. A decision to recommend the site has not been made: the DOE has provided the S and ER and its supporting documents as an aid to the public in formulating comments on the possible recommendation. When the S and ER (DOE 2001 [DIRS 153849]) was released, the DOE acknowledged that technical and scientific analyses of the site were ongoing. Therefore, the DOE noted in the Federal Register Notice accompanying the report (66 FR 23013 [DIRS 155009], p. 2) that additional technical information would be released before the dates, locations, and times for public hearings on the possible recommendation were announced. This information includes: (1) the results of additional technical studies of a potential repository at Yucca Mountain, contained in this FY01 Supplemental Science and Performance Analyses: Vol. 1, Scientific Bases and Analyses; and FY01 Supplemental Science and Performance Analyses: Vol. 2, Performance Analyses (McNeish 2001 [DIRS 155023]) (collectively referred to as the SSPA) and (2) a preliminary evaluation of the Yucca Mountain site's preclosure and postclosure performance against the DOE's proposed site suitability guidelines (10 CFR Part 963 [64 FR 67054 [DIRS 124754

  3. FY01 Supplemental Science and Performance Analysis: Volume 1,Scientific Bases and Analyses

    Energy Technology Data Exchange (ETDEWEB)

    Bodvarsson, G.S.; Dobson, David

    2001-05-30

    The U.S. Department of Energy (DOE) is considering the possible recommendation of a site at Yucca Mountain, Nevada, for development as a geologic repository for the disposal of high-level radioactive waste and spent nuclear fuel. To facilitate public review and comment, in May 2001 the DOE released the Yucca Mountain Science and Engineering Report (S&ER) (DOE 2001 [DIRS 153849]), which presents technical information supporting the consideration of the possible site recommendation. The report summarizes the results of more than 20 years of scientific and engineering studies. A decision to recommend the site has not been made: the DOE has provided the S&ER and its supporting documents as an aid to the public in formulating comments on the possible recommendation. When the S&ER (DOE 2001 [DIRS 153849]) was released, the DOE acknowledged that technical and scientific analyses of the site were ongoing. Therefore, the DOE noted in the Federal Register Notice accompanying the report (66 FR 23013 [DIRS 155009], p. 2) that additional technical information would be released before the dates, locations, and times for public hearings on the possible recommendation were announced. This information includes: (1) the results of additional technical studies of a potential repository at Yucca Mountain, contained in this FY01 Supplemental Science and Performance Analyses: Vol. 1, Scientific Bases and Analyses; and FY01 Supplemental Science and Performance Analyses: Vol. 2, Performance Analyses (McNeish 2001 [DIRS 155023]) (collectively referred to as the SSPA) and (2) a preliminary evaluation of the Yucca Mountain site's preclosure and postclosure performance against the DOE's proposed site suitability guidelines (10 CFR Part 963 [64 FR 67054 [DIRS 124754

  4. Issues and approaches in risk-based aging analyses of passive components

    International Nuclear Information System (INIS)

    Uryasev, S.P.; Samanta, P.K.; Vesely, W.E.

    1994-01-01

    In previous NRC-sponsored work a general methodology was developed to quantify the risk contributions from aging components at nuclear plants. The methodology allowed Probabilistic Risk Analyses (PRAs) to be modified to incorporate the age-dependent component failure rates and also aging maintenance models to evaluate and prioritize the aging contributions from active components using the linear aging failure rate model and empirical components aging rates. In the present paper, this methodology is extended to passive components (for example, the pipes, heat exchangers, and the vessel). The analyses of passive components bring in issues different from active components. Here, we specifically focus on three aspects that need to be addressed in risk-based aging prioritization of passive components

  5. DESIGNING EAP MATERIALS BASED ON INTERCULTURAL CORPUS ANALYSES: THE CASE OF LOGICAL MARKERS IN RESEARCH ARTICLES

    Directory of Open Access Journals (Sweden)

    Pilar Mur Dueñas

    2009-10-01

    Full Text Available The ultimate aim of intercultural analyses in English for Academic Purposes is to help non-native scholars function successfully in the international disciplinary community in English. The aim of this paper is to show how corpus-based intercultural analyses can be useful to design EAP materials on a particular metadiscourse category, logical markers, in research article writing. The paper first describes the analysis carried out of additive, contrastive and consecutive logical markers in a corpus of research articles in English and in Spanish in a particular discipline, Business Management. Differences were found in their frequency and also in the use of each of the sub-categories. Then, five activities designed on the basis of these results are presented. They are aimed at raising Spanish Business scholars' awareness of the specific uses and pragmatic function of frequent logical markers in international research articles in English.

  6. Integrated approach for fusion multi-physics coupled analyses based on hybrid CAD and mesh geometries

    Energy Technology Data Exchange (ETDEWEB)

    Qiu, Yuefeng, E-mail: yuefeng.qiu@kit.edu; Lu, Lei; Fischer, Ulrich

    2015-10-15

    Highlights: • Integrated approach for neutronics, thermal and structural analyses was developed. • MCNP5/6, TRIPOLI-4 were coupled with CFX, Fluent and ANSYS Workbench. • A novel meshing approach has been proposed for describing MC geometry. - Abstract: Coupled multi-physics analyses on fusion reactor devices require high-fidelity neutronic models, and flexible, accurate data exchanging between various calculation codes. An integrated coupling approach has been developed to enable the conversion of CAD, mesh, or hybrid geometries for Monte Carlo (MC) codes MCNP5/6, TRIPOLI-4, and translation of nuclear heating data for CFD codes Fluent, CFX and structural mechanical software ANSYS Workbench. The coupling approach has been implemented based on SALOME platform with CAD modeling, mesh generation and data visualization capabilities. A novel meshing approach has been developed for generating suitable meshes for MC geometry descriptions. The coupling approach has been concluded to be reliable and efficient after verification calculations of several application cases.

  7. Tracing common origins of Genomic Islands in prokaryotes based on genome signature analyses.

    Science.gov (United States)

    van Passel, Mark Wj

    2011-09-01

    Horizontal gene transfer constitutes a powerful and innovative force in evolution, but often little is known about the actual origins of transferred genes. Sequence alignments are generally of limited use in tracking the original donor, since still only a small fraction of the total genetic diversity is thought to be uncovered. Alternatively, approaches based on similarities in the genome specific relative oligonucleotide frequencies do not require alignments. Even though the exact origins of horizontally transferred genes may still not be established using these compositional analyses, it does suggest that compositionally very similar regions are likely to have had a common origin. These analyses have shown that up to a third of large acquired gene clusters that reside in the same genome are compositionally very similar, indicative of a shared origin. This brings us closer to uncovering the original donors of horizontally transferred genes, and could help in elucidating possible regulatory interactions between previously unlinked sequences.

  8. Regression: A Bibliography.

    Science.gov (United States)

    Pedrini, D. T.; Pedrini, Bonnie C.

    Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…

  9. Regression-Based Norms for a Bi-factor Model for Scoring the Brief Test of Adult Cognition by Telephone (BTACT).

    Science.gov (United States)

    Gurnani, Ashita S; John, Samantha E; Gavett, Brandon E

    2015-05-01

    The current study developed regression-based normative adjustments for a bi-factor model of the The Brief Test of Adult Cognition by Telephone (BTACT). Archival data from the Midlife Development in the United States-II Cognitive Project were used to develop eight separate linear regression models that predicted bi-factor BTACT scores, accounting for age, education, gender, and occupation-alone and in various combinations. All regression models provided statistically significant fit to the data. A three-predictor regression model fit best and accounted for 32.8% of the variance in the global bi-factor BTACT score. The fit of the regression models was not improved by gender. Eight different regression models are presented to allow the user flexibility in applying demographic corrections to the bi-factor BTACT scores. Occupation corrections, while not widely used, may provide useful demographic adjustments for adult populations or for those individuals who have attained an occupational status not commensurate with expected educational attainment. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  10. Features of Computer-Based Decision Aids: Systematic Review, Thematic Synthesis, and Meta-Analyses.

    Science.gov (United States)

    Syrowatka, Ania; Krömker, Dörthe; Meguerditchian, Ari N; Tamblyn, Robyn

    2016-01-26

    Patient information and education, such as decision aids, are gradually moving toward online, computer-based environments. Considerable research has been conducted to guide content and presentation of decision aids. However, given the relatively new shift to computer-based support, little attention has been given to how multimedia and interactivity can improve upon paper-based decision aids. The first objective of this review was to summarize published literature into a proposed classification of features that have been integrated into computer-based decision aids. Building on this classification, the second objective was to assess whether integration of specific features was associated with higher-quality decision making. Relevant studies were located by searching MEDLINE, Embase, CINAHL, and CENTRAL databases. The review identified studies that evaluated computer-based decision aids for adults faced with preference-sensitive medical decisions and reported quality of decision-making outcomes. A thematic synthesis was conducted to develop the classification of features. Subsequently, meta-analyses were conducted based on standardized mean differences (SMD) from randomized controlled trials (RCTs) that reported knowledge or decisional conflict. Further subgroup analyses compared pooled SMDs for decision aids that incorporated a specific feature to other computer-based decision aids that did not incorporate the feature, to assess whether specific features improved quality of decision making. Of 3541 unique publications, 58 studies met the target criteria and were included in the thematic synthesis. The synthesis identified six features: content control, tailoring, patient narratives, explicit values clarification, feedback, and social support. A subset of 26 RCTs from the thematic synthesis was used to conduct the meta-analyses. As expected, computer-based decision aids performed better than usual care or alternative aids; however, some features performed better than

  11. Features of Computer-Based Decision Aids: Systematic Review, Thematic Synthesis, and Meta-Analyses

    Science.gov (United States)

    Krömker, Dörthe; Meguerditchian, Ari N; Tamblyn, Robyn

    2016-01-01

    Background Patient information and education, such as decision aids, are gradually moving toward online, computer-based environments. Considerable research has been conducted to guide content and presentation of decision aids. However, given the relatively new shift to computer-based support, little attention has been given to how multimedia and interactivity can improve upon paper-based decision aids. Objective The first objective of this review was to summarize published literature into a proposed classification of features that have been integrated into computer-based decision aids. Building on this classification, the second objective was to assess whether integration of specific features was associated with higher-quality decision making. Methods Relevant studies were located by searching MEDLINE, Embase, CINAHL, and CENTRAL databases. The review identified studies that evaluated computer-based decision aids for adults faced with preference-sensitive medical decisions and reported quality of decision-making outcomes. A thematic synthesis was conducted to develop the classification of features. Subsequently, meta-analyses were conducted based on standardized mean differences (SMD) from randomized controlled trials (RCTs) that reported knowledge or decisional conflict. Further subgroup analyses compared pooled SMDs for decision aids that incorporated a specific feature to other computer-based decision aids that did not incorporate the feature, to assess whether specific features improved quality of decision making. Results Of 3541 unique publications, 58 studies met the target criteria and were included in the thematic synthesis. The synthesis identified six features: content control, tailoring, patient narratives, explicit values clarification, feedback, and social support. A subset of 26 RCTs from the thematic synthesis was used to conduct the meta-analyses. As expected, computer-based decision aids performed better than usual care or alternative aids; however

  12. Genomic-Enabled Prediction Based on Molecular Markers and Pedigree Using the Bayesian Linear Regression Package in R

    Directory of Open Access Journals (Sweden)

    Paulino Pérez

    2010-09-01

    Full Text Available The availability of dense molecular markers has made possible the use of genomic selection in plant and animal breeding. However, models for genomic selection pose several computational and statistical challenges and require specialized computer programs, not always available to the end user and not implemented in standard statistical software yet. The R-package BLR (Bayesian Linear Regression implements several statistical procedures (e.g., Bayesian Ridge Regression, Bayesian LASSO in a unified framework that allows including marker genotypes and pedigree data jointly. This article describes the classes of models implemented in the BLR package and illustrates their use through examples. Some challenges faced when applying genomic-enabled selection, such as model choice, evaluation of predictive ability through cross-validation, and choice of hyper-parameters, are also addressed.

  13. Non-Invasive Methodology to Estimate Polyphenol Content in Extra Virgin Olive Oil Based on Stepwise Multilinear Regression.

    Science.gov (United States)

    Martínez Gila, Diego Manuel; Cano Marchal, Pablo; Gómez Ortega, Juan; Gámez García, Javier

    2018-03-25

    Normally the olive oil quality is assessed by chemical analysis according to international standards. These norms define chemical and organoleptic markers, and depending on the markers, the olive oil can be labelled as lampante, virgin, or extra virgin olive oil (EVOO), the last being an indicator of top quality. The polyphenol content is related to EVOO organoleptic features, and different scientific works have studied the positive influence that these compounds have on human health. The works carried out in this paper are focused on studying relations between the polyphenol content in olive oil samples and its spectral response in the near infrared spectra. In this context, several acquisition parameters have been assessed to optimize the measurement process within the virgin olive oil production process. The best regression model reached a mean error value of 156.14 mg/kg in leave one out cross validation, and the higher regression coefficient was 0.81 through holdout validation.

  14. Non-Invasive Methodology to Estimate Polyphenol Content in Extra Virgin Olive Oil Based on Stepwise Multilinear Regression

    Directory of Open Access Journals (Sweden)

    Diego Manuel Martínez Gila

    2018-03-01

    Full Text Available Normally the olive oil quality is assessed by chemical analysis according to international standards. These norms define chemical and organoleptic markers, and depending on the markers, the olive oil can be labelled as lampante, virgin, or extra virgin olive oil (EVOO, the last being an indicator of top quality. The polyphenol content is related to EVOO organoleptic features, and different scientific works have studied the positive influence that these compounds have on human health. The works carried out in this paper are focused on studying relations between the polyphenol content in olive oil samples and its spectral response in the near infrared spectra. In this context, several acquisition parameters have been assessed to optimize the measurement process within the virgin olive oil production process. The best regression model reached a mean error value of 156.14 mg/kg in leave one out cross validation, and the higher regression coefficient was 0.81 through holdout validation.

  15. A web-based endpoint adjudication system for interim analyses in clinical trials.

    Science.gov (United States)

    Nolen, Tracy L; Dimmick, Bill F; Ostrosky-Zeichner, Luis; Kendrick, Amy S; Sable, Carole; Ngai, Angela; Wallace, Dennis

    2009-02-01

    A data monitoring committee (DMC) is often employed to assess trial progress and review safety data and efficacy endpoints throughout a trail. Interim analyses performed for the DMC should use data that are as complete and verified as possible. Such analyses are complicated when data verification involves subjective study endpoints or requires clinical expertise to determine each subject's status with respect to the study endpoint. Therefore, procedures are needed to obtain adjudicated data for interim analyses in an efficient manner. In the past, methods for handling such data included using locally reported results as surrogate endpoints, adjusting analysis methods for unadjudicated data, or simply performing the adjudication as rapidly as possible. These methods all have inadequacies that make their sole usage suboptimal. For a study of prophylaxis for invasive candidiasis, adjudication of both study eligibility criteria and clinical endpoints prior to two interim analyses was required. Because the study was expected to enroll at a moderate rate and the sponsor required adjudicated endpoints to be used for interim analyses, an efficient process for adjudication was required. We created a web-based endpoint adjudication system (WebEAS) that allows for expedited review by the endpoint adjudication committee (EAC). This system automatically identifies when a subject's data are complete, creates a subject profile from the study data, and assigns EAC reviewers. The reviewers use the WebEAS to review the subject profile and submit their completed review form. The WebEAS then compares the reviews, assigns an additional review as a tiebreaker if needed, and stores the adjudicated data. The study for which this system was originally built was administratively closed after 10 months with only 38 subjects enrolled. The adjudication process was finalized and the WebEAS system activated prior to study closure. Some website accessibility issues presented initially. However

  16. Determination of the spatial response of neutron based analysers using a Monte Carlo based method

    International Nuclear Information System (INIS)

    Tickner, James

    2000-01-01

    One of the principal advantages of using thermal neutron capture (TNC, also called prompt gamma neutron activation analysis or PGNAA) or neutron inelastic scattering (NIS) techniques for measuring elemental composition is the high penetrating power of both the incident neutrons and the resultant gamma-rays, which means that large sample volumes can be interrogated. Gauges based on these techniques are widely used in the mineral industry for on-line determination of the composition of bulk samples. However, attenuation of both neutrons and gamma-rays in the sample and geometric (source/detector distance) effects typically result in certain parts of the sample contributing more to the measured composition than others. In turn, this introduces errors in the determination of the composition of inhomogeneous samples. This paper discusses a combined Monte Carlo/analytical method for estimating the spatial response of a neutron gauge. Neutron propagation is handled using a Monte Carlo technique which allows an arbitrarily complex neutron source and gauge geometry to be specified. Gamma-ray production and detection is calculated analytically which leads to a dramatic increase in the efficiency of the method. As an example, the method is used to study ways of reducing the spatial sensitivity of on-belt composition measurements of cement raw meal

  17. CrusView: a Java-based visualization platform for comparative genomics analyses in Brassicaceae species.

    Science.gov (United States)

    Chen, Hao; Wang, Xiangfeng

    2013-09-01

    In plants and animals, chromosomal breakage and fusion events based on conserved syntenic genomic blocks lead to conserved patterns of karyotype evolution among species of the same family. However, karyotype information has not been well utilized in genomic comparison studies. We present CrusView, a Java-based bioinformatic application utilizing Standard Widget Toolkit/Swing graphics libraries and a SQLite database for performing visualized analyses of comparative genomics data in Brassicaceae (crucifer) plants. Compared with similar software and databases, one of the unique features of CrusView is its integration of karyotype information when comparing two genomes. This feature allows users to perform karyotype-based genome assembly and karyotype-assisted genome synteny analyses with preset karyotype patterns of the Brassicaceae genomes. Additionally, CrusView is a local program, which gives its users high flexibility when analyzing unpublished genomes and allows users to upload self-defined genomic information so that they can visually study the associations between genome structural variations and genetic elements, including chromosomal rearrangements, genomic macrosynteny, gene families, high-frequency recombination sites, and tandem and segmental duplications between related species. This tool will greatly facilitate karyotype, chromosome, and genome evolution studies using visualized comparative genomics approaches in Brassicaceae species. CrusView is freely available at http://www.cmbb.arizona.edu/CrusView/.

  18. Better Autologistic Regression

    Directory of Open Access Journals (Sweden)

    Mark A. Wolters

    2017-11-01

    Full Text Available Autologistic regression is an important probability model for dichotomous random variables observed along with covariate information. It has been used in various fields for analyzing binary data possessing spatial or network structure. The model can be viewed as an extension of the autologistic model (also known as the Ising model, quadratic exponential binary distribution, or Boltzmann machine to include covariates. It can also be viewed as an extension of logistic regression to handle responses that are not independent. Not all authors use exactly the same form of the autologistic regression model. Variations of the model differ in two respects. First, the variable coding—the two numbers used to represent the two possible states of the variables—might differ. Common coding choices are (zero, one and (minus one, plus one. Second, the model might appear in either of two algebraic forms: a standard form, or a recently proposed centered form. Little attention has been paid to the effect of these differences, and the literature shows ambiguity about their importance. It is shown here that changes to either coding or centering in fact produce distinct, non-nested probability models. Theoretical results, numerical studies, and analysis of an ecological data set all show that the differences among the models can be large and practically significant. Understanding the nature of the differences and making appropriate modeling choices can lead to significantly improved autologistic regression analyses. The results strongly suggest that the standard model with plus/minus coding, which we call the symmetric autologistic model, is the most natural choice among the autologistic variants.

  19. A Game-based Corpus for Analysing the Interplay between Game Context and Player Experience

    DEFF Research Database (Denmark)

    Shaker, Noor; Yannakakis, Georgios N.; Asteriadis, Stylianos

    2011-01-01

    present dierent types of information that have been extracted from game context, player preferences and perception of the game, as well as user features, automatically extracted from video recordings.We run a number of initial experiments to analyse players' behavior while playing video games as a case......Recognizing players' aective state while playing video games has been the focus of many recent research studies. In this paper we describe the process that has been followed to build a corpus based on game events and recorded video sessions from human players while playing Super Mario Bros. We...

  20. SU-G-BRA-08: Diaphragm Motion Tracking Based On KV CBCT Projections with a Constrained Linear Regression Optimization

    Energy Technology Data Exchange (ETDEWEB)

    Wei, J [City College of New York, New York, NY (United States); Chao, M [The Mount Sinai Medical Center, New York, NY (United States)

    2016-06-15

    Purpose: To develop a novel strategy to extract the respiratory motion of the thoracic diaphragm from kilovoltage cone beam computed tomography (CBCT) projections by a constrained linear regression optimization technique. Methods: A parabolic function was identified as the geometric model and was employed to fit the shape of the diaphragm on the CBCT projections. The search was initialized by five manually placed seeds on a pre-selected projection image. Temporal redundancies, the enabling phenomenology in video compression and encoding techniques, inherent in the dynamic properties of the diaphragm motion together with the geometrical shape of the diaphragm boundary and the associated algebraic constraint that significantly reduced the searching space of viable parabolic parameters was integrated, which can be effectively optimized by a constrained linear regression approach on the subsequent projections. The innovative algebraic constraints stipulating the kinetic range of the motion and the spatial constraint preventing any unphysical deviations was able to obtain the optimal contour of the diaphragm with minimal initialization. The algorithm was assessed by a fluoroscopic movie acquired at anteriorposterior fixed direction and kilovoltage CBCT projection image sets from four lung and two liver patients. The automatic tracing by the proposed algorithm and manual tracking by a human operator were compared in both space and frequency domains. Results: The error between the estimated and manual detections for the fluoroscopic movie was 0.54mm with standard deviation (SD) of 0.45mm, while the average error for the CBCT projections was 0.79mm with SD of 0.64mm for all enrolled patients. The submillimeter accuracy outcome exhibits the promise of the proposed constrained linear regression approach to track the diaphragm motion on rotational projection images. Conclusion: The new algorithm will provide a potential solution to rendering diaphragm motion and ultimately

  1. SU-G-BRA-08: Diaphragm Motion Tracking Based On KV CBCT Projections with a Constrained Linear Regression Optimization

    International Nuclear Information System (INIS)

    Wei, J; Chao, M

    2016-01-01

    Purpose: To develop a novel strategy to extract the respiratory motion of the thoracic diaphragm from kilovoltage cone beam computed tomography (CBCT) projections by a constrained linear regression optimization technique. Methods: A parabolic function was identified as the geometric model and was employed to fit the shape of the diaphragm on the CBCT projections. The search was initialized by five manually placed seeds on a pre-selected projection image. Temporal redundancies, the enabling phenomenology in video compression and encoding techniques, inherent in the dynamic properties of the diaphragm motion together with the geometrical shape of the diaphragm boundary and the associated algebraic constraint that significantly reduced the searching space of viable parabolic parameters was integrated, which can be effectively optimized by a constrained linear regression approach on the subsequent projections. The innovative algebraic constraints stipulating the kinetic range of the motion and the spatial constraint preventing any unphysical deviations was able to obtain the optimal contour of the diaphragm with minimal initialization. The algorithm was assessed by a fluoroscopic movie acquired at anteriorposterior fixed direction and kilovoltage CBCT projection image sets from four lung and two liver patients. The automatic tracing by the proposed algorithm and manual tracking by a human operator were compared in both space and frequency domains. Results: The error between the estimated and manual detections for the fluoroscopic movie was 0.54mm with standard deviation (SD) of 0.45mm, while the average error for the CBCT projections was 0.79mm with SD of 0.64mm for all enrolled patients. The submillimeter accuracy outcome exhibits the promise of the proposed constrained linear regression approach to track the diaphragm motion on rotational projection images. Conclusion: The new algorithm will provide a potential solution to rendering diaphragm motion and ultimately

  2. Individual-based analyses reveal limited functional overlap in a coral reef fish community.

    Science.gov (United States)

    Brandl, Simon J; Bellwood, David R

    2014-05-01

    Detailed knowledge of a species' functional niche is crucial for the study of ecological communities and processes. The extent of niche overlap, functional redundancy and functional complementarity is of particular importance if we are to understand ecosystem processes and their vulnerability to disturbances. Coral reefs are among the most threatened marine systems, and anthropogenic activity is changing the functional composition of reefs. The loss of herbivorous fishes is particularly concerning as the removal of algae is crucial for the growth and survival of corals. Yet, the foraging patterns of the various herbivorous fish species are poorly understood. Using a multidimensional framework, we present novel individual-based analyses of species' realized functional niches, which we apply to a herbivorous coral reef fish community. In calculating niche volumes for 21 species, based on their microhabitat utilization patterns during foraging, and computing functional overlaps, we provide a measurement of functional redundancy or complementarity. Complementarity is the inverse of redundancy and is defined as less than 50% overlap in niche volumes. The analyses reveal extensive complementarity with an average functional overlap of just 15.2%. Furthermore, the analyses divide herbivorous reef fishes into two broad groups. The first group (predominantly surgeonfishes and parrotfishes) comprises species feeding on exposed surfaces and predominantly open reef matrix or sandy substrata, resulting in small niche volumes and extensive complementarity. In contrast, the second group consists of species (predominantly rabbitfishes) that feed over a wider range of microhabitats, penetrating the reef matrix to exploit concealed surfaces of various substratum types. These species show high variation among individuals, leading to large niche volumes, more overlap and less complementarity. These results may have crucial consequences for our understanding of herbivorous processes on

  3. Generalized regression neural network (GRNN)-based approach for colored dissolved organic matter (CDOM) retrieval: case study of Connecticut River at Middle Haddam Station, USA.

    Science.gov (United States)

    Heddam, Salim

    2014-11-01

    The prediction of colored dissolved organic matter (CDOM) using artificial neural network approaches has received little attention in the past few decades. In this study, colored dissolved organic matter (CDOM) was modeled using generalized regression neural network (GRNN) and multiple linear regression (MLR) models as a function of Water temperature (TE), pH, specific conductance (SC), and turbidity (TU). Evaluation of the prediction accuracy of the models is based on the root mean square error (RMSE), mean absolute error (MAE), coefficient of correlation (CC), and Willmott's index of agreement (d). The results indicated that GRNN can be applied successfully for prediction of colored dissolved organic matter (CDOM).

  4. Genome-based comparative analyses of Antarctic and temperate species of Paenibacillus.

    Directory of Open Access Journals (Sweden)

    Melissa Dsouza

    Full Text Available Antarctic soils represent a unique environment characterised by extremes of temperature, salinity, elevated UV radiation, low nutrient and low water content. Despite the harshness of this environment, members of 15 bacterial phyla have been identified in soils of the Ross Sea Region (RSR. However, the survival mechanisms and ecological roles of these phyla are largely unknown. The aim of this study was to investigate whether strains of Paenibacillus darwinianus owe their resilience to substantial genomic changes. For this, genome-based comparative analyses were performed on three P. darwinianus strains, isolated from gamma-irradiated RSR soils, together with nine temperate, soil-dwelling Paenibacillus spp. The genome of each strain was sequenced to over 1,000-fold coverage, then assembled into contigs totalling approximately 3 Mbp per genome. Based on the occurrence of essential, single-copy genes, genome completeness was estimated at approximately 88%. Genome analysis revealed between 3,043-3,091 protein-coding sequences (CDSs, primarily associated with two-component systems, sigma factors, transporters, sporulation and genes induced by cold-shock, oxidative and osmotic stresses. These comparative analyses provide an insight into the metabolic potential of P. darwinianus, revealing potential adaptive mechanisms for survival in Antarctic soils. However, a large proportion of these mechanisms were also identified in temperate Paenibacillus spp., suggesting that these mechanisms are beneficial for growth and survival in a range of soil environments. These analyses have also revealed that the P. darwinianus genomes contain significantly fewer CDSs and have a lower paralogous content. Notwithstanding the incompleteness of the assemblies, the large differences in genome sizes, determined by the number of genes in paralogous clusters and the CDS content, are indicative of genome content scaling. Finally, these sequences are a resource for further

  5. MULTI-DIMENSIONAL MASS SPECTROMETRY-BASED SHOTGUN LIPIDOMICS AND NOVEL STRATEGIES FOR LIPIDOMIC ANALYSES

    Science.gov (United States)

    Han, Xianlin; Yang, Kui; Gross, Richard W.

    2011-01-01

    Since our last comprehensive review on multi-dimensional mass spectrometry-based shotgun lipidomics (Mass Spectrom. Rev. 24 (2005), 367), many new developments in the field of lipidomics have occurred. These developments include new strategies and refinements for shotgun lipidomic approaches that use direct infusion, including novel fragmentation strategies, identification of multiple new informative dimensions for mass spectrometric interrogation, and the development of new bioinformatic approaches for enhanced identification and quantitation of the individual molecular constituents that comprise each cell’s lipidome. Concurrently, advances in liquid chromatography-based platforms and novel strategies for quantitative matrix-assisted laser desorption/ionization mass spectrometry for lipidomic analyses have been developed. Through the synergistic use of this repertoire of new mass spectrometric approaches, the power and scope of lipidomics has been greatly expanded to accelerate progress toward the comprehensive understanding of the pleiotropic roles of lipids in biological systems. PMID:21755525

  6. An Integrated Software Suite for Surface-based Analyses of Cerebral Cortex

    Science.gov (United States)

    Van Essen, David C.; Drury, Heather A.; Dickson, James; Harwell, John; Hanlon, Donna; Anderson, Charles H.

    2001-01-01

    The authors describe and illustrate an integrated trio of software programs for carrying out surface-based analyses of cerebral cortex. The first component of this trio, SureFit (Surface Reconstruction by Filtering and Intensity Transformations), is used primarily for cortical segmentation, volume visualization, surface generation, and the mapping of functional neuroimaging data onto surfaces. The second component, Caret (Computerized Anatomical Reconstruction and Editing Tool Kit), provides a wide range of surface visualization and analysis options as well as capabilities for surface flattening, surface-based deformation, and other surface manipulations. The third component, SuMS (Surface Management System), is a database and associated user interface for surface-related data. It provides for efficient insertion, searching, and extraction of surface and volume data from the database. PMID:11522765

  7. An integrated software suite for surface-based analyses of cerebral cortex

    Science.gov (United States)

    Van Essen, D. C.; Drury, H. A.; Dickson, J.; Harwell, J.; Hanlon, D.; Anderson, C. H.

    2001-01-01

    The authors describe and illustrate an integrated trio of software programs for carrying out surface-based analyses of cerebral cortex. The first component of this trio, SureFit (Surface Reconstruction by Filtering and Intensity Transformations), is used primarily for cortical segmentation, volume visualization, surface generation, and the mapping of functional neuroimaging data onto surfaces. The second component, Caret (Computerized Anatomical Reconstruction and Editing Tool Kit), provides a wide range of surface visualization and analysis options as well as capabilities for surface flattening, surface-based deformation, and other surface manipulations. The third component, SuMS (Surface Management System), is a database and associated user interface for surface-related data. It provides for efficient insertion, searching, and extraction of surface and volume data from the database.

  8. The Seismic Reliability of Offshore Structures Based on Nonlinear Time History Analyses

    International Nuclear Information System (INIS)

    Hosseini, Mahmood; Karimiyani, Somayyeh; Ghafooripour, Amin; Jabbarzadeh, Mohammad Javad

    2008-01-01

    Regarding the past earthquakes damages to offshore structures, as vital structures in the oil and gas industries, it is important that their seismic design is performed by very high reliability. Accepting the Nonlinear Time History Analyses (NLTHA) as the most reliable seismic analysis method, in this paper an offshore platform of jacket type with the height of 304 feet, having a deck of 96 feet by 94 feet, and weighing 290 million pounds has been studied. At first, some Push-Over Analyses (POA) have been preformed to recognize the more critical members of the jacket, based on the range of their plastic deformations. Then NLTHA have been performed by using the 3-components accelerograms of 100 earthquakes, covering a wide range of frequency content, and normalized to three Peak Ground Acceleration (PGA) levels of 0.3 g, 0.65 g, and 1.0 g. By using the results of NLTHA the damage and rupture probabilities of critical member have been studied to assess the reliability of the jacket structure. Regarding that different structural members of the jacket have different effects on the stability of the platform, an ''importance factor'' has been considered for each critical member based on its location and orientation in the structure, and then the reliability of the whole structure has been obtained by combining the reliability of the critical members, each having its specific importance factor

  9. Analyses of criticality and reactivity for TRACY experiments based on JENDL-3.3 data library

    International Nuclear Information System (INIS)

    Sono, Hiroki; Miyoshi, Yoshinori; Nakajima, Ken

    2003-01-01

    The parameters on criticality and reactivity employed for computational simulations of the TRACY supercritical experiments were analyzed using a recently revised nuclear data library, JENDL-3.3. The parameters based on the JENDL-3.3 library were compared to those based on two former-used libraries, JENDL-3.2 and ENDF/B-VI. In the analyses computational codes, MVP, MCNP version 4C and TWOTRAN, were used. The following conclusions were obtained from the analyses: (1) The computational biases of the effective neutron multiplication factor attributable to the nuclear data libraries and to the computational codes do not depend the TRACY experimental conditions such as fuel conditions. (2) The fractional discrepancies in the kinetic parameters and coefficients of reactivity are within ∼5% between the three libraries. By comparison between calculations and measurements of the parameters, the JENDL-3.3 library is expected to give closer values to the measurements than the JENDL-3.2 and ENDF/B-VI libraries. (3) While the reactivity worth of transient rods expressed in the $ unit shows ∼5% discrepancy between the three libraries according to their respective β eff values, there is little discrepancy in that expressed in the Δk/k unit. (author)

  10. Novel citation-based search method for scientific literature: application to meta-analyses.

    Science.gov (United States)

    Janssens, A Cecile J W; Gwinn, M

    2015-10-13

    Finding eligible studies for meta-analysis and systematic reviews relies on keyword-based searching as the gold standard, despite its inefficiency. Searching based on direct citations is not sufficiently comprehensive. We propose a novel strategy that ranks articles on their degree of co-citation with one or more "known" articles before reviewing their eligibility. In two independent studies, we aimed to reproduce the results of literature searches for sets of published meta-analyses (n = 10 and n = 42). For each meta-analysis, we extracted co-citations for the randomly selected 'known' articles from the Web of Science database, counted their frequencies and screened all articles with a score above a selection threshold. In the second study, we extended the method by retrieving direct citations for all selected articles. In the first study, we retrieved 82% of the studies included in the meta-analyses while screening only 11% as many articles as were screened for the original publications. Articles that we missed were published in non-English languages, published before 1975, published very recently, or available only as conference abstracts. In the second study, we retrieved 79% of included studies while screening half the original number of articles. Citation searching appears to be an efficient and reasonably accurate method for finding articles similar to one or more articles of interest for meta-analysis and reviews.

  11. Benefits of Exercise Training For Computer-Based Staff: A Meta Analyses

    Directory of Open Access Journals (Sweden)

    Mothna Mohammed

    2017-04-01

    Full Text Available Background: Office workers sit down to work for approximately 8 hours a day and, as a result, many of them do not have enough time for any form of physical exercise. This can lead to musculoskeletal discomforts, especially low back pain and recently, many researchers focused on home/office-based exercise training for prevention/treatment of low back pain among this population. Objective: This Meta analyses paper tried to discuss about the latest suggested exercises for the office workers based on the mechanisms and theories behind low back pain among office workers. Method: In this Meta analyses the author tried to collect relevant papers which were published previously on the subject. Google Scholar, Scopus, and PubMed were used as sources to find the articles. Only articles that were published using the same methodology, including office workers, musculoskeletal discomforts, low back pain, and exercise training keywords, were selected. Studies that failed to report sufficient sample statistics, or lacked a substantial review of past academic scholarship and/or clear methodologies, were excluded. Results: Limited evidence regarding the prevention of, and treatment methods for, musculoskeletal discomfort, especially those in the low back, among office workers, is available. The findings showed that training exercises had a significant effect (p<0.05 on low back pain discomfort scores and decreased pain levels in response to office-based exercise training. Conclusion: Office-based exercise training can affect pain/discomfort scores among office workers through positive effects on flexibility and strength of muscles. As such, it should be suggested to occupational therapists as a practical way for the treatment/prevention of low back pain among office workers.

  12. Advances in simultaneous atmospheric profile and cloud parameter regression based retrieval from high-spectral resolution radiance measurements

    Science.gov (United States)

    Weisz, Elisabeth; Smith, William L.; Smith, Nadia

    2013-06-01

    The dual-regression (DR) method retrieves information about the Earth surface and vertical atmospheric conditions from measurements made by any high-spectral resolution infrared sounder in space. The retrieved information includes temperature and atmospheric gases (such as water vapor, ozone, and carbon species) as well as surface and cloud top parameters. The algorithm was designed to produce a high-quality product with low latency and has been demonstrated to yield accurate results in real-time environments. The speed of the retrieval is achieved through linear regression, while accuracy is achieved through a series of classification schemes and decision-making steps. These steps are necessary to account for the nonlinearity of hyperspectral retrievals. In this work, we detail the key steps that have been developed in the DR method to advance accuracy in the retrieval of nonlinear parameters, specifically cloud top pressure. The steps and their impact on retrieval results are discussed in-depth and illustrated through relevant case studies. In addition to discussing and demonstrating advances made in addressing nonlinearity in a linear geophysical retrieval method, advances toward multi-instrument geophysical analysis by applying the DR to three different operational sounders in polar orbit are also noted. For any area on the globe, the DR method achieves consistent accuracy and precision, making it potentially very valuable to both the meteorological and environmental user communities.

  13. Spatial prediction of landslides using a hybrid machine learning approach based on Random Subspace and Classification and Regression Trees

    Science.gov (United States)

    Pham, Binh Thai; Prakash, Indra; Tien Bui, Dieu

    2018-02-01

    A hybrid machine learning approach of Random Subspace (RSS) and Classification And Regression Trees (CART) is proposed to develop a model named RSSCART for spatial prediction of landslides. This model is a combination of the RSS method which is known as an efficient ensemble technique and the CART which is a state of the art classifier. The Luc Yen district of Yen Bai province, a prominent landslide prone area of Viet Nam, was selected for the model development. Performance of the RSSCART model was evaluated through the Receiver Operating Characteristic (ROC) curve, statistical analysis methods, and the Chi Square test. Results were compared with other benchmark landslide models namely Support Vector Machines (SVM), single CART, Naïve Bayes Trees (NBT), and Logistic Regression (LR). In the development of model, ten important landslide affecting factors related with geomorphology, geology and geo-environment were considered namely slope angles, elevation, slope aspect, curvature, lithology, distance to faults, distance to rivers, distance to roads, and rainfall. Performance of the RSSCART model (AUC = 0.841) is the best compared with other popular landslide models namely SVM (0.835), single CART (0.822), NBT (0.821), and LR (0.723). These results indicate that performance of the RSSCART is a promising method for spatial landslide prediction.

  14. Modeling Typhoon Event-Induced Landslides Using GIS-Based Logistic Regression: A Case Study of Alishan Forestry Railway, Taiwan

    Directory of Open Access Journals (Sweden)

    Sheng-Chuan Chen

    2013-01-01

    Full Text Available This study develops a model for evaluating the hazard level of landslides at Alishan Forestry Railway, Taiwan, by using logistic regression with the assistance of a geographical information system (GIS. A typhoon event-induced landslide inventory, independent variables, and a triggering factor were used to build the model. The environmental factors such as bedrock lithology from the geology database; topographic aspect, terrain roughness, profile curvature, and distance to river, from the topographic database; and the vegetation index value from SPOT 4 satellite images were used as variables that influence landslide occurrence. The area under curve (AUC of a receiver operator characteristic (ROC curve was used to validate the model. Effects of parameters on landslide occurrence were assessed from the corresponding coefficient that appears in the logistic regression function. Thereafter, the model was applied to predict the probability of landslides for rainfall data of different return periods. Using a predicted map of probability, the study area was classified into four ranks of landslide susceptibility: low, medium, high, and very high. As a result, most high susceptibility areas are located on the western portion of the study area. Several train stations and railways are located on sites with a high susceptibility ranking.

  15. riskRegression

    DEFF Research Database (Denmark)

    Ozenne, Brice; Sørensen, Anne Lyngholm; Scheike, Thomas

    2017-01-01

    In the presence of competing risks a prediction of the time-dynamic absolute risk of an event can be based on cause-specific Cox regression models for the event and the competing risks (Benichou and Gail, 1990). We present computationally fast and memory optimized C++ functions with an R interface......-product we obtain fast access to the baseline hazards (compared to survival::basehaz()) and predictions of survival probabilities, their confidence intervals and confidence bands. Confidence intervals and confidence bands are based on point-wise asymptotic expansions of the corresponding statistical...

  16. Reviewing PSA-based analyses to modify technical specifications at nuclear power plants

    International Nuclear Information System (INIS)

    Samanta, P.K.; Martinez-Guridi, G.; Vesely, W.E.

    1995-12-01

    Changes to Technical Specifications (TSs) at nuclear power plants (NPPs) require review and approval by the United States Nuclear Regulatory Commission (USNRC). Currently, many requests for changes to TSs use analyses that are based on a plant's probabilistic safety assessment (PSA). This report presents an approach to reviewing such PSA-based submittals for changes to TSs. We discuss the basic objectives of reviewing a PSA-based submittal to modify NPP TSs; the methodology of reviewing a TS submittal, and the differing roles of a PSA review, a PSA Computer Code review, and a review of a TS submittal. To illustrate this approach, we discuss our review of changes to allowed outage time (AOT) and surveillance test interval (STI) in the TS for the South Texas Project Nuclear Generating Station. Based on this experience gained, a check-list of items is given for future reviewers; it can be used to verify that the submittal contains sufficient information, and also that the review has addressed the relevant issues. Finally, recommended steps in the review process and the expected findings of each step are discussed

  17. Performance Analyses of Renewable and Fuel Power Supply Systems for Different Base Station Sites

    Directory of Open Access Journals (Sweden)

    Josip Lorincz

    2014-11-01

    Full Text Available Base station sites (BSSs powered with renewable energy sources have gained the attention of cellular operators during the last few years. This is because such “green” BSSs impose significant reductions in the operational expenditures (OPEX of telecom operators due to the possibility of on-site renewable energy harvesting. In this paper, the green BSSs power supply system parameters detected through remote and centralized real time sensing are presented. An implemented sensing system based on a wireless sensor network enables reliable collection and post-processing analyses of many parameters, such as: total charging/discharging current of power supply system, battery voltage and temperature, wind speed, etc. As an example, yearly sensing results for three different BSS configurations powered by solar and/or wind energy are discussed in terms of renewable energy supply (RES system performance. In the case of powering those BSS with standalone systems based on a fuel generator, the fuel consumption models expressing interdependence among the generator load and fuel consumption are proposed. This has allowed energy-efficiency comparison of the fuel powered and RES systems, which is presented in terms of the OPEX and carbon dioxide (CO2 reductions. Additionally, approaches based on different BSS air-conditioning systems and the on/off regulation of a daily fuel generator activity are proposed and validated in terms of energy and capital expenditure (CAPEX savings.

  18. Hierarchical cluster-based partial least squares regression (HC-PLSR) is an efficient tool for metamodelling of nonlinear dynamic models.

    Science.gov (United States)

    Tøndel, Kristin; Indahl, Ulf G; Gjuvsland, Arne B; Vik, Jon Olav; Hunter, Peter; Omholt, Stig W; Martens, Harald

    2011-06-01

    Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs) to variation in features of the trajectories of the state variables (outputs) throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR), where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR) and ordinary least squares (OLS) regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback loops. HC-PLSR is a promising approach for

  19. Hierarchical Cluster-based Partial Least Squares Regression (HC-PLSR is an efficient tool for metamodelling of nonlinear dynamic models

    Directory of Open Access Journals (Sweden)

    Omholt Stig W

    2011-06-01

    Full Text Available Abstract Background Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs to variation in features of the trajectories of the state variables (outputs throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR, where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR and ordinary least squares (OLS regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Results Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback

  20. Research on the psychological gap, personality and achievement of in-school youth based on regression analysis

    Directory of Open Access Journals (Sweden)

    Lei Zhou

    2017-02-01

    Full Text Available Although our society is in rapid development, the psychological problems among in-school university students are increasingly obvious. According to this problem, this thesis applied the Psychological Gap Scale made by Caixia Ma with EPQ and AMS, and made random questionnaire survey among 400 students in a comprehensive university. The survey found out that the average scores of all psychological gap dimensions exceeded the critical value 3, showing most students in that university have psychological gap. Their personality stability, introversion and extroversion are all above the national norm level while their stubbornness is lower than it. Besides, the students’ motivation in pursuing success is stronger than their motivation in avoiding failure. In the last part, this thesis reached the conclusion that personality leaves a great impact in the students’ psychology through regression analysis model and study of the quantitative relations among personality, achievement and psychological gap.

  1. Estimating severity of sideways fall using a generic multi linear regression model based on kinematic input variables.

    Science.gov (United States)

    van der Zijden, A M; Groen, B E; Tanck, E; Nienhuis, B; Verdonschot, N; Weerdesteyn, V

    2017-03-21

    Many research groups have studied fall impact mechanics to understand how fall severity can be reduced to prevent hip fractures. Yet, direct impact force measurements with force plates are restricted to a very limited repertoire of experimental falls. The purpose of this study was to develop a generic model for estimating hip impact forces (i.e. fall severity) in in vivo sideways falls without the use of force plates. Twelve experienced judokas performed sideways Martial Arts (MA) and Block ('natural') falls on a force plate, both with and without a mat on top. Data were analyzed to determine the hip impact force and to derive 11 selected (subject-specific and kinematic) variables. Falls from kneeling height were used to perform a stepwise regression procedure to assess the effects of these input variables and build the model. The final model includes four input variables, involving one subject-specific measure and three kinematic variables: maximum upper body deceleration, body mass, shoulder angle at the instant of 'maximum impact' and maximum hip deceleration. The results showed that estimated and measured hip impact forces were linearly related (explained variances ranging from 46 to 63%). Hip impact forces of MA falls onto the mat from a standing position (3650±916N) estimated by the final model were comparable with measured values (3698±689N), even though these data were not used for training the model. In conclusion, a generic linear regression model was developed that enables the assessment of fall severity through kinematic measures of sideways falls, without using force plates. Copyright © 2017 Elsevier Ltd. All rights reserved.

  2. Growth curves of preschool children in the northeast of iran: a population based study using quantile regression approach.

    Science.gov (United States)

    Payande, Abolfazl; Tabesh, Hamed; Shakeri, Mohammad Taghi; Saki, Azadeh; Safarian, Mohammad

    2013-01-14

    Growth charts are widely used to assess children's growth status and can provide a trajectory of growth during early important months of life. The objectives of this study are going to construct growth charts and normal values of weight-for-age for children aged 0 to 5 years using a powerful and applicable methodology. The results compare with the World Health Organization (WHO) references and semi-parametric LMS method of Cole and Green. A total of 70737 apparently healthy boys and girls aged 0 to 5 years were recruited in July 2004 for 20 days from those attending community clinics for routine health checks as a part of a national survey. Anthropometric measurements were done by trained health staff using WHO methodology. The nonparametric quantile regression method obtained by local constant kernel estimation of conditional quantiles curves using for estimation of curves and normal values. The weight-for-age growth curves for boys and girls aged from 0 to 5 years were derived utilizing a population of children living in the northeast of Iran. The results were similar to the ones obtained by the semi-parametric LMS method in the same data. Among all age groups from 0 to 5 years, the median values of children's weight living in the northeast of Iran were lower than the corresponding values in WHO reference data. The weight curves of boys were higher than those of girls in all age groups. The differences between growth patterns of children living in the northeast of Iran versus international ones necessitate using local and regional growth charts. International normal values may not properly recognize the populations at risk for growth problems in Iranian children. Quantile regression (QR) as a flexible method which doesn't require restricted assumptions, proposed for estimation reference curves and normal values.

  3. Comparison of Multiple Linear Regressions and Neural Networks based QSAR models for the design of new antitubercular compounds.

    Science.gov (United States)

    Ventura, Cristina; Latino, Diogo A R S; Martins, Filomena

    2013-01-01

    The performance of two QSAR methodologies, namely Multiple Linear Regressions (MLR) and Neural Networks (NN), towards the modeling and prediction of antitubercular activity was evaluated and compared. A data set of 173 potentially active compounds belonging to the hydrazide family and represented by 96 descriptors was analyzed. Models were built with Multiple Linear Regressions (MLR), single Feed-Forward Neural Networks (FFNNs), ensembles of FFNNs and Associative Neural Networks (AsNNs) using four different data sets and different types of descriptors. The predictive ability of the different techniques used were assessed and discussed on the basis of different validation criteria and results show in general a better performance of AsNNs in terms of learning ability and prediction of antitubercular behaviors when compared with all other methods. MLR have, however, the advantage of pinpointing the most relevant molecular characteristics responsible for the behavior of these compounds against Mycobacterium tuberculosis. The best results for the larger data set (94 compounds in training set and 18 in test set) were obtained with AsNNs using seven descriptors (R(2) of 0.874 and RMSE of 0.437 against R(2) of 0.845 and RMSE of 0.472 in MLRs, for test set). Counter-Propagation Neural Networks (CPNNs) were trained with the same data sets and descriptors. From the scrutiny of the weight levels in each CPNN and the information retrieved from MLRs, a rational design of potentially active compounds was attempted. Two new compounds were synthesized and tested against M. tuberculosis showing an activity close to that predicted by the majority of the models. Copyright © 2013 Elsevier Masson SAS. All rights reserved.

  4. An Optimal Sample Data Usage Strategy to Minimize Overfitting and Underfitting Effects in Regression Tree Models Based on Remotely-Sensed Data

    Directory of Open Access Journals (Sweden)

    Yingxin Gu

    2016-11-01

    Full Text Available Regression tree models have been widely used for remote sensing-based ecosystem mapping. Improper use of the sample data (model training and testing data may cause overfitting and underfitting effects in the model. The goal of this study is to develop an optimal sampling data usage strategy for any dataset and identify an appropriate number of rules in the regression tree model that will improve its accuracy and robustness. Landsat 8 data and Moderate-Resolution Imaging Spectroradiometer-scaled Normalized Difference Vegetation Index (NDVI were used to develop regression tree models. A Python procedure was designed to generate random replications of model parameter options across a range of model development data sizes and rule number constraints. The mean absolute difference (MAD between the predicted and actual NDVI (scaled NDVI, value from 0–200 and its variability across the different randomized replications were calculated to assess the accuracy and stability of the models. In our case study, a six-rule regression tree model developed from 80% of the sample data had the lowest MAD (MADtraining = 2.5 and MADtesting = 2.4, which was suggested as the optimal model. This study demonstrates how the training data and rule number selections impact model accuracy and provides important guidance for future remote-sensing-based ecosystem modeling.

  5. Comparative Analyses of Zebrafish Anxiety-Like Behavior Using Conflict-Based Novelty Tests.

    Science.gov (United States)

    Kysil, Elana V; Meshalkina, Darya A; Frick, Erin E; Echevarria, David J; Rosemberg, Denis B; Maximino, Caio; Lima, Monica Gomes; Abreu, Murilo S; Giacomini, Ana C; Barcellos, Leonardo J G; Song, Cai; Kalueff, Allan V

    2017-06-01

    Modeling of stress and anxiety in adult zebrafish (Danio rerio) is increasingly utilized in neuroscience research and central nervous system (CNS) drug discovery. Representing the most commonly used zebrafish anxiety models, the novel tank test (NTT) focuses on zebrafish diving in response to potentially threatening stimuli, whereas the light-dark test (LDT) is based on fish scototaxis (innate preference for dark vs. bright areas). Here, we systematically evaluate the utility of these two tests, combining meta-analyses of published literature with comparative in vivo behavioral and whole-body endocrine (cortisol) testing. Overall, the NTT and LDT behaviors demonstrate a generally good cross-test correlation in vivo, whereas meta-analyses of published literature show that both tests have similar sensitivity to zebrafish anxiety-like states. Finally, NTT evokes higher levels of cortisol, likely representing a more stressful procedure than LDT. Collectively, our study reappraises NTT and LDT for studying anxiety-like states in zebrafish, and emphasizes their developing utility for neurobehavioral research. These findings can help optimize drug screening procedures by choosing more appropriate models for testing anxiolytic or anxiogenic drugs.

  6. AucPR: An AUC-based approach using penalized regression for disease prediction with high-dimensional omics data

    OpenAIRE

    Yu, Wenbao; Park, Taesung

    2014-01-01

    Motivation It is common to get an optimal combination of markers for disease classification and prediction when multiple markers are available. Many approaches based on the area under the receiver operating characteristic curve (AUC) have been proposed. Existing works based on AUC in a high-dimensional context depend mainly on a non-parametric, smooth approximation of AUC, with no work using a parametric AUC-based approach, for high-dimensional data. Results We propose an AUC-based approach u...

  7. Chemometrical characterization of four italian rice varieties based on genetic and chemical analyses.

    Science.gov (United States)

    Brandolini, Vincenzo; Coïsson, Jean Daniel; Tedeschi, Paola; Barile, Daniela; Cereti, Elisabetta; Maietti, Annalisa; Vecchiati, Giorgio; Martelli, Aldo; Arlorio, Marco

    2006-12-27

    This paper describes a method for achieving qualitative identification of four rice varieties from two different Italian regions. To estimate the presence of genetic diversity among the four rice varieties, we used polymerase chain reaction-randomly amplified polymorphic DNA (PCR-RAPD) markers, and to elucidate whether a relationship exists between the ground and the specific characteristics of the product, we studied proximate composition, fatty acid composition, mineral content, and total antioxidant capacity. Using principal component analysis on genomic and compositional data, we were able to classify rice samples according to their variety and their district of production. This work also examined the discrimination ability of different parameters. It was found that genomic data give the best discrimination based on varieties, indicating that RAPD assays could be useful in discriminating among closely related species, while compositional analyses do not depend on the genetic characters only but are related to the production area.

  8. Stress and deflection analyses of floating roofs based on a load-modifying method

    International Nuclear Information System (INIS)

    Sun Xiushan; Liu Yinghua; Wang Jianbin; Cen Zhangzhi

    2008-01-01

    This paper proposes a load-modifying method for the stress and deflection analyses of floating roofs used in cylindrical oil storage tanks. The formulations of loads and deformations are derived according to the equilibrium analysis of floating roofs. Based on these formulations, the load-modifying method is developed to conduct a geometrically nonlinear analysis of floating roofs with the finite element (FE) simulation. In the procedure with the load-modifying method, the analysis is carried out through a series of iterative computations until a convergence is achieved within the error tolerance. Numerical examples are given to demonstrate the validity and reliability of the proposed method, which provides an effective and practical numerical solution to the design and analysis of floating roofs

  9. Seismic fragility analyses of nuclear power plant structures based on the recorded earthquake data in Korea

    International Nuclear Information System (INIS)

    Cho, Sung Gook; Joe, Yang Hee

    2005-01-01

    By nature, the seismic fragility analysis results will be considerably affected by the statistical data of design information and site-dependent ground motions. The engineering characteristics of small magnitude earthquake spectra recorded in the Korean peninsula during the last several years are analyzed in this paper. An improved method of seismic fragility analysis is evaluated by comparative analyses to verify its efficiency for practical application to nuclear power plant structures. The effects of the recorded earthquake on the seismic fragilities of Korean nuclear power plant structures are also evaluated from the comparative studies. Observing the obtained results, the proposed method is more efficient for the multi-modes structures. The case study results show that seismic fragility analysis based on the Newmark's spectra in Korea might over-estimate the seismic capacities of Korean facilities

  10. Seismic fragility analyses of nuclear power plant structures based on the recorded earthquake data in Korea

    Energy Technology Data Exchange (ETDEWEB)

    Cho, Sung Gook [Department of Civil and Environmental System Engineering, University of Incheon, 177 Dohwa-dong, Nam-gu, Incheon 402-749 (Korea, Republic of)]. E-mail: sgcho@incheon.ac.kr; Joe, Yang Hee [Department of Civil and Environmental System Engineering, University of Incheon, 177 Dohwa-dong, Nam-gu, Incheon 402-749 (Korea, Republic of)

    2005-08-01

    By nature, the seismic fragility analysis results will be considerably affected by the statistical data of design information and site-dependent ground motions. The engineering characteristics of small magnitude earthquake spectra recorded in the Korean peninsula during the last several years are analyzed in this paper. An improved method of seismic fragility analysis is evaluated by comparative analyses to verify its efficiency for practical application to nuclear power plant structures. The effects of the recorded earthquake on the seismic fragilities of Korean nuclear power plant structures are also evaluated from the comparative studies. Observing the obtained results, the proposed method is more efficient for the multi-modes structures. The case study results show that seismic fragility analysis based on the Newmark's spectra in Korea might over-estimate the seismic capacities of Korean facilities.

  11. Secondary Data Analyses of Subjective Outcome Evaluation Data Based on Nine Databases

    Directory of Open Access Journals (Sweden)

    Daniel T. L. Shek

    2012-01-01

    Full Text Available The purpose of this study was to evaluate the effectiveness of the Tier 1 Program of the Project P.A.T.H.S. (Positive Adolescent Training through Holistic Social Programmes in Hong Kong by analyzing 1,327 school-based program reports submitted by program implementers. In each report, program implementers were invited to write down five conclusions based on an integration of the subjective outcome evaluation data collected from the program participants and program implementers. Secondary data analyses were carried out by aggregating nine databases, with 14,390 meaningful units extracted from 6,618 conclusions. Results showed that most of the conclusions were positive in nature. The findings generally showed that the workers perceived the program and program implementers to be positive, and they also pointed out that the program could promote holistic development of the program participants in societal, familial, interpersonal, and personal aspects. However, difficulties encountered during program implementation (2.15% and recommendations for improvement were also reported (16.26%. In conjunction with the evaluation findings based on other strategies, the present study suggests that the Tier 1 Program of the Project P.A.T.H.S. is beneficial to the holistic development of the program participants.

  12. What is needed to eliminate new pediatric HIV infections: The contribution of model-based analyses

    Science.gov (United States)

    Doherty, Katie; Ciaranello, Andrea

    2013-01-01

    Purpose of Review Computer simulation models can identify key clinical, operational, and economic interventions that will be needed to achieve the elimination of new pediatric HIV infections. In this review, we summarize recent findings from model-based analyses of strategies for prevention of mother-to-child HIV transmission (MTCT). Recent Findings In order to achieve elimination of MTCT (eMTCT), model-based studies suggest that scale-up of services will be needed in several domains: uptake of services and retention in care (the PMTCT “cascade”), interventions to prevent HIV infections in women and reduce unintended pregnancies (the “four-pronged approach”), efforts to support medication adherence through long periods of pregnancy and breastfeeding, and strategies to make breastfeeding safer and/or shorter. Models also project the economic resources that will be needed to achieve these goals in the most efficient ways to allocate limited resources for eMTCT. Results suggest that currently recommended PMTCT regimens (WHO Option A, Option B, and Option B+) will be cost-effective in most settings. Summary Model-based results can guide future implementation science, by highlighting areas in which additional data are needed to make informed decisions and by outlining critical interventions that will be necessary in order to eliminate new pediatric HIV infections. PMID:23743788

  13. Logistic regression models

    CERN Document Server

    Hilbe, Joseph M

    2009-01-01

    This book really does cover everything you ever wanted to know about logistic regression … with updates available on the author's website. Hilbe, a former national athletics champion, philosopher, and expert in astronomy, is a master at explaining statistical concepts and methods. Readers familiar with his other expository work will know what to expect-great clarity.The book provides considerable detail about all facets of logistic regression. No step of an argument is omitted so that the book will meet the needs of the reader who likes to see everything spelt out, while a person familiar with some of the topics has the option to skip "obvious" sections. The material has been thoroughly road-tested through classroom and web-based teaching. … The focus is on helping the reader to learn and understand logistic regression. The audience is not just students meeting the topic for the first time, but also experienced users. I believe the book really does meet the author's goal … .-Annette J. Dobson, Biometric...

  14. A multi-criteria evaluation system for marine litter pollution based on statistical analyses of OSPAR beach litter monitoring time series.

    Science.gov (United States)

    Schulz, Marcus; Neumann, Daniel; Fleet, David M; Matthies, Michael

    2013-12-01

    During the last decades, marine pollution with anthropogenic litter has become a worldwide major environmental concern. Standardized monitoring of litter since 2001 on 78 beaches selected within the framework of the Convention for the Protection of the Marine Environment of the North-East Atlantic (OSPAR) has been used to identify temporal trends of marine litter. Based on statistical analyses of this dataset a two-part multi-criteria evaluation system for beach litter pollution of the North-East Atlantic and the North Sea is proposed. Canonical correlation analyses, linear regression analyses, and non-parametric analyses of variance were used to identify different temporal trends. A classification of beaches was derived from cluster analyses and served to define different states of beach quality according to abundances of 17 input variables. The evaluation system is easily applicable and relies on the above-mentioned classification and on significant temporal trends implied by significant rank correlations. Copyright © 2013 Elsevier Ltd. All rights reserved.

  15. Testing discontinuities in nonparametric regression

    KAUST Repository

    Dai, Wenlin

    2017-01-19

    In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100

  16. Testing discontinuities in nonparametric regression

    KAUST Repository

    Dai, Wenlin; Zhou, Yuejin; Tong, Tiejun

    2017-01-01

    In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100

  17. Geometrical quality evaluation in laser cutting of Inconel-718 sheet by using Taguchi based regression analysis and particle swarm optimization

    Science.gov (United States)

    Shrivastava, Prashant Kumar; Pandey, Arun Kumar

    2018-03-01

    The Inconel-718 is one of the most demanding advanced engineering materials because of its superior quality. The conventional machining techniques are facing many problems to cut intricate profiles on these materials due to its minimum thermal conductivity, minimum elastic property and maximum chemical affinity at magnified temperature. The laser beam cutting is one of the advanced cutting method that may be used to achieve the geometrical accuracy with more precision by the suitable management of input process parameters. In this research work, the experimental investigation during the pulsed Nd:YAG laser cutting of Inconel-718 has been carried out. The experiments have been conducted by using the well planned orthogonal array L27. The experimentally measured values of different quality characteristics have been used for developing the second order regression models of bottom kerf deviation (KD), bottom kerf width (KW) and kerf taper (KT). The developed models of different quality characteristics have been utilized as a quality function for single-objective optimization by using particle swarm optimization (PSO) method. The optimum results obtained by the proposed hybrid methodology have been compared with experimental results. The comparison of optimized results with the experimental results shows that an individual improvement of 75%, 12.67% and 33.70% in bottom kerf deviation, bottom kerf width, and kerf taper has been observed. The parametric effects of different most significant input process parameters on quality characteristics have also been discussed.

  18. Discrimination of Transgenic Rice Based on Near Infrared Reflectance Spectroscopy and Partial Least Squares Regression Discriminant Analysis

    Directory of Open Access Journals (Sweden)

    ZHANG Long

    2015-09-01

    Full Text Available Near infrared reflectance spectroscopy (NIRS, a non-destructive measurement technique, was combined with partial least squares regression discrimiant analysis (PLS-DA to discriminate the transgenic (TCTP and mi166 and wild type (Zhonghua 11 rice. Furthermore, rice lines transformed with protein gene (OsTCTP and regulation gene (Osmi166 were also discriminated by the NIRS method. The performances of PLS-DA in spectral ranges of 4 000–8 000 cm-1 and 4 000–10 000 cm-1 were compared to obtain the optimal spectral range. As a result, the transgenic and wild type rice were distinguished from each other in the range of 4 000–10 000 cm-1, and the correct classification rate was 100.0% in the validation test. The transgenic rice TCTP and mi166 were also distinguished from each other in the range of 4 000–10 000 cm-1, and the correct classification rate was also 100.0%. In conclusion, NIRS combined with PLS-DA can be used for the discrimination of transgenic rice.

  19. Time-Frequency Analysis of Non-Stationary Biological Signals with Sparse Linear Regression Based Fourier Linear Combiner

    Directory of Open Access Journals (Sweden)

    Yubo Wang

    2017-06-01

    Full Text Available It is often difficult to analyze biological signals because of their nonlinear and non-stationary characteristics. This necessitates the usage of time-frequency decomposition methods for analyzing the subtle changes in these signals that are often connected to an underlying phenomena. This paper presents a new approach to analyze the time-varying characteristics of such signals by employing a simple truncated Fourier series model, namely the band-limited multiple Fourier linear combiner (BMFLC. In contrast to the earlier designs, we first identified the sparsity imposed on the signal model in order to reformulate the model to a sparse linear regression model. The coefficients of the proposed model are then estimated by a convex optimization algorithm. The performance of the proposed method was analyzed with benchmark test signals. An energy ratio metric is employed to quantify the spectral performance and results show that the proposed method Sparse-BMFLC has high mean energy (0.9976 ratio and outperforms existing methods such as short-time Fourier transfrom (STFT, continuous Wavelet transform (CWT and BMFLC Kalman Smoother. Furthermore, the proposed method provides an overall 6.22% in reconstruction error.

  20. Time-Frequency Analysis of Non-Stationary Biological Signals with Sparse Linear Regression Based Fourier Linear Combiner.

    Science.gov (United States)

    Wang, Yubo; Veluvolu, Kalyana C

    2017-06-14

    It is often difficult to analyze biological signals because of their nonlinear and non-stationary characteristics. This necessitates the usage of time-frequency decomposition methods for analyzing the subtle changes in these signals that are often connected to an underlying phenomena. This paper presents a new approach to analyze the time-varying characteristics of such signals by employing a simple truncated Fourier series model, namely the band-limited multiple Fourier linear combiner (BMFLC). In contrast to the earlier designs, we first identified the sparsity imposed on the signal model in order to reformulate the model to a sparse linear regression model. The coefficients of the proposed model are then estimated by a convex optimization algorithm. The performance of the proposed method was analyzed with benchmark test signals. An energy ratio metric is employed to quantify the spectral performance and results show that the proposed method Sparse-BMFLC has high mean energy (0.9976) ratio and outperforms existing methods such as short-time Fourier transfrom (STFT), continuous Wavelet transform (CWT) and BMFLC Kalman Smoother. Furthermore, the proposed method provides an overall 6.22% in reconstruction error.

  1. Assessment of perfusion by dynamic contrast-enhanced imaging using a deconvolution approach based on regression and singular value decomposition.

    Science.gov (United States)

    Koh, T S; Wu, X Y; Cheong, L H; Lim, C C T

    2004-12-01

    The assessment of tissue perfusion by dynamic contrast-enhanced (DCE) imaging involves a deconvolution process. For analysis of DCE imaging data, we implemented a regression approach to select appropriate regularization parameters for deconvolution using the standard and generalized singular value decomposition methods. Monte Carlo simulation experiments were carried out to study the performance and to compare with other existing methods used for deconvolution analysis of DCE imaging data. The present approach is found to be robust and reliable at the levels of noise commonly encountered in DCE imaging, and for different models of the underlying tissue vasculature. The advantages of the present method, as compared with previous methods, include its efficiency of computation, ability to achieve adequate regularization to reproduce less noisy solutions, and that it does not require prior knowledge of the noise condition. The proposed method is applied on actual patient study cases with brain tumors and ischemic stroke, to illustrate its applicability as a clinical tool for diagnosis and assessment of treatment response.

  2. Geographically Weighted Regression Models in Estimating Median Home Prices in Towns of Massachusetts Based on an Urban Sustainability Framework

    Directory of Open Access Journals (Sweden)

    Yaxiong Ma

    2018-03-01

    Full Text Available Housing is a key component of urban sustainability. The objective of this study was to assess the significance of key spatial determinants of median home price in towns in Massachusetts that impact sustainable growth. Our analysis investigates the presence or absence of spatial non-stationarity in the relationship between sustainable growth, measured in terms of the relationship between home values and various parameters including the amount of unprotected forest land, residential land, unemployment, education, vehicle ownership, accessibility to commuter rail stations, school district performance, and senior population. We use the standard geographically weighted regression (GWR and Mixed GWR models to analyze the effects of spatial non-stationarity. Mixed GWR performed better than GWR in terms of Akaike Information Criterion (AIC values. Our findings highlight the nature and spatial extent of the non-stationary vs. stationary qualities of key environmental and social determinants of median home price. Understanding the key determinants of housing values, such as valuation of green spaces, public school performance metrics, and proximity to public transport, enable towns to use different strategies of sustainable urban planning, while understanding urban housing determinants—such as unemployment and senior population—can help modify urban sustainable housing policies.

  3. Crude Oil Price Forecasting Based on Hybridizing Wavelet Multiple Linear Regression Model, Particle Swarm Optimization Techniques, and Principal Component Analysis

    Science.gov (United States)

    Shabri, Ani; Samsudin, Ruhaidah

    2014-01-01

    Crude oil prices do play significant role in the global economy and are a key input into option pricing formulas, portfolio allocation, and risk measurement. In this paper, a hybrid model integrating wavelet and multiple linear regressions (MLR) is proposed for crude oil price forecasting. In this model, Mallat wavelet transform is first selected to decompose an original time series into several subseries with different scale. Then, the principal component analysis (PCA) is used in processing subseries data in MLR for crude oil price forecasting. The particle swarm optimization (PSO) is used to adopt the optimal parameters of the MLR model. To assess the effectiveness of this model, daily crude oil market, West Texas Intermediate (WTI), has been used as the case study. Time series prediction capability performance of the WMLR model is compared with the MLR, ARIMA, and GARCH models using various statistics measures. The experimental results show that the proposed model outperforms the individual models in forecasting of the crude oil prices series. PMID:24895666

  4. Regression-based season-ahead drought prediction for southern Peru conditioned on large-scale climate variables

    Science.gov (United States)

    Mortensen, Eric; Wu, Shu; Notaro, Michael; Vavrus, Stephen; Montgomery, Rob; De Piérola, José; Sánchez, Carlos; Block, Paul

    2018-01-01

    Located at a complex topographic, climatic, and hydrologic crossroads, southern Peru is a semiarid region that exhibits high spatiotemporal variability in precipitation. The economic viability of the region hinges on this water, yet southern Peru is prone to water scarcity caused by seasonal meteorological drought. Meteorological droughts in this region are often triggered during El Niño episodes; however, other large-scale climate mechanisms also play a noteworthy role in controlling the region's hydrologic cycle. An extensive season-ahead precipitation prediction model is developed to help bolster the existing capacity of stakeholders to plan for and mitigate deleterious impacts of drought. In addition to existing climate indices, large-scale climatic variables, such as sea surface temperature, are investigated to identify potential drought predictors. A principal component regression framework is applied to 11 potential predictors to produce an ensemble forecast of regional January-March precipitation totals. Model hindcasts of 51 years, compared to climatology and another model conditioned solely on an El Niño-Southern Oscillation index, achieve notable skill and perform better for several metrics, including ranked probability skill score and a hit-miss statistic. The information provided by the developed model and ancillary modeling efforts, such as extending the lead time of and spatially disaggregating precipitation predictions to the local level as well as forecasting the number of wet-dry days per rainy season, may further assist regional stakeholders and policymakers in preparing for drought.

  5. Crude Oil Price Forecasting Based on Hybridizing Wavelet Multiple Linear Regression Model, Particle Swarm Optimization Techniques, and Principal Component Analysis

    Directory of Open Access Journals (Sweden)

    Ani Shabri

    2014-01-01

    Full Text Available Crude oil prices do play significant role in the global economy and are a key input into option pricing formulas, portfolio allocation, and risk measurement. In this paper, a hybrid model integrating wavelet and multiple linear regressions (MLR is proposed for crude oil price forecasting. In this model, Mallat wavelet transform is first selected to decompose an original time series into several subseries with different scale. Then, the principal component analysis (PCA is used in processing subseries data in MLR for crude oil price forecasting. The particle swarm optimization (PSO is used to adopt the optimal parameters of the MLR model. To assess the effectiveness of this model, daily crude oil market, West Texas Intermediate (WTI, has been used as the case study. Time series prediction capability performance of the WMLR model is compared with the MLR, ARIMA, and GARCH models using various statistics measures. The experimental results show that the proposed model outperforms the individual models in forecasting of the crude oil prices series.

  6. Crude oil price forecasting based on hybridizing wavelet multiple linear regression model, particle swarm optimization techniques, and principal component analysis.

    Science.gov (United States)

    Shabri, Ani; Samsudin, Ruhaidah

    2014-01-01

    Crude oil prices do play significant role in the global economy and are a key input into option pricing formulas, portfolio allocation, and risk measurement. In this paper, a hybrid model integrating wavelet and multiple linear regressions (MLR) is proposed for crude oil price forecasting. In this model, Mallat wavelet transform is first selected to decompose an original time series into several subseries with different scale. Then, the principal component analysis (PCA) is used in processing subseries data in MLR for crude oil price forecasting. The particle swarm optimization (PSO) is used to adopt the optimal parameters of the MLR model. To assess the effectiveness of this model, daily crude oil market, West Texas Intermediate (WTI), has been used as the case study. Time series prediction capability performance of the WMLR model is compared with the MLR, ARIMA, and GARCH models using various statistics measures. The experimental results show that the proposed model outperforms the individual models in forecasting of the crude oil prices series.

  7. Time-adaptive quantile regression

    DEFF Research Database (Denmark)

    Møller, Jan Kloppenborg; Nielsen, Henrik Aalborg; Madsen, Henrik

    2008-01-01

    and an updating procedure are combined into a new algorithm for time-adaptive quantile regression, which generates new solutions on the basis of the old solution, leading to savings in computation time. The suggested algorithm is tested against a static quantile regression model on a data set with wind power......An algorithm for time-adaptive quantile regression is presented. The algorithm is based on the simplex algorithm, and the linear optimization formulation of the quantile regression problem is given. The observations have been split to allow a direct use of the simplex algorithm. The simplex method...... production, where the models combine splines and quantile regression. The comparison indicates superior performance for the time-adaptive quantile regression in all the performance parameters considered....

  8. A regression tree for identifying combinations of fall risk factors associated to recurrent falling: a cross-sectional elderly population-based study.

    Science.gov (United States)

    Kabeshova, A; Annweiler, C; Fantino, B; Philip, T; Gromov, V A; Launay, C P; Beauchet, O

    2014-06-01

    Regression tree (RT) analyses are particularly adapted to explore the risk of recurrent falling according to various combinations of fall risk factors compared to logistic regression models. The aims of this study were (1) to determine which combinations of fall risk factors were associated with the occurrence of recurrent falls in older community-dwellers, and (2) to compare the efficacy of RT and multiple logistic regression model for the identification of recurrent falls. A total of 1,760 community-dwelling volunteers (mean age ± standard deviation, 71.0 ± 5.1 years; 49.4 % female) were recruited prospectively in this cross-sectional study. Age, gender, polypharmacy, use of psychoactive drugs, fear of falling (FOF), cognitive disorders and sad mood were recorded. In addition, the history of falls within the past year was recorded using a standardized questionnaire. Among 1,760 participants, 19.7 % (n = 346) were recurrent fallers. The RT identified 14 nodes groups and 8 end nodes with FOF as the first major split. Among participants with FOF, those who had sad mood and polypharmacy formed the end node with the greatest OR for recurrent falls (OR = 6.06 with p falls (OR = 0.25 with p factors for recurrent falls, the combination most associated with recurrent falls involving FOF, sad mood and polypharmacy. The FOF emerged as the risk factor strongly associated with recurrent falls. In addition, RT and multiple logistic regression were not sensitive enough to identify the majority of recurrent fallers but appeared efficient in detecting individuals not at risk of recurrent falls.

  9. Validation of a fully autonomous phosphate analyser based on a microfluidic lab-on-a-chip

    DEFF Research Database (Denmark)

    Slater, Conor; Cleary, J.; Lau, K.T.

    2010-01-01

    of long-term operation. This was proven by a bench top calibration of the analyser using standard solutions and also by comparing the analyser's performance to a commercially available phosphate monitor installed at a waste water treatment plant. The output of the microfluidic lab-on-a-chip analyser...

  10. Reduced Rank Regression

    DEFF Research Database (Denmark)

    Johansen, Søren

    2008-01-01

    The reduced rank regression model is a multivariate regression model with a coefficient matrix with reduced rank. The reduced rank regression algorithm is an estimation procedure, which estimates the reduced rank regression model. It is related to canonical correlations and involves calculating...

  11. Regression analysis with categorized regression calibrated exposure: some interesting findings

    Directory of Open Access Journals (Sweden)

    Hjartåker Anette

    2006-07-01

    Full Text Available Abstract Background Regression calibration as a method for handling measurement error is becoming increasingly well-known and used in epidemiologic research. However, the standard version of the method is not appropriate for exposure analyzed on a categorical (e.g. quintile scale, an approach commonly used in epidemiologic studies. A tempting solution could then be to use the predicted continuous exposure obtained through the regression calibration method and treat it as an approximation to the true exposure, that is, include the categorized calibrated exposure in the main regression analysis. Methods We use semi-analytical calculations and simulations to evaluate the performance of the proposed approach compared to the naive approach of not correcting for measurement error, in situations where analyses are performed on quintile scale and when incorporating the original scale into the categorical variables, respectively. We also present analyses of real data, containing measures of folate intake and depression, from the Norwegian Women and Cancer study (NOWAC. Results In cases where extra information is available through replicated measurements and not validation data, regression calibration does not maintain important qualities of the true exposure distribution, thus estimates of variance and percentiles can be severely biased. We show that the outlined approach maintains much, in some cases all, of the misclassification found in the observed exposure. For that reason, regression analysis with the corrected variable included on a categorical scale is still biased. In some cases the corrected estimates are analytically equal to those obtained by the naive approach. Regression calibration is however vastly superior to the naive method when applying the medians of each category in the analysis. Conclusion Regression calibration in its most well-known form is not appropriate for measurement error correction when the exposure is analyzed on a

  12. Development of hybrid genetic-algorithm-based neural networks using regression trees for modeling air quality inside a public transportation bus.

    Science.gov (United States)

    Kadiyala, Akhil; Kaur, Devinder; Kumar, Ashok

    2013-02-01

    The present study developed a novel approach to modeling indoor air quality (IAQ) of a public transportation bus by the development of hybrid genetic-algorithm-based neural networks (also known as evolutionary neural networks) with input variables optimized from using the regression trees, referred as the GART approach. This study validated the applicability of the GART modeling approach in solving complex nonlinear systems by accurately predicting the monitored contaminants of carbon dioxide (CO2), carbon monoxide (CO), nitric oxide (NO), sulfur dioxide (SO2), 0.3-0.4 microm sized particle numbers, 0.4-0.5 microm sized particle numbers, particulate matter (PM) concentrations less than 1.0 microm (PM10), and PM concentrations less than 2.5 microm (PM2.5) inside a public transportation bus operating on 20% grade biodiesel in Toledo, OH. First, the important variables affecting each monitored in-bus contaminant were determined using regression trees. Second, the analysis of variance was used as a complimentary sensitivity analysis to the regression tree results to determine a subset of statistically significant variables affecting each monitored in-bus contaminant. Finally, the identified subsets of statistically significant variables were used as inputs to develop three artificial neural network (ANN) models. The models developed were regression tree-based back-propagation network (BPN-RT), regression tree-based radial basis function network (RBFN-RT), and GART models. Performance measures were used to validate the predictive capacity of the developed IAQ models. The results from this approach were compared with the results obtained from using a theoretical approach and a generalized practicable approach to modeling IAQ that included the consideration of additional independent variables when developing the aforementioned ANN models. The hybrid GART models were able to capture majority of the variance in the monitored in-bus contaminants. The genetic-algorithm-based

  13. Evaluation of the efficiency of continuous wavelet transform as processing and preprocessing algorithm for resolution of overlapped signals in univariate and multivariate regression analyses; an application to ternary and quaternary mixtures

    Science.gov (United States)

    Hegazy, Maha A.; Lotfy, Hayam M.; Mowaka, Shereen; Mohamed, Ekram Hany

    2016-07-01

    Wavelets have been adapted for a vast number of signal-processing applications due to the amount of information that can be extracted from a signal. In this work, a comparative study on the efficiency of continuous wavelet transform (CWT) as a signal processing tool in univariate regression and a pre-processing tool in multivariate analysis using partial least square (CWT-PLS) was conducted. These were applied to complex spectral signals of ternary and quaternary mixtures. CWT-PLS method succeeded in the simultaneous determination of a quaternary mixture of drotaverine (DRO), caffeine (CAF), paracetamol (PAR) and p-aminophenol (PAP, the major impurity of paracetamol). While, the univariate CWT failed to simultaneously determine the quaternary mixture components and was able to determine only PAR and PAP, the ternary mixtures of DRO, CAF, and PAR and CAF, PAR, and PAP. During the calculations of CWT, different wavelet families were tested. The univariate CWT method was validated according to the ICH guidelines. While for the development of the CWT-PLS model a calibration set was prepared by means of an orthogonal experimental design and their absorption spectra were recorded and processed by CWT. The CWT-PLS model was constructed by regression between the wavelet coefficients and concentration matrices and validation was performed by both cross validation and external validation sets. Both methods were successfully applied for determination of the studied drugs in pharmaceutical formulations.

  14. Genome based analyses of six hexacorallian species reject the “naked coral” hypothesis

    KAUST Repository

    Wang, Xin

    2017-09-23

    Scleractinian corals are the foundation species of the coral-reef ecosystem. Their calcium carbonate skeletons form extensive structures that are home to millions of species, making coral reefs one of the most diverse ecosystems of our planet. However, our understanding of how reef-building corals have evolved the ability to calcify and become the ecosystem builders they are today is hampered by uncertain relationships within their subclass Hexacorallia. Corallimorpharians have been proposed to originate from a complex scleractinian ancestor that lost the ability to calcify in response to increasing ocean acidification, suggesting the possibility for corals to lose and gain the ability to calcify in response to increasing ocean acidification. Here we employed a phylogenomic approach using whole-genome data from six hexacorallian species to resolve the evolutionary relationship between reef-building corals and their non-calcifying relatives. Phylogenetic analysis based on 1,421 single-copy orthologs, as well as gene presence/absence and synteny information, converged on the same topologies, showing strong support for scleractinian monophyly and a corallimorpharian sister clade. Our broad phylogenomic approach using sequence-based and sequence-independent analyses provides unambiguous evidence for the monophyly of scleractinian corals and the rejection of corallimorpharians as descendants of a complex coral ancestor.

  15. Genome based analyses of six hexacorallian species reject the “naked coral” hypothesis

    KAUST Repository

    Wang, Xin; Drillon, Gué nola; Ryu, Taewoo; Voolstra, Christian R.; Aranda, Manuel

    2017-01-01

    Scleractinian corals are the foundation species of the coral-reef ecosystem. Their calcium carbonate skeletons form extensive structures that are home to millions of species, making coral reefs one of the most diverse ecosystems of our planet. However, our understanding of how reef-building corals have evolved the ability to calcify and become the ecosystem builders they are today is hampered by uncertain relationships within their subclass Hexacorallia. Corallimorpharians have been proposed to originate from a complex scleractinian ancestor that lost the ability to calcify in response to increasing ocean acidification, suggesting the possibility for corals to lose and gain the ability to calcify in response to increasing ocean acidification. Here we employed a phylogenomic approach using whole-genome data from six hexacorallian species to resolve the evolutionary relationship between reef-building corals and their non-calcifying relatives. Phylogenetic analysis based on 1,421 single-copy orthologs, as well as gene presence/absence and synteny information, converged on the same topologies, showing strong support for scleractinian monophyly and a corallimorpharian sister clade. Our broad phylogenomic approach using sequence-based and sequence-independent analyses provides unambiguous evidence for the monophyly of scleractinian corals and the rejection of corallimorpharians as descendants of a complex coral ancestor.

  16. Risk-based analyses in support of California hazardous site remediation

    International Nuclear Information System (INIS)

    Ringland, J.T.

    1995-08-01

    The California Environmental Enterprise (CEE) is a joint program of the Department of Energy (DOE), Lawrence Livermore National Laboratory, Lawrence Berkeley Laboratory, and Sandia National Laboratories. Its goal is to make DOE laboratory expertise accessible to hazardous site cleanups in the state This support might involve working directly with parties responsible for individual cleanups or it might involve working with the California Environmental Protection Agency to develop tools that would be applicable across a broad range of sites. As part of its initial year's activities, the CEE supported a review to examine where laboratory risk and risk-based systems analysis capabilities might be most effectively applied. To this end, this study draws the following observations. The labs have a clear role in analyses supporting the demonstration and transfer of laboratory characterization or remediation technologies. The labs may have opportunities in developing broadly applicable analysis tools and computer codes for problems such as site characterization or efficient management of resources. Analysis at individual sites, separate from supporting lab technologies or prototyping general tools, may be appropriate only in limited circumstances. In any of these roles, the labs' capabilities extend beyond health risk assessment to the broader areas of risk management and risk-based systems analysis

  17. Analyses of microstructural and elastic properties of porous SOFC cathodes based on focused ion beam tomography

    Science.gov (United States)

    Chen, Zhangwei; Wang, Xin; Giuliani, Finn; Atkinson, Alan

    2015-01-01

    Mechanical properties of porous SOFC electrodes are largely determined by their microstructures. Measurements of the elastic properties and microstructural parameters can be achieved by modelling of the digitally reconstructed 3D volumes based on the real electrode microstructures. However, the reliability of such measurements is greatly dependent on the processing of raw images acquired for reconstruction. In this work, the actual microstructures of La0.6Sr0.4Co0.2Fe0.8O3-δ (LSCF) cathodes sintered at an elevated temperature were reconstructed based on dual-beam FIB/SEM tomography. Key microstructural and elastic parameters were estimated and correlated. Analyses of their sensitivity to the grayscale threshold value applied in the image segmentation were performed. The important microstructural parameters included porosity, tortuosity, specific surface area, particle and pore size distributions, and inter-particle neck size distribution, which may have varying extent of effect on the elastic properties simulated from the microstructures using FEM. Results showed that different threshold value range would result in different degree of sensitivity for a specific parameter. The estimated porosity and tortuosity were more sensitive than surface area to volume ratio. Pore and neck size were found to be less sensitive than particle size. Results also showed that the modulus was essentially sensitive to the porosity which was largely controlled by the threshold value.

  18. GIS-based groundwater potential analysis using novel ensemble weights-of-evidence with logistic regression and functional tree models.

    Science.gov (United States)

    Chen, Wei; Li, Hui; Hou, Enke; Wang, Shengquan; Wang, Guirong; Panahi, Mahdi; Li, Tao; Peng, Tao; Guo, Chen; Niu, Chao; Xiao, Lele; Wang, Jiale; Xie, Xiaoshen; Ahmad, Baharin Bin

    2018-09-01

    The aim of the current study was to produce groundwater spring potential maps using novel ensemble weights-of-evidence (WoE) with logistic regression (LR) and functional tree (FT) models. First, a total of 66 springs were identified by field surveys, out of which 70% of the spring locations were used for training the models and 30% of the spring locations were employed for the validation process. Second, a total of 14 affecting factors including aspect, altitude, slope, plan curvature, profile curvature, stream power index (SPI), topographic wetness index (TWI), sediment transport index (STI), lithology, normalized difference vegetation index (NDVI), land use, soil, distance to roads, and distance to streams was used to analyze the spatial relationship between these affecting factors and spring occurrences. Multicollinearity analysis and feature selection of the correlation attribute evaluation (CAE) method were employed to optimize the affecting factors. Subsequently, the novel ensembles of the WoE, LR, and FT models were constructed using the training dataset. Finally, the receiver operating characteristic (ROC) curves, standard error, confidence interval (CI) at 95%, and significance level P were employed to validate and compare the performance of three models. Overall, all three models performed well for groundwater spring potential evaluation. The prediction capability of the FT model, with the highest AUC values, the smallest standard errors, the narrowest CIs, and the smallest P values for the training and validation datasets, is better compared to those of other models. The groundwater spring potential maps can be adopted for the management of water resources and land use by planners and engineers. Copyright © 2018 Elsevier B.V. All rights reserved.

  19. Personality, Driving Behavior and Mental Disorders Factors as Predictors of Road Traffic Accidents Based on Logistic Regression

    Science.gov (United States)

    Alavi, Seyyed Salman; Mohammadi, Mohammad Reza; Souri, Hamid; Mohammadi Kalhori, Soroush; Jannatifard, Fereshteh; Sepahbodi, Ghazal

    2017-01-01

    Background: The aim of this study was to evaluate the effect of variables such as personality traits, driving behavior and mental illness on road traffic accidents among the drivers with accidents and those without road crash. Methods: In this cohort study, 800 bus and truck drivers were recruited. Participants were selected among drivers who referred to Imam Sajjad Hospital (Tehran, Iran) during 2013-2015. The Manchester driving behavior questionnaire (MDBQ), big five personality test (NEO personality inventory) and semi-structured interview (schizophrenia and affective disorders scale) were used. After two years, we surveyed all accidents due to human factors that involved the recruited drivers. The data were analyzed using the SPSS software by performing the descriptive statistics, t-test, and multiple logistic regression analysis methods. P values less than 0.05 were considered statistically significant. Results: In terms of controlling the effective and demographic variables, the findings revealed significant differences between the two groups of drivers that were and were not involved in road accidents. In addition, it was found that depression and anxiety could increase the odds ratio (OR) of road accidents by 2.4- and 2.7-folds, respectively (P=0.04, P=0.004). It is noteworthy to mention that neuroticism alone can increase the odds of road accidents by 1.1-fold (P=0.009), but other personality factors did not have a significant effect on the equation. Conclusion: The results revealed that some mental disorders affect the incidence of road collisions. Considering the importance and sensitivity of driving behavior, it is necessary to evaluate multiple psychological factors influencing drivers before and after receiving or renewing their driver’s license. PMID:28293047

  20. A glucose model based on support vector regression for the prediction of hypoglycemic events under free-living conditions.

    Science.gov (United States)

    Georga, Eleni I; Protopappas, Vasilios C; Ardigò, Diego; Polyzos, Demosthenes; Fotiadis, Dimitrios I

    2013-08-01

    The prevention of hypoglycemic events is of paramount importance in the daily management of insulin-treated diabetes. The use of short-term prediction algorithms of the subcutaneous (s.c.) glucose concentration may contribute significantly toward this direction. The literature suggests that, although the recent glucose profile is a prominent predictor of hypoglycemia, the overall patient's context greatly impacts its accurate estimation. The objective of this study is to evaluate the performance of a support vector for regression (SVR) s.c. glucose method on hypoglycemia prediction. We extend our SVR model to predict separately the nocturnal events during sleep and the non-nocturnal (i.e., diurnal) ones over 30-min and 60-min horizons using information on recent glucose profile, meals, insulin intake, and physical activities for a hypoglycemic threshold of 70 mg/dL. We also introduce herein additional variables accounting for recurrent nocturnal hypoglycemia due to antecedent hypoglycemia, exercise, and sleep. SVR predictions are compared with those from two other machine learning techniques. The method is assessed on a dataset of 15 patients with type 1 diabetes under free-living conditions. Nocturnal hypoglycemic events are predicted with 94% sensitivity for both horizons and with time lags of 5.43 min and 4.57 min, respectively. As concerns the diurnal events, when physical activities are not considered, the sensitivity is 92% and 96% for a 30-min and 60-min horizon, respectively, with both time lags being less than 5 min. However, when such information is introduced, the diurnal sensitivity decreases by 8% and 3%, respectively. Both nocturnal and diurnal predictions show a high (>90%) precision. Results suggest that hypoglycemia prediction using SVR can be accurate and performs better in most diurnal and nocturnal cases compared with other techniques. It is advised that the problem of hypoglycemia prediction should be handled differently for nocturnal

  1. Climate-based statistical regression models for crop yield forecasting of coffee in humid tropical Kerala, India

    Science.gov (United States)

    Jayakumar, M.; Rajavel, M.; Surendran, U.

    2016-12-01

    A study on the variability of coffee yield of both Coffea arabica and Coffea canephora as influenced by climate parameters (rainfall (RF), maximum temperature (Tmax), minimum temperature (Tmin), and mean relative humidity (RH)) was undertaken at Regional Coffee Research Station, Chundale, Wayanad, Kerala State, India. The result on the coffee yield data of 30 years (1980 to 2009) revealed that the yield of coffee is fluctuating with the variations in climatic parameters. Among the species, productivity was higher for C. canephora coffee than C. arabica in most of the years. Maximum yield of C. canephora (2040 kg ha-1) was recorded in 2003-2004 and there was declining trend of yield noticed in the recent years. Similarly, the maximum yield of C. arabica (1745 kg ha-1) was recorded in 1988-1989 and decreased yield was noticed in the subsequent years till 1997-1998 due to year to year variability in climate. The highest correlation coefficient was found between the yield of C. arabica coffee and maximum temperature during January (0.7) and between C. arabica coffee yield and RH during July (0.4). Yield of C. canephora coffee had highest correlation with maximum temperature, RH and rainfall during February. Statistical regression model between selected climatic parameters and yield of C. arabica and C. canephora coffee was developed to forecast the yield of coffee in Wayanad district in Kerala. The model was validated for years 2010, 2011, and 2012 with the coffee yield data obtained during the years and the prediction was found to be good.

  2. Non-contact video-based vital sign monitoring using ambient light and auto-regressive models

    International Nuclear Information System (INIS)

    Tarassenko, L; Villarroel, M; Guazzi, A; Jorge, J; Clifton, D A; Pugh, C

    2014-01-01

    Remote sensing of the reflectance photoplethysmogram using a video camera typically positioned 1 m away from the patient’s face is a promising method for monitoring the vital signs of patients without attaching any electrodes or sensors to them. Most of the papers in the literature on non-contact vital sign monitoring report results on human volunteers in controlled environments. We have been able to obtain estimates of heart rate and respiratory rate and preliminary results on changes in oxygen saturation from double-monitored patients undergoing haemodialysis in the Oxford Kidney Unit. To achieve this, we have devised a novel method of cancelling out aliased frequency components caused by artificial light flicker, using auto-regressive (AR) modelling and pole cancellation. Secondly, we have been able to construct accurate maps of the spatial distribution of heart rate and respiratory rate information from the coefficients of the AR model. In stable sections with minimal patient motion, the mean absolute error between the camera-derived estimate of heart rate and the reference value from a pulse oximeter is similar to the mean absolute error between two pulse oximeter measurements at different sites (finger and earlobe). The activities of daily living affect the respiratory rate, but the camera-derived estimates of this parameter are at least as accurate as those derived from a thoracic expansion sensor (chest belt). During a period of obstructive sleep apnoea, we tracked changes in oxygen saturation using the ratio of normalized reflectance changes in two colour channels (red and blue), but this required calibration against the reference data from a pulse oximeter. (paper)

  3. Personality, Driving Behavior and Mental Disorders Factors as Predictors of Road Traffic Accidents Based on Logistic Regression.

    Science.gov (United States)

    Alavi, Seyyed Salman; Mohammadi, Mohammad Reza; Souri, Hamid; Mohammadi Kalhori, Soroush; Jannatifard, Fereshteh; Sepahbodi, Ghazal

    2017-01-01

    The aim of this study was to evaluate the effect of variables such as personality traits, driving behavior and mental illness on road traffic accidents among the drivers with accidents and those without road crash. In this cohort study, 800 bus and truck drivers were recruited. Participants were selected among drivers who referred to Imam Sajjad Hospital (Tehran, Iran) during 2013-2015. The Manchester driving behavior questionnaire (MDBQ), big five personality test (NEO personality inventory) and semi-structured interview (schizophrenia and affective disorders scale) were used. After two years, we surveyed all accidents due to human factors that involved the recruited drivers. The data were analyzed using the SPSS software by performing the descriptive statistics, t-test, and multiple logistic regression analysis methods. P values less than 0.05 were considered statistically significant. In terms of controlling the effective and demographic variables, the findings revealed significant differences between the two groups of drivers that were and were not involved in road accidents. In addition, it was found that depression and anxiety could increase the odds ratio (OR) of road accidents by 2.4- and 2.7-folds, respectively (P=0.04, P=0.004). It is noteworthy to mention that neuroticism alone can increase the odds of road accidents by 1.1-fold (P=0.009), but other personality factors did not have a significant effect on the equation. The results revealed that some mental disorders affect the incidence of road collisions. Considering the importance and sensitivity of driving behavior, it is necessary to evaluate multiple psychological factors influencing drivers before and after receiving or renewing their driver's license.

  4. Development and evaluation of a regression-based model to predict cesium-137 concentration ratios for saltwater fish

    International Nuclear Information System (INIS)

    Pinder, John E.; Rowan, David J.; Smith, Jim T.

    2016-01-01

    Data from published studies and World Wide Web sources were combined to develop a regression model to predict "1"3"7Cs concentration ratios for saltwater fish. Predictions were developed from 1) numeric trophic levels computed primarily from random resampling of known food items and 2) K concentrations in the saltwater for 65 samplings from 41 different species from both the Atlantic and Pacific Oceans. A number of different models were initially developed and evaluated for accuracy which was assessed as the ratios of independently measured concentration ratios to those predicted by the model. In contrast to freshwater systems, were K concentrations are highly variable and are an important factor in affecting fish concentration ratios, the less variable K concentrations in saltwater were relatively unimportant in affecting concentration ratios. As a result, the simplest model, which used only trophic level as a predictor, had comparable accuracies to more complex models that also included K concentrations. A test of model accuracy involving comparisons of 56 published concentration ratios from 51 species of marine fish to those predicted by the model indicated that 52 of the predicted concentration ratios were within a factor of 2 of the observed concentration ratios. - Highlights: • We developed a model to predict concentration ratios (C_r) for saltwater fish. • The model requires only a single input variable to predict C_r. • That variable is a mean numeric trophic level available at (fishbase.org). • The K concentrations in seawater were not an important predictor variable. • The median-to observed ratio for 56 independently measured C_r was 0.83.

  5. Development and evaluation of a regression-based model to predict cesium concentration ratios for freshwater fish

    International Nuclear Information System (INIS)

    Pinder, John E.; Rowan, David J.; Rasmussen, Joseph B.; Smith, Jim T.; Hinton, Thomas G.; Whicker, F.W.

    2014-01-01

    Data from published studies and World Wide Web sources were combined to produce and test a regression model to predict Cs concentration ratios for freshwater fish species. The accuracies of predicted concentration ratios, which were computed using 1) species trophic levels obtained from random resampling of known food items and 2) K concentrations in the water for 207 fish from 44 species and 43 locations, were tested against independent observations of ratios for 57 fish from 17 species from 25 locations. Accuracy was assessed as the percent of observed to predicted ratios within factors of 2 or 3. Conservatism, expressed as the lack of under prediction, was assessed as the percent of observed to predicted ratios that were less than 2 or less than 3. The model's median observed to predicted ratio was 1.26, which was not significantly different from 1, and 50% of the ratios were between 0.73 and 1.85. The percentages of ratios within factors of 2 or 3 were 67 and 82%, respectively. The percentages of ratios that were <2 or <3 were 79 and 88%, respectively. An example for Perca fluviatilis demonstrated that increased prediction accuracy could be obtained when more detailed knowledge of diet was available to estimate trophic level. - Highlights: • We developed a model to predict Cs concentration ratios for freshwater fish species. • The model uses only two variables to predict a species CR for any location. • One variable is the K concentration in the freshwater. • The other is a species mean trophic level measure easily obtained from (fishbase.org). • The median observed to predicted ratio for 57 independent test cases was 1.26

  6. Personality, Driving Behavior and Mental Disorders Factors as Predictors of Road Traffic Accidents Based on Logistic Regression

    Directory of Open Access Journals (Sweden)

    Seyyed Salman Alavi

    2017-01-01

    Full Text Available Background: The aim of this study was to evaluate the effect of variables such as personality traits, driving behavior and mental illness on road traffic accidents among the drivers with accidents and those without road crash. Methods: In this cohort study, 800 bus and truck drivers were recruited. Participants were selected among drivers who referred to Imam Sajjad Hospital (Tehran, Iran during 2013-2015. The Manchester driving behavior questionnaire (MDBQ, big five personality test (NEO personality inventory and semi-structured interview (SADS were used. After two years, we surveyed all accidents due to human factors that involved the recruited drivers. The data were analyzed using the SPSS software by performing the descriptive statistics, t-test, and multiple logistic regression analysis methods. P values less than 0.05 were considered statistically significant. Results: In terms of controlling the effective and demographic variables, the findings revealed significant differences between the two groups of drivers that were and were not involved in road accidents. In addition, it was found that depression and anxiety could increase the odds ratio (OR of road accidents by 2.4- and 2.7-folds, respectively (P=0.04, P=0.004. It is noteworthy to mention that neuroticism alone can increase the odds of road accidents by 1.1-fold (P=0.009, but other personality factors did not have a significant effect on the equation. Conclusion: The results revealed that some mental disorders affect the incidence of road collisions. Considering the importance and sensitivity of driving behavior, it is necessary to evaluate multiple psychological factors influencing drivers before and after receiving or renewing their driver’s license.

  7. Use of regression-based models to map sensitivity of aquatic resources to atmospheric deposition in Yosemite National Park, USA

    Science.gov (United States)

    Clow, D. W.; Nanus, L.; Huggett, B. W.

    2010-12-01

    An abundance of exposed bedrock, sparse soil and vegetation, and fast hydrologic flushing rates make aquatic ecosystems in Yosemite National Park susceptible to nutrient enrichment and episodic acidification due to atmospheric deposition of nitrogen (N) and sulfur (S). In this study, multiple-linear regression (MLR) models were created to estimate fall-season nitrate and acid neutralizing capacity (ANC) in surface water in Yosemite wilderness. Input data included estimated winter N deposition, fall-season surface-water chemistry measurements at 52 sites, and basin characteristics derived from geographic information system layers of topography, geology, and vegetation. The MLR models accounted for 84% and 70% of the variance in surface-water nitrate and ANC, respectively. Explanatory variables (and the sign of their coefficients) for nitrate included elevation (positive) and the abundance of neoglacial and talus deposits (positive), unvegetated terrain (positive), alluvium (negative), and riparian (negative) areas in the basins. Explanatory variables for ANC included basin area (positive) and the abundance of metamorphic rocks (positive), unvegetated terrain (negative), water (negative), and winter N deposition (negative) in the basins. The MLR equations were applied to 1407 stream reaches delineated in the National Hydrography Dataset for Yosemite, and maps of predicted surface-water nitrate and ANC concentrations were created. Predicted surface-water nitrate concentrations were highest in small, high-elevation cirques, and concentrations declined downstream. Predicted ANC concentrations showed the opposite pattern, except in high-elevation areas underlain by metamorphic rocks along the Sierran Crest, which had relatively high predicted ANC (>200 µeq L-1). Maps were created to show where basin characteristics predispose aquatic resources to nutrient enrichment and acidification effects from N and S deposition. The maps can be used to help guide development of

  8. A New Predictive Model Based on the ABC Optimized Multivariate Adaptive Regression Splines Approach for Predicting the Remaining Useful Life in Aircraft Engines

    Directory of Open Access Journals (Sweden)

    Paulino José García Nieto

    2016-05-01

    Full Text Available Remaining useful life (RUL estimation is considered as one of the most central points in the prognostics and health management (PHM. The present paper describes a nonlinear hybrid ABC–MARS-based model for the prediction of the remaining useful life of aircraft engines. Indeed, it is well-known that an accurate RUL estimation allows failure prevention in a more controllable way so that the effective maintenance can be carried out in appropriate time to correct impending faults. The proposed hybrid model combines multivariate adaptive regression splines (MARS, which have been successfully adopted for regression problems, with the artificial bee colony (ABC technique. This optimization technique involves parameter setting in the MARS training procedure, which significantly influences the regression accuracy. However, its use in reliability applications has not yet been widely explored. Bearing this in mind, remaining useful life values have been predicted here by using the hybrid ABC–MARS-based model from the remaining measured parameters (input variables for aircraft engines with success. A correlation coefficient equal to 0.92 was obtained when this hybrid ABC–MARS-based model was applied to experimental data. The agreement of this model with experimental data confirmed its good performance. The main advantage of this predictive model is that it does not require information about the previous operation states of the aircraft engine.

  9. Statistical parameters of random heterogeneity estimated by analysing coda waves based on finite difference method

    Science.gov (United States)

    Emoto, K.; Saito, T.; Shiomi, K.

    2017-12-01

    Short-period (2 s) seismograms. We found that the energy of the coda of long-period seismograms shows a spatially flat distribution. This phenomenon is well known in short-period seismograms and results from the scattering by small-scale heterogeneities. We estimate the statistical parameters that characterize the small-scale random heterogeneity by modelling the spatiotemporal energy distribution of long-period seismograms. We analyse three moderate-size earthquakes that occurred in southwest Japan. We calculate the spatial distribution of the energy density recorded by a dense seismograph network in Japan at the period bands of 8-16 s, 4-8 s and 2-4 s and model them by using 3-D finite difference (FD) simulations. Compared to conventional methods based on statistical theories, we can calculate more realistic synthetics by using the FD simulation. It is not necessary to assume a uniform background velocity, body or surface waves and scattering properties considered in general scattering theories. By taking the ratio of the energy of the coda area to that of the entire area, we can separately estimate the scattering and the intrinsic absorption effects. Our result reveals the spectrum of the random inhomogeneity in a wide wavenumber range including the intensity around the corner wavenumber as P(m) = 8πε2a3/(1 + a2m2)2, where ε = 0.05 and a = 3.1 km, even though past studies analysing higher-frequency records could not detect the corner. Finally, we estimate the intrinsic attenuation by modelling the decay rate of the energy. The method proposed in this study is suitable for quantifying the statistical properties of long-wavelength subsurface random inhomogeneity, which leads the way to characterizing a wider wavenumber range of spectra, including the corner wavenumber.

  10. Comprehensive logic based analyses of Toll-like receptor 4 signal transduction pathway.

    Directory of Open Access Journals (Sweden)

    Mahesh Kumar Padwal

    Full Text Available Among the 13 TLRs in the vertebrate systems, only TLR4 utilizes both Myeloid differentiation factor 88 (MyD88 and Toll/Interleukin-1 receptor (TIR-domain-containing adapter interferon-β-inducing Factor (TRIF adaptors to transduce signals triggering host-protective immune responses. Earlier studies on the pathway combined various experimental data in the form of one comprehensive map of TLR signaling. But in the absence of adequate kinetic parameters quantitative mathematical models that reveal emerging systems level properties and dynamic inter-regulation among the kinases/phosphatases of the TLR4 network are not yet available. So, here we used reaction stoichiometry-based and parameter independent logical modeling formalism to build the TLR4 signaling network model that captured the feedback regulations, interdependencies between signaling kinases and phosphatases and the outcome of simulated infections. The analyses of the TLR4 signaling network revealed 360 feedback loops, 157 negative and 203 positive; of which, 334 loops had the phosphatase PP1 as an essential component. The network elements' interdependency (positive or negative dependencies in perturbation conditions such as the phosphatase knockout conditions revealed interdependencies between the dual-specific phosphatases MKP-1 and MKP-3 and the kinases in MAPK modules and the role of PP2A in the auto-regulation of Calmodulin kinase-II. Our simulations under the specific kinase or phosphatase gene-deficiency or inhibition conditions corroborated with several previously reported experimental data. The simulations to mimic Yersinia pestis and E. coli infections identified the key perturbation in the network and potential drug targets. Thus, our analyses of TLR4 signaling highlights the role of phosphatases as key regulatory factors in determining the global interdependencies among the network elements; uncovers novel signaling connections; identifies potential drug targets for

  11. Has Childhood Smoking Reduced Following Smoke-Free Public Places Legislation? A Segmented Regression Analysis of Cross-Sectional UK School-Based Surveys.

    Science.gov (United States)

    Katikireddi, Srinivasa Vittal; Der, Geoff; Roberts, Chris; Haw, Sally

    2016-07-01

    Smoke-free legislation has been a great success for tobacco control but its impact on smoking uptake remains under-explored. We investigated if trends in smoking uptake amongst adolescents differed before and after the introduction of smoke-free legislation in the United Kingdom. Prevalence estimates for regular smoking were obtained from representative school-based surveys for the four countries of the United Kingdom. Post-intervention status was represented using a dummy variable and to allow for a change in trend, the number of years since implementation was included. To estimate the association between smoke-free legislation and adolescent smoking, the percentage of regular smokers was modeled using linear regression adjusted for trends over time and country. All models were stratified by age (13 and 15 years) and sex. For 15-year-old girls, the implementation of smoke-free legislation in the United Kingdom was associated with a 4.3% reduction in the prevalence of regular smoking (P = .029). In addition, regular smoking fell by an additional 1.5% per annum post-legislation in this group (P = .005). Among 13-year-old girls, there was a reduction of 2.8% in regular smoking (P = .051), with no evidence of a change in trend post-legislation. Smaller and nonsignificant reductions in regular smoking were observed for 15- and 13-year-old boys (P = .175 and P = .113, respectively). Smoke-free legislation may help reduce smoking uptake amongst teenagers, with stronger evidence for an association seen in females. Further research that analyses longitudinal data across more countries is required. Previous research has established that smoke-free legislation has led to many improvements in population health, including reductions in heart attack, stroke, and asthma. However, the impacts of smoke-free legislation on the rates of smoking amongst children have been less investigated. Analysis of repeated cross-sectional surveys across the four countries of the United Kingdom

  12. Improving sensitivity of linear regression-based cell type-specific differential expression deconvolution with per-gene vs. global significance threshold.

    Science.gov (United States)

    Glass, Edmund R; Dozmorov, Mikhail G

    2016-10-06

    The goal of many human disease-oriented studies is to detect molecular mechanisms different between healthy controls and patients. Yet, commonly used gene expression measurements from blood samples suffer from variability of cell composition. This variability hinders the detection of differentially expressed genes and is often ignored. Combined with cell counts, heterogeneous gene expression may provide deeper insights into the gene expression differences on the cell type-specific level. Published computational methods use linear regression to estimate cell type-specific differential expression, and a global cutoff to judge significance, such as False Discovery Rate (FDR). Yet, they do not consider many artifacts hidden in high-dimensional gene expression data that may negatively affect linear regression. In this paper we quantify the parameter space affecting the performance of linear regression (sensitivity of cell type-specific differential expression detection) on a per-gene basis. We evaluated the effect of sample sizes, cell type-specific proportion variability, and mean squared error on sensitivity of cell type-specific differential expression detection using linear regression. Each parameter affected variability of cell type-specific expression estimates and, subsequently, the sensitivity of differential expression detection. We provide the R package, LRCDE, which performs linear regression-based cell type-specific differential expression (deconvolution) detection on a gene-by-gene basis. Accounting for variability around cell type-specific gene expression estimates, it computes per-gene t-statistics of differential detection, p-values, t-statistic-based sensitivity, group-specific mean squared error, and several gene-specific diagnostic metrics. The sensitivity of linear regression-based cell type-specific differential expression detection differed for each gene as a function of mean squared error, per group sample sizes, and variability of the proportions

  13. Voxel-based morphometry analyses of in-vivo MRI in the aging mouse lemur primate

    Directory of Open Access Journals (Sweden)

    Stephen John Sawiak

    2014-05-01

    Full Text Available Cerebral atrophy is one of the most widely brain alterations associated to aging. A clear relationship has been established between age-associated cognitive impairments and cerebral atrophy. The mouse lemur (Microcebus murinus is a small primate used as a model of age-related neurodegenerative processes. It is the first nonhuman primate in which cerebral atrophy has been correlated with cognitive deficits. Previous studies of cerebral atrophy in this model were based on time consuming manual delineation or measurement of selected brain regions from magnetic resonance images (MRI. These measures could not be used to analyse regions that cannot be easily outlined such as the nucleus basalis of Meynert or the subiculum. In humans, morphometric assessment of structural changes with age is generally performed with automated procedures such as voxel-based morphometry (VBM. The objective of our work was to perform user-independent assessment of age-related morphological changes in the whole brain of large mouse lemur populations thanks to VBM. The study was based on the SPMMouse toolbox of SPM 8 and involved thirty mouse lemurs aged from 1.9 to 11.3 years. The automatic method revealed for the first time atrophy in regions where manual delineation is prohibitive (nucleus basalis of Meynert, subiculum, prepiriform cortex, Brodmann areas 13-16, hypothalamus, putamen, thalamus, corpus callosum. Some of these regions are described as particularly sensitive to age-associated alterations in humans. The method revealed also age-associated atrophy in cortical regions (cingulate, occipital, parietal, nucleus septalis, and the caudate. Manual measures performed in some of these regions were in good agreement with results from automatic measures. The templates generated in this study as well as the toolbox for SPM8 can be downloaded. These tools will be valuable for future evaluation of various treatments that are tested to modulate cerebral aging in lemurs.

  14. [State Recognition of Solid Fermentation Process Based on Near Infrared Spectroscopy with Adaboost and Spectral Regression Discriminant Analysis].

    Science.gov (United States)

    Yu, Shuang; Liu, Guo-hai; Xia, Rong-sheng; Jiang, Hui

    2016-01-01

    In order to achieve the rapid monitoring of process state of solid state fermentation (SSF), this study attempted to qualitative identification of process state of SSF of feed protein by use of Fourier transform near infrared (FT-NIR) spectroscopy analysis technique. Even more specifically, the FT-NIR spectroscopy combined with Adaboost-SRDA-NN integrated learning algorithm as an ideal analysis tool was used to accurately and rapidly monitor chemical and physical changes in SSF of feed protein without the need for chemical analysis. Firstly, the raw spectra of all the 140 fermentation samples obtained were collected by use of Fourier transform near infrared spectrometer (Antaris II), and the raw spectra obtained were preprocessed by use of standard normal variate transformation (SNV) spectral preprocessing algorithm. Thereafter, the characteristic information of the preprocessed spectra was extracted by use of spectral regression discriminant analysis (SRDA). Finally, nearest neighbors (NN) algorithm as a basic classifier was selected and building state recognition model to identify different fermentation samples in the validation set. Experimental results showed as follows: the SRDA-NN model revealed its superior performance by compared with other two different NN models, which were developed by use of the feature information form principal component analysis (PCA) and linear discriminant analysis (LDA), and the correct recognition rate of SRDA-NN model achieved 94.28% in the validation set. In this work, in order to further improve the recognition accuracy of the final model, Adaboost-SRDA-NN ensemble learning algorithm was proposed by integrated the Adaboost and SRDA-NN methods, and the presented algorithm was used to construct the online monitoring model of process state of SSF of feed protein. Experimental results showed as follows: the prediction performance of SRDA-NN model has been further enhanced by use of Adaboost lifting algorithm, and the correct

  15. Long Term Association of Tropospheric Trace gases over Pakistan by exploiting satellite observations and development of Econometric Regression based Model

    Science.gov (United States)

    Zeb, Naila; Fahim Khokhar, Muhammad; Khan, Saud Ahmed; Noreen, Asma; Murtaza, Rabbia

    2017-04-01

    . Furthermore to explore causal relation, regression analysis is employed to estimate model for CO and TOC. This model numerically estimated the long term association of trace gases over the region.

  16. Personal, social, and game-related correlates of active and non-active gaming among dutch gaming adolescents : survey-based multivariable, multilevel logistic regression analyses

    NARCIS (Netherlands)

    Simons, Monique; de Vet, Emely; Chinapaw, Mai Jm; de Boer, Michiel; Seidell, Jacob C; Brug, Johannes

    2014-01-01

    BACKGROUND: Playing video games contributes substantially to sedentary behavior in youth. A new generation of video games-active games-seems to be a promising alternative to sedentary games to promote physical activity and reduce sedentary behavior. At this time, little is known about correlates of

  17. Vanadium NMR Chemical Shifts of (Imido)vanadium(V) Dichloride Complexes with Imidazolin-2-iminato and Imidazolidin-2-iminato Ligands: Cooperation with Quantum-Chemical Calculations and Multiple Linear Regression Analyses.

    Science.gov (United States)

    Yi, Jun; Yang, Wenhong; Sun, Wen-Hua; Nomura, Kotohiro; Hada, Masahiko

    2017-11-30

    The NMR chemical shifts of vanadium ( 51 V) in (imido)vanadium(V) dichloride complexes with imidazolin-2-iminato and imidazolidin-2-iminato ligands were calculated by the density functional theory (DFT) method with GIAO. The calculated 51 V NMR chemical shifts were analyzed by the multiple linear regression (MLR) analysis (MLRA) method with a series of calculated molecular properties. Some of calculated NMR chemical shifts were incorrect using the optimized molecular geometries of the X-ray structures. After the global minimum geometries of all of the molecules were determined, the trend of the observed chemical shifts was well reproduced by the present DFT method. The MLRA method was performed to investigate the correlation between the 51 V NMR chemical shift and the natural charge, band energy gap, and Wiberg bond index of the V═N bond. The 51 V NMR chemical shifts obtained with the present MLR model were well reproduced with a correlation coefficient of 0.97.

  18. Predicted effect size of lisdexamfetamine treatment of attention deficit/hyperactivity disorder (ADHD) in European adults: Estimates based on indirect analysis using a systematic review and meta-regression analysis.

    Science.gov (United States)

    Fridman, M; Hodgkins, P S; Kahle, J S; Erder, M H

    2015-06-01

    There are few approved therapies for adults with attention-deficit/hyperactivity disorder (ADHD) in Europe. Lisdexamfetamine (LDX) is an effective treatment for ADHD; however, no clinical trials examining the efficacy of LDX specifically in European adults have been conducted. Therefore, to estimate the efficacy of LDX in European adults we performed a meta-regression of existing clinical data. A systematic review identified US- and Europe-based randomized efficacy trials of LDX, atomoxetine (ATX), or osmotic-release oral system methylphenidate (OROS-MPH) in children/adolescents and adults. A meta-regression model was then fitted to the published/calculated effect sizes (Cohen's d) using medication, geographical location, and age group as predictors. The LDX effect size in European adults was extrapolated from the fitted model. Sensitivity analyses performed included using adult-only studies and adding studies with placebo designs other than a standard pill-placebo design. Twenty-two of 2832 identified articles met inclusion criteria. The model-estimated effect size of LDX for European adults was 1.070 (95% confidence interval: 0.738, 1.401), larger than the 0.8 threshold for large effect sizes. The overall model fit was adequate (80%) and stable in the sensitivity analyses. This model predicts that LDX may have a large treatment effect size in European adults with ADHD. Copyright © 2015 Elsevier Masson SAS. All rights reserved.

  19. Equações de regressão para estimar valores energéticos do grão de trigo e seus subprodutos para frangos de corte, a partir de análises químicas Regression equations to evaluate the energy values of wheat grain and its by-products for broiler chickens from chemical analyses

    Directory of Open Access Journals (Sweden)

    F.M.O. Borges

    2003-12-01

    que significou pouca influência da metodologia sobre essa medida. A FDN não mostrou ser melhor preditor de EM do que a FB.One experiment was run with broiler chickens, to obtain prediction equations for metabolizable energy (ME based on feedstuffs chemical analyses, and determined ME of wheat grain and its by-products, using four different methodologies. Seven wheat grain by-products were used in five treatments: wheat grain, wheat germ, white wheat flour, dark wheat flour, wheat bran for human use, wheat bran for animal use and rough wheat bran. Based on chemical analyses of crude fiber (CF, ether extract (EE, crude protein (CP, ash (AS and starch (ST of the feeds and the determined values of apparent energy (MEA, true energy (MEV, apparent corrected energy (MEAn and true energy corrected by nitrogen balance (MEVn in five treatments, prediction equations were obtained using the stepwise procedure. CF showed the best relationship with metabolizable energy values, however, this variable alone was not enough for a good estimate of the energy values (R² below 0.80. When EE and CP were included in the equations, R² increased to 0.90 or higher in most estimates. When the equations were calculated with all treatments, the equation for MEA were less precise and R² decreased. When ME data of the traditional or force-feeding methods were used separately, the precision of the equations increases (R² higher than 0.85. For MEV and MEVn values, the best multiple linear equations included CF, EE and CP (R²>0.90, independently of using all experimental data or separating by methodology. The estimates of MEVn values showed high precision and the linear coefficients (a of the equations were similar for all treatments or methodologies. Therefore, it explains the small influence of the different methodologies on this parameter. NDF was not a better predictor of ME than CF.

  20. Ecology of Subglacial Lake Vostok (Antarctica, Based on Metagenomic/Metatranscriptomic Analyses of Accretion Ice

    Directory of Open Access Journals (Sweden)

    Tom D'Elia

    2013-03-01

    Full Text Available Lake Vostok is the largest of the nearly 400 subglacial Antarctic lakes and has been continuously buried by glacial ice for 15 million years. Extreme cold, heat (from possible hydrothermal activity, pressure (from the overriding glacier and dissolved oxygen (delivered by melting meteoric ice, in addition to limited nutrients and complete darkness, combine to produce one of the most extreme environments on Earth. Metagenomic/metatranscriptomic analyses of ice that accreted over a shallow embayment and over the southern main lake basin indicate the presence of thousands of species of organisms (94% Bacteria, 6% Eukarya, and two Archaea. The predominant bacterial sequences were closest to those from species of Firmicutes, Proteobacteria and Actinobacteria, while the predominant eukaryotic sequences were most similar to those from species of ascomycetous and basidiomycetous Fungi. Based on the sequence data, the lake appears to contain a mixture of autotrophs and heterotrophs capable of performing nitrogen fixation, nitrogen cycling, carbon fixation and nutrient recycling. Sequences closest to those of psychrophiles and thermophiles indicate a cold lake with possible hydrothermal activity. Sequences most similar to those from marine and aquatic species suggest the presence of marine and freshwater regions.

  1. Loss of Flow Accident (LOFA) analyses using LabView-based NRR simulator

    Energy Technology Data Exchange (ETDEWEB)

    Arafa, Amany Abdel Aziz; Saleh, Hassan Ibrahim [Atomic Energy Authority, Cairo (Egypt). Radiation Engineering Dept.; Ashoub, Nagieb [Atomic Energy Authority, Cairo (Egypt). Reactor Physics Dept.

    2016-12-15

    This paper presents a generic Loss of Flow Accident (LOFA) scenario module which is integrated in the LabView-based simulator to imitate a Nuclear Research Reactor (NRR) behavior for different user defined LOFA scenarios. It also provides analyses of a LOFA of a single fuel channel and its impact on operational transactions and on the behavior of the reactor. The generic LOFA scenario module includes graphs needed to clarify the effects of the LOFA under study. Furthermore, the percentage of the loss of mass flow rate, the mode of flow reduction and the start time and transient time of LOFA are user defined to add flexibility to the LOFA scenarios. The objective of integrating such generic LOFA module is to be able to deal with such incidents and avoid their significant effects. It is also useful in the development of expertise in this area and reducing the operator training and simulations costs. The results of the implemented generic LOFA module agree well with that of COBRA-IIIC code and the earlier guidebook for this series of transients.

  2. TAXONOMY AND GENETIC RELATIONSHIPS OF PANGASIIDAE, ASIAN CATFISHES, BASED ON MORPHOLOGICAL AND MOLECULAR ANALYSES

    Directory of Open Access Journals (Sweden)

    Rudhy Gustiano

    2007-12-01

    Full Text Available Pangasiids are economically important riverine catfishes generally residing in freshwater from the Indian subcontinent to the Indonesian Archipelago. The systematics of this family are still poorly known. Consequently, lack of such basic information impedes the understanding of the biology of the Pangasiids and the study of their aquaculture potential as well as improvement of seed production and growth performance. The objectives of the present study are to clarify phylogeny of this family based on a biometric analysis and molecular evidence using 12S ribosomal mtDNA on the total of 1070 specimens. The study revealed that 28 species are recognised as valid in Pangasiidae. Four genera are also recognized as Helicophagus Bleeker 1858, Pangasianodon Chevey 1930, Pteropangasius Fowler 1937, and Pangasius Valenciennes 1840 instead of two as reported by previous workers. The phylogenetic analysis demonstrated the recognised genera, and genetic relationships among taxa. Overall, trees from the different analyses show similar topologies and confirm the hypothesis derived from geological history, palaeontology, and similar models in other taxa of fishes from the same area. The oldest genus may already have existed when the Asian mainland was still connected to the islands in the southern part about 20 million years ago.

  3. Historical Weathering Based on Chemical Analyses of Two Spodosols in Southern Sweden

    International Nuclear Information System (INIS)

    Melkerud, Per-Arne; Bain, Derek C.; Olsson, Mats T.

    2003-01-01

    Chemical weathering losses were calculated for two conifer stands in relation to ongoing studies on liming effects and ash amendments on chemical status, soil solution chemistry and soil genesis. Weathering losses were based on elemental depletion trends in soil profiles since deglaciation and exposure to the weathering environment. Gradients in total geochemical composition were assumed to reflect alteration over time. Study sites were Horroed and Hassloev in southern Sweden. Both Horroed and Hassloev sites are located on sandy loamy Weichselian till at an altitude of 85 and 190 m a.s.l., respectively. Aliquots from volume determined samples from a number of soil levels were fused with lithium metaborate, dissolved in HNO 3 , and analysed by ICP - AES. Results indicated highest cumulative weathering losses at Hassloev. The weathering losses for the elements are in the following order:Si > Al > K > Na > Ca > MgTotal annual losses for Ca+Mg+K+Na, expressed in mmol c m -2 yr -1 , amounted to c. 28 and 58 at Horroed and Hassloev, respectively. Variations between study sites could not be explained by differences in bulk density, geochemistry or mineralogy. The accumulated weathering losses since deglaciation were larger in the uppermost 15 cm than in deeper B horizons for most elements studied

  4. Aleatoric and epistemic uncertainties in sampling based nuclear data uncertainty and sensitivity analyses

    International Nuclear Information System (INIS)

    Zwermann, W.; Krzykacz-Hausmann, B.; Gallner, L.; Klein, M.; Pautz, A.; Velkov, K.

    2012-01-01

    Sampling based uncertainty and sensitivity analyses due to epistemic input uncertainties, i.e. to an incomplete knowledge of uncertain input parameters, can be performed with arbitrary application programs to solve the physical problem under consideration. For the description of steady-state particle transport, direct simulations of the microscopic processes with Monte Carlo codes are often used. This introduces an additional source of uncertainty, the aleatoric sampling uncertainty, which is due to the randomness of the simulation process performed by sampling, and which adds to the total combined output sampling uncertainty. So far, this aleatoric part of uncertainty is minimized by running a sufficiently large number of Monte Carlo histories for each sample calculation, thus making its impact negligible as compared to the impact from sampling the epistemic uncertainties. Obviously, this process may cause high computational costs. The present paper shows that in many applications reliable epistemic uncertainty results can also be obtained with substantially lower computational effort by performing and analyzing two appropriately generated series of samples with much smaller number of Monte Carlo histories each. The method is applied along with the nuclear data uncertainty and sensitivity code package XSUSA in combination with the Monte Carlo transport code KENO-Va to various critical assemblies and a full scale reactor calculation. It is shown that the proposed method yields output uncertainties and sensitivities equivalent to the traditional approach, with a high reduction of computing time by factors of the magnitude of 100. (authors)

  5. Neural Spike-Train Analyses of the Speech-Based Envelope Power Spectrum Model

    Science.gov (United States)

    Rallapalli, Varsha H.

    2016-01-01

    Diagnosing and treating hearing impairment is challenging because people with similar degrees of sensorineural hearing loss (SNHL) often have different speech-recognition abilities. The speech-based envelope power spectrum model (sEPSM) has demonstrated that the signal-to-noise ratio (SNRENV) from a modulation filter bank provides a robust speech-intelligibility measure across a wider range of degraded conditions than many long-standing models. In the sEPSM, noise (N) is assumed to: (a) reduce S + N envelope power by filling in dips within clean speech (S) and (b) introduce an envelope noise floor from intrinsic fluctuations in the noise itself. While the promise of SNRENV has been demonstrated for normal-hearing listeners, it has not been thoroughly extended to hearing-impaired listeners because of limited physiological knowledge of how SNHL affects speech-in-noise envelope coding relative to noise alone. Here, envelope coding to speech-in-noise stimuli was quantified from auditory-nerve model spike trains using shuffled correlograms, which were analyzed in the modulation-frequency domain to compute modulation-band estimates of neural SNRENV. Preliminary spike-train analyses show strong similarities to the sEPSM, demonstrating feasibility of neural SNRENV computations. Results suggest that individual differences can occur based on differential degrees of outer- and inner-hair-cell dysfunction in listeners currently diagnosed into the single audiological SNHL category. The predicted acoustic-SNR dependence in individual differences suggests that the SNR-dependent rate of susceptibility could be an important metric in diagnosing individual differences. Future measurements of the neural SNRENV in animal studies with various forms of SNHL will provide valuable insight for understanding individual differences in speech-in-noise intelligibility.

  6. Analysing the operative experience of basic surgical trainees in Ireland using a web-based logbook

    LENUS (Irish Health Repository)

    Lonergan, Peter E

    2011-09-25

    Abstract Background There is concern about the adequacy of operative exposure in surgical training programmes, in the context of changing work practices. We aimed to quantify the operative exposure of all trainees on the National Basic Surgical Training (BST) programme in Ireland and compare the results with arbitrary training targets. Methods Retrospective analysis of data obtained from a web-based logbook (http:\\/\\/www.elogbook.org) for all general surgery and orthopaedic training posts between July 2007 and June 2009. Results 104 trainees recorded 23,918 operations between two 6-month general surgery posts. The most common general surgery operation performed was simple skin excision with trainees performing an average of 19.7 (± 9.9) over the 2-year training programme. Trainees most frequently assisted with cholecystectomy with an average of 16.0 (± 11.0) per trainee. Comparison of trainee operative experience to arbitrary training targets found that 2-38% of trainees achieved the targets for 9 emergency index operations and 24-90% of trainees achieved the targets for 8 index elective operations. 72 trainees also completed a 6-month post in orthopaedics and recorded 7,551 operations. The most common orthopaedic operation that trainees performed was removal of metal, with an average of 2.90 (± 3.27) per trainee. The most common orthopaedic operation that trainees assisted with was total hip replacement, with an average of 10.46 (± 6.21) per trainee. Conclusions A centralised web-based logbook provides valuable data to analyse training programme performance. Analysis of logbooks raises concerns about operative experience at junior trainee level. The provision of adequate operative exposure for trainees should be a key performance indicator for training programmes.

  7. [Research on fast classification based on LIBS technology and principle component analyses].

    Science.gov (United States)

    Yu, Qi; Ma, Xiao-Hong; Wang, Rui; Zhao, Hua-Feng

    2014-11-01

    Laser-induced breakdown spectroscopy (LIBS) and the principle component analysis (PCA) were combined to study aluminum alloy classification in the present article. Classification experiments were done on thirteen different kinds of standard samples of aluminum alloy which belong to 4 different types, and the results suggested that the LIBS-PCA method can be used to aluminum alloy fast classification. PCA was used to analyze the spectrum data from LIBS experiments, three principle components were figured out that contribute the most, the principle component scores of the spectrums were calculated, and the scores of the spectrums data in three-dimensional coordinates were plotted. It was found that the spectrum sample points show clear convergence phenomenon according to the type of aluminum alloy they belong to. This result ensured the three principle components and the preliminary aluminum alloy type zoning. In order to verify its accuracy, 20 different aluminum alloy samples were used to do the same experiments to verify the aluminum alloy type zoning. The experimental result showed that the spectrum sample points all located in their corresponding area of the aluminum alloy type, and this proved the correctness of the earlier aluminum alloy standard sample type zoning method. Based on this, the identification of unknown type of aluminum alloy can be done. All the experimental results showed that the accuracy of principle component analyses method based on laser-induced breakdown spectroscopy is more than 97.14%, and it can classify the different type effectively. Compared to commonly used chemical methods, laser-induced breakdown spectroscopy can do the detection of the sample in situ and fast with little sample preparation, therefore, using the method of the combination of LIBS and PCA in the areas such as quality testing and on-line industrial controlling can save a lot of time and cost, and improve the efficiency of detection greatly.

  8. Data-driven method based on particle swarm optimization and k-nearest neighbor regression for estimating capacity of lithium-ion battery

    International Nuclear Information System (INIS)

    Hu, Chao; Jain, Gaurav; Zhang, Puqiang; Schmidt, Craig; Gomadam, Parthasarathy; Gorka, Tom

    2014-01-01

    Highlights: • We develop a data-driven method for the battery capacity estimation. • Five charge-related features that are indicative of the capacity are defined. • The kNN regression model captures the dependency of the capacity on the features. • Results with 10 years’ continuous cycling data verify the effectiveness of the method. - Abstract: Reliability of lithium-ion (Li-ion) rechargeable batteries used in implantable medical devices has been recognized as of high importance from a broad range of stakeholders, including medical device manufacturers, regulatory agencies, physicians, and patients. To ensure Li-ion batteries in these devices operate reliably, it is important to be able to assess the battery health condition by estimating the battery capacity over the life-time. This paper presents a data-driven method for estimating the capacity of Li-ion battery based on the charge voltage and current curves. The contributions of this paper are three-fold: (i) the definition of five characteristic features of the charge curves that are indicative of the capacity, (ii) the development of a non-linear kernel regression model, based on the k-nearest neighbor (kNN) regression, that captures the complex dependency of the capacity on the five features, and (iii) the adaptation of particle swarm optimization (PSO) to finding the optimal combination of feature weights for creating a kNN regression model that minimizes the cross validation (CV) error in the capacity estimation. Verification with 10 years’ continuous cycling data suggests that the proposed method is able to accurately estimate the capacity of Li-ion battery throughout the whole life-time

  9. Comprehensive preference optimization of an irreversible thermal engine using pareto based mutable smart bee algorithm and generalized regression neural network

    DEFF Research Database (Denmark)

    Mozaffari, Ahmad; Gorji-Bandpy, Mofid; Samadian, Pendar

    2013-01-01

    Optimizing and controlling of complex engineering systems is a phenomenon that has attracted an incremental interest of numerous scientists. Until now, a variety of intelligent optimizing and controlling techniques such as neural networks, fuzzy logic, game theory, support vector machines...... and stochastic algorithms were proposed to facilitate controlling of the engineering systems. In this study, an extended version of mutable smart bee algorithm (MSBA) called Pareto based mutable smart bee (PBMSB) is inspired to cope with multi-objective problems. Besides, a set of benchmark problems and four...... well-known Pareto based optimizing algorithms i.e. multi-objective bee algorithm (MOBA), multi-objective particle swarm optimization (MOPSO) algorithm, non-dominated sorting genetic algorithm (NSGA-II), and strength Pareto evolutionary algorithm (SPEA 2) are utilized to confirm the acceptable...

  10. Airway management education: simulation based training versus non-simulation based training-A systematic review and meta-analyses.

    Science.gov (United States)

    Sun, Yanxia; Pan, Chuxiong; Li, Tianzuo; Gan, Tong J

    2017-02-01

    Simulation-based training (SBT) has become a standard for medical education. However, the efficacy of simulation based training in airway management education remains unclear. The aim of this study was to evaluate all published evidence comparing the effectiveness of SBT for airway management versus non-simulation based training (NSBT) on learner and patient outcomes. Systematic review with meta-analyses were used. Data were derived from PubMed, EMBASE, CINAHL, Scopus, the Cochrane Controlled Trials Register and Cochrane Database of Systematic Reviews from inception to May 2016. Published comparative trials that evaluated the effect of SBT on airway management training in compared with NSBT were considered. The effect sizes with 95% confidence intervals (CI) were calculated for outcomes measures. Seventeen eligible studies were included. SBT was associated with improved behavior performance [standardized mean difference (SMD):0.30, 95% CI: 0.06 to 0.54] in comparison with NSBT. However, the benefits of SBT were not seen in time-skill (SMD:-0.13, 95% CI: -0.82 to 0.52), written examination score (SMD: 0.39, 95% CI: -0.09 to 0.86) and success rate of procedure completion on patients [relative risk (RR): 1.26, 95% CI: 0.96 to 1.66]. SBT may be not superior to NSBT on airway management training.

  11. Regression analysis by example

    CERN Document Server

    Chatterjee, Samprit

    2012-01-01

    Praise for the Fourth Edition: ""This book is . . . an excellent source of examples for regression analysis. It has been and still is readily readable and understandable."" -Journal of the American Statistical Association Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. Regression Analysis by Example, Fifth Edition has been expanded

  12. Regression and tracing methodology based prediction of oncoming demand and losses in deregulated operation of power systems

    DEFF Research Database (Denmark)

    Nallagownden, P.; Mukerjee, R.N.; Masri, S.

    2010-01-01

    of the transmission services hiring contract, inputs such as extent of use of a transmission circuit for a transaction and the associated power loss in the said transmission circuit are also required. To provide the necessary lead time to frame transaction and transmission contracts for an oncoming operational...... coefficients are used advantageously to predict a generator's contribution to a retailer's demand and power loss for this transaction. This paper proposes a procedure that can be implemented real time, to quantify losses in each transmission circuit used by a specific transaction, based on proportionality......The deregulated electricity market can be thought of as a conglomeration of generation providers, transmission service operators (TSO) and retailers, where both generation and retailing may have open access to the transmission grid for trading electricity. For a transaction contract bid to take...

  13. Metabolomics study on primary dysmenorrhea patients during the luteal regression stage based on ultra performance liquid chromatography coupled with quadrupole-time-of-flight mass spectrometry

    Science.gov (United States)

    Fang, Ling; Gu, Caiyun; Liu, Xinyu; Xie, Jiabin; Hou, Zhiguo; Tian, Meng; Yin, Jia; Li, Aizhu; Li, Yubo

    2017-01-01

    Primary dysmenorrhea (PD) is a common gynecological disorder which, while not life-threatening, severely affects the quality of life of women. Most patients with PD suffer ovarian hormone imbalances caused by uterine contraction, which results in dysmenorrhea. PD patients may also suffer from increases in estrogen levels caused by increased levels of prostaglandin synthesis and release during luteal regression and early menstruation. Although PD pathogenesis has been previously reported on, these studies only examined the menstrual period and neglected the importance of the luteal regression stage. Therefore, the present study used urine metabolomics to examine changes in endogenous substances and detect urine biomarkers for PD during luteal regression. Ultra performance liquid chromatography coupled with quadrupole-time-of-flight mass spectrometry was used to create metabolomic profiles for 36 patients with PD and 27 healthy controls. Principal component analysis and partial least squares discriminate analysis were used to investigate the metabolic alterations associated with PD. Ten biomarkers for PD were identified, including ornithine, dihydrocortisol, histidine, citrulline, sphinganine, phytosphingosine, progesterone, 17-hydroxyprogesterone, androstenedione, and 15-keto-prostaglandin F2α. The specificity and sensitivity of these biomarkers was assessed based on the area under the curve of receiver operator characteristic curves, which can be used to distinguish patients with PD from healthy controls. These results provide novel targets for the treatment of PD. PMID:28098892

  14. Exploring a physico-chemical multi-array explanatory model with a new multiple covariance-based technique: structural equation exploratory regression.

    Science.gov (United States)

    Bry, X; Verron, T; Cazes, P

    2009-05-29

    In this work, we consider chemical and physical variable groups describing a common set of observations (cigarettes). One of the groups, minor smoke compounds (minSC), is assumed to depend on the others (minSC predictors). PLS regression (PLSR) of m inSC on the set of all predictors appears not to lead to a satisfactory analytic model, because it does not take into account the expert's knowledge. PLS path modeling (PLSPM) does not use the multidimensional structure of predictor groups. Indeed, the expert needs to separate the influence of several pre-designed predictor groups on minSC, in order to see what dimensions this influence involves. To meet these needs, we consider a multi-group component-regression model, and propose a method to extract from each group several strong uncorrelated components that fit the model. Estimation is based on a global multiple covariance criterion, used in combination with an appropriate nesting approach. Compared to PLSR and PLSPM, the structural equation exploratory regression (SEER) we propose fully uses predictor group complementarity, both conceptually and statistically, to predict the dependent group.

  15. How to regress and predict in a Bland-Altman plot? Review and contribution based on tolerance intervals and correlated-errors-in-variables models.

    Science.gov (United States)

    Francq, Bernard G; Govaerts, Bernadette

    2016-06-30

    Two main methodologies for assessing equivalence in method-comparison studies are presented separately in the literature. The first one is the well-known and widely applied Bland-Altman approach with its agreement intervals, where two methods are considered interchangeable if their differences are not clinically significant. The second approach is based on errors-in-variables regression in a classical (X,Y) plot and focuses on confidence intervals, whereby two methods are considered equivalent when providing similar measures notwithstanding the random measurement errors. This paper reconciles these two methodologies and shows their similarities and differences using both real data and simulations. A new consistent correlated-errors-in-variables regression is introduced as the errors are shown to be correlated in the Bland-Altman plot. Indeed, the coverage probabilities collapse and the biases soar when this correlation is ignored. Novel tolerance intervals are compared with agreement intervals with or without replicated data, and novel predictive intervals are introduced to predict a single measure in an (X,Y) plot or in a Bland-Atman plot with excellent coverage probabilities. We conclude that the (correlated)-errors-in-variables regressions should not be avoided in method comparison studies, although the Bland-Altman approach is usually applied to avert their complexity. We argue that tolerance or predictive intervals are better alternatives than agreement intervals, and we provide guidelines for practitioners regarding method comparison studies. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  16. A Novel Hybrid Model Based on Extreme Learning Machine, k-Nearest Neighbor Regression and Wavelet Denoising Applied to Short-Term Electric Load Forecasting

    Directory of Open Access Journals (Sweden)

    Weide Li

    2017-05-01

    Full Text Available Electric load forecasting plays an important role in electricity markets and power systems. Because electric load time series are complicated and nonlinear, it is very difficult to achieve a satisfactory forecasting accuracy. In this paper, a hybrid model, Wavelet Denoising-Extreme Learning Machine optimized by k-Nearest Neighbor Regression (EWKM, which combines k-Nearest Neighbor (KNN and Extreme Learning Machine (ELM based on a wavelet denoising technique is proposed for short-term load forecasting. The proposed hybrid model decomposes the time series into a low frequency-associated main signal and some detailed signals associated with high frequencies at first, then uses KNN to determine the independent and dependent variables from the low-frequency signal. Finally, the ELM is used to get the non-linear relationship between these variables to get