WorldWideScience

Sample records for binary regression model

  1. A binary logistic regression model with complex sampling design of ...

    African Journals Online (AJOL)

    2017-09-03

    Sep 3, 2017 ... Bi-variable and multi-variable binary logistic regression model with complex sampling design was fitted. .... Data was entered into STATA-12 and analyzed using. SPSS-21. .... lack of access/too far or costs too much. 35. 1.2.

  2. Flexible link functions in nonparametric binary regression with Gaussian process priors.

    Science.gov (United States)

    Li, Dan; Wang, Xia; Lin, Lizhen; Dey, Dipak K

    2016-09-01

    In many scientific fields, it is a common practice to collect a sequence of 0-1 binary responses from a subject across time, space, or a collection of covariates. Researchers are interested in finding out how the expected binary outcome is related to covariates, and aim at better prediction in the future 0-1 outcomes. Gaussian processes have been widely used to model nonlinear systems; in particular to model the latent structure in a binary regression model allowing nonlinear functional relationship between covariates and the expectation of binary outcomes. A critical issue in modeling binary response data is the appropriate choice of link functions. Commonly adopted link functions such as probit or logit links have fixed skewness and lack the flexibility to allow the data to determine the degree of the skewness. To address this limitation, we propose a flexible binary regression model which combines a generalized extreme value link function with a Gaussian process prior on the latent structure. Bayesian computation is employed in model estimation. Posterior consistency of the resulting posterior distribution is demonstrated. The flexibility and gains of the proposed model are illustrated through detailed simulation studies and two real data examples. Empirical results show that the proposed model outperforms a set of alternative models, which only have either a Gaussian process prior on the latent regression function or a Dirichlet prior on the link function. © 2015, The International Biometric Society.

  3. Binary logistic regression-Instrument for assessing museum indoor air impact on exhibits.

    Science.gov (United States)

    Bucur, Elena; Danet, Andrei Florin; Lehr, Carol Blaziu; Lehr, Elena; Nita-Lazar, Mihai

    2017-04-01

    This paper presents a new way to assess the environmental impact on historical artifacts using binary logistic regression. The prediction of the impact on the exhibits during certain pollution scenarios (environmental impact) was calculated by a mathematical model based on the binary logistic regression; it allows the identification of those environmental parameters from a multitude of possible parameters with a significant impact on exhibitions and ranks them according to their severity effect. Air quality (NO 2 , SO 2 , O 3 and PM 2.5 ) and microclimate parameters (temperature, humidity) monitoring data from a case study conducted within exhibition and storage spaces of the Romanian National Aviation Museum Bucharest have been used for developing and validating the binary logistic regression method and the mathematical model. The logistic regression analysis was used on 794 data combinations (715 to develop of the model and 79 to validate it) by a Statistical Package for Social Sciences (SPSS 20.0). The results from the binary logistic regression analysis demonstrated that from six parameters taken into consideration, four of them present a significant effect upon exhibits in the following order: O 3 >PM 2.5 >NO 2 >humidity followed at a significant distance by the effects of SO 2 and temperature. The mathematical model, developed in this study, correctly predicted 95.1 % of the cumulated effect of the environmental parameters upon the exhibits. Moreover, this model could also be used in the decisional process regarding the preventive preservation measures that should be implemented within the exhibition space. The paper presents a new way to assess the environmental impact on historical artifacts using binary logistic regression. The mathematical model developed on the environmental parameters analyzed by the binary logistic regression method could be useful in a decision-making process establishing the best measures for pollution reduction and preventive

  4. A computer tool for a minimax criterion in binary response and heteroscedastic simple linear regression models.

    Science.gov (United States)

    Casero-Alonso, V; López-Fidalgo, J; Torsney, B

    2017-01-01

    Binary response models are used in many real applications. For these models the Fisher information matrix (FIM) is proportional to the FIM of a weighted simple linear regression model. The same is also true when the weight function has a finite integral. Thus, optimal designs for one binary model are also optimal for the corresponding weighted linear regression model. The main objective of this paper is to provide a tool for the construction of MV-optimal designs, minimizing the maximum of the variances of the estimates, for a general design space. MV-optimality is a potentially difficult criterion because of its nondifferentiability at equal variance designs. A methodology for obtaining MV-optimal designs where the design space is a compact interval [a, b] will be given for several standard weight functions. The methodology will allow us to build a user-friendly computer tool based on Mathematica to compute MV-optimal designs. Some illustrative examples will show a representation of MV-optimal designs in the Euclidean plane, taking a and b as the axes. The applet will be explained using two relevant models. In the first one the case of a weighted linear regression model is considered, where the weight function is directly chosen from a typical family. In the second example a binary response model is assumed, where the probability of the outcome is given by a typical probability distribution. Practitioners can use the provided applet to identify the solution and to know the exact support points and design weights. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  5. Bayesian binary regression model: an application to in-hospital death after AMI prediction

    Directory of Open Access Journals (Sweden)

    Aparecida D. P. Souza

    2004-08-01

    Full Text Available A Bayesian binary regression model is developed to predict death of patients after acute myocardial infarction (AMI. Markov Chain Monte Carlo (MCMC methods are used to make inference and to evaluate Bayesian binary regression models. A model building strategy based on Bayes factor is proposed and aspects of model validation are extensively discussed in the paper, including the posterior distribution for the c-index and the analysis of residuals. Risk assessment, based on variables easily available within minutes of the patients' arrival at the hospital, is very important to decide the course of the treatment. The identified model reveals itself strongly reliable and accurate, with a rate of correct classification of 88% and a concordance index of 83%.Um modelo bayesiano de regressão binária é desenvolvido para predizer óbito hospitalar em pacientes acometidos por infarto agudo do miocárdio. Métodos de Monte Carlo via Cadeias de Markov (MCMC são usados para fazer inferência e validação. Uma estratégia para construção de modelos, baseada no uso do fator de Bayes, é proposta e aspectos de validação são extensivamente discutidos neste artigo, incluindo a distribuição a posteriori para o índice de concordância e análise de resíduos. A determinação de fatores de risco, baseados em variáveis disponíveis na chegada do paciente ao hospital, é muito importante para a tomada de decisão sobre o curso do tratamento. O modelo identificado se revela fortemente confiável e acurado, com uma taxa de classificação correta de 88% e um índice de concordância de 83%.

  6. Face Alignment via Regressing Local Binary Features.

    Science.gov (United States)

    Ren, Shaoqing; Cao, Xudong; Wei, Yichen; Sun, Jian

    2016-03-01

    This paper presents a highly efficient and accurate regression approach for face alignment. Our approach has two novel components: 1) a set of local binary features and 2) a locality principle for learning those features. The locality principle guides us to learn a set of highly discriminative local binary features for each facial landmark independently. The obtained local binary features are used to jointly learn a linear regression for the final output. This approach achieves the state-of-the-art results when tested on the most challenging benchmarks to date. Furthermore, because extracting and regressing local binary features are computationally very cheap, our system is much faster than previous methods. It achieves over 3000 frames per second (FPS) on a desktop or 300 FPS on a mobile phone for locating a few dozens of landmarks. We also study a key issue that is important but has received little attention in the previous research, which is the face detector used to initialize alignment. We investigate several face detectors and perform quantitative evaluation on how they affect alignment accuracy. We find that an alignment friendly detector can further greatly boost the accuracy of our alignment method, reducing the error up to 16% relatively. To facilitate practical usage of face detection/alignment methods, we also propose a convenient metric to measure how good a detector is for alignment initialization.

  7. Modelling binary data

    CERN Document Server

    Collett, David

    2002-01-01

    INTRODUCTION Some Examples The Scope of this Book Use of Statistical Software STATISTICAL INFERENCE FOR BINARY DATA The Binomial Distribution Inference about the Success Probability Comparison of Two Proportions Comparison of Two or More Proportions MODELS FOR BINARY AND BINOMIAL DATA Statistical Modelling Linear Models Methods of Estimation Fitting Linear Models to Binomial Data Models for Binomial Response Data The Linear Logistic Model Fitting the Linear Logistic Model to Binomial Data Goodness of Fit of a Linear Logistic Model Comparing Linear Logistic Models Linear Trend in Proportions Comparing Stimulus-Response Relationships Non-Convergence and Overfitting Some other Goodness of Fit Statistics Strategy for Model Selection Predicting a Binary Response Probability BIOASSAY AND SOME OTHER APPLICATIONS The Tolerance Distribution Estimating an Effective Dose Relative Potency Natural Response Non-Linear Logistic Regression Models Applications of the Complementary Log-Log Model MODEL CHECKING Definition of Re...

  8. Risk of Recurrence in Operated Parasagittal Meningiomas: A Logistic Binary Regression Model.

    Science.gov (United States)

    Escribano Mesa, José Alberto; Alonso Morillejo, Enrique; Parrón Carreño, Tesifón; Huete Allut, Antonio; Narro Donate, José María; Méndez Román, Paddy; Contreras Jiménez, Ascensión; Pedrero García, Francisco; Masegosa González, José

    2018-02-01

    Parasagittal meningiomas arise from the arachnoid cells of the angle formed between the superior sagittal sinus (SSS) and the brain convexity. In this retrospective study, we focused on factors that predict early recurrence and recurrence times. We reviewed 125 patients with parasagittal meningiomas operated from 1985 to 2014. We studied the following variables: age, sex, location, laterality, histology, surgeons, invasion of the SSS, Simpson removal grade, follow-up time, angiography, embolization, radiotherapy, recurrence and recurrence time, reoperation, neurologic deficit, degree of dependency, and patient status at the end of follow-up. Patients ranged in age from 26 to 81 years (mean 57.86 years; median 60 years). There were 44 men (35.2%) and 81 women (64.8%). There were 57 patients with neurologic deficits (45.2%). The most common presenting symptom was motor deficit. World Health Organization grade I tumors were identified in 104 patients (84.6%), and the majority were the meningothelial type. Recurrence was detected in 34 cases. Time of recurrence was 9 to 336 months (mean: 84.4 months; median: 79.5 months). Male sex was identified as an independent risk for recurrence with relative risk 2.7 (95% confidence interval 1.21-6.15), P = 0.014. Kaplan-Meier curves for recurrence had statistically significant differences depending on sex, age, histologic type, and World Health Organization histologic grade. A binary logistic regression was made with the Hosmer-Lemeshow test with P > 0.05; sex, tumor size, and histologic type were used in this model. Male sex is an independent risk factor for recurrence that, associated with other factors such tumor size and histologic type, explains 74.5% of all cases in a binary regression model. Copyright © 2017 Elsevier Inc. All rights reserved.

  9. Modelling of binary logistic regression for obesity among secondary students in a rural area of Kedah

    Science.gov (United States)

    Kamaruddin, Ainur Amira; Ali, Zalila; Noor, Norlida Mohd.; Baharum, Adam; Ahmad, Wan Muhamad Amir W.

    2014-07-01

    Logistic regression analysis examines the influence of various factors on a dichotomous outcome by estimating the probability of the event's occurrence. Logistic regression, also called a logit model, is a statistical procedure used to model dichotomous outcomes. In the logit model the log odds of the dichotomous outcome is modeled as a linear combination of the predictor variables. The log odds ratio in logistic regression provides a description of the probabilistic relationship of the variables and the outcome. In conducting logistic regression, selection procedures are used in selecting important predictor variables, diagnostics are used to check that assumptions are valid which include independence of errors, linearity in the logit for continuous variables, absence of multicollinearity, and lack of strongly influential outliers and a test statistic is calculated to determine the aptness of the model. This study used the binary logistic regression model to investigate overweight and obesity among rural secondary school students on the basis of their demographics profile, medical history, diet and lifestyle. The results indicate that overweight and obesity of students are influenced by obesity in family and the interaction between a student's ethnicity and routine meals intake. The odds of a student being overweight and obese are higher for a student having a family history of obesity and for a non-Malay student who frequently takes routine meals as compared to a Malay student.

  10. Predicting Social Trust with Binary Logistic Regression

    Science.gov (United States)

    Adwere-Boamah, Joseph; Hufstedler, Shirley

    2015-01-01

    This study used binary logistic regression to predict social trust with five demographic variables from a national sample of adult individuals who participated in The General Social Survey (GSS) in 2012. The five predictor variables were respondents' highest degree earned, race, sex, general happiness and the importance of personally assisting…

  11. A novel hybrid method of beta-turn identification in protein using binary logistic regression and neural network.

    Science.gov (United States)

    Asghari, Mehdi Poursheikhali; Hayatshahi, Sayyed Hamed Sadat; Abdolmaleki, Parviz

    2012-01-01

    From both the structural and functional points of view, β-turns play important biological roles in proteins. In the present study, a novel two-stage hybrid procedure has been developed to identify β-turns in proteins. Binary logistic regression was initially used for the first time to select significant sequence parameters in identification of β-turns due to a re-substitution test procedure. Sequence parameters were consisted of 80 amino acid positional occurrences and 20 amino acid percentages in sequence. Among these parameters, the most significant ones which were selected by binary logistic regression model, were percentages of Gly, Ser and the occurrence of Asn in position i+2, respectively, in sequence. These significant parameters have the highest effect on the constitution of a β-turn sequence. A neural network model was then constructed and fed by the parameters selected by binary logistic regression to build a hybrid predictor. The networks have been trained and tested on a non-homologous dataset of 565 protein chains. With applying a nine fold cross-validation test on the dataset, the network reached an overall accuracy (Qtotal) of 74, which is comparable with results of the other β-turn prediction methods. In conclusion, this study proves that the parameter selection ability of binary logistic regression together with the prediction capability of neural networks lead to the development of more precise models for identifying β-turns in proteins.

  12. Performance and separation occurrence of binary probit regression estimator using maximum likelihood method and Firths approach under different sample size

    Science.gov (United States)

    Lusiana, Evellin Dewi

    2017-12-01

    The parameters of binary probit regression model are commonly estimated by using Maximum Likelihood Estimation (MLE) method. However, MLE method has limitation if the binary data contains separation. Separation is the condition where there are one or several independent variables that exactly grouped the categories in binary response. It will result the estimators of MLE method become non-convergent, so that they cannot be used in modeling. One of the effort to resolve the separation is using Firths approach instead. This research has two aims. First, to identify the chance of separation occurrence in binary probit regression model between MLE method and Firths approach. Second, to compare the performance of binary probit regression model estimator that obtained by MLE method and Firths approach using RMSE criteria. Those are performed using simulation method and under different sample size. The results showed that the chance of separation occurrence in MLE method for small sample size is higher than Firths approach. On the other hand, for larger sample size, the probability decreased and relatively identic between MLE method and Firths approach. Meanwhile, Firths estimators have smaller RMSE than MLEs especially for smaller sample sizes. But for larger sample sizes, the RMSEs are not much different. It means that Firths estimators outperformed MLE estimator.

  13. Modelling Status Food Security Households Disease Sufferers Pulmonary Tuberculosis Uses the Method Regression Logistics Binary

    Science.gov (United States)

    Wulandari, S. P.; Salamah, M.; Rositawati, A. F. D.

    2018-04-01

    Food security is the condition where the food fulfilment is managed well for the country till the individual. Indonesia is one of the country which has the commitment to create the food security becomes main priority. However, the food necessity becomes common thing means that it doesn’t care about nutrient standard and the health condition of family member, so in the fulfilment of food necessity also has to consider the disease suffered by the family member, one of them is pulmonary tuberculosa. From that reasons, this research is conducted to know the factors which influence on household food security status which suffered from pulmonary tuberculosis in the coastal area of Surabaya by using binary logistic regression method. The analysis result by using binary logistic regression shows that the variables wife latest education, house density and spacious house ventilation significantly affect on household food security status which suffered from pulmonary tuberculosis in the coastal area of Surabaya, where the wife education level is University/equivalent, the house density is eligible or 8 m2/person and spacious house ventilation 10% of the floor area has the opportunity to become food secure households amounted to 0.911089. While the chance of becoming food insecure households amounted to 0.088911. The model household food security status which suffered from pulmonary tuberculosis in the coastal area of Surabaya has been conformable, and the overall percentages of those classifications are at 71.8%.

  14. Binary logistic regression modelling: Measuring the probability of relapse cases among drug addict

    Science.gov (United States)

    Ismail, Mohd Tahir; Alias, Siti Nor Shadila

    2014-07-01

    For many years Malaysia faced the drug addiction issues. The most serious case is relapse phenomenon among treated drug addict (drug addict who have under gone the rehabilitation programme at Narcotic Addiction Rehabilitation Centre, PUSPEN). Thus, the main objective of this study is to find the most significant factor that contributes to relapse to happen. The binary logistic regression analysis was employed to model the relationship between independent variables (predictors) and dependent variable. The dependent variable is the status of the drug addict either relapse, (Yes coded as 1) or not, (No coded as 0). Meanwhile the predictors involved are age, age at first taking drug, family history, education level, family crisis, community support and self motivation. The total of the sample is 200 which the data are provided by AADK (National Antidrug Agency). The finding of the study revealed that age and self motivation are statistically significant towards the relapse cases..

  15. Unobserved Heterogeneity in the Binary Logit Model with Cross-Sectional Data and Short Panels

    DEFF Research Database (Denmark)

    Holm, Anders; Jæger, Mads Meier; Pedersen, Morten

    This paper proposes a new approach to dealing with unobserved heterogeneity in applied research using the binary logit model with cross-sectional data and short panels. Unobserved heterogeneity is particularly important in non-linear regression models such as the binary logit model because, unlike...... in linear regression models, estimates of the effects of observed independent variables are biased even when omitted independent variables are uncorrelated with the observed independent variables. We propose an extension of the binary logit model based on a finite mixture approach in which we conceptualize...

  16. Parameter Estimation for Improving Association Indicators in Binary Logistic Regression

    Directory of Open Access Journals (Sweden)

    Mahdi Bashiri

    2012-02-01

    Full Text Available The aim of this paper is estimation of Binary logistic regression parameters for maximizing the log-likelihood function with improved association indicators. In this paper the parameter estimation steps have been explained and then measures of association have been introduced and their calculations have been analyzed. Moreover a new related indicators based on membership degree level have been expressed. Indeed association measures demonstrate the number of success responses occurred in front of failure in certain number of Bernoulli independent experiments. In parameter estimation, existing indicators values is not sensitive to the parameter values, whereas the proposed indicators are sensitive to the estimated parameters during the iterative procedure. Therefore, proposing a new association indicator of binary logistic regression with more sensitivity to the estimated parameters in maximizing the log- likelihood in iterative procedure is innovation of this study.

  17. No rationale for 1 variable per 10 events criterion for binary logistic regression analysis.

    Science.gov (United States)

    van Smeden, Maarten; de Groot, Joris A H; Moons, Karel G M; Collins, Gary S; Altman, Douglas G; Eijkemans, Marinus J C; Reitsma, Johannes B

    2016-11-24

    Ten events per variable (EPV) is a widely advocated minimal criterion for sample size considerations in logistic regression analysis. Of three previous simulation studies that examined this minimal EPV criterion only one supports the use of a minimum of 10 EPV. In this paper, we examine the reasons for substantial differences between these extensive simulation studies. The current study uses Monte Carlo simulations to evaluate small sample bias, coverage of confidence intervals and mean square error of logit coefficients. Logistic regression models fitted by maximum likelihood and a modified estimation procedure, known as Firth's correction, are compared. The results show that besides EPV, the problems associated with low EPV depend on other factors such as the total sample size. It is also demonstrated that simulation results can be dominated by even a few simulated data sets for which the prediction of the outcome by the covariates is perfect ('separation'). We reveal that different approaches for identifying and handling separation leads to substantially different simulation results. We further show that Firth's correction can be used to improve the accuracy of regression coefficients and alleviate the problems associated with separation. The current evidence supporting EPV rules for binary logistic regression is weak. Given our findings, there is an urgent need for new research to provide guidance for supporting sample size considerations for binary logistic regression analysis.

  18. Sample size calculation to externally validate scoring systems based on logistic regression models.

    Directory of Open Access Journals (Sweden)

    Antonio Palazón-Bru

    Full Text Available A sample size containing at least 100 events and 100 non-events has been suggested to validate a predictive model, regardless of the model being validated and that certain factors can influence calibration of the predictive model (discrimination, parameterization and incidence. Scoring systems based on binary logistic regression models are a specific type of predictive model.The aim of this study was to develop an algorithm to determine the sample size for validating a scoring system based on a binary logistic regression model and to apply it to a case study.The algorithm was based on bootstrap samples in which the area under the ROC curve, the observed event probabilities through smooth curves, and a measure to determine the lack of calibration (estimated calibration index were calculated. To illustrate its use for interested researchers, the algorithm was applied to a scoring system, based on a binary logistic regression model, to determine mortality in intensive care units.In the case study provided, the algorithm obtained a sample size with 69 events, which is lower than the value suggested in the literature.An algorithm is provided for finding the appropriate sample size to validate scoring systems based on binary logistic regression models. This could be applied to determine the sample size in other similar cases.

  19. No rationale for 1 variable per 10 events criterion for binary logistic regression analysis

    Directory of Open Access Journals (Sweden)

    Maarten van Smeden

    2016-11-01

    Full Text Available Abstract Background Ten events per variable (EPV is a widely advocated minimal criterion for sample size considerations in logistic regression analysis. Of three previous simulation studies that examined this minimal EPV criterion only one supports the use of a minimum of 10 EPV. In this paper, we examine the reasons for substantial differences between these extensive simulation studies. Methods The current study uses Monte Carlo simulations to evaluate small sample bias, coverage of confidence intervals and mean square error of logit coefficients. Logistic regression models fitted by maximum likelihood and a modified estimation procedure, known as Firth’s correction, are compared. Results The results show that besides EPV, the problems associated with low EPV depend on other factors such as the total sample size. It is also demonstrated that simulation results can be dominated by even a few simulated data sets for which the prediction of the outcome by the covariates is perfect (‘separation’. We reveal that different approaches for identifying and handling separation leads to substantially different simulation results. We further show that Firth’s correction can be used to improve the accuracy of regression coefficients and alleviate the problems associated with separation. Conclusions The current evidence supporting EPV rules for binary logistic regression is weak. Given our findings, there is an urgent need for new research to provide guidance for supporting sample size considerations for binary logistic regression analysis.

  20. Efficient and robust estimation for longitudinal mixed models for binary data

    DEFF Research Database (Denmark)

    Holst, René

    2009-01-01

    This paper proposes a longitudinal mixed model for binary data. The model extends the classical Poisson trick, in which a binomial regression is fitted by switching to a Poisson framework. A recent estimating equations method for generalized linear longitudinal mixed models, called GEEP, is used...... as a vehicle for fitting the conditional Poisson regressions, given a latent process of serial correlated Tweedie variables. The regression parameters are estimated using a quasi-score method, whereas the dispersion and correlation parameters are estimated by use of bias-corrected Pearson-type estimating...... equations, using second moments only. Random effects are predicted by BLUPs. The method provides a computationally efficient and robust approach to the estimation of longitudinal clustered binary data and accommodates linear and non-linear models. A simulation study is used for validation and finally...

  1. Introduction to the use of regression models in epidemiology.

    Science.gov (United States)

    Bender, Ralf

    2009-01-01

    Regression modeling is one of the most important statistical techniques used in analytical epidemiology. By means of regression models the effect of one or several explanatory variables (e.g., exposures, subject characteristics, risk factors) on a response variable such as mortality or cancer can be investigated. From multiple regression models, adjusted effect estimates can be obtained that take the effect of potential confounders into account. Regression methods can be applied in all epidemiologic study designs so that they represent a universal tool for data analysis in epidemiology. Different kinds of regression models have been developed in dependence on the measurement scale of the response variable and the study design. The most important methods are linear regression for continuous outcomes, logistic regression for binary outcomes, Cox regression for time-to-event data, and Poisson regression for frequencies and rates. This chapter provides a nontechnical introduction to these regression models with illustrating examples from cancer research.

  2. Predicting the "graduate on time (GOT)" of PhD students using binary logistics regression model

    Science.gov (United States)

    Shariff, S. Sarifah Radiah; Rodzi, Nur Atiqah Mohd; Rahman, Kahartini Abdul; Zahari, Siti Meriam; Deni, Sayang Mohd

    2016-10-01

    Malaysian government has recently set a new goal to produce 60,000 Malaysian PhD holders by the year 2023. As a Malaysia's largest institution of higher learning in terms of size and population which offers more than 500 academic programmes in a conducive and vibrant environment, UiTM has taken several initiatives to fill up the gap. Strategies to increase the numbers of graduates with PhD are a process that is challenging. In many occasions, many have already identified that the struggle to get into the target set is even more daunting, and that implementation is far too ideal. This has further being progressing slowly as the attrition rate increases. This study aims to apply the proposed models that incorporates several factors in predicting the number PhD students that will complete their PhD studies on time. Binary Logistic Regression model is proposed and used on the set of data to determine the number. The results show that only 6.8% of the 2014 PhD students are predicted to graduate on time and the results are compared wih the actual number for validation purpose.

  3. The likelihood of achieving quantified road safety targets: a binary logistic regression model for possible factors.

    Science.gov (United States)

    Sze, N N; Wong, S C; Lee, C Y

    2014-12-01

    In past several decades, many countries have set quantified road safety targets to motivate transport authorities to develop systematic road safety strategies and measures and facilitate the achievement of continuous road safety improvement. Studies have been conducted to evaluate the association between the setting of quantified road safety targets and road fatality reduction, in both the short and long run, by comparing road fatalities before and after the implementation of a quantified road safety target. However, not much work has been done to evaluate whether the quantified road safety targets are actually achieved. In this study, we used a binary logistic regression model to examine the factors - including vehicle ownership, fatality rate, and national income, in addition to level of ambition and duration of target - that contribute to a target's success. We analyzed 55 quantified road safety targets set by 29 countries from 1981 to 2009, and the results indicate that targets that are in progress and with lower level of ambitions had a higher likelihood of eventually being achieved. Moreover, possible interaction effects on the association between level of ambition and the likelihood of success are also revealed. Copyright © 2014 Elsevier Ltd. All rights reserved.

  4. RBSURFpred: Modeling protein accessible surface area in real and binary space using regularized and optimized regression.

    Science.gov (United States)

    Tarafder, Sumit; Toukir Ahmed, Md; Iqbal, Sumaiya; Tamjidul Hoque, Md; Sohel Rahman, M

    2018-03-14

    Accessible surface area (ASA) of a protein residue is an effective feature for protein structure prediction, binding region identification, fold recognition problems etc. Improving the prediction of ASA by the application of effective feature variables is a challenging but explorable task to consider, specially in the field of machine learning. Among the existing predictors of ASA, REGAd 3 p is a highly accurate ASA predictor which is based on regularized exact regression with polynomial kernel of degree 3. In this work, we present a new predictor RBSURFpred, which extends REGAd 3 p on several dimensions by incorporating 58 physicochemical, evolutionary and structural properties into 9-tuple peptides via Chou's general PseAAC, which allowed us to obtain higher accuracies in predicting both real-valued and binary ASA. We have compared RBSURFpred for both real and binary space predictions with state-of-the-art predictors, such as REGAd 3 p and SPIDER2. We also have carried out a rigorous analysis of the performance of RBSURFpred in terms of different amino acids and their properties, and also with biologically relevant case-studies. The performance of RBSURFpred establishes itself as a useful tool for the community. Copyright © 2018 Elsevier Ltd. All rights reserved.

  5. Better Autologistic Regression

    Directory of Open Access Journals (Sweden)

    Mark A. Wolters

    2017-11-01

    Full Text Available Autologistic regression is an important probability model for dichotomous random variables observed along with covariate information. It has been used in various fields for analyzing binary data possessing spatial or network structure. The model can be viewed as an extension of the autologistic model (also known as the Ising model, quadratic exponential binary distribution, or Boltzmann machine to include covariates. It can also be viewed as an extension of logistic regression to handle responses that are not independent. Not all authors use exactly the same form of the autologistic regression model. Variations of the model differ in two respects. First, the variable coding—the two numbers used to represent the two possible states of the variables—might differ. Common coding choices are (zero, one and (minus one, plus one. Second, the model might appear in either of two algebraic forms: a standard form, or a recently proposed centered form. Little attention has been paid to the effect of these differences, and the literature shows ambiguity about their importance. It is shown here that changes to either coding or centering in fact produce distinct, non-nested probability models. Theoretical results, numerical studies, and analysis of an ecological data set all show that the differences among the models can be large and practically significant. Understanding the nature of the differences and making appropriate modeling choices can lead to significantly improved autologistic regression analyses. The results strongly suggest that the standard model with plus/minus coding, which we call the symmetric autologistic model, is the most natural choice among the autologistic variants.

  6. Mesoscopic model for binary fluids

    Science.gov (United States)

    Echeverria, C.; Tucci, K.; Alvarez-Llamoza, O.; Orozco-Guillén, E. E.; Morales, M.; Cosenza, M. G.

    2017-10-01

    We propose a model for studying binary fluids based on the mesoscopic molecular simulation technique known as multiparticle collision, where the space and state variables are continuous, and time is discrete. We include a repulsion rule to simulate segregation processes that does not require calculation of the interaction forces between particles, so binary fluids can be described on a mesoscopic scale. The model is conceptually simple and computationally efficient; it maintains Galilean invariance and conserves the mass and energy in the system at the micro- and macro-scale, whereas momentum is conserved globally. For a wide range of temperatures and densities, the model yields results in good agreement with the known properties of binary fluids, such as the density profile, interface width, phase separation, and phase growth. We also apply the model to the study of binary fluids in crowded environments with consistent results.

  7. Accuracy of binary black hole waveform models for aligned-spin binaries

    Science.gov (United States)

    Kumar, Prayush; Chu, Tony; Fong, Heather; Pfeiffer, Harald P.; Boyle, Michael; Hemberger, Daniel A.; Kidder, Lawrence E.; Scheel, Mark A.; Szilagyi, Bela

    2016-05-01

    Coalescing binary black holes are among the primary science targets for second generation ground-based gravitational wave detectors. Reliable gravitational waveform models are central to detection of such systems and subsequent parameter estimation. This paper performs a comprehensive analysis of the accuracy of recent waveform models for binary black holes with aligned spins, utilizing a new set of 84 high-accuracy numerical relativity simulations. Our analysis covers comparable mass binaries (mass-ratio 1 ≤q ≤3 ), and samples independently both black hole spins up to a dimensionless spin magnitude of 0.9 for equal-mass binaries and 0.85 for unequal mass binaries. Furthermore, we focus on the high-mass regime (total mass ≳50 M⊙ ). The two most recent waveform models considered (PhenomD and SEOBNRv2) both perform very well for signal detection, losing less than 0.5% of the recoverable signal-to-noise ratio ρ , except that SEOBNRv2's efficiency drops slightly for both black hole spins aligned at large magnitude. For parameter estimation, modeling inaccuracies of the SEOBNRv2 model are found to be smaller than systematic uncertainties for moderately strong GW events up to roughly ρ ≲15 . PhenomD's modeling errors are found to be smaller than SEOBNRv2's, and are generally irrelevant for ρ ≲20 . Both models' accuracy deteriorates with increased mass ratio, and when at least one black hole spin is large and aligned. The SEOBNRv2 model shows a pronounced disagreement with the numerical relativity simulation in the merger phase, for unequal masses and simultaneously both black hole spins very large and aligned. Two older waveform models (PhenomC and SEOBNRv1) are found to be distinctly less accurate than the more recent PhenomD and SEOBNRv2 models. Finally, we quantify the bias expected from all four waveform models during parameter estimation for several recovered binary parameters: chirp mass, mass ratio, and effective spin.

  8. Modelling long-term fire occurrence factors in Spain by accounting for local variations with geographically weighted regression

    Science.gov (United States)

    Martínez-Fernández, J.; Chuvieco, E.; Koutsias, N.

    2013-02-01

    Humans are responsible for most forest fires in Europe, but anthropogenic factors behind these events are still poorly understood. We tried to identify the driving factors of human-caused fire occurrence in Spain by applying two different statistical approaches. Firstly, assuming stationary processes for the whole country, we created models based on multiple linear regression and binary logistic regression to find factors associated with fire density and fire presence, respectively. Secondly, we used geographically weighted regression (GWR) to better understand and explore the local and regional variations of those factors behind human-caused fire occurrence. The number of human-caused fires occurring within a 25-yr period (1983-2007) was computed for each of the 7638 Spanish mainland municipalities, creating a binary variable (fire/no fire) to develop logistic models, and a continuous variable (fire density) to build standard linear regression models. A total of 383 657 fires were registered in the study dataset. The binary logistic model, which estimates the probability of having/not having a fire, successfully classified 76.4% of the total observations, while the ordinary least squares (OLS) regression model explained 53% of the variation of the fire density patterns (adjusted R2 = 0.53). Both approaches confirmed, in addition to forest and climatic variables, the importance of variables related with agrarian activities, land abandonment, rural population exodus and developmental processes as underlying factors of fire occurrence. For the GWR approach, the explanatory power of the GW linear model for fire density using an adaptive bandwidth increased from 53% to 67%, while for the GW logistic model the correctly classified observations improved only slightly, from 76.4% to 78.4%, but significantly according to the corrected Akaike Information Criterion (AICc), from 3451.19 to 3321.19. The results from GWR indicated a significant spatial variation in the local

  9. Structured Additive Regression Models: An R Interface to BayesX

    Directory of Open Access Journals (Sweden)

    Nikolaus Umlauf

    2015-02-01

    Full Text Available Structured additive regression (STAR models provide a flexible framework for model- ing possible nonlinear effects of covariates: They contain the well established frameworks of generalized linear models and generalized additive models as special cases but also allow a wider class of effects, e.g., for geographical or spatio-temporal data, allowing for specification of complex and realistic models. BayesX is standalone software package providing software for fitting general class of STAR models. Based on a comprehensive open-source regression toolbox written in C++, BayesX uses Bayesian inference for estimating STAR models based on Markov chain Monte Carlo simulation techniques, a mixed model representation of STAR models, or stepwise regression techniques combining penalized least squares estimation with model selection. BayesX not only covers models for responses from univariate exponential families, but also models from less-standard regression situations such as models for multi-categorical responses with either ordered or unordered categories, continuous time survival data, or continuous time multi-state models. This paper presents a new fully interactive R interface to BayesX: the R package R2BayesX. With the new package, STAR models can be conveniently specified using Rs formula language (with some extended terms, fitted using the BayesX binary, represented in R with objects of suitable classes, and finally printed/summarized/plotted. This makes BayesX much more accessible to users familiar with R and adds extensive graphics capabilities for visualizing fitted STAR models. Furthermore, R2BayesX complements the already impressive capabilities for semiparametric regression in R by a comprehensive toolbox comprising in particular more complex response types and alternative inferential procedures such as simulation-based Bayesian inference.

  10. Sunspot Cycle Prediction Using Multivariate Regression and Binary ...

    Indian Academy of Sciences (India)

    49

    Multivariate regression model has been derived based on the available cycles 1 .... The flare index correlates well with various parameters of the solar activity. ...... 32) Sabarinath A and Anilkumar A K 2011 A stochastic prediction model for the.

  11. Modified Regression Correlation Coefficient for Poisson Regression Model

    Science.gov (United States)

    Kaengthong, Nattacha; Domthong, Uthumporn

    2017-09-01

    This study gives attention to indicators in predictive power of the Generalized Linear Model (GLM) which are widely used; however, often having some restrictions. We are interested in regression correlation coefficient for a Poisson regression model. This is a measure of predictive power, and defined by the relationship between the dependent variable (Y) and the expected value of the dependent variable given the independent variables [E(Y|X)] for the Poisson regression model. The dependent variable is distributed as Poisson. The purpose of this research was modifying regression correlation coefficient for Poisson regression model. We also compare the proposed modified regression correlation coefficient with the traditional regression correlation coefficient in the case of two or more independent variables, and having multicollinearity in independent variables. The result shows that the proposed regression correlation coefficient is better than the traditional regression correlation coefficient based on Bias and the Root Mean Square Error (RMSE).

  12. bayesQR: A Bayesian Approach to Quantile Regression

    Directory of Open Access Journals (Sweden)

    Dries F. Benoit

    2017-01-01

    Full Text Available After its introduction by Koenker and Basset (1978, quantile regression has become an important and popular tool to investigate the conditional response distribution in regression. The R package bayesQR contains a number of routines to estimate quantile regression parameters using a Bayesian approach based on the asymmetric Laplace distribution. The package contains functions for the typical quantile regression with continuous dependent variable, but also supports quantile regression for binary dependent variables. For both types of dependent variables, an approach to variable selection using the adaptive lasso approach is provided. For the binary quantile regression model, the package also contains a routine that calculates the fitted probabilities for each vector of predictors. In addition, functions for summarizing the results, creating traceplots, posterior histograms and drawing quantile plots are included. This paper starts with a brief overview of the theoretical background of the models used in the bayesQR package. The main part of this paper discusses the computational problems that arise in the implementation of the procedure and illustrates the usefulness of the package through selected examples.

  13. Multilevel Cross-Dependent Binary Longitudinal Data

    KAUST Repository

    Serban, Nicoleta

    2013-10-16

    We provide insights into new methodology for the analysis of multilevel binary data observed longitudinally, when the repeated longitudinal measurements are correlated. The proposed model is logistic functional regression conditioned on three latent processes describing the within- and between-variability, and describing the cross-dependence of the repeated longitudinal measurements. We estimate the model components without employing mixed-effects modeling but assuming an approximation to the logistic link function. The primary objectives of this article are to highlight the challenges in the estimation of the model components, to compare two approximations to the logistic regression function, linear and exponential, and to discuss their advantages and limitations. The linear approximation is computationally efficient whereas the exponential approximation applies for rare events functional data. Our methods are inspired by and applied to a scientific experiment on spectral backscatter from long range infrared light detection and ranging (LIDAR) data. The models are general and relevant to many new binary functional data sets, with or without dependence between repeated functional measurements.

  14. Guidance for the utility of linear models in meta-analysis of genetic association studies of binary phenotypes.

    Science.gov (United States)

    Cook, James P; Mahajan, Anubha; Morris, Andrew P

    2017-02-01

    Linear mixed models are increasingly used for the analysis of genome-wide association studies (GWAS) of binary phenotypes because they can efficiently and robustly account for population stratification and relatedness through inclusion of random effects for a genetic relationship matrix. However, the utility of linear (mixed) models in the context of meta-analysis of GWAS of binary phenotypes has not been previously explored. In this investigation, we present simulations to compare the performance of linear and logistic regression models under alternative weighting schemes in a fixed-effects meta-analysis framework, considering designs that incorporate variable case-control imbalance, confounding factors and population stratification. Our results demonstrate that linear models can be used for meta-analysis of GWAS of binary phenotypes, without loss of power, even in the presence of extreme case-control imbalance, provided that one of the following schemes is used: (i) effective sample size weighting of Z-scores or (ii) inverse-variance weighting of allelic effect sizes after conversion onto the log-odds scale. Our conclusions thus provide essential recommendations for the development of robust protocols for meta-analysis of binary phenotypes with linear models.

  15. Logic regression and its extensions.

    Science.gov (United States)

    Schwender, Holger; Ruczinski, Ingo

    2010-01-01

    Logic regression is an adaptive classification and regression procedure, initially developed to reveal interacting single nucleotide polymorphisms (SNPs) in genetic association studies. In general, this approach can be used in any setting with binary predictors, when the interaction of these covariates is of primary interest. Logic regression searches for Boolean (logic) combinations of binary variables that best explain the variability in the outcome variable, and thus, reveals variables and interactions that are associated with the response and/or have predictive capabilities. The logic expressions are embedded in a generalized linear regression framework, and thus, logic regression can handle a variety of outcome types, such as binary responses in case-control studies, numeric responses, and time-to-event data. In this chapter, we provide an introduction to the logic regression methodology, list some applications in public health and medicine, and summarize some of the direct extensions and modifications of logic regression that have been proposed in the literature. Copyright © 2010 Elsevier Inc. All rights reserved.

  16. The Effect of Latent Binary Variables on the Uncertainty of the Prediction of a Dichotomous Outcome Using Logistic Regression Based Propensity Score Matching.

    Science.gov (United States)

    Szekér, Szabolcs; Vathy-Fogarassy, Ágnes

    2018-01-01

    Logistic regression based propensity score matching is a widely used method in case-control studies to select the individuals of the control group. This method creates a suitable control group if all factors affecting the output variable are known. However, if relevant latent variables exist as well, which are not taken into account during the calculations, the quality of the control group is uncertain. In this paper, we present a statistics-based research in which we try to determine the relationship between the accuracy of the logistic regression model and the uncertainty of the dependent variable of the control group defined by propensity score matching. Our analyses show that there is a linear correlation between the fit of the logistic regression model and the uncertainty of the output variable. In certain cases, a latent binary explanatory variable can result in a relative error of up to 70% in the prediction of the outcome variable. The observed phenomenon calls the attention of analysts to an important point, which must be taken into account when deducting conclusions.

  17. Prediction of Mind-Wandering with Electroencephalogram and Non-linear Regression Modeling.

    Science.gov (United States)

    Kawashima, Issaku; Kumano, Hiroaki

    2017-01-01

    Mind-wandering (MW), task-unrelated thought, has been examined by researchers in an increasing number of articles using models to predict whether subjects are in MW, using numerous physiological variables. However, these models are not applicable in general situations. Moreover, they output only binary classification. The current study suggests that the combination of electroencephalogram (EEG) variables and non-linear regression modeling can be a good indicator of MW intensity. We recorded EEGs of 50 subjects during the performance of a Sustained Attention to Response Task, including a thought sampling probe that inquired the focus of attention. We calculated the power and coherence value and prepared 35 patterns of variable combinations and applied Support Vector machine Regression (SVR) to them. Finally, we chose four SVR models: two of them non-linear models and the others linear models; two of the four models are composed of a limited number of electrodes to satisfy model usefulness. Examination using the held-out data indicated that all models had robust predictive precision and provided significantly better estimations than a linear regression model using single electrode EEG variables. Furthermore, in limited electrode condition, non-linear SVR model showed significantly better precision than linear SVR model. The method proposed in this study helps investigations into MW in various little-examined situations. Further, by measuring MW with a high temporal resolution EEG, unclear aspects of MW, such as time series variation, are expected to be revealed. Furthermore, our suggestion that a few electrodes can also predict MW contributes to the development of neuro-feedback studies.

  18. Prediction of Mind-Wandering with Electroencephalogram and Non-linear Regression Modeling

    Directory of Open Access Journals (Sweden)

    Issaku Kawashima

    2017-07-01

    Full Text Available Mind-wandering (MW, task-unrelated thought, has been examined by researchers in an increasing number of articles using models to predict whether subjects are in MW, using numerous physiological variables. However, these models are not applicable in general situations. Moreover, they output only binary classification. The current study suggests that the combination of electroencephalogram (EEG variables and non-linear regression modeling can be a good indicator of MW intensity. We recorded EEGs of 50 subjects during the performance of a Sustained Attention to Response Task, including a thought sampling probe that inquired the focus of attention. We calculated the power and coherence value and prepared 35 patterns of variable combinations and applied Support Vector machine Regression (SVR to them. Finally, we chose four SVR models: two of them non-linear models and the others linear models; two of the four models are composed of a limited number of electrodes to satisfy model usefulness. Examination using the held-out data indicated that all models had robust predictive precision and provided significantly better estimations than a linear regression model using single electrode EEG variables. Furthermore, in limited electrode condition, non-linear SVR model showed significantly better precision than linear SVR model. The method proposed in this study helps investigations into MW in various little-examined situations. Further, by measuring MW with a high temporal resolution EEG, unclear aspects of MW, such as time series variation, are expected to be revealed. Furthermore, our suggestion that a few electrodes can also predict MW contributes to the development of neuro-feedback studies.

  19. Comparison of robustness to outliers between robust poisson models and log-binomial models when estimating relative risks for common binary outcomes: a simulation study.

    Science.gov (United States)

    Chen, Wansu; Shi, Jiaxiao; Qian, Lei; Azen, Stanley P

    2014-06-26

    To estimate relative risks or risk ratios for common binary outcomes, the most popular model-based methods are the robust (also known as modified) Poisson and the log-binomial regression. Of the two methods, it is believed that the log-binomial regression yields more efficient estimators because it is maximum likelihood based, while the robust Poisson model may be less affected by outliers. Evidence to support the robustness of robust Poisson models in comparison with log-binomial models is very limited. In this study a simulation was conducted to evaluate the performance of the two methods in several scenarios where outliers existed. The findings indicate that for data coming from a population where the relationship between the outcome and the covariate was in a simple form (e.g. log-linear), the two models yielded comparable biases and mean square errors. However, if the true relationship contained a higher order term, the robust Poisson models consistently outperformed the log-binomial models even when the level of contamination is low. The robust Poisson models are more robust (or less sensitive) to outliers compared to the log-binomial models when estimating relative risks or risk ratios for common binary outcomes. Users should be aware of the limitations when choosing appropriate models to estimate relative risks or risk ratios.

  20. Reducing Monte Carlo error in the Bayesian estimation of risk ratios using log-binomial regression models.

    Science.gov (United States)

    Salmerón, Diego; Cano, Juan A; Chirlaque, María D

    2015-08-30

    In cohort studies, binary outcomes are very often analyzed by logistic regression. However, it is well known that when the goal is to estimate a risk ratio, the logistic regression is inappropriate if the outcome is common. In these cases, a log-binomial regression model is preferable. On the other hand, the estimation of the regression coefficients of the log-binomial model is difficult owing to the constraints that must be imposed on these coefficients. Bayesian methods allow a straightforward approach for log-binomial regression models and produce smaller mean squared errors in the estimation of risk ratios than the frequentist methods, and the posterior inferences can be obtained using the software WinBUGS. However, Markov chain Monte Carlo methods implemented in WinBUGS can lead to large Monte Carlo errors in the approximations to the posterior inferences because they produce correlated simulations, and the accuracy of the approximations are inversely related to this correlation. To reduce correlation and to improve accuracy, we propose a reparameterization based on a Poisson model and a sampling algorithm coded in R. Copyright © 2015 John Wiley & Sons, Ltd.

  1. A simple model for binary star evolution

    International Nuclear Information System (INIS)

    Whyte, C.A.; Eggleton, P.P.

    1985-01-01

    A simple model for calculating the evolution of binary stars is presented. Detailed stellar evolution calculations of stars undergoing mass and energy transfer at various rates are reported and used to identify the dominant physical processes which determine the type of evolution. These detailed calculations are used to calibrate the simple model and a comparison of calculations using the detailed stellar evolution equations and the simple model is made. Results of the evolution of a few binary systems are reported and compared with previously published calculations using normal stellar evolution programs. (author)

  2. An appraisal of convergence failures in the application of logistic regression model in published manuscripts.

    Science.gov (United States)

    Yusuf, O B; Bamgboye, E A; Afolabi, R F; Shodimu, M A

    2014-09-01

    Logistic regression model is widely used in health research for description and predictive purposes. Unfortunately, most researchers are sometimes not aware that the underlying principles of the techniques have failed when the algorithm for maximum likelihood does not converge. Young researchers particularly postgraduate students may not know why separation problem whether quasi or complete occurs, how to identify it and how to fix it. This study was designed to critically evaluate convergence issues in articles that employed logistic regression analysis published in an African Journal of Medicine and medical sciences between 2004 and 2013. Problems of quasi or complete separation were described and were illustrated with the National Demographic and Health Survey dataset. A critical evaluation of articles that employed logistic regression was conducted. A total of 581 articles was reviewed, of which 40 (6.9%) used binary logistic regression. Twenty-four (60.0%) stated the use of logistic regression model in the methodology while none of the articles assessed model fit. Only 3 (12.5%) properly described the procedures. Of the 40 that used the logistic regression model, the problem of convergence occurred in 6 (15.0%) of the articles. Logistic regression tends to be poorly reported in studies published between 2004 and 2013. Our findings showed that the procedure may not be well understood by researchers since very few described the process in their reports and may be totally unaware of the problem of convergence or how to deal with it.

  3. Binary Logistic Regression Versus Boosted Regression Trees in Assessing Landslide Susceptibility for Multiple-Occurring Regional Landslide Events: Application to the 2009 Storm Event in Messina (Sicily, southern Italy).

    Science.gov (United States)

    Lombardo, L.; Cama, M.; Maerker, M.; Parisi, L.; Rotigliano, E.

    2014-12-01

    This study aims at comparing the performances of Binary Logistic Regression (BLR) and Boosted Regression Trees (BRT) methods in assessing landslide susceptibility for multiple-occurrence regional landslide events within the Mediterranean region. A test area was selected in the north-eastern sector of Sicily (southern Italy), corresponding to the catchments of the Briga and the Giampilieri streams both stretching for few kilometres from the Peloritan ridge (eastern Sicily, Italy) to the Ionian sea. This area was struck on the 1st October 2009 by an extreme climatic event resulting in thousands of rapid shallow landslides, mainly of debris flows and debris avalanches types involving the weathered layer of a low to high grade metamorphic bedrock. Exploiting the same set of predictors and the 2009 landslide archive, BLR- and BRT-based susceptibility models were obtained for the two catchments separately, adopting a random partition (RP) technique for validation; besides, the models trained in one of the two catchments (Briga) were tested in predicting the landslide distribution in the other (Giampilieri), adopting a spatial partition (SP) based validation procedure. All the validation procedures were based on multi-folds tests so to evaluate and compare the reliability of the fitting, the prediction skill, the coherence in the predictor selection and the precision of the susceptibility estimates. All the obtained models for the two methods produced very high predictive performances, with a general congruence between BLR and BRT in the predictor importance. In particular, the research highlighted that BRT-models reached a higher prediction performance with respect to BLR-models, for RP based modelling, whilst for the SP-based models the difference in predictive skills between the two methods dropped drastically, converging to an analogous excellent performance. However, when looking at the precision of the probability estimates, BLR demonstrated to produce more robust

  4. Statistical power to detect violation of the proportional hazards assumption when using the Cox regression model.

    Science.gov (United States)

    Austin, Peter C

    2018-01-01

    The use of the Cox proportional hazards regression model is widespread. A key assumption of the model is that of proportional hazards. Analysts frequently test the validity of this assumption using statistical significance testing. However, the statistical power of such assessments is frequently unknown. We used Monte Carlo simulations to estimate the statistical power of two different methods for detecting violations of this assumption. When the covariate was binary, we found that a model-based method had greater power than a method based on cumulative sums of martingale residuals. Furthermore, the parametric nature of the distribution of event times had an impact on power when the covariate was binary. Statistical power to detect a strong violation of the proportional hazards assumption was low to moderate even when the number of observed events was high. In many data sets, power to detect a violation of this assumption is likely to be low to modest.

  5. Performance of models for estimating absolute risk difference in multicenter trials with binary outcome

    Directory of Open Access Journals (Sweden)

    Claudia Pedroza

    2016-08-01

    Full Text Available Abstract Background Reporting of absolute risk difference (RD is recommended for clinical and epidemiological prospective studies. In analyses of multicenter studies, adjustment for center is necessary when randomization is stratified by center or when there is large variation in patients outcomes across centers. While regression methods are used to estimate RD adjusted for baseline predictors and clustering, no formal evaluation of their performance has been previously conducted. Methods We performed a simulation study to evaluate 6 regression methods fitted under a generalized estimating equation framework: binomial identity, Poisson identity, Normal identity, log binomial, log Poisson, and logistic regression model. We compared the model estimates to unadjusted estimates. We varied the true response function (identity or log, number of subjects per center, true risk difference, control outcome rate, effect of baseline predictor, and intracenter correlation. We compared the models in terms of convergence, absolute bias and coverage of 95 % confidence intervals for RD. Results The 6 models performed very similar to each other for the majority of scenarios. However, the log binomial model did not converge for a large portion of the scenarios including a baseline predictor. In scenarios with outcome rate close to the parameter boundary, the binomial and Poisson identity models had the best performance, but differences from other models were negligible. The unadjusted method introduced little bias to the RD estimates, but its coverage was larger than the nominal value in some scenarios with an identity response. Under the log response, coverage from the unadjusted method was well below the nominal value (<80 % for some scenarios. Conclusions We recommend the use of a binomial or Poisson GEE model with identity link to estimate RD for correlated binary outcome data. If these models fail to run, then either a logistic regression, log Poisson

  6. Misclassification in binary choice models

    Czech Academy of Sciences Publication Activity Database

    Meyer, B. D.; Mittag, Nikolas

    2017-01-01

    Roč. 200, č. 2 (2017), s. 295-311 ISSN 0304-4076 Institutional support: RVO:67985998 Keywords : measurement error * binary choice models * program take-up Subject RIV: AH - Economics OBOR OECD: Economic Theory Impact factor: 1.633, year: 2016

  7. Accuracy of Binary Black Hole Waveform Models for Advanced LIGO

    Science.gov (United States)

    Kumar, Prayush; Fong, Heather; Barkett, Kevin; Bhagwat, Swetha; Afshari, Nousha; Chu, Tony; Brown, Duncan; Lovelace, Geoffrey; Pfeiffer, Harald; Scheel, Mark; Szilagyi, Bela; Simulating Extreme Spacetimes (SXS) Team

    2016-03-01

    Coalescing binaries of compact objects, such as black holes and neutron stars, are the primary targets for gravitational-wave (GW) detection with Advanced LIGO. Accurate modeling of the emitted GWs is required to extract information about the binary source. The most accurate solution to the general relativistic two-body problem is available in numerical relativity (NR), which is however limited in application due to computational cost. Current searches use semi-analytic models that are based in post-Newtonian (PN) theory and calibrated to NR. In this talk, I will present comparisons between contemporary models and high-accuracy numerical simulations performed using the Spectral Einstein Code (SpEC), focusing at the questions: (i) How well do models capture binary's late-inspiral where they lack a-priori accurate information from PN or NR, and (ii) How accurately do they model binaries with parameters outside their range of calibration. These results guide the choice of templates for future GW searches, and motivate future modeling efforts.

  8. Regression modeling of ground-water flow

    Science.gov (United States)

    Cooley, R.L.; Naff, R.L.

    1985-01-01

    Nonlinear multiple regression methods are developed to model and analyze groundwater flow systems. Complete descriptions of regression methodology as applied to groundwater flow models allow scientists and engineers engaged in flow modeling to apply the methods to a wide range of problems. Organization of the text proceeds from an introduction that discusses the general topic of groundwater flow modeling, to a review of basic statistics necessary to properly apply regression techniques, and then to the main topic: exposition and use of linear and nonlinear regression to model groundwater flow. Statistical procedures are given to analyze and use the regression models. A number of exercises and answers are included to exercise the student on nearly all the methods that are presented for modeling and statistical analysis. Three computer programs implement the more complex methods. These three are a general two-dimensional, steady-state regression model for flow in an anisotropic, heterogeneous porous medium, a program to calculate a measure of model nonlinearity with respect to the regression parameters, and a program to analyze model errors in computed dependent variables such as hydraulic head. (USGS)

  9. The mechanical properties of high speed GTAW weld and factors of nonlinear multiple regression model under external transverse magnetic field

    Science.gov (United States)

    Lu, Lin; Chang, Yunlong; Li, Yingmin; He, Youyou

    2013-05-01

    A transverse magnetic field was introduced to the arc plasma in the process of welding stainless steel tubes by high-speed Tungsten Inert Gas Arc Welding (TIG for short) without filler wire. The influence of external magnetic field on welding quality was investigated. 9 sets of parameters were designed by the means of orthogonal experiment. The welding joint tensile strength and form factor of weld were regarded as the main standards of welding quality. A binary quadratic nonlinear regression equation was established with the conditions of magnetic induction and flow rate of Ar gas. The residual standard deviation was calculated to adjust the accuracy of regression model. The results showed that, the regression model was correct and effective in calculating the tensile strength and aspect ratio of weld. Two 3D regression models were designed respectively, and then the impact law of magnetic induction on welding quality was researched.

  10. Regression Models for Market-Shares

    DEFF Research Database (Denmark)

    Birch, Kristina; Olsen, Jørgen Kai; Tjur, Tue

    2005-01-01

    On the background of a data set of weekly sales and prices for three brands of coffee, this paper discusses various regression models and their relation to the multiplicative competitive-interaction model (the MCI model, see Cooper 1988, 1993) for market-shares. Emphasis is put on the interpretat......On the background of a data set of weekly sales and prices for three brands of coffee, this paper discusses various regression models and their relation to the multiplicative competitive-interaction model (the MCI model, see Cooper 1988, 1993) for market-shares. Emphasis is put...... on the interpretation of the parameters in relation to models for the total sales based on discrete choice models.Key words and phrases. MCI model, discrete choice model, market-shares, price elasitcity, regression model....

  11. Binary versus non-binary information in real time series: empirical results and maximum-entropy matrix models

    Science.gov (United States)

    Almog, Assaf; Garlaschelli, Diego

    2014-09-01

    The dynamics of complex systems, from financial markets to the brain, can be monitored in terms of multiple time series of activity of the constituent units, such as stocks or neurons, respectively. While the main focus of time series analysis is on the magnitude of temporal increments, a significant piece of information is encoded into the binary projection (i.e. the sign) of such increments. In this paper we provide further evidence of this by showing strong nonlinear relations between binary and non-binary properties of financial time series. These relations are a novel quantification of the fact that extreme price increments occur more often when most stocks move in the same direction. We then introduce an information-theoretic approach to the analysis of the binary signature of single and multiple time series. Through the definition of maximum-entropy ensembles of binary matrices and their mapping to spin models in statistical physics, we quantify the information encoded into the simplest binary properties of real time series and identify the most informative property given a set of measurements. Our formalism is able to accurately replicate, and mathematically characterize, the observed binary/non-binary relations. We also obtain a phase diagram allowing us to identify, based only on the instantaneous aggregate return of a set of multiple time series, a regime where the so-called ‘market mode’ has an optimal interpretation in terms of collective (endogenous) effects, a regime where it is parsimoniously explained by pure noise, and a regime where it can be regarded as a combination of endogenous and exogenous factors. Our approach allows us to connect spin models, simple stochastic processes, and ensembles of time series inferred from partial information.

  12. Binary versus non-binary information in real time series: empirical results and maximum-entropy matrix models

    International Nuclear Information System (INIS)

    Almog, Assaf; Garlaschelli, Diego

    2014-01-01

    The dynamics of complex systems, from financial markets to the brain, can be monitored in terms of multiple time series of activity of the constituent units, such as stocks or neurons, respectively. While the main focus of time series analysis is on the magnitude of temporal increments, a significant piece of information is encoded into the binary projection (i.e. the sign) of such increments. In this paper we provide further evidence of this by showing strong nonlinear relations between binary and non-binary properties of financial time series. These relations are a novel quantification of the fact that extreme price increments occur more often when most stocks move in the same direction. We then introduce an information-theoretic approach to the analysis of the binary signature of single and multiple time series. Through the definition of maximum-entropy ensembles of binary matrices and their mapping to spin models in statistical physics, we quantify the information encoded into the simplest binary properties of real time series and identify the most informative property given a set of measurements. Our formalism is able to accurately replicate, and mathematically characterize, the observed binary/non-binary relations. We also obtain a phase diagram allowing us to identify, based only on the instantaneous aggregate return of a set of multiple time series, a regime where the so-called ‘market mode’ has an optimal interpretation in terms of collective (endogenous) effects, a regime where it is parsimoniously explained by pure noise, and a regime where it can be regarded as a combination of endogenous and exogenous factors. Our approach allows us to connect spin models, simple stochastic processes, and ensembles of time series inferred from partial information. (paper)

  13. A binary logistic regression model with complex sampling design of unmet need for family planning among all women aged (15-49) in Ethiopia.

    Science.gov (United States)

    Workie, Demeke Lakew; Zike, Dereje Tesfaye; Fenta, Haile Mekonnen; Mekonnen, Mulusew Admasu

    2017-09-01

    Unintended pregnancy related to unmet need is a worldwide problem that affects societies. The main objective of this study was to identify the prevalence and determinants of unmet need for family planning among women aged (15-49) in Ethiopia. The Performance Monitoring and Accountability2020/Ethiopia was conducted in April 2016 at round-4 from 7494 women with two-stage-stratified sampling. Bi-variable and multi-variable binary logistic regression model with complex sampling design was fitted. The prevalence of unmet-need for family planning was 16.2% in Ethiopia. Women between the age range of 15-24 years were 2.266 times more likely to have unmet need family planning compared to above 35 years. Women who were currently married were about 8 times more likely to have unmet need family planning compared to never married women. Women who had no under-five child were 0.125 times less likely to have unmet need family planning compared to those who had more than two-under-5. The key determinants of unmet need family planning in Ethiopia were residence, age, marital-status, education, household members, birth-events and number of under-5 children. Thus the Government of Ethiopia would take immediate steps to address the causes of high unmet need for family planning among women.

  14. Misclassification in binary choice models

    Czech Academy of Sciences Publication Activity Database

    Meyer, B. D.; Mittag, Nikolas

    2017-01-01

    Roč. 200, č. 2 (2017), s. 295-311 ISSN 0304-4076 R&D Projects: GA ČR(CZ) GJ16-07603Y Institutional support: Progres-Q24 Keywords : measurement error * binary choice models * program take-up Subject RIV: AH - Economics OBOR OECD: Economic Theory Impact factor: 1.633, year: 2016

  15. A Seemingly Unrelated Poisson Regression Model

    OpenAIRE

    King, Gary

    1989-01-01

    This article introduces a new estimator for the analysis of two contemporaneously correlated endogenous event count variables. This seemingly unrelated Poisson regression model (SUPREME) estimator combines the efficiencies created by single equation Poisson regression model estimators and insights from "seemingly unrelated" linear regression models.

  16. Solving and interpreting binary classification problems in marketing with SVMs

    NARCIS (Netherlands)

    J.C. Bioch (Cor); P.J.F. Groenen (Patrick); G.I. Nalbantov (Georgi)

    2005-01-01

    textabstractMarketing problems often involve inary classification of customers into ``buyers'' versus ``non-buyers'' or ``prefers brand A'' versus ``prefers brand B''. These cases require binary classification models such as logistic regression, linear, and quadratic discriminant analysis. A

  17. Radio emission from symbiotic stars: a binary model

    International Nuclear Information System (INIS)

    Taylor, A.R.; Seaquist, E.R.

    1985-01-01

    The authors examine a binary model for symbiotic stars to account for their radio properties. The system is comprised of a cool, mass-losing star and a hot companion. Radio emission arises in the portion of the stellar wind photo-ionized by the hot star. Computer simulations for the case of uniform mass loss at constant velocity show that when less than half the wind is ionized, optically thick spectral indices greater than +0.6 are produced. Model fits to radio spectra allow the binary separation, wind density and ionizing photon luminosity to be calculated. They apply the model to the symbiotic star H1-36. (orig.)

  18. Panel Smooth Transition Regression Models

    DEFF Research Database (Denmark)

    González, Andrés; Terasvirta, Timo; Dijk, Dick van

    We introduce the panel smooth transition regression model. This new model is intended for characterizing heterogeneous panels, allowing the regression coefficients to vary both across individuals and over time. Specifically, heterogeneity is allowed for by assuming that these coefficients are bou...

  19. The intermediate endpoint effect in logistic and probit regression

    Science.gov (United States)

    MacKinnon, DP; Lockwood, CM; Brown, CH; Wang, W; Hoffman, JM

    2010-01-01

    Background An intermediate endpoint is hypothesized to be in the middle of the causal sequence relating an independent variable to a dependent variable. The intermediate variable is also called a surrogate or mediating variable and the corresponding effect is called the mediated, surrogate endpoint, or intermediate endpoint effect. Clinical studies are often designed to change an intermediate or surrogate endpoint and through this intermediate change influence the ultimate endpoint. In many intermediate endpoint clinical studies the dependent variable is binary, and logistic or probit regression is used. Purpose The purpose of this study is to describe a limitation of a widely used approach to assessing intermediate endpoint effects and to propose an alternative method, based on products of coefficients, that yields more accurate results. Methods The intermediate endpoint model for a binary outcome is described for a true binary outcome and for a dichotomization of a latent continuous outcome. Plots of true values and a simulation study are used to evaluate the different methods. Results Distorted estimates of the intermediate endpoint effect and incorrect conclusions can result from the application of widely used methods to assess the intermediate endpoint effect. The same problem occurs for the proportion of an effect explained by an intermediate endpoint, which has been suggested as a useful measure for identifying intermediate endpoints. A solution to this problem is given based on the relationship between latent variable modeling and logistic or probit regression. Limitations More complicated intermediate variable models are not addressed in the study, although the methods described in the article can be extended to these more complicated models. Conclusions Researchers are encouraged to use an intermediate endpoint method based on the product of regression coefficients. A common method based on difference in coefficient methods can lead to distorted

  20. Binary classification of dyslipidemia from the waist-to-hip ratio and body mass index: a comparison of linear, logistic, and CART models

    Directory of Open Access Journals (Sweden)

    Paccaud Fred

    2004-04-01

    Full Text Available Abstract Background We sought to improve upon previously published statistical modeling strategies for binary classification of dyslipidemia for general population screening purposes based on the waist-to-hip circumference ratio and body mass index anthropometric measurements. Methods Study subjects were participants in WHO-MONICA population-based surveys conducted in two Swiss regions. Outcome variables were based on the total serum cholesterol to high density lipoprotein cholesterol ratio. The other potential predictor variables were gender, age, current cigarette smoking, and hypertension. The models investigated were: (i linear regression; (ii logistic classification; (iii regression trees; (iv classification trees (iii and iv are collectively known as "CART". Binary classification performance of the region-specific models was externally validated by classifying the subjects from the other region. Results Waist-to-hip circumference ratio and body mass index remained modest predictors of dyslipidemia. Correct classification rates for all models were 60–80%, with marked gender differences. Gender-specific models provided only small gains in classification. The external validations provided assurance about the stability of the models. Conclusions There were no striking differences between either the algebraic (i, ii vs. non-algebraic (iii, iv, or the regression (i, iii vs. classification (ii, iv modeling approaches. Anticipated advantages of the CART vs. simple additive linear and logistic models were less than expected in this particular application with a relatively small set of predictor variables. CART models may be more useful when considering main effects and interactions between larger sets of predictor variables.

  1. ATLS Hypovolemic Shock Classification by Prediction of Blood Loss in Rats Using Regression Models.

    Science.gov (United States)

    Choi, Soo Beom; Choi, Joon Yul; Park, Jee Soo; Kim, Deok Won

    2016-07-01

    In our previous study, our input data set consisted of 78 rats, the blood loss in percent as a dependent variable, and 11 independent variables (heart rate, systolic blood pressure, diastolic blood pressure, mean arterial pressure, pulse pressure, respiration rate, temperature, perfusion index, lactate concentration, shock index, and new index (lactate concentration/perfusion)). The machine learning methods for multicategory classification were applied to a rat model in acute hemorrhage to predict the four Advanced Trauma Life Support (ATLS) hypovolemic shock classes for triage in our previous study. However, multicategory classification is much more difficult and complicated than binary classification. We introduce a simple approach for classifying ATLS hypovolaemic shock class by predicting blood loss in percent using support vector regression and multivariate linear regression (MLR). We also compared the performance of the classification models using absolute and relative vital signs. The accuracies of support vector regression and MLR models with relative values by predicting blood loss in percent were 88.5% and 84.6%, respectively. These were better than the best accuracy of 80.8% of the direct multicategory classification using the support vector machine one-versus-one model in our previous study for the same validation data set. Moreover, the simple MLR models with both absolute and relative values could provide possibility of the future clinical decision support system for ATLS classification. The perfusion index and new index were more appropriate with relative changes than absolute values.

  2. Using a binary logistic regression method and GIS for evaluating and mapping the groundwater spring potential in the Sultan Mountains (Aksehir, Turkey)

    Science.gov (United States)

    Ozdemir, Adnan

    2011-07-01

    SummaryThe purpose of this study is to produce a groundwater spring potential map of the Sultan Mountains in central Turkey, based on a logistic regression method within a Geographic Information System (GIS) environment. Using field surveys, the locations of the springs (440 springs) were determined in the study area. In this study, 17 spring-related factors were used in the analysis: geology, relative permeability, land use/land cover, precipitation, elevation, slope, aspect, total curvature, plan curvature, profile curvature, wetness index, stream power index, sediment transport capacity index, distance to drainage, distance to fault, drainage density, and fault density map. The coefficients of the predictor variables were estimated using binary logistic regression analysis and were used to calculate the groundwater spring potential for the entire study area. The accuracy of the final spring potential map was evaluated based on the observed springs. The accuracy of the model was evaluated by calculating the relative operating characteristics. The area value of the relative operating characteristic curve model was found to be 0.82. These results indicate that the model is a good estimator of the spring potential in the study area. The spring potential map shows that the areas of very low, low, moderate and high groundwater spring potential classes are 105.586 km 2 (28.99%), 74.271 km 2 (19.906%), 101.203 km 2 (27.14%), and 90.05 km 2 (24.671%), respectively. The interpretations of the potential map showed that stream power index, relative permeability of lithologies, geology, elevation, aspect, wetness index, plan curvature, and drainage density play major roles in spring occurrence and distribution in the Sultan Mountains. The logistic regression approach has not yet been used to delineate groundwater potential zones. In this study, the logistic regression method was used to locate potential zones for groundwater springs in the Sultan Mountains. The evolved model

  3. Interpretation of commonly used statistical regression models.

    Science.gov (United States)

    Kasza, Jessica; Wolfe, Rory

    2014-01-01

    A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.

  4. ACOUSTIC EFFECTS ON BINARY AEROELASTICITY MODEL

    Directory of Open Access Journals (Sweden)

    Kok Hwa Yu

    2011-10-01

    Full Text Available Acoustics is the science concerned with the study of sound. The effects of sound on structures attract overwhelm interests and numerous studies were carried out in this particular area. Many of the preliminary investigations show that acoustic pressure produces significant influences on structures such as thin plate, membrane and also high-impedance medium like water (and other similar fluids. Thus, it is useful to investigate the structure response with the presence of acoustics on aircraft, especially on aircraft wings, tails and control surfaces which are vulnerable to flutter phenomena. The present paper describes the modeling of structural-acoustic interactions to simulate the external acoustic effect on binary flutter model. Here, the binary flutter model which illustrated as a rectangular wing is constructed using strip theory with simplified unsteady aerodynamics involving flap and pitch degree of freedom terms. The external acoustic excitation, on the other hand, is modeled using four-node quadrilateral isoparametric element via finite element approach. Both equations then carefully coupled and solved using eigenvalue solution. The mentioned approach is implemented in MATLAB and the outcome of the simulated result are later described, analyzed and illustrated in this paper.

  5. Thermodynamic modeling of saturated liquid compositions and densities for asymmetric binary systems composed of carbon dioxide, alkanes and alkanols

    International Nuclear Information System (INIS)

    Bayestehparvin, Bita; Nourozieh, Hossein; Kariznovi, Mohammad; Abedi, Jalal

    2015-01-01

    Highlights: • Phase behavior of the binary systems containing largely different components. • Equation of state modeling of binary polar and non-polar systems by utilizing different mixing rules. • Three different mixing rules (one-parameter, two-parameters and Wong–Sandler) coupled with Peng–Robinson equation of state. • Two-parameter mixing rule shows promoting results compared to one-parameter mixing rule. • Wong–Sandler mixing rule is unable to predict saturated liquid densities with sufficient accuracy. - Abstract: The present study mainly focuses on the phase behavior modeling of asymmetric binary mixtures. Capability of different mixing rules and volume shift in the prediction of solubility and saturated liquid density has been investigated. Different binary systems of (alkane + alkanol), (alkane + alkane), (carbon dioxide + alkanol), and (carbon dioxide + alkane) are considered. The composition and the density of saturated liquid phase at equilibrium condition are the properties of interest. Considering composition and saturated liquid density of different binary systems, three main objectives are investigated. First, three different mixing rules (one-parameter, two parameters and Wong–Sandler) coupled with Peng–Robinson equation of state were used to predict the equilibrium properties. The Wong–Sandler mixing rule was utilized with the non-random two-liquid (NRTL) model. Binary interaction coefficients and NRTL model parameters were optimized using the Levenberg–Marquardt algorithm. Second, to improve the density prediction, the volume translation technique was applied. Finally, Two different approaches were considered to tune the equation of state; regression of experimental equilibrium compositions and densities separately and spontaneously. The modeling results show that there is no superior mixing rule which can predict the equilibrium properties for different systems. Two-parameter and Wong–Sandler mixing rule show promoting

  6. A menu-driven software package of Bayesian nonparametric (and parametric) mixed models for regression analysis and density estimation.

    Science.gov (United States)

    Karabatsos, George

    2017-02-01

    Most of applied statistics involves regression analysis of data. In practice, it is important to specify a regression model that has minimal assumptions which are not violated by data, to ensure that statistical inferences from the model are informative and not misleading. This paper presents a stand-alone and menu-driven software package, Bayesian Regression: Nonparametric and Parametric Models, constructed from MATLAB Compiler. Currently, this package gives the user a choice from 83 Bayesian models for data analysis. They include 47 Bayesian nonparametric (BNP) infinite-mixture regression models; 5 BNP infinite-mixture models for density estimation; and 31 normal random effects models (HLMs), including normal linear models. Each of the 78 regression models handles either a continuous, binary, or ordinal dependent variable, and can handle multi-level (grouped) data. All 83 Bayesian models can handle the analysis of weighted observations (e.g., for meta-analysis), and the analysis of left-censored, right-censored, and/or interval-censored data. Each BNP infinite-mixture model has a mixture distribution assigned one of various BNP prior distributions, including priors defined by either the Dirichlet process, Pitman-Yor process (including the normalized stable process), beta (two-parameter) process, normalized inverse-Gaussian process, geometric weights prior, dependent Dirichlet process, or the dependent infinite-probits prior. The software user can mouse-click to select a Bayesian model and perform data analysis via Markov chain Monte Carlo (MCMC) sampling. After the sampling completes, the software automatically opens text output that reports MCMC-based estimates of the model's posterior distribution and model predictive fit to the data. Additional text and/or graphical output can be generated by mouse-clicking other menu options. This includes output of MCMC convergence analyses, and estimates of the model's posterior predictive distribution, for selected

  7. Photometric Analysis and Modeling of Five Mass-Transferring Binary Systems

    Science.gov (United States)

    Geist, Emily; Beaky, Matthew; Jamison, Kate

    2018-01-01

    In overcontact eclipsing binary systems, both stellar components have overfilled their Roche lobes, resulting in a dumbbell-shaped shared envelope. Mass transfer is common in overcontact binaries, which can be observed as a slow change on the rotation period of the system.We studied five overcontact eclipsing binary systems with evidence of period change, and thus likely mass transfer between the components, identified by Nelson (2014): V0579 Lyr, KN Vul, V0406 Lyr, V2240 Cyg, and MS Her. We used the 31-inch NURO telescope at Lowell Observatory in Flagstaff, Arizona to obtain images in B,V,R, and I filters for V0579 Lyr, and the 16-inch Meade LX200GPS telescope with attached SBIG ST-8XME CCD camera at Juniata College in Huntingdon, Pennsylvania to image KN Vul, V0406 Lyr, V2240 Cyg, and MS Her, also in B,V,R, and I.After data reduction, we created light curves for each of the systems and modeled the eclipsing binaries using the BinaryMaker3 and PHOEBE programs to determine their fundamental physical parameters for the first time. Complete light curves and preliminary models for each of these neglected eclipsing binary systems will be presented.

  8. [Logistic regression model of noninvasive prediction for portal hypertensive gastropathy in patients with hepatitis B associated cirrhosis].

    Science.gov (United States)

    Wang, Qingliang; Li, Xiaojie; Hu, Kunpeng; Zhao, Kun; Yang, Peisheng; Liu, Bo

    2015-05-12

    To explore the risk factors of portal hypertensive gastropathy (PHG) in patients with hepatitis B associated cirrhosis and establish a Logistic regression model of noninvasive prediction. The clinical data of 234 hospitalized patients with hepatitis B associated cirrhosis from March 2012 to March 2014 were analyzed retrospectively. The dependent variable was the occurrence of PHG while the independent variables were screened by binary Logistic analysis. Multivariate Logistic regression was used for further analysis of significant noninvasive independent variables. Logistic regression model was established and odds ratio was calculated for each factor. The accuracy, sensitivity and specificity of model were evaluated by the curve of receiver operating characteristic (ROC). According to univariate Logistic regression, the risk factors included hepatic dysfunction, albumin (ALB), bilirubin (TB), prothrombin time (PT), platelet (PLT), white blood cell (WBC), portal vein diameter, spleen index, splenic vein diameter, diameter ratio, PLT to spleen volume ratio, esophageal varices (EV) and gastric varices (GV). Multivariate analysis showed that hepatic dysfunction (X1), TB (X2), PLT (X3) and splenic vein diameter (X4) were the major occurring factors for PHG. The established regression model was Logit P=-2.667+2.186X1-2.167X2+0.725X3+0.976X4. The accuracy of model for PHG was 79.1% with a sensitivity of 77.2% and a specificity of 80.8%. Hepatic dysfunction, TB, PLT and splenic vein diameter are risk factors for PHG and the noninvasive predicted Logistic regression model was Logit P=-2.667+2.186X1-2.167X2+0.725X3+0.976X4.

  9. Binary model for the coma cluster of galaxies

    International Nuclear Information System (INIS)

    Valtonen, M.J.; Byrd, G.G.

    1979-01-01

    We study the dynamics of galaxies in the Coma cluster and find that the cluster is probably dominated by a central binary of galaxies NGC 4874--NGC4889. We estimate their total mass to be about 3 x 10 14 M/sub sun/ by two independent methods (assuming in Hubble constant of 100 km s -1 Mpc -1 ). This binary is efficient in dynamically ejecting smaller galaxies, some of of which are seen in projection against the inner 3 0 radius of the cluster and which, if erroneously considered as bound members, cause a serious overestimate of the mass of the entire cluster. Taking account of the ejected galaxies, we estimate the total cluster mass to be 4--9 x 10 14 M/sub sun/, with a corresponding mass-to-light ratio for a typical galaxy in the range of 20--120 solar units. The origin of the secondary maximum observed in the radial surface density profile is studied. We consider it to be a remnant of a shell of galaxies which formed around the central binary. This shell expanded, then collapsed into the binary, and is now reexpanding. This is supported by the coincidence of the minimum in the cluster eccentricity and radical velocity dispersion at the same radial distance as the secondary maximum. Numerical simulations of a cluster model with a massive central binary and a spherical shell of test particles are performed, and they reproduce the observed shape, galaxy density, and radial velocity distributions in the Coma cluster fairly well. Consequences of extending the model to other clusters are discussed

  10. Complete waveform model for compact binaries on eccentric orbits

    Science.gov (United States)

    Huerta, E. A.; Kumar, Prayush; Agarwal, Bhanu; George, Daniel; Schive, Hsi-Yu; Pfeiffer, Harald P.; Haas, Roland; Ren, Wei; Chu, Tony; Boyle, Michael; Hemberger, Daniel A.; Kidder, Lawrence E.; Scheel, Mark A.; Szilagyi, Bela

    2017-01-01

    We present a time domain waveform model that describes the inspiral, merger and ringdown of compact binary systems whose components are nonspinning, and which evolve on orbits with low to moderate eccentricity. The inspiral evolution is described using third-order post-Newtonian equations both for the equations of motion of the binary, and its far-zone radiation field. This latter component also includes instantaneous, tails and tails-of-tails contributions, and a contribution due to nonlinear memory. This framework reduces to the post-Newtonian approximant TaylorT4 at third post-Newtonian order in the zero-eccentricity limit. To improve phase accuracy, we also incorporate higher-order post-Newtonian corrections for the energy flux of quasicircular binaries and gravitational self-force corrections to the binding energy of compact binaries. This enhanced prescription for the inspiral evolution is combined with a fully analytical prescription for the merger-ringdown evolution constructed using a catalog of numerical relativity simulations. We show that this inspiral-merger-ringdown waveform model reproduces the effective-one-body model of Ref. [Y. Pan et al., Phys. Rev. D 89, 061501 (2014)., 10.1103/PhysRevD.89.061501] for quasicircular black hole binaries with mass ratios between 1 to 15 in the zero-eccentricity limit over a wide range of the parameter space under consideration. Using a set of eccentric numerical relativity simulations, not used during calibration, we show that our new eccentric model reproduces the true features of eccentric compact binary coalescence throughout merger. We use this model to show that the gravitational-wave transients GW150914 and GW151226 can be effectively recovered with template banks of quasicircular, spin-aligned waveforms if the eccentricity e0 of these systems when they enter the aLIGO band at a gravitational-wave frequency of 14 Hz satisfies e0GW 150914≤0.15 and e0GW 151226≤0.1 . We also find that varying the spin

  11. Modeling of formation of binary-phase hollow nanospheres from metallic solid nanospheres

    International Nuclear Information System (INIS)

    Svoboda, J.; Fischer, F.D.; Vollath, D.

    2009-01-01

    Spontaneous formation of binary-phase hollow nanospheres by reaction of a metallic nanosphere with a non-metallic component in the surrounding atmosphere is observed for many systems. The kinetic model describing this phenomenon is derived by application of the thermodynamic extremal principle. The necessary condition of formation of the binary-phase hollow nanospheres is that the diffusion coefficient of the metallic component in the binary phase is higher than that of the non-metallic component (Kirkendall effect occurs in the correct direction). The model predictions of the time to formation of the binary-phase hollow nanospheres agree with the experimental observations

  12. Discriminative Elastic-Net Regularized Linear Regression.

    Science.gov (United States)

    Zhang, Zheng; Lai, Zhihui; Xu, Yong; Shao, Ling; Wu, Jian; Xie, Guo-Sen

    2017-03-01

    In this paper, we aim at learning compact and discriminative linear regression models. Linear regression has been widely used in different problems. However, most of the existing linear regression methods exploit the conventional zero-one matrix as the regression targets, which greatly narrows the flexibility of the regression model. Another major limitation of these methods is that the learned projection matrix fails to precisely project the image features to the target space due to their weak discriminative capability. To this end, we present an elastic-net regularized linear regression (ENLR) framework, and develop two robust linear regression models which possess the following special characteristics. First, our methods exploit two particular strategies to enlarge the margins of different classes by relaxing the strict binary targets into a more feasible variable matrix. Second, a robust elastic-net regularization of singular values is introduced to enhance the compactness and effectiveness of the learned projection matrix. Third, the resulting optimization problem of ENLR has a closed-form solution in each iteration, which can be solved efficiently. Finally, rather than directly exploiting the projection matrix for recognition, our methods employ the transformed features as the new discriminate representations to make final image classification. Compared with the traditional linear regression model and some of its variants, our method is much more accurate in image classification. Extensive experiments conducted on publicly available data sets well demonstrate that the proposed framework can outperform the state-of-the-art methods. The MATLAB codes of our methods can be available at http://www.yongxu.org/lunwen.html.

  13. Determination of osteoporosis risk factors using a multiple logistic regression model in postmenopausal Turkish women.

    Science.gov (United States)

    Akkus, Zeki; Camdeviren, Handan; Celik, Fatma; Gur, Ali; Nas, Kemal

    2005-09-01

    To determine the risk factors of osteoporosis using a multiple binary logistic regression method and to assess the risk variables for osteoporosis, which is a major and growing health problem in many countries. We presented a case-control study, consisting of 126 postmenopausal healthy women as control group and 225 postmenopausal osteoporotic women as the case group. The study was carried out in the Department of Physical Medicine and Rehabilitation, Dicle University, Diyarbakir, Turkey between 1999-2002. The data from the 351 participants were collected using a standard questionnaire that contains 43 variables. A multiple logistic regression model was then used to evaluate the data and to find the best regression model. We classified 80.1% (281/351) of the participants using the regression model. Furthermore, the specificity value of the model was 67% (84/126) of the control group while the sensitivity value was 88% (197/225) of the case group. We found the distribution of residual values standardized for final model to be exponential using the Kolmogorow-Smirnow test (p=0.193). The receiver operating characteristic curve was found successful to predict patients with risk for osteoporosis. This study suggests that low levels of dietary calcium intake, physical activity, education, and longer duration of menopause are independent predictors of the risk of low bone density in our population. Adequate dietary calcium intake in combination with maintaining a daily physical activity, increasing educational level, decreasing birth rate, and duration of breast-feeding may contribute to healthy bones and play a role in practical prevention of osteoporosis in Southeast Anatolia. In addition, the findings of the present study indicate that the use of multivariate statistical method as a multiple logistic regression in osteoporosis, which maybe influenced by many variables, is better than univariate statistical evaluation.

  14. Logistic Regression: Concept and Application

    Science.gov (United States)

    Cokluk, Omay

    2010-01-01

    The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…

  15. Poisson Mixture Regression Models for Heart Disease Prediction.

    Science.gov (United States)

    Mufudza, Chipo; Erol, Hamza

    2016-01-01

    Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model.

  16. Poisson Mixture Regression Models for Heart Disease Prediction

    Science.gov (United States)

    Erol, Hamza

    2016-01-01

    Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model. PMID:27999611

  17. Evolutionary models of early-type contact binary SV Centauri

    Energy Technology Data Exchange (ETDEWEB)

    Nakamura, Y; Saio, H [Tohoku Univ., Sendai (Japan). Faculty of Science; Sugimoto, Daiichiro

    1978-12-01

    Models of the early-type contact binary system SV Centauri are computed with a binary-star evolution program. The effects of mass exchange, i.e., the effects of mass acceptance as well as mass loss, are properly included. With the initial masses of the component stars as 12.4 and 8.0 M sub(solar mass), the following observed configurations are well reproduced; the component stars are definitely in contact and the rate of mass exchange is 4 x 10/sup -4/ M sub(solar mass)yr/sup -1/. The more massive component is less luminous and has a lower effective temperature. Such features are also reproduced quantitatively. Agreement of the computed models with observation indicates that the binary system SV Cen is actually in the phase of rapid mass exchange preceding the mass-ratio reversal.

  18. Completing the Remedial Sequence and College-Level Credit-Bearing Math: Comparing Binary, Cumulative, and Continuation Ratio Logistic Regression Models

    Science.gov (United States)

    Davidson, J. Cody

    2016-01-01

    Mathematics is the most common subject area of remedial need and the majority of remedial math students never pass a college-level credit-bearing math class. The majorities of studies that investigate this phenomenon are conducted at community colleges and use some type of regression model; however, none have used a continuation ratio model. The…

  19. Marginal and Random Intercepts Models for Longitudinal Binary Data with Examples from Criminology

    Science.gov (United States)

    Long, Jeffrey D.; Loeber, Rolf; Farrington, David P.

    2009-01-01

    Two models for the analysis of longitudinal binary data are discussed: the marginal model and the random intercepts model. In contrast to the linear mixed model (LMM), the two models for binary data are not subsumed under a single hierarchical model. The marginal model provides group-level information whereas the random intercepts model provides…

  20. Mixture of Regression Models with Single-Index

    OpenAIRE

    Xiang, Sijia; Yao, Weixin

    2016-01-01

    In this article, we propose a class of semiparametric mixture regression models with single-index. We argue that many recently proposed semiparametric/nonparametric mixture regression models can be considered special cases of the proposed model. However, unlike existing semiparametric mixture regression models, the new pro- posed model can easily incorporate multivariate predictors into the nonparametric components. Backfitting estimates and the corresponding algorithms have been proposed for...

  1. An Additive-Multiplicative Cox-Aalen Regression Model

    DEFF Research Database (Denmark)

    Scheike, Thomas H.; Zhang, Mei-Jie

    2002-01-01

    Aalen model; additive risk model; counting processes; Cox regression; survival analysis; time-varying effects......Aalen model; additive risk model; counting processes; Cox regression; survival analysis; time-varying effects...

  2. Modeling the dynamics of urban growth using multinomial logistic regression: a case study of Jiayu County, Hubei Province, China

    Science.gov (United States)

    Nong, Yu; Du, Qingyun; Wang, Kun; Miao, Lei; Zhang, Weiwei

    2008-10-01

    Urban growth modeling, one of the most important aspects of land use and land cover change study, has attracted substantial attention because it helps to comprehend the mechanisms of land use change thus helps relevant policies made. This study applied multinomial logistic regression to model urban growth in the Jiayu county of Hubei province, China to discover the relationship between urban growth and the driving forces of which biophysical and social-economic factors are selected as independent variables. This type of regression is similar to binary logistic regression, but it is more general because the dependent variable is not restricted to two categories, as those previous studies did. The multinomial one can simulate the process of multiple land use competition between urban land, bare land, cultivated land and orchard land. Taking the land use type of Urban as reference category, parameters could be estimated with odds ratio. A probability map is generated from the model to predict where urban growth will occur as a result of the computation.

  3. The Binary System Laboratory Activities Based on Students Mental Model

    Science.gov (United States)

    Albaiti, A.; Liliasari, S.; Sumarna, O.; Martoprawiro, M. A.

    2017-09-01

    Generic science skills (GSS) are required to develop student conception in learning binary system. The aim of this research was to know the improvement of students GSS through the binary system labotoratory activities based on their mental model using hypothetical-deductive learning cycle. It was a mixed methods embedded experimental model research design. This research involved 15 students of a university in Papua, Indonesia. Essay test of 7 items was used to analyze the improvement of students GSS. Each items was designed to interconnect macroscopic, sub-microscopic and symbolic levels. Students worksheet was used to explore students mental model during investigation in laboratory. The increase of students GSS could be seen in their N-Gain of each GSS indicators. The results were then analyzed descriptively. Students mental model and GSS have been improved from this study. They were interconnect macroscopic and symbolic levels to explain binary systems phenomena. Furthermore, they reconstructed their mental model with interconnecting the three levels of representation in Physical Chemistry. It necessary to integrate the Physical Chemistry Laboratory into a Physical Chemistry course for effectiveness and efficiency.

  4. [From clinical judgment to linear regression model.

    Science.gov (United States)

    Palacios-Cruz, Lino; Pérez, Marcela; Rivas-Ruiz, Rodolfo; Talavera, Juan O

    2013-01-01

    When we think about mathematical models, such as linear regression model, we think that these terms are only used by those engaged in research, a notion that is far from the truth. Legendre described the first mathematical model in 1805, and Galton introduced the formal term in 1886. Linear regression is one of the most commonly used regression models in clinical practice. It is useful to predict or show the relationship between two or more variables as long as the dependent variable is quantitative and has normal distribution. Stated in another way, the regression is used to predict a measure based on the knowledge of at least one other variable. Linear regression has as it's first objective to determine the slope or inclination of the regression line: Y = a + bx, where "a" is the intercept or regression constant and it is equivalent to "Y" value when "X" equals 0 and "b" (also called slope) indicates the increase or decrease that occurs when the variable "x" increases or decreases in one unit. In the regression line, "b" is called regression coefficient. The coefficient of determination (R 2 ) indicates the importance of independent variables in the outcome.

  5. Regression models of reactor diagnostic signals

    International Nuclear Information System (INIS)

    Vavrin, J.

    1989-01-01

    The application is described of an autoregression model as the simplest regression model of diagnostic signals in experimental analysis of diagnostic systems, in in-service monitoring of normal and anomalous conditions and their diagnostics. The method of diagnostics is described using a regression type diagnostic data base and regression spectral diagnostics. The diagnostics is described of neutron noise signals from anomalous modes in the experimental fuel assembly of a reactor. (author)

  6. PHYSICS OF ECLIPSING BINARIES. II. TOWARD THE INCREASED MODEL FIDELITY

    Energy Technology Data Exchange (ETDEWEB)

    Prša, A.; Conroy, K. E.; Horvat, M.; Kochoska, A.; Hambleton, K. M. [Villanova University, Dept. of Astrophysics and Planetary Sciences, 800 E Lancaster Avenue, Villanova PA 19085 (United States); Pablo, H. [Université de Montréal, Pavillon Roger-Gaudry, 2900, boul. Édouard-Montpetit Montréal QC H3T 1J4 (Canada); Bloemen, S. [Radboud University Nijmegen, Department of Astrophysics, IMAPP, P.O. Box 9010, 6500 GL, Nijmegen (Netherlands); Giammarco, J. [Eastern University, Dept. of Astronomy and Physics, 1300 Eagle Road, St. Davids, PA 19087 (United States); Degroote, P. [KU Leuven, Instituut voor Sterrenkunde, Celestijnenlaan 200D, B-3001 Heverlee (Belgium)

    2016-12-01

    The precision of photometric and spectroscopic observations has been systematically improved in the last decade, mostly thanks to space-borne photometric missions and ground-based spectrographs dedicated to finding exoplanets. The field of eclipsing binary stars strongly benefited from this development. Eclipsing binaries serve as critical tools for determining fundamental stellar properties (masses, radii, temperatures, and luminosities), yet the models are not capable of reproducing observed data well, either because of the missing physics or because of insufficient precision. This led to a predicament where radiative and dynamical effects, insofar buried in noise, started showing up routinely in the data, but were not accounted for in the models. PHOEBE (PHysics Of Eclipsing BinariEs; http://phoebe-project.org) is an open source modeling code for computing theoretical light and radial velocity curves that addresses both problems by incorporating missing physics and by increasing the computational fidelity. In particular, we discuss triangulation as a superior surface discretization algorithm, meshing of rotating single stars, light travel time effects, advanced phase computation, volume conservation in eccentric orbits, and improved computation of local intensity across the stellar surfaces that includes the photon-weighted mode, the enhanced limb darkening treatment, the better reflection treatment, and Doppler boosting. Here we present the concepts on which PHOEBE is built and proofs of concept that demonstrate the increased model fidelity.

  7. Forecasting with Dynamic Regression Models

    CERN Document Server

    Pankratz, Alan

    2012-01-01

    One of the most widely used tools in statistical forecasting, single equation regression models is examined here. A companion to the author's earlier work, Forecasting with Univariate Box-Jenkins Models: Concepts and Cases, the present text pulls together recent time series ideas and gives special attention to possible intertemporal patterns, distributed lag responses of output to input series and the auto correlation patterns of regression disturbance. It also includes six case studies.

  8. Multiple Logistic Regression Analysis of Cigarette Use among High School Students

    Science.gov (United States)

    Adwere-Boamah, Joseph

    2011-01-01

    A binary logistic regression analysis was performed to predict high school students' cigarette smoking behavior from selected predictors from 2009 CDC Youth Risk Behavior Surveillance Survey. The specific target student behavior of interest was frequent cigarette use. Five predictor variables included in the model were: a) race, b) frequency of…

  9. First Higher-Multipole Model of Gravitational Waves from Spinning and Coalescing Black-Hole Binaries.

    Science.gov (United States)

    London, Lionel; Khan, Sebastian; Fauchon-Jones, Edward; García, Cecilio; Hannam, Mark; Husa, Sascha; Jiménez-Forteza, Xisco; Kalaghatgi, Chinmay; Ohme, Frank; Pannarale, Francesco

    2018-04-20

    Gravitational-wave observations of binary black holes currently rely on theoretical models that predict the dominant multipoles (ℓ=2,|m|=2) of the radiation during inspiral, merger, and ringdown. We introduce a simple method to include the subdominant multipoles to binary black hole gravitational waveforms, given a frequency-domain model for the dominant multipoles. The amplitude and phase of the original model are appropriately stretched and rescaled using post-Newtonian results (for the inspiral), perturbation theory (for the ringdown), and a smooth transition between the two. No additional tuning to numerical-relativity simulations is required. We apply a variant of this method to the nonprecessing PhenomD model. The result, PhenomHM, constitutes the first higher-multipole model of spinning and coalescing black-hole binaries, and currently includes the (ℓ,|m|)=(2,2),(3,3),(4,4),(2,1),(3,2),(4,3) radiative moments. Comparisons with numerical-relativity waveforms demonstrate that PhenomHM is more accurate than dominant-multipole-only models for all binary configurations, and typically improves the measurement of binary properties.

  10. First Higher-Multipole Model of Gravitational Waves from Spinning and Coalescing Black-Hole Binaries

    Science.gov (United States)

    London, Lionel; Khan, Sebastian; Fauchon-Jones, Edward; García, Cecilio; Hannam, Mark; Husa, Sascha; Jiménez-Forteza, Xisco; Kalaghatgi, Chinmay; Ohme, Frank; Pannarale, Francesco

    2018-04-01

    Gravitational-wave observations of binary black holes currently rely on theoretical models that predict the dominant multipoles (ℓ=2 ,|m |=2 ) of the radiation during inspiral, merger, and ringdown. We introduce a simple method to include the subdominant multipoles to binary black hole gravitational waveforms, given a frequency-domain model for the dominant multipoles. The amplitude and phase of the original model are appropriately stretched and rescaled using post-Newtonian results (for the inspiral), perturbation theory (for the ringdown), and a smooth transition between the two. No additional tuning to numerical-relativity simulations is required. We apply a variant of this method to the nonprecessing PhenomD model. The result, PhenomHM, constitutes the first higher-multipole model of spinning and coalescing black-hole binaries, and currently includes the (ℓ,|m |)=(2 ,2 ),(3 ,3 ),(4 ,4 ),(2 ,1 ),(3 ,2 ),(4 ,3 ) radiative moments. Comparisons with numerical-relativity waveforms demonstrate that PhenomHM is more accurate than dominant-multipole-only models for all binary configurations, and typically improves the measurement of binary properties.

  11. Categorical regression dose-response modeling

    Science.gov (United States)

    The goal of this training is to provide participants with training on the use of the U.S. EPA’s Categorical Regression soft¬ware (CatReg) and its application to risk assessment. Categorical regression fits mathematical models to toxicity data that have been assigned ord...

  12. Modeling and analysis of periodic orbits around a contact binary asteroid

    NARCIS (Netherlands)

    Feng, J.; Noomen, R.; Visser, P.N.A.M.; Yuan, J.

    2015-01-01

    The existence and characteristics of periodic orbits (POs) in the vicinity of a contact binary asteroid are investigated with an averaged spherical harmonics model. A contact binary asteroid consists of two components connected to each other, resulting in a highly bifurcated shape. Here, it is

  13. Mixed-effects regression models in linguistics

    CERN Document Server

    Heylen, Kris; Geeraerts, Dirk

    2018-01-01

    When data consist of grouped observations or clusters, and there is a risk that measurements within the same group are not independent, group-specific random effects can be added to a regression model in order to account for such within-group associations. Regression models that contain such group-specific random effects are called mixed-effects regression models, or simply mixed models. Mixed models are a versatile tool that can handle both balanced and unbalanced datasets and that can also be applied when several layers of grouping are present in the data; these layers can either be nested or crossed.  In linguistics, as in many other fields, the use of mixed models has gained ground rapidly over the last decade. This methodological evolution enables us to build more sophisticated and arguably more realistic models, but, due to its technical complexity, also introduces new challenges. This volume brings together a number of promising new evolutions in the use of mixed models in linguistics, but also addres...

  14. Moderation analysis using a two-level regression model.

    Science.gov (United States)

    Yuan, Ke-Hai; Cheng, Ying; Maxwell, Scott

    2014-10-01

    Moderation analysis is widely used in social and behavioral research. The most commonly used model for moderation analysis is moderated multiple regression (MMR) in which the explanatory variables of the regression model include product terms, and the model is typically estimated by least squares (LS). This paper argues for a two-level regression model in which the regression coefficients of a criterion variable on predictors are further regressed on moderator variables. An algorithm for estimating the parameters of the two-level model by normal-distribution-based maximum likelihood (NML) is developed. Formulas for the standard errors (SEs) of the parameter estimates are provided and studied. Results indicate that, when heteroscedasticity exists, NML with the two-level model gives more efficient and more accurate parameter estimates than the LS analysis of the MMR model. When error variances are homoscedastic, NML with the two-level model leads to essentially the same results as LS with the MMR model. Most importantly, the two-level regression model permits estimating the percentage of variance of each regression coefficient that is due to moderator variables. When applied to data from General Social Surveys 1991, NML with the two-level model identified a significant moderation effect of race on the regression of job prestige on years of education while LS with the MMR model did not. An R package is also developed and documented to facilitate the application of the two-level model.

  15. A new model for predicting thermodynamic properties of ternary metallic solution from binary components

    International Nuclear Information System (INIS)

    Fang Zheng; Zhang Quanru

    2006-01-01

    A model has been derived to predict thermodynamic properties of ternary metallic systems from those of its three binaries. In the model, the excess Gibbs free energies and the interaction parameter ω 123 for three components of a ternary are expressed as a simple sum of those of the three sub-binaries, and the mole fractions of the components of the ternary are identical with the sub-binaries. This model is greatly simplified compared with the current symmetrical and asymmetrical models. It is able to overcome some shortcomings of the current models, such as the arrangement of the components in the Gibbs triangle, the conversion of mole fractions between ternary and corresponding binaries, and some necessary processes for optimizing the various parameters of these models. Two ternary systems, Mg-Cu-Ni and Cd-Bi-Pb are recalculated to demonstrate the validity and precision of the present model. The calculated results on the Mg-Cu-Ni system are better than those in the literature. New parameters in the Margules equations expressing the excess Gibbs free energies of three binary systems of the Cd-Bi-Pb ternary system are also given

  16. Percolation of binary disk systems: Modeling and theory

    International Nuclear Information System (INIS)

    Meeks, Kelsey; Pantoya, Michelle L.

    2017-01-01

    The dispersion and connectivity of particles with a high degree of polydispersity is relevant to problems involving composite material properties and reaction decomposition prediction and has been the subject of much study in the literature. This paper utilizes Monte Carlo models to predict percolation thresholds for a two-dimensional systems containing disks of two different radii. Monte Carlo simulations and spanning probability are used to extend prior models into regions of higher polydispersity than those previously considered. A correlation to predict the percolation threshold for binary disk systems is proposed based on the extended dataset presented in this work and compared to previously published correlations. Finally, a set of boundary conditions necessary for a good fit is presented, and a condition for maximizing percolation threshold for binary disk systems is suggested.

  17. Variable selection and model choice in geoadditive regression models.

    Science.gov (United States)

    Kneib, Thomas; Hothorn, Torsten; Tutz, Gerhard

    2009-06-01

    Model choice and variable selection are issues of major concern in practical regression analyses, arising in many biometric applications such as habitat suitability analyses, where the aim is to identify the influence of potentially many environmental conditions on certain species. We describe regression models for breeding bird communities that facilitate both model choice and variable selection, by a boosting algorithm that works within a class of geoadditive regression models comprising spatial effects, nonparametric effects of continuous covariates, interaction surfaces, and varying coefficients. The major modeling components are penalized splines and their bivariate tensor product extensions. All smooth model terms are represented as the sum of a parametric component and a smooth component with one degree of freedom to obtain a fair comparison between the model terms. A generic representation of the geoadditive model allows us to devise a general boosting algorithm that automatically performs model choice and variable selection.

  18. Waveform model for an eccentric binary black hole based on the effective-one-body-numerical-relativity formalism

    Science.gov (United States)

    Cao, Zhoujian; Han, Wen-Biao

    2017-08-01

    Binary black hole systems are among the most important sources for gravitational wave detection. They are also good objects for theoretical research for general relativity. A gravitational waveform template is important to data analysis. An effective-one-body-numerical-relativity (EOBNR) model has played an essential role in the LIGO data analysis. For future space-based gravitational wave detection, many binary systems will admit a somewhat orbit eccentricity. At the same time, the eccentric binary is also an interesting topic for theoretical study in general relativity. In this paper, we construct the first eccentric binary waveform model based on an effective-one-body-numerical-relativity framework. Our basic assumption in the model construction is that the involved eccentricity is small. We have compared our eccentric EOBNR model to the circular one used in the LIGO data analysis. We have also tested our eccentric EOBNR model against another recently proposed eccentric binary waveform model; against numerical relativity simulation results; and against perturbation approximation results for extreme mass ratio binary systems. Compared to numerical relativity simulations with an eccentricity as large as about 0.2, the overlap factor for our eccentric EOBNR model is better than 0.98 for all tested cases, including spinless binary and spinning binary, equal mass binary, and unequal mass binary. Hopefully, our eccentric model can be the starting point to develop a faithful template for future space-based gravitational wave detectors.

  19. Experimental vapor-liquid equilibria data for binary mixtures of xylene isomers

    Directory of Open Access Journals (Sweden)

    W.L. Rodrigues

    2005-09-01

    Full Text Available Separation of aromatic C8 compounds by distillation is a difficult task due to the low relative volatilities of the compounds and to the high degree of purity required of the final commercial products. For rigorous simulation and optimization of this separation, the use of a model capable of describing vapor-liquid equilibria accurately is necessary. Nevertheless, experimental data are not available for all binaries at atmospheric pressure. Vapor-liquid equilibria data for binary mixtures were isobarically obtained with a modified Fischer cell at 100.65 kPa. The vapor and liquid phase compositions were analyzed with a gas chromatograph. The methodology was initially tested for cyclo-hexane+n-heptane data; results obtained are similar to other data in the literature. Data for xylene binary mixtures were then obtained, and after testing, were considered to be thermodynamically consistent. Experimental data were regressed with Aspen Plus® 10.1 and binary interaction parameters were reported for the most frequently used activity coefficient models and for the classic mixing rules of two cubic equations of state.

  20. Confounding of three binary-variables counterfactual model

    OpenAIRE

    Liu, Jingwei; Hu, Shuang

    2011-01-01

    Confounding of three binary-variables counterfactual model is discussed in this paper. According to the effect between the control variable and the covariate variable, we investigate three counterfactual models: the control variable is independent of the covariate variable, the control variable has the effect on the covariate variable and the covariate variable affects the control variable. Using the ancillary information based on conditional independence hypotheses, the sufficient conditions...

  1. The MIDAS Touch: Mixed Data Sampling Regression Models

    OpenAIRE

    Ghysels, Eric; Santa-Clara, Pedro; Valkanov, Rossen

    2004-01-01

    We introduce Mixed Data Sampling (henceforth MIDAS) regression models. The regressions involve time series data sampled at different frequencies. Technically speaking MIDAS models specify conditional expectations as a distributed lag of regressors recorded at some higher sampling frequencies. We examine the asymptotic properties of MIDAS regression estimation and compare it with traditional distributed lag models. MIDAS regressions have wide applicability in macroeconomics and �nance.

  2. Physics Of Eclipsing Binaries. II. Towards the Increased Model Fidelity

    OpenAIRE

    Prša, Andrej; Conroy, Kyle E.; Horvat, Martin; Pablo, Herbert; Kochoska, Angela; Bloemen, Steven; Giammarco, Joseph; Hambleton, Kelly M.; Degroote, Pieter

    2016-01-01

    The precision of photometric and spectroscopic observations has been systematically improved in the last decade, mostly thanks to space-borne photometric missions and ground-based spectrographs dedicated to finding exoplanets. The field of eclipsing binary stars strongly benefited from this development. Eclipsing binaries serve as critical tools for determining fundamental stellar properties (masses, radii, temperatures and luminosities), yet the models are not capable of reproducing observed...

  3. A binary genetic programing model for teleconnection identification between global sea surface temperature and local maximum monthly rainfall events

    Science.gov (United States)

    Danandeh Mehr, Ali; Nourani, Vahid; Hrnjica, Bahrudin; Molajou, Amir

    2017-12-01

    The effectiveness of genetic programming (GP) for solving regression problems in hydrology has been recognized in recent studies. However, its capability to solve classification problems has not been sufficiently explored so far. This study develops and applies a novel classification-forecasting model, namely Binary GP (BGP), for teleconnection studies between sea surface temperature (SST) variations and maximum monthly rainfall (MMR) events. The BGP integrates certain types of data pre-processing and post-processing methods with conventional GP engine to enhance its ability to solve both regression and classification problems simultaneously. The model was trained and tested using SST series of Black Sea, Mediterranean Sea, and Red Sea as potential predictors as well as classified MMR events at two locations in Iran as predictand. Skill of the model was measured in regard to different rainfall thresholds and SST lags and compared to that of the hybrid decision tree-association rule (DTAR) model available in the literature. The results indicated that the proposed model can identify potential teleconnection signals of surrounding seas beneficial to long-term forecasting of the occurrence of the classified MMR events.

  4. Real estate value prediction using multivariate regression models

    Science.gov (United States)

    Manjula, R.; Jain, Shubham; Srivastava, Sharad; Rajiv Kher, Pranav

    2017-11-01

    The real estate market is one of the most competitive in terms of pricing and the same tends to vary significantly based on a lot of factors, hence it becomes one of the prime fields to apply the concepts of machine learning to optimize and predict the prices with high accuracy. Therefore in this paper, we present various important features to use while predicting housing prices with good accuracy. We have described regression models, using various features to have lower Residual Sum of Squares error. While using features in a regression model some feature engineering is required for better prediction. Often a set of features (multiple regressions) or polynomial regression (applying a various set of powers in the features) is used for making better model fit. For these models are expected to be susceptible towards over fitting ridge regression is used to reduce it. This paper thus directs to the best application of regression models in addition to other techniques to optimize the result.

  5. Targeting: Logistic Regression, Special Cases and Extensions

    Directory of Open Access Journals (Sweden)

    Helmut Schaeben

    2014-12-01

    Full Text Available Logistic regression is a classical linear model for logit-transformed conditional probabilities of a binary target variable. It recovers the true conditional probabilities if the joint distribution of predictors and the target is of log-linear form. Weights-of-evidence is an ordinary logistic regression with parameters equal to the differences of the weights of evidence if all predictor variables are discrete and conditionally independent given the target variable. The hypothesis of conditional independence can be tested in terms of log-linear models. If the assumption of conditional independence is violated, the application of weights-of-evidence does not only corrupt the predicted conditional probabilities, but also their rank transform. Logistic regression models, including the interaction terms, can account for the lack of conditional independence, appropriate interaction terms compensate exactly for violations of conditional independence. Multilayer artificial neural nets may be seen as nested regression-like models, with some sigmoidal activation function. Most often, the logistic function is used as the activation function. If the net topology, i.e., its control, is sufficiently versatile to mimic interaction terms, artificial neural nets are able to account for violations of conditional independence and yield very similar results. Weights-of-evidence cannot reasonably include interaction terms; subsequent modifications of the weights, as often suggested, cannot emulate the effect of interaction terms.

  6. Model-based Quantile Regression for Discrete Data

    KAUST Repository

    Padellini, Tullia

    2018-04-10

    Quantile regression is a class of methods voted to the modelling of conditional quantiles. In a Bayesian framework quantile regression has typically been carried out exploiting the Asymmetric Laplace Distribution as a working likelihood. Despite the fact that this leads to a proper posterior for the regression coefficients, the resulting posterior variance is however affected by an unidentifiable parameter, hence any inferential procedure beside point estimation is unreliable. We propose a model-based approach for quantile regression that considers quantiles of the generating distribution directly, and thus allows for a proper uncertainty quantification. We then create a link between quantile regression and generalised linear models by mapping the quantiles to the parameter of the response variable, and we exploit it to fit the model with R-INLA. We extend it also in the case of discrete responses, where there is no 1-to-1 relationship between quantiles and distribution\\'s parameter, by introducing continuous generalisations of the most common discrete variables (Poisson, Binomial and Negative Binomial) to be exploited in the fitting.

  7. Evolutionary model of the subdwarf binary system LB3459

    International Nuclear Information System (INIS)

    Paczynski, B.; Dearborn, D.S.

    1980-01-01

    An evolutionary model is proposed for the eclipsing binary system LB 3459 (=CPD-60 0 389 = HDE 269696). The two stars are hot subdwarfs with degenerate helium cores, hydrogen burning shell sources and low mass hydrogen rich envelopes. The system probably evolved through two common envelope phases. After the first such phase it might look like the semi-detached binary AS Eri. Soon after the second common envelope phase the system might look like UU Sge, an eclipsing binary nucleus of a planetary nebula. The present mass of the optical (spectroscopic) primary is probably close to 0.24 solar mass, and the predicted radial velocity amplitude of the primary is about 150 km/s. The optical secondary should be hotter and bolometrically brighter, with a mass of 0.32 solar mass. The primary eclipse is an occultation. (author)

  8. Modeling the rate of HIV testing from repeated binary data amidst potential never-testers.

    Science.gov (United States)

    Rice, John D; Johnson, Brent A; Strawderman, Robert L

    2018-01-04

    Many longitudinal studies with a binary outcome measure involve a fraction of subjects with a homogeneous response profile. In our motivating data set, a study on the rate of human immunodeficiency virus (HIV) self-testing in a population of men who have sex with men (MSM), a substantial proportion of the subjects did not self-test during the follow-up study. The observed data in this context consist of a binary sequence for each subject indicating whether or not that subject experienced any events between consecutive observation time points, so subjects who never self-tested were observed to have a response vector consisting entirely of zeros. Conventional longitudinal analysis is not equipped to handle questions regarding the rate of events (as opposed to the odds, as in the classical logistic regression model). With the exception of discrete mixture models, such methods are also not equipped to handle settings in which there may exist a group of subjects for whom no events will ever occur, i.e. a so-called "never-responder" group. In this article, we model the observed data assuming that events occur according to some unobserved continuous-time stochastic process. In particular, we consider the underlying subject-specific processes to be Poisson conditional on some unobserved frailty, leading to a natural focus on modeling event rates. Specifically, we propose to use the power variance function (PVF) family of frailty distributions, which contains both the gamma and inverse Gaussian distributions as special cases and allows for the existence of a class of subjects having zero frailty. We generalize a computational algorithm developed for a log-gamma random intercept model (Conaway, 1990. A random effects model for binary data. Biometrics46, 317-328) to compute the exact marginal likelihood, which is then maximized to obtain estimates of model parameters. We conduct simulation studies, exploring the performance of the proposed method in comparison with

  9. Variable importance in latent variable regression models

    NARCIS (Netherlands)

    Kvalheim, O.M.; Arneberg, R.; Bleie, O.; Rajalahti, T.; Smilde, A.K.; Westerhuis, J.A.

    2014-01-01

    The quality and practical usefulness of a regression model are a function of both interpretability and prediction performance. This work presents some new graphical tools for improved interpretation of latent variable regression models that can also assist in improved algorithms for variable

  10. A simple approach to power and sample size calculations in logistic regression and Cox regression models.

    Science.gov (United States)

    Vaeth, Michael; Skovlund, Eva

    2004-06-15

    For a given regression problem it is possible to identify a suitably defined equivalent two-sample problem such that the power or sample size obtained for the two-sample problem also applies to the regression problem. For a standard linear regression model the equivalent two-sample problem is easily identified, but for generalized linear models and for Cox regression models the situation is more complicated. An approximately equivalent two-sample problem may, however, also be identified here. In particular, we show that for logistic regression and Cox regression models the equivalent two-sample problem is obtained by selecting two equally sized samples for which the parameters differ by a value equal to the slope times twice the standard deviation of the independent variable and further requiring that the overall expected number of events is unchanged. In a simulation study we examine the validity of this approach to power calculations in logistic regression and Cox regression models. Several different covariate distributions are considered for selected values of the overall response probability and a range of alternatives. For the Cox regression model we consider both constant and non-constant hazard rates. The results show that in general the approach is remarkably accurate even in relatively small samples. Some discrepancies are, however, found in small samples with few events and a highly skewed covariate distribution. Comparison with results based on alternative methods for logistic regression models with a single continuous covariate indicates that the proposed method is at least as good as its competitors. The method is easy to implement and therefore provides a simple way to extend the range of problems that can be covered by the usual formulas for power and sample size determination. Copyright 2004 John Wiley & Sons, Ltd.

  11. BANK FAILURE PREDICTION WITH LOGISTIC REGRESSION

    Directory of Open Access Journals (Sweden)

    Taha Zaghdoudi

    2013-04-01

    Full Text Available In recent years the economic and financial world is shaken by a wave of financial crisis and resulted in violent bank fairly huge losses. Several authors have focused on the study of the crises in order to develop an early warning model. It is in the same path that our work takes its inspiration. Indeed, we have tried to develop a predictive model of Tunisian bank failures with the contribution of the binary logistic regression method. The specificity of our prediction model is that it takes into account microeconomic indicators of bank failures. The results obtained using our provisional model show that a bank's ability to repay its debt, the coefficient of banking operations, bank profitability per employee and leverage financial ratio has a negative impact on the probability of failure.

  12. Regression modeling methods, theory, and computation with SAS

    CERN Document Server

    Panik, Michael

    2009-01-01

    Regression Modeling: Methods, Theory, and Computation with SAS provides an introduction to a diverse assortment of regression techniques using SAS to solve a wide variety of regression problems. The author fully documents the SAS programs and thoroughly explains the output produced by the programs.The text presents the popular ordinary least squares (OLS) approach before introducing many alternative regression methods. It covers nonparametric regression, logistic regression (including Poisson regression), Bayesian regression, robust regression, fuzzy regression, random coefficients regression,

  13. A complete waveform model for compact binaries on eccentric orbits

    Science.gov (United States)

    George, Daniel; Huerta, Eliu; Kumar, Prayush; Agarwal, Bhanu; Schive, Hsi-Yu; Pfeiffer, Harald; Chu, Tony; Boyle, Michael; Hemberger, Daniel; Kidder, Lawrence; Scheel, Mark; Szilagyi, Bela

    2017-01-01

    We present a time domain waveform model that describes the inspiral, merger and ringdown of compact binary systems whose components are non-spinning, and which evolve on orbits with low to moderate eccentricity. We show that this inspiral-merger-ringdown waveform model reproduces the effective-one-body model for black hole binaries with mass-ratios between 1 to 15 in the zero eccentricity limit over a wide range of the parameter space under consideration. We use this model to show that the gravitational wave transients GW150914 and GW151226 can be effectively recovered with template banks of quasicircular, spin-aligned waveforms if the eccentricity e0 of these systems when they enter the aLIGO band at a gravitational wave frequency of 14 Hz satisfies e0GW 150914 <= 0 . 15 and e0GW 151226 <= 0 . 1 .

  14. Regression Models For Multivariate Count Data.

    Science.gov (United States)

    Zhang, Yiwen; Zhou, Hua; Zhou, Jin; Sun, Wei

    2017-01-01

    Data with multivariate count responses frequently occur in modern applications. The commonly used multinomial-logit model is limiting due to its restrictive mean-variance structure. For instance, analyzing count data from the recent RNA-seq technology by the multinomial-logit model leads to serious errors in hypothesis testing. The ubiquity of over-dispersion and complicated correlation structures among multivariate counts calls for more flexible regression models. In this article, we study some generalized linear models that incorporate various correlation structures among the counts. Current literature lacks a treatment of these models, partly due to the fact that they do not belong to the natural exponential family. We study the estimation, testing, and variable selection for these models in a unifying framework. The regression models are compared on both synthetic and real RNA-seq data.

  15. Predictors of the number of under-five malnourished children in Bangladesh: application of the generalized poisson regression model.

    Science.gov (United States)

    Islam, Mohammad Mafijul; Alam, Morshed; Tariquzaman, Md; Kabir, Mohammad Alamgir; Pervin, Rokhsona; Begum, Munni; Khan, Md Mobarak Hossain

    2013-01-08

    Malnutrition is one of the principal causes of child mortality in developing countries including Bangladesh. According to our knowledge, most of the available studies, that addressed the issue of malnutrition among under-five children, considered the categorical (dichotomous/polychotomous) outcome variables and applied logistic regression (binary/multinomial) to find their predictors. In this study malnutrition variable (i.e. outcome) is defined as the number of under-five malnourished children in a family, which is a non-negative count variable. The purposes of the study are (i) to demonstrate the applicability of the generalized Poisson regression (GPR) model as an alternative of other statistical methods and (ii) to find some predictors of this outcome variable. The data is extracted from the Bangladesh Demographic and Health Survey (BDHS) 2007. Briefly, this survey employs a nationally representative sample which is based on a two-stage stratified sample of households. A total of 4,460 under-five children is analysed using various statistical techniques namely Chi-square test and GPR model. The GPR model (as compared to the standard Poisson regression and negative Binomial regression) is found to be justified to study the above-mentioned outcome variable because of its under-dispersion (variance variable namely mother's education, father's education, wealth index, sanitation status, source of drinking water, and total number of children ever born to a woman. Consistencies of our findings in light of many other studies suggest that the GPR model is an ideal alternative of other statistical models to analyse the number of under-five malnourished children in a family. Strategies based on significant predictors may improve the nutritional status of children in Bangladesh.

  16. Nonparametric Mixture of Regression Models.

    Science.gov (United States)

    Huang, Mian; Li, Runze; Wang, Shaoli

    2013-07-01

    Motivated by an analysis of US house price index data, we propose nonparametric finite mixture of regression models. We study the identifiability issue of the proposed models, and develop an estimation procedure by employing kernel regression. We further systematically study the sampling properties of the proposed estimators, and establish their asymptotic normality. A modified EM algorithm is proposed to carry out the estimation procedure. We show that our algorithm preserves the ascent property of the EM algorithm in an asymptotic sense. Monte Carlo simulations are conducted to examine the finite sample performance of the proposed estimation procedure. An empirical analysis of the US house price index data is illustrated for the proposed methodology.

  17. The True Ultracool Binary Fraction Using Spectral Binaries

    Science.gov (United States)

    Bardalez Gagliuffi, Daniella; Burgasser, Adam J.; Schmidt, Sarah J.; Gagné, Jonathan; Faherty, Jacqueline K.; Cruz, Kelle; Gelino, Chris

    2018-01-01

    Brown dwarfs bridge the gap between stars and giant planets. While the essential mechanisms governing their formation are not well constrained, binary statistics are a direct outcome of the formation process, and thus provide a means to test formation theories. Observational constraints on the brown dwarf binary fraction place it at 10 ‑ 20%, dominated by imaging studies (85% of systems) with the most common separation at 4 AU. This coincides with the resolution limit of state-of-the-art imaging techniques, suggesting that the binary fraction is underestimated. We have developed a separation-independent method to identify and characterize tightly-separated (dwarfs as spectral binaries by identifying traces of methane in the spectra of late-M and early-L dwarfs. Imaging follow-up of 17 spectral binaries yielded 3 (18%) resolved systems, corroborating the observed binary fraction, but 5 (29%) known binaries were missed, reinforcing the hypothesis that the short-separation systems are undercounted. In order to find the true binary fraction of brown dwarfs, we have compiled a volume-limited, spectroscopic sample of M7-L5 dwarfs and searched for T dwarf companions. In the 25 pc volume, 4 candidates were found, three of which are already confirmed, leading to a spectral binary fraction of 0.95 ± 0.50%, albeit for a specific combination of spectral types. To extract the true binary fraction and determine the biases of the spectral binary method, we have produced a binary population simulation based on different assumptions of the mass function, age distribution, evolutionary models and mass ratio distribution. Applying the correction fraction resulting from this method to the observed spectral binary fraction yields a true binary fraction of 27 ± 4%, which is roughly within 1σ of the binary fraction obtained from high resolution imaging studies, radial velocity and astrometric monitoring. This method can be extended to identify giant planet companions to young brown

  18. Interfacing modeling suite Physics Of Eclipsing Binaries 2.0 with a Virtual Reality Platform

    Science.gov (United States)

    Harriett, Edward; Conroy, Kyle; Prša, Andrej; Klassner, Frank

    2018-01-01

    To explore alternate methods for modeling eclipsing binary stars, we extrapolate upon PHOEBE’s (PHysics Of Eclipsing BinariEs) capabilities in a virtual reality (VR) environment to create an immersive and interactive experience for users. The application used is Vizard, a python-scripted VR development platform for environments such as Cave Automatic Virtual Environment (CAVE) and other off-the-shelf VR headsets. Vizard allows the freedom for all modeling to be precompiled without compromising functionality or usage on its part. The system requires five arguments to be precomputed using PHOEBE’s python front-end: the effective temperature, flux, relative intensity, vertex coordinates, and orbits; the user can opt to implement other features from PHOEBE to be accessed within the simulation as well. Here we present the method for making the data observables accessible in real time. An Occulus Rift will be available for a live showcase of various cases of VR rendering of PHOEBE binary systems including detached and contact binary stars.

  19. Gaussian Process Regression Model in Spatial Logistic Regression

    Science.gov (United States)

    Sofro, A.; Oktaviarina, A.

    2018-01-01

    Spatial analysis has developed very quickly in the last decade. One of the favorite approaches is based on the neighbourhood of the region. Unfortunately, there are some limitations such as difficulty in prediction. Therefore, we offer Gaussian process regression (GPR) to accommodate the issue. In this paper, we will focus on spatial modeling with GPR for binomial data with logit link function. The performance of the model will be investigated. We will discuss the inference of how to estimate the parameters and hyper-parameters and to predict as well. Furthermore, simulation studies will be explained in the last section.

  20. Time variability of X-ray binaries: observations with INTEGRAL. Modeling

    International Nuclear Information System (INIS)

    Cabanac, Clement

    2007-01-01

    The exact origin of the observed X and Gamma ray variability in X-ray binaries is still an open debate in high energy astrophysics. Among others, these objects are showing aperiodic and quasi-periodic luminosity variations on timescales as small as the millisecond. This erratic behavior must put constraints on the proposed emission processes occurring in the vicinity of the neutrons star or the stellar mass black-hole held by these objects. We propose here to study their behavior following 3 different ways: first we examine the evolution of a particular X-ray source discovered by INTEGRAL, IGR J19140+0951. Using timing and spectral data given by different instruments, we show that the source type is plausibly consistent with a High Mass X-ray Binary hosting a neutrons star. Subsequently, we propose a new method dedicated to the study of timing data coming from coded mask aperture instruments. Using it on INTEGRAL/ISGRI real data, we detect the presence of periodic and quasi-periodic features in some pulsars and micro-quasars at energies as high as a hundred keV. Finally, we suggest a model designed to describe the low frequency variability of X-ray binaries in their hardest state. This model is based on thermal comptonization of soft photons by a warm corona in which a pressure wave is propagating in cylindrical geometry. By computing both numerical simulations and analytical solution, we show that this model should be suitable to describe some of the typical features observed in X-ray binaries power spectra in their hard state and their evolution such as aperiodic noise and low frequency quasi-periodic oscillations. (author) [fr

  1. Regression and regression analysis time series prediction modeling on climate data of quetta, pakistan

    International Nuclear Information System (INIS)

    Jafri, Y.Z.; Kamal, L.

    2007-01-01

    Various statistical techniques was used on five-year data from 1998-2002 of average humidity, rainfall, maximum and minimum temperatures, respectively. The relationships to regression analysis time series (RATS) were developed for determining the overall trend of these climate parameters on the basis of which forecast models can be corrected and modified. We computed the coefficient of determination as a measure of goodness of fit, to our polynomial regression analysis time series (PRATS). The correlation to multiple linear regression (MLR) and multiple linear regression analysis time series (MLRATS) were also developed for deciphering the interdependence of weather parameters. Spearman's rand correlation and Goldfeld-Quandt test were used to check the uniformity or non-uniformity of variances in our fit to polynomial regression (PR). The Breusch-Pagan test was applied to MLR and MLRATS, respectively which yielded homoscedasticity. We also employed Bartlett's test for homogeneity of variances on a five-year data of rainfall and humidity, respectively which showed that the variances in rainfall data were not homogenous while in case of humidity, were homogenous. Our results on regression and regression analysis time series show the best fit to prediction modeling on climatic data of Quetta, Pakistan. (author)

  2. Robust mislabel logistic regression without modeling mislabel probabilities.

    Science.gov (United States)

    Hung, Hung; Jou, Zhi-Yu; Huang, Su-Yun

    2018-03-01

    Logistic regression is among the most widely used statistical methods for linear discriminant analysis. In many applications, we only observe possibly mislabeled responses. Fitting a conventional logistic regression can then lead to biased estimation. One common resolution is to fit a mislabel logistic regression model, which takes into consideration of mislabeled responses. Another common method is to adopt a robust M-estimation by down-weighting suspected instances. In this work, we propose a new robust mislabel logistic regression based on γ-divergence. Our proposal possesses two advantageous features: (1) It does not need to model the mislabel probabilities. (2) The minimum γ-divergence estimation leads to a weighted estimating equation without the need to include any bias correction term, that is, it is automatically bias-corrected. These features make the proposed γ-logistic regression more robust in model fitting and more intuitive for model interpretation through a simple weighting scheme. Our method is also easy to implement, and two types of algorithms are included. Simulation studies and the Pima data application are presented to demonstrate the performance of γ-logistic regression. © 2017, The International Biometric Society.

  3. Star formation history: Modeling of visual binaries

    Science.gov (United States)

    Gebrehiwot, Y. M.; Tessema, S. B.; Malkov, O. Yu.; Kovaleva, D. A.; Sytov, A. Yu.; Tutukov, A. V.

    2018-05-01

    Most stars form in binary or multiple systems. Their evolution is defined by masses of components, orbital separation and eccentricity. In order to understand star formation and evolutionary processes, it is vital to find distributions of physical parameters of binaries. We have carried out Monte Carlo simulations in which we simulate different pairing scenarios: random pairing, primary-constrained pairing, split-core pairing, and total and primary pairing in order to get distributions of binaries over physical parameters at birth. Next, for comparison with observations, we account for stellar evolution and selection effects. Brightness, radius, temperature, and other parameters of components are assigned or calculated according to approximate relations for stars in different evolutionary stages (main-sequence stars, red giants, white dwarfs, relativistic objects). Evolutionary stage is defined as a function of system age and component masses. We compare our results with the observed IMF, binarity rate, and binary mass-ratio distributions for field visual binaries to find initial distributions and pairing scenarios that produce observed distributions.

  4. Mixed Frequency Data Sampling Regression Models: The R Package midasr

    Directory of Open Access Journals (Sweden)

    Eric Ghysels

    2016-08-01

    Full Text Available When modeling economic relationships it is increasingly common to encounter data sampled at different frequencies. We introduce the R package midasr which enables estimating regression models with variables sampled at different frequencies within a MIDAS regression framework put forward in work by Ghysels, Santa-Clara, and Valkanov (2002. In this article we define a general autoregressive MIDAS regression model with multiple variables of different frequencies and show how it can be specified using the familiar R formula interface and estimated using various optimization methods chosen by the researcher. We discuss how to check the validity of the estimated model both in terms of numerical convergence and statistical adequacy of a chosen regression specification, how to perform model selection based on a information criterion, how to assess forecasting accuracy of the MIDAS regression model and how to obtain a forecast aggregation of different MIDAS regression models. We illustrate the capabilities of the package with a simulated MIDAS regression model and give two empirical examples of application of MIDAS regression.

  5. Physics of Eclipsing Binaries: Motivation for the New-Age Modeling Suite

    OpenAIRE

    Pavlovski, K.; Prša, A.; Degroote, P.; Conroy, K.; Bloemen, S.; Hambleton, Kelly; Giammarco, J.; Pablo, H.; Tkachenko, A.; Torres, G.

    2013-01-01

    Recent ultra-high precision observations of eclipsing binaries, especially data acquired by the Kepler satellite, have made accurate light curve modelling increasingly challenging but also more rewarding. In this contribution, we discuss low-amplitude signals in light curves that can now be used to derive physical information about eclipsing binaries but that were unaccessible before the Kepler era. A notable example is the detection of Doppler beaming, which leads to an increase in flux when...

  6. A joint logistic regression and covariate-adjusted continuous-time Markov chain model.

    Science.gov (United States)

    Rubin, Maria Laura; Chan, Wenyaw; Yamal, Jose-Miguel; Robertson, Claudia Sue

    2017-12-10

    The use of longitudinal measurements to predict a categorical outcome is an increasingly common goal in research studies. Joint models are commonly used to describe two or more models simultaneously by considering the correlated nature of their outcomes and the random error present in the longitudinal measurements. However, there is limited research on joint models with longitudinal predictors and categorical cross-sectional outcomes. Perhaps the most challenging task is how to model the longitudinal predictor process such that it represents the true biological mechanism that dictates the association with the categorical response. We propose a joint logistic regression and Markov chain model to describe a binary cross-sectional response, where the unobserved transition rates of a two-state continuous-time Markov chain are included as covariates. We use the method of maximum likelihood to estimate the parameters of our model. In a simulation study, coverage probabilities of about 95%, standard deviations close to standard errors, and low biases for the parameter values show that our estimation method is adequate. We apply the proposed joint model to a dataset of patients with traumatic brain injury to describe and predict a 6-month outcome based on physiological data collected post-injury and admission characteristics. Our analysis indicates that the information provided by physiological changes over time may help improve prediction of long-term functional status of these severely ill subjects. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  7. A Bayesian model for binary Markov chains

    Directory of Open Access Journals (Sweden)

    Belkheir Essebbar

    2004-02-01

    Full Text Available This note is concerned with Bayesian estimation of the transition probabilities of a binary Markov chain observed from heterogeneous individuals. The model is founded on the Jeffreys' prior which allows for transition probabilities to be correlated. The Bayesian estimator is approximated by means of Monte Carlo Markov chain (MCMC techniques. The performance of the Bayesian estimates is illustrated by analyzing a small simulated data set.

  8. Impact of multicollinearity on small sample hydrologic regression models

    Science.gov (United States)

    Kroll, Charles N.; Song, Peter

    2013-06-01

    Often hydrologic regression models are developed with ordinary least squares (OLS) procedures. The use of OLS with highly correlated explanatory variables produces multicollinearity, which creates highly sensitive parameter estimators with inflated variances and improper model selection. It is not clear how to best address multicollinearity in hydrologic regression models. Here a Monte Carlo simulation is developed to compare four techniques to address multicollinearity: OLS, OLS with variance inflation factor screening (VIF), principal component regression (PCR), and partial least squares regression (PLS). The performance of these four techniques was observed for varying sample sizes, correlation coefficients between the explanatory variables, and model error variances consistent with hydrologic regional regression models. The negative effects of multicollinearity are magnified at smaller sample sizes, higher correlations between the variables, and larger model error variances (smaller R2). The Monte Carlo simulation indicates that if the true model is known, multicollinearity is present, and the estimation and statistical testing of regression parameters are of interest, then PCR or PLS should be employed. If the model is unknown, or if the interest is solely on model predictions, is it recommended that OLS be employed since using more complicated techniques did not produce any improvement in model performance. A leave-one-out cross-validation case study was also performed using low-streamflow data sets from the eastern United States. Results indicate that OLS with stepwise selection generally produces models across study regions with varying levels of multicollinearity that are as good as biased regression techniques such as PCR and PLS.

  9. (Liquid plus liquid) equilibria of binary polymer solutions using a free-volume UNIQUAC-NRF model

    DEFF Research Database (Denmark)

    Radfarnia, H.R.; Ghotbi, C.; Taghikhani, V.

    2006-01-01

    + liquid) equilibria (LLE) for a number of binary polymer solutions at various temperatures. The values for the binary characteristic energy parameters for the proposed model and the FV-UNIQUAC model along with their average relative deviations from the experimental data were reported. It should be stated...

  10. Applied Regression Modeling A Business Approach

    CERN Document Server

    Pardoe, Iain

    2012-01-01

    An applied and concise treatment of statistical regression techniques for business students and professionals who have little or no background in calculusRegression analysis is an invaluable statistical methodology in business settings and is vital to model the relationship between a response variable and one or more predictor variables, as well as the prediction of a response value given values of the predictors. In view of the inherent uncertainty of business processes, such as the volatility of consumer spending and the presence of market uncertainty, business professionals use regression a

  11. Testing homogeneity in Weibull-regression models.

    Science.gov (United States)

    Bolfarine, Heleno; Valença, Dione M

    2005-10-01

    In survival studies with families or geographical units it may be of interest testing whether such groups are homogeneous for given explanatory variables. In this paper we consider score type tests for group homogeneity based on a mixing model in which the group effect is modelled as a random variable. As opposed to hazard-based frailty models, this model presents survival times that conditioned on the random effect, has an accelerated failure time representation. The test statistics requires only estimation of the conventional regression model without the random effect and does not require specifying the distribution of the random effect. The tests are derived for a Weibull regression model and in the uncensored situation, a closed form is obtained for the test statistic. A simulation study is used for comparing the power of the tests. The proposed tests are applied to real data sets with censored data.

  12. Model-based Quantile Regression for Discrete Data

    KAUST Repository

    Padellini, Tullia; Rue, Haavard

    2018-01-01

    Quantile regression is a class of methods voted to the modelling of conditional quantiles. In a Bayesian framework quantile regression has typically been carried out exploiting the Asymmetric Laplace Distribution as a working likelihood. Despite

  13. Detection of epistatic effects with logic regression and a classical linear regression model.

    Science.gov (United States)

    Malina, Magdalena; Ickstadt, Katja; Schwender, Holger; Posch, Martin; Bogdan, Małgorzata

    2014-02-01

    To locate multiple interacting quantitative trait loci (QTL) influencing a trait of interest within experimental populations, usually methods as the Cockerham's model are applied. Within this framework, interactions are understood as the part of the joined effect of several genes which cannot be explained as the sum of their additive effects. However, if a change in the phenotype (as disease) is caused by Boolean combinations of genotypes of several QTLs, this Cockerham's approach is often not capable to identify them properly. To detect such interactions more efficiently, we propose a logic regression framework. Even though with the logic regression approach a larger number of models has to be considered (requiring more stringent multiple testing correction) the efficient representation of higher order logic interactions in logic regression models leads to a significant increase of power to detect such interactions as compared to a Cockerham's approach. The increase in power is demonstrated analytically for a simple two-way interaction model and illustrated in more complex settings with simulation study and real data analysis.

  14. A binary neutron star GRB model

    International Nuclear Information System (INIS)

    Wilson, J.R.; Salmonson, J.D.; Wilson, J.R.; Mathews, G.J.

    1998-01-01

    In this paper we present the preliminary results of a model for the production of gamma-ray bursts (GRBs) through the compressional heating of binary neutron stars near their last stable orbit prior to merger. Recent numerical studies of the general relativistic (GR) hydrodynamics in three spatial dimensions of close neutron star binaries (NSBs) have uncovered evidence for the compression and heating of the individual neutron stars (NSs) prior to merger 12. This effect will have significant effect on the production of gravitational waves, neutrinos and, ultimately, energetic photons. The study of the production of these photons in close NSBs and, in particular, its correspondence to observed GRBs is the subject of this paper. The gamma-rays arise as follows. Compressional heating causes the neutron stars to emit neutrino pairs which, in turn, annihilate to produce a hot electron-positron pair plasma. This pair-photon plasma expands rapidly until it becomes optically thin, at which point the photons are released. We show that this process can indeed satisfy three basic requirements of a model for cosmological gamma-ray bursts: (1) sufficient gamma-ray energy release (>10 51 ergs) to produce observed fluxes, (2) a time-scale of the primary burst duration consistent with that of a 'classical' GRB (∼10 seconds), and (3) the peak of the photon number spectrum matches that of 'classical' GRB (∼300 keV). copyright 1998 American Institute of Physics

  15. Modeling the effects of binary mixtures on survival in time.

    NARCIS (Netherlands)

    Baas, J.; van Houte, B.P.P.; van Gestel, C.A.M.; Kooijman, S.A.L.M.

    2007-01-01

    In general, effects of mixtures are difficult to describe, and most of the models in use are descriptive in nature and lack a strong mechanistic basis. The aim of this experiment was to develop a process-based model for the interpretation of mixture toxicity measurements, with effects of binary

  16. AN APPLICATION OF FUNCTIONAL MULTIVARIATE REGRESSION MODEL TO MULTICLASS CLASSIFICATION

    OpenAIRE

    Krzyśko, Mirosław; Smaga, Łukasz

    2017-01-01

    In this paper, the scale response functional multivariate regression model is considered. By using the basis functions representation of functional predictors and regression coefficients, this model is rewritten as a multivariate regression model. This representation of the functional multivariate regression model is used for multiclass classification for multivariate functional data. Computational experiments performed on real labelled data sets demonstrate the effectiveness of the proposed ...

  17. Modeling and analysis of advanced binary cycles

    Energy Technology Data Exchange (ETDEWEB)

    Gawlik, K.

    1997-12-31

    A computer model (Cycle Analysis Simulation Tool, CAST) and a methodology have been developed to perform value analysis for small, low- to moderate-temperature binary geothermal power plants. The value analysis method allows for incremental changes in the levelized electricity cost (LEC) to be determined between a baseline plant and a modified plant. Thermodynamic cycle analyses and component sizing are carried out in the model followed by economic analysis which provides LEC results. The emphasis of the present work is on evaluating the effect of mixed working fluids instead of pure fluids on the LEC of a geothermal binary plant that uses a simple Organic Rankine Cycle. Four resources were studied spanning the range of 265{degrees}F to 375{degrees}F. A variety of isobutane and propane based mixtures, in addition to pure fluids, were used as working fluids. This study shows that the use of propane mixtures at a 265{degrees}F resource can reduce the LEC by 24% when compared to a base case value that utilizes commercial isobutane as its working fluid. The cost savings drop to 6% for a 375{degrees}F resource, where an isobutane mixture is favored. Supercritical cycles were found to have the lowest cost at all resources.

  18. Parameters Estimation of Geographically Weighted Ordinal Logistic Regression (GWOLR) Model

    Science.gov (United States)

    Zuhdi, Shaifudin; Retno Sari Saputro, Dewi; Widyaningsih, Purnami

    2017-06-01

    A regression model is the representation of relationship between independent variable and dependent variable. The dependent variable has categories used in the logistic regression model to calculate odds on. The logistic regression model for dependent variable has levels in the logistics regression model is ordinal. GWOLR model is an ordinal logistic regression model influenced the geographical location of the observation site. Parameters estimation in the model needed to determine the value of a population based on sample. The purpose of this research is to parameters estimation of GWOLR model using R software. Parameter estimation uses the data amount of dengue fever patients in Semarang City. Observation units used are 144 villages in Semarang City. The results of research get GWOLR model locally for each village and to know probability of number dengue fever patient categories.

  19. Frequency-domain reduced order models for gravitational waves from aligned-spin compact binaries

    International Nuclear Information System (INIS)

    Pürrer, Michael

    2014-01-01

    Black-hole binary coalescences are one of the most promising sources for the first detection of gravitational waves. Fast and accurate theoretical models of the gravitational radiation emitted from these coalescences are highly important for the detection and extraction of physical parameters. Spinning effective-one-body models for binaries with aligned-spins have been shown to be highly faithful, but are slow to generate and thus have not yet been used for parameter estimation (PE) studies. I provide a frequency-domain singular value decomposition-based surrogate reduced order model that is thousands of times faster for typical system masses and has a faithfulness mismatch of better than ∼0.1% with the original SEOBNRv1 model for advanced LIGO detectors. This model enables PE studies up to signal-to-noise ratios (SNRs) of 20 and even up to 50 for total masses below 50 M ⊙ . This paper discusses various choices for approximations and interpolation over the parameter space that can be made for reduced order models of spinning compact binaries, provides a detailed discussion of errors arising in the construction and assesses the fidelity of such models. (paper)

  20. A model of two-stream non-radial accretion for binary X-ray pulsars

    International Nuclear Information System (INIS)

    Lipunov, V.M.

    1982-01-01

    The general case of non-radial accretion is assumed to occur in real binary systems containing X-ray pulsars. The structure and the stability of the magnetosphere, the interaction between the magnetosphere and accreted matter, as well as evolution of neutron star in close binary system are examined within the framework of the two-stream model of nonradial accretion onto a magnetized neutron star. Observable parameters of X-ray pulsars are explained in terms of the model considered. (orig.)

  1. Trojan Binaries

    Science.gov (United States)

    Noll, K. S.

    2017-12-01

    The Jupiter Trojans, in the context of giant planet migration models, can be thought of as an extension of the small body populations found beyond Neptune in the Kuiper Belt. Binaries are a distinctive feature of small body populations in the Kuiper Belt with an especially high fraction apparent among the brightest Cold Classicals. The binary fraction, relative sizes, and separations in the dynamically excited populations (Scattered, Resonant) reflects processes that may have eroded a more abundant initial population. This trend continues in the Centaurs and Trojans where few binaries have been found. We review new evidence including a third resolved Trojan binary and lightcurve studies to understand how the Trojans are related to the small body populations that originated in the outer protoplanetary disk.

  2. Alternative regression models to assess increase in childhood BMI.

    Science.gov (United States)

    Beyerlein, Andreas; Fahrmeir, Ludwig; Mansmann, Ulrich; Toschke, André M

    2008-09-08

    Body mass index (BMI) data usually have skewed distributions, for which common statistical modeling approaches such as simple linear or logistic regression have limitations. Different regression approaches to predict childhood BMI by goodness-of-fit measures and means of interpretation were compared including generalized linear models (GLMs), quantile regression and Generalized Additive Models for Location, Scale and Shape (GAMLSS). We analyzed data of 4967 children participating in the school entry health examination in Bavaria, Germany, from 2001 to 2002. TV watching, meal frequency, breastfeeding, smoking in pregnancy, maternal obesity, parental social class and weight gain in the first 2 years of life were considered as risk factors for obesity. GAMLSS showed a much better fit regarding the estimation of risk factors effects on transformed and untransformed BMI data than common GLMs with respect to the generalized Akaike information criterion. In comparison with GAMLSS, quantile regression allowed for additional interpretation of prespecified distribution quantiles, such as quantiles referring to overweight or obesity. The variables TV watching, maternal BMI and weight gain in the first 2 years were directly, and meal frequency was inversely significantly associated with body composition in any model type examined. In contrast, smoking in pregnancy was not directly, and breastfeeding and parental social class were not inversely significantly associated with body composition in GLM models, but in GAMLSS and partly in quantile regression models. Risk factor specific BMI percentile curves could be estimated from GAMLSS and quantile regression models. GAMLSS and quantile regression seem to be more appropriate than common GLMs for risk factor modeling of BMI data.

  3. Alternative regression models to assess increase in childhood BMI

    OpenAIRE

    Beyerlein, Andreas; Fahrmeir, Ludwig; Mansmann, Ulrich; Toschke, André M

    2008-01-01

    Abstract Background Body mass index (BMI) data usually have skewed distributions, for which common statistical modeling approaches such as simple linear or logistic regression have limitations. Methods Different regression approaches to predict childhood BMI by goodness-of-fit measures and means of interpretation were compared including generalized linear models (GLMs), quantile regression and Generalized Additive Models for Location, Scale and Shape (GAMLSS). We analyzed data of 4967 childre...

  4. The use of continuous data versus binary data in MTC models: a case study in rheumatoid arthritis.

    Science.gov (United States)

    Schmitz, Susanne; Adams, Roisin; Walsh, Cathal

    2012-11-06

    Estimates of relative efficacy between alternative treatments are crucial for decision making in health care. When sufficient head to head evidence is not available Bayesian mixed treatment comparison models provide a powerful methodology to obtain such estimates. While models can be fit to a broad range of efficacy measures, this paper illustrates the advantages of using continuous outcome measures compared to binary outcome measures. Using a case study in rheumatoid arthritis a Bayesian mixed treatment comparison model is fit to estimate the relative efficacy of five anti-TNF agents currently licensed in Europe. The model is fit for the continuous HAQ improvement outcome measure and a binary version thereof as well as for the binary ACR response measure and the underlying continuous effect. Results are compared regarding their power to detect differences between treatments. Sixteen randomized controlled trials were included for the analysis. For both analyses, based on the HAQ improvement as well as based on the ACR response, differences between treatments detected by the binary outcome measures are subsets of the differences detected by the underlying continuous effects. The information lost when transforming continuous data into a binary response measure translates into a loss of power to detect differences between treatments in mixed treatment comparison models. Binary outcome measures are therefore less sensitive to change than continuous measures. Furthermore the choice of cut-off point to construct the binary measure also impacts the relative efficacy estimates.

  5. The use of continuous data versus binary data in MTC models: A case study in rheumatoid arthritis

    Directory of Open Access Journals (Sweden)

    Schmitz Susanne

    2012-11-01

    Full Text Available Abstract Background Estimates of relative efficacy between alternative treatments are crucial for decision making in health care. When sufficient head to head evidence is not available Bayesian mixed treatment comparison models provide a powerful methodology to obtain such estimates. While models can be fit to a broad range of efficacy measures, this paper illustrates the advantages of using continuous outcome measures compared to binary outcome measures. Methods Using a case study in rheumatoid arthritis a Bayesian mixed treatment comparison model is fit to estimate the relative efficacy of five anti-TNF agents currently licensed in Europe. The model is fit for the continuous HAQ improvement outcome measure and a binary version thereof as well as for the binary ACR response measure and the underlying continuous effect. Results are compared regarding their power to detect differences between treatments. Results Sixteen randomized controlled trials were included for the analysis. For both analyses, based on the HAQ improvement as well as based on the ACR response, differences between treatments detected by the binary outcome measures are subsets of the differences detected by the underlying continuous effects. Conclusions The information lost when transforming continuous data into a binary response measure translates into a loss of power to detect differences between treatments in mixed treatment comparison models. Binary outcome measures are therefore less sensitive to change than continuous measures. Furthermore the choice of cut-off point to construct the binary measure also impacts the relative efficacy estimates.

  6. Thermal Efficiency Degradation Diagnosis Method Using Regression Model

    International Nuclear Information System (INIS)

    Jee, Chang Hyun; Heo, Gyun Young; Jang, Seok Won; Lee, In Cheol

    2011-01-01

    This paper proposes an idea for thermal efficiency degradation diagnosis in turbine cycles, which is based on turbine cycle simulation under abnormal conditions and a linear regression model. The correlation between the inputs for representing degradation conditions (normally unmeasured but intrinsic states) and the simulation outputs (normally measured but superficial states) was analyzed with the linear regression model. The regression models can inversely response an associated intrinsic state for a superficial state observed from a power plant. The diagnosis method proposed herein is classified into three processes, 1) simulations for degradation conditions to get measured states (referred as what-if method), 2) development of the linear model correlating intrinsic and superficial states, and 3) determination of an intrinsic state using the superficial states of current plant and the linear regression model (referred as inverse what-if method). The what-if method is to generate the outputs for the inputs including various root causes and/or boundary conditions whereas the inverse what-if method is the process of calculating the inverse matrix with the given superficial states, that is, component degradation modes. The method suggested in this paper was validated using the turbine cycle model for an operating power plant

  7. A wide low-mass binary model for the origin of axially symmetric non-thermal radio sources

    International Nuclear Information System (INIS)

    Kool, M. de; Heuvel, E.P.J. van den

    1985-01-01

    An accreting binary model has been proposed by recent workers to account for the origin of the axially symmetric non-thermal radio sources. The authors show that the only type of binary system that can produce the observed structural properties, is a relatively wide neutron star binary, in which the companion of the neutron star is a low-mass giant. Binaries of this type are expected to resemble closely the eight brightest galactic bulge X-ray sources as well as the progenitors of the two wide radio pulsar binaries. (U.K.)

  8. Random regression models for detection of gene by environment interaction

    Directory of Open Access Journals (Sweden)

    Meuwissen Theo HE

    2007-02-01

    Full Text Available Abstract Two random regression models, where the effect of a putative QTL was regressed on an environmental gradient, are described. The first model estimates the correlation between intercept and slope of the random regression, while the other model restricts this correlation to 1 or -1, which is expected under a bi-allelic QTL model. The random regression models were compared to a model assuming no gene by environment interactions. The comparison was done with regards to the models ability to detect QTL, to position them accurately and to detect possible QTL by environment interactions. A simulation study based on a granddaughter design was conducted, and QTL were assumed, either by assigning an effect independent of the environment or as a linear function of a simulated environmental gradient. It was concluded that the random regression models were suitable for detection of QTL effects, in the presence and absence of interactions with environmental gradients. Fixing the correlation between intercept and slope of the random regression had a positive effect on power when the QTL effects re-ranked between environments.

  9. Alternative regression models to assess increase in childhood BMI

    Directory of Open Access Journals (Sweden)

    Mansmann Ulrich

    2008-09-01

    Full Text Available Abstract Background Body mass index (BMI data usually have skewed distributions, for which common statistical modeling approaches such as simple linear or logistic regression have limitations. Methods Different regression approaches to predict childhood BMI by goodness-of-fit measures and means of interpretation were compared including generalized linear models (GLMs, quantile regression and Generalized Additive Models for Location, Scale and Shape (GAMLSS. We analyzed data of 4967 children participating in the school entry health examination in Bavaria, Germany, from 2001 to 2002. TV watching, meal frequency, breastfeeding, smoking in pregnancy, maternal obesity, parental social class and weight gain in the first 2 years of life were considered as risk factors for obesity. Results GAMLSS showed a much better fit regarding the estimation of risk factors effects on transformed and untransformed BMI data than common GLMs with respect to the generalized Akaike information criterion. In comparison with GAMLSS, quantile regression allowed for additional interpretation of prespecified distribution quantiles, such as quantiles referring to overweight or obesity. The variables TV watching, maternal BMI and weight gain in the first 2 years were directly, and meal frequency was inversely significantly associated with body composition in any model type examined. In contrast, smoking in pregnancy was not directly, and breastfeeding and parental social class were not inversely significantly associated with body composition in GLM models, but in GAMLSS and partly in quantile regression models. Risk factor specific BMI percentile curves could be estimated from GAMLSS and quantile regression models. Conclusion GAMLSS and quantile regression seem to be more appropriate than common GLMs for risk factor modeling of BMI data.

  10. The microcomputer scientific software series 2: general linear model--regression.

    Science.gov (United States)

    Harold M. Rauscher

    1983-01-01

    The general linear model regression (GLMR) program provides the microcomputer user with a sophisticated regression analysis capability. The output provides a regression ANOVA table, estimators of the regression model coefficients, their confidence intervals, confidence intervals around the predicted Y-values, residuals for plotting, a check for multicollinearity, a...

  11. Wavelet regression model in forecasting crude oil price

    Science.gov (United States)

    Hamid, Mohd Helmie; Shabri, Ani

    2017-05-01

    This study presents the performance of wavelet multiple linear regression (WMLR) technique in daily crude oil forecasting. WMLR model was developed by integrating the discrete wavelet transform (DWT) and multiple linear regression (MLR) model. The original time series was decomposed to sub-time series with different scales by wavelet theory. Correlation analysis was conducted to assist in the selection of optimal decomposed components as inputs for the WMLR model. The daily WTI crude oil price series has been used in this study to test the prediction capability of the proposed model. The forecasting performance of WMLR model were also compared with regular multiple linear regression (MLR), Autoregressive Moving Average (ARIMA) and Generalized Autoregressive Conditional Heteroscedasticity (GARCH) using root mean square errors (RMSE) and mean absolute errors (MAE). Based on the experimental results, it appears that the WMLR model performs better than the other forecasting technique tested in this study.

  12. MESA models of the evolutionary state of the interacting binary epsilon Aurigae

    Science.gov (United States)

    Gibson, Justus L.; Stencel, Robert E.

    2018-06-01

    Using MESA code (Modules for Experiments in Stellar Astrophysics, version 9575), an evaluation was made of the evolutionary state of the epsilon Aurigae binary system (HD 31964, F0Iap + disc). We sought to satisfy several observational constraints: (1) requiring evolutionary tracks to pass close to the current temperature and luminosity of the primary star; (2) obtaining a period near the observed value of 27.1 years; (3) matching a mass function of 3.0; (4) concurrent Roche lobe overflow and mass transfer; (5) an isotopic ratio 12C/13C = 5 and, (6) matching the interferometrically determined angular diameter. A MESA model starting with binary masses of 9.85 + 4.5 M⊙, with a 100 d initial period, produces a 1.2 + 10.6 M⊙ result having a 547 d period, and a single digit 12C/13C ratio. These values were reached near an age of 20 Myr, when the donor star comes close to the observed luminosity and temperature for epsilon Aurigae A, as a post-RGB/pre-AGB star. Contemporaneously, the accretor then appears as an upper main-sequence, early B-type star. This benchmark model can provide a basis for further exploration of this interacting binary, and other long-period binary stars.

  13. Intermediate and advanced topics in multilevel logistic regression analysis.

    Science.gov (United States)

    Austin, Peter C; Merlo, Juan

    2017-09-10

    Multilevel data occur frequently in health services, population and public health, and epidemiologic research. In such research, binary outcomes are common. Multilevel logistic regression models allow one to account for the clustering of subjects within clusters of higher-level units when estimating the effect of subject and cluster characteristics on subject outcomes. A search of the PubMed database demonstrated that the use of multilevel or hierarchical regression models is increasing rapidly. However, our impression is that many analysts simply use multilevel regression models to account for the nuisance of within-cluster homogeneity that is induced by clustering. In this article, we describe a suite of analyses that can complement the fitting of multilevel logistic regression models. These ancillary analyses permit analysts to estimate the marginal or population-average effect of covariates measured at the subject and cluster level, in contrast to the within-cluster or cluster-specific effects arising from the original multilevel logistic regression model. We describe the interval odds ratio and the proportion of opposed odds ratios, which are summary measures of effect for cluster-level covariates. We describe the variance partition coefficient and the median odds ratio which are measures of components of variance and heterogeneity in outcomes. These measures allow one to quantify the magnitude of the general contextual effect. We describe an R 2 measure that allows analysts to quantify the proportion of variation explained by different multilevel logistic regression models. We illustrate the application and interpretation of these measures by analyzing mortality in patients hospitalized with a diagnosis of acute myocardial infarction. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.

  14. Predicting and Modelling of Survival Data when Cox's Regression Model does not hold

    DEFF Research Database (Denmark)

    Scheike, Thomas H.; Zhang, Mei-Jie

    2002-01-01

    Aalen model; additive risk model; counting processes; competing risk; Cox regression; flexible modeling; goodness of fit; prediction of survival; survival analysis; time-varying effects......Aalen model; additive risk model; counting processes; competing risk; Cox regression; flexible modeling; goodness of fit; prediction of survival; survival analysis; time-varying effects...

  15. Spatial stochastic regression modelling of urban land use

    International Nuclear Information System (INIS)

    Arshad, S H M; Jaafar, J; Abiden, M Z Z; Latif, Z A; Rasam, A R A

    2014-01-01

    Urbanization is very closely linked to industrialization, commercialization or overall economic growth and development. This results in innumerable benefits of the quantity and quality of the urban environment and lifestyle but on the other hand contributes to unbounded development, urban sprawl, overcrowding and decreasing standard of living. Regulation and observation of urban development activities is crucial. The understanding of urban systems that promotes urban growth are also essential for the purpose of policy making, formulating development strategies as well as development plan preparation. This study aims to compare two different stochastic regression modeling techniques for spatial structure models of urban growth in the same specific study area. Both techniques will utilize the same datasets and their results will be analyzed. The work starts by producing an urban growth model by using stochastic regression modeling techniques namely the Ordinary Least Square (OLS) and Geographically Weighted Regression (GWR). The two techniques are compared to and it is found that, GWR seems to be a more significant stochastic regression model compared to OLS, it gives a smaller AICc (Akaike's Information Corrected Criterion) value and its output is more spatially explainable

  16. PARAMETRIC AND NON PARAMETRIC (MARS: MULTIVARIATE ADDITIVE REGRESSION SPLINES) LOGISTIC REGRESSIONS FOR PREDICTION OF A DICHOTOMOUS RESPONSE VARIABLE WITH AN EXAMPLE FOR PRESENCE/ABSENCE OF AMPHIBIANS

    Science.gov (United States)

    The purpose of this report is to provide a reference manual that could be used by investigators for making informed use of logistic regression using two methods (standard logistic regression and MARS). The details for analyses of relationships between a dependent binary response ...

  17. Physics constrained nonlinear regression models for time series

    International Nuclear Information System (INIS)

    Majda, Andrew J; Harlim, John

    2013-01-01

    A central issue in contemporary science is the development of data driven statistical nonlinear dynamical models for time series of partial observations of nature or a complex physical model. It has been established recently that ad hoc quadratic multi-level regression (MLR) models can have finite-time blow up of statistical solutions and/or pathological behaviour of their invariant measure. Here a new class of physics constrained multi-level quadratic regression models are introduced, analysed and applied to build reduced stochastic models from data of nonlinear systems. These models have the advantages of incorporating memory effects in time as well as the nonlinear noise from energy conserving nonlinear interactions. The mathematical guidelines for the performance and behaviour of these physics constrained MLR models as well as filtering algorithms for their implementation are developed here. Data driven applications of these new multi-level nonlinear regression models are developed for test models involving a nonlinear oscillator with memory effects and the difficult test case of the truncated Burgers–Hopf model. These new physics constrained quadratic MLR models are proposed here as process models for Bayesian estimation through Markov chain Monte Carlo algorithms of low frequency behaviour in complex physical data. (paper)

  18. Logistic regression modelling: procedures and pitfalls in developing and interpreting prediction models

    Directory of Open Access Journals (Sweden)

    Nataša Šarlija

    2017-01-01

    Full Text Available This study sheds light on the most common issues related to applying logistic regression in prediction models for company growth. The purpose of the paper is 1 to provide a detailed demonstration of the steps in developing a growth prediction model based on logistic regression analysis, 2 to discuss common pitfalls and methodological errors in developing a model, and 3 to provide solutions and possible ways of overcoming these issues. Special attention is devoted to the question of satisfying logistic regression assumptions, selecting and defining dependent and independent variables, using classification tables and ROC curves, for reporting model strength, interpreting odds ratios as effect measures and evaluating performance of the prediction model. Development of a logistic regression model in this paper focuses on a prediction model of company growth. The analysis is based on predominantly financial data from a sample of 1471 small and medium-sized Croatian companies active between 2009 and 2014. The financial data is presented in the form of financial ratios divided into nine main groups depicting following areas of business: liquidity, leverage, activity, profitability, research and development, investing and export. The growth prediction model indicates aspects of a business critical for achieving high growth. In that respect, the contribution of this paper is twofold. First, methodological, in terms of pointing out pitfalls and potential solutions in logistic regression modelling, and secondly, theoretical, in terms of identifying factors responsible for high growth of small and medium-sized companies.

  19. A 3D dynamical model of the colliding winds in binary systems

    Science.gov (United States)

    Parkin, E. R.; Pittard, J. M.

    2008-08-01

    We present a three-dimensional (3D) dynamical model of the orbital-induced curvature of the wind-wind collision region in binary star systems. Momentum balance equations are used to determine the position and shape of the contact discontinuity between the stars, while further downstream the gas is assumed to behave ballistically. An Archimedean spiral structure is formed by the motion of the stars, with clear resemblance to high-resolution images of the so-called `pinwheel nebulae'. A key advantage of this approach over grid or smoothed particle hydrodynamic models is its significantly reduced computational cost, while it also allows the study of the structure obtained in an eccentric orbit. The model is relevant to symbiotic systems and γ-ray binaries, as well as systems with O-type and Wolf-Rayet stars. As an example application, we simulate the X-ray emission from hypothetical O+O and WR+O star binaries, and describe a method of ray tracing through the 3D spiral structure to account for absorption by the circumstellar material in the system. Such calculations may be easily adapted to study observations at wavelengths ranging from the radio to γ-ray.

  20. Fitting Markovian binary trees using global and individual demographic data

    OpenAIRE

    Hautphenne, Sophie; Massaro, Melanie; Turner, Katharine

    2017-01-01

    We consider a class of branching processes called Markovian binary trees, in which the individuals lifetime and reproduction epochs are modeled using a transient Markovian arrival process (TMAP). We estimate the parameters of the TMAP based on population data containing information on age-specific fertility and mortality rates. Depending on the degree of detail of the available data, a weighted non-linear regression method or a maximum likelihood method is applied. We discuss the optimal choi...

  1. Detecting nonsense for Chinese comments based on logistic regression

    Science.gov (United States)

    Zhuolin, Ren; Guang, Chen; Shu, Chen

    2016-07-01

    To understand cyber citizens' opinion accurately from Chinese news comments, the clear definition on nonsense is present, and a detection model based on logistic regression (LR) is proposed. The detection of nonsense can be treated as a binary-classification problem. Besides of traditional lexical features, we propose three kinds of features in terms of emotion, structure and relevance. By these features, we train an LR model and demonstrate its effect in understanding Chinese news comments. We find that each of proposed features can significantly promote the result. In our experiments, we achieve a prediction accuracy of 84.3% which improves the baseline 77.3% by 7%.

  2. Regression Models for Repairable Systems

    Czech Academy of Sciences Publication Activity Database

    Novák, Petr

    2015-01-01

    Roč. 17, č. 4 (2015), s. 963-972 ISSN 1387-5841 Institutional support: RVO:67985556 Keywords : Reliability analysis * Repair models * Regression Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.782, year: 2015 http://library.utia.cas.cz/separaty/2015/SI/novak-0450902.pdf

  3. Modelling the effect of absorption from the interstellar medium on transient black hole X-ray binaries

    Science.gov (United States)

    Eckersall, A. J.; Vaughan, S.; Wynn, G. A.

    2017-10-01

    All observations of Galactic X-ray binaries are affected by absorption from gas and dust in the interstellar medium (ISM) which imprints narrow (line) and broad (photoelectric edges) features on the continuum emission spectrum of the binary. Any spectral model used to fit data from a Galactic X-ray binary must therefore take account of these features; when the absorption is strong (as for most Galactic sources) it becomes important to accurately model the ISM absorption in order to obtain unbiased estimates of the parameters of the (emission) spectrum of the binary system. In this paper, we present analysis of some of the best spectroscopic data from the XMM-Newton RGS instrument using the most up-to-date photoabsorption model of the gaseous ISM ISMabs. We calculate column densities for H, O, Ne and Fe for seven transient black hole X-ray binary systems. We find that the hydrogen column densities in particular can vary greatly from those presented elsewhere in the literature. We assess the impact of using inaccurate column densities and older X-ray absorption models on spectral analysis using simulated data. We find that poor treatment of absorption can lead to large biases in inferred disc properties and that an independent analysis of absorption parameters can be used to alleviate such issues.

  4. A globally accurate theory for a class of binary mixture models

    Science.gov (United States)

    Dickman, Adriana G.; Stell, G.

    The self-consistent Ornstein-Zernike approximation results for the 3D Ising model are used to obtain phase diagrams for binary mixtures described by decorated models, yielding the plait point, binodals, and closed-loop coexistence curves for the models proposed by Widom, Clark, Neece, and Wheeler. The results are in good agreement with series expansions and experiments.

  5. Spatio-kinematic modelling: Testing the link between planetary nebulae and close binaries

    OpenAIRE

    Jones, David; Tyndall, Amy A.; Huckvale, Leo; Prouse, Barnabas; Lloyd, Myfanwy

    2011-01-01

    It is widely believed that central star binarity plays an important role in the formation and evolution of aspherical planetary nebulae, however observational support for this hypothesis is lacking. Here, we present the most recent results of a continuing programme to model the morphologies of all planetary nebulae known to host a close binary central star. Initially, this programme allows us to compare the inclination of the nebular symmetry axis to that of the binary plane, testing the theo...

  6. Theoretical Models of Protostellar Binary and Multiple Systems with AMR Simulations

    Science.gov (United States)

    Matsumoto, Tomoaki; Tokuda, Kazuki; Onishi, Toshikazu; Inutsuka, Shu-ichiro; Saigo, Kazuya; Takakuwa, Shigehisa

    2017-05-01

    We present theoretical models for protostellar binary and multiple systems based on the high-resolution numerical simulation with an adaptive mesh refinement (AMR) code, SFUMATO. The recent ALMA observations have revealed early phases of the binary and multiple star formation with high spatial resolutions. These observations should be compared with theoretical models with high spatial resolutions. We present two theoretical models for (1) a high density molecular cloud core, MC27/L1521F, and (2) a protobinary system, L1551 NE. For the model for MC27, we performed numerical simulations for gravitational collapse of a turbulent cloud core. The cloud core exhibits fragmentation during the collapse, and dynamical interaction between the fragments produces an arc-like structure, which is one of the prominent structures observed by ALMA. For the model for L1551 NE, we performed numerical simulations of gas accretion onto protobinary. The simulations exhibit asymmetry of a circumbinary disk. Such asymmetry has been also observed by ALMA in the circumbinary disk of L1551 NE.

  7. Close binary stars

    International Nuclear Information System (INIS)

    Larsson-Leander, G.

    1979-01-01

    Studies of close binary stars are being persued more vigorously than ever, with about 3000 research papers and notes pertaining to the field being published during the triennium 1976-1978. Many major advances and spectacular discoveries were made, mostly due to increased observational efficiency and precision, especially in the X-ray, radio, and ultraviolet domains. Progress reports are presented in the following areas: observational techniques, methods of analyzing light curves, observational data, physical data, structure and models of close binaries, statistical investigations, and origin and evolution of close binaries. Reports from the Coordinates Programs Committee, the Committee for Extra-Terrestrial Observations and the Working Group on RS CVn binaries are included. (Auth./C.F.)

  8. Geographically weighted regression model on poverty indicator

    Science.gov (United States)

    Slamet, I.; Nugroho, N. F. T. A.; Muslich

    2017-12-01

    In this research, we applied geographically weighted regression (GWR) for analyzing the poverty in Central Java. We consider Gaussian Kernel as weighted function. The GWR uses the diagonal matrix resulted from calculating kernel Gaussian function as a weighted function in the regression model. The kernel weights is used to handle spatial effects on the data so that a model can be obtained for each location. The purpose of this paper is to model of poverty percentage data in Central Java province using GWR with Gaussian kernel weighted function and to determine the influencing factors in each regency/city in Central Java province. Based on the research, we obtained geographically weighted regression model with Gaussian kernel weighted function on poverty percentage data in Central Java province. We found that percentage of population working as farmers, population growth rate, percentage of households with regular sanitation, and BPJS beneficiaries are the variables that affect the percentage of poverty in Central Java province. In this research, we found the determination coefficient R2 are 68.64%. There are two categories of district which are influenced by different of significance factors.

  9. New limb-darkening coefficients for modeling binary star light curves

    Science.gov (United States)

    Van Hamme, W.

    1993-01-01

    We present monochromatic, passband-specific, and bolometric limb-darkening coefficients for a linear as well as nonlinear logarithmic and square root limb-darkening laws. These coefficients, including the bolometric ones, are needed when modeling binary star light curves with the latest version of the Wilson-Devinney light curve progam. We base our calculations on the most recent ATLAS stellar atmosphere models for solar chemical composition stars with a wide range of effective temperatures and surface gravitites. We examine how well various limb-darkening approximations represent the variation of the emerging specific intensity across a stellar surface as computed according to the model. For binary star light curve modeling purposes, we propose the use of a logarithmic or a square root law. We design our tables in such a manner that the relative quality of either law with respect to another can be easily compared. Since the computation of bolometric limb-darkening coefficients first requires monochromatic coefficients, we also offer tables of these coefficients (at 1221 wavelength values between 9.09 nm and 160 micrometer) and tables of passband-specific coefficients for commonly used photometric filters.

  10. Use of generalized ordered logistic regression for the analysis of multidrug resistance data.

    Science.gov (United States)

    Agga, Getahun E; Scott, H Morgan

    2015-10-01

    Statistical analysis of antimicrobial resistance data largely focuses on individual antimicrobial's binary outcome (susceptible or resistant). However, bacteria are becoming increasingly multidrug resistant (MDR). Statistical analysis of MDR data is mostly descriptive often with tabular or graphical presentations. Here we report the applicability of generalized ordinal logistic regression model for the analysis of MDR data. A total of 1,152 Escherichia coli, isolated from the feces of weaned pigs experimentally supplemented with chlortetracycline (CTC) and copper, were tested for susceptibilities against 15 antimicrobials and were binary classified into resistant or susceptible. The 15 antimicrobial agents tested were grouped into eight different antimicrobial classes. We defined MDR as the number of antimicrobial classes to which E. coli isolates were resistant ranging from 0 to 8. Proportionality of the odds assumption of the ordinal logistic regression model was violated only for the effect of treatment period (pre-treatment, during-treatment and post-treatment); but not for the effect of CTC or copper supplementation. Subsequently, a partially constrained generalized ordinal logistic model was built that allows for the effect of treatment period to vary while constraining the effects of treatment (CTC and copper supplementation) to be constant across the levels of MDR classes. Copper (Proportional Odds Ratio [Prop OR]=1.03; 95% CI=0.73-1.47) and CTC (Prop OR=1.1; 95% CI=0.78-1.56) supplementation were not significantly associated with the level of MDR adjusted for the effect of treatment period. MDR generally declined over the trial period. In conclusion, generalized ordered logistic regression can be used for the analysis of ordinal data such as MDR data when the proportionality assumptions for ordered logistic regression are violated. Published by Elsevier B.V.

  11. Numerical simulation of binary collisions using a modified surface tension model with particle method

    International Nuclear Information System (INIS)

    Sun Zhongguo; Xi Guang; Chen Xi

    2009-01-01

    The binary collision of liquid droplets is of both practical importance and fundamental value in computational fluid mechanics. We present a modified surface tension model within the moving particle semi-implicit (MPS) method, and carry out two-dimensional simulations to investigate the mechanisms of coalescence and separation of the droplets during binary collision. The modified surface tension model improves accuracy and convergence. A mechanism map is established for various possible deformation pathways encountered during binary collision, as the impact speed is varied; a new pathway is reported when the collision speed is critical. In addition, eccentric collisions are simulated and the effect of the rotation of coalesced particle is explored. The results qualitatively agree with experiments and the numerical protocol may find applications in studying free surface flows and interface deformation

  12. On a Robust MaxEnt Process Regression Model with Sample-Selection

    Directory of Open Access Journals (Sweden)

    Hea-Jung Kim

    2018-04-01

    Full Text Available In a regression analysis, a sample-selection bias arises when a dependent variable is partially observed as a result of the sample selection. This study introduces a Maximum Entropy (MaxEnt process regression model that assumes a MaxEnt prior distribution for its nonparametric regression function and finds that the MaxEnt process regression model includes the well-known Gaussian process regression (GPR model as a special case. Then, this special MaxEnt process regression model, i.e., the GPR model, is generalized to obtain a robust sample-selection Gaussian process regression (RSGPR model that deals with non-normal data in the sample selection. Various properties of the RSGPR model are established, including the stochastic representation, distributional hierarchy, and magnitude of the sample-selection bias. These properties are used in the paper to develop a hierarchical Bayesian methodology to estimate the model. This involves a simple and computationally feasible Markov chain Monte Carlo algorithm that avoids analytical or numerical derivatives of the log-likelihood function of the model. The performance of the RSGPR model in terms of the sample-selection bias correction, robustness to non-normality, and prediction, is demonstrated through results in simulations that attest to its good finite-sample performance.

  13. On concurvity in nonlinear and nonparametric regression models

    Directory of Open Access Journals (Sweden)

    Sonia Amodio

    2014-12-01

    Full Text Available When data are affected by multicollinearity in the linear regression framework, then concurvity will be present in fitting a generalized additive model (GAM. The term concurvity describes nonlinear dependencies among the predictor variables. As collinearity results in inflated variance of the estimated regression coefficients in the linear regression model, the result of the presence of concurvity leads to instability of the estimated coefficients in GAMs. Even if the backfitting algorithm will always converge to a solution, in case of concurvity the final solution of the backfitting procedure in fitting a GAM is influenced by the starting functions. While exact concurvity is highly unlikely, approximate concurvity, the analogue of multicollinearity, is of practical concern as it can lead to upwardly biased estimates of the parameters and to underestimation of their standard errors, increasing the risk of committing type I error. We compare the existing approaches to detect concurvity, pointing out their advantages and drawbacks, using simulated and real data sets. As a result, this paper will provide a general criterion to detect concurvity in nonlinear and non parametric regression models.

  14. Simple model of surface roughness for binary collision sputtering simulations

    Energy Technology Data Exchange (ETDEWEB)

    Lindsey, Sloan J. [Institute of Solid-State Electronics, TU Wien, Floragasse 7, A-1040 Wien (Austria); Hobler, Gerhard, E-mail: gerhard.hobler@tuwien.ac.at [Institute of Solid-State Electronics, TU Wien, Floragasse 7, A-1040 Wien (Austria); Maciążek, Dawid; Postawa, Zbigniew [Institute of Physics, Jagiellonian University, ul. Lojasiewicza 11, 30348 Kraków (Poland)

    2017-02-15

    Highlights: • A simple model of surface roughness is proposed. • Its key feature is a linearly varying target density at the surface. • The model can be used in 1D/2D/3D Monte Carlo binary collision simulations. • The model fits well experimental glancing incidence sputtering yield data. - Abstract: It has been shown that surface roughness can strongly influence the sputtering yield – especially at glancing incidence angles where the inclusion of surface roughness leads to an increase in sputtering yields. In this work, we propose a simple one-parameter model (the “density gradient model”) which imitates surface roughness effects. In the model, the target’s atomic density is assumed to vary linearly between the actual material density and zero. The layer width is the sole model parameter. The model has been implemented in the binary collision simulator IMSIL and has been evaluated against various geometric surface models for 5 keV Ga ions impinging an amorphous Si target. To aid the construction of a realistic rough surface topography, we have performed MD simulations of sequential 5 keV Ga impacts on an initially crystalline Si target. We show that our new model effectively reproduces the sputtering yield, with only minor variations in the energy and angular distributions of sputtered particles. The success of the density gradient model is attributed to a reduction of the reflection coefficient – leading to increased sputtering yields, similar in effect to surface roughness.

  15. Simple model of surface roughness for binary collision sputtering simulations

    International Nuclear Information System (INIS)

    Lindsey, Sloan J.; Hobler, Gerhard; Maciążek, Dawid; Postawa, Zbigniew

    2017-01-01

    Highlights: • A simple model of surface roughness is proposed. • Its key feature is a linearly varying target density at the surface. • The model can be used in 1D/2D/3D Monte Carlo binary collision simulations. • The model fits well experimental glancing incidence sputtering yield data. - Abstract: It has been shown that surface roughness can strongly influence the sputtering yield – especially at glancing incidence angles where the inclusion of surface roughness leads to an increase in sputtering yields. In this work, we propose a simple one-parameter model (the “density gradient model”) which imitates surface roughness effects. In the model, the target’s atomic density is assumed to vary linearly between the actual material density and zero. The layer width is the sole model parameter. The model has been implemented in the binary collision simulator IMSIL and has been evaluated against various geometric surface models for 5 keV Ga ions impinging an amorphous Si target. To aid the construction of a realistic rough surface topography, we have performed MD simulations of sequential 5 keV Ga impacts on an initially crystalline Si target. We show that our new model effectively reproduces the sputtering yield, with only minor variations in the energy and angular distributions of sputtered particles. The success of the density gradient model is attributed to a reduction of the reflection coefficient – leading to increased sputtering yields, similar in effect to surface roughness.

  16. GALAXY ROTATION AND RAPID SUPERMASSIVE BINARY COALESCENCE

    Energy Technology Data Exchange (ETDEWEB)

    Holley-Bockelmann, Kelly [Vanderbilt University, Nashville, TN (United States); Khan, Fazeel Mahmood, E-mail: k.holley@vanderbilt.edu [Institute of Space Technology (IST), Islamabad (Pakistan)

    2015-09-10

    Galaxy mergers usher the supermassive black hole (SMBH) in each galaxy to the center of the potential, where they form an SMBH binary. The binary orbit shrinks by ejecting stars via three-body scattering, but ample work has shown that in spherical galaxy models, the binary separation stalls after ejecting all the stars in its loss cone—this is the well-known final parsec problem. However, it has been shown that SMBH binaries in non-spherical galactic nuclei harden at a nearly constant rate until reaching the gravitational wave regime. Here we use a suite of direct N-body simulations to follow SMBH binary evolution in both corotating and counterrotating flattened galaxy models. For N > 500 K, we find that the evolution of the SMBH binary is convergent and is independent of the particle number. Rotation in general increases the hardening rate of SMBH binaries even more effectively than galaxy geometry alone. SMBH binary hardening rates are similar for co- and counterrotating galaxies. In the corotating case, the center of mass of the SMBH binary settles into an orbit that is in corotation resonance with the background rotating model, and the coalescence time is roughly a few 100 Myr faster than a non-rotating flattened model. We find that counterrotation drives SMBHs to coalesce on a nearly radial orbit promptly after forming a hard binary. We discuss the implications for gravitational wave astronomy, hypervelocity star production, and the effect on the structure of the host galaxy.

  17. GALAXY ROTATION AND RAPID SUPERMASSIVE BINARY COALESCENCE

    International Nuclear Information System (INIS)

    Holley-Bockelmann, Kelly; Khan, Fazeel Mahmood

    2015-01-01

    Galaxy mergers usher the supermassive black hole (SMBH) in each galaxy to the center of the potential, where they form an SMBH binary. The binary orbit shrinks by ejecting stars via three-body scattering, but ample work has shown that in spherical galaxy models, the binary separation stalls after ejecting all the stars in its loss cone—this is the well-known final parsec problem. However, it has been shown that SMBH binaries in non-spherical galactic nuclei harden at a nearly constant rate until reaching the gravitational wave regime. Here we use a suite of direct N-body simulations to follow SMBH binary evolution in both corotating and counterrotating flattened galaxy models. For N > 500 K, we find that the evolution of the SMBH binary is convergent and is independent of the particle number. Rotation in general increases the hardening rate of SMBH binaries even more effectively than galaxy geometry alone. SMBH binary hardening rates are similar for co- and counterrotating galaxies. In the corotating case, the center of mass of the SMBH binary settles into an orbit that is in corotation resonance with the background rotating model, and the coalescence time is roughly a few 100 Myr faster than a non-rotating flattened model. We find that counterrotation drives SMBHs to coalesce on a nearly radial orbit promptly after forming a hard binary. We discuss the implications for gravitational wave astronomy, hypervelocity star production, and the effect on the structure of the host galaxy

  18. Semiparametric Mixtures of Regressions with Single-index for Model Based Clustering

    OpenAIRE

    Xiang, Sijia; Yao, Weixin

    2017-01-01

    In this article, we propose two classes of semiparametric mixture regression models with single-index for model based clustering. Unlike many semiparametric/nonparametric mixture regression models that can only be applied to low dimensional predictors, the new semiparametric models can easily incorporate high dimensional predictors into the nonparametric components. The proposed models are very general, and many of the recently proposed semiparametric/nonparametric mixture regression models a...

  19. A model for the massive binary V340 Muscae

    Science.gov (United States)

    Hauck, Norbert

    2016-02-01

    A synthetic light curve has been fitted to photometric data from the ASAS-3 database. The parameters of the best solution are well consistent with those derived from stellar models for both components for an initial metallicity Z=0.020 and a common age of 5 Myr. Therefore, we can reliably estimate the absolute dimensions of this close eclipsing binary system. Apparently, the O-type primary star has a mass of about 22.65 Msun and a radius of 10.35 Rsun. For the secondary star, likely a late B-type dwarf, we obtain about 3.1 Msun and 2.1 Rsun. Their mass ratio of about 0.138 might be the lowest found so far in O-type binaries. [English and German online-version of this paper available under www.bav-astro.eu/rb/rb2016-2/1.html].

  20. Reconstruction of binary geological images using analytical edge and object models

    Science.gov (United States)

    Abdollahifard, Mohammad J.; Ahmadi, Sadegh

    2016-04-01

    Reconstruction of fields using partial measurements is of vital importance in different applications in geosciences. Solving such an ill-posed problem requires a well-chosen model. In recent years, training images (TI) are widely employed as strong prior models for solving these problems. However, in the absence of enough evidence it is difficult to find an adequate TI which is capable of describing the field behavior properly. In this paper a very simple and general model is introduced which is applicable to a fairly wide range of binary images without any modifications. The model is motivated by the fact that nearly all binary images are composed of simple linear edges in micro-scale. The analytic essence of this model allows us to formulate the template matching problem as a convex optimization problem having efficient and fast solutions. The model has the potential to incorporate the qualitative and quantitative information provided by geologists. The image reconstruction problem is also formulated as an optimization problem and solved using an iterative greedy approach. The proposed method is capable of recovering the image unknown values with accuracies about 90% given samples representing as few as 2% of the original image.

  1. Short-term electricity prices forecasting based on support vector regression and Auto-regressive integrated moving average modeling

    International Nuclear Information System (INIS)

    Che Jinxing; Wang Jianzhou

    2010-01-01

    In this paper, we present the use of different mathematical models to forecast electricity price under deregulated power. A successful prediction tool of electricity price can help both power producers and consumers plan their bidding strategies. Inspired by that the support vector regression (SVR) model, with the ε-insensitive loss function, admits of the residual within the boundary values of ε-tube, we propose a hybrid model that combines both SVR and Auto-regressive integrated moving average (ARIMA) models to take advantage of the unique strength of SVR and ARIMA models in nonlinear and linear modeling, which is called SVRARIMA. A nonlinear analysis of the time-series indicates the convenience of nonlinear modeling, the SVR is applied to capture the nonlinear patterns. ARIMA models have been successfully applied in solving the residuals regression estimation problems. The experimental results demonstrate that the model proposed outperforms the existing neural-network approaches, the traditional ARIMA models and other hybrid models based on the root mean square error and mean absolute percentage error.

  2. Quantifying relative importance: Computing standardized effects in models with binary outcomes

    Science.gov (United States)

    Grace, James B.; Johnson, Darren; Lefcheck, Jonathan S.; Byrnes, Jarrett E.K.

    2018-01-01

    Scientists commonly ask questions about the relative importances of processes, and then turn to statistical models for answers. Standardized coefficients are typically used in such situations, with the goal being to compare effects on a common scale. Traditional approaches to obtaining standardized coefficients were developed with idealized Gaussian variables in mind. When responses are binary, complications arise that impact standardization methods. In this paper, we review, evaluate, and propose new methods for standardizing coefficients from models that contain binary outcomes. We first consider the interpretability of unstandardized coefficients and then examine two main approaches to standardization. One approach, which we refer to as the Latent-Theoretical or LT method, assumes that underlying binary observations there exists a latent, continuous propensity linearly related to the coefficients. A second approach, which we refer to as the Observed-Empirical or OE method, assumes responses are purely discrete and estimates error variance empirically via reference to a classical R2 estimator. We also evaluate the standard formula for calculating standardized coefficients based on standard deviations. Criticisms of this practice have been persistent, leading us to propose an alternative formula that is based on user-defined “relevant ranges”. Finally, we implement all of the above in an open-source package for the statistical software R.

  3. Modeling oil production based on symbolic regression

    International Nuclear Information System (INIS)

    Yang, Guangfei; Li, Xianneng; Wang, Jianliang; Lian, Lian; Ma, Tieju

    2015-01-01

    Numerous models have been proposed to forecast the future trends of oil production and almost all of them are based on some predefined assumptions with various uncertainties. In this study, we propose a novel data-driven approach that uses symbolic regression to model oil production. We validate our approach on both synthetic and real data, and the results prove that symbolic regression could effectively identify the true models beneath the oil production data and also make reliable predictions. Symbolic regression indicates that world oil production will peak in 2021, which broadly agrees with other techniques used by researchers. Our results also show that the rate of decline after the peak is almost half the rate of increase before the peak, and it takes nearly 12 years to drop 4% from the peak. These predictions are more optimistic than those in several other reports, and the smoother decline will provide the world, especially the developing countries, with more time to orchestrate mitigation plans. -- Highlights: •A data-driven approach has been shown to be effective at modeling the oil production. •The Hubbert model could be discovered automatically from data. •The peak of world oil production is predicted to appear in 2021. •The decline rate after peak is half of the increase rate before peak. •Oil production projected to decline 4% post-peak

  4. Models for the formation of binary and millisecond radio pulsars

    International Nuclear Information System (INIS)

    van den Heuvel, E.P.J.

    1984-01-01

    The peculiar combination of a relatively short pulse period and a relatively weak surface dipole magnetic field strength of binary radio pulsars finds a consistent explanation in terms of: (i) decay of the surface dipole component of neutron star magnetic fields on a timescale of (2-5).10 6 yrs, in combination with: (ii) spin up of the rotation of the neutron star during a subsequent mass-transfer phase. The two observed classes of binary radio pulsars (very close and very wide systems, respectively) are expected to have been formed by the later evolution of binaries consisting of a neutron star and a normal companion star, in which the companion was (considerably) more massive than the neutron star, or less massive than the neutron star, respectively. In the first case the companion of the neutron star in the final system will be a fairly massive white dwarf, in a circular orbit, or a neutron star in an eccentric orbit. In the second case the final companion to the neutron star will be a low-mass (approx. 0.3 Msub solar) helium white dwarf in a wide and nearly circular orbit. In systems of the second type the neutron star was most probably formed by the accretion-induced collapse of a white dwarf. This explains why PSR 1953+29 has a millisecond rotation period and why PSR 0820+02 has not. Binary coalescence models for the formation of the 1.5 millisecond pulsar appear to be viable. The companion to the neutron star may have been a low-mass red dwarf, a neutron star, or a massive (> 0.7 Msub solar) white dwarf. In the red-dwarf case the progenitor system probably was a CV binary in which the white dwarf collapsed by accretion. 66 references, 6 figures, 1 table

  5. Modeling diffusion coefficients in binary mixtures of polar and non-polar compounds

    DEFF Research Database (Denmark)

    Medvedev, Oleg; Shapiro, Alexander

    2005-01-01

    The theory of transport coefficients in liquids, developed previously, is tested on a description of the diffusion coefficients in binary polar/non-polar mixtures, by applying advanced thermodynamic models. Comparison to a large set of experimental data shows good performance of the model. Only f...

  6. Factors associated with trait anger level of juvenile offenders in Hubei province: A binary logistic regression analysis.

    Science.gov (United States)

    Tang, Li-Na; Ye, Xiao-Zhou; Yan, Qiu-Ge; Chang, Hong-Juan; Ma, Yu-Qiao; Liu, De-Bin; Li, Zhi-Gen; Yu, Yi-Zhen

    2017-02-01

    The risk factors of high trait anger of juvenile offenders were explored through questionnaire study in a youth correctional facility of Hubei province, China. A total of 1090 juvenile offenders in Hubei province were investigated by self-compiled social-demographic questionnaire, Childhood Trauma Questionnaire (CTQ), and State-Trait Anger Expression Inventory-II (STAXI-II). The risk factors were analyzed by chi-square tests, correlation analysis, and binary logistic regression analysis with SPSS 19.0. A total of 1082 copies of valid questionnaires were collected. High trait anger group (n=316) was defined as those who scored in the upper 27th percentile of STAXI-II trait anger scale (TAS), and the rest were defined as low trait anger group (n=766). The risk factors associated with high level of trait anger included: childhood emotional abuse, childhood sexual abuse, step family, frequent drug abuse, and frequent internet using (P0.05). It was suggested that traumatic experience in childhood and unhealthy life style may significantly increase the level of trait anger in adulthood. The risk factors of high trait anger and their effects should be taken into consideration seriously.

  7. Model performance analysis and model validation in logistic regression

    Directory of Open Access Journals (Sweden)

    Rosa Arboretti Giancristofaro

    2007-10-01

    Full Text Available In this paper a new model validation procedure for a logistic regression model is presented. At first, we illustrate a brief review of different techniques of model validation. Next, we define a number of properties required for a model to be considered "good", and a number of quantitative performance measures. Lastly, we describe a methodology for the assessment of the performance of a given model by using an example taken from a management study.

  8. Model building strategy for logistic regression: purposeful selection.

    Science.gov (United States)

    Zhang, Zhongheng

    2016-03-01

    Logistic regression is one of the most commonly used models to account for confounders in medical literature. The article introduces how to perform purposeful selection model building strategy with R. I stress on the use of likelihood ratio test to see whether deleting a variable will have significant impact on model fit. A deleted variable should also be checked for whether it is an important adjustment of remaining covariates. Interaction should be checked to disentangle complex relationship between covariates and their synergistic effect on response variable. Model should be checked for the goodness-of-fit (GOF). In other words, how the fitted model reflects the real data. Hosmer-Lemeshow GOF test is the most widely used for logistic regression model.

  9. The APT model as reduced-rank regression

    NARCIS (Netherlands)

    Bekker, P.A.; Dobbelstein, P.; Wansbeek, T.J.

    Integrating the two steps of an arbitrage pricing theory (APT) model leads to a reduced-rank regression (RRR) model. So the results on RRR can be used to estimate APT models, making estimation very simple. We give a succinct derivation of estimation of RRR, derive the asymptotic variance of RRR

  10. Mediation analysis for logistic regression with interactions: Application of a surrogate marker in ophthalmology

    DEFF Research Database (Denmark)

    Jensen, Signe Marie; Hauger, Hanne; Ritz, Christian

    2018-01-01

    Mediation analysis is often based on fitting two models, one including and another excluding a potential mediator, and subsequently quantify the mediated effects by combining parameter estimates from these two models. Standard errors of such derived parameters may be approximated using the delta...... method. For a study evaluating a treatment effect on visual acuity, a binary outcome, we demonstrate how mediation analysis may conveniently be carried out by means of marginally fitted logistic regression models in combination with the delta method. Several metrics of mediation are estimated and results...

  11. A template bank to search for gravitational waves from inspiralling compact binaries: I. Physical models

    International Nuclear Information System (INIS)

    Babak, S; Balasubramanian, R; Churches, D; Cokelaer, T; Sathyaprakash, B S

    2006-01-01

    Gravitational waves from coalescing compact binaries are searched for using the matched filtering technique. As the model waveform depends on a number of parameters, it is necessary to filter the data through a template bank covering the astrophysically interesting region of the parameter space. The choice of templates is defined by the maximum allowed drop in signal-to-noise ratio due to the discreteness of the template bank. In this paper we describe the template-bank algorithm that was used in the analysis of data from the Laser Interferometer Gravitational Wave Observatory (LIGO) and GEO 600 detectors to search for signals from binaries consisting of non-spinning compact objects. Using Monte Carlo simulations, we study the efficiency of the bank and show that its performance is satisfactory for the design sensitivity curves of ground-based interferometric gravitational wave detectors GEO 600, initial LIGO, advanced LIGO and Virgo. The bank is efficient in searching for various compact binaries such as binary primordial black holes, binary neutron stars, binary black holes, as well as a mixed binary consisting of a non-spinning black hole and a neutron star

  12. Modelling subject-specific childhood growth using linear mixed-effect models with cubic regression splines.

    Science.gov (United States)

    Grajeda, Laura M; Ivanescu, Andrada; Saito, Mayuko; Crainiceanu, Ciprian; Jaganath, Devan; Gilman, Robert H; Crabtree, Jean E; Kelleher, Dermott; Cabrera, Lilia; Cama, Vitaliano; Checkley, William

    2016-01-01

    Childhood growth is a cornerstone of pediatric research. Statistical models need to consider individual trajectories to adequately describe growth outcomes. Specifically, well-defined longitudinal models are essential to characterize both population and subject-specific growth. Linear mixed-effect models with cubic regression splines can account for the nonlinearity of growth curves and provide reasonable estimators of population and subject-specific growth, velocity and acceleration. We provide a stepwise approach that builds from simple to complex models, and account for the intrinsic complexity of the data. We start with standard cubic splines regression models and build up to a model that includes subject-specific random intercepts and slopes and residual autocorrelation. We then compared cubic regression splines vis-à-vis linear piecewise splines, and with varying number of knots and positions. Statistical code is provided to ensure reproducibility and improve dissemination of methods. Models are applied to longitudinal height measurements in a cohort of 215 Peruvian children followed from birth until their fourth year of life. Unexplained variability, as measured by the variance of the regression model, was reduced from 7.34 when using ordinary least squares to 0.81 (p linear mixed-effect models with random slopes and a first order continuous autoregressive error term. There was substantial heterogeneity in both the intercept (p modeled with a first order continuous autoregressive error term as evidenced by the variogram of the residuals and by a lack of association among residuals. The final model provides a parametric linear regression equation for both estimation and prediction of population- and individual-level growth in height. We show that cubic regression splines are superior to linear regression splines for the case of a small number of knots in both estimation and prediction with the full linear mixed effect model (AIC 19,352 vs. 19

  13. Influence diagnostics in meta-regression model.

    Science.gov (United States)

    Shi, Lei; Zuo, ShanShan; Yu, Dalei; Zhou, Xiaohua

    2017-09-01

    This paper studies the influence diagnostics in meta-regression model including case deletion diagnostic and local influence analysis. We derive the subset deletion formulae for the estimation of regression coefficient and heterogeneity variance and obtain the corresponding influence measures. The DerSimonian and Laird estimation and maximum likelihood estimation methods in meta-regression are considered, respectively, to derive the results. Internal and external residual and leverage measure are defined. The local influence analysis based on case-weights perturbation scheme, responses perturbation scheme, covariate perturbation scheme, and within-variance perturbation scheme are explored. We introduce a method by simultaneous perturbing responses, covariate, and within-variance to obtain the local influence measure, which has an advantage of capable to compare the influence magnitude of influential studies from different perturbations. An example is used to illustrate the proposed methodology. Copyright © 2017 John Wiley & Sons, Ltd.

  14. Logistic Regression Modeling of Diminishing Manufacturing Sources for Integrated Circuits

    National Research Council Canada - National Science Library

    Gravier, Michael

    1999-01-01

    .... The research identified logistic regression as a powerful tool for analysis of DMSMS and further developed twenty models attempting to identify the "best" way to model and predict DMSMS using logistic regression...

  15. Analysis of dental caries using generalized linear and count regression models

    Directory of Open Access Journals (Sweden)

    Javali M. Phil

    2013-11-01

    Full Text Available Generalized linear models (GLM are generalization of linear regression models, which allow fitting regression models to response data in all the sciences especially medical and dental sciences that follow a general exponential family. These are flexible and widely used class of such models that can accommodate response variables. Count data are frequently characterized by overdispersion and excess zeros. Zero-inflated count models provide a parsimonious yet powerful way to model this type of situation. Such models assume that the data are a mixture of two separate data generation processes: one generates only zeros, and the other is either a Poisson or a negative binomial data-generating process. Zero inflated count regression models such as the zero-inflated Poisson (ZIP, zero-inflated negative binomial (ZINB regression models have been used to handle dental caries count data with many zeros. We present an evaluation framework to the suitability of applying the GLM, Poisson, NB, ZIP and ZINB to dental caries data set where the count data may exhibit evidence of many zeros and over-dispersion. Estimation of the model parameters using the method of maximum likelihood is provided. Based on the Vuong test statistic and the goodness of fit measure for dental caries data, the NB and ZINB regression models perform better than other count regression models.

  16. Formation and Evolution of X-ray Binaries

    Science.gov (United States)

    Fragkos, Anastasios

    X-ray binaries - mass-transferring binary stellar systems with compact object accretors - are unique astrophysical laboratories. They carry information about many complex physical processes such as star formation, compact object formation, and evolution of interacting binaries. My thesis work involves the study of the formation and evolution of Galactic and extra-galacticX-ray binaries using both detailed and realistic simulation tools, and population synthesis techniques. I applied an innovative analysis method that allows the reconstruction of the full evolutionary history of known black hole X-ray binaries back to the time of compact object formation. This analysis takes into account all the available observationally determined properties of a system, and models in detail four of its evolutionary evolutionary phases: mass transfer through the ongoing X-ray phase, tidal evolution before the onset of Roche-lobe overflow, motion through the Galactic potential after the formation of the black hole, and binary orbital dynamics at the time of core collapse. Motivated by deep extra-galactic Chandra survey observations, I worked on population synthesis models of low-mass X-ray binaries in the two elliptical galaxies NGC3379 and NGC4278. These simulations were targeted at understanding the origin of the shape and normalization of the observed X-ray luminosity functions. In a follow up study, I proposed a physically motivated prescription for the modeling of transient neutron star low-mass X-ray binary properties, such as duty cycle, outburst duration and recurrence time. This prescription enabled the direct comparison of transient low-mass X-ray binary population synthesis models to the Chandra X-ray survey of the two ellipticals NGC3379 and NGC4278. Finally, I worked on population synthesismodels of black holeX-ray binaries in the MilkyWay. This work was motivated by recent developments in observational techniques for the measurement of black hole spin magnitudes in

  17. AIRLINE ACTIVITY FORECASTING BY REGRESSION MODELS

    Directory of Open Access Journals (Sweden)

    Н. Білак

    2012-04-01

    Full Text Available Proposed linear and nonlinear regression models, which take into account the equation of trend and seasonality indices for the analysis and restore the volume of passenger traffic over the past period of time and its prediction for future years, as well as the algorithm of formation of these models based on statistical analysis over the years. The desired model is the first step for the synthesis of more complex models, which will enable forecasting of passenger (income level airline with the highest accuracy and time urgency.

  18. [Application of detecting and taking overdispersion into account in Poisson regression model].

    Science.gov (United States)

    Bouche, G; Lepage, B; Migeot, V; Ingrand, P

    2009-08-01

    Researchers often use the Poisson regression model to analyze count data. Overdispersion can occur when a Poisson regression model is used, resulting in an underestimation of variance of the regression model parameters. Our objective was to take overdispersion into account and assess its impact with an illustration based on the data of a study investigating the relationship between use of the Internet to seek health information and number of primary care consultations. Three methods, overdispersed Poisson, a robust estimator, and negative binomial regression, were performed to take overdispersion into account in explaining variation in the number (Y) of primary care consultations. We tested overdispersion in the Poisson regression model using the ratio of the sum of Pearson residuals over the number of degrees of freedom (chi(2)/df). We then fitted the three models and compared parameter estimation to the estimations given by Poisson regression model. Variance of the number of primary care consultations (Var[Y]=21.03) was greater than the mean (E[Y]=5.93) and the chi(2)/df ratio was 3.26, which confirmed overdispersion. Standard errors of the parameters varied greatly between the Poisson regression model and the three other regression models. Interpretation of estimates from two variables (using the Internet to seek health information and single parent family) would have changed according to the model retained, with significant levels of 0.06 and 0.002 (Poisson), 0.29 and 0.09 (overdispersed Poisson), 0.29 and 0.13 (use of a robust estimator) and 0.45 and 0.13 (negative binomial) respectively. Different methods exist to solve the problem of underestimating variance in the Poisson regression model when overdispersion is present. The negative binomial regression model seems to be particularly accurate because of its theorical distribution ; in addition this regression is easy to perform with ordinary statistical software packages.

  19. Variable Selection for Regression Models of Percentile Flows

    Science.gov (United States)

    Fouad, G.

    2017-12-01

    Percentile flows describe the flow magnitude equaled or exceeded for a given percent of time, and are widely used in water resource management. However, these statistics are normally unavailable since most basins are ungauged. Percentile flows of ungauged basins are often predicted using regression models based on readily observable basin characteristics, such as mean elevation. The number of these independent variables is too large to evaluate all possible models. A subset of models is typically evaluated using automatic procedures, like stepwise regression. This ignores a large variety of methods from the field of feature (variable) selection and physical understanding of percentile flows. A study of 918 basins in the United States was conducted to compare an automatic regression procedure to the following variable selection methods: (1) principal component analysis, (2) correlation analysis, (3) random forests, (4) genetic programming, (5) Bayesian networks, and (6) physical understanding. The automatic regression procedure only performed better than principal component analysis. Poor performance of the regression procedure was due to a commonly used filter for multicollinearity, which rejected the strongest models because they had cross-correlated independent variables. Multicollinearity did not decrease model performance in validation because of a representative set of calibration basins. Variable selection methods based strictly on predictive power (numbers 2-5 from above) performed similarly, likely indicating a limit to the predictive power of the variables. Similar performance was also reached using variables selected based on physical understanding, a finding that substantiates recent calls to emphasize physical understanding in modeling for predictions in ungauged basins. The strongest variables highlighted the importance of geology and land cover, whereas widely used topographic variables were the weakest predictors. Variables suffered from a high

  20. Classifying machinery condition using oil samples and binary logistic regression

    Science.gov (United States)

    Phillips, J.; Cripps, E.; Lau, John W.; Hodkiewicz, M. R.

    2015-08-01

    The era of big data has resulted in an explosion of condition monitoring information. The result is an increasing motivation to automate the costly and time consuming human elements involved in the classification of machine health. When working with industry it is important to build an understanding and hence some trust in the classification scheme for those who use the analysis to initiate maintenance tasks. Typically "black box" approaches such as artificial neural networks (ANN) and support vector machines (SVM) can be difficult to provide ease of interpretability. In contrast, this paper argues that logistic regression offers easy interpretability to industry experts, providing insight to the drivers of the human classification process and to the ramifications of potential misclassification. Of course, accuracy is of foremost importance in any automated classification scheme, so we also provide a comparative study based on predictive performance of logistic regression, ANN and SVM. A real world oil analysis data set from engines on mining trucks is presented and using cross-validation we demonstrate that logistic regression out-performs the ANN and SVM approaches in terms of prediction for healthy/not healthy engines.

  1. [Evaluation of estimation of prevalence ratio using bayesian log-binomial regression model].

    Science.gov (United States)

    Gao, W L; Lin, H; Liu, X N; Ren, X W; Li, J S; Shen, X P; Zhu, S L

    2017-03-10

    To evaluate the estimation of prevalence ratio ( PR ) by using bayesian log-binomial regression model and its application, we estimated the PR of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea in their infants by using bayesian log-binomial regression model in Openbugs software. The results showed that caregivers' recognition of infant' s risk signs of diarrhea was associated significantly with a 13% increase of medical care-seeking. Meanwhile, we compared the differences in PR 's point estimation and its interval estimation of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea and convergence of three models (model 1: not adjusting for the covariates; model 2: adjusting for duration of caregivers' education, model 3: adjusting for distance between village and township and child month-age based on model 2) between bayesian log-binomial regression model and conventional log-binomial regression model. The results showed that all three bayesian log-binomial regression models were convergence and the estimated PRs were 1.130(95 %CI : 1.005-1.265), 1.128(95 %CI : 1.001-1.264) and 1.132(95 %CI : 1.004-1.267), respectively. Conventional log-binomial regression model 1 and model 2 were convergence and their PRs were 1.130(95 % CI : 1.055-1.206) and 1.126(95 % CI : 1.051-1.203), respectively, but the model 3 was misconvergence, so COPY method was used to estimate PR , which was 1.125 (95 %CI : 1.051-1.200). In addition, the point estimation and interval estimation of PRs from three bayesian log-binomial regression models differed slightly from those of PRs from conventional log-binomial regression model, but they had a good consistency in estimating PR . Therefore, bayesian log-binomial regression model can effectively estimate PR with less misconvergence and have more advantages in application compared with conventional log-binomial regression model.

  2. The cross-validated AUC for MCP-logistic regression with high-dimensional data.

    Science.gov (United States)

    Jiang, Dingfeng; Huang, Jian; Zhang, Ying

    2013-10-01

    We propose a cross-validated area under the receiving operator characteristic (ROC) curve (CV-AUC) criterion for tuning parameter selection for penalized methods in sparse, high-dimensional logistic regression models. We use this criterion in combination with the minimax concave penalty (MCP) method for variable selection. The CV-AUC criterion is specifically designed for optimizing the classification performance for binary outcome data. To implement the proposed approach, we derive an efficient coordinate descent algorithm to compute the MCP-logistic regression solution surface. Simulation studies are conducted to evaluate the finite sample performance of the proposed method and its comparison with the existing methods including the Akaike information criterion (AIC), Bayesian information criterion (BIC) or Extended BIC (EBIC). The model selected based on the CV-AUC criterion tends to have a larger predictive AUC and smaller classification error than those with tuning parameters selected using the AIC, BIC or EBIC. We illustrate the application of the MCP-logistic regression with the CV-AUC criterion on three microarray datasets from the studies that attempt to identify genes related to cancers. Our simulation studies and data examples demonstrate that the CV-AUC is an attractive method for tuning parameter selection for penalized methods in high-dimensional logistic regression models.

  3. RED GIANTS IN ECLIPSING BINARY AND MULTIPLE-STAR SYSTEMS: MODELING AND ASTEROSEISMIC ANALYSIS OF 70 CANDIDATES FROM KEPLER DATA

    International Nuclear Information System (INIS)

    Gaulme, P.; McKeever, J.; Rawls, M. L.; Jackiewicz, J.; Mosser, B.; Guzik, J. A.

    2013-01-01

    Red giant stars are proving to be an incredible source of information for testing models of stellar evolution, as asteroseismology has opened up a window into their interiors. Such insights are a direct result of the unprecedented data from space missions CoRoT and Kepler as well as recent theoretical advances. Eclipsing binaries are also fundamental astrophysical objects, and when coupled with asteroseismology, binaries provide two independent methods to obtain masses and radii and exciting opportunities to develop highly constrained stellar models. The possibility of discovering pulsating red giants in eclipsing binary systems is therefore an important goal that could potentially offer very robust characterization of these systems. Until recently, only one case has been discovered with Kepler. We cross-correlate the detected red giant and eclipsing-binary catalogs from Kepler data to find possible candidate systems. Light-curve modeling and mean properties measured from asteroseismology are combined to yield specific measurements of periods, masses, radii, temperatures, eclipse timing variations, core rotation rates, and red giant evolutionary state. After using three different techniques to eliminate false positives, out of the 70 systems common to the red giant and eclipsing-binary catalogs we find 13 strong candidates (12 previously unknown) to be eclipsing binaries, one to be a non-eclipsing binary with tidally induced oscillations, and 10 more to be hierarchical triple systems, all of which include a pulsating red giant. The systems span a range of orbital eccentricities, periods, and spectral types F, G, K, and M for the companion of the red giant. One case even suggests an eclipsing binary composed of two red giant stars and another of a red giant with a δ-Scuti star. The discovery of multiple pulsating red giants in eclipsing binaries provides an exciting test bed for precise astrophysical modeling, and follow-up spectroscopic observations of many of the

  4. Geographically Weighted Logistic Regression Applied to Credit Scoring Models

    Directory of Open Access Journals (Sweden)

    Pedro Henrique Melo Albuquerque

    Full Text Available Abstract This study used real data from a Brazilian financial institution on transactions involving Consumer Direct Credit (CDC, granted to clients residing in the Distrito Federal (DF, to construct credit scoring models via Logistic Regression and Geographically Weighted Logistic Regression (GWLR techniques. The aims were: to verify whether the factors that influence credit risk differ according to the borrower’s geographic location; to compare the set of models estimated via GWLR with the global model estimated via Logistic Regression, in terms of predictive power and financial losses for the institution; and to verify the viability of using the GWLR technique to develop credit scoring models. The metrics used to compare the models developed via the two techniques were the AICc informational criterion, the accuracy of the models, the percentage of false positives, the sum of the value of false positive debt, and the expected monetary value of portfolio default compared with the monetary value of defaults observed. The models estimated for each region in the DF were distinct in their variables and coefficients (parameters, with it being concluded that credit risk was influenced differently in each region in the study. The Logistic Regression and GWLR methodologies presented very close results, in terms of predictive power and financial losses for the institution, and the study demonstrated viability in using the GWLR technique to develop credit scoring models for the target population in the study.

  5. Modeling maximum daily temperature using a varying coefficient regression model

    Science.gov (United States)

    Han Li; Xinwei Deng; Dong-Yum Kim; Eric P. Smith

    2014-01-01

    Relationships between stream water and air temperatures are often modeled using linear or nonlinear regression methods. Despite a strong relationship between water and air temperatures and a variety of models that are effective for data summarized on a weekly basis, such models did not yield consistently good predictions for summaries such as daily maximum temperature...

  6. Modeling binary correlated responses using SAS, SPSS and R

    CERN Document Server

    Wilson, Jeffrey R

    2015-01-01

    Statistical tools to analyze correlated binary data are spread out in the existing literature. This book makes these tools accessible to practitioners in a single volume. Chapters cover recently developed statistical tools and statistical packages that are tailored to analyzing correlated binary data. The authors showcase both traditional and new methods for application to health-related research. Data and computer programs will be publicly available in order for readers to replicate model development, but learning a new statistical language is not necessary with this book. The inclusion of code for R, SAS, and SPSS allows for easy implementation by readers. For readers interested in learning more about the languages, though, there are short tutorials in the appendix. Accompanying data sets are available for download through the book s website. Data analysis presented in each chapter will provide step-by-step instructions so these new methods can be readily applied to projects.  Researchers and graduate stu...

  7. Multiple regression and beyond an introduction to multiple regression and structural equation modeling

    CERN Document Server

    Keith, Timothy Z

    2014-01-01

    Multiple Regression and Beyond offers a conceptually oriented introduction to multiple regression (MR) analysis and structural equation modeling (SEM), along with analyses that flow naturally from those methods. By focusing on the concepts and purposes of MR and related methods, rather than the derivation and calculation of formulae, this book introduces material to students more clearly, and in a less threatening way. In addition to illuminating content necessary for coursework, the accessibility of this approach means students are more likely to be able to conduct research using MR or SEM--and more likely to use the methods wisely. Covers both MR and SEM, while explaining their relevance to one another Also includes path analysis, confirmatory factor analysis, and latent growth modeling Figures and tables throughout provide examples and illustrate key concepts and techniques For additional resources, please visit: http://tzkeith.com/.

  8. Time series regression model for infectious disease and weather.

    Science.gov (United States)

    Imai, Chisato; Armstrong, Ben; Chalabi, Zaid; Mangtani, Punam; Hashizume, Masahiro

    2015-10-01

    Time series regression has been developed and long used to evaluate the short-term associations of air pollution and weather with mortality or morbidity of non-infectious diseases. The application of the regression approaches from this tradition to infectious diseases, however, is less well explored and raises some new issues. We discuss and present potential solutions for five issues often arising in such analyses: changes in immune population, strong autocorrelations, a wide range of plausible lag structures and association patterns, seasonality adjustments, and large overdispersion. The potential approaches are illustrated with datasets of cholera cases and rainfall from Bangladesh and influenza and temperature in Tokyo. Though this article focuses on the application of the traditional time series regression to infectious diseases and weather factors, we also briefly introduce alternative approaches, including mathematical modeling, wavelet analysis, and autoregressive integrated moving average (ARIMA) models. Modifications proposed to standard time series regression practice include using sums of past cases as proxies for the immune population, and using the logarithm of lagged disease counts to control autocorrelation due to true contagion, both of which are motivated from "susceptible-infectious-recovered" (SIR) models. The complexity of lag structures and association patterns can often be informed by biological mechanisms and explored by using distributed lag non-linear models. For overdispersed models, alternative distribution models such as quasi-Poisson and negative binomial should be considered. Time series regression can be used to investigate dependence of infectious diseases on weather, but may need modifying to allow for features specific to this context. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  9. Linear regression crash prediction models : issues and proposed solutions.

    Science.gov (United States)

    2010-05-01

    The paper develops a linear regression model approach that can be applied to : crash data to predict vehicle crashes. The proposed approach involves novice data aggregation : to satisfy linear regression assumptions; namely error structure normality ...

  10. Astronomy of binary and multiple stars

    International Nuclear Information System (INIS)

    Tokovinin, A.A.

    1984-01-01

    Various types of binary stars and methods for their observation are described in a popular form. Some models of formation and evolution of binary and multiple star systems are presented. It is concluded that formation of binary and multiple stars is a regular stage in the process of star production

  11. Variance in binary stellar population synthesis

    Science.gov (United States)

    Breivik, Katelyn; Larson, Shane L.

    2016-03-01

    In the years preceding LISA, Milky Way compact binary population simulations can be used to inform the science capabilities of the mission. Galactic population simulation efforts generally focus on high fidelity models that require extensive computational power to produce a single simulated population for each model. Each simulated population represents an incomplete sample of the functions governing compact binary evolution, thus introducing variance from one simulation to another. We present a rapid Monte Carlo population simulation technique that can simulate thousands of populations in less than a week, thus allowing a full exploration of the variance associated with a binary stellar evolution model.

  12. Identification of Influential Points in a Linear Regression Model

    Directory of Open Access Journals (Sweden)

    Jan Grosz

    2011-03-01

    Full Text Available The article deals with the detection and identification of influential points in the linear regression model. Three methods of detection of outliers and leverage points are described. These procedures can also be used for one-sample (independentdatasets. This paper briefly describes theoretical aspects of several robust methods as well. Robust statistics is a powerful tool to increase the reliability and accuracy of statistical modelling and data analysis. A simulation model of the simple linear regression is presented.

  13. Isothermal vapour–liquid equilibrium of binary systems containing polyoxyethylene dodecanoate and alcohols

    International Nuclear Information System (INIS)

    Khoiroh, Ianatul; Lee, Ming-Jer

    2013-01-01

    Highlights: ► An autoclave apparatus was used to measure binary vapor-liquid equilibrium data. ► The studied systems are polyoxyethylene dodecanoate with 2-butanol, tert-butanol, and 1-pentanol. ► The saturated pressure data were fitted accurately to the Antoine equation. ► The UNIQUAC, the NRTL, and the Flory–Huggins models correlated well the phase equilibrium data. ► The solvent activities have been calculated. - Abstract: Isothermal vapour–liquid equilibrium (VLE) data have been measured with a static method for three binary systems of polyoxyethylene dodecanoate {(POEDDA) + butan-2-ol} at T = (333.4 to 424.5) K, (POEDDA + tert-butanol) at (321.1 to 401.5) K, and (POEDDA + pentan-1-ol) at (340.2 to 419.4) K. Four feed compositions were studied over the concentration range of 0.099 to 0.432 of POEDDA mole fractions. The experimental results were fitted to the Antoine equation to regress the Antoine constants. These VLE data were further treated by using the Barker method to obtain the best fit of binary interaction parameters from the UNIQUAC, the NRTL, and the Flory–Huggins models. The results showed good agreement between the experimental and calculated values. The Flory–Huggins model yielded the best result with an overall average absolute relative deviation (AARD) of 2.1%. The solvent activities were also calculated and showed agree well with the calculated values from those three activity coefficient models.

  14. Physics of Eclipsing Binaries: Modelling in the new era of ultra-high precision photometry

    OpenAIRE

    Pavlovski, K.; Bloemen, S.; Degroote, P.; Conroy, K.; Hambleton, Kelly; Giammarco, J.M.; Pablo, H.; Prša, A.; Tkachenko, A.; Torres, G.

    2013-01-01

    Recent ultra-high precision observations of eclipsing binaries, especially data acquired by the Kepler satellite, have made accurate light curve modelling increasingly challenging but also more rewarding. In this contribution, we discuss low-amplitude signals in light curves that can now be used to derive physical information about eclipsing binaries but that were unaccessible before the Kepler era. A notable example is the detection of Doppler beaming, which leads to an increase in flux when...

  15. The art of regression modeling in road safety

    CERN Document Server

    Hauer, Ezra

    2015-01-01

    This unique book explains how to fashion useful regression models from commonly available data to erect models essential for evidence-based road safety management and research. Composed from techniques and best practices presented over many years of lectures and workshops, The Art of Regression Modeling in Road Safety illustrates that fruitful modeling cannot be done without substantive knowledge about the modeled phenomenon. Class-tested in courses and workshops across North America, the book is ideal for professionals, researchers, university professors, and graduate students with an interest in, or responsibilities related to, road safety. This book also: · Presents for the first time a powerful analytical tool for road safety researchers and practitioners · Includes problems and solutions in each chapter as well as data and spreadsheets for running models and PowerPoint presentation slides · Features pedagogy well-suited for graduate courses and workshops including problems, solutions, and PowerPoint p...

  16. Support Vector Regression Model Based on Empirical Mode Decomposition and Auto Regression for Electric Load Forecasting

    Directory of Open Access Journals (Sweden)

    Hong-Juan Li

    2013-04-01

    Full Text Available Electric load forecasting is an important issue for a power utility, associated with the management of daily operations such as energy transfer scheduling, unit commitment, and load dispatch. Inspired by strong non-linear learning capability of support vector regression (SVR, this paper presents a SVR model hybridized with the empirical mode decomposition (EMD method and auto regression (AR for electric load forecasting. The electric load data of the New South Wales (Australia market are employed for comparing the forecasting performances of different forecasting models. The results confirm the validity of the idea that the proposed model can simultaneously provide forecasting with good accuracy and interpretability.

  17. Robust geographically weighted regression of modeling the Air Polluter Standard Index (APSI)

    Science.gov (United States)

    Warsito, Budi; Yasin, Hasbi; Ispriyanti, Dwi; Hoyyi, Abdul

    2018-05-01

    The Geographically Weighted Regression (GWR) model has been widely applied to many practical fields for exploring spatial heterogenity of a regression model. However, this method is inherently not robust to outliers. Outliers commonly exist in data sets and may lead to a distorted estimate of the underlying regression model. One of solution to handle the outliers in the regression model is to use the robust models. So this model was called Robust Geographically Weighted Regression (RGWR). This research aims to aid the government in the policy making process related to air pollution mitigation by developing a standard index model for air polluter (Air Polluter Standard Index - APSI) based on the RGWR approach. In this research, we also consider seven variables that are directly related to the air pollution level, which are the traffic velocity, the population density, the business center aspect, the air humidity, the wind velocity, the air temperature, and the area size of the urban forest. The best model is determined by the smallest AIC value. There are significance differences between Regression and RGWR in this case, but Basic GWR using the Gaussian kernel is the best model to modeling APSI because it has smallest AIC.

  18. Flexible competing risks regression modeling and goodness-of-fit

    DEFF Research Database (Denmark)

    Scheike, Thomas; Zhang, Mei-Jie

    2008-01-01

    In this paper we consider different approaches for estimation and assessment of covariate effects for the cumulative incidence curve in the competing risks model. The classic approach is to model all cause-specific hazards and then estimate the cumulative incidence curve based on these cause...... models that is easy to fit and contains the Fine-Gray model as a special case. One advantage of this approach is that our regression modeling allows for non-proportional hazards. This leads to a new simple goodness-of-fit procedure for the proportional subdistribution hazards assumption that is very easy...... of the flexible regression models to analyze competing risks data when non-proportionality is present in the data....

  19. Regression trees for predicting mortality in patients with cardiovascular disease: What improvement is achieved by using ensemble-based methods?

    Science.gov (United States)

    Austin, Peter C; Lee, Douglas S; Steyerberg, Ewout W; Tu, Jack V

    2012-01-01

    In biomedical research, the logistic regression model is the most commonly used method for predicting the probability of a binary outcome. While many clinical researchers have expressed an enthusiasm for regression trees, this method may have limited accuracy for predicting health outcomes. We aimed to evaluate the improvement that is achieved by using ensemble-based methods, including bootstrap aggregation (bagging) of regression trees, random forests, and boosted regression trees. We analyzed 30-day mortality in two large cohorts of patients hospitalized with either acute myocardial infarction (N = 16,230) or congestive heart failure (N = 15,848) in two distinct eras (1999–2001 and 2004–2005). We found that both the in-sample and out-of-sample prediction of ensemble methods offered substantial improvement in predicting cardiovascular mortality compared to conventional regression trees. However, conventional logistic regression models that incorporated restricted cubic smoothing splines had even better performance. We conclude that ensemble methods from the data mining and machine learning literature increase the predictive performance of regression trees, but may not lead to clear advantages over conventional logistic regression models for predicting short-term mortality in population-based samples of subjects with cardiovascular disease. PMID:22777999

  20. Measuring and modeling of binary mixture effects of pharmaceuticals and nickel on cell viability/cytotoxicity in the human hepatoma derived cell line HepG2

    International Nuclear Information System (INIS)

    Rudzok, S.; Schlink, U.; Herbarth, O.; Bauer, M.

    2010-01-01

    The interaction of drugs and non-therapeutic xenobiotics constitutes a central role in human health risk assessment. Still, available data are rare. Two different models have been established to predict mixture toxicity from single dose data, namely, the concentration addition (CA) and independent action (IA) model. However, chemicals can also act synergistic or antagonistic or in dose level deviation, or in a dose ratio dependent deviation. In the present study we used the MIXTOX model (EU project ENV4-CT97-0507), which incorporates these algorithms, to assess effects of the binary mixtures in the human hepatoma cell line HepG2. These cells possess a liver-like enzyme pattern and a variety of xenobiotic-metabolizing enzymes (phases I and II). We tested binary mixtures of the metal nickel, the anti-inflammatory drug diclofenac, and the antibiotic agent irgasan and compared the experimental data to the mathematical models. Cell viability was determined by three different methods the MTT-, AlamarBlue (registered) and NRU assay. The compounds were tested separately and in combinations. We could show that the metal nickel is the dominant component in the mixture, affecting an antagonism at low-dose levels and a synergism at high-dose levels in combination with diclofenac or irgasan, when using the NRU and the AlamarBlue assay. The dose-response surface of irgasan and diclofenac indicated a concentration addition. The experimental data could be described by the algorithms with a regression of up to 90%, revealing the HepG2 cell line and the MIXTOX model as valuable tool for risk assessment of binary mixtures for cytotoxic endpoints. However the model failed to predict a specific mode of action, the CYP1A1 enzyme activity.

  1. Quantifying sediment-associated metal dispersal using Pb isotopes: Application of binary and multivariate mixing models at the catchment-scale

    International Nuclear Information System (INIS)

    Bird, Graham; Brewer, Paul A.; Macklin, Mark G.; Nikolova, Mariyana; Kotsev, Tsvetan; Mollov, Mihail; Swain, Catherine

    2010-01-01

    In this study Pb isotope signatures were used to identify the provenance of contaminant metals and establish patterns of downstream sediment dispersal within the River Maritsa catchment, which is impacted by the mining of polymetallic ores. A two-fold modelling approach was undertaken to quantify sediment-associated metal delivery to the Maritsa catchment; employing binary mixing models in tributary systems and a composite fingerprinting and mixing model approach in the wider Maritsa catchment. Composite fingerprints were determined using Pb isotopic and multi-element geochemical data to characterize sediments delivered from tributary catchments. Application of a mixing model allowed a quantification of the percentage contribution of tributary catchments to the sediment load of the River Maritsa. Sediment delivery from tributaries directly affected by mining activity contributes 42-63% to the sediment load of the River Maritsa, with best-fit regression relationships indicating that sediments originating from mining-affected tributaries are being dispersed over 200 km downstream. - Pb isotopic evidence used to quantify sediment-associated metal delivery within a mining-affected river catchment.

  2. New approach in modeling Cr(VI) sorption onto biomass from metal binary mixtures solutions

    Energy Technology Data Exchange (ETDEWEB)

    Liu, Chang [College of Environmental Science and Engineering, Anhui Normal University, South Jiuhua Road, 189, 241002 Wuhu (China); Chemical Engineering Department, Escola Politècnica Superior, Universitat de Girona, Ma Aurèlia Capmany, 61, 17071 Girona (Spain); Fiol, Núria [Chemical Engineering Department, Escola Politècnica Superior, Universitat de Girona, Ma Aurèlia Capmany, 61, 17071 Girona (Spain); Villaescusa, Isabel, E-mail: Isabel.Villaescusa@udg.edu [Chemical Engineering Department, Escola Politècnica Superior, Universitat de Girona, Ma Aurèlia Capmany, 61, 17071 Girona (Spain); Poch, Jordi [Applied Mathematics Department, Escola Politècnica Superior, Universitat de Girona, Ma Aurèlia Capmany, 61, 17071 Girona (Spain)

    2016-01-15

    In the last decades Cr(VI) sorption equilibrium and kinetic studies have been carried out using several types of biomasses. However there are few researchers that consider all the simultaneous processes that take place during Cr(VI) sorption (i.e., sorption/reduction of Cr(VI) and simultaneous formation and binding of reduced Cr(III)) when formulating a model that describes the overall sorption process. On the other hand Cr(VI) scarcely exists alone in wastewaters, it is usually found in mixtures with divalent metals. Therefore, the simultaneous removal of Cr(VI) and divalent metals in binary mixtures and the interactive mechanism governing Cr(VI) elimination have gained more and more attention. In the present work, kinetics of Cr(VI) sorption onto exhausted coffee from Cr(VI)–Cu(II) binary mixtures has been studied in a stirred batch reactor. A model including Cr(VI) sorption and reduction, Cr(III) sorption and the effect of the presence of Cu(II) in these processes has been developed and validated. This study constitutes an important advance in modeling Cr(VI) sorption kinetics especially when chromium sorption is in part based on the sorbent capacity of reducing hexavalent chromium and a metal cation is present in the binary mixture. - Highlights: • A kinetic model including Cr(VI) reduction, Cr(VI) and Cr(III) sorption/desorption • Synergistic effect of Cu(II) on Cr(VI) elimination included in the modelModel validation by checking it against independent sets of data.

  3. New approach in modeling Cr(VI) sorption onto biomass from metal binary mixtures solutions

    International Nuclear Information System (INIS)

    Liu, Chang; Fiol, Núria; Villaescusa, Isabel; Poch, Jordi

    2016-01-01

    In the last decades Cr(VI) sorption equilibrium and kinetic studies have been carried out using several types of biomasses. However there are few researchers that consider all the simultaneous processes that take place during Cr(VI) sorption (i.e., sorption/reduction of Cr(VI) and simultaneous formation and binding of reduced Cr(III)) when formulating a model that describes the overall sorption process. On the other hand Cr(VI) scarcely exists alone in wastewaters, it is usually found in mixtures with divalent metals. Therefore, the simultaneous removal of Cr(VI) and divalent metals in binary mixtures and the interactive mechanism governing Cr(VI) elimination have gained more and more attention. In the present work, kinetics of Cr(VI) sorption onto exhausted coffee from Cr(VI)–Cu(II) binary mixtures has been studied in a stirred batch reactor. A model including Cr(VI) sorption and reduction, Cr(III) sorption and the effect of the presence of Cu(II) in these processes has been developed and validated. This study constitutes an important advance in modeling Cr(VI) sorption kinetics especially when chromium sorption is in part based on the sorbent capacity of reducing hexavalent chromium and a metal cation is present in the binary mixture. - Highlights: • A kinetic model including Cr(VI) reduction, Cr(VI) and Cr(III) sorption/desorption • Synergistic effect of Cu(II) on Cr(VI) elimination included in the modelModel validation by checking it against independent sets of data

  4. Maximum Entropy Discrimination Poisson Regression for Software Reliability Modeling.

    Science.gov (United States)

    Chatzis, Sotirios P; Andreou, Andreas S

    2015-11-01

    Reliably predicting software defects is one of the most significant tasks in software engineering. Two of the major components of modern software reliability modeling approaches are: 1) extraction of salient features for software system representation, based on appropriately designed software metrics and 2) development of intricate regression models for count data, to allow effective software reliability data modeling and prediction. Surprisingly, research in the latter frontier of count data regression modeling has been rather limited. More specifically, a lack of simple and efficient algorithms for posterior computation has made the Bayesian approaches appear unattractive, and thus underdeveloped in the context of software reliability modeling. In this paper, we try to address these issues by introducing a novel Bayesian regression model for count data, based on the concept of max-margin data modeling, effected in the context of a fully Bayesian model treatment with simple and efficient posterior distribution updates. Our novel approach yields a more discriminative learning technique, making more effective use of our training data during model inference. In addition, it allows of better handling uncertainty in the modeled data, which can be a significant problem when the training data are limited. We derive elegant inference algorithms for our model under the mean-field paradigm and exhibit its effectiveness using the publicly available benchmark data sets.

  5. Modelling fourier regression for time series data- a case study: modelling inflation in foods sector in Indonesia

    Science.gov (United States)

    Prahutama, Alan; Suparti; Wahyu Utami, Tiani

    2018-03-01

    Regression analysis is an analysis to model the relationship between response variables and predictor variables. The parametric approach to the regression model is very strict with the assumption, but nonparametric regression model isn’t need assumption of model. Time series data is the data of a variable that is observed based on a certain time, so if the time series data wanted to be modeled by regression, then we should determined the response and predictor variables first. Determination of the response variable in time series is variable in t-th (yt), while the predictor variable is a significant lag. In nonparametric regression modeling, one developing approach is to use the Fourier series approach. One of the advantages of nonparametric regression approach using Fourier series is able to overcome data having trigonometric distribution. In modeling using Fourier series needs parameter of K. To determine the number of K can be used Generalized Cross Validation method. In inflation modeling for the transportation sector, communication and financial services using Fourier series yields an optimal K of 120 parameters with R-square 99%. Whereas if it was modeled by multiple linear regression yield R-square 90%.

  6. Bayesian Inference of a Multivariate Regression Model

    Directory of Open Access Journals (Sweden)

    Marick S. Sinay

    2014-01-01

    Full Text Available We explore Bayesian inference of a multivariate linear regression model with use of a flexible prior for the covariance structure. The commonly adopted Bayesian setup involves the conjugate prior, multivariate normal distribution for the regression coefficients and inverse Wishart specification for the covariance matrix. Here we depart from this approach and propose a novel Bayesian estimator for the covariance. A multivariate normal prior for the unique elements of the matrix logarithm of the covariance matrix is considered. Such structure allows for a richer class of prior distributions for the covariance, with respect to strength of beliefs in prior location hyperparameters, as well as the added ability, to model potential correlation amongst the covariance structure. The posterior moments of all relevant parameters of interest are calculated based upon numerical results via a Markov chain Monte Carlo procedure. The Metropolis-Hastings-within-Gibbs algorithm is invoked to account for the construction of a proposal density that closely matches the shape of the target posterior distribution. As an application of the proposed technique, we investigate a multiple regression based upon the 1980 High School and Beyond Survey.

  7. General regression and representation model for classification.

    Directory of Open Access Journals (Sweden)

    Jianjun Qian

    Full Text Available Recently, the regularized coding-based classification methods (e.g. SRC and CRC show a great potential for pattern classification. However, most existing coding methods assume that the representation residuals are uncorrelated. In real-world applications, this assumption does not hold. In this paper, we take account of the correlations of the representation residuals and develop a general regression and representation model (GRR for classification. GRR not only has advantages of CRC, but also takes full use of the prior information (e.g. the correlations between representation residuals and representation coefficients and the specific information (weight matrix of image pixels to enhance the classification performance. GRR uses the generalized Tikhonov regularization and K Nearest Neighbors to learn the prior information from the training data. Meanwhile, the specific information is obtained by using an iterative algorithm to update the feature (or image pixel weights of the test sample. With the proposed model as a platform, we design two classifiers: basic general regression and representation classifier (B-GRR and robust general regression and representation classifier (R-GRR. The experimental results demonstrate the performance advantages of proposed methods over state-of-the-art algorithms.

  8. Modeling AGN outbursts from supermassive black hole binaries

    Directory of Open Access Journals (Sweden)

    Tanaka T.

    2012-12-01

    Full Text Available When galaxies merge to assemble more massive galaxies, their nuclear supermassive black holes (SMBHs should form bound binaries. As these interact with their stellar and gaseous environments, they will become increasingly compact, culminating in inspiral and coalescence through the emission of gravitational radiation. Because galaxy mergers and interactions are also thought to fuel star formation and nuclear black hole activity, it is plausible that such binaries would lie in gas-rich environments and power active galactic nuclei (AGN. The primary difference is that these binaries have gravitational potentials that vary – through their orbital motion as well as their orbital evolution – on humanly tractable timescales, and are thus excellent candidates to give rise to coherent AGN variability in the form of outbursts and recurrent transients. Although such electromagnetic signatures would be ideally observed concomitantly with the binary’s gravitational-wave signatures, they are also likely to be discovered serendipitously in wide-field, high-cadence surveys; some may even be confused for stellar tidal disruption events. I discuss several types of possible “smoking gun” AGN signatures caused by the peculiar geometry predicted for accretion disks around SMBH binaries.

  9. A test for the parameters of multiple linear regression models ...

    African Journals Online (AJOL)

    A test for the parameters of multiple linear regression models is developed for conducting tests simultaneously on all the parameters of multiple linear regression models. The test is robust relative to the assumptions of homogeneity of variances and absence of serial correlation of the classical F-test. Under certain null and ...

  10. Predicting recycling behaviour: Comparison of a linear regression model and a fuzzy logic model.

    Science.gov (United States)

    Vesely, Stepan; Klöckner, Christian A; Dohnal, Mirko

    2016-03-01

    In this paper we demonstrate that fuzzy logic can provide a better tool for predicting recycling behaviour than the customarily used linear regression. To show this, we take a set of empirical data on recycling behaviour (N=664), which we randomly divide into two halves. The first half is used to estimate a linear regression model of recycling behaviour, and to develop a fuzzy logic model of recycling behaviour. As the first comparison, the fit of both models to the data included in estimation of the models (N=332) is evaluated. As the second comparison, predictive accuracy of both models for "new" cases (hold-out data not included in building the models, N=332) is assessed. In both cases, the fuzzy logic model significantly outperforms the regression model in terms of fit. To conclude, when accurate predictions of recycling and possibly other environmental behaviours are needed, fuzzy logic modelling seems to be a promising technique. Copyright © 2015 Elsevier Ltd. All rights reserved.

  11. Theoretical model of the density of states of random binary alloys

    International Nuclear Information System (INIS)

    Zekri, N.; Brezini, A.

    1991-09-01

    A theoretical formulation of the density of states for random binary alloys is examined based on a mean field treatment. The present model includes both diagonal and off-diagonal disorder and also short-range order. Extensive results are reported for various concentrations and compared to other calculations. (author). 22 refs, 6 figs

  12. Soot modeling of counterflow diffusion flames of ethylene-based binary mixture fuels

    KAUST Repository

    Wang, Yu; Raj, Abhijeet Dhayal; Chung, Suk-Ho

    2015-01-01

    of ethylene and its binary mixtures with methane, ethane and propane based on the method of moments. The soot model has 36 soot nucleation reactions from 8 PAH molecules including pyrene and larger PAHs. Soot surface growth reactions were based on a modified

  13. Economic modeling using evolutionary algorithms : the effect of binary encoding of strategies

    NARCIS (Netherlands)

    Waltman, L.R.; Eck, van N.J.; Dekker, Rommert; Kaymak, U.

    2011-01-01

    We are concerned with evolutionary algorithms that are employed for economic modeling purposes. We focus in particular on evolutionary algorithms that use a binary encoding of strategies. These algorithms, commonly referred to as genetic algorithms, are popular in agent-based computational economics

  14. MESA models for the evolutionary status of the epsilon Aurigae disk-eclipsed binary system

    Science.gov (United States)

    Stencel, Robert E.; Gibson, Justus

    2018-06-01

    The brightest member of the class of disk-eclipsed binary stars is the Algol-like long-period binary, epsilon Aurigae (HD 31964, F0Iap + disk, http://adsabs.harvard.edu/abs/2016SPIE.9907E..17S ). Using MESA (Modules for Experiments in Stellar Astrophysics, version 9575), we have made an evaluation of its evolutionary state. We sought to satisfy several observational constraints, including: (1) requiring evolutionary tracks to pass close to the current temperature and luminosity of the primary star; (2) obtaining a period near the observed value of 27.1 years; (3) matching a mass function of 3.0; (4) concurrent Roche lobe overflow and mass transfer; (5) an isotopic ratio 12C / 13C = 5 and, (6) matching the interferometrically determined angular diameter. A MESA model starting with binary masses of 9.85 + 4.5 solar masses, with a 100 day initial period, produces a 1.2 + 10.6 solar masses result having a 547 day period, plus a single digit 12C / 13C ratio. These values were reached near an age of 20 Myr, when the donor star comes close to the observed luminosity and temperature for epsilon Aurigae A, as a post-RGB/pre-AGB star. Contemporaneously, the accretor then appears as an upper main sequence, early B-type star. This benchmark model can provide a basis for further exploration of this interacting binary, and other long period binary stars. This report has been submitted to MNRAS, along with a parallel investigation of mass transfer stream and disk sub-structure. The authors are grateful to the estate of William Herschel Womble for the support of astronomy at the University of Denver.

  15. Massive Black Hole Binary Evolution

    Directory of Open Access Journals (Sweden)

    Merritt David

    2005-11-01

    Full Text Available Coalescence of binary supermassive black holes (SBHs would constitute the strongest sources of gravitational waves to be observed by LISA. While the formation of binary SBHs during galaxy mergers is almost inevitable, coalescence requires that the separation between binary components first drop by a few orders of magnitude, due presumably to interaction of the binary with stars and gas in a galactic nucleus. This article reviews the observational evidence for binary SBHs and discusses how they would evolve. No completely convincing case of a bound, binary SBH has yet been found, although a handful of systems (e.g. interacting galaxies; remnants of galaxy mergers are now believed to contain two SBHs at projected separations of <~ 1kpc. N-body studies of binary evolution in gas-free galaxies have reached large enough particle numbers to reproduce the slow, “diffusive” refilling of the binary’s loss cone that is believed to characterize binary evolution in real galactic nuclei. While some of the results of these simulations - e.g. the binary hardening rate and eccentricity evolution - are strongly N-dependent, others - e.g. the “damage” inflicted by the binary on the nucleus - are not. Luminous early-type galaxies often exhibit depleted cores with masses of ~ 1-2 times the mass of their nuclear SBHs, consistent with the predictions of the binary model. Studies of the interaction of massive binaries with gas are still in their infancy, although much progress is expected in the near future. Binary coalescence has a large influence on the spins of SBHs, even for mass ratios as extreme as 10:1, and evidence of spin-flips may have been observed.

  16. Modelling of volumetric properties of binary and ternary mixtures by CEOS, CEOS/GE and empirical models

    Directory of Open Access Journals (Sweden)

    BOJAN D. DJORDJEVIC

    2007-12-01

    Full Text Available Although many cubic equations of state coupled with van der Waals-one fluid mixing rules including temperature dependent interaction parameters are sufficient for representing phase equilibria and excess properties (excess molar enthalpy HE, excess molar volume VE, etc., difficulties appear in the correlation and prediction of thermodynamic properties of complex mixtures at various temperature and pressure ranges. Great progress has been made by a new approach based on CEOS/GE models. This paper reviews the last six-year of progress achieved in modelling of the volumetric properties for complex binary and ternary systems of non-electrolytes by the CEOS and CEOS/GE approaches. In addition, the vdW1 and TCBT models were used to estimate the excess molar volume VE of ternary systems methanol + chloroform + benzene and 1-propanol + chloroform + benzene, as well as the corresponding binaries methanol + chloroform, chloroform + benzene, 1-propanol + chloroform and 1-propanol + benzene at 288.15–313.15 K and atmospheric pressure. Also, prediction of VE for both ternaries by empirical models (Radojković, Kohler, Jackob–Fitzner, Colinet, Tsao–Smith, Toop, Scatchard, Rastogi was performed.

  17. Theoretical studies of binaries in astrophysics

    Science.gov (United States)

    Dischler, Johann Sebastian

    This thesis introduces and summarizes four papers dealing with computer simulations of astrophysical processes involving binaries. The first part gives the rational and theoretical background to these papers. In paper I and II a statistical approach to studying eclipsing binaries is described. By using population synthesis models for binaries the probabilities for eclipses are calculated for different luminosity classes of binaries. These are compared with Hipparcos data and they agree well if one uses a standard input distribution for the orbit sizes. If one uses a random pairing model, where both companions are independently picked from an IMF, one finds too feclipsing binaries by an order of magnitude. In paper III we investigate a possible scenario for the origin of the stars observed close to the centre of our galaxy, called S stars. We propose that a cluster falls radially cowards the central black hole. The binaries within the cluster can then, if they have small impact parameters, be broken up by the black hole's tidal held and one of the components of the binary will be captured by the black hole. Paper IV investigates how the onset of mass transfer in eccentric binaries depends on the eccentricity. To do this we have developed a new two-phase SPH scheme where very light particles are at tire outer edge of our simulated star. This enables us to get a much better resolution of the very small mass that is transferred in close binaries. Our simulations show that the minimum required distance between the stars to have mass transfer decreases with the eccentricity.

  18. Application of two-dimensional binary fingerprinting methods for the design of selective Tankyrase I inhibitors.

    Science.gov (United States)

    Muddukrishna, B S; Pai, Vasudev; Lobo, Richard; Pai, Aravinda

    2017-11-22

    In the present study, five important binary fingerprinting techniques were used to model novel flavones for the selective inhibition of Tankyrase I. From the fingerprints used: the fingerprint atom pairs resulted in a statistically significant 2D QSAR model using a kernel-based partial least square regression method. This model indicates that the presence of electron-donating groups positively contributes to activity, whereas the presence of electron withdrawing groups negatively contributes to activity. This model could be used to develop more potent as well as selective analogues for the inhibition of Tankyrase I. Schematic representation of 2D QSAR work flow.

  19. Logistic regression models

    CERN Document Server

    Hilbe, Joseph M

    2009-01-01

    This book really does cover everything you ever wanted to know about logistic regression … with updates available on the author's website. Hilbe, a former national athletics champion, philosopher, and expert in astronomy, is a master at explaining statistical concepts and methods. Readers familiar with his other expository work will know what to expect-great clarity.The book provides considerable detail about all facets of logistic regression. No step of an argument is omitted so that the book will meet the needs of the reader who likes to see everything spelt out, while a person familiar with some of the topics has the option to skip "obvious" sections. The material has been thoroughly road-tested through classroom and web-based teaching. … The focus is on helping the reader to learn and understand logistic regression. The audience is not just students meeting the topic for the first time, but also experienced users. I believe the book really does meet the author's goal … .-Annette J. Dobson, Biometric...

  20. Linking Simple Economic Theory Models and the Cointegrated Vector AutoRegressive Model

    DEFF Research Database (Denmark)

    Møller, Niels Framroze

    This paper attempts to clarify the connection between simple economic theory models and the approach of the Cointegrated Vector-Auto-Regressive model (CVAR). By considering (stylized) examples of simple static equilibrium models, it is illustrated in detail, how the theoretical model and its stru....... Further fundamental extensions and advances to more sophisticated theory models, such as those related to dynamics and expectations (in the structural relations) are left for future papers......This paper attempts to clarify the connection between simple economic theory models and the approach of the Cointegrated Vector-Auto-Regressive model (CVAR). By considering (stylized) examples of simple static equilibrium models, it is illustrated in detail, how the theoretical model and its......, it is demonstrated how other controversial hypotheses such as Rational Expectations can be formulated directly as restrictions on the CVAR-parameters. A simple example of a "Neoclassical synthetic" AS-AD model is also formulated. Finally, the partial- general equilibrium distinction is related to the CVAR as well...

  1. Multiple Response Regression for Gaussian Mixture Models with Known Labels.

    Science.gov (United States)

    Lee, Wonyul; Du, Ying; Sun, Wei; Hayes, D Neil; Liu, Yufeng

    2012-12-01

    Multiple response regression is a useful regression technique to model multiple response variables using the same set of predictor variables. Most existing methods for multiple response regression are designed for modeling homogeneous data. In many applications, however, one may have heterogeneous data where the samples are divided into multiple groups. Our motivating example is a cancer dataset where the samples belong to multiple cancer subtypes. In this paper, we consider modeling the data coming from a mixture of several Gaussian distributions with known group labels. A naive approach is to split the data into several groups according to the labels and model each group separately. Although it is simple, this approach ignores potential common structures across different groups. We propose new penalized methods to model all groups jointly in which the common and unique structures can be identified. The proposed methods estimate the regression coefficient matrix, as well as the conditional inverse covariance matrix of response variables. Asymptotic properties of the proposed methods are explored. Through numerical examples, we demonstrate that both estimation and prediction can be improved by modeling all groups jointly using the proposed methods. An application to a glioblastoma cancer dataset reveals some interesting common and unique gene relationships across different cancer subtypes.

  2. Extending the linear model with R generalized linear, mixed effects and nonparametric regression models

    CERN Document Server

    Faraway, Julian J

    2005-01-01

    Linear models are central to the practice of statistics and form the foundation of a vast range of statistical methodologies. Julian J. Faraway''s critically acclaimed Linear Models with R examined regression and analysis of variance, demonstrated the different methods available, and showed in which situations each one applies. Following in those footsteps, Extending the Linear Model with R surveys the techniques that grow from the regression model, presenting three extensions to that framework: generalized linear models (GLMs), mixed effect models, and nonparametric regression models. The author''s treatment is thoroughly modern and covers topics that include GLM diagnostics, generalized linear mixed models, trees, and even the use of neural networks in statistics. To demonstrate the interplay of theory and practice, throughout the book the author weaves the use of the R software environment to analyze the data of real examples, providing all of the R commands necessary to reproduce the analyses. All of the ...

  3. Bayesian approach to errors-in-variables in regression models

    Science.gov (United States)

    Rozliman, Nur Aainaa; Ibrahim, Adriana Irawati Nur; Yunus, Rossita Mohammad

    2017-05-01

    In many applications and experiments, data sets are often contaminated with error or mismeasured covariates. When at least one of the covariates in a model is measured with error, Errors-in-Variables (EIV) model can be used. Measurement error, when not corrected, would cause misleading statistical inferences and analysis. Therefore, our goal is to examine the relationship of the outcome variable and the unobserved exposure variable given the observed mismeasured surrogate by applying the Bayesian formulation to the EIV model. We shall extend the flexible parametric method proposed by Hossain and Gustafson (2009) to another nonlinear regression model which is the Poisson regression model. We shall then illustrate the application of this approach via a simulation study using Markov chain Monte Carlo sampling methods.

  4. Interacting binaries

    International Nuclear Information System (INIS)

    Eggleton, P.P.; Pringle, J.E.

    1985-01-01

    This volume contains 15 review articles in the field of binary stars. The subjects reviewed span considerably, from the shortest period of interacting binaries to the longest, symbiotic stars. Also included are articles on Algols, X-ray binaries and Wolf-Rayet stars (single and binary). Contents: Preface. List of Participants. Activity of Contact Binary Systems. Wolf-Rayet Stars and Binarity. Symbiotic Stars. Massive X-ray Binaries. Stars that go Hump in the Night: The SU UMa Stars. Interacting Binaries - Summing Up

  5. Thermodynamic properties of binary mixtures containing dimethyl carbonate+2-alkanol: Experimental data, correlation and prediction by ERAS model and cubic EOS

    International Nuclear Information System (INIS)

    Almasi, Mohammad

    2013-01-01

    Densities and viscosities for binary mixtures of dimethyl carbonate with 2-propanol up to 2-heptanol were measured at various temperatures and ambient pressure. From experimental data, excess molar volumes, V m E . were calculated and correlated by the Redlich–Kister equation to obtain the binary coefficients and the standard deviations. Excess molar volumes, V m E , are positive for all studied mixtures over the entire range of the mole fraction. The ERAS-model has been applied for describing the binary excess molar volumes and also Peng–Robinson–Stryjek–Vera (PRSV) equation of state (EOS) has been used to predict the binary excess molar volumes and viscosities. Also several semi-empirical models were used to correlate the viscosity of binary mixtures

  6. The crux of the method: assumptions in ordinary least squares and logistic regression.

    Science.gov (United States)

    Long, Rebecca G

    2008-10-01

    Logistic regression has increasingly become the tool of choice when analyzing data with a binary dependent variable. While resources relating to the technique are widely available, clear discussions of why logistic regression should be used in place of ordinary least squares regression are difficult to find. The current paper compares and contrasts the assumptions of ordinary least squares with those of logistic regression and explains why logistic regression's looser assumptions make it adept at handling violations of the more important assumptions in ordinary least squares.

  7. PREDICTION OF THE MIXING ENTHALPIES OF BINARY LIQUID ALLOYS BY MOLECULAR INTERACTION VOLUME MODEL

    Institute of Scientific and Technical Information of China (English)

    H.W.Yang; D.P.Tao; Z.H.Zhou

    2008-01-01

    The mixing enthalpies of 23 binary liquid alloys are calculated by molecular interaction volume model (MIVM), which is a two-parameter model with the partial molar infinite dilute mixing enthalpies. The predicted values are in agreement with the experimental data and then indicate that the model is reliable and convenient.

  8. A re-examination of thermodynamic modelling of U-Ru binary phase diagram

    Energy Technology Data Exchange (ETDEWEB)

    Wang, L.C.; Kaye, M.H., E-mail: matthew.kaye@uoit.ca [University of Ontario Institute of Technology, Oshawa, ON (Canada)

    2015-07-01

    Ruthenium (Ru) is one of the more abundant fission products (FPs) both in fast breeder reactors and thermal reactors. Post irradiation examinations (PIE) show that both 'the white metallic phase' (MoTc-Ru-Rh-Pd) and 'the other metallic phase' (U(Pd-Rh-Ru)3) are present in spent nuclear fuels. To describe this quaternary system, binary subsystems of uranium (U) with Pd, Rh, and Ru are necessary. Presently, only the U-Ru system has been thermodynamically described but with some problems. As part of research on U-Ru-Rh-Pd quaternary system, an improved consistent thermodynamic model describing the U-Ru binary phase diagram has been obtained. (author)

  9. Modeling Fire Occurrence at the City Scale: A Comparison between Geographically Weighted Regression and Global Linear Regression.

    Science.gov (United States)

    Song, Chao; Kwan, Mei-Po; Zhu, Jiping

    2017-04-08

    An increasing number of fires are occurring with the rapid development of cities, resulting in increased risk for human beings and the environment. This study compares geographically weighted regression-based models, including geographically weighted regression (GWR) and geographically and temporally weighted regression (GTWR), which integrates spatial and temporal effects and global linear regression models (LM) for modeling fire risk at the city scale. The results show that the road density and the spatial distribution of enterprises have the strongest influences on fire risk, which implies that we should focus on areas where roads and enterprises are densely clustered. In addition, locations with a large number of enterprises have fewer fire ignition records, probably because of strict management and prevention measures. A changing number of significant variables across space indicate that heterogeneity mainly exists in the northern and eastern rural and suburban areas of Hefei city, where human-related facilities or road construction are only clustered in the city sub-centers. GTWR can capture small changes in the spatiotemporal heterogeneity of the variables while GWR and LM cannot. An approach that integrates space and time enables us to better understand the dynamic changes in fire risk. Thus governments can use the results to manage fire safety at the city scale.

  10. Direction of Effects in Multiple Linear Regression Models.

    Science.gov (United States)

    Wiedermann, Wolfgang; von Eye, Alexander

    2015-01-01

    Previous studies analyzed asymmetric properties of the Pearson correlation coefficient using higher than second order moments. These asymmetric properties can be used to determine the direction of dependence in a linear regression setting (i.e., establish which of two variables is more likely to be on the outcome side) within the framework of cross-sectional observational data. Extant approaches are restricted to the bivariate regression case. The present contribution extends the direction of dependence methodology to a multiple linear regression setting by analyzing distributional properties of residuals of competing multiple regression models. It is shown that, under certain conditions, the third central moments of estimated regression residuals can be used to decide upon direction of effects. In addition, three different approaches for statistical inference are discussed: a combined D'Agostino normality test, a skewness difference test, and a bootstrap difference test. Type I error and power of the procedures are assessed using Monte Carlo simulations, and an empirical example is provided for illustrative purposes. In the discussion, issues concerning the quality of psychological data, possible extensions of the proposed methods to the fourth central moment of regression residuals, and potential applications are addressed.

  11. Relativistic Binaries in Globular Clusters

    Directory of Open Access Journals (Sweden)

    Matthew J. Benacquista

    2013-03-01

    Full Text Available Galactic globular clusters are old, dense star systems typically containing 10^4 – 10^6 stars. As an old population of stars, globular clusters contain many collapsed and degenerate objects. As a dense population of stars, globular clusters are the scene of many interesting close dynamical interactions between stars. These dynamical interactions can alter the evolution of individual stars and can produce tight binary systems containing one or two compact objects. In this review, we discuss theoretical models of globular cluster evolution and binary evolution, techniques for simulating this evolution that leads to relativistic binaries, and current and possible future observational evidence for this population. Our discussion of globular cluster evolution will focus on the processes that boost the production of tight binary systems and the subsequent interaction of these binaries that can alter the properties of both bodies and can lead to exotic objects. Direct N-body integrations and Fokker–Planck simulations of the evolution of globular clusters that incorporate tidal interactions and lead to predictions of relativistic binary populations are also discussed. We discuss the current observational evidence for cataclysmic variables, millisecond pulsars, and low-mass X-ray binaries as well as possible future detection of relativistic binaries with gravitational radiation.

  12. A Monte Carlo simulation study comparing linear regression, beta regression, variable-dispersion beta regression and fractional logit regression at recovering average difference measures in a two sample design.

    Science.gov (United States)

    Meaney, Christopher; Moineddin, Rahim

    2014-01-24

    In biomedical research, response variables are often encountered which have bounded support on the open unit interval--(0,1). Traditionally, researchers have attempted to estimate covariate effects on these types of response data using linear regression. Alternative modelling strategies may include: beta regression, variable-dispersion beta regression, and fractional logit regression models. This study employs a Monte Carlo simulation design to compare the statistical properties of the linear regression model to that of the more novel beta regression, variable-dispersion beta regression, and fractional logit regression models. In the Monte Carlo experiment we assume a simple two sample design. We assume observations are realizations of independent draws from their respective probability models. The randomly simulated draws from the various probability models are chosen to emulate average proportion/percentage/rate differences of pre-specified magnitudes. Following simulation of the experimental data we estimate average proportion/percentage/rate differences. We compare the estimators in terms of bias, variance, type-1 error and power. Estimates of Monte Carlo error associated with these quantities are provided. If response data are beta distributed with constant dispersion parameters across the two samples, then all models are unbiased and have reasonable type-1 error rates and power profiles. If the response data in the two samples have different dispersion parameters, then the simple beta regression model is biased. When the sample size is small (N0 = N1 = 25) linear regression has superior type-1 error rates compared to the other models. Small sample type-1 error rates can be improved in beta regression models using bias correction/reduction methods. In the power experiments, variable-dispersion beta regression and fractional logit regression models have slightly elevated power compared to linear regression models. Similar results were observed if the

  13. Tutorial on Using Regression Models with Count Outcomes Using R

    Directory of Open Access Journals (Sweden)

    A. Alexander Beaujean

    2016-02-01

    Full Text Available Education researchers often study count variables, such as times a student reached a goal, discipline referrals, and absences. Most researchers that study these variables use typical regression methods (i.e., ordinary least-squares either with or without transforming the count variables. In either case, using typical regression for count data can produce parameter estimates that are biased, thus diminishing any inferences made from such data. As count-variable regression models are seldom taught in training programs, we present a tutorial to help educational researchers use such methods in their own research. We demonstrate analyzing and interpreting count data using Poisson, negative binomial, zero-inflated Poisson, and zero-inflated negative binomial regression models. The count regression methods are introduced through an example using the number of times students skipped class. The data for this example are freely available and the R syntax used run the example analyses are included in the Appendix.

  14. Ternary and binary LLE measurements for solvent (2-methyltetrahydrofuran and cyclopentyl methyl ether) + furfural + water between 298 and 343 K

    International Nuclear Information System (INIS)

    Männistö, Mikael; Pokki, Juha-Pekka; Fournis, Ludivine; Alopaeus, Ville

    2017-01-01

    Highlights: • Novel LLE of 2-methyltetrahydrofuran or cyclopentyl methyl ether + furfural + water. • High performance solvents for liquid-liquid extraction exhibited. • Modelled with UNIQUAC-HOC activity coefficient model. • Comparison to other industrial solvents with distribution coefficient and selectivity. - Abstract: The suitability of two solvents for the extraction of furfural from aqueous streams is assessed through novel ternary and binary liquid-liquid equilibria data for mixtures of solvent (2-methyltetrahydrofuran or cyclopentyl methyl ether) + furfural + water. The measured data are reported along with regressed binary interaction parameters for UNIQUAC-HOC activity coefficient model and further analyzed through distribution coefficients and selectivity for furfural. Out of the two solvents, cyclopentyl methyl ether presents a very high selectivity along with good distribution coefficient in the entire temperature range.

  15. Statistical approach for selection of regression model during validation of bioanalytical method

    Directory of Open Access Journals (Sweden)

    Natalija Nakov

    2014-06-01

    Full Text Available The selection of an adequate regression model is the basis for obtaining accurate and reproducible results during the bionalytical method validation. Given the wide concentration range, frequently present in bioanalytical assays, heteroscedasticity of the data may be expected. Several weighted linear and quadratic regression models were evaluated during the selection of the adequate curve fit using nonparametric statistical tests: One sample rank test and Wilcoxon signed rank test for two independent groups of samples. The results obtained with One sample rank test could not give statistical justification for the selection of linear vs. quadratic regression models because slight differences between the error (presented through the relative residuals were obtained. Estimation of the significance of the differences in the RR was achieved using Wilcoxon signed rank test, where linear and quadratic regression models were treated as two independent groups. The application of this simple non-parametric statistical test provides statistical confirmation of the choice of an adequate regression model.

  16. Linear regression models for quantitative assessment of left ...

    African Journals Online (AJOL)

    Changes in left ventricular structures and function have been reported in cardiomyopathies. No prediction models have been established in this environment. This study established regression models for prediction of left ventricular structures in normal subjects. A sample of normal subjects was drawn from a large urban ...

  17. Coevolution of Binaries and Circumbinary Gaseous Disks

    Science.gov (United States)

    Fleming, David; Quinn, Thomas R.

    2018-04-01

    The recent discoveries of circumbinary planets by Kepler raise questions for contemporary planet formation models. Understanding how these planets form requires characterizing their formation environment, the circumbinary protoplanetary disk, and how the disk and binary interact. The central binary excites resonances in the surrounding protoplanetary disk that drive evolution in both the binary orbital elements and in the disk. To probe how these interactions impact both binary eccentricity and disk structure evolution, we ran N-body smooth particle hydrodynamics (SPH) simulations of gaseous protoplanetary disks surrounding binaries based on Kepler 38 for 10^4 binary orbital periods for several initial binary eccentricities. We find that nearly circular binaries weakly couple to the disk via a parametric instability and excite disk eccentricity growth. Eccentric binaries strongly couple to the disk causing eccentricity growth for both the disk and binary. Disks around sufficiently eccentric binaries strongly couple to the disk and develop an m = 1 spiral wave launched from the 1:3 eccentric outer Lindblad resonance (EOLR). This wave corresponds to an alignment of gas particle longitude of periastrons. We find that in all simulations, the binary semi-major axis decays due to dissipation from the viscous disk.

  18. Reduced Rank Regression

    DEFF Research Database (Denmark)

    Johansen, Søren

    2008-01-01

    The reduced rank regression model is a multivariate regression model with a coefficient matrix with reduced rank. The reduced rank regression algorithm is an estimation procedure, which estimates the reduced rank regression model. It is related to canonical correlations and involves calculating...

  19. Blind Separation of Acoustic Signals Combining SIMO-Model-Based Independent Component Analysis and Binary Masking

    Directory of Open Access Journals (Sweden)

    Hiekata Takashi

    2006-01-01

    Full Text Available A new two-stage blind source separation (BSS method for convolutive mixtures of speech is proposed, in which a single-input multiple-output (SIMO-model-based independent component analysis (ICA and a new SIMO-model-based binary masking are combined. SIMO-model-based ICA enables us to separate the mixed signals, not into monaural source signals but into SIMO-model-based signals from independent sources in their original form at the microphones. Thus, the separated signals of SIMO-model-based ICA can maintain the spatial qualities of each sound source. Owing to this attractive property, our novel SIMO-model-based binary masking can be applied to efficiently remove the residual interference components after SIMO-model-based ICA. The experimental results reveal that the separation performance can be considerably improved by the proposed method compared with that achieved by conventional BSS methods. In addition, the real-time implementation of the proposed BSS is illustrated.

  20. Poisson regression for modeling count and frequency outcomes in trauma research.

    Science.gov (United States)

    Gagnon, David R; Doron-LaMarca, Susan; Bell, Margret; O'Farrell, Timothy J; Taft, Casey T

    2008-10-01

    The authors describe how the Poisson regression method for analyzing count or frequency outcome variables can be applied in trauma studies. The outcome of interest in trauma research may represent a count of the number of incidents of behavior occurring in a given time interval, such as acts of physical aggression or substance abuse. Traditional linear regression approaches assume a normally distributed outcome variable with equal variances over the range of predictor variables, and may not be optimal for modeling count outcomes. An application of Poisson regression is presented using data from a study of intimate partner aggression among male patients in an alcohol treatment program and their female partners. Results of Poisson regression and linear regression models are compared.

  1. Regression Models and Fuzzy Logic Prediction of TBM Penetration Rate

    Directory of Open Access Journals (Sweden)

    Minh Vu Trieu

    2017-03-01

    Full Text Available This paper presents statistical analyses of rock engineering properties and the measured penetration rate of tunnel boring machine (TBM based on the data of an actual project. The aim of this study is to analyze the influence of rock engineering properties including uniaxial compressive strength (UCS, Brazilian tensile strength (BTS, rock brittleness index (BI, the distance between planes of weakness (DPW, and the alpha angle (Alpha between the tunnel axis and the planes of weakness on the TBM rate of penetration (ROP. Four (4 statistical regression models (two linear and two nonlinear are built to predict the ROP of TBM. Finally a fuzzy logic model is developed as an alternative method and compared to the four statistical regression models. Results show that the fuzzy logic model provides better estimations and can be applied to predict the TBM performance. The R-squared value (R2 of the fuzzy logic model scores the highest value of 0.714 over the second runner-up of 0.667 from the multiple variables nonlinear regression model.

  2. Regression Models and Fuzzy Logic Prediction of TBM Penetration Rate

    Science.gov (United States)

    Minh, Vu Trieu; Katushin, Dmitri; Antonov, Maksim; Veinthal, Renno

    2017-03-01

    This paper presents statistical analyses of rock engineering properties and the measured penetration rate of tunnel boring machine (TBM) based on the data of an actual project. The aim of this study is to analyze the influence of rock engineering properties including uniaxial compressive strength (UCS), Brazilian tensile strength (BTS), rock brittleness index (BI), the distance between planes of weakness (DPW), and the alpha angle (Alpha) between the tunnel axis and the planes of weakness on the TBM rate of penetration (ROP). Four (4) statistical regression models (two linear and two nonlinear) are built to predict the ROP of TBM. Finally a fuzzy logic model is developed as an alternative method and compared to the four statistical regression models. Results show that the fuzzy logic model provides better estimations and can be applied to predict the TBM performance. The R-squared value (R2) of the fuzzy logic model scores the highest value of 0.714 over the second runner-up of 0.667 from the multiple variables nonlinear regression model.

  3. Deep ensemble learning of sparse regression models for brain disease diagnosis.

    Science.gov (United States)

    Suk, Heung-Il; Lee, Seong-Whan; Shen, Dinggang

    2017-04-01

    Recent studies on brain imaging analysis witnessed the core roles of machine learning techniques in computer-assisted intervention for brain disease diagnosis. Of various machine-learning techniques, sparse regression models have proved their effectiveness in handling high-dimensional data but with a small number of training samples, especially in medical problems. In the meantime, deep learning methods have been making great successes by outperforming the state-of-the-art performances in various applications. In this paper, we propose a novel framework that combines the two conceptually different methods of sparse regression and deep learning for Alzheimer's disease/mild cognitive impairment diagnosis and prognosis. Specifically, we first train multiple sparse regression models, each of which is trained with different values of a regularization control parameter. Thus, our multiple sparse regression models potentially select different feature subsets from the original feature set; thereby they have different powers to predict the response values, i.e., clinical label and clinical scores in our work. By regarding the response values from our sparse regression models as target-level representations, we then build a deep convolutional neural network for clinical decision making, which thus we call 'Deep Ensemble Sparse Regression Network.' To our best knowledge, this is the first work that combines sparse regression models with deep neural network. In our experiments with the ADNI cohort, we validated the effectiveness of the proposed method by achieving the highest diagnostic accuracies in three classification tasks. We also rigorously analyzed our results and compared with the previous studies on the ADNI cohort in the literature. Copyright © 2017 Elsevier B.V. All rights reserved.

  4. A computational approach to compare regression modelling strategies in prediction research.

    Science.gov (United States)

    Pajouheshnia, Romin; Pestman, Wiebe R; Teerenstra, Steven; Groenwold, Rolf H H

    2016-08-25

    It is often unclear which approach to fit, assess and adjust a model will yield the most accurate prediction model. We present an extension of an approach for comparing modelling strategies in linear regression to the setting of logistic regression and demonstrate its application in clinical prediction research. A framework for comparing logistic regression modelling strategies by their likelihoods was formulated using a wrapper approach. Five different strategies for modelling, including simple shrinkage methods, were compared in four empirical data sets to illustrate the concept of a priori strategy comparison. Simulations were performed in both randomly generated data and empirical data to investigate the influence of data characteristics on strategy performance. We applied the comparison framework in a case study setting. Optimal strategies were selected based on the results of a priori comparisons in a clinical data set and the performance of models built according to each strategy was assessed using the Brier score and calibration plots. The performance of modelling strategies was highly dependent on the characteristics of the development data in both linear and logistic regression settings. A priori comparisons in four empirical data sets found that no strategy consistently outperformed the others. The percentage of times that a model adjustment strategy outperformed a logistic model ranged from 3.9 to 94.9 %, depending on the strategy and data set. However, in our case study setting the a priori selection of optimal methods did not result in detectable improvement in model performance when assessed in an external data set. The performance of prediction modelling strategies is a data-dependent process and can be highly variable between data sets within the same clinical domain. A priori strategy comparison can be used to determine an optimal logistic regression modelling strategy for a given data set before selecting a final modelling approach.

  5. 3D Modeling of Accretion Disks and Circumbinary Envelopes in Close Binaries

    Science.gov (United States)

    Bisikalo, D.

    2010-12-01

    A number of observations prove the complex flow structure in close binary stars. The gas dynamic structure of the flow is governed by the stream of matter from the inner Lagrange point, the accretion disk, the circum-disk halo, and the circumbinary envelope. Observations reflect the current state of a binary system and for their interpretation one should consider the gas dynamics of flow patterns. Three-dimensional numerical gasdynamical modeling is used to study the gaseous flow structure and dynamics in close binaries. It is shown that the periodic variations of the positions of the disk and the bow shock formed when the inner parts of the circumbinary envelope flow around the disk result in variations in both the rate of angular-momentum transfer to the disk and the flow structure near the Lagrange point L3. All these factors lead to periodic ejections of matter from the accretion disk and circum-disk halo into the outer layers of the circumbinary envelope. The results of simulations are used to estimate the physical parameters of the circumbinary envelope, including 3D matter distribution in it, and the matter-flow configuration and dynamics. The envelope becomes optically thick for systems with high mass-exchange rates, M⊙=10-8 Msun/year, and has a significant influence on the binary's observed features. The uneven phase distributions of the matter and density variations due to periodic injections of matter into the envelope are important for interpretations of observations of CBSs.

  6. A generalized right truncated bivariate Poisson regression model with applications to health data.

    Science.gov (United States)

    Islam, M Ataharul; Chowdhury, Rafiqul I

    2017-01-01

    A generalized right truncated bivariate Poisson regression model is proposed in this paper. Estimation and tests for goodness of fit and over or under dispersion are illustrated for both untruncated and right truncated bivariate Poisson regression models using marginal-conditional approach. Estimation and test procedures are illustrated for bivariate Poisson regression models with applications to Health and Retirement Study data on number of health conditions and the number of health care services utilized. The proposed test statistics are easy to compute and it is evident from the results that the models fit the data very well. A comparison between the right truncated and untruncated bivariate Poisson regression models using the test for nonnested models clearly shows that the truncated model performs significantly better than the untruncated model.

  7. Learning to assign binary weights to binary descriptor

    Science.gov (United States)

    Huang, Zhoudi; Wei, Zhenzhong; Zhang, Guangjun

    2016-10-01

    Constructing robust binary local feature descriptors are receiving increasing interest due to their binary nature, which can enable fast processing while requiring significantly less memory than their floating-point competitors. To bridge the performance gap between the binary and floating-point descriptors without increasing the computational cost of computing and matching, optimal binary weights are learning to assign to binary descriptor for considering each bit might contribute differently to the distinctiveness and robustness. Technically, a large-scale regularized optimization method is applied to learn float weights for each bit of the binary descriptor. Furthermore, binary approximation for the float weights is performed by utilizing an efficient alternatively greedy strategy, which can significantly improve the discriminative power while preserve fast matching advantage. Extensive experimental results on two challenging datasets (Brown dataset and Oxford dataset) demonstrate the effectiveness and efficiency of the proposed method.

  8. Numerical model for dendritic solidification of binary alloys

    Science.gov (United States)

    Felicelli, S. D.; Heinrich, J. C.; Poirier, D. R.

    1993-01-01

    A finite element model capable of simulating solidification of binary alloys and the formation of freckles is presented. It uses a single system of equations to deal with the all-liquid region, the dendritic region, and the all-solid region. The dendritic region is treated as an anisotropic porous medium. The algorithm uses the bilinear isoparametric element, with a penalty function approximation and a Petrov-Galerkin formulation. Numerical simulations are shown in which an NH4Cl-H2O mixture and a Pb-Sn alloy melt are cooled. The solidification process is followed in time. Instabilities in the process can be clearly observed and the final compositions obtained.

  9. Structural classification and a binary structure model for superconductors

    Institute of Scientific and Technical Information of China (English)

    Dong Cheng

    2006-01-01

    Based on structural and bonding features, a new classification scheme of superconductors is proposed to classify conductors can be partitioned into two parts, a superconducting active component and a supplementary component.Partially metallic covalent bonding is found to be a common feature in all superconducting active components, and the electron states of the atoms in the active components usually make a dominant contribution to the energy band near the Fermi surface. Possible directions to explore new superconductors are discussed based on the structural classification and the binary structure model.

  10. Linear regression metamodeling as a tool to summarize and present simulation model results.

    Science.gov (United States)

    Jalal, Hawre; Dowd, Bryan; Sainfort, François; Kuntz, Karen M

    2013-10-01

    Modelers lack a tool to systematically and clearly present complex model results, including those from sensitivity analyses. The objective was to propose linear regression metamodeling as a tool to increase transparency of decision analytic models and better communicate their results. We used a simplified cancer cure model to demonstrate our approach. The model computed the lifetime cost and benefit of 3 treatment options for cancer patients. We simulated 10,000 cohorts in a probabilistic sensitivity analysis (PSA) and regressed the model outcomes on the standardized input parameter values in a set of regression analyses. We used the regression coefficients to describe measures of sensitivity analyses, including threshold and parameter sensitivity analyses. We also compared the results of the PSA to deterministic full-factorial and one-factor-at-a-time designs. The regression intercept represented the estimated base-case outcome, and the other coefficients described the relative parameter uncertainty in the model. We defined simple relationships that compute the average and incremental net benefit of each intervention. Metamodeling produced outputs similar to traditional deterministic 1-way or 2-way sensitivity analyses but was more reliable since it used all parameter values. Linear regression metamodeling is a simple, yet powerful, tool that can assist modelers in communicating model characteristics and sensitivity analyses.

  11. Hierarchical Neural Regression Models for Customer Churn Prediction

    Directory of Open Access Journals (Sweden)

    Golshan Mohammadi

    2013-01-01

    Full Text Available As customers are the main assets of each industry, customer churn prediction is becoming a major task for companies to remain in competition with competitors. In the literature, the better applicability and efficiency of hierarchical data mining techniques has been reported. This paper considers three hierarchical models by combining four different data mining techniques for churn prediction, which are backpropagation artificial neural networks (ANN, self-organizing maps (SOM, alpha-cut fuzzy c-means (α-FCM, and Cox proportional hazards regression model. The hierarchical models are ANN + ANN + Cox, SOM + ANN + Cox, and α-FCM + ANN + Cox. In particular, the first component of the models aims to cluster data in two churner and nonchurner groups and also filter out unrepresentative data or outliers. Then, the clustered data as the outputs are used to assign customers to churner and nonchurner groups by the second technique. Finally, the correctly classified data are used to create Cox proportional hazards model. To evaluate the performance of the hierarchical models, an Iranian mobile dataset is considered. The experimental results show that the hierarchical models outperform the single Cox regression baseline model in terms of prediction accuracy, Types I and II errors, RMSE, and MAD metrics. In addition, the α-FCM + ANN + Cox model significantly performs better than the two other hierarchical models.

  12. LINEAR REGRESSION MODEL ESTİMATİON FOR RIGHT CENSORED DATA

    Directory of Open Access Journals (Sweden)

    Ersin Yılmaz

    2016-05-01

    Full Text Available In this study, firstly we will define a right censored data. If we say shortly right-censored data is censoring values that above the exact line. This may be related with scaling device. And then  we will use response variable acquainted from right-censored explanatory variables. Then the linear regression model will be estimated. For censored data’s existence, Kaplan-Meier weights will be used for  the estimation of the model. With the weights regression model  will be consistent and unbiased with that.   And also there is a method for the censored data that is a semi parametric regression and this method also give  useful results  for censored data too. This study also might be useful for the health studies because of the censored data used in medical issues generally.

  13. Developing and testing a global-scale regression model to quantify mean annual streamflow

    Science.gov (United States)

    Barbarossa, Valerio; Huijbregts, Mark A. J.; Hendriks, A. Jan; Beusen, Arthur H. W.; Clavreul, Julie; King, Henry; Schipper, Aafke M.

    2017-01-01

    Quantifying mean annual flow of rivers (MAF) at ungauged sites is essential for assessments of global water supply, ecosystem integrity and water footprints. MAF can be quantified with spatially explicit process-based models, which might be overly time-consuming and data-intensive for this purpose, or with empirical regression models that predict MAF based on climate and catchment characteristics. Yet, regression models have mostly been developed at a regional scale and the extent to which they can be extrapolated to other regions is not known. In this study, we developed a global-scale regression model for MAF based on a dataset unprecedented in size, using observations of discharge and catchment characteristics from 1885 catchments worldwide, measuring between 2 and 106 km2. In addition, we compared the performance of the regression model with the predictive ability of the spatially explicit global hydrological model PCR-GLOBWB by comparing results from both models to independent measurements. We obtained a regression model explaining 89% of the variance in MAF based on catchment area and catchment averaged mean annual precipitation and air temperature, slope and elevation. The regression model performed better than PCR-GLOBWB for the prediction of MAF, as root-mean-square error (RMSE) values were lower (0.29-0.38 compared to 0.49-0.57) and the modified index of agreement (d) was higher (0.80-0.83 compared to 0.72-0.75). Our regression model can be applied globally to estimate MAF at any point of the river network, thus providing a feasible alternative to spatially explicit process-based global hydrological models.

  14. Radial Velocities of 41 Kepler Eclipsing Binaries

    Science.gov (United States)

    Matson, Rachel A.; Gies, Douglas R.; Guo, Zhao; Williams, Stephen J.

    2017-12-01

    Eclipsing binaries are vital for directly determining stellar parameters without reliance on models or scaling relations. Spectroscopically derived parameters of detached and semi-detached binaries allow us to determine component masses that can inform theories of stellar and binary evolution. Here we present moderate resolution ground-based spectra of stars in close binary systems with and without (detected) tertiary companions observed by NASA’s Kepler mission and analyzed for eclipse timing variations. We obtain radial velocities and spectroscopic orbits for five single-lined and 35 double-lined systems, and confirm one false positive eclipsing binary. For the double-lined spectroscopic binaries, we also determine individual component masses and examine the mass ratio {M}2/{M}1 distribution, which is dominated by binaries with like-mass pairs and semi-detached classical Algol systems that have undergone mass transfer. Finally, we constrain the mass of the tertiary component for five double-lined binaries with previously detected companions.

  15. A land use regression model for ambient ultrafine particles in Montreal, Canada: A comparison of linear regression and a machine learning approach.

    Science.gov (United States)

    Weichenthal, Scott; Ryswyk, Keith Van; Goldstein, Alon; Bagg, Scott; Shekkarizfard, Maryam; Hatzopoulou, Marianne

    2016-04-01

    Existing evidence suggests that ambient ultrafine particles (UFPs) (regression model for UFPs in Montreal, Canada using mobile monitoring data collected from 414 road segments during the summer and winter months between 2011 and 2012. Two different approaches were examined for model development including standard multivariable linear regression and a machine learning approach (kernel-based regularized least squares (KRLS)) that learns the functional form of covariate impacts on ambient UFP concentrations from the data. The final models included parameters for population density, ambient temperature and wind speed, land use parameters (park space and open space), length of local roads and rail, and estimated annual average NOx emissions from traffic. The final multivariable linear regression model explained 62% of the spatial variation in ambient UFP concentrations whereas the KRLS model explained 79% of the variance. The KRLS model performed slightly better than the linear regression model when evaluated using an external dataset (R(2)=0.58 vs. 0.55) or a cross-validation procedure (R(2)=0.67 vs. 0.60). In general, our findings suggest that the KRLS approach may offer modest improvements in predictive performance compared to standard multivariable linear regression models used to estimate spatial variations in ambient UFPs. However, differences in predictive performance were not statistically significant when evaluated using the cross-validation procedure. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.

  16. Binary Linear-Time Erasure Decoding for Non-Binary LDPC codes

    OpenAIRE

    Savin, Valentin

    2009-01-01

    In this paper, we first introduce the extended binary representation of non-binary codes, which corresponds to a covering graph of the bipartite graph associated with the non-binary code. Then we show that non-binary codewords correspond to binary codewords of the extended representation that further satisfy some simplex-constraint: that is, bits lying over the same symbol-node of the non-binary graph must form a codeword of a simplex code. Applied to the binary erasure channel, this descript...

  17. Multiple logistic regression model of signalling practices of drivers on urban highways

    Science.gov (United States)

    Puan, Othman Che; Ibrahim, Muttaka Na'iya; Zakaria, Rozana

    2015-05-01

    Giving signal is a way of informing other road users, especially to the conflicting drivers, the intention of a driver to change his/her movement course. Other users are exposed to hazard situation and risks of accident if the driver who changes his/her course failed to give signal as required. This paper describes the application of logistic regression model for the analysis of driver's signalling practices on multilane highways based on possible factors affecting driver's decision such as driver's gender, vehicle's type, vehicle's speed and traffic flow intensity. Data pertaining to the analysis of such factors were collected manually. More than 2000 drivers who have performed a lane changing manoeuvre while driving on two sections of multilane highways were observed. Finding from the study shows that relatively a large proportion of drivers failed to give any signals when changing lane. The result of the analysis indicates that although the proportion of the drivers who failed to provide signal prior to lane changing manoeuvre is high, the degree of compliances of the female drivers is better than the male drivers. A binary logistic model was developed to represent the probability of a driver to provide signal indication prior to lane changing manoeuvre. The model indicates that driver's gender, type of vehicle's driven, speed of vehicle and traffic volume influence the driver's decision to provide a signal indication prior to a lane changing manoeuvre on a multilane urban highway. In terms of types of vehicles driven, about 97% of motorcyclists failed to comply with the signal indication requirement. The proportion of non-compliance drivers under stable traffic flow conditions is much higher than when the flow is relatively heavy. This is consistent with the data which indicates a high degree of non-compliances when the average speed of the traffic stream is relatively high.

  18. Adaptive regression for modeling nonlinear relationships

    CERN Document Server

    Knafl, George J

    2016-01-01

    This book presents methods for investigating whether relationships are linear or nonlinear and for adaptively fitting appropriate models when they are nonlinear. Data analysts will learn how to incorporate nonlinearity in one or more predictor variables into regression models for different types of outcome variables. Such nonlinear dependence is often not considered in applied research, yet nonlinear relationships are common and so need to be addressed. A standard linear analysis can produce misleading conclusions, while a nonlinear analysis can provide novel insights into data, not otherwise possible. A variety of examples of the benefits of modeling nonlinear relationships are presented throughout the book. Methods are covered using what are called fractional polynomials based on real-valued power transformations of primary predictor variables combined with model selection based on likelihood cross-validation. The book covers how to formulate and conduct such adaptive fractional polynomial modeling in the s...

  19. Confidence bands for inverse regression models

    International Nuclear Information System (INIS)

    Birke, Melanie; Bissantz, Nicolai; Holzmann, Hajo

    2010-01-01

    We construct uniform confidence bands for the regression function in inverse, homoscedastic regression models with convolution-type operators. Here, the convolution is between two non-periodic functions on the whole real line rather than between two periodic functions on a compact interval, since the former situation arguably arises more often in applications. First, following Bickel and Rosenblatt (1973 Ann. Stat. 1 1071–95) we construct asymptotic confidence bands which are based on strong approximations and on a limit theorem for the supremum of a stationary Gaussian process. Further, we propose bootstrap confidence bands based on the residual bootstrap and prove consistency of the bootstrap procedure. A simulation study shows that the bootstrap confidence bands perform reasonably well for moderate sample sizes. Finally, we apply our method to data from a gel electrophoresis experiment with genetically engineered neuronal receptor subunits incubated with rat brain extract

  20. A binary mixture operated heat pump

    International Nuclear Information System (INIS)

    Hihara, E.; Saito, T.

    1991-01-01

    This paper evaluates the performance of possible binary mixtures as working fluids in high- temperature heat pump applications. The binary mixtures, which are potential alternatives of fully halogenated hydrocarbons, include HCFC142b/HCFC22, HFC152a/HCFC22, HFC134a/HCFC22. The performance of the mixtures is estimated by a thermodynamic model and a practical model in which the heat transfer is considered in heat exchangers. One of the advantages of binary mixtures is a higher coefficient of performance, which is caused by the small temperature difference between the heat-sink/-source fluid and the refrigerant. The mixture HCFC142b/HCFC22 is promising from the stand point of thermodynamic performance

  1. Electricity consumption forecasting in Italy using linear regression models

    Energy Technology Data Exchange (ETDEWEB)

    Bianco, Vincenzo; Manca, Oronzio; Nardini, Sergio [DIAM, Seconda Universita degli Studi di Napoli, Via Roma 29, 81031 Aversa (CE) (Italy)

    2009-09-15

    The influence of economic and demographic variables on the annual electricity consumption in Italy has been investigated with the intention to develop a long-term consumption forecasting model. The time period considered for the historical data is from 1970 to 2007. Different regression models were developed, using historical electricity consumption, gross domestic product (GDP), gross domestic product per capita (GDP per capita) and population. A first part of the paper considers the estimation of GDP, price and GDP per capita elasticities of domestic and non-domestic electricity consumption. The domestic and non-domestic short run price elasticities are found to be both approximately equal to -0.06, while long run elasticities are equal to -0.24 and -0.09, respectively. On the contrary, the elasticities of GDP and GDP per capita present higher values. In the second part of the paper, different regression models, based on co-integrated or stationary data, are presented. Different statistical tests are employed to check the validity of the proposed models. A comparison with national forecasts, based on complex econometric models, such as Markal-Time, was performed, showing that the developed regressions are congruent with the official projections, with deviations of {+-}1% for the best case and {+-}11% for the worst. These deviations are to be considered acceptable in relation to the time span taken into account. (author)

  2. Electricity consumption forecasting in Italy using linear regression models

    International Nuclear Information System (INIS)

    Bianco, Vincenzo; Manca, Oronzio; Nardini, Sergio

    2009-01-01

    The influence of economic and demographic variables on the annual electricity consumption in Italy has been investigated with the intention to develop a long-term consumption forecasting model. The time period considered for the historical data is from 1970 to 2007. Different regression models were developed, using historical electricity consumption, gross domestic product (GDP), gross domestic product per capita (GDP per capita) and population. A first part of the paper considers the estimation of GDP, price and GDP per capita elasticities of domestic and non-domestic electricity consumption. The domestic and non-domestic short run price elasticities are found to be both approximately equal to -0.06, while long run elasticities are equal to -0.24 and -0.09, respectively. On the contrary, the elasticities of GDP and GDP per capita present higher values. In the second part of the paper, different regression models, based on co-integrated or stationary data, are presented. Different statistical tests are employed to check the validity of the proposed models. A comparison with national forecasts, based on complex econometric models, such as Markal-Time, was performed, showing that the developed regressions are congruent with the official projections, with deviations of ±1% for the best case and ±11% for the worst. These deviations are to be considered acceptable in relation to the time span taken into account. (author)

  3. Improving sub-pixel imperviousness change prediction by ensembling heterogeneous non-linear regression models

    Directory of Open Access Journals (Sweden)

    Drzewiecki Wojciech

    2016-12-01

    Full Text Available In this work nine non-linear regression models were compared for sub-pixel impervious surface area mapping from Landsat images. The comparison was done in three study areas both for accuracy of imperviousness coverage evaluation in individual points in time and accuracy of imperviousness change assessment. The performance of individual machine learning algorithms (Cubist, Random Forest, stochastic gradient boosting of regression trees, k-nearest neighbors regression, random k-nearest neighbors regression, Multivariate Adaptive Regression Splines, averaged neural networks, and support vector machines with polynomial and radial kernels was also compared with the performance of heterogeneous model ensembles constructed from the best models trained using particular techniques.

  4. Computation of infinite dilute activity coefficients of binary liquid alloys using complex formation model

    Energy Technology Data Exchange (ETDEWEB)

    Awe, O.E., E-mail: draweoe2004@yahoo.com; Oshakuade, O.M.

    2016-04-15

    A new method for calculating Infinite Dilute Activity Coefficients (γ{sup ∞}s) of binary liquid alloys has been developed. This method is basically computing γ{sup ∞}s from experimental thermodynamic integral free energy of mixing data using Complex formation model. The new method was first used to theoretically compute the γ{sup ∞}s of 10 binary alloys whose γ{sup ∞}s have been determined by experiments. The significant agreement between the computed values and the available experimental values served as impetus for applying the new method to 22 selected binary liquid alloys whose γ{sup ∞}s are either nonexistent or incomplete. In order to verify the reliability of the computed γ{sup ∞}s of the 22 selected alloys, we recomputed the γ{sup ∞}s using three other existing methods of computing or estimating γ{sup ∞}s and then used the γ{sup ∞}s obtained from each of the four methods (the new method inclusive) to compute thermodynamic activities of components of each of the binary systems. The computed activities were compared with available experimental activities. It is observed that the results from the method being proposed, in most of the selected alloys, showed better agreement with experimental activity data. Thus, the new method is an alternative and in certain instances, more reliable approach of computing γ{sup ∞}s of binary liquid alloys.

  5. Modelling infant mortality rate in Central Java, Indonesia use generalized poisson regression method

    Science.gov (United States)

    Prahutama, Alan; Sudarno

    2018-05-01

    The infant mortality rate is the number of deaths under one year of age occurring among the live births in a given geographical area during a given year, per 1,000 live births occurring among the population of the given geographical area during the same year. This problem needs to be addressed because it is an important element of a country’s economic development. High infant mortality rate will disrupt the stability of a country as it relates to the sustainability of the population in the country. One of regression model that can be used to analyze the relationship between dependent variable Y in the form of discrete data and independent variable X is Poisson regression model. Recently The regression modeling used for data with dependent variable is discrete, among others, poisson regression, negative binomial regression and generalized poisson regression. In this research, generalized poisson regression modeling gives better AIC value than poisson regression. The most significant variable is the Number of health facilities (X1), while the variable that gives the most influence to infant mortality rate is the average breastfeeding (X9).

  6. Augmented Beta rectangular regression models: A Bayesian perspective.

    Science.gov (United States)

    Wang, Jue; Luo, Sheng

    2016-01-01

    Mixed effects Beta regression models based on Beta distributions have been widely used to analyze longitudinal percentage or proportional data ranging between zero and one. However, Beta distributions are not flexible to extreme outliers or excessive events around tail areas, and they do not account for the presence of the boundary values zeros and ones because these values are not in the support of the Beta distributions. To address these issues, we propose a mixed effects model using Beta rectangular distribution and augment it with the probabilities of zero and one. We conduct extensive simulation studies to assess the performance of mixed effects models based on both the Beta and Beta rectangular distributions under various scenarios. The simulation studies suggest that the regression models based on Beta rectangular distributions improve the accuracy of parameter estimates in the presence of outliers and heavy tails. The proposed models are applied to the motivating Neuroprotection Exploratory Trials in Parkinson's Disease (PD) Long-term Study-1 (LS-1 study, n = 1741), developed by The National Institute of Neurological Disorders and Stroke Exploratory Trials in Parkinson's Disease (NINDS NET-PD) network. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  7. Bayesian semiparametric regression models to characterize molecular evolution

    Directory of Open Access Journals (Sweden)

    Datta Saheli

    2012-10-01

    Full Text Available Abstract Background Statistical models and methods that associate changes in the physicochemical properties of amino acids with natural selection at the molecular level typically do not take into account the correlations between such properties. We propose a Bayesian hierarchical regression model with a generalization of the Dirichlet process prior on the distribution of the regression coefficients that describes the relationship between the changes in amino acid distances and natural selection in protein-coding DNA sequence alignments. Results The Bayesian semiparametric approach is illustrated with simulated data and the abalone lysin sperm data. Our method identifies groups of properties which, for this particular dataset, have a similar effect on evolution. The model also provides nonparametric site-specific estimates for the strength of conservation of these properties. Conclusions The model described here is distinguished by its ability to handle a large number of amino acid properties simultaneously, while taking into account that such data can be correlated. The multi-level clustering ability of the model allows for appealing interpretations of the results in terms of properties that are roughly equivalent from the standpoint of molecular evolution.

  8. Rotational properties of the binary and non-binary populations in the Trans-Neptunian belt

    Science.gov (United States)

    Thirouin, Audrey; Noll, Keith S.; Ortiz Moreno, Jose Luis; Morales , Nicolas

    2014-11-01

    An exhaustive study about short-term variability as well as derived properties from lightcurves allowed us to draw some conclusions for the Trans-Neptunian belt binary population. Based on Maxwellian fit distributions of the spin rate, we suggested that the binary population rotates slower than the non-binary one. This slowing-down can be attributed to tidal effects between the satellite and the primary, as expected. We showed that no system in this work is tidally locked, but the primary despinning process may have already affected the primary rate (as well as the satellite rotational rate). We used the Gladman et al. (1996) formula to compute the time required to tidally lock the systems, but this formula is based on several assumptions and approximations that do not always hold. The computed times are reasonable in most cases and confirm that none of the systems is tidally locked, assuming that the satellite densities are low and have a high rigidity or have a higher dissipation than usually assumed. The rotational properties of small bodies provide information about important physical properties, such as shape, density, and cohesion (Pravec & Harris 2000; Holsapple 2001, 2004; Thirouin et al. 2010, 2012). For binaries it is also possible to derive several physical parameters of the system components, such as diameters of the primary/secondary and albedo under some assumptions. We compare our results as well as our technique for deriving this information from the lightcurve with other methods, such as: i) thermal or thermophysical modeling, ii) from the mutual orbit of the binary component, iii) from direct imaging or iv) from stellar occultation by Trans-Neptunian Objects (TNOs). Finally, by studying the specific angular momentum of the sample, we proposed possible formation models for several binary TNOs. In several cases, we obtained hints of the formation mechanism from the angular momentum, but for other cases we do not have enough information about the

  9. Regression analysis of a chemical reaction fouling model

    International Nuclear Information System (INIS)

    Vasak, F.; Epstein, N.

    1996-01-01

    A previously reported mathematical model for the initial chemical reaction fouling of a heated tube is critically examined in the light of the experimental data for which it was developed. A regression analysis of the model with respect to that data shows that the reference point upon which the two adjustable parameters of the model were originally based was well chosen, albeit fortuitously. (author). 3 refs., 2 tabs., 2 figs

  10. Correlation-regression model for physico-chemical quality of ...

    African Journals Online (AJOL)

    abusaad

    areas, suggesting that groundwater quality in urban areas is closely related with land use ... the ground water, with correlation and regression model is also presented. ...... WHO (World Health Organization) (1985). Health hazards from nitrates.

  11. Comparison of the binary logistic and skewed logistic (Scobit) models of injury severity in motor vehicle collisions.

    Science.gov (United States)

    Tay, Richard

    2016-03-01

    The binary logistic model has been extensively used to analyze traffic collision and injury data where the outcome of interest has two categories. However, the assumption of a symmetric distribution may not be a desirable property in some cases, especially when there is a significant imbalance in the two categories of outcome. This study compares the standard binary logistic model with the skewed logistic model in two cases in which the symmetry assumption is violated in one but not the other case. The differences in the estimates, and thus the marginal effects obtained, are significant when the assumption of symmetry is violated. Copyright © 2015 Elsevier Ltd. All rights reserved.

  12. Binary effectivity rules

    DEFF Research Database (Denmark)

    Keiding, Hans; Peleg, Bezalel

    2006-01-01

    is binary if it is rationalized by an acyclic binary relation. The foregoing result motivates our definition of a binary effectivity rule as the effectivity rule of some binary SCR. A binary SCR is regular if it satisfies unanimity, monotonicity, and independence of infeasible alternatives. A binary...

  13. Multiple linear regression and regression with time series error models in forecasting PM10 concentrations in Peninsular Malaysia.

    Science.gov (United States)

    Ng, Kar Yong; Awang, Norhashidah

    2018-01-06

    Frequent haze occurrences in Malaysia have made the management of PM 10 (particulate matter with aerodynamic less than 10 μm) pollution a critical task. This requires knowledge on factors associating with PM 10 variation and good forecast of PM 10 concentrations. Hence, this paper demonstrates the prediction of 1-day-ahead daily average PM 10 concentrations based on predictor variables including meteorological parameters and gaseous pollutants. Three different models were built. They were multiple linear regression (MLR) model with lagged predictor variables (MLR1), MLR model with lagged predictor variables and PM 10 concentrations (MLR2) and regression with time series error (RTSE) model. The findings revealed that humidity, temperature, wind speed, wind direction, carbon monoxide and ozone were the main factors explaining the PM 10 variation in Peninsular Malaysia. Comparison among the three models showed that MLR2 model was on a same level with RTSE model in terms of forecasting accuracy, while MLR1 model was the worst.

  14. Multivariate Frequency-Severity Regression Models in Insurance

    Directory of Open Access Journals (Sweden)

    Edward W. Frees

    2016-02-01

    Full Text Available In insurance and related industries including healthcare, it is common to have several outcome measures that the analyst wishes to understand using explanatory variables. For example, in automobile insurance, an accident may result in payments for damage to one’s own vehicle, damage to another party’s vehicle, or personal injury. It is also common to be interested in the frequency of accidents in addition to the severity of the claim amounts. This paper synthesizes and extends the literature on multivariate frequency-severity regression modeling with a focus on insurance industry applications. Regression models for understanding the distribution of each outcome continue to be developed yet there now exists a solid body of literature for the marginal outcomes. This paper contributes to this body of literature by focusing on the use of a copula for modeling the dependence among these outcomes; a major advantage of this tool is that it preserves the body of work established for marginal models. We illustrate this approach using data from the Wisconsin Local Government Property Insurance Fund. This fund offers insurance protection for (i property; (ii motor vehicle; and (iii contractors’ equipment claims. In addition to several claim types and frequency-severity components, outcomes can be further categorized by time and space, requiring complex dependency modeling. We find significant dependencies for these data; specifically, we find that dependencies among lines are stronger than the dependencies between the frequency and average severity within each line.

  15. Quantile Regression Methods

    DEFF Research Database (Denmark)

    Fitzenberger, Bernd; Wilke, Ralf Andreas

    2015-01-01

    if the mean regression model does not. We provide a short informal introduction into the principle of quantile regression which includes an illustrative application from empirical labor market research. This is followed by briefly sketching the underlying statistical model for linear quantile regression based......Quantile regression is emerging as a popular statistical approach, which complements the estimation of conditional mean models. While the latter only focuses on one aspect of the conditional distribution of the dependent variable, the mean, quantile regression provides more detailed insights...... by modeling conditional quantiles. Quantile regression can therefore detect whether the partial effect of a regressor on the conditional quantiles is the same for all quantiles or differs across quantiles. Quantile regression can provide evidence for a statistical relationship between two variables even...

  16. An odor interaction model of binary odorant mixtures by a partial differential equation method.

    Science.gov (United States)

    Yan, Luchun; Liu, Jiemin; Wang, Guihua; Wu, Chuandong

    2014-07-09

    A novel odor interaction model was proposed for binary mixtures of benzene and substituted benzenes by a partial differential equation (PDE) method. Based on the measurement method (tangent-intercept method) of partial molar volume, original parameters of corresponding formulas were reasonably displaced by perceptual measures. By these substitutions, it was possible to relate a mixture's odor intensity to the individual odorant's relative odor activity value (OAV). Several binary mixtures of benzene and substituted benzenes were respectively tested to establish the PDE models. The obtained results showed that the PDE model provided an easily interpretable method relating individual components to their joint odor intensity. Besides, both predictive performance and feasibility of the PDE model were proved well through a series of odor intensity matching tests. If combining the PDE model with portable gas detectors or on-line monitoring systems, olfactory evaluation of odor intensity will be achieved by instruments instead of odor assessors. Many disadvantages (e.g., expense on a fixed number of odor assessors) also will be successfully avoided. Thus, the PDE model is predicted to be helpful to the monitoring and management of odor pollutions.

  17. An Odor Interaction Model of Binary Odorant Mixtures by a Partial Differential Equation Method

    Directory of Open Access Journals (Sweden)

    Luchun Yan

    2014-07-01

    Full Text Available A novel odor interaction model was proposed for binary mixtures of benzene and substituted benzenes by a partial differential equation (PDE method. Based on the measurement method (tangent-intercept method of partial molar volume, original parameters of corresponding formulas were reasonably displaced by perceptual measures. By these substitutions, it was possible to relate a mixture’s odor intensity to the individual odorant’s relative odor activity value (OAV. Several binary mixtures of benzene and substituted benzenes were respectively tested to establish the PDE models. The obtained results showed that the PDE model provided an easily interpretable method relating individual components to their joint odor intensity. Besides, both predictive performance and feasibility of the PDE model were proved well through a series of odor intensity matching tests. If combining the PDE model with portable gas detectors or on-line monitoring systems, olfactory evaluation of odor intensity will be achieved by instruments instead of odor assessors. Many disadvantages (e.g., expense on a fixed number of odor assessors also will be successfully avoided. Thus, the PDE model is predicted to be helpful to the monitoring and management of odor pollutions.

  18. Application of random regression models to the genetic evaluation ...

    African Journals Online (AJOL)

    The model included fixed regression on AM (range from 30 to 138 mo) and the effect of herd-measurement date concatenation. Random parts of the model were RRM coefficients for additive and permanent environmental effects, while residual effects were modelled to account for heterogeneity of variance by AY. Estimates ...

  19. Multitask Quantile Regression under the Transnormal Model.

    Science.gov (United States)

    Fan, Jianqing; Xue, Lingzhou; Zou, Hui

    2016-01-01

    We consider estimating multi-task quantile regression under the transnormal model, with focus on high-dimensional setting. We derive a surprisingly simple closed-form solution through rank-based covariance regularization. In particular, we propose the rank-based ℓ 1 penalization with positive definite constraints for estimating sparse covariance matrices, and the rank-based banded Cholesky decomposition regularization for estimating banded precision matrices. By taking advantage of alternating direction method of multipliers, nearest correlation matrix projection is introduced that inherits sampling properties of the unprojected one. Our work combines strengths of quantile regression and rank-based covariance regularization to simultaneously deal with nonlinearity and nonnormality for high-dimensional regression. Furthermore, the proposed method strikes a good balance between robustness and efficiency, achieves the "oracle"-like convergence rate, and provides the provable prediction interval under the high-dimensional setting. The finite-sample performance of the proposed method is also examined. The performance of our proposed rank-based method is demonstrated in a real application to analyze the protein mass spectroscopy data.

  20. Approximating prediction uncertainty for random forest regression models

    Science.gov (United States)

    John W. Coulston; Christine E. Blinn; Valerie A. Thomas; Randolph H. Wynne

    2016-01-01

    Machine learning approaches such as random forest have increased for the spatial modeling and mapping of continuous variables. Random forest is a non-parametric ensemble approach, and unlike traditional regression approaches there is no direct quantification of prediction error. Understanding prediction uncertainty is important when using model-based continuous maps as...

  1. Modeling long correlation times using additive binary Markov chains: Applications to wind generation time series

    Science.gov (United States)

    Weber, Juliane; Zachow, Christopher; Witthaut, Dirk

    2018-03-01

    Wind power generation exhibits a strong temporal variability, which is crucial for system integration in highly renewable power systems. Different methods exist to simulate wind power generation but they often cannot represent the crucial temporal fluctuations properly. We apply the concept of additive binary Markov chains to model a wind generation time series consisting of two states: periods of high and low wind generation. The only input parameter for this model is the empirical autocorrelation function. The two-state model is readily extended to stochastically reproduce the actual generation per period. To evaluate the additive binary Markov chain method, we introduce a coarse model of the electric power system to derive backup and storage needs. We find that the temporal correlations of wind power generation, the backup need as a function of the storage capacity, and the resting time distribution of high and low wind events for different shares of wind generation can be reconstructed.

  2. Modeling long correlation times using additive binary Markov chains: Applications to wind generation time series.

    Science.gov (United States)

    Weber, Juliane; Zachow, Christopher; Witthaut, Dirk

    2018-03-01

    Wind power generation exhibits a strong temporal variability, which is crucial for system integration in highly renewable power systems. Different methods exist to simulate wind power generation but they often cannot represent the crucial temporal fluctuations properly. We apply the concept of additive binary Markov chains to model a wind generation time series consisting of two states: periods of high and low wind generation. The only input parameter for this model is the empirical autocorrelation function. The two-state model is readily extended to stochastically reproduce the actual generation per period. To evaluate the additive binary Markov chain method, we introduce a coarse model of the electric power system to derive backup and storage needs. We find that the temporal correlations of wind power generation, the backup need as a function of the storage capacity, and the resting time distribution of high and low wind events for different shares of wind generation can be reconstructed.

  3. Predictive thermodynamic models for liquid--liquid extraction of single, binary and ternary lanthanides and actinides

    International Nuclear Information System (INIS)

    Hoh, Y.C.

    1977-03-01

    Chemically based thermodynamic models to predict the distribution coefficients and the separation factors for the liquid--liquid extraction of lanthanides-organophosphorus compounds were developed by assuming that the quotient of the activity coefficients of each species varies slightly with its concentrations, by using aqueous lanthanide or actinide complexes stoichiometric stability constants expressed as its degrees of formation, by making use of the extraction mechanism and the equilibrium constant for the extraction reaction. For a single component system, the thermodynamic model equations which predict the distribution coefficients, are dependent on the free organic concentration, the equilibrated ligand and hydrogen ion concentrations, the degree of formation, and on the extraction mechanism. For a binary component system, the thermodynamic model equation which predicts the separation factors is the same for all cases. This model equation is dependent on the degrees of formation of each species in their binary system and can be used in a ternary component system to predict the separation factors for the solutes relative to each other

  4. Profile-driven regression for modeling and runtime optimization of mobile networks

    DEFF Research Database (Denmark)

    McClary, Dan; Syrotiuk, Violet; Kulahci, Murat

    2010-01-01

    Computer networks often display nonlinear behavior when examined over a wide range of operating conditions. There are few strategies available for modeling such behavior and optimizing such systems as they run. Profile-driven regression is developed and applied to modeling and runtime optimization...... of throughput in a mobile ad hoc network, a self-organizing collection of mobile wireless nodes without any fixed infrastructure. The intermediate models generated in profile-driven regression are used to fit an overall model of throughput, and are also used to optimize controllable factors at runtime. Unlike...

  5. Performance of Firth-and logF-type penalized methods in risk prediction for small or sparse binary data.

    Science.gov (United States)

    Rahman, M Shafiqur; Sultana, Mahbuba

    2017-02-23

    When developing risk models for binary data with small or sparse data sets, the standard maximum likelihood estimation (MLE) based logistic regression faces several problems including biased or infinite estimate of the regression coefficient and frequent convergence failure of the likelihood due to separation. The problem of separation occurs commonly even if sample size is large but there is sufficient number of strong predictors. In the presence of separation, even if one develops the model, it produces overfitted model with poor predictive performance. Firth-and logF-type penalized regression methods are popular alternative to MLE, particularly for solving separation-problem. Despite the attractive advantages, their use in risk prediction is very limited. This paper evaluated these methods in risk prediction in comparison with MLE and other commonly used penalized methods such as ridge. The predictive performance of the methods was evaluated through assessing calibration, discrimination and overall predictive performance using an extensive simulation study. Further an illustration of the methods were provided using a real data example with low prevalence of outcome. The MLE showed poor performance in risk prediction in small or sparse data sets. All penalized methods offered some improvements in calibration, discrimination and overall predictive performance. Although the Firth-and logF-type methods showed almost equal amount of improvement, Firth-type penalization produces some bias in the average predicted probability, and the amount of bias is even larger than that produced by MLE. Of the logF(1,1) and logF(2,2) penalization, logF(2,2) provides slight bias in the estimate of regression coefficient of binary predictor and logF(1,1) performed better in all aspects. Similarly, ridge performed well in discrimination and overall predictive performance but it often produces underfitted model and has high rate of convergence failure (even the rate is higher than that

  6. Accounting for measurement error in log regression models with applications to accelerated testing.

    Science.gov (United States)

    Richardson, Robert; Tolley, H Dennis; Evenson, William E; Lunt, Barry M

    2018-01-01

    In regression settings, parameter estimates will be biased when the explanatory variables are measured with error. This bias can significantly affect modeling goals. In particular, accelerated lifetime testing involves an extrapolation of the fitted model, and a small amount of bias in parameter estimates may result in a significant increase in the bias of the extrapolated predictions. Additionally, bias may arise when the stochastic component of a log regression model is assumed to be multiplicative when the actual underlying stochastic component is additive. To account for these possible sources of bias, a log regression model with measurement error and additive error is approximated by a weighted regression model which can be estimated using Iteratively Re-weighted Least Squares. Using the reduced Eyring equation in an accelerated testing setting, the model is compared to previously accepted approaches to modeling accelerated testing data with both simulations and real data.

  7. Accounting for measurement error in log regression models with applications to accelerated testing.

    Directory of Open Access Journals (Sweden)

    Robert Richardson

    Full Text Available In regression settings, parameter estimates will be biased when the explanatory variables are measured with error. This bias can significantly affect modeling goals. In particular, accelerated lifetime testing involves an extrapolation of the fitted model, and a small amount of bias in parameter estimates may result in a significant increase in the bias of the extrapolated predictions. Additionally, bias may arise when the stochastic component of a log regression model is assumed to be multiplicative when the actual underlying stochastic component is additive. To account for these possible sources of bias, a log regression model with measurement error and additive error is approximated by a weighted regression model which can be estimated using Iteratively Re-weighted Least Squares. Using the reduced Eyring equation in an accelerated testing setting, the model is compared to previously accepted approaches to modeling accelerated testing data with both simulations and real data.

  8. Fast and Accurate Prediction of Numerical Relativity Waveforms from Binary Black Hole Coalescences Using Surrogate Models.

    Science.gov (United States)

    Blackman, Jonathan; Field, Scott E; Galley, Chad R; Szilágyi, Béla; Scheel, Mark A; Tiglio, Manuel; Hemberger, Daniel A

    2015-09-18

    Simulating a binary black hole coalescence by solving Einstein's equations is computationally expensive, requiring days to months of supercomputing time. Using reduced order modeling techniques, we construct an accurate surrogate model, which is evaluated in a millisecond to a second, for numerical relativity (NR) waveforms from nonspinning binary black hole coalescences with mass ratios in [1, 10] and durations corresponding to about 15 orbits before merger. We assess the model's uncertainty and show that our modeling strategy predicts NR waveforms not used for the surrogate's training with errors nearly as small as the numerical error of the NR code. Our model includes all spherical-harmonic _{-2}Y_{ℓm} waveform modes resolved by the NR code up to ℓ=8. We compare our surrogate model to effective one body waveforms from 50M_{⊙} to 300M_{⊙} for advanced LIGO detectors and find that the surrogate is always more faithful (by at least an order of magnitude in most cases).

  9. OPLS statistical model versus linear regression to assess sonographic predictors of stroke prognosis.

    Science.gov (United States)

    Vajargah, Kianoush Fathi; Sadeghi-Bazargani, Homayoun; Mehdizadeh-Esfanjani, Robab; Savadi-Oskouei, Daryoush; Farhoudi, Mehdi

    2012-01-01

    The objective of the present study was to assess the comparable applicability of orthogonal projections to latent structures (OPLS) statistical model vs traditional linear regression in order to investigate the role of trans cranial doppler (TCD) sonography in predicting ischemic stroke prognosis. The study was conducted on 116 ischemic stroke patients admitted to a specialty neurology ward. The Unified Neurological Stroke Scale was used once for clinical evaluation on the first week of admission and again six months later. All data was primarily analyzed using simple linear regression and later considered for multivariate analysis using PLS/OPLS models through the SIMCA P+12 statistical software package. The linear regression analysis results used for the identification of TCD predictors of stroke prognosis were confirmed through the OPLS modeling technique. Moreover, in comparison to linear regression, the OPLS model appeared to have higher sensitivity in detecting the predictors of ischemic stroke prognosis and detected several more predictors. Applying the OPLS model made it possible to use both single TCD measures/indicators and arbitrarily dichotomized measures of TCD single vessel involvement as well as the overall TCD result. In conclusion, the authors recommend PLS/OPLS methods as complementary rather than alternative to the available classical regression models such as linear regression.

  10. Hidden slow pulsars in binaries

    Science.gov (United States)

    Tavani, Marco; Brookshaw, Leigh

    1993-01-01

    The recent discovery of the binary containing the slow pulsar PSR 1718-19 orbiting around a low-mass companion star adds new light on the characteristics of binary pulsars. The properties of the radio eclipses of PSR 1718-19 are the most striking observational characteristics of this system. The surface of the companion star produces a mass outflow which leaves only a small 'window' in orbital phase for the detection of PSR 1718-19 around 400 MHz. At this observing frequency, PSR 1718-19 is clearly observable only for about 1 hr out of the total 6.2 hr orbital period. The aim of this Letter is twofold: (1) to model the hydrodynamical behavior of the eclipsing material from the companion star of PSR 1718-19 and (2) to argue that a population of binary slow pulsars might have escaped detection in pulsar surveys carried out at 400 MHz. The possible existence of a population of partially or totally hidden slow pulsars in binaries will have a strong impact on current theories of binary evolution of neutron stars.

  11. A Technique of Fuzzy C-Mean in Multiple Linear Regression Model toward Paddy Yield

    Science.gov (United States)

    Syazwan Wahab, Nur; Saifullah Rusiman, Mohd; Mohamad, Mahathir; Amira Azmi, Nur; Che Him, Norziha; Ghazali Kamardan, M.; Ali, Maselan

    2018-04-01

    In this paper, we propose a hybrid model which is a combination of multiple linear regression model and fuzzy c-means method. This research involved a relationship between 20 variates of the top soil that are analyzed prior to planting of paddy yields at standard fertilizer rates. Data used were from the multi-location trials for rice carried out by MARDI at major paddy granary in Peninsular Malaysia during the period from 2009 to 2012. Missing observations were estimated using mean estimation techniques. The data were analyzed using multiple linear regression model and a combination of multiple linear regression model and fuzzy c-means method. Analysis of normality and multicollinearity indicate that the data is normally scattered without multicollinearity among independent variables. Analysis of fuzzy c-means cluster the yield of paddy into two clusters before the multiple linear regression model can be used. The comparison between two method indicate that the hybrid of multiple linear regression model and fuzzy c-means method outperform the multiple linear regression model with lower value of mean square error.

  12. Formation of planetary nebulae with close binary nuclei

    Energy Technology Data Exchange (ETDEWEB)

    Livio, M; Salzman, J; Shaviv, G [Tel Aviv Univ. (Israel). Dept. of Physics and Astronomy

    1979-07-01

    A model for the formation of planetary nebulae with a close binary as a nucleus is presented. The model is based on mass loss instability at L/sub 2/. The instability is demonstrated. The conditions on the mass loss are formulated and analysed. The observational consequence of the model is described briefly and its relation to symbiotic stars and cataclysmic binaries discussed.

  13. Observational properties of models of semidetached close binaries. Pt. 2

    International Nuclear Information System (INIS)

    Giannone, P.; Giannuzzi, M.A.; Pucillo, M.

    1975-01-01

    Binaries of Cases A and B with intermediate and small masses have been studied. Synthetic light curves are shown to be affected mainly by the assumption concerning the shape of the components. The comparison between synthetic light curves and observed data can give further information on the reliability of the hypotheses assumed in the computations of binary star evolution. The calculated properties lead to useful indications about the evolutionary stages of observed binaries. The detection of systems evolving according to Case A appears to be favoured in comparison with that of systems of Case B. Systems with undersize subgiants result comparatively difficult to observe. (orig./BJ) [de

  14. Direct modeling of regression effects for transition probabilities in the progressive illness-death model

    DEFF Research Database (Denmark)

    Azarang, Leyla; Scheike, Thomas; de Uña-Álvarez, Jacobo

    2017-01-01

    In this work, we present direct regression analysis for the transition probabilities in the possibly non-Markov progressive illness–death model. The method is based on binomial regression, where the response is the indicator of the occupancy for the given state along time. Randomly weighted score...

  15. An epidemiological survey on road traffic crashes in Iran: application of the two logistic regression models.

    Science.gov (United States)

    Bakhtiyari, Mahmood; Mehmandar, Mohammad Reza; Mirbagheri, Babak; Hariri, Gholam Reza; Delpisheh, Ali; Soori, Hamid

    2014-01-01

    Risk factors of human-related traffic crashes are the most important and preventable challenges for community health due to their noteworthy burden in developing countries in particular. The present study aims to investigate the role of human risk factors of road traffic crashes in Iran. Through a cross-sectional study using the COM 114 data collection forms, the police records of almost 600,000 crashes occurred in 2010 are investigated. The binary logistic regression and proportional odds regression models are used. The odds ratio for each risk factor is calculated. These models are adjusted for known confounding factors including age, sex and driving time. The traffic crash reports of 537,688 men (90.8%) and 54,480 women (9.2%) are analysed. The mean age is 34.1 ± 14 years. Not maintaining eyes on the road (53.7%) and losing control of the vehicle (21.4%) are the main causes of drivers' deaths in traffic crashes within cities. Not maintaining eyes on the road is also the most frequent human risk factor for road traffic crashes out of cities. Sudden lane excursion (OR = 9.9, 95% CI: 8.2-11.9) and seat belt non-compliance (OR = 8.7, CI: 6.7-10.1), exceeding authorised speed (OR = 17.9, CI: 12.7-25.1) and exceeding safe speed (OR = 9.7, CI: 7.2-13.2) are the most significant human risk factors for traffic crashes in Iran. The high mortality rate of 39 people for every 100,000 population emphasises on the importance of traffic crashes in Iran. Considering the important role of human risk factors in traffic crashes, struggling efforts are required to control dangerous driving behaviours such as exceeding speed, illegal overtaking and not maintaining eyes on the road.

  16. Automatic feed phase identification in multivariate bioprocess profiles by sequential binary classification.

    Science.gov (United States)

    Nikzad-Langerodi, Ramin; Lughofer, Edwin; Saminger-Platz, Susanne; Zahel, Thomas; Sagmeister, Patrick; Herwig, Christoph

    2017-08-22

    In this paper, we propose a new strategy for retrospective identification of feed phases from online sensor-data enriched feed profiles of an Escherichia Coli (E. coli) fed-batch fermentation process. In contrast to conventional (static), data-driven multi-class machine learning (ML), we exploit process knowledge in order to constrain our classification system yielding more parsimonious models compared to static ML approaches. In particular, we enforce unidirectionality on a set of binary, multivariate classifiers trained to discriminate between adjacent feed phases by linking the classifiers through a one-way switch. The switch is activated when the actual classifier output changes. As a consequence, the next binary classifier in the classifier chain is used for the discrimination between the next feed phase pair etc. We allow activation of the switch only after a predefined number of consecutive predictions of a transition event in order to prevent premature activation of the switch and undertake a sensitivity analysis regarding the optimal choice of the (time) lag parameter. From a complexity/parsimony perspective the benefit of our approach is three-fold: i) The multi-class learning task is broken down into binary subproblems which usually have simpler decision surfaces and tend to be less susceptible to the class-imbalance problem. ii) We exploit the fact that the process follows a rigid feed cycle structure (i.e. batch-feed-batch-feed) which allows us to focus on the subproblems involving phase transitions as they occur during the process while discarding off-transition classifiers and iii) only one binary classifier is active at the time which keeps effective model complexity low. We further use a combination of logistic regression and Lasso (i.e. regularized logistic regression, RLR) as a wrapper to extract the most relevant features for individual subproblems from the whole set of high-dimensional sensor data. We train different soft computing classifiers

  17. Targeted maximum likelihood estimation for a binary treatment: A tutorial.

    Science.gov (United States)

    Luque-Fernandez, Miguel Angel; Schomaker, Michael; Rachet, Bernard; Schnitzer, Mireille E

    2018-04-23

    When estimating the average effect of a binary treatment (or exposure) on an outcome, methods that incorporate propensity scores, the G-formula, or targeted maximum likelihood estimation (TMLE) are preferred over naïve regression approaches, which are biased under misspecification of a parametric outcome model. In contrast propensity score methods require the correct specification of an exposure model. Double-robust methods only require correct specification of either the outcome or the exposure model. Targeted maximum likelihood estimation is a semiparametric double-robust method that improves the chances of correct model specification by allowing for flexible estimation using (nonparametric) machine-learning methods. It therefore requires weaker assumptions than its competitors. We provide a step-by-step guided implementation of TMLE and illustrate it in a realistic scenario based on cancer epidemiology where assumptions about correct model specification and positivity (ie, when a study participant had 0 probability of receiving the treatment) are nearly violated. This article provides a concise and reproducible educational introduction to TMLE for a binary outcome and exposure. The reader should gain sufficient understanding of TMLE from this introductory tutorial to be able to apply the method in practice. Extensive R-code is provided in easy-to-read boxes throughout the article for replicability. Stata users will find a testing implementation of TMLE and additional material in the Appendix S1 and at the following GitHub repository: https://github.com/migariane/SIM-TMLE-tutorial. © 2018 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.

  18. SPSS macros to compare any two fitted values from a regression model.

    Science.gov (United States)

    Weaver, Bruce; Dubois, Sacha

    2012-12-01

    In regression models with first-order terms only, the coefficient for a given variable is typically interpreted as the change in the fitted value of Y for a one-unit increase in that variable, with all other variables held constant. Therefore, each regression coefficient represents the difference between two fitted values of Y. But the coefficients represent only a fraction of the possible fitted value comparisons that might be of interest to researchers. For many fitted value comparisons that are not captured by any of the regression coefficients, common statistical software packages do not provide the standard errors needed to compute confidence intervals or carry out statistical tests-particularly in more complex models that include interactions, polynomial terms, or regression splines. We describe two SPSS macros that implement a matrix algebra method for comparing any two fitted values from a regression model. The !OLScomp and !MLEcomp macros are for use with models fitted via ordinary least squares and maximum likelihood estimation, respectively. The output from the macros includes the standard error of the difference between the two fitted values, a 95% confidence interval for the difference, and a corresponding statistical test with its p-value.

  19. Parameter estimation and statistical test of geographically weighted bivariate Poisson inverse Gaussian regression models

    Science.gov (United States)

    Amalia, Junita; Purhadi, Otok, Bambang Widjanarko

    2017-11-01

    Poisson distribution is a discrete distribution with count data as the random variables and it has one parameter defines both mean and variance. Poisson regression assumes mean and variance should be same (equidispersion). Nonetheless, some case of the count data unsatisfied this assumption because variance exceeds mean (over-dispersion). The ignorance of over-dispersion causes underestimates in standard error. Furthermore, it causes incorrect decision in the statistical test. Previously, paired count data has a correlation and it has bivariate Poisson distribution. If there is over-dispersion, modeling paired count data is not sufficient with simple bivariate Poisson regression. Bivariate Poisson Inverse Gaussian Regression (BPIGR) model is mix Poisson regression for modeling paired count data within over-dispersion. BPIGR model produces a global model for all locations. In another hand, each location has different geographic conditions, social, cultural and economic so that Geographically Weighted Regression (GWR) is needed. The weighting function of each location in GWR generates a different local model. Geographically Weighted Bivariate Poisson Inverse Gaussian Regression (GWBPIGR) model is used to solve over-dispersion and to generate local models. Parameter estimation of GWBPIGR model obtained by Maximum Likelihood Estimation (MLE) method. Meanwhile, hypothesis testing of GWBPIGR model acquired by Maximum Likelihood Ratio Test (MLRT) method.

  20. Formation and Evolution of X-ray Binaries

    Science.gov (United States)

    Shao, Y.

    2017-07-01

    X-ray binaries are a class of binary systems, in which the accretor is a compact star (i.e., black hole, neutron star, or white dwarf). They are one of the most important objects in the universe, which can be used to study not only binary evolution but also accretion disks and compact stars. Statistical investigations of these binaries help to understand the formation and evolution of galaxies, and sometimes provide useful constraints on the cosmological models. The goal of this thesis is to investigate the formation and evolution processes of X-ray binaries including Be/X-ray binaries, low-mass X-ray binaries (LMXBs), ultraluminous X-ray sources (ULXs), and cataclysmic variables. In Chapter 1 we give a brief review on the basic knowledge of the binary evolution. In Chapter 2 we discuss the formation of Be stars through binary interaction. In this chapter we investigate the formation of Be stars resulting from mass transfer in binaries in the Galaxy. Using binary evolution and population synthesis calculations, we find that in Be/neutron star binaries the Be stars have a lower limit of mass ˜ 8 M⊙ if they are formed by a stable (i.e., without the occurrence of common envelope evolution) and nonconservative mass transfer. We demonstrate that the isolated Be stars may originate from both mergers of two main-sequence stars and disrupted Be binaries during the supernova explosions of the primary stars, but mergers seem to play a much more important role. Finally the fraction of Be stars produced by binary interactions in all B type stars can be as high as ˜ 13%-30% , implying that most of Be stars may result from binary interaction. In Chapter 3 we show the evolution of intermediate- and low-mass X-ray binaries (I/LMXBs) and the formation of millisecond pulsars. Comparing the calculated results with the observations of binary radio pulsars, we report the following results: (1) The allowed parameter space for forming binary pulsars in the initial orbital period

  1. A Statistical Model for Misreported Binary Outcomes in Clustered RCTs of Education Interventions

    Science.gov (United States)

    Schochet, Peter Z.

    2013-01-01

    In education randomized control trials (RCTs), the misreporting of student outcome data could lead to biased estimates of average treatment effects (ATEs) and their standard errors. This article discusses a statistical model that adjusts for misreported binary outcomes for two-level, school-based RCTs, where it is assumed that misreporting could…

  2. Can We Use Regression Modeling to Quantify Mean Annual Streamflow at a Global-Scale?

    Science.gov (United States)

    Barbarossa, V.; Huijbregts, M. A. J.; Hendriks, J. A.; Beusen, A.; Clavreul, J.; King, H.; Schipper, A.

    2016-12-01

    Quantifying mean annual flow of rivers (MAF) at ungauged sites is essential for a number of applications, including assessments of global water supply, ecosystem integrity and water footprints. MAF can be quantified with spatially explicit process-based models, which might be overly time-consuming and data-intensive for this purpose, or with empirical regression models that predict MAF based on climate and catchment characteristics. Yet, regression models have mostly been developed at a regional scale and the extent to which they can be extrapolated to other regions is not known. In this study, we developed a global-scale regression model for MAF using observations of discharge and catchment characteristics from 1,885 catchments worldwide, ranging from 2 to 106 km2 in size. In addition, we compared the performance of the regression model with the predictive ability of the spatially explicit global hydrological model PCR-GLOBWB [van Beek et al., 2011] by comparing results from both models to independent measurements. We obtained a regression model explaining 89% of the variance in MAF based on catchment area, mean annual precipitation and air temperature, average slope and elevation. The regression model performed better than PCR-GLOBWB for the prediction of MAF, as root-mean-square error values were lower (0.29 - 0.38 compared to 0.49 - 0.57) and the modified index of agreement was higher (0.80 - 0.83 compared to 0.72 - 0.75). Our regression model can be applied globally at any point of the river network, provided that the input parameters are within the range of values employed in the calibration of the model. The performance is reduced for water scarce regions and further research should focus on improving such an aspect for regression-based global hydrological models.

  3. Modeling Governance KB with CATPCA to Overcome Multicollinearity in the Logistic Regression

    Science.gov (United States)

    Khikmah, L.; Wijayanto, H.; Syafitri, U. D.

    2017-04-01

    The problem often encounters in logistic regression modeling are multicollinearity problems. Data that have multicollinearity between explanatory variables with the result in the estimation of parameters to be bias. Besides, the multicollinearity will result in error in the classification. In general, to overcome multicollinearity in regression used stepwise regression. They are also another method to overcome multicollinearity which involves all variable for prediction. That is Principal Component Analysis (PCA). However, classical PCA in only for numeric data. Its data are categorical, one method to solve the problems is Categorical Principal Component Analysis (CATPCA). Data were used in this research were a part of data Demographic and Population Survey Indonesia (IDHS) 2012. This research focuses on the characteristic of women of using the contraceptive methods. Classification results evaluated using Area Under Curve (AUC) values. The higher the AUC value, the better. Based on AUC values, the classification of the contraceptive method using stepwise method (58.66%) is better than the logistic regression model (57.39%) and CATPCA (57.39%). Evaluation of the results of logistic regression using sensitivity, shows the opposite where CATPCA method (99.79%) is better than logistic regression method (92.43%) and stepwise (92.05%). Therefore in this study focuses on major class classification (using a contraceptive method), then the selected model is CATPCA because it can raise the level of the major class model accuracy.

  4. Time series modeling by a regression approach based on a latent process.

    Science.gov (United States)

    Chamroukhi, Faicel; Samé, Allou; Govaert, Gérard; Aknin, Patrice

    2009-01-01

    Time series are used in many domains including finance, engineering, economics and bioinformatics generally to represent the change of a measurement over time. Modeling techniques may then be used to give a synthetic representation of such data. A new approach for time series modeling is proposed in this paper. It consists of a regression model incorporating a discrete hidden logistic process allowing for activating smoothly or abruptly different polynomial regression models. The model parameters are estimated by the maximum likelihood method performed by a dedicated Expectation Maximization (EM) algorithm. The M step of the EM algorithm uses a multi-class Iterative Reweighted Least-Squares (IRLS) algorithm to estimate the hidden process parameters. To evaluate the proposed approach, an experimental study on simulated data and real world data was performed using two alternative approaches: a heteroskedastic piecewise regression model using a global optimization algorithm based on dynamic programming, and a Hidden Markov Regression Model whose parameters are estimated by the Baum-Welch algorithm. Finally, in the context of the remote monitoring of components of the French railway infrastructure, and more particularly the switch mechanism, the proposed approach has been applied to modeling and classifying time series representing the condition measurements acquired during switch operations.

  5. Introduction to generalized linear models

    CERN Document Server

    Dobson, Annette J

    2008-01-01

    Introduction Background Scope Notation Distributions Related to the Normal Distribution Quadratic Forms Estimation Model Fitting Introduction Examples Some Principles of Statistical Modeling Notation and Coding for Explanatory Variables Exponential Family and Generalized Linear Models Introduction Exponential Family of Distributions Properties of Distributions in the Exponential Family Generalized Linear Models Examples Estimation Introduction Example: Failure Times for Pressure Vessels Maximum Likelihood Estimation Poisson Regression Example Inference Introduction Sampling Distribution for Score Statistics Taylor Series Approximations Sampling Distribution for MLEs Log-Likelihood Ratio Statistic Sampling Distribution for the Deviance Hypothesis Testing Normal Linear Models Introduction Basic Results Multiple Linear Regression Analysis of Variance Analysis of Covariance General Linear Models Binary Variables and Logistic Regression Probability Distributions ...

  6. Regression to Causality : Regression-style presentation influences causal attribution

    DEFF Research Database (Denmark)

    Bordacconi, Mats Joe; Larsen, Martin Vinæs

    2014-01-01

    of equivalent results presented as either regression models or as a test of two sample means. Our experiment shows that the subjects who were presented with results as estimates from a regression model were more inclined to interpret these results causally. Our experiment implies that scholars using regression...... models – one of the primary vehicles for analyzing statistical results in political science – encourage causal interpretation. Specifically, we demonstrate that presenting observational results in a regression model, rather than as a simple comparison of means, makes causal interpretation of the results...... more likely. Our experiment drew on a sample of 235 university students from three different social science degree programs (political science, sociology and economics), all of whom had received substantial training in statistics. The subjects were asked to compare and evaluate the validity...

  7. Comparison between linear and non-parametric regression models for genome-enabled prediction in wheat.

    Science.gov (United States)

    Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne

    2012-12-01

    In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models.

  8. Modeling and analysis of periodic orbits around a contact binary asteroid

    Science.gov (United States)

    Feng, Jinglang; Noomen, Ron; Visser, Pieter N. A. M.; Yuan, Jianping

    2015-06-01

    The existence and characteristics of periodic orbits (POs) in the vicinity of a contact binary asteroid are investigated with an averaged spherical harmonics model. A contact binary asteroid consists of two components connected to each other, resulting in a highly bifurcated shape. Here, it is represented by a combination of an ellipsoid and a sphere. The gravitational field of this configuration is for the first time expanded into a spherical harmonics model up to degree and order 8. Compared with the exact potential, the truncation at degree and order 4 is found to introduce an error of less than 10 % at the circumscribing sphere and less than 1 % at a distance of the double of the reference radius. The Hamiltonian taking into account harmonics up to degree and order 4 is developed. After double averaging of this Hamiltonian, the model is reduced to include zonal harmonics only and frozen orbits are obtained. The tesseral terms are found to introduce significant variations on the frozen orbits and distort the frozen situation. Applying the method of Poincaré sections, phase space structures of the single-averaged model are generated for different energy levels and rotation rates of the asteroid, from which the dynamics driven by the 4×4 harmonics model is identified and POs are found. It is found that the disturbing effect of the highly irregular gravitational field on orbital motion is weakened around the polar region, and also for an asteroid with a fast rotation rate. Starting with initial conditions from this averaged model, families of exact POs in the original non-averaged system are obtained employing a numerical search method and a continuation technique. Some of these POs are stable and are candidates for future missions.

  9. Adjusting for Confounding in Early Postlaunch Settings: Going Beyond Logistic Regression Models.

    Science.gov (United States)

    Schmidt, Amand F; Klungel, Olaf H; Groenwold, Rolf H H

    2016-01-01

    Postlaunch data on medical treatments can be analyzed to explore adverse events or relative effectiveness in real-life settings. These analyses are often complicated by the number of potential confounders and the possibility of model misspecification. We conducted a simulation study to compare the performance of logistic regression, propensity score, disease risk score, and stabilized inverse probability weighting methods to adjust for confounding. Model misspecification was induced in the independent derivation dataset. We evaluated performance using relative bias confidence interval coverage of the true effect, among other metrics. At low events per coefficient (1.0 and 0.5), the logistic regression estimates had a large relative bias (greater than -100%). Bias of the disease risk score estimates was at most 13.48% and 18.83%. For the propensity score model, this was 8.74% and >100%, respectively. At events per coefficient of 1.0 and 0.5, inverse probability weighting frequently failed or reduced to a crude regression, resulting in biases of -8.49% and 24.55%. Coverage of logistic regression estimates became less than the nominal level at events per coefficient ≤5. For the disease risk score, inverse probability weighting, and propensity score, coverage became less than nominal at events per coefficient ≤2.5, ≤1.0, and ≤1.0, respectively. Bias of misspecified disease risk score models was 16.55%. In settings with low events/exposed subjects per coefficient, disease risk score methods can be useful alternatives to logistic regression models, especially when propensity score models cannot be used. Despite better performance of disease risk score methods than logistic regression and propensity score models in small events per coefficient settings, bias, and coverage still deviated from nominal.

  10. Hadronic model for the non-thermal radiation from the binary system AR Scorpii

    Science.gov (United States)

    Bednarek, W.

    2018-05-01

    AR Scorpii is a close binary system containing a rotation powered white dwarf and a low-mass M type companion star. This system shows non-thermal emission extending up to the X-ray energy range. We consider hybrid (lepto-hadronic) and pure hadronic models for the high energy non-thermal processes in this binary system. Relativistic electrons and hadrons are assumed to be accelerated in a strongly magnetised, turbulent region formed in collision of a rotating white dwarf magnetosphere and a magnetosphere/dense atmosphere of the M-dwarf star. We propose that the non-thermal X-ray emission is produced either by the primary electrons or the secondary e± pairs from decay of charged pions created in collisions of hadrons with the companion star atmosphere. We show that the accompanying γ-ray emission from decay of neutral pions, which are produced by these same protons, is expected to be on the detectability level of the present and/or the future satellite and Cherenkov telescopes. The γ-ray observations of the binary system AR Sco should allow us to constrain the efficiency of hadron and electron acceleration and also the details of the radiation processes.

  11. APPLICATION OF A LATTICE GAS MODEL FOR SUBPIXEL PROCESSING OF LOW-RESOLUTION IMAGES OF BINARY STRUCTURES

    Directory of Open Access Journals (Sweden)

    Zbisław Tabor

    2011-05-01

    Full Text Available In the study an algorithm based on a lattice gas model is proposed as a tool for enhancing quality of lowresolution images of binary structures. Analyzed low-resolution gray-level images are replaced with binary images, in which pixel size is decreased. The intensity in the pixels of these new images is determined by corresponding gray-level intensities in the original low-resolution images. Then the white phase pixels in the binary images are assumed to be particles interacting with one another, interacting with properly defined external field and allowed to diffuse. The evolution is driven towards a state with maximal energy by Metropolis algorithm. This state is used to estimate the imaged object. The performance of the proposed algorithm and local and global thresholding methods are compared.

  12. (Liquid + liquid) equilibrium for binary systems of N-formylmorpholine with alkanes

    International Nuclear Information System (INIS)

    Wang Zhengrong; Xia Shuqian; Ma Peisheng; Liu Tao; Han Kewei

    2012-01-01

    Highlights: ► The LLE data of four binary systems containing N-formylmorpholine were measured. ► Both NRTL and UNIQUAC models can fit the experimental data well. ► The new group interaction parameters of UNIFAC (Do) were regressed from the LLE data. ► The estimated result shows that the group interaction parameters and methods are reliable. - Abstract: (Liquid + liquid) equilibrium (LLE) data were determined for four binary systems containing N-formylmorpholine (NFM) and alkanes (3-methylpentane, heptane, nonane, and 2,2,4-trimethylpentane) over the temperature range from around 300 K to near 420 K using a set of newly designed equilibrium equipment. The compositions of both light and heavy phases were analyzed by gas chromatography. The mutual solubility increased as the temperature increased for all these systems. The binary (liquid + liquid) equilibrium data were correlated by the NRTL and UNIQUAC equations with temperature-dependent parameters. Both models correlate the experimental results well. Furthermore, the UNIFAC (Do) group contribution model was used to correlate and estimate the LLE data for NFM containing systems. Two methods of group division for NFM were used. NFM is treated as a single group: NFM group (method I) or divided into two groups: CHO and C 4 H 8 NO (method II), respectively. The group interaction parameters for CH 2 –NFM, or CH 2 –CHO and CH 2 –C 4 H 8 NO were fitted from the experimental LLE data. The UNIFAC (Do) model correlates the experimental data well. In addition, in order to develop UNIFAC (Do) group contribution model to estimate the LLE data of (NFM + cycloalkane) systems, some literature LLE data were used. The group interaction parameters for c-CH 2 –NFM, c-CH 2 –CHO and c-CH 2 –C 4 H 8 NO were correlated. Then these group interaction parameters were used to estimate the phase equilibrium data of binary systems in the literature by the UNIFAC (Do) model. The results showed that the estimated values are in

  13. Regression Model to Predict Global Solar Irradiance in Malaysia

    Directory of Open Access Journals (Sweden)

    Hairuniza Ahmed Kutty

    2015-01-01

    Full Text Available A novel regression model is developed to estimate the monthly global solar irradiance in Malaysia. The model is developed based on different available meteorological parameters, including temperature, cloud cover, rain precipitate, relative humidity, wind speed, pressure, and gust speed, by implementing regression analysis. This paper reports on the details of the analysis of the effect of each prediction parameter to identify the parameters that are relevant to estimating global solar irradiance. In addition, the proposed model is compared in terms of the root mean square error (RMSE, mean bias error (MBE, and the coefficient of determination (R2 with other models available from literature studies. Seven models based on single parameters (PM1 to PM7 and five multiple-parameter models (PM7 to PM12 are proposed. The new models perform well, with RMSE ranging from 0.429% to 1.774%, R2 ranging from 0.942 to 0.992, and MBE ranging from −0.1571% to 0.6025%. In general, cloud cover significantly affects the estimation of global solar irradiance. However, cloud cover in Malaysia lacks sufficient influence when included into multiple-parameter models although it performs fairly well in single-parameter prediction models.

  14. Polynomial regression analysis and significance test of the regression function

    International Nuclear Information System (INIS)

    Gao Zhengming; Zhao Juan; He Shengping

    2012-01-01

    In order to analyze the decay heating power of a certain radioactive isotope per kilogram with polynomial regression method, the paper firstly demonstrated the broad usage of polynomial function and deduced its parameters with ordinary least squares estimate. Then significance test method of polynomial regression function is derived considering the similarity between the polynomial regression model and the multivariable linear regression model. Finally, polynomial regression analysis and significance test of the polynomial function are done to the decay heating power of the iso tope per kilogram in accord with the authors' real work. (authors)

  15. Logistic regression for risk factor modelling in stuttering research.

    Science.gov (United States)

    Reed, Phil; Wu, Yaqionq

    2013-06-01

    To outline the uses of logistic regression and other statistical methods for risk factor analysis in the context of research on stuttering. The principles underlying the application of a logistic regression are illustrated, and the types of questions to which such a technique has been applied in the stuttering field are outlined. The assumptions and limitations of the technique are discussed with respect to existing stuttering research, and with respect to formulating appropriate research strategies to accommodate these considerations. Finally, some alternatives to the approach are briefly discussed. The way the statistical procedures are employed are demonstrated with some hypothetical data. Research into several practical issues concerning stuttering could benefit if risk factor modelling were used. Important examples are early diagnosis, prognosis (whether a child will recover or persist) and assessment of treatment outcome. After reading this article you will: (a) Summarize the situations in which logistic regression can be applied to a range of issues about stuttering; (b) Follow the steps in performing a logistic regression analysis; (c) Describe the assumptions of the logistic regression technique and the precautions that need to be checked when it is employed; (d) Be able to summarize its advantages over other techniques like estimation of group differences and simple regression. Copyright © 2012 Elsevier Inc. All rights reserved.

  16. Efficient estimation of an additive quantile regression model

    NARCIS (Netherlands)

    Cheng, Y.; de Gooijer, J.G.; Zerom, D.

    2011-01-01

    In this paper, two non-parametric estimators are proposed for estimating the components of an additive quantile regression model. The first estimator is a computationally convenient approach which can be viewed as a more viable alternative to existing kernel-based approaches. The second estimator

  17. Thermodynamic models for determination of solid–liquid equilibrium of the 6-benzyladenine in pure and binary organic solvents

    International Nuclear Information System (INIS)

    Li, Tao; Deng, Renlun; Wu, Gang; Gu, Pengfei; Hu, Yonghong; Yang, Wenge; Yu, Yemin; Zhang, Yuhao; Yang, Chen

    2017-01-01

    Highlights: • The solubility increased with increasing temperature. • Data were fitted using the modified Apelblat equation and other models in pure solvents. • Data were fitted using the modified Apelblat equation and other models in binary solvent mixture. - Abstract: Data on corresponding solid–liquid equilibrium of 6-benzyladenine in different solvents are essential for a preliminary study of industrial applications. In this paper, the solid–liquid equilibrium of 6-benzyladenine in methanol, ethanol, 1-butanol, acetone, acetonitrile, ethyl acetate, dimethyl formamide and tetrahydrofuran pure solvents and (dimethyl formamide + actone) mixture solvents was explored within the temperature range from (278.15 to 333.15) K under 0.1 MPa. For the temperature range investigated, the solubility of 6-benzyladenine in the solvents increased with increasing temperature. The solubility of 6-benzyladenine in dimethyl formamide is superior to other selected pure solvents. The modified Apelblat model, the Buchowski-Ksiazaczak λh model, and the ideal model were adopted to describe and predict the change tendency of solubility. Computational results showed that the modified Apelblat model has more advantages than the other two models. The solubility results were fitted using a modified Apelblat equation, a variant of the combined nearly ideal binary solvent/Redich-Kister (CNIBS/R-K) model, Jouyban-Acree model and Ma model in (dimethyl formamide + acetone) binary solvent mixture. Computational results showed that the modified Apelblat model is superior to the other equations.

  18. Fault diagnosis and comparing risk for the steel coil manufacturing process using statistical models for binary data

    International Nuclear Information System (INIS)

    Debón, A.; Carlos Garcia-Díaz, J.

    2012-01-01

    Advanced statistical models can help industry to design more economical and rational investment plans. Fault detection and diagnosis is an important problem in continuous hot dip galvanizing. Increasingly stringent quality requirements in the automotive industry also require ongoing efforts in process control to make processes more robust. Robust methods for estimating the quality of galvanized steel coils are an important tool for the comprehensive monitoring of the performance of the manufacturing process. This study applies different statistical regression models: generalized linear models, generalized additive models and classification trees to estimate the quality of galvanized steel coils on the basis of short time histories. The data, consisting of 48 galvanized steel coils, was divided into sets of conforming and nonconforming coils. Five variables were selected for monitoring the process: steel strip velocity and four bath temperatures. The present paper reports a comparative evaluation of statistical models for binary data using Receiver Operating Characteristic (ROC) curves. A ROC curve is a graph or a technique for visualizing, organizing and selecting classifiers based on their performance. The purpose of this paper is to examine their use in research to obtain the best model to predict defective steel coil probability. In relation to the work of other authors who only propose goodness of fit statistics, we should highlight one distinctive feature of the methodology presented here, which is the possibility of comparing the different models with ROC graphs which are based on model classification performance. Finally, the results are validated by bootstrap procedures.

  19. Advanced statistics: linear regression, part I: simple linear regression.

    Science.gov (United States)

    Marill, Keith A

    2004-01-01

    Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.

  20. Advanced statistics: linear regression, part II: multiple linear regression.

    Science.gov (United States)

    Marill, Keith A

    2004-01-01

    The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.

  1. Sample size adjustments for varying cluster sizes in cluster randomized trials with binary outcomes analyzed with second-order PQL mixed logistic regression.

    Science.gov (United States)

    Candel, Math J J M; Van Breukelen, Gerard J P

    2010-06-30

    Adjustments of sample size formulas are given for varying cluster sizes in cluster randomized trials with a binary outcome when testing the treatment effect with mixed effects logistic regression using second-order penalized quasi-likelihood estimation (PQL). Starting from first-order marginal quasi-likelihood (MQL) estimation of the treatment effect, the asymptotic relative efficiency of unequal versus equal cluster sizes is derived. A Monte Carlo simulation study shows this asymptotic relative efficiency to be rather accurate for realistic sample sizes, when employing second-order PQL. An approximate, simpler formula is presented to estimate the efficiency loss due to varying cluster sizes when planning a trial. In many cases sampling 14 per cent more clusters is sufficient to repair the efficiency loss due to varying cluster sizes. Since current closed-form formulas for sample size calculation are based on first-order MQL, planning a trial also requires a conversion factor to obtain the variance of the second-order PQL estimator. In a second Monte Carlo study, this conversion factor turned out to be 1.25 at most. (c) 2010 John Wiley & Sons, Ltd.

  2. Studying Variance in the Galactic Ultra-compact Binary Population

    Science.gov (United States)

    Larson, Shane; Breivik, Katelyn

    2017-01-01

    In the years preceding LISA, Milky Way compact binary population simulations can be used to inform the science capabilities of the mission. Galactic population simulation efforts generally focus on high fidelity models that require extensive computational power to produce a single simulated population for each model. Each simulated population represents an incomplete sample of the functions governing compact binary evolution, thus introducing variance from one simulation to another. We present a rapid Monte Carlo population simulation technique that can simulate thousands of populations on week-long timescales, thus allowing a full exploration of the variance associated with a binary stellar evolution model.

  3. Crime Modeling using Spatial Regression Approach

    Science.gov (United States)

    Saleh Ahmar, Ansari; Adiatma; Kasim Aidid, M.

    2018-01-01

    Act of criminality in Indonesia increased both variety and quantity every year. As murder, rape, assault, vandalism, theft, fraud, fencing, and other cases that make people feel unsafe. Risk of society exposed to crime is the number of reported cases in the police institution. The higher of the number of reporter to the police institution then the number of crime in the region is increasing. In this research, modeling criminality in South Sulawesi, Indonesia with the dependent variable used is the society exposed to the risk of crime. Modelling done by area approach is the using Spatial Autoregressive (SAR) and Spatial Error Model (SEM) methods. The independent variable used is the population density, the number of poor population, GDP per capita, unemployment and the human development index (HDI). Based on the analysis using spatial regression can be shown that there are no dependencies spatial both lag or errors in South Sulawesi.

  4. Performance analysis and binary working fluid selection of combined flash-binary geothermal cycle

    International Nuclear Information System (INIS)

    Zeyghami, Mehdi

    2015-01-01

    Performance of the combined flash-binary geothermal power cycle for geofluid temperatures between 150 and 250 °C is studied. A thermodynamic model is developed, and the suitable binary working fluids for different geofluid temperatures are identified from a list of thirty working fluid candidates, consisting environmental friendly refrigerants and hydrocarbons. The overall system exergy destruction and Vapor Expansion Ratio across the binary cycle turbine are selected as key performance indicators. The results show that for low-temperature heat sources using refrigerants as binary working fluids result in higher overall cycle efficiency and for medium and high-temperature resources, hydrocarbons are more suitable. For combined flash-binary cycle, secondary working fluids; R-152a, Butane and Cis-butane show the best performances at geofluid temperatures 150, 200 and 250 °C respectively. The overall second law efficiency is calculated as high as 0.48, 0.55 and 0.58 for geofluid temperatures equal 150, 200 and 250 °C respectively. The flash separator pressure found to has important effects on cycle operation and performance. Separator pressure dictates the work production share of steam and binary parts of the system. And there is an optimal separator pressure at which overall exergy destruction of the cycle achieves its minimum value. - Highlights: • Performance of the combined flash-binary geothermal cycle is investigated. • Thirty different fluids are screened to find the most suitable ORC working fluid. • Optimum cycle operation conditions presented for geofluids between 150 °C and 250 °C. • Refrigerants are more suitable for the ORC at geothermal sources temperature ≤200 °C. • Hydrocarbons are more suitable for the ORC at geothermal sources temperature >200 °C

  5. Determining factors influencing survival of breast cancer by fuzzy logistic regression model.

    Science.gov (United States)

    Nikbakht, Roya; Bahrampour, Abbas

    2017-01-01

    Fuzzy logistic regression model can be used for determining influential factors of disease. This study explores the important factors of actual predictive survival factors of breast cancer's patients. We used breast cancer data which collected by cancer registry of Kerman University of Medical Sciences during the period of 2000-2007. The variables such as morphology, grade, age, and treatments (surgery, radiotherapy, and chemotherapy) were applied in the fuzzy logistic regression model. Performance of model was determined in terms of mean degree of membership (MDM). The study results showed that almost 41% of patients were in neoplasm and malignant group and more than two-third of them were still alive after 5-year follow-up. Based on the fuzzy logistic model, the most important factors influencing survival were chemotherapy, morphology, and radiotherapy, respectively. Furthermore, the MDM criteria show that the fuzzy logistic regression have a good fit on the data (MDM = 0.86). Fuzzy logistic regression model showed that chemotherapy is more important than radiotherapy in survival of patients with breast cancer. In addition, another ability of this model is calculating possibilistic odds of survival in cancer patients. The results of this study can be applied in clinical research. Furthermore, there are few studies which applied the fuzzy logistic models. Furthermore, we recommend using this model in various research areas.

  6. Efficient estimation of an additive quantile regression model

    NARCIS (Netherlands)

    Cheng, Y.; de Gooijer, J.G.; Zerom, D.

    2009-01-01

    In this paper two kernel-based nonparametric estimators are proposed for estimating the components of an additive quantile regression model. The first estimator is a computationally convenient approach which can be viewed as a viable alternative to the method of De Gooijer and Zerom (2003). By

  7. Efficient estimation of an additive quantile regression model

    NARCIS (Netherlands)

    Cheng, Y.; de Gooijer, J.G.; Zerom, D.

    2010-01-01

    In this paper two kernel-based nonparametric estimators are proposed for estimating the components of an additive quantile regression model. The first estimator is a computationally convenient approach which can be viewed as a viable alternative to the method of De Gooijer and Zerom (2003). By

  8. Boosted beta regression.

    Directory of Open Access Journals (Sweden)

    Matthias Schmid

    Full Text Available Regression analysis with a bounded outcome is a common problem in applied statistics. Typical examples include regression models for percentage outcomes and the analysis of ratings that are measured on a bounded scale. In this paper, we consider beta regression, which is a generalization of logit models to situations where the response is continuous on the interval (0,1. Consequently, beta regression is a convenient tool for analyzing percentage responses. The classical approach to fit a beta regression model is to use maximum likelihood estimation with subsequent AIC-based variable selection. As an alternative to this established - yet unstable - approach, we propose a new estimation technique called boosted beta regression. With boosted beta regression estimation and variable selection can be carried out simultaneously in a highly efficient way. Additionally, both the mean and the variance of a percentage response can be modeled using flexible nonlinear covariate effects. As a consequence, the new method accounts for common problems such as overdispersion and non-binomial variance structures.

  9. Using the classical linear regression model in analysis of the dependences of conveyor belt life

    Directory of Open Access Journals (Sweden)

    Miriam Andrejiová

    2013-12-01

    Full Text Available The paper deals with the classical linear regression model of the dependence of conveyor belt life on some selected parameters: thickness of paint layer, width and length of the belt, conveyor speed and quantity of transported material. The first part of the article is about regression model design, point and interval estimation of parameters, verification of statistical significance of the model, and about the parameters of the proposed regression model. The second part of the article deals with identification of influential and extreme values that can have an impact on estimation of regression model parameters. The third part focuses on assumptions of the classical regression model, i.e. on verification of independence assumptions, normality and homoscedasticity of residuals.

  10. Modeling the frequency of opposing left-turn conflicts at signalized intersections using generalized linear regression models.

    Science.gov (United States)

    Zhang, Xin; Liu, Pan; Chen, Yuguang; Bai, Lu; Wang, Wei

    2014-01-01

    The primary objective of this study was to identify whether the frequency of traffic conflicts at signalized intersections can be modeled. The opposing left-turn conflicts were selected for the development of conflict predictive models. Using data collected at 30 approaches at 20 signalized intersections, the underlying distributions of the conflicts under different traffic conditions were examined. Different conflict-predictive models were developed to relate the frequency of opposing left-turn conflicts to various explanatory variables. The models considered include a linear regression model, a negative binomial model, and separate models developed for four traffic scenarios. The prediction performance of different models was compared. The frequency of traffic conflicts follows a negative binominal distribution. The linear regression model is not appropriate for the conflict frequency data. In addition, drivers behaved differently under different traffic conditions. Accordingly, the effects of conflicting traffic volumes on conflict frequency vary across different traffic conditions. The occurrences of traffic conflicts at signalized intersections can be modeled using generalized linear regression models. The use of conflict predictive models has potential to expand the uses of surrogate safety measures in safety estimation and evaluation.

  11. Use of empirical likelihood to calibrate auxiliary information in partly linear monotone regression models.

    Science.gov (United States)

    Chen, Baojiang; Qin, Jing

    2014-05-10

    In statistical analysis, a regression model is needed if one is interested in finding the relationship between a response variable and covariates. When the response depends on the covariate, then it may also depend on the function of this covariate. If one has no knowledge of this functional form but expect for monotonic increasing or decreasing, then the isotonic regression model is preferable. Estimation of parameters for isotonic regression models is based on the pool-adjacent-violators algorithm (PAVA), where the monotonicity constraints are built in. With missing data, people often employ the augmented estimating method to improve estimation efficiency by incorporating auxiliary information through a working regression model. However, under the framework of the isotonic regression model, the PAVA does not work as the monotonicity constraints are violated. In this paper, we develop an empirical likelihood-based method for isotonic regression model to incorporate the auxiliary information. Because the monotonicity constraints still hold, the PAVA can be used for parameter estimation. Simulation studies demonstrate that the proposed method can yield more efficient estimates, and in some situations, the efficiency improvement is substantial. We apply this method to a dementia study. Copyright © 2013 John Wiley & Sons, Ltd.

  12. Binary Black Hole Mergers from Globular Clusters: Implications for Advanced LIGO.

    Science.gov (United States)

    Rodriguez, Carl L; Morscher, Meagan; Pattabiraman, Bharath; Chatterjee, Sourav; Haster, Carl-Johan; Rasio, Frederic A

    2015-07-31

    The predicted rate of binary black hole mergers from galactic fields can vary over several orders of magnitude and is extremely sensitive to the assumptions of stellar evolution. But in dense stellar environments such as globular clusters, binary black holes form by well-understood gravitational interactions. In this Letter, we study the formation of black hole binaries in an extensive collection of realistic globular cluster models. By comparing these models to observed Milky Way and extragalactic globular clusters, we find that the mergers of dynamically formed binaries could be detected at a rate of ∼100 per year, potentially dominating the binary black hole merger rate. We also find that a majority of cluster-formed binaries are more massive than their field-formed counterparts, suggesting that Advanced LIGO could identify certain binaries as originating from dense stellar environments.

  13. Isothermal (vapour + liquid) equilibrium data for binary systems of (n-hexane + CO2 or CHF3)

    International Nuclear Information System (INIS)

    Williams-Wynn, Mark D.; Naidoo, Paramespri; Ramjugernath, Deresh

    2016-01-01

    Highlights: • (Static-analytic + static-synthetic) phase equilibrium measurements. • Binary VLE data for (CO 2 + n-hexane) and (trifluoromethane + n-hexane). • Thermodynamic models were fitted to the experimental data. • Liquid–liquid immiscibility occurred with (trifluoromethane + n-hexane) system. - Abstract: The (vapour + liquid) equilibrium (VLE) was measured for the (carbon dioxide + n-hexane) binary system at temperatures between T = (303.1 and 323.1) K. In addition, VLE and (vapour + liquid + liquid) equilibria (VLLE) were determined for the (trifluoromethane + n-hexane) binary system at temperatures between T = (272.9 and 313.3) K and pressures in the range of P = (1.0 to 5.7) MPa. Measurements were undertaken in a static-analytic apparatus, with verification of experimental values undertaken using a static-synthetic equilibrium cell to measure bubble point pressures at several compositions. The phase equilibrium results were modelled with the Peng–Robinson equation of state with the Mathias–Copeman alpha function, coupled with the Wong–Sandler mixing rules. Regression of the data was performed with the NRTL and the UNIQUAC activity coefficient models with the Wong–Sandler mixing rules, and the performance of the models was compared. Critical loci for both systems were estimated, using the calculation procedures of Ungerer et al. and Heidemann and Khalil. For the (trifluoromethane + n-hexane) system, liquid–liquid immiscibility was experienced at the lowest temperature measured (T = 272.9 K). At higher temperatures, no immiscibility was visible during the measurements; however, the models continued to predict a miscibility gap.

  14. Covariate Imbalance and Adjustment for Logistic Regression Analysis of Clinical Trial Data

    Science.gov (United States)

    Ciolino, Jody D.; Martin, Reneé H.; Zhao, Wenle; Jauch, Edward C.; Hill, Michael D.; Palesch, Yuko Y.

    2014-01-01

    In logistic regression analysis for binary clinical trial data, adjusted treatment effect estimates are often not equivalent to unadjusted estimates in the presence of influential covariates. This paper uses simulation to quantify the benefit of covariate adjustment in logistic regression. However, International Conference on Harmonization guidelines suggest that covariate adjustment be pre-specified. Unplanned adjusted analyses should be considered secondary. Results suggest that that if adjustment is not possible or unplanned in a logistic setting, balance in continuous covariates can alleviate some (but never all) of the shortcomings of unadjusted analyses. The case of log binomial regression is also explored. PMID:24138438

  15. Financial performance monitoring of the technical efficiency of critical access hospitals: a data envelopment analysis and logistic regression modeling approach.

    Science.gov (United States)

    Wilson, Asa B; Kerr, Bernard J; Bastian, Nathaniel D; Fulton, Lawrence V

    2012-01-01

    From 1980 to 1999, rural designated hospitals closed at a disproportionally high rate. In response to this emergent threat to healthcare access in rural settings, the Balanced Budget Act of 1997 made provisions for the creation of a new rural hospital--the critical access hospital (CAH). The conversion to CAH and the associated cost-based reimbursement scheme significantly slowed the closure rate of rural hospitals. This work investigates which methods can ensure the long-term viability of small hospitals. This article uses a two-step design to focus on a hypothesized relationship between technical efficiency of CAHs and a recently developed set of financial monitors for these entities. The goal is to identify the financial performance measures associated with efficiency. The first step uses data envelopment analysis (DEA) to differentiate efficient from inefficient facilities within a data set of 183 CAHs. Determining DEA efficiency is an a priori categorization of hospitals in the data set as efficient or inefficient. In the second step, DEA efficiency is the categorical dependent variable (efficient = 0, inefficient = 1) in the subsequent binary logistic regression (LR) model. A set of six financial monitors selected from the array of 20 measures were the LR independent variables. We use a binary LR to test the null hypothesis that recently developed CAH financial indicators had no predictive value for categorizing a CAH as efficient or inefficient, (i.e., there is no relationship between DEA efficiency and fiscal performance).

  16. Fractional Gaussian noise-enhanced information capacity of a nonlinear neuron model with binary signal input

    Science.gov (United States)

    Gao, Feng-Yin; Kang, Yan-Mei; Chen, Xi; Chen, Guanrong

    2018-05-01

    This paper reveals the effect of fractional Gaussian noise with Hurst exponent H ∈(1 /2 ,1 ) on the information capacity of a general nonlinear neuron model with binary signal input. The fGn and its corresponding fractional Brownian motion exhibit long-range, strong-dependent increments. It extends standard Brownian motion to many types of fractional processes found in nature, such as the synaptic noise. In the paper, for the subthreshold binary signal, sufficient conditions are given based on the "forbidden interval" theorem to guarantee the occurrence of stochastic resonance, while for the suprathreshold binary signal, the simulated results show that additive fGn with Hurst exponent H ∈(1 /2 ,1 ) could increase the mutual information or bits count. The investigation indicated that the synaptic noise with the characters of long-range dependence and self-similarity might be the driving factor for the efficient encoding and decoding of the nervous system.

  17. Predictive market segmentation model: An application of logistic regression model and CHAID procedure

    Directory of Open Access Journals (Sweden)

    Soldić-Aleksić Jasna

    2009-01-01

    Full Text Available Market segmentation presents one of the key concepts of the modern marketing. The main goal of market segmentation is focused on creating groups (segments of customers that have similar characteristics, needs, wishes and/or similar behavior regarding the purchase of concrete product/service. Companies can create specific marketing plan for each of these segments and therefore gain short or long term competitive advantage on the market. Depending on the concrete marketing goal, different segmentation schemes and techniques may be applied. This paper presents a predictive market segmentation model based on the application of logistic regression model and CHAID analysis. The logistic regression model was used for the purpose of variables selection (from the initial pool of eleven variables which are statistically significant for explaining the dependent variable. Selected variables were afterwards included in the CHAID procedure that generated the predictive market segmentation model. The model results are presented on the concrete empirical example in the following form: summary model results, CHAID tree, Gain chart, Index chart, risk and classification tables.

  18. Analysis of Multivariate Experimental Data Using A Simplified Regression Model Search Algorithm

    Science.gov (United States)

    Ulbrich, Norbert Manfred

    2013-01-01

    A new regression model search algorithm was developed in 2011 that may be used to analyze both general multivariate experimental data sets and wind tunnel strain-gage balance calibration data. The new algorithm is a simplified version of a more complex search algorithm that was originally developed at the NASA Ames Balance Calibration Laboratory. The new algorithm has the advantage that it needs only about one tenth of the original algorithm's CPU time for the completion of a search. In addition, extensive testing showed that the prediction accuracy of math models obtained from the simplified algorithm is similar to the prediction accuracy of math models obtained from the original algorithm. The simplified algorithm, however, cannot guarantee that search constraints related to a set of statistical quality requirements are always satisfied in the optimized regression models. Therefore, the simplified search algorithm is not intended to replace the original search algorithm. Instead, it may be used to generate an alternate optimized regression model of experimental data whenever the application of the original search algorithm either fails or requires too much CPU time. Data from a machine calibration of NASA's MK40 force balance is used to illustrate the application of the new regression model search algorithm.

  19. Evaluation of accuracy of linear regression models in predicting urban stormwater discharge characteristics.

    Science.gov (United States)

    Madarang, Krish J; Kang, Joo-Hyon

    2014-06-01

    Stormwater runoff has been identified as a source of pollution for the environment, especially for receiving waters. In order to quantify and manage the impacts of stormwater runoff on the environment, predictive models and mathematical models have been developed. Predictive tools such as regression models have been widely used to predict stormwater discharge characteristics. Storm event characteristics, such as antecedent dry days (ADD), have been related to response variables, such as pollutant loads and concentrations. However it has been a controversial issue among many studies to consider ADD as an important variable in predicting stormwater discharge characteristics. In this study, we examined the accuracy of general linear regression models in predicting discharge characteristics of roadway runoff. A total of 17 storm events were monitored in two highway segments, located in Gwangju, Korea. Data from the monitoring were used to calibrate United States Environmental Protection Agency's Storm Water Management Model (SWMM). The calibrated SWMM was simulated for 55 storm events, and the results of total suspended solid (TSS) discharge loads and event mean concentrations (EMC) were extracted. From these data, linear regression models were developed. R(2) and p-values of the regression of ADD for both TSS loads and EMCs were investigated. Results showed that pollutant loads were better predicted than pollutant EMC in the multiple regression models. Regression may not provide the true effect of site-specific characteristics, due to uncertainty in the data. Copyright © 2014 The Research Centre for Eco-Environmental Sciences, Chinese Academy of Sciences. Published by Elsevier B.V. All rights reserved.

  20. Improving sub-pixel imperviousness change prediction by ensembling heterogeneous non-linear regression models

    Science.gov (United States)

    Drzewiecki, Wojciech

    2016-12-01

    In this work nine non-linear regression models were compared for sub-pixel impervious surface area mapping from Landsat images. The comparison was done in three study areas both for accuracy of imperviousness coverage evaluation in individual points in time and accuracy of imperviousness change assessment. The performance of individual machine learning algorithms (Cubist, Random Forest, stochastic gradient boosting of regression trees, k-nearest neighbors regression, random k-nearest neighbors regression, Multivariate Adaptive Regression Splines, averaged neural networks, and support vector machines with polynomial and radial kernels) was also compared with the performance of heterogeneous model ensembles constructed from the best models trained using particular techniques. The results proved that in case of sub-pixel evaluation the most accurate prediction of change may not necessarily be based on the most accurate individual assessments. When single methods are considered, based on obtained results Cubist algorithm may be advised for Landsat based mapping of imperviousness for single dates. However, Random Forest may be endorsed when the most reliable evaluation of imperviousness change is the primary goal. It gave lower accuracies for individual assessments, but better prediction of change due to more correlated errors of individual predictions. Heterogeneous model ensembles performed for individual time points assessments at least as well as the best individual models. In case of imperviousness change assessment the ensembles always outperformed single model approaches. It means that it is possible to improve the accuracy of sub-pixel imperviousness change assessment using ensembles of heterogeneous non-linear regression models.

  1. Solubility determination and thermodynamic modelling of allisartan isoproxil in different binary solvent mixtures from T = (278.15 to 313.15) K and mixing properties of solutions

    International Nuclear Information System (INIS)

    Yang, Yaoyao; Yang, Peng; Du, Shichao; Li, Kangli; Zhao, Kaifei; Xu, Shijie; Hou, Baohong; Gong, Junbo

    2016-01-01

    Highlights: • The solubility of allisartan isoproxil in binary solvent mixtures were determined. • Apelblat, CNIBS/R-K and Jouyban-Acree models were used to correlate the solubility. • Solubility parameter theory was used to explain the co-solvency phenomenon. • Regular mixing rules were used to calculate solubility parameter of binary solvents. • The mixing thermodynamics were calculated and discussed based on NRTL model. - Abstract: In this work, the solubility of allisartan isoproxil in binary solvent mixtures, including (acetone + water), (acetonitrile + water) and (methanol + water), was determined by a gravimetric method with the temperature ranging from (278.15 to 313.15) K at atmospheric pressure (p = 0.1 MPa). The solubility of allisartan isoproxil in three binary solvent mixtures all increased with the rising of temperature at a constant solvent composition. For the binary solvent mixtures of (methanol + water), the solubility increased with the increasing of methanol fraction, while it appeared maximum value at a certain solvent composition in the other two binary solvent mixtures (acetone + water and acetonitrile + water). Based on the theory of solubility parameter, Fedors method and two mixing rules were employed to calculate the solubility parameters, by which the proximity of solubility parameters between allisartan isoproxil and binary solvent mixtures explained the co-solvent phenomenon. Additionally, the modified Apelblat equation, CNIBS/R-K model and Jouyban-Acree model were used to correlate the solubility data in binary solvent mixtures, and it turned out that all the three correlation models could give a satisfactory result. Furthermore, the mixing thermodynamic properties were calculated based on NRTL model, which indicated that the mixing process was spontaneous and exothermic.

  2. Modeling adsorption of binary and ternary mixtures on microporous media

    DEFF Research Database (Denmark)

    Monsalvo, Matias Alfonso; Shapiro, Alexander

    2007-01-01

    it possible using the same equation of state to describe the thermodynamic properties of the segregated and the bulk phases. For comparison, we also used the ideal adsorbed solution theory (IAST) to describe adsorption equilibria. The main advantage of these two models is their capabilities to predict......The goal of this work is to analyze the adsorption of binary and ternary mixtures on the basis of the multicomponent potential theory of adsorption (MPTA). In the MPTA, the adsorbate is considered as a segregated mixture in the external potential field emitted by the solid adsorbent. This makes...... multicomponent adsorption equilibria on the basis of single-component adsorption data. We compare the MPTA and IAST models to a large set of experimental data, obtaining reasonable good agreement with experimental data and high degree of predictability. Some limitations of both models are also discussed....

  3. AN X-RAY AND OPTICAL LIGHT CURVE MODEL OF THE ECLIPSING SYMBIOTIC BINARY SMC3

    International Nuclear Information System (INIS)

    Kato, Mariko; Hachisu, Izumi; Mikołajewska, Joanna

    2013-01-01

    Some binary evolution scenarios for Type Ia supernovae (SNe Ia) include long-period binaries that evolve to symbiotic supersoft X-ray sources in their late stage of evolution. However, symbiotic stars with steady hydrogen burning on the white dwarf's (WD) surface are very rare, and the X-ray characteristics are not well known. SMC3 is one such rare example and a key object for understanding the evolution of symbiotic stars to SNe Ia. SMC3 is an eclipsing symbiotic binary, consisting of a massive WD and red giant (RG), with an orbital period of 4.5 years in the Small Magellanic Cloud. The long-term V light curve variations are reproduced as orbital variations in the irradiated RG, whose atmosphere fills its Roche lobe, thus supporting the idea that the RG supplies matter to the WD at rates high enough to maintain steady hydrogen burning on the WD. We also present an eclipse model in which an X-ray-emitting region around the WD is almost totally occulted by the RG swelling over the Roche lobe on the trailing side, although it is always partly obscured by a long spiral tail of neutral hydrogen surrounding the binary in the orbital plane.

  4. No tension between assembly models of super massive black hole binaries and pulsar observations.

    Science.gov (United States)

    Middleton, Hannah; Chen, Siyuan; Del Pozzo, Walter; Sesana, Alberto; Vecchio, Alberto

    2018-02-08

    Pulsar timing arrays are presently the only means to search for the gravitational wave stochastic background from super massive black hole binary populations, considered to be within the grasp of current or near-future observations. The stringent upper limit from the Parkes Pulsar Timing Array has been interpreted as excluding (>90% confidence) the current paradigm of binary assembly through galaxy mergers and hardening via stellar interaction, suggesting evolution is accelerated or stalled. Using Bayesian hierarchical modelling we consider implications of this upper limit for a range of astrophysical scenarios, without invoking stalling, nor more exotic physical processes. All scenarios are fully consistent with the upper limit, but (weak) bounds on population parameters can be inferred. Recent upward revisions of the black hole-galaxy bulge mass relation are disfavoured at 1.6σ against lighter models. Once sensitivity improves by an order of magnitude, a non-detection will disfavour the most optimistic scenarios at 3.9σ.

  5. Semiparametric Allelic Tests for Mapping Multiple Phenotypes: Binomial Regression and Mahalanobis Distance.

    Science.gov (United States)

    Majumdar, Arunabha; Witte, John S; Ghosh, Saurabh

    2015-12-01

    Binary phenotypes commonly arise due to multiple underlying quantitative precursors and genetic variants may impact multiple traits in a pleiotropic manner. Hence, simultaneously analyzing such correlated traits may be more powerful than analyzing individual traits. Various genotype-level methods, e.g., MultiPhen (O'Reilly et al. []), have been developed to identify genetic factors underlying a multivariate phenotype. For univariate phenotypes, the usefulness and applicability of allele-level tests have been investigated. The test of allele frequency difference among cases and controls is commonly used for mapping case-control association. However, allelic methods for multivariate association mapping have not been studied much. In this article, we explore two allelic tests of multivariate association: one using a Binomial regression model based on inverted regression of genotype on phenotype (Binomial regression-based Association of Multivariate Phenotypes [BAMP]), and the other employing the Mahalanobis distance between two sample means of the multivariate phenotype vector for two alleles at a single-nucleotide polymorphism (Distance-based Association of Multivariate Phenotypes [DAMP]). These methods can incorporate both discrete and continuous phenotypes. Some theoretical properties for BAMP are studied. Using simulations, the power of the methods for detecting multivariate association is compared with the genotype-level test MultiPhen's. The allelic tests yield marginally higher power than MultiPhen for multivariate phenotypes. For one/two binary traits under recessive mode of inheritance, allelic tests are found to be substantially more powerful. All three tests are applied to two different real data and the results offer some support for the simulation study. We propose a hybrid approach for testing multivariate association that implements MultiPhen when Hardy-Weinberg Equilibrium (HWE) is violated and BAMP otherwise, because the allelic approaches assume HWE

  6. Regularized multivariate regression models with skew-t error distributions

    KAUST Repository

    Chen, Lianfu; Pourahmadi, Mohsen; Maadooliat, Mehdi

    2014-01-01

    We consider regularization of the parameters in multivariate linear regression models with the errors having a multivariate skew-t distribution. An iterative penalized likelihood procedure is proposed for constructing sparse estimators of both

  7. Thermodynamic equilibrium of hydroxyacetic acid in pure and binary solvent systems

    International Nuclear Information System (INIS)

    Huang, Qiaoyin; Xie, Chuang; Li, Yang; Su, Nannan; Lou, Yajing; Hu, Xiaoxue; Wang, Yongli; Bao, Ying; Hou, Baohong

    2017-01-01

    Highlights: • Solubility of hydroxyacetic acid in mono-solvents and binary solvent mixtures was measured. • Modified Apelblat, NRTL and Wilson model were used to correlate the solubility data in pure solvents. • CNIBS/R-K and Jouyban-Acree model were used to correlate the solubility in binary solvent mixtures. • The mixing properties were calculated based on the NRTL model. - Abstract: The solubility of hydroxyacetic acid in five pure organic solvents and two binary solvent mixtures were experimentally measured from 273.15 K to 313.15 K at atmospheric pressure (p = 0.1 MPa) by using a dynamic method. The order of solubility in pure organic solvents is ethanol > isopropanol > n-butanol > acetonitrile > ethyl acetate within the investigated temperature range, except for temperature lower than 278 K where the solubility of HA in ethyl acetate is slightly larger than that in acetonitrile. Furthermore, the solubility data in pure solvents were correlated with the modified Apelblat model, NRTL model, and Wilson model and that in the binary solvents mixtures were fitted to the CNIBS/R-K model and Jouyban-Acree model. Finally, the mixing thermodynamic properties of hydroxyacetic acid in pure and binary solvent systems were calculated and discussed.

  8. Evaluation of a multiple linear regression model and SARIMA model in forecasting heat demand for district heating system

    International Nuclear Information System (INIS)

    Fang, Tingting; Lahdelma, Risto

    2016-01-01

    Highlights: • Social factor is considered for the linear regression models besides weather file. • Simultaneously optimize all the coefficients for linear regression models. • SARIMA combined with linear regression is used to forecast the heat demand. • The accuracy for both linear regression and time series models are evaluated. - Abstract: Forecasting heat demand is necessary for production and operation planning of district heating (DH) systems. In this study we first propose a simple regression model where the hourly outdoor temperature and wind speed forecast the heat demand. Weekly rhythm of heat consumption as a social component is added to the model to significantly improve the accuracy. The other type of model is the seasonal autoregressive integrated moving average (SARIMA) model with exogenous variables as a combination to take weather factors, and the historical heat consumption data as depending variables. One outstanding advantage of the model is that it peruses the high accuracy for both long-term and short-term forecast by considering both exogenous factors and time series. The forecasting performance of both linear regression models and time series model are evaluated based on real-life heat demand data for the city of Espoo in Finland by out-of-sample tests for the last 20 full weeks of the year. The results indicate that the proposed linear regression model (T168h) using 168-h demand pattern with midweek holidays classified as Saturdays or Sundays gives the highest accuracy and strong robustness among all the tested models based on the tested forecasting horizon and corresponding data. Considering the parsimony of the input, the ease of use and the high accuracy, the proposed T168h model is the best in practice. The heat demand forecasting model can also be developed for individual buildings if automated meter reading customer measurements are available. This would allow forecasting the heat demand based on more accurate heat consumption

  9. A Application of WD Model to EB Type Contact Binary System

    Directory of Open Access Journals (Sweden)

    Su-Yeon Oh

    2000-12-01

    Full Text Available The EB type contact binaries show large temperature difference ( T 1,000K between two components. Thus we have modified the mode 3 of the WD program to adjust albedos, limb darkening coefficients and gravity darkening exponents for both components of such binaries, while the values for those parameters should be same for both components in the original WD program. Both of the modified and the original versions have been applied to the EB type contact binaries such as DO Cas, GO Cyg, and FS Lup. The computed light curves with modified version fit better to the observations.

  10. Modeling of columnar and equiaxed solidification of binary mixtures

    International Nuclear Information System (INIS)

    Roux, P.

    2005-12-01

    This work deals with the modelling of dendritic solidification in binary mixtures. Large scale phenomena are represented by volume averaging of the local conservation equations. This method allows to rigorously derive the partial differential equations of averaged fields and the closure problems associated to the deviations. Such problems can be resolved numerically on periodic cells, representative of dendritic structures, in order to give a precise evaluation of macroscopic transfer coefficients (Drag coefficients, exchange coefficients, diffusion-dispersion tensors...). The method had already been applied for a model of columnar dendritic mushy zone and it is extended to the case of equiaxed dendritic solidification, where solid grains can move. The two-phase flow is modelled with an Eulerian-Eulerian approach and the novelty is to account for the dispersion of solid velocity through the kinetic agitation of the particles. A coupling of the two models is proposed thanks to an original adaptation of the columnar model, allowing for undercooling calculation: a solid-liquid interfacial area density is introduced and calculated. At last, direct numerical simulations of crystal growth are proposed with a diffuse interface method for a representation of local phenomena. (author)

  11. EREM: Parameter Estimation and Ancestral Reconstruction by Expectation-Maximization Algorithm for a Probabilistic Model of Genomic Binary Characters Evolution.

    Science.gov (United States)

    Carmel, Liran; Wolf, Yuri I; Rogozin, Igor B; Koonin, Eugene V

    2010-01-01

    Evolutionary binary characters are features of species or genes, indicating the absence (value zero) or presence (value one) of some property. Examples include eukaryotic gene architecture (the presence or absence of an intron in a particular locus), gene content, and morphological characters. In many studies, the acquisition of such binary characters is assumed to represent a rare evolutionary event, and consequently, their evolution is analyzed using various flavors of parsimony. However, when gain and loss of the character are not rare enough, a probabilistic analysis becomes essential. Here, we present a comprehensive probabilistic model to describe the evolution of binary characters on a bifurcating phylogenetic tree. A fast software tool, EREM, is provided, using maximum likelihood to estimate the parameters of the model and to reconstruct ancestral states (presence and absence in internal nodes) and events (gain and loss events along branches).

  12. Solubility Modeling of the Binary Systems Fe(NO3)3–H2O, Co(NO3)2–H2O and the Ternary System Fe(NO3)3–Co(NO3)2–H2O with the Extended Universal Quasichemical (UNIQUAC) Model

    DEFF Research Database (Denmark)

    Arrad, Mouad; Kaddami, Mohammed; Goundali, Bahija El

    2016-01-01

    Solubility modeling in the binary system Fe(NO3)3–H2O, Co(NO3)2–H2O and the ternary system Fe(NO3)3–Co(NO3)2–H2O is presented. The extended UNIQUAC model was applied to the thermodynamic assessment of the investigated systems. The model parameters obtained were regressed simultaneously using...... the available databank but with more experimental points, recently published in the open literature. A revision of previously published parameters for the cobalt ion and new parameters for the iron(III) nitrate system are presented. Based on this set of parameters, the equilibrium constants of hydrates...

  13. Validation of regression models for nitrate concentrations in the upper groundwater in sandy soils

    International Nuclear Information System (INIS)

    Sonneveld, M.P.W.; Brus, D.J.; Roelsma, J.

    2010-01-01

    For Dutch sandy regions, linear regression models have been developed that predict nitrate concentrations in the upper groundwater on the basis of residual nitrate contents in the soil in autumn. The objective of our study was to validate these regression models for one particular sandy region dominated by dairy farming. No data from this area were used for calibrating the regression models. The model was validated by additional probability sampling. This sample was used to estimate errors in 1) the predicted areal fractions where the EU standard of 50 mg l -1 is exceeded for farms with low N surpluses (ALT) and farms with higher N surpluses (REF); 2) predicted cumulative frequency distributions of nitrate concentration for both groups of farms. Both the errors in the predicted areal fractions as well as the errors in the predicted cumulative frequency distributions indicate that the regression models are invalid for the sandy soils of this study area. - This study indicates that linear regression models that predict nitrate concentrations in the upper groundwater using residual soil N contents should be applied with care.

  14. PERIODIC SIGNALS IN BINARY MICROLENSING EVENTS

    International Nuclear Information System (INIS)

    Guo, Xinyi; Stefano, Rosanne Di; Esin, Ann; Taylor, Jeffrey

    2015-01-01

    Gravitational microlensing events are powerful tools for the study of stellar populations. In particular, they can be used to discover and study a variety of binary systems. A large number of binary lenses have already been found through microlensing surveys and a few of these systems show strong evidence of orbital motion on the timescale of the lensing event. We expect that more binary lenses of this kind will be detected in the future. For binaries whose orbital period is comparable to the event duration, the orbital motion can cause the lensing signal to deviate drastically from that of a static binary lens. The most striking property of such light curves is the presence of quasi-periodic features, which are produced as the source traverses the same regions in the rotating lens plane. These repeating features contain information about the orbital period of the lens. If this period can be extracted, then much can be learned about the lensing system even without performing time-consuming, detailed light-curve modeling. However, the relative transverse motion between the source and the lens significantly complicates the problem of period extraction. To resolve this difficulty, we present a modification of the standard Lomb–Scargle periodogram analysis. We test our method for four representative binary lens systems and demonstrate its efficiency in correctly extracting binary orbital periods

  15. Emission-line diagnostics of nearby H II regions including interacting binary populations

    Science.gov (United States)

    Xiao, Lin; Stanway, Elizabeth R.; Eldridge, J. J.

    2018-06-01

    We present numerical models of the nebular emission from H II regions around young stellar populations over a range of compositions and ages. The synthetic stellar populations include both single stars and interacting binary stars. We compare these models to the observed emission lines of 254 H II regions of 13 nearby spiral galaxies and 21 dwarf galaxies drawn from archival data. The models are created using the combination of the BPASS (Binary Population and Spectral Synthesis) code with the photoionization code CLOUDY to study the differences caused by the inclusion of interacting binary stars in the stellar population. We obtain agreement with the observed emission line ratios from the nearby star-forming regions and discuss the effect of binary-star evolution pathways on the nebular ionization of H II regions. We find that at population ages above 10 Myr, single-star models rapidly decrease in flux and ionization strength, while binary-star models still produce strong flux and high [O III]/H β ratios. Our models can reproduce the metallicity of H II regions from spiral galaxies, but we find higher metallicities than previously estimated for the H II regions from dwarf galaxies. Comparing the equivalent width of H β emission between models and observations, we find that accounting for ionizing photon leakage can affect age estimates for H II regions. When it is included, the typical age derived for H II regions is 5 Myr from single-star models, and up to 10 Myr with binary-star models. This is due to the existence of binary-star evolution pathways, which produce more hot Wolf-Rayet and helium stars at older ages. For future reference, we calculate new BPASS binary maximal starburst lines as a function of metallicity, and for the total model population, and present these in Appendix A.

  16. Bi-dimensional null model analysis of presence-absence binary matrices.

    Science.gov (United States)

    Strona, Giovanni; Ulrich, Werner; Gotelli, Nicholas J

    2018-01-01

    Comparing the structure of presence/absence (i.e., binary) matrices with those of randomized counterparts is a common practice in ecology. However, differences in the randomization procedures (null models) can affect the results of the comparisons, leading matrix structural patterns to appear either "random" or not. Subjectivity in the choice of one particular null model over another makes it often advisable to compare the results obtained using several different approaches. Yet, available algorithms to randomize binary matrices differ substantially in respect to the constraints they impose on the discrepancy between observed and randomized row and column marginal totals, which complicates the interpretation of contrasting patterns. This calls for new strategies both to explore intermediate scenarios of restrictiveness in-between extreme constraint assumptions, and to properly synthesize the resulting information. Here we introduce a new modeling framework based on a flexible matrix randomization algorithm (named the "Tuning Peg" algorithm) that addresses both issues. The algorithm consists of a modified swap procedure in which the discrepancy between the row and column marginal totals of the target matrix and those of its randomized counterpart can be "tuned" in a continuous way by two parameters (controlling, respectively, row and column discrepancy). We show how combining the Tuning Peg with a wise random walk procedure makes it possible to explore the complete null space embraced by existing algorithms. This exploration allows researchers to visualize matrix structural patterns in an innovative bi-dimensional landscape of significance/effect size. We demonstrate the rational and potential of our approach with a set of simulated and real matrices, showing how the simultaneous investigation of a comprehensive and continuous portion of the null space can be extremely informative, and possibly key to resolving longstanding debates in the analysis of ecological

  17. Bivariate least squares linear regression: Towards a unified analytic formalism. I. Functional models

    Science.gov (United States)

    Caimmi, R.

    2011-08-01

    Concerning bivariate least squares linear regression, the classical approach pursued for functional models in earlier attempts ( York, 1966, 1969) is reviewed using a new formalism in terms of deviation (matrix) traces which, for unweighted data, reduce to usual quantities leaving aside an unessential (but dimensional) multiplicative factor. Within the framework of classical error models, the dependent variable relates to the independent variable according to the usual additive model. The classes of linear models considered are regression lines in the general case of correlated errors in X and in Y for weighted data, and in the opposite limiting situations of (i) uncorrelated errors in X and in Y, and (ii) completely correlated errors in X and in Y. The special case of (C) generalized orthogonal regression is considered in detail together with well known subcases, namely: (Y) errors in X negligible (ideally null) with respect to errors in Y; (X) errors in Y negligible (ideally null) with respect to errors in X; (O) genuine orthogonal regression; (R) reduced major-axis regression. In the limit of unweighted data, the results determined for functional models are compared with their counterparts related to extreme structural models i.e. the instrumental scatter is negligible (ideally null) with respect to the intrinsic scatter ( Isobe et al., 1990; Feigelson and Babu, 1992). While regression line slope and intercept estimators for functional and structural models necessarily coincide, the contrary holds for related variance estimators even if the residuals obey a Gaussian distribution, with the exception of Y models. An example of astronomical application is considered, concerning the [O/H]-[Fe/H] empirical relations deduced from five samples related to different stars and/or different methods of oxygen abundance determination. For selected samples and assigned methods, different regression models yield consistent results within the errors (∓ σ) for both

  18. THE POTENTIAL IMPORTANCE OF BINARY EVOLUTION IN ULTRAVIOLET-OPTICAL SPECTRAL FITTING OF EARLY-TYPE GALAXIES

    International Nuclear Information System (INIS)

    Li, Zhongmu; Mao, Caiyan; Chen, Li; Zhang, Qian; Li, Maocai

    2013-01-01

    Most galaxies possibly contain some binaries, and more than half of Galactic hot subdwarf stars, which are thought to be a possible origin of the UV-upturn of old stellar populations, are found in binaries. However, the effect of binary evolution has not been taken into account in most works on the spectral fitting of galaxies. This paper studies the role of binary evolution in the spectral fitting of early-type galaxies, via a stellar population synthesis model including both single and binary star populations. Spectra from ultraviolet to optical bands are fitted to determine a few galaxy parameters. The results show that the inclusion of binaries in stellar population models may lead to obvious change in the determination of some parameters of early-type galaxies and therefore it is potentially important for spectral studies. In particular, the ages of young components of composite stellar populations become much older when using binary star population models instead of single star population models. This implies that binary star population models will measure significantly different star formation histories for early-type galaxies compared to single star population models. In addition, stellar population models with binary interactions on average measure larger dust extinctions than single star population models. This suggests that when binary star population models are used, negative extinctions are possibly no longer necessary in the spectral fitting of galaxies (see previous works, e.g., Cid Fernandes et al. for comparison). Furthermore, it is shown that optical spectra have strong constraints on stellar age while UV spectra have strong constraints on binary fraction. Finally, our results suggest that binary star population models can provide new insight into the stellar properties of globular clusters

  19. Modeling and prediction of flotation performance using support vector regression

    Directory of Open Access Journals (Sweden)

    Despotović Vladimir

    2017-01-01

    Full Text Available Continuous efforts have been made in recent year to improve the process of paper recycling, as it is of critical importance for saving the wood, water and energy resources. Flotation deinking is considered to be one of the key methods for separation of ink particles from the cellulose fibres. Attempts to model the flotation deinking process have often resulted in complex models that are difficult to implement and use. In this paper a model for prediction of flotation performance based on Support Vector Regression (SVR, is presented. Representative data samples were created in laboratory, under a variety of practical control variables for the flotation deinking process, including different reagents, pH values and flotation residence time. Predictive model was created that was trained on these data samples, and the flotation performance was assessed showing that Support Vector Regression is a promising method even when dataset used for training the model is limited.

  20. Forecasting daily meteorological time series using ARIMA and regression models

    Science.gov (United States)

    Murat, Małgorzata; Malinowska, Iwona; Gos, Magdalena; Krzyszczak, Jaromir

    2018-04-01

    The daily air temperature and precipitation time series recorded between January 1, 1980 and December 31, 2010 in four European sites (Jokioinen, Dikopshof, Lleida and Lublin) from different climatic zones were modeled and forecasted. In our forecasting we used the methods of the Box-Jenkins and Holt- Winters seasonal auto regressive integrated moving-average, the autoregressive integrated moving-average with external regressors in the form of Fourier terms and the time series regression, including trend and seasonality components methodology with R software. It was demonstrated that obtained models are able to capture the dynamics of the time series data and to produce sensible forecasts.

  1. Replica analysis of overfitting in regression models for time-to-event data

    Science.gov (United States)

    Coolen, A. C. C.; Barrett, J. E.; Paga, P.; Perez-Vicente, C. J.

    2017-09-01

    Overfitting, which happens when the number of parameters in a model is too large compared to the number of data points available for determining these parameters, is a serious and growing problem in survival analysis. While modern medicine presents us with data of unprecedented dimensionality, these data cannot yet be used effectively for clinical outcome prediction. Standard error measures in maximum likelihood regression, such as p-values and z-scores, are blind to overfitting, and even for Cox’s proportional hazards model (the main tool of medical statisticians), one finds in literature only rules of thumb on the number of samples required to avoid overfitting. In this paper we present a mathematical theory of overfitting in regression models for time-to-event data, which aims to increase our quantitative understanding of the problem and provide practical tools with which to correct regression outcomes for the impact of overfitting. It is based on the replica method, a statistical mechanical technique for the analysis of heterogeneous many-variable systems that has been used successfully for several decades in physics, biology, and computer science, but not yet in medical statistics. We develop the theory initially for arbitrary regression models for time-to-event data, and verify its predictions in detail for the popular Cox model.

  2. Cross-validation pitfalls when selecting and assessing regression and classification models.

    Science.gov (United States)

    Krstajic, Damjan; Buturovic, Ljubomir J; Leahy, David E; Thomas, Simon

    2014-03-29

    We address the problem of selecting and assessing classification and regression models using cross-validation. Current state-of-the-art methods can yield models with high variance, rendering them unsuitable for a number of practical applications including QSAR. In this paper we describe and evaluate best practices which improve reliability and increase confidence in selected models. A key operational component of the proposed methods is cloud computing which enables routine use of previously infeasible approaches. We describe in detail an algorithm for repeated grid-search V-fold cross-validation for parameter tuning in classification and regression, and we define a repeated nested cross-validation algorithm for model assessment. As regards variable selection and parameter tuning we define two algorithms (repeated grid-search cross-validation and double cross-validation), and provide arguments for using the repeated grid-search in the general case. We show results of our algorithms on seven QSAR datasets. The variation of the prediction performance, which is the result of choosing different splits of the dataset in V-fold cross-validation, needs to be taken into account when selecting and assessing classification and regression models. We demonstrate the importance of repeating cross-validation when selecting an optimal model, as well as the importance of repeating nested cross-validation when assessing a prediction error.

  3. Kepler eclipsing binary stars. IV. Precise eclipse times for close binaries and identification of candidate three-body systems

    International Nuclear Information System (INIS)

    Conroy, Kyle E.; Stassun, Keivan G.; Prša, Andrej; Orosz, Jerome A.; Welsh, William F.; Fabrycky, Daniel C.

    2014-01-01

    We present a catalog of precise eclipse times and analysis of third-body signals among 1279 close binaries in the latest Kepler Eclipsing Binary Catalog. For these short-period binaries, Kepler's 30 minute exposure time causes significant smearing of light curves. In addition, common astrophysical phenomena such as chromospheric activity, as well as imperfections in the light curve detrending process, can create systematic artifacts that may produce fictitious signals in the eclipse timings. We present a method to measure precise eclipse times in the presence of distorted light curves, such as in contact and near-contact binaries which exhibit continuously changing light levels in and out of eclipse. We identify 236 systems for which we find a timing variation signal compatible with the presence of a third body. These are modeled for the light travel time effect and the basic properties of the third body are derived. This study complements J. A. Orosz et al. (in preparation), which focuses on eclipse timing variations of longer period binaries with flat out-of-eclipse regions. Together, these two papers provide comprehensive eclipse timings for all binaries in the Kepler Eclipsing Binary Catalog, as an ongoing resource freely accessible online to the community.

  4. A LATENT CLASS POISSON REGRESSION-MODEL FOR HETEROGENEOUS COUNT DATA

    NARCIS (Netherlands)

    WEDEL, M; DESARBO, WS; BULT, [No Value; RAMASWAMY, [No Value

    1993-01-01

    In this paper an approach is developed that accommodates heterogeneity in Poisson regression models for count data. The model developed assumes that heterogeneity arises from a distribution of both the intercept and the coefficients of the explanatory variables. We assume that the mixing

  5. Meta-analysis of studies with bivariate binary outcomes: a marginal beta-binomial model approach.

    Science.gov (United States)

    Chen, Yong; Hong, Chuan; Ning, Yang; Su, Xiao

    2016-01-15

    When conducting a meta-analysis of studies with bivariate binary outcomes, challenges arise when the within-study correlation and between-study heterogeneity should be taken into account. In this paper, we propose a marginal beta-binomial model for the meta-analysis of studies with binary outcomes. This model is based on the composite likelihood approach and has several attractive features compared with the existing models such as bivariate generalized linear mixed model (Chu and Cole, 2006) and Sarmanov beta-binomial model (Chen et al., 2012). The advantages of the proposed marginal model include modeling the probabilities in the original scale, not requiring any transformation of probabilities or any link function, having closed-form expression of likelihood function, and no constraints on the correlation parameter. More importantly, because the marginal beta-binomial model is only based on the marginal distributions, it does not suffer from potential misspecification of the joint distribution of bivariate study-specific probabilities. Such misspecification is difficult to detect and can lead to biased inference using currents methods. We compare the performance of the marginal beta-binomial model with the bivariate generalized linear mixed model and the Sarmanov beta-binomial model by simulation studies. Interestingly, the results show that the marginal beta-binomial model performs better than the Sarmanov beta-binomial model, whether or not the true model is Sarmanov beta-binomial, and the marginal beta-binomial model is more robust than the bivariate generalized linear mixed model under model misspecifications. Two meta-analyses of diagnostic accuracy studies and a meta-analysis of case-control studies are conducted for illustration. Copyright © 2015 John Wiley & Sons, Ltd.

  6. Continuous validation of ASTEC containment models and regression testing

    International Nuclear Information System (INIS)

    Nowack, Holger; Reinke, Nils; Sonnenkalb, Martin

    2014-01-01

    The focus of the ASTEC (Accident Source Term Evaluation Code) development at GRS is primarily on the containment module CPA (Containment Part of ASTEC), whose modelling is to a large extent based on the GRS containment code COCOSYS (COntainment COde SYStem). Validation is usually understood as the approval of the modelling capabilities by calculations of appropriate experiments done by external users different from the code developers. During the development process of ASTEC CPA, bugs and unintended side effects may occur, which leads to changes in the results of the initially conducted validation. Due to the involvement of a considerable number of developers in the coding of ASTEC modules, validation of the code alone, even if executed repeatedly, is not sufficient. Therefore, a regression testing procedure has been implemented in order to ensure that the initially obtained validation results are still valid with succeeding code versions. Within the regression testing procedure, calculations of experiments and plant sequences are performed with the same input deck but applying two different code versions. For every test-case the up-to-date code version is compared to the preceding one on the basis of physical parameters deemed to be characteristic for the test-case under consideration. In the case of post-calculations of experiments also a comparison to experimental data is carried out. Three validation cases from the regression testing procedure are presented within this paper. The very good post-calculation of the HDR E11.1 experiment shows the high quality modelling of thermal-hydraulics in ASTEC CPA. Aerosol behaviour is validated on the BMC VANAM M3 experiment, and the results show also a very good agreement with experimental data. Finally, iodine behaviour is checked in the validation test-case of the THAI IOD-11 experiment. Within this test-case, the comparison of the ASTEC versions V2.0r1 and V2.0r2 shows how an error was detected by the regression testing

  7. Linear Multivariable Regression Models for Prediction of Eddy Dissipation Rate from Available Meteorological Data

    Science.gov (United States)

    MCKissick, Burnell T. (Technical Monitor); Plassman, Gerald E.; Mall, Gerald H.; Quagliano, John R.

    2005-01-01

    Linear multivariable regression models for predicting day and night Eddy Dissipation Rate (EDR) from available meteorological data sources are defined and validated. Model definition is based on a combination of 1997-2000 Dallas/Fort Worth (DFW) data sources, EDR from Aircraft Vortex Spacing System (AVOSS) deployment data, and regression variables primarily from corresponding Automated Surface Observation System (ASOS) data. Model validation is accomplished through EDR predictions on a similar combination of 1994-1995 Memphis (MEM) AVOSS and ASOS data. Model forms include an intercept plus a single term of fixed optimal power for each of these regression variables; 30-minute forward averaged mean and variance of near-surface wind speed and temperature, variance of wind direction, and a discrete cloud cover metric. Distinct day and night models, regressing on EDR and the natural log of EDR respectively, yield best performance and avoid model discontinuity over day/night data boundaries.

  8. The Reflection Effect on the Eclipsing Binary by the Wilson and Devinney's Model and Russell and Merrill's Model

    Directory of Open Access Journals (Sweden)

    Seong Hee Choea

    1992-06-01

    Full Text Available The reflection effect on three types of eclipsing binaries has been analyzed Wilson and Devinney's model and Russell and Merrill's model. The reflection effect was displayed on the theoretical light curves for the various conditions using the Wilson and Devinney's light curve program. Two models were compared after the rectifing the theoretical light curves including the reflection effect with the Russell and Merrill's method. The result shows that two models have an agreement on the reflection effect just in cases of the small difference in temperature and albedo between two stars in the system.

  9. Asteroseismic effects in close binary stars

    Science.gov (United States)

    Springer, Ofer M.; Shaviv, Nir J.

    2013-09-01

    Turbulent processes in the convective envelopes of the Sun and stars have been shown to be a source of internal acoustic excitations. In single stars, acoustic waves having frequencies below a certain cut-off frequency propagate nearly adiabatically and are effectively trapped below the photosphere where they are internally reflected. This reflection essentially occurs where the local wavelength becomes comparable to the pressure scale height. In close binary stars, the sound speed is a constant on equipotentials, while the pressure scale height, which depends on the local effective gravity, varies on equipotentials and may be much greater near the inner Lagrangian point (L1). As a result, waves reaching the vicinity of L1 may propagate unimpeded into low-density regions, where they tend to dissipate quickly due to non-linear and radiative effects. We study the three-dimensional propagation and enhanced damping of such waves inside a set of close binary stellar models using a WKB approximation of the acoustic field. We find that these waves can have much higher damping rates in close binaries, compared to their non-binary counterparts. We also find that the relative distribution of acoustic energy density at the visible surface of close binaries develops a ring-like feature at specific acoustic frequencies and binary separations.

  10. Reconstruction of missing daily streamflow data using dynamic regression models

    Science.gov (United States)

    Tencaliec, Patricia; Favre, Anne-Catherine; Prieur, Clémentine; Mathevet, Thibault

    2015-12-01

    River discharge is one of the most important quantities in hydrology. It provides fundamental records for water resources management and climate change monitoring. Even very short data-gaps in this information can cause extremely different analysis outputs. Therefore, reconstructing missing data of incomplete data sets is an important step regarding the performance of the environmental models, engineering, and research applications, thus it presents a great challenge. The objective of this paper is to introduce an effective technique for reconstructing missing daily discharge data when one has access to only daily streamflow data. The proposed procedure uses a combination of regression and autoregressive integrated moving average models (ARIMA) called dynamic regression model. This model uses the linear relationship between neighbor and correlated stations and then adjusts the residual term by fitting an ARIMA structure. Application of the model to eight daily streamflow data for the Durance river watershed showed that the model yields reliable estimates for the missing data in the time series. Simulation studies were also conducted to evaluate the performance of the procedure.

  11. Detection of Outliers in Regression Model for Medical Data

    Directory of Open Access Journals (Sweden)

    Stephen Raj S

    2017-07-01

    Full Text Available In regression analysis, an outlier is an observation for which the residual is large in magnitude compared to other observations in the data set. The detection of outliers and influential points is an important step of the regression analysis. Outlier detection methods have been used to detect and remove anomalous values from data. In this paper, we detect the presence of outliers in simple linear regression models for medical data set. Chatterjee and Hadi mentioned that the ordinary residuals are not appropriate for diagnostic purposes; a transformed version of them is preferable. First, we investigate the presence of outliers based on existing procedures of residuals and standardized residuals. Next, we have used the new approach of standardized scores for detecting outliers without the use of predicted values. The performance of the new approach was verified with the real-life data.

  12. EREM: Parameter Estimation and Ancestral Reconstruction by Expectation-Maximization Algorithm for a Probabilistic Model of Genomic Binary Characters Evolution

    Directory of Open Access Journals (Sweden)

    Liran Carmel

    2010-01-01

    Full Text Available Evolutionary binary characters are features of species or genes, indicating the absence (value zero or presence (value one of some property. Examples include eukaryotic gene architecture (the presence or absence of an intron in a particular locus, gene content, and morphological characters. In many studies, the acquisition of such binary characters is assumed to represent a rare evolutionary event, and consequently, their evolution is analyzed using various flavors of parsimony. However, when gain and loss of the character are not rare enough, a probabilistic analysis becomes essential. Here, we present a comprehensive probabilistic model to describe the evolution of binary characters on a bifurcating phylogenetic tree. A fast software tool, EREM, is provided, using maximum likelihood to estimate the parameters of the model and to reconstruct ancestral states (presence and absence in internal nodes and events (gain and loss events along branches.

  13. Simple model of surface roughness for binary collision sputtering simulations

    Science.gov (United States)

    Lindsey, Sloan J.; Hobler, Gerhard; Maciążek, Dawid; Postawa, Zbigniew

    2017-02-01

    It has been shown that surface roughness can strongly influence the sputtering yield - especially at glancing incidence angles where the inclusion of surface roughness leads to an increase in sputtering yields. In this work, we propose a simple one-parameter model (the "density gradient model") which imitates surface roughness effects. In the model, the target's atomic density is assumed to vary linearly between the actual material density and zero. The layer width is the sole model parameter. The model has been implemented in the binary collision simulator IMSIL and has been evaluated against various geometric surface models for 5 keV Ga ions impinging an amorphous Si target. To aid the construction of a realistic rough surface topography, we have performed MD simulations of sequential 5 keV Ga impacts on an initially crystalline Si target. We show that our new model effectively reproduces the sputtering yield, with only minor variations in the energy and angular distributions of sputtered particles. The success of the density gradient model is attributed to a reduction of the reflection coefficient - leading to increased sputtering yields, similar in effect to surface roughness.

  14. Spectroscopic and Chemometric Analysis of Binary and Ternary Edible Oil Mixtures: Qualitative and Quantitative Study.

    Science.gov (United States)

    Jović, Ozren; Smolić, Tomislav; Primožič, Ines; Hrenar, Tomica

    2016-04-19

    The aim of this study was to investigate the feasibility of FTIR-ATR spectroscopy coupled with the multivariate numerical methodology for qualitative and quantitative analysis of binary and ternary edible oil mixtures. Four pure oils (extra virgin olive oil, high oleic sunflower oil, rapeseed oil, and sunflower oil), as well as their 54 binary and 108 ternary mixtures, were analyzed using FTIR-ATR spectroscopy in combination with principal component and discriminant analysis, partial least-squares, and principal component regression. It was found that the composition of all 166 samples can be excellently represented using only the first three principal components describing 98.29% of total variance in the selected spectral range (3035-2989, 1170-1140, 1120-1100, 1093-1047, and 930-890 cm(-1)). Factor scores in 3D space spanned by these three principal components form a tetrahedral-like arrangement: pure oils being at the vertices, binary mixtures at the edges, and ternary mixtures on the faces of a tetrahedron. To confirm the validity of results, we applied several cross-validation methods. Quantitative analysis was performed by minimization of root-mean-square error of cross-validation values regarding the spectral range, derivative order, and choice of method (partial least-squares or principal component regression), which resulted in excellent predictions for test sets (R(2) > 0.99 in all cases). Additionally, experimentally more demanding gas chromatography analysis of fatty acid content was carried out for all specimens, confirming the results obtained by FTIR-ATR coupled with principal component analysis. However, FTIR-ATR provided a considerably better model for prediction of mixture composition than gas chromatography, especially for high oleic sunflower oil.

  15. Effect of binary stars on the dynamical evolution of stellar clusters. II. Analytic evolutionary models

    International Nuclear Information System (INIS)

    Hills, J.G.

    1975-01-01

    We use analytic models to compute the evolution of the core of a stellar system due simultaneously to stellar evaporation which causes the system (core) to contract and to its binaries which cause it to expand by progressively decreasing its binding energy. The evolution of the system is determined by two parameters: the initial number of stars in the system N 0 , and the fraction f/subb/ of its stars which are binaries. For a fixed f/subb/, stellar evaporation initially dominates the dynamical evolution if N 0 is sufficiently large due to the fact that the rate of evaporation is determined chiefly by long-range encounters which increase in importance as the number of stars in the system increases. If stellar evaporation initially dominates, the system first contracts, but as N/subc/, the number of remaining stars in the system, decreases by evaporation, the system reaches a minimum radius and a maximum density and then it expands monotonically as N/subc/ decreases further. Open clusters expand monotonically from the beginning if they have anything approaching average Population I binary frequencies. Globular clusters are highly deficient in binaries in order to have formed and retained the high-density stellar cores observed in most of them. We estimate that for these system f/subb/ < or = 0.15

  16. Estimasi Model Seemingly Unrelated Regression (SUR dengan Metode Generalized Least Square (GLS

    Directory of Open Access Journals (Sweden)

    Ade Widyaningsih

    2015-04-01

    Full Text Available Regression analysis is a statistical tool that is used to determine the relationship between two or more quantitative variables so that one variable can be predicted from the other variables. A method that can used to obtain a good estimation in the regression analysis is ordinary least squares method. The least squares method is used to estimate the parameters of one or more regression but relationships among the errors in the response of other estimators are not allowed. One way to overcome this problem is Seemingly Unrelated Regression model (SUR in which parameters are estimated using Generalized Least Square (GLS. In this study, the author applies SUR model using GLS method on world gasoline demand data. The author obtains that SUR using GLS is better than OLS because SUR produce smaller errors than the OLS.

  17. Estimasi Model Seemingly Unrelated Regression (SUR dengan Metode Generalized Least Square (GLS

    Directory of Open Access Journals (Sweden)

    Ade Widyaningsih

    2014-06-01

    Full Text Available Regression analysis is a statistical tool that is used to determine the relationship between two or more quantitative variables so that one variable can be predicted from the other variables. A method that can used to obtain a good estimation in the regression analysis is ordinary least squares method. The least squares method is used to estimate the parameters of one or more regression but relationships among the errors in the response of other estimators are not allowed. One way to overcome this problem is Seemingly Unrelated Regression model (SUR in which parameters are estimated using Generalized Least Square (GLS. In this study, the author applies SUR model using GLS method on world gasoline demand data. The author obtains that SUR using GLS is better than OLS because SUR produce smaller errors than the OLS.

  18. Binary Star Fractions from the LAMOST DR4

    Science.gov (United States)

    Tian, Zhi-Jia; Liu, Xiao-Wei; Yuan, Hai-Bo; Chen, Bing-Qiu; Xiang, Mao-Sheng; Huang, Yang; Wang, Chun; Zhang, Hua-Wei; Guo, Jin-Cheng; Ren, Juan-Juan; Huo, Zhi-Ying; Yang, Yong; Zhang, Meng; Bi, Shao-Lan; Yang, Wu-Ming; Liu, Kang; Zhang, Xian-Fei; Li, Tan-Da; Wu, Ya-Qian; Zhang, Jing-Hua

    2018-05-01

    Stellar systems composed of single, double, triple or higher-order systems are rightfully regarded as the fundamental building blocks of the Milky Way. Binary stars play an important role in formation and evolution of the Galaxy. Through comparing the radial velocity variations from multi-epoch observations, we analyze the binary fraction of dwarf stars observed with LAMOST. Effects of different model assumptions, such as orbital period distributions on the estimate of binary fractions, are investigated. The results based on log-normal distribution of orbital periods reproduce the previous complete analyses better than the power-law distribution. We find that the binary fraction increases with T eff and decreases with [Fe/H]. We first investigate the relation between α-elements and binary fraction in such a large sample as provided by LAMOST. The old stars with high [α/Fe] dominate with a higher binary fraction than young stars with low [α/Fe]. At the same mass, earlier forming stars possess a higher binary fraction than newly forming ones, which may be related with evolution of the Galaxy.

  19. Two-dimensional model of laser alloying of binary alloy powder with interval of melting temperature

    Science.gov (United States)

    Knyzeva, A. G.; Sharkeev, Yu. P.

    2017-10-01

    The paper contains two-dimensional model of laser beam melting of powders from binary alloy. The model takes into consideration the melting of alloy in some temperature interval between solidus and liquidus temperatures. The external source corresponds to laser beam with energy density distributed by Gauss law. The source moves along the treated surface according to given trajectory. The model allows investigating the temperature distribution and thickness of powder layer depending on technological parameters.

  20. Population of Nuclei Via 7Li-Induced Binary Reactions

    International Nuclear Information System (INIS)

    Clark, Rodney M.; Phair, Larry W.; Descovich, M.; Cromaz, Mario; Deleplanque, M.A.; Fall on, Paul; Lee, I-Yang; Macchiavelli, A.O.; McMahan, Margaret A.; Moretto, Luciano G.; Rodriguez-Vieitez, E.; Sinha, Shrabani; Stephens, Frank S.; Ward, David; Wiedeking, Mathis

    2005-01-01

    The authors have investigated the population of nuclei formed in binary reactions involving 7 Li beams on targets of 160 Gd and 184 W. The 7 Li + 184 W data were taken in the first experiment using the LIBERACE Ge-array in combination with the STARS Si ΔE-E telescope system at the 88-Inch Cyclotron of the Lawrence Berkeley National Laboratory. By using the Wilczynski binary transfer model, in combination with a standard evaporation model, they are able to reproduce the experimental results. This is a useful method for predicting the population of neutron-rich heavy nuclei formed in binary reactions involving beams of weakly bound nuclei formed in binary reactions involving beams of weakly bound nuclei and will be of use in future spectroscopic studies

  1. COSMIC probes into compact binary formation and evolution

    Science.gov (United States)

    Breivik, Katelyn

    2018-01-01

    The population of compact binaries in the galaxy represents the final state of all binaries that have lived up to the present epoch. Compact binaries present a unique opportunity to probe binary evolution since many of the interactions binaries experience can be imprinted on the compact binary population. By combining binary evolution simulations with catalogs of observable compact binary systems, we can distill the dominant physical processes that govern binary star evolution, as well as predict the abundance and variety of their end products.The next decades herald a previously unseen opportunity to study compact binaries. Multi-messenger observations from telescopes across all wavelengths and gravitational-wave observatories spanning several decades of frequency will give an unprecedented view into the structure of these systems and the composition of their components. Observations will not always be coincident and in some cases may be separated by several years, providing an avenue for simulations to better constrain binary evolution models in preparation for future observations.I will present the results of three population synthesis studies of compact binary populations carried out with the Compact Object Synthesis and Monte Carlo Investigation Code (COSMIC). I will first show how binary-black-hole formation channels can be understood with LISA observations. I will then show how the population of double white dwarfs observed with LISA and Gaia could provide a detailed view of mass transfer and accretion. Finally, I will show that Gaia could discover thousands black holes in the Milky Way through astrometric observations, yielding view into black-hole astrophysics that is complementary to and independent from both X-ray and gravitational-wave astronomy.

  2. A Conditional Curie-Weiss Model for Stylized Multi-group Binary Choice with Social Interaction

    Science.gov (United States)

    Opoku, Alex Akwasi; Edusei, Kwame Owusu; Ansah, Richard Kwame

    2018-04-01

    This paper proposes a conditional Curie-Weiss model as a model for decision making in a stylized society made up of binary decision makers that face a particular dichotomous choice between two options. Following Brock and Durlauf (Discrete choice with social interaction I: theory, 1955), we set-up both socio-economic and statistical mechanical models for the choice problem. We point out when both the socio-economic and statistical mechanical models give rise to the same self-consistent equilibrium mean choice level(s). Phase diagram of the associated statistical mechanical model and its socio-economic implications are discussed.

  3. On pseudo-values for regression analysis in competing risks models

    DEFF Research Database (Denmark)

    Graw, F; Gerds, Thomas Alexander; Schumacher, M

    2009-01-01

    For regression on state and transition probabilities in multi-state models Andersen et al. (Biometrika 90:15-27, 2003) propose a technique based on jackknife pseudo-values. In this article we analyze the pseudo-values suggested for competing risks models and prove some conjectures regarding their...

  4. High-energy gamma-ray emission in compact binaries

    International Nuclear Information System (INIS)

    Cerutti, Benoit

    2010-01-01

    Four gamma-ray sources have been associated with binary systems in our Galaxy: the micro-quasar Cygnus X-3 and the gamma-ray binaries LS I +61 degrees 303, LS 5039 and PSR B1259-63. These systems are composed of a massive companion star and a compact object of unknown nature, except in PSR B1259-63 where there is a young pulsar. I propose a comprehensive theoretical model for the high-energy gamma-ray emission and variability in gamma-ray emitting binaries. In this model, the high-energy radiation is produced by inverse Compton scattering of stellar photons on ultra-relativistic electron-positron pairs injected by a young pulsar in gamma-ray binaries and in a relativistic jet in micro-quasars. Considering anisotropic inverse Compton scattering, pair production and pair cascade emission, the TeV gamma-ray emission is well explained in LS 5039. Nevertheless, this model cannot account for the gamma-ray emission in LS I +61 degrees 303 and PSR B1259-63. Other processes should dominate in these complex systems. In Cygnus X-3, the gamma-ray radiation is convincingly reproduced by Doppler-boosted Compton emission of pairs in a relativistic jet. Gamma-ray binaries and micro-quasars provide a novel environment for the study of pulsar winds and relativistic jets at very small spatial scales. (author)

  5. Modeling and prediction of Turkey's electricity consumption using Support Vector Regression

    International Nuclear Information System (INIS)

    Kavaklioglu, Kadir

    2011-01-01

    Support Vector Regression (SVR) methodology is used to model and predict Turkey's electricity consumption. Among various SVR formalisms, ε-SVR method was used since the training pattern set was relatively small. Electricity consumption is modeled as a function of socio-economic indicators such as population, Gross National Product, imports and exports. In order to facilitate future predictions of electricity consumption, a separate SVR model was created for each of the input variables using their current and past values; and these models were combined to yield consumption prediction values. A grid search for the model parameters was performed to find the best ε-SVR model for each variable based on Root Mean Square Error. Electricity consumption of Turkey is predicted until 2026 using data from 1975 to 2006. The results show that electricity consumption can be modeled using Support Vector Regression and the models can be used to predict future electricity consumption. (author)

  6. Cluster regression model and level fluctuation features of Van Lake, Turkey

    Directory of Open Access Journals (Sweden)

    Z. Şen

    1999-02-01

    Full Text Available Lake water levels change under the influences of natural and/or anthropogenic environmental conditions. Among these influences are the climate change, greenhouse effects and ozone layer depletions which are reflected in the hydrological cycle features over the lake drainage basins. Lake levels are among the most significant hydrological variables that are influenced by different atmospheric and environmental conditions. Consequently, lake level time series in many parts of the world include nonstationarity components such as shifts in the mean value, apparent or hidden periodicities. On the other hand, many lake level modeling techniques have a stationarity assumption. The main purpose of this work is to develop a cluster regression model for dealing with nonstationarity especially in the form of shifting means. The basis of this model is the combination of transition probability and classical regression technique. Both parts of the model are applied to monthly level fluctuations of Lake Van in eastern Turkey. It is observed that the cluster regression procedure does preserve the statistical properties and the transitional probabilities that are indistinguishable from the original data.Key words. Hydrology (hydrologic budget; stochastic processes · Meteorology and atmospheric dynamics (ocean-atmosphere interactions

  7. Cluster regression model and level fluctuation features of Van Lake, Turkey

    Directory of Open Access Journals (Sweden)

    Z. Şen

    Full Text Available Lake water levels change under the influences of natural and/or anthropogenic environmental conditions. Among these influences are the climate change, greenhouse effects and ozone layer depletions which are reflected in the hydrological cycle features over the lake drainage basins. Lake levels are among the most significant hydrological variables that are influenced by different atmospheric and environmental conditions. Consequently, lake level time series in many parts of the world include nonstationarity components such as shifts in the mean value, apparent or hidden periodicities. On the other hand, many lake level modeling techniques have a stationarity assumption. The main purpose of this work is to develop a cluster regression model for dealing with nonstationarity especially in the form of shifting means. The basis of this model is the combination of transition probability and classical regression technique. Both parts of the model are applied to monthly level fluctuations of Lake Van in eastern Turkey. It is observed that the cluster regression procedure does preserve the statistical properties and the transitional probabilities that are indistinguishable from the original data.

    Key words. Hydrology (hydrologic budget; stochastic processes · Meteorology and atmospheric dynamics (ocean-atmosphere interactions

  8. Improved model of the retardance in citric acid coated ferrofluids using stepwise regression

    Science.gov (United States)

    Lin, J. F.; Qiu, X. R.

    2017-06-01

    Citric acid (CA) coated Fe3O4 ferrofluids (FFs) have been conducted for biomedical application. The magneto-optical retardance of CA coated FFs was measured by a Stokes polarimeter. Optimization and multiple regression of retardance in FFs were executed by Taguchi method and Microsoft Excel previously, and the F value of regression model was large enough. However, the model executed by Excel was not systematic. Instead we adopted the stepwise regression to model the retardance of CA coated FFs. From the results of stepwise regression by MATLAB, the developed model had highly predictable ability owing to F of 2.55897e+7 and correlation coefficient of one. The average absolute error of predicted retardances to measured retardances was just 0.0044%. Using the genetic algorithm (GA) in MATLAB, the optimized parametric combination was determined as [4.709 0.12 39.998 70.006] corresponding to the pH of suspension, molar ratio of CA to Fe3O4, CA volume, and coating temperature. The maximum retardance was found as 31.712°, close to that obtained by evolutionary solver in Excel and a relative error of -0.013%. Above all, the stepwise regression method was successfully used to model the retardance of CA coated FFs, and the maximum global retardance was determined by the use of GA.

  9. Black Hole/Pulsar Binaries in the Galaxy

    Science.gov (United States)

    Shao, Yong; Li, Xiang-Dong

    2018-04-01

    We have performed population synthesis calculation on the formation of binaries containing a black hole (BH) and a neutron star (NS) in the Galactic disk. Some of important input parameters, especially for the treatment of common envelope evolution, are updated in the calculation. We have discussed the uncertainties from the star formation rate of the Galaxy and the velocity distribution of NS kicks on the birthrate (˜ 0.6-13 Myr^{-1}) of BH/NS binaries. From incident BH/NS binaries, by modelling the orbital evolution duo to gravitational wave radiation and the NS evolution as radio pulsars, we obtain the distributions of the observable parameters such as the orbital period, eccentricity and pulse period of the BH/pulsar binaries. We estimate that there may be ˜3 - 80 BH/pulsar binaries in the Galactic disk and around 10% of them could be detected by the Five-hundred-meter Aperture Spherical radio Telescope.

  10. Black hole/pulsar binaries in the Galaxy

    Science.gov (United States)

    Shao, Yong; Li, Xiang-Dong

    2018-06-01

    We have performed population synthesis calculation on the formation of binaries containing a black hole (BH) and a neutron star (NS) in the Galactic disc. Some of important input parameters, especially for the treatment of common envelope evolution, are updated in the calculation. We have discussed the uncertainties from the star formation rate of the Galaxy and the velocity distribution of NS kicks on the birthrate (˜ 0.6-13 M yr^{-1}) of BH/NS binaries. From incident BH/NS binaries, by modelling the orbital evolution due to gravitational wave radiation and the NS evolution as radio pulsars, we obtain the distributions of the observable parameters such as the orbital period, eccentricity, and pulse period of the BH/pulsar binaries. We estimate that there may be ˜3-80 BH/pulsar binaries in the Galactic disc and around 10 per cent of them could be detected by the Five-hundred-metre Aperture Spherical radio Telescope.

  11. A general equation to obtain multiple cut-off scores on a test from multinomial logistic regression.

    Science.gov (United States)

    Bersabé, Rosa; Rivas, Teresa

    2010-05-01

    The authors derive a general equation to compute multiple cut-offs on a total test score in order to classify individuals into more than two ordinal categories. The equation is derived from the multinomial logistic regression (MLR) model, which is an extension of the binary logistic regression (BLR) model to accommodate polytomous outcome variables. From this analytical procedure, cut-off scores are established at the test score (the predictor variable) at which an individual is as likely to be in category j as in category j+1 of an ordinal outcome variable. The application of the complete procedure is illustrated by an example with data from an actual study on eating disorders. In this example, two cut-off scores on the Eating Attitudes Test (EAT-26) scores are obtained in order to classify individuals into three ordinal categories: asymptomatic, symptomatic and eating disorder. Diagnoses were made from the responses to a self-report (Q-EDD) that operationalises DSM-IV criteria for eating disorders. Alternatives to the MLR model to set multiple cut-off scores are discussed.

  12. Binary classification posed as a quadratically constrained quadratic ...

    Indian Academy of Sciences (India)

    Binary classification is posed as a quadratically constrained quadratic problem and solved using the proposed method. Each class in the binary classification problem is modeled as a multidimensional ellipsoid to forma quadratic constraint in the problem. Particle swarms help in determining the optimal hyperplane or ...

  13. Accounting for spatial effects in land use regression for urban air pollution modeling.

    Science.gov (United States)

    Bertazzon, Stefania; Johnson, Markey; Eccles, Kristin; Kaplan, Gilaad G

    2015-01-01

    In order to accurately assess air pollution risks, health studies require spatially resolved pollution concentrations. Land-use regression (LUR) models estimate ambient concentrations at a fine spatial scale. However, spatial effects such as spatial non-stationarity and spatial autocorrelation can reduce the accuracy of LUR estimates by increasing regression errors and uncertainty; and statistical methods for resolving these effects--e.g., spatially autoregressive (SAR) and geographically weighted regression (GWR) models--may be difficult to apply simultaneously. We used an alternate approach to address spatial non-stationarity and spatial autocorrelation in LUR models for nitrogen dioxide. Traditional models were re-specified to include a variable capturing wind speed and direction, and re-fit as GWR models. Mean R(2) values for the resulting GWR-wind models (summer: 0.86, winter: 0.73) showed a 10-20% improvement over traditional LUR models. GWR-wind models effectively addressed both spatial effects and produced meaningful predictive models. These results suggest a useful method for improving spatially explicit models. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

  14. Regression analysis understanding and building business and economic models using Excel

    CERN Document Server

    Wilson, J Holton

    2012-01-01

    The technique of regression analysis is used so often in business and economics today that an understanding of its use is necessary for almost everyone engaged in the field. This book will teach you the essential elements of building and understanding regression models in a business/economic context in an intuitive manner. The authors take a non-theoretical treatment that is accessible even if you have a limited statistical background. It is specifically designed to teach the correct use of regression, while advising you of its limitations and teaching about common pitfalls. This book describe

  15. Measurement and modelling of hydrogen bonding in 1-alkanol plus n-alkane binary mixtures

    DEFF Research Database (Denmark)

    von Solms, Nicolas; Jensen, Lars; Kofod, Jonas L.

    2007-01-01

    Two equations of state (simplified PC-SAFT and CPA) are used to predict the monomer fraction of 1-alkanols in binary mixtures with n-alkanes. It is found that the choice of parameters and association schemes significantly affects the ability of a model to predict hydrogen bonding in mixtures, eve...... studies, which is clarified in the present work. New hydrogen bonding data based on infrared spectroscopy are reported for seven binary mixtures of alcohols and alkanes. (C) 2007 Elsevier B.V. All rights reserved....... though pure-component liquid densities and vapour pressures are predicted equally accurately for the associating compound. As was the case in the study of pure components, there exists some confusion in the literature about the correct interpretation and comparison of experimental data and theoretical...

  16. The prediction of intelligence in preschool children using alternative models to regression.

    Science.gov (United States)

    Finch, W Holmes; Chang, Mei; Davis, Andrew S; Holden, Jocelyn E; Rothlisberg, Barbara A; McIntosh, David E

    2011-12-01

    Statistical prediction of an outcome variable using multiple independent variables is a common practice in the social and behavioral sciences. For example, neuropsychologists are sometimes called upon to provide predictions of preinjury cognitive functioning for individuals who have suffered a traumatic brain injury. Typically, these predictions are made using standard multiple linear regression models with several demographic variables (e.g., gender, ethnicity, education level) as predictors. Prior research has shown conflicting evidence regarding the ability of such models to provide accurate predictions of outcome variables such as full-scale intelligence (FSIQ) test scores. The present study had two goals: (1) to demonstrate the utility of a set of alternative prediction methods that have been applied extensively in the natural sciences and business but have not been frequently explored in the social sciences and (2) to develop models that can be used to predict premorbid cognitive functioning in preschool children. Predictions of Stanford-Binet 5 FSIQ scores for preschool-aged children is used to compare the performance of a multiple regression model with several of these alternative methods. Results demonstrate that classification and regression trees provided more accurate predictions of FSIQ scores than does the more traditional regression approach. Implications of these results are discussed.

  17. A primer for biomedical scientists on how to execute model II linear regression analysis.

    Science.gov (United States)

    Ludbrook, John

    2012-04-01

    1. There are two very different ways of executing linear regression analysis. One is Model I, when the x-values are fixed by the experimenter. The other is Model II, in which the x-values are free to vary and are subject to error. 2. I have received numerous complaints from biomedical scientists that they have great difficulty in executing Model II linear regression analysis. This may explain the results of a Google Scholar search, which showed that the authors of articles in journals of physiology, pharmacology and biochemistry rarely use Model II regression analysis. 3. I repeat my previous arguments in favour of using least products linear regression analysis for Model II regressions. I review three methods for executing ordinary least products (OLP) and weighted least products (WLP) regression analysis: (i) scientific calculator and/or computer spreadsheet; (ii) specific purpose computer programs; and (iii) general purpose computer programs. 4. Using a scientific calculator and/or computer spreadsheet, it is easy to obtain correct values for OLP slope and intercept, but the corresponding 95% confidence intervals (CI) are inaccurate. 5. Using specific purpose computer programs, the freeware computer program smatr gives the correct OLP regression coefficients and obtains 95% CI by bootstrapping. In addition, smatr can be used to compare the slopes of OLP lines. 6. When using general purpose computer programs, I recommend the commercial programs systat and Statistica for those who regularly undertake linear regression analysis and I give step-by-step instructions in the Supplementary Information as to how to use loss functions. © 2011 The Author. Clinical and Experimental Pharmacology and Physiology. © 2011 Blackwell Publishing Asia Pty Ltd.

  18. Interaction of Massive Black Hole Binaries with Their Stellar Environment. II. Loss Cone Depletion and Binary Orbital Decay

    Science.gov (United States)

    Sesana, Alberto; Haardt, Francesco; Madau, Piero

    2007-05-01

    We study the long-term evolution of massive black hole binaries (MBHBs) at the centers of galaxies using detailed scattering experiments to solve the full three-body problem. Ambient stars drawn from an isotropic Maxwellian distribution unbound to the binary are ejected by the gravitational slingshot. We construct a minimal, hybrid model for the depletion of the loss cone and the orbital decay of the binary and show that secondary slingshots-stars returning on small-impact parameter orbits to have a second superelastic scattering with the MBHB-may considerably help the shrinking of the pair in the case of large binary mass ratios. In the absence of loss cone refilling by two-body relaxation or other processes, the mass ejected before the stalling of a MBHB is half the binary reduced mass. About 50% of the ejected stars are expelled in a ``burst'' lasting ~104 yr M1/46, where M6 is the binary mass in units of 106 Msolar. The loss cone is completely emptied in a few bulge crossing timescales, ~107 yr M1/46. Even in the absence of two-body relaxation or gas dynamical processes, unequal mass and/or eccentric binaries with M6>~0.1 can shrink to the gravitational wave emission regime in less than a Hubble time and are therefore ``safe'' targets for the planned Laser Interferometer Space Antenna.

  19. Using Logistic Regression To Predict the Probability of Debris Flows Occurring in Areas Recently Burned By Wildland Fires

    Science.gov (United States)

    Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.

    2003-01-01

    Logistic regression was used to predict the probability of debris flows occurring in areas recently burned by wildland fires. Multiple logistic regression is conceptually similar to multiple linear regression because statistical relations between one dependent variable and several independent variables are evaluated. In logistic regression, however, the dependent variable is transformed to a binary variable (debris flow did or did not occur), and the actual probability of the debris flow occurring is statistically modeled. Data from 399 basins located within 15 wildland fires that burned during 2000-2002 in Colorado, Idaho, Montana, and New Mexico were evaluated. More than 35 independent variables describing the burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows were delineated from National Elevation Data using a Geographic Information System (GIS). (2) Data describing the burn severity, geology, land surface gradient, rainfall, and soil properties were determined for each basin. These data were then downloaded to a statistics software package for analysis using logistic regression. (3) Relations between the occurrence/non-occurrence of debris flows and burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated and several preliminary multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combination produced the most effective model. The multivariate model that best predicted the occurrence of debris flows was selected. (4) The multivariate logistic regression model was entered into a GIS, and a map showing the probability of debris flows was constructed. The most effective model incorporates the percentage of each basin with slope greater than 30 percent, percentage of land burned at medium and high burn severity

  20. Applied logistic regression

    CERN Document Server

    Hosmer, David W; Sturdivant, Rodney X

    2013-01-01

     A new edition of the definitive guide to logistic regression modeling for health science and other applications This thoroughly expanded Third Edition provides an easily accessible introduction to the logistic regression (LR) model and highlights the power of this model by examining the relationship between a dichotomous outcome and a set of covariables. Applied Logistic Regression, Third Edition emphasizes applications in the health sciences and handpicks topics that best suit the use of modern statistical software. The book provides readers with state-of-

  1. Evaluation of Logistic Regression and Multivariate Adaptive Regression Spline Models for Groundwater Potential Mapping Using R and GIS

    Directory of Open Access Journals (Sweden)

    Soyoung Park

    2017-07-01

    Full Text Available This study mapped and analyzed groundwater potential using two different models, logistic regression (LR and multivariate adaptive regression splines (MARS, and compared the results. A spatial database was constructed for groundwater well data and groundwater influence factors. Groundwater well data with a high potential yield of ≥70 m3/d were extracted, and 859 locations (70% were used for model training, whereas the other 365 locations (30% were used for model validation. We analyzed 16 groundwater influence factors including altitude, slope degree, slope aspect, plan curvature, profile curvature, topographic wetness index, stream power index, sediment transport index, distance from drainage, drainage density, lithology, distance from fault, fault density, distance from lineament, lineament density, and land cover. Groundwater potential maps (GPMs were constructed using LR and MARS models and tested using a receiver operating characteristics curve. Based on this analysis, the area under the curve (AUC for the success rate curve of GPMs created using the MARS and LR models was 0.867 and 0.838, and the AUC for the prediction rate curve was 0.836 and 0.801, respectively. This implies that the MARS model is useful and effective for groundwater potential analysis in the study area.

  2. BINARY CENTRAL STARS OF PLANETARY NEBULAE DISCOVERED THROUGH PHOTOMETRIC VARIABILITY. II. MODELING THE CENTRAL STARS OF NGC 6026 AND NGC 6337

    International Nuclear Information System (INIS)

    Hillwig, Todd C.; Bond, Howard E.; Afsar, Melike; De Marco, Orsola

    2010-01-01

    Close-binary central stars of planetary nebulae (CSPNe) provide an opportunity to explore the evolution of PNe, their shaping, and the evolution of binary systems undergoing a common-envelope phase. Here, we present the results of time-resolved photometry of the binary central stars (CSs) of the PNe NGC 6026 and NGC 6337 as well as time-resolved spectroscopy of the CS of NGC 6026. The results of a period analysis give an orbital period of 0.528086(4) days for NGC 6026 and a photometric period of 0.1734742(5) days for NGC 6337. In the case of NGC 6337, it appears that the photometric period reflects the orbital period and that the variability is the result of the irradiated hemisphere of a cool companion. The inclination of the thin PN ring is nearly face-on. Our modeled inclination range for the close central binary includes nearly face-on alignments and provides evidence for a direct binary-nebular shaping connection. For NGC 6026, however, the radial-velocity curve shows that the orbital period is twice the photometric period. In this case, the photometric variability is due to an ellipsoidal effect in which the CS nearly fills its Roche lobe and the companion is most likely a hot white dwarf. NGC 6026 then is the third PN with a confirmed central binary where the companion is compact. Based on the data and modeling using a Wilson-Devinney code, we discuss the physical parameters of the two systems and how they relate to the known sample of close-binary CSs, which comprise 15%-20% of all PNe.

  3. Detection of Cutting Tool Wear using Statistical Analysis and Regression Model

    Science.gov (United States)

    Ghani, Jaharah A.; Rizal, Muhammad; Nuawi, Mohd Zaki; Haron, Che Hassan Che; Ramli, Rizauddin

    2010-10-01

    This study presents a new method for detecting the cutting tool wear based on the measured cutting force signals. A statistical-based method called Integrated Kurtosis-based Algorithm for Z-Filter technique, called I-kaz was used for developing a regression model and 3D graphic presentation of I-kaz 3D coefficient during machining process. The machining tests were carried out using a CNC turning machine Colchester Master Tornado T4 in dry cutting condition. A Kistler 9255B dynamometer was used to measure the cutting force signals, which were transmitted, analyzed, and displayed in the DasyLab software. Various force signals from machining operation were analyzed, and each has its own I-kaz 3D coefficient. This coefficient was examined and its relationship with flank wear lands (VB) was determined. A regression model was developed due to this relationship, and results of the regression model shows that the I-kaz 3D coefficient value decreases as tool wear increases. The result then is used for real time tool wear monitoring.

  4. Support vector regression and artificial neural network models for stability indicating analysis of mebeverine hydrochloride and sulpiride mixtures in pharmaceutical preparation: A comparative study

    Science.gov (United States)

    Naguib, Ibrahim A.; Darwish, Hany W.

    2012-02-01

    A comparison between support vector regression (SVR) and Artificial Neural Networks (ANNs) multivariate regression methods is established showing the underlying algorithm for each and making a comparison between them to indicate the inherent advantages and limitations. In this paper we compare SVR to ANN with and without variable selection procedure (genetic algorithm (GA)). To project the comparison in a sensible way, the methods are used for the stability indicating quantitative analysis of mixtures of mebeverine hydrochloride and sulpiride in binary mixtures as a case study in presence of their reported impurities and degradation products (summing up to 6 components) in raw materials and pharmaceutical dosage form via handling the UV spectral data. For proper analysis, a 6 factor 5 level experimental design was established resulting in a training set of 25 mixtures containing different ratios of the interfering species. An independent test set consisting of 5 mixtures was used to validate the prediction ability of the suggested models. The proposed methods (linear SVR (without GA) and linear GA-ANN) were successfully applied to the analysis of pharmaceutical tablets containing mebeverine hydrochloride and sulpiride mixtures. The results manifest the problem of nonlinearity and how models like the SVR and ANN can handle it. The methods indicate the ability of the mentioned multivariate calibration models to deconvolute the highly overlapped UV spectra of the 6 components' mixtures, yet using cheap and easy to handle instruments like the UV spectrophotometer.

  5. Thermodynamics of Binary Mixed Crystals in the Sub-quasi-chemical/Debye Approximation

    Science.gov (United States)

    van der Kemp, W. J. M.; Verdonk, M. L.

    1995-03-01

    A new statistical model for the description of the thermodynamic properties of binary mixed crystals is discussed. The model is based on an asymmetrical analogue of the quasi-chemical approximation and the Debye model of a solid. With two interchange -energy parameters and two interchange-Debye-temperature parameters, all important thermodynamic functions, at constant volume, of the binary mixed crystal can be calculated as a function of temperature and composition. The binary system {( 1 - x)Nai + xKI}(s) is used for illustration of the model.

  6. Using the Logistic Regression model in supporting decisions of establishing marketing strategies

    Directory of Open Access Journals (Sweden)

    Cristinel CONSTANTIN

    2015-12-01

    Full Text Available This paper is about an instrumental research regarding the using of Logistic Regression model for data analysis in marketing research. The decision makers inside different organisation need relevant information to support their decisions regarding the marketing strategies. The data provided by marketing research could be computed in various ways but the multivariate data analysis models can enhance the utility of the information. Among these models we can find the Logistic Regression model, which is used for dichotomous variables. Our research is based on explanation the utility of this model and interpretation of the resulted information in order to help practitioners and researchers to use it in their future investigations

  7. Vector regression introduced

    Directory of Open Access Journals (Sweden)

    Mok Tik

    2014-06-01

    Full Text Available This study formulates regression of vector data that will enable statistical analysis of various geodetic phenomena such as, polar motion, ocean currents, typhoon/hurricane tracking, crustal deformations, and precursory earthquake signals. The observed vector variable of an event (dependent vector variable is expressed as a function of a number of hypothesized phenomena realized also as vector variables (independent vector variables and/or scalar variables that are likely to impact the dependent vector variable. The proposed representation has the unique property of solving the coefficients of independent vector variables (explanatory variables also as vectors, hence it supersedes multivariate multiple regression models, in which the unknown coefficients are scalar quantities. For the solution, complex numbers are used to rep- resent vector information, and the method of least squares is deployed to estimate the vector model parameters after transforming the complex vector regression model into a real vector regression model through isomorphism. Various operational statistics for testing the predictive significance of the estimated vector parameter coefficients are also derived. A simple numerical example demonstrates the use of the proposed vector regression analysis in modeling typhoon paths.

  8. Multiple regression models for energy use in air-conditioned office buildings in different climates

    International Nuclear Information System (INIS)

    Lam, Joseph C.; Wan, Kevin K.W.; Liu Dalong; Tsang, C.L.

    2010-01-01

    An attempt was made to develop multiple regression models for office buildings in the five major climates in China - severe cold, cold, hot summer and cold winter, mild, and hot summer and warm winter. A total of 12 key building design variables were identified through parametric and sensitivity analysis, and considered as inputs in the regression models. The coefficient of determination R 2 varies from 0.89 in Harbin to 0.97 in Kunming, indicating that 89-97% of the variations in annual building energy use can be explained by the changes in the 12 parameters. A pseudo-random number generator based on three simple multiplicative congruential generators was employed to generate random designs for evaluation of the regression models. The difference between regression-predicted and DOE-simulated annual building energy use are largely within 10%. It is envisaged that the regression models developed can be used to estimate the likely energy savings/penalty during the initial design stage when different building schemes and design concepts are being considered.

  9. CICAAR - Convolutive ICA with an Auto-Regressive Inverse Model

    DEFF Research Database (Denmark)

    Dyrholm, Mads; Hansen, Lars Kai

    2004-01-01

    We invoke an auto-regressive IIR inverse model for convolutive ICA and derive expressions for the likelihood and its gradient. We argue that optimization will give a stable inverse. When there are more sensors than sources the mixing model parameters are estimated in a second step by least square...... estimation. We demonstrate the method on synthetic data and finally separate speech and music in a real room recording....

  10. Two levels ARIMAX and regression models for forecasting time series data with calendar variation effects

    Science.gov (United States)

    Suhartono, Lee, Muhammad Hisyam; Prastyo, Dedy Dwi

    2015-12-01

    The aim of this research is to develop a calendar variation model for forecasting retail sales data with the Eid ul-Fitr effect. The proposed model is based on two methods, namely two levels ARIMAX and regression methods. Two levels ARIMAX and regression models are built by using ARIMAX for the first level and regression for the second level. Monthly men's jeans and women's trousers sales in a retail company for the period January 2002 to September 2009 are used as case study. In general, two levels of calendar variation model yields two models, namely the first model to reconstruct the sales pattern that already occurred, and the second model to forecast the effect of increasing sales due to Eid ul-Fitr that affected sales at the same and the previous months. The results show that the proposed two level calendar variation model based on ARIMAX and regression methods yields better forecast compared to the seasonal ARIMA model and Neural Networks.

  11. Pulsars in binary systems: probing binary stellar evolution and general relativity.

    Science.gov (United States)

    Stairs, Ingrid H

    2004-04-23

    Radio pulsars in binary orbits often have short millisecond spin periods as a result of mass transfer from their companion stars. They therefore act as very precise, stable, moving clocks that allow us to investigate a large set of otherwise inaccessible astrophysical problems. The orbital parameters derived from high-precision binary pulsar timing provide constraints on binary evolution, characteristics of the binary pulsar population, and the masses of neutron stars with different mass-transfer histories. These binary systems also test gravitational theories, setting strong limits on deviations from general relativity. Surveys for new pulsars yield new binary systems that increase our understanding of all these fields and may open up whole new areas of physics, as most spectacularly evidenced by the recent discovery of an extremely relativistic double-pulsar system.

  12. Testing and Modeling Fuel Regression Rate in a Miniature Hybrid Burner

    Directory of Open Access Journals (Sweden)

    Luciano Fanton

    2012-01-01

    Full Text Available Ballistic characterization of an extended group of innovative HTPB-based solid fuel formulations for hybrid rocket propulsion was performed in a lab-scale burner. An optical time-resolved technique was used to assess the quasisteady regression history of single perforation, cylindrical samples. The effects of metalized additives and radiant heat transfer on the regression rate of such formulations were assessed. Under the investigated operating conditions and based on phenomenological models from the literature, analyses of the collected experimental data show an appreciable influence of the radiant heat flux from burnt gases and soot for both unloaded and loaded fuel formulations. Pure HTPB regression rate data are satisfactorily reproduced, while the impressive initial regression rates of metalized formulations require further assessment.

  13. The role of multicollinearity in landslide susceptibility assessment by means of Binary Logistic Regression: comparison between VIF and AIC stepwise selection

    Science.gov (United States)

    Cama, Mariaelena; Cristi Nicu, Ionut; Conoscenti, Christian; Quénéhervé, Geraldine; Maerker, Michael

    2016-04-01

    Landslide susceptibility can be defined as the likelihood of a landslide occurring in a given area on the basis of local terrain conditions. In the last decades many research focused on its evaluation by means of stochastic approaches under the assumption that 'the past is the key to the future' which means that if a model is able to reproduce a known landslide spatial distribution, it will be able to predict the future locations of new (i.e. unknown) slope failures. Among the various stochastic approaches, Binary Logistic Regression (BLR) is one of the most used because it calculates the susceptibility in probabilistic terms and its results are easily interpretable from a geomorphological point of view. However, very often not much importance is given to multicollinearity assessment whose effect is that the coefficient estimates are unstable, with opposite sign and therefore difficult to interpret. Therefore, it should be evaluated every time in order to make a model whose results are geomorphologically correct. In this study the effects of multicollinearity in the predictive performance and robustness of landslide susceptibility models are analyzed. In particular, the multicollinearity is estimated by means of Variation Inflation Index (VIF) which is also used as selection criterion for the independent variables (VIF Stepwise Selection) and compared to the more commonly used AIC Stepwise Selection. The robustness of the results is evaluated through 100 replicates of the dataset. The study area selected to perform this analysis is the Moldavian Plateau where landslides are among the most frequent geomorphological processes. This area has an increasing trend of urbanization and a very high potential regarding the cultural heritage, being the place of discovery of the largest settlement belonging to the Cucuteni Culture from Eastern Europe (that led to the development of the great complex Cucuteni-Tripyllia). Therefore, identifying the areas susceptible to

  14. A numerical analysis of an anisotropic phase-field model for binary-fluid mixtures in the presence of magnetic-field

    OpenAIRE

    Belmiloudi , Aziz; Rasheed , Amer

    2015-01-01

    In this paper we propose a numerical scheme and perform its numerical analysis devoted to an anisotropic phase-field model with convection under the influence of magnetic field for the isother-mal solidification of binary mixtures in two-dimensional geometry. Precisely, the numerical stability and error analysis of this approximation scheme which is based on mixed finite-element method are performed. The particular application of a nickelcopper (NiCu) binary alloy, with real physical paramete...

  15. A metallic solution model with adjustable parameter for describing ternary thermodynamic properties from its binary constituents

    International Nuclear Information System (INIS)

    Fang Zheng; Qiu Guanzhou

    2007-01-01

    A metallic solution model with adjustable parameter k has been developed to predict thermodynamic properties of ternary systems from those of its constituent three binaries. In the present model, the excess Gibbs free energy for a ternary mixture is expressed as a weighted probability sum of those of binaries and the k value is determined based on an assumption that the ternary interaction generally strengthens the mixing effects for metallic solutions with weak interaction, making the Gibbs free energy of mixing of the ternary system more negative than that before considering the interaction. This point is never considered in the models currently reported, where the only difference in a geometrical definition of molar values of components is considered that do not involve thermodynamic principles but are completely empirical. The current model describes the results of experiments very well, and by adjusting the k value also agrees with those from models used widely in the literature. Three ternary systems, Mg-Cu-Ni, Zn-In-Cd, and Cd-Bi-Pb are recalculated to demonstrate the method of determining k and the precision of the model. The results of the calculations, especially those in Mg-Cu-Ni system, are better than those predicted by the current models in the literature

  16. Two-step variable selection in quantile regression models

    Directory of Open Access Journals (Sweden)

    FAN Yali

    2015-06-01

    Full Text Available We propose a two-step variable selection procedure for high dimensional quantile regressions, in which the dimension of the covariates, pn is much larger than the sample size n. In the first step, we perform ℓ1 penalty, and we demonstrate that the first step penalized estimator with the LASSO penalty can reduce the model from an ultra-high dimensional to a model whose size has the same order as that of the true model, and the selected model can cover the true model. The second step excludes the remained irrelevant covariates by applying the adaptive LASSO penalty to the reduced model obtained from the first step. Under some regularity conditions, we show that our procedure enjoys the model selection consistency. We conduct a simulation study and a real data analysis to evaluate the finite sample performance of the proposed approach.

  17. Regression Models for Predicting Force Coefficients of Aerofoils

    Directory of Open Access Journals (Sweden)

    Mohammed ABDUL AKBAR

    2015-09-01

    Full Text Available Renewable sources of energy are attractive and advantageous in a lot of different ways. Among the renewable energy sources, wind energy is the fastest growing type. Among wind energy converters, Vertical axis wind turbines (VAWTs have received renewed interest in the past decade due to some of the advantages they possess over their horizontal axis counterparts. VAWTs have evolved into complex 3-D shapes. A key component in predicting the output of VAWTs through analytical studies is obtaining the values of lift and drag coefficients which is a function of shape of the aerofoil, ‘angle of attack’ of wind and Reynolds’s number of flow. Sandia National Laboratories have carried out extensive experiments on aerofoils for the Reynolds number in the range of those experienced by VAWTs. The volume of experimental data thus obtained is huge. The current paper discusses three Regression analysis models developed wherein lift and drag coefficients can be found out using simple formula without having to deal with the bulk of the data. Drag coefficients and Lift coefficients were being successfully estimated by regression models with R2 values as high as 0.98.

  18. The Relationship between Economic Growth and Money Laundering – a Linear Regression Model

    Directory of Open Access Journals (Sweden)

    Daniel Rece

    2009-09-01

    Full Text Available This study provides an overview of the relationship between economic growth and money laundering modeled by a least squares function. The report analyzes statistically data collected from USA, Russia, Romania and other eleven European countries, rendering a linear regression model. The study illustrates that 23.7% of the total variance in the regressand (level of money laundering is “explained” by the linear regression model. In our opinion, this model will provide critical auxiliary judgment and decision support for anti-money laundering service systems.

  19. Regression analysis of informative current status data with the additive hazards model.

    Science.gov (United States)

    Zhao, Shishun; Hu, Tao; Ma, Ling; Wang, Peijie; Sun, Jianguo

    2015-04-01

    This paper discusses regression analysis of current status failure time data arising from the additive hazards model in the presence of informative censoring. Many methods have been developed for regression analysis of current status data under various regression models if the censoring is noninformative, and also there exists a large literature on parametric analysis of informative current status data in the context of tumorgenicity experiments. In this paper, a semiparametric maximum likelihood estimation procedure is presented and in the method, the copula model is employed to describe the relationship between the failure time of interest and the censoring time. Furthermore, I-splines are used to approximate the nonparametric functions involved and the asymptotic consistency and normality of the proposed estimators are established. A simulation study is conducted and indicates that the proposed approach works well for practical situations. An illustrative example is also provided.

  20. Poisson regression approach for modeling fatal injury rates amongst Malaysian workers

    International Nuclear Information System (INIS)

    Kamarulzaman Ibrahim; Heng Khai Theng

    2005-01-01

    Many safety studies are based on the analysis carried out on injury surveillance data. The injury surveillance data gathered for the analysis include information on number of employees at risk of injury in each of several strata where the strata are defined in terms of a series of important predictor variables. Further insight into the relationship between fatal injury rates and predictor variables may be obtained by the poisson regression approach. Poisson regression is widely used in analyzing count data. In this study, poisson regression is used to model the relationship between fatal injury rates and predictor variables which are year (1995-2002), gender, recording system and industry type. Data for the analysis were obtained from PERKESO and Jabatan Perangkaan Malaysia. It is found that the assumption that the data follow poisson distribution has been violated. After correction for the problem of over dispersion, the predictor variables that are found to be significant in the model are gender, system of recording, industry type, two interaction effects (interaction between recording system and industry type and between year and industry type). Introduction Regression analysis is one of the most popular

  1. Measurement and modeling of osmotic coefficients of binary mixtures (alcohol + 1,3-dimethylpyridinium methylsulfate) at T = 323.15 K

    International Nuclear Information System (INIS)

    Gomez, Elena; Calvar, Noelia; Dominguez, Angeles; Macedo, Eugenia A.

    2011-01-01

    Research highlights: → The osmotic coefficients of binary mixtures (alcohol + ionic liquid) were determined. → The measurements were carried out with a vapor pressure osmometer at 323.15 K. → The Pitzer-Archer, and the MNRTL models were used to correlate the experimental data. → Mean molal activity coefficients and excess Gibbs free energies were calculated. - Abstract: Measurement of osmotic coefficients of binary mixtures containing several primary and secondary alcohols (1-propanol, 2-propanol, 1-butanol, 2-butanol, and 1-pentanol) and the pyridinium-based ionic liquid 1,3-dimethylpyridinium methylsulfate were performed at T = 323.15 K using the vapor pressure osmometry technique, and from experimental data, vapor pressure, and activity coefficients were determined. The extended Pitzer model modified by Archer, and the NRTL model modified by Jaretun and Aly (MNRTL) were used to correlate the experimental osmotic coefficients, obtaining standard deviations lower than 0.017 and 0.054, respectively. From the parameters obtained with the extended Pitzer model modified by Archer, the mean molal activity coefficients and the excess Gibbs free energy for the studied binary mixtures were calculated. The effect of the cation is studied comparing the experimental results with those obtained for the ionic liquid 1,3-dimethylimidazolium methylsulfate.

  2. Accounting for Zero Inflation of Mussel Parasite Counts Using Discrete Regression Models

    Directory of Open Access Journals (Sweden)

    Emel Çankaya

    2017-06-01

    Full Text Available In many ecological applications, the absences of species are inevitable due to either detection faults in samples or uninhabitable conditions for their existence, resulting in high number of zero counts or abundance. Usual practice for modelling such data is regression modelling of log(abundance+1 and it is well know that resulting model is inadequate for prediction purposes. New discrete models accounting for zero abundances, namely zero-inflated regression (ZIP and ZINB, Hurdle-Poisson (HP and Hurdle-Negative Binomial (HNB amongst others are widely preferred to the classical regression models. Due to the fact that mussels are one of the economically most important aquatic products of Turkey, the purpose of this study is therefore to examine the performances of these four models in determination of the significant biotic and abiotic factors on the occurrences of Nematopsis legeri parasite harming the existence of Mediterranean mussels (Mytilus galloprovincialis L.. The data collected from the three coastal regions of Sinop city in Turkey showed more than 50% of parasite counts on the average are zero-valued and model comparisons were based on information criterion. The results showed that the probability of the occurrence of this parasite is here best formulated by ZINB or HNB models and influential factors of models were found to be correspondent with ecological differences of the regions.

  3. Prediction of unwanted pregnancies using logistic regression, probit regression and discriminant analysis.

    Science.gov (United States)

    Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon

    2015-01-01

    Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended.

  4. Transpiration of glasshouse rose crops: evaluation of regression models

    NARCIS (Netherlands)

    Baas, R.; Rijssel, van E.

    2006-01-01

    Regression models of transpiration (T) based on global radiation inside the greenhouse (G), with or without energy input from heating pipes (Eh) and/or vapor pressure deficit (VPD) were parameterized. Therefore, data on T, G, temperatures from air, canopy and heating pipes, and VPD from both a

  5. Probabilistic seismic history matching using binary images

    Science.gov (United States)

    Davolio, Alessandra; Schiozer, Denis Jose

    2018-02-01

    Currently, the goal of history-matching procedures is not only to provide a model matching any observed data but also to generate multiple matched models to properly handle uncertainties. One such approach is a probabilistic history-matching methodology based on the discrete Latin Hypercube sampling algorithm, proposed in previous works, which was particularly efficient for matching well data (production rates and pressure). 4D seismic (4DS) data have been increasingly included into history-matching procedures. A key issue in seismic history matching (SHM) is to transfer data into a common domain: impedance, amplitude or pressure, and saturation. In any case, seismic inversions and/or modeling are required, which can be time consuming. An alternative to avoid these procedures is using binary images in SHM as they allow the shape, rather than the physical values, of observed anomalies to be matched. This work presents the incorporation of binary images in SHM within the aforementioned probabilistic history matching. The application was performed with real data from a segment of the Norne benchmark case that presents strong 4D anomalies, including softening signals due to pressure build up. The binary images are used to match the pressurized zones observed in time-lapse data. Three history matchings were conducted using: only well data, well and 4DS data, and only 4DS. The methodology is very flexible and successfully utilized the addition of binary images for seismic objective functions. Results proved the good convergence of the method in few iterations for all three cases. The matched models of the first two cases provided the best results, with similar well matching quality. The second case provided models presenting pore pressure changes according to the expected dynamic behavior (pressurized zones) observed on 4DS data. The use of binary images in SHM is relatively new with few examples in the literature. This work enriches this discussion by presenting a new

  6. MEMPREDIKSI FINANCIAL DISTRESS DENGAN BINARY LOGIT REGRESSION PERUSAHAAN TELEKOMUNIKASI

    Directory of Open Access Journals (Sweden)

    Tiara Widya Antikasari

    2017-04-01

    Full Text Available In this globalization era, sub–sector telecommunication industry has rapid development as time goes by with the number of customers’ growth. However, its growth is not balanced with operational revenue development. Therefore, it is important to analyze the financial distress in telecommunication companies in order to avoid bankruptcy. This research aimed to investigate the effect of financial ratios to predict probability of financial distress. Financial ratios indicator used profitability ratio, liquidity ratio, activity ratio, and leverage ratio. The population in this research was telecommunication companies listed in the Indonesia Stock Exchange periods 2009-2016. Based on purposive sampling method, the criteria of financial distress in this study was measured by using net operation negative two years, while statistic analysis used was logistic regression with a significance level of 10%. The result was that liquidity ratio (current ratio and activity ratio (total asset turnover ratio had a negative significant value, and profitability ratio(return on asset and leverage ratio (debt to total asset had positive significant value to predict financial distress.

  7. Binary Classification Method of Social Network Users

    Directory of Open Access Journals (Sweden)

    I. A. Poryadin

    2017-01-01

    Full Text Available The subject of research is a binary classification method of social network users based on the data analysis they have placed. Relevance of the task to gain information about a person by examining the content of his/her pages in social networks is exemplified. The most common approach to its solution is a visual browsing. The order of the regional authority in our country illustrates that its using in school education is needed. The article shows restrictions on the visual browsing of pupil’s pages in social networks as a tool for the teacher and the school psychologist and justifies that a process of social network users’ data analysis should be automated. Explores publications, which describe such data acquisition, processing, and analysis methods and considers their advantages and disadvantages. The article also gives arguments to support a proposal to study the classification method of social network users. One such method is credit scoring, which is used in banks and credit institutions to assess the solvency of clients. Based on the high efficiency of the method there is a proposal for significant expansion of its using in other areas of society. The possibility to use logistic regression as the mathematical apparatus of the proposed method of binary classification has been justified. Such an approach enables taking into account the different types of data extracted from social networks. Among them: the personal user data, information about hobbies, friends, graphic and text information, behaviour characteristics. The article describes a number of existing methods of data transformation that can be applied to solve the problem. An experiment of binary gender-based classification of social network users is described. A logistic model obtained for this example includes multiple logical variables obtained by transforming the user surnames. This experiment confirms the feasibility of the proposed method. Further work is to define a system

  8. Goal-oriented error estimation for Cahn-Hilliard models of binary phase transition

    KAUST Repository

    van der Zee, Kristoffer G.

    2010-10-27

    A posteriori estimates of errors in quantities of interest are developed for the nonlinear system of evolution equations embodied in the Cahn-Hilliard model of binary phase transition. These involve the analysis of wellposedness of dual backward-in-time problems and the calculation of residuals. Mixed finite element approximations are developed and used to deliver numerical solutions of representative problems in one- and two-dimensional domains. Estimated errors are shown to be quite accurate in these numerical examples. © 2010 Wiley Periodicals, Inc.

  9. Convergent Time-Varying Regression Models for Data Streams: Tracking Concept Drift by the Recursive Parzen-Based Generalized Regression Neural Networks.

    Science.gov (United States)

    Duda, Piotr; Jaworski, Maciej; Rutkowski, Leszek

    2018-03-01

    One of the greatest challenges in data mining is related to processing and analysis of massive data streams. Contrary to traditional static data mining problems, data streams require that each element is processed only once, the amount of allocated memory is constant and the models incorporate changes of investigated streams. A vast majority of available methods have been developed for data stream classification and only a few of them attempted to solve regression problems, using various heuristic approaches. In this paper, we develop mathematically justified regression models working in a time-varying environment. More specifically, we study incremental versions of generalized regression neural networks, called IGRNNs, and we prove their tracking properties - weak (in probability) and strong (with probability one) convergence assuming various concept drift scenarios. First, we present the IGRNNs, based on the Parzen kernels, for modeling stationary systems under nonstationary noise. Next, we extend our approach to modeling time-varying systems under nonstationary noise. We present several types of concept drifts to be handled by our approach in such a way that weak and strong convergence holds under certain conditions. Finally, in the series of simulations, we compare our method with commonly used heuristic approaches, based on forgetting mechanism or sliding windows, to deal with concept drift. Finally, we apply our concept in a real life scenario solving the problem of currency exchange rates prediction.

  10. The COBAIN (COntact Binary Atmospheres with INterpolation) Code for Radiative Transfer

    Science.gov (United States)

    Kochoska, Angela; Prša, Andrej; Horvat, Martin

    2018-01-01

    Standard binary star modeling codes make use of pre-existing solutions of the radiative transfer equation in stellar atmospheres. The various model atmospheres available today are consistently computed for single stars, under different assumptions - plane-parallel or spherical atmosphere approximation, local thermodynamical equilibrium (LTE) or non-LTE (NLTE), etc. However, they are nonetheless being applied to contact binary atmospheres by populating the surface corresponding to each component separately and neglecting any mixing that would typically occur at the contact boundary. In addition, single stellar atmosphere models do not take into account irradiance from a companion star, which can pose a serious problem when modeling close binaries. 1D atmosphere models are also solved under the assumption of an atmosphere in hydrodynamical equilibrium, which is not necessarily the case for contact atmospheres, as the potentially different densities and temperatures can give rise to flows that play a key role in the heat and radiation transfer.To resolve the issue of erroneous modeling of contact binary atmospheres using single star atmosphere tables, we have developed a generalized radiative transfer code for computation of the normal emergent intensity of a stellar surface, given its geometry and internal structure. The code uses a regular mesh of equipotential surfaces in a discrete set of spherical coordinates, which are then used to interpolate the values of the structural quantites (density, temperature, opacity) in any given point inside the mesh. The radiaitive transfer equation is numerically integrated in a set of directions spanning the unit sphere around each point and iterated until the intensity values for all directions and all mesh points converge within a given tolerance. We have found that this approach, albeit computationally expensive, is the only one that can reproduce the intensity distribution of the non-symmetric contact binary atmosphere and

  11. Predicting Antitumor Activity of Peptides by Consensus of Regression Models Trained on a Small Data Sample

    Directory of Open Access Journals (Sweden)

    Ivanka Jerić

    2011-11-01

    Full Text Available Predicting antitumor activity of compounds using regression models trained on a small number of compounds with measured biological activity is an ill-posed inverse problem. Yet, it occurs very often within the academic community. To counteract, up to some extent, overfitting problems caused by a small training data, we propose to use consensus of six regression models for prediction of biological activity of virtual library of compounds. The QSAR descriptors of 22 compounds related to the opioid growth factor (OGF, Tyr-Gly-Gly-Phe-Met with known antitumor activity were used to train regression models: the feed-forward artificial neural network, the k-nearest neighbor, sparseness constrained linear regression, the linear and nonlinear (with polynomial and Gaussian kernel support vector machine. Regression models were applied on a virtual library of 429 compounds that resulted in six lists with candidate compounds ranked by predicted antitumor activity. The highly ranked candidate compounds were synthesized, characterized and tested for an antiproliferative activity. Some of prepared peptides showed more pronounced activity compared with the native OGF; however, they were less active than highly ranked compounds selected previously by the radial basis function support vector machine (RBF SVM regression model. The ill-posedness of the related inverse problem causes unstable behavior of trained regression models on test data. These results point to high complexity of prediction based on the regression models trained on a small data sample.

  12. A simulation study on Bayesian Ridge regression models for several collinearity levels

    Science.gov (United States)

    Efendi, Achmad; Effrihan

    2017-12-01

    When analyzing data with multiple regression model if there are collinearities, then one or several predictor variables are usually omitted from the model. However, there sometimes some reasons, for instance medical or economic reasons, the predictors are all important and should be included in the model. Ridge regression model is not uncommon in some researches to use to cope with collinearity. Through this modeling, weights for predictor variables are used for estimating parameters. The next estimation process could follow the concept of likelihood. Furthermore, for the estimation nowadays the Bayesian version could be an alternative. This estimation method does not match likelihood one in terms of popularity due to some difficulties; computation and so forth. Nevertheless, with the growing improvement of computational methodology recently, this caveat should not at the moment become a problem. This paper discusses about simulation process for evaluating the characteristic of Bayesian Ridge regression parameter estimates. There are several simulation settings based on variety of collinearity levels and sample sizes. The results show that Bayesian method gives better performance for relatively small sample sizes, and for other settings the method does perform relatively similar to the likelihood method.

  13. Photovoltaic Array Condition Monitoring Based on Online Regression of Performance Model

    DEFF Research Database (Denmark)

    Spataru, Sergiu; Sera, Dezso; Kerekes, Tamas

    2013-01-01

    regression modeling, from PV array production, plane-of-array irradiance, and module temperature measurements, acquired during an initial learning phase of the system. After the model has been parameterized automatically, the condition monitoring system enters the normal operation phase, where...

  14. Extended cox regression model: The choice of timefunction

    Science.gov (United States)

    Isik, Hatice; Tutkun, Nihal Ata; Karasoy, Durdu

    2017-07-01

    Cox regression model (CRM), which takes into account the effect of censored observations, is one the most applicative and usedmodels in survival analysis to evaluate the effects of covariates. Proportional hazard (PH), requires a constant hazard ratio over time, is the assumptionofCRM. Using extended CRM provides the test of including a time dependent covariate to assess the PH assumption or an alternative model in case of nonproportional hazards. In this study, the different types of real data sets are used to choose the time function and the differences between time functions are analyzed and discussed.

  15. A magnetic model for low/hard state of black hole binaries

    Science.gov (United States)

    Ye, Yong-Chun; Wang, Ding-Xiong; Huang, Chang-Yin; Cao, Xiao-Feng

    2016-03-01

    A magnetic model for the low/hard state (LHS) of two black hole X-ray binaries (BHXBs), H1743-322 and GX 339-4, is proposed based on transport of the magnetic field from a companion into an accretion disk around a black hole (BH). This model consists of a truncated thin disk with an inner advection-dominated accretion flow (ADAF). The spectral profiles of the sources are fitted in agreement with the data observed at four different dates corresponding to the rising phase of the LHS. In addition, the association of the LHS with a quasi-steady jet is modeled based on transport of magnetic field, where the Blandford-Znajek (BZ) and Blandford-Payne (BP) processes are invoked to drive the jets from BH and inner ADAF. It turns out that the steep radio/X-ray correlations observed in H1743-322 and GX 339-4 can be interpreted based on our model.

  16. INVESTIGATION OF E-MAIL TRAFFIC BY USING ZERO-INFLATED REGRESSION MODELS

    Directory of Open Access Journals (Sweden)

    Yılmaz KAYA

    2012-06-01

    Full Text Available Based on count data obtained with a value of zero may be greater than anticipated. These types of data sets should be used to analyze by regression methods taking into account zero values. Zero- Inflated Poisson (ZIP, Zero-Inflated negative binomial (ZINB, Poisson Hurdle (PH, negative binomial Hurdle (NBH are more common approaches in modeling more zero value possessing dependent variables than expected. In the present study, the e-mail traffic of Yüzüncü Yıl University in 2009 spring semester was investigated. ZIP and ZINB, PH and NBH regression methods were applied on the data set because more zeros counting (78.9% were found in data set than expected. ZINB and NBH regression considered zero dispersion and overdispersion were found to be more accurate results due to overdispersion and zero dispersion in sending e-mail. ZINB is determined to be best model accordingto Vuong statistics and information criteria.

  17. Study on the control mechanism of China aerospace enterprises' binary multinational operation

    Institute of Scientific and Technical Information of China (English)

    Wang Jian; Li Hanling; Wu Weiwei

    2008-01-01

    China's aerospace enterprises carry on the multinational operation and participate in the international competition and the international division of labor and cooperation positively.This article first analyzs China aerospace enterprises' binary multinational business control objective and constructes its model.Then the article analyzes the tangible and intangible control mechanism of China aerospace enterprises' binary multinational operation respectively.Finally,the article constructs the model of China aerospace enterprises' binary multinational operation mechanisms.

  18. Focused information criterion and model averaging based on weighted composite quantile regression

    KAUST Repository

    Xu, Ganggang; Wang, Suojin; Huang, Jianhua Z.

    2013-01-01

    We study the focused information criterion and frequentist model averaging and their application to post-model-selection inference for weighted composite quantile regression (WCQR) in the context of the additive partial linear models. With the non

  19. Dynamical evolution of a fictitious population of binary Neptune Trojans

    Science.gov (United States)

    Brunini, Adrián

    2018-03-01

    We present numerical simulations of the evolution of a synthetic population of Binary Neptune Trojans, under the influence of the solar perturbations and tidal friction (the so-called Kozai cycles and tidal friction evolution). Our model includes the dynamical influence of the four giant planets on the heliocentric orbit of the binary centre of mass. In this paper, we explore the evolution of initially tight binaries around the Neptune L4 Lagrange point. We found that the variation of the heliocentric orbital elements due to the libration around the Lagrange point introduces significant changes in the orbital evolution of the binaries. Collisional processes would not play a significant role in the dynamical evolution of Neptune Trojans. After 4.5 × 109 yr of evolution, ˜50 per cent of the synthetic systems end up separated as single objects, most of them with slow diurnal rotation rate. The final orbital distribution of the surviving binary systems is statistically similar to the one found for Kuiper Belt Binaries when collisional evolution is not included in the model. Systems composed by a primary and a small satellite are more fragile than the ones composed by components of similar sizes.

  20. Multiclass Prediction with Partial Least Square Regression for Gene Expression Data: Applications in Breast Cancer Intrinsic Taxonomy

    Directory of Open Access Journals (Sweden)

    Chi-Cheng Huang

    2013-01-01

    Full Text Available Multiclass prediction remains an obstacle for high-throughput data analysis such as microarray gene expression profiles. Despite recent advancements in machine learning and bioinformatics, most classification tools were limited to the applications of binary responses. Our aim was to apply partial least square (PLS regression for breast cancer intrinsic taxonomy, of which five distinct molecular subtypes were identified. The PAM50 signature genes were used as predictive variables in PLS analysis, and the latent gene component scores were used in binary logistic regression for each molecular subtype. The 139 prototypical arrays for PAM50 development were used as training dataset, and three independent microarray studies with Han Chinese origin were used for independent validation (n=535. The agreement between PAM50 centroid-based single sample prediction (SSP and PLS-regression was excellent (weighted Kappa: 0.988 within the training samples, but deteriorated substantially in independent samples, which could attribute to much more unclassified samples by PLS-regression. If these unclassified samples were removed, the agreement between PAM50 SSP and PLS-regression improved enormously (weighted Kappa: 0.829 as opposed to 0.541 when unclassified samples were analyzed. Our study ascertained the feasibility of PLS-regression in multi-class prediction, and distinct clinical presentations and prognostic discrepancies were observed across breast cancer molecular subtypes.

  1. Binary Stochastic Representations for Large Multi-class Classification

    KAUST Repository

    Gerald, Thomas

    2017-10-23

    Classification with a large number of classes is a key problem in machine learning and corresponds to many real-world applications like tagging of images or textual documents in social networks. If one-vs-all methods usually reach top performance in this context, these approaches suffer of a high inference complexity, linear w.r.t. the number of categories. Different models based on the notion of binary codes have been proposed to overcome this limitation, achieving in a sublinear inference complexity. But they a priori need to decide which binary code to associate to which category before learning using more or less complex heuristics. We propose a new end-to-end model which aims at simultaneously learning to associate binary codes with categories, but also learning to map inputs to binary codes. This approach called Deep Stochastic Neural Codes (DSNC) keeps the sublinear inference complexity but do not need any a priori tuning. Experimental results on different datasets show the effectiveness of the approach w.r.t. baseline methods.

  2. Model-based bootstrapping when correcting for measurement error with application to logistic regression.

    Science.gov (United States)

    Buonaccorsi, John P; Romeo, Giovanni; Thoresen, Magne

    2018-03-01

    When fitting regression models, measurement error in any of the predictors typically leads to biased coefficients and incorrect inferences. A plethora of methods have been proposed to correct for this. Obtaining standard errors and confidence intervals using the corrected estimators can be challenging and, in addition, there is concern about remaining bias in the corrected estimators. The bootstrap, which is one option to address these problems, has received limited attention in this context. It has usually been employed by simply resampling observations, which, while suitable in some situations, is not always formally justified. In addition, the simple bootstrap does not allow for estimating bias in non-linear models, including logistic regression. Model-based bootstrapping, which can potentially estimate bias in addition to being robust to the original sampling or whether the measurement error variance is constant or not, has received limited attention. However, it faces challenges that are not present in handling regression models with no measurement error. This article develops new methods for model-based bootstrapping when correcting for measurement error in logistic regression with replicate measures. The methodology is illustrated using two examples, and a series of simulations are carried out to assess and compare the simple and model-based bootstrap methods, as well as other standard methods. While not always perfect, the model-based approaches offer some distinct improvements over the other methods. © 2017, The International Biometric Society.

  3. Testing the Binary Black Hole Nature of a Compact Binary Coalescence.

    Science.gov (United States)

    Krishnendu, N V; Arun, K G; Mishra, Chandra Kant

    2017-09-01

    We propose a novel method to test the binary black hole nature of compact binaries detectable by gravitational wave (GW) interferometers and, hence, constrain the parameter space of other exotic compact objects. The spirit of the test lies in the "no-hair" conjecture for black holes where all properties of a Kerr black hole are characterized by its mass and spin. The method relies on observationally measuring the quadrupole moments of the compact binary constituents induced due to their spins. If the compact object is a Kerr black hole (BH), its quadrupole moment is expressible solely in terms of its mass and spin. Otherwise, the quadrupole moment can depend on additional parameters (such as the equation of state of the object). The higher order spin effects in phase and amplitude of a gravitational waveform, which explicitly contains the spin-induced quadrupole moments of compact objects, hence, uniquely encode the nature of the compact binary. Thus, we argue that an independent measurement of the spin-induced quadrupole moment of the compact binaries from GW observations can provide a unique way to distinguish binary BH systems from binaries consisting of exotic compact objects.

  4. Ordinal regression models to describe tourist satisfaction with Sintra's world heritage

    Science.gov (United States)

    Mouriño, Helena

    2013-10-01

    In Tourism Research, ordinal regression models are becoming a very powerful tool in modelling the relationship between an ordinal response variable and a set of explanatory variables. In August and September 2010, we conducted a pioneering Tourist Survey in Sintra, Portugal. The data were obtained by face-to-face interviews at the entrances of the Palaces and Parks of Sintra. The work developed in this paper focus on two main points: tourists' perception of the entrance fees; overall level of satisfaction with this heritage site. For attaining these goals, ordinal regression models were developed. We concluded that tourist's nationality was the only significant variable to describe the perception of the admission fees. Also, Sintra's image among tourists depends not only on their nationality, but also on previous knowledge about Sintra's World Heritage status.

  5. Combination of supervised and semi-supervised regression models for improved unbiased estimation

    DEFF Research Database (Denmark)

    Arenas-Garía, Jeronimo; Moriana-Varo, Carlos; Larsen, Jan

    2010-01-01

    In this paper we investigate the steady-state performance of semisupervised regression models adjusted using a modified RLS-like algorithm, identifying the situations where the new algorithm is expected to outperform standard RLS. By using an adaptive combination of the supervised and semisupervi......In this paper we investigate the steady-state performance of semisupervised regression models adjusted using a modified RLS-like algorithm, identifying the situations where the new algorithm is expected to outperform standard RLS. By using an adaptive combination of the supervised...

  6. Binary Masking & Speech Intelligibility

    DEFF Research Database (Denmark)

    Boldt, Jesper

    The purpose of this thesis is to examine how binary masking can be used to increase intelligibility in situations where hearing impaired listeners have difficulties understanding what is being said. The major part of the experiments carried out in this thesis can be categorized as either experime......The purpose of this thesis is to examine how binary masking can be used to increase intelligibility in situations where hearing impaired listeners have difficulties understanding what is being said. The major part of the experiments carried out in this thesis can be categorized as either...... experiments under ideal conditions or as experiments under more realistic conditions useful for real-life applications such as hearing aids. In the experiments under ideal conditions, the previously defined ideal binary mask is evaluated using hearing impaired listeners, and a novel binary mask -- the target...... binary mask -- is introduced. The target binary mask shows the same substantial increase in intelligibility as the ideal binary mask and is proposed as a new reference for binary masking. In the category of real-life applications, two new methods are proposed: a method for estimation of the ideal binary...

  7. Random regression models for daily feed intake in Danish Duroc pigs

    DEFF Research Database (Denmark)

    Strathe, Anders Bjerring; Mark, Thomas; Jensen, Just

    The objective of this study was to develop random regression models and estimate covariance functions for daily feed intake (DFI) in Danish Duroc pigs. A total of 476201 DFI records were available on 6542 Duroc boars between 70 to 160 days of age. The data originated from the National test station......-year-season, permanent, and animal genetic effects. The functional form was based on Legendre polynomials. A total of 64 models for random regressions were initially ranked by BIC to identify the approximate order for the Legendre polynomials using AI-REML. The parsimonious model included Legendre polynomials of 2nd...... order for genetic and permanent environmental curves and a heterogeneous residual variance, allowing the daily residual variance to change along the age trajectory due to scale effects. The parameters of the model were estimated in a Bayesian framework, using the RJMC module of the DMU package, where...

  8. Longitudinal beta regression models for analyzing health-related quality of life scores over time

    Directory of Open Access Journals (Sweden)

    Hunger Matthias

    2012-09-01

    Full Text Available Abstract Background Health-related quality of life (HRQL has become an increasingly important outcome parameter in clinical trials and epidemiological research. HRQL scores are typically bounded at both ends of the scale and often highly skewed. Several regression techniques have been proposed to model such data in cross-sectional studies, however, methods applicable in longitudinal research are less well researched. This study examined the use of beta regression models for analyzing longitudinal HRQL data using two empirical examples with distributional features typically encountered in practice. Methods We used SF-6D utility data from a German older age cohort study and stroke-specific HRQL data from a randomized controlled trial. We described the conceptual differences between mixed and marginal beta regression models and compared both models to the commonly used linear mixed model in terms of overall fit and predictive accuracy. Results At any measurement time, the beta distribution fitted the SF-6D utility data and stroke-specific HRQL data better than the normal distribution. The mixed beta model showed better likelihood-based fit statistics than the linear mixed model and respected the boundedness of the outcome variable. However, it tended to underestimate the true mean at the upper part of the distribution. Adjusted group means from marginal beta model and linear mixed model were nearly identical but differences could be observed with respect to standard errors. Conclusions Understanding the conceptual differences between mixed and marginal beta regression models is important for their proper use in the analysis of longitudinal HRQL data. Beta regression fits the typical distribution of HRQL data better than linear mixed models, however, if focus is on estimating group mean scores rather than making individual predictions, the two methods might not differ substantially.

  9. Interacting binary stars

    CERN Document Server

    Sahade, Jorge; Ter Haar, D

    1978-01-01

    Interacting Binary Stars deals with the development, ideas, and problems in the study of interacting binary stars. The book consolidates the information that is scattered over many publications and papers and gives an account of important discoveries with relevant historical background. Chapters are devoted to the presentation and discussion of the different facets of the field, such as historical account of the development in the field of study of binary stars; the Roche equipotential surfaces; methods and techniques in space astronomy; and enumeration of binary star systems that are studied

  10. A Gompertz regression model for fern spores germination

    Directory of Open Access Journals (Sweden)

    Gabriel y Galán, Jose María

    2015-06-01

    Full Text Available Germination is one of the most important biological processes for both seed and spore plants, also for fungi. At present, mathematical models of germination have been developed in fungi, bryophytes and several plant species. However, ferns are the only group whose germination has never been modelled. In this work we develop a regression model of the germination of fern spores. We have found that for Blechnum serrulatum, Blechnum yungense, Cheilanthes pilosa, Niphidium macbridei and Polypodium feuillei species the Gompertz growth model describe satisfactorily cumulative germination. An important result is that regression parameters are independent of fern species and the model is not affected by intraspecific variation. Our results show that the Gompertz curve represents a general germination model for all the non-green spore leptosporangiate ferns, including in the paper a discussion about the physiological and ecological meaning of the model.La germinación es uno de los procesos biológicos más relevantes tanto para las plantas con esporas, como para las plantas con semillas y los hongos. Hasta el momento, se han desarrollado modelos de germinación para hongos, briofitos y diversas especies de espermatófitos. Los helechos son el único grupo de plantas cuya germinación nunca ha sido modelizada. En este trabajo se desarrolla un modelo de regresión para explicar la germinación de las esporas de helechos. Observamos que para las especies Blechnum serrulatum, Blechnum yungense, Cheilanthes pilosa, Niphidium macbridei y Polypodium feuillei el modelo de crecimiento de Gompertz describe satisfactoriamente la germinación acumulativa. Un importante resultado es que los parámetros de la regresión son independientes de la especie y que el modelo no está afectado por variación intraespecífica. Por lo tanto, los resultados del trabajo muestran que la curva de Gompertz puede representar un modelo general para todos los helechos leptosporangiados

  11. Generic global regression models for growth prediction of Salmonella in ground pork and pork cuts

    DEFF Research Database (Denmark)

    Buschhardt, Tasja; Hansen, Tina Beck; Bahl, Martin Iain

    2017-01-01

    Introduction and Objectives Models for the prediction of bacterial growth in fresh pork are primarily developed using two-step regression (i.e. primary models followed by secondary models). These models are also generally based on experiments in liquids or ground meat and neglect surface growth....... It has been shown that one-step global regressions can result in more accurate models and that bacterial growth on intact surfaces can substantially differ from growth in liquid culture. Material and Methods We used a global-regression approach to develop predictive models for the growth of Salmonella....... One part of obtained logtransformed cell counts was used for model development and another for model validation. The Ratkowsky square root model and the relative lag time (RLT) model were integrated into the logistic model with delay. Fitted parameter estimates were compared to investigate the effect...

  12. Symbiotic stars - a binary model with super-critical accretion

    Energy Technology Data Exchange (ETDEWEB)

    Bath, G T [National Radio Astronomy Observatory, Charlottesville, Va. (USA)

    1977-01-01

    The structure of symbiotic variables is discussed in terms of a binary model. Disc accretion by a main sequence star or white dwarf at rates close to the Eddington limit produces an ultraviolet continuum source near the accreting star surface. This generates a variable, radiatively-driven, out-flowing wind. The wind is optically thick and the disc luminosity is absorbed and scattered and thus degraded into the optical region. Variations in the rate of mass loss in the wind lead to optical eruptions through shifts in the position of, and conditions in, the last scattering surface. The behaviour of Z And determined by Boyarchuk is shown to be in agreement with such a model. The conditions in the out-flowing wind are discussed. Limits on the mass loss rate are derived from conditions at the surface of the accreting star. It is suggested that variable out-flow in the wind is generated by fluctuations in disc luminosity produced by changes in the giant companions rate of mass transfer. The relation between symbiotic variables and classical and dwarf novae is discussed.

  13. Multifrequency Behaviour of the Gamma-Ray Binary System PSR B1259-63: Modelling the FERMI Flare

    Directory of Open Access Journals (Sweden)

    Brian van Soelen

    2014-12-01

    Full Text Available This paper presents a brief overview of the multifrequency properties of the gamma-ray binary system PSR B1259-63 from radio to very high energy gamma-rays. A summary is also presented of the various models put forward to explain the Fermi "flare" detected in 2011. Initial results are presented of a new turbulence driven model to explain the GeV observations.

  14. Application of multilinear regression analysis in modeling of soil ...

    African Journals Online (AJOL)

    The application of Multi-Linear Regression Analysis (MLRA) model for predicting soil properties in Calabar South offers a technical guide and solution in foundation designs problems in the area. Forty-five soil samples were collected from fifteen different boreholes at a different depth and 270 tests were carried out for CBR, ...

  15. EMD-regression for modelling multi-scale relationships, and application to weather-related cardiovascular mortality

    Science.gov (United States)

    Masselot, Pierre; Chebana, Fateh; Bélanger, Diane; St-Hilaire, André; Abdous, Belkacem; Gosselin, Pierre; Ouarda, Taha B. M. J.

    2018-01-01

    In a number of environmental studies, relationships between natural processes are often assessed through regression analyses, using time series data. Such data are often multi-scale and non-stationary, leading to a poor accuracy of the resulting regression models and therefore to results with moderate reliability. To deal with this issue, the present paper introduces the EMD-regression methodology consisting in applying the empirical mode decomposition (EMD) algorithm on data series and then using the resulting components in regression models. The proposed methodology presents a number of advantages. First, it accounts of the issues of non-stationarity associated to the data series. Second, this approach acts as a scan for the relationship between a response variable and the predictors at different time scales, providing new insights about this relationship. To illustrate the proposed methodology it is applied to study the relationship between weather and cardiovascular mortality in Montreal, Canada. The results shed new knowledge concerning the studied relationship. For instance, they show that the humidity can cause excess mortality at the monthly time scale, which is a scale not visible in classical models. A comparison is also conducted with state of the art methods which are the generalized additive models and distributed lag models, both widely used in weather-related health studies. The comparison shows that EMD-regression achieves better prediction performances and provides more details than classical models concerning the relationship.

  16. Coupled binary embedding for large-scale image retrieval.

    Science.gov (United States)

    Zheng, Liang; Wang, Shengjin; Tian, Qi

    2014-08-01

    Visual matching is a crucial step in image retrieval based on the bag-of-words (BoW) model. In the baseline method, two keypoints are considered as a matching pair if their SIFT descriptors are quantized to the same visual word. However, the SIFT visual word has two limitations. First, it loses most of its discriminative power during quantization. Second, SIFT only describes the local texture feature. Both drawbacks impair the discriminative power of the BoW model and lead to false positive matches. To tackle this problem, this paper proposes to embed multiple binary features at indexing level. To model correlation between features, a multi-IDF scheme is introduced, through which different binary features are coupled into the inverted file. We show that matching verification methods based on binary features, such as Hamming embedding, can be effectively incorporated in our framework. As an extension, we explore the fusion of binary color feature into image retrieval. The joint integration of the SIFT visual word and binary features greatly enhances the precision of visual matching, reducing the impact of false positive matches. Our method is evaluated through extensive experiments on four benchmark datasets (Ukbench, Holidays, DupImage, and MIR Flickr 1M). We show that our method significantly improves the baseline approach. In addition, large-scale experiments indicate that the proposed method requires acceptable memory usage and query time compared with other approaches. Further, when global color feature is integrated, our method yields competitive performance with the state-of-the-arts.

  17. Discovery and characterization of 3000+ main-sequence binaries from APOGEE spectra

    Science.gov (United States)

    El-Badry, Kareem; Ting, Yuan-Sen; Rix, Hans-Walter; Quataert, Eliot; Weisz, Daniel R.; Cargile, Phillip; Conroy, Charlie; Hogg, David W.; Bergemann, Maria; Liu, Chao

    2018-05-01

    We develop a data-driven spectral model for identifying and characterizing spatially unresolved multiple-star systems and apply it to APOGEE DR13 spectra of main-sequence stars. Binaries and triples are identified as targets whose spectra can be significantly better fit by a superposition of two or three model spectra, drawn from the same isochrone, than any single-star model. From an initial sample of ˜20 000 main-sequence targets, we identify ˜2500 binaries in which both the primary and secondary stars contribute detectably to the spectrum, simultaneously fitting for the velocities and stellar parameters of both components. We additionally identify and fit ˜200 triple systems, as well as ˜700 velocity-variable systems in which the secondary does not contribute detectably to the spectrum. Our model simplifies the process of simultaneously fitting single- or multi-epoch spectra with composite models and does not depend on a velocity offset between the two components of a binary, making it sensitive to traditionally undetectable systems with periods of hundreds or thousands of years. In agreement with conventional expectations, almost all the spectrally identified binaries with measured parallaxes fall above the main sequence in the colour-magnitude diagram. We find excellent agreement between spectrally and dynamically inferred mass ratios for the ˜600 binaries in which a dynamical mass ratio can be measured from multi-epoch radial velocities. We obtain full orbital solutions for 64 systems, including 14 close binaries within hierarchical triples. We make available catalogues of stellar parameters, abundances, mass ratios, and orbital parameters.

  18. The Application of Classical and Neural Regression Models for the Valuation of Residential Real Estate

    Directory of Open Access Journals (Sweden)

    Mach Łukasz

    2017-06-01

    Full Text Available The research process aimed at building regression models, which helps to valuate residential real estate, is presented in the following article. Two widely used computational tools i.e. the classical multiple regression and regression models of artificial neural networks were used in order to build models. An attempt to define the utilitarian usefulness of the above-mentioned tools and comparative analysis of them is the aim of the conducted research. Data used for conducting analyses refers to the secondary transactional residential real estate market.

  19. A review of a priori regression models for warfarin maintenance dose prediction.

    Directory of Open Access Journals (Sweden)

    Ben Francis

    Full Text Available A number of a priori warfarin dosing algorithms, derived using linear regression methods, have been proposed. Although these dosing algorithms may have been validated using patients derived from the same centre, rarely have they been validated using a patient cohort recruited from another centre. In order to undertake external validation, two cohorts were utilised. One cohort formed by patients from a prospective trial and the second formed by patients in the control arm of the EU-PACT trial. Of these, 641 patients were identified as having attained stable dosing and formed the dataset used for validation. Predicted maintenance doses from six criterion fulfilling regression models were then compared to individual patient stable warfarin dose. Predictive ability was assessed with reference to several statistics including the R-square and mean absolute error. The six regression models explained different amounts of variability in the stable maintenance warfarin dose requirements of the patients in the two validation cohorts; adjusted R-squared values ranged from 24.2% to 68.6%. An overview of the summary statistics demonstrated that no one dosing algorithm could be considered optimal. The larger validation cohort from the prospective trial produced more consistent statistics across the six dosing algorithms. The study found that all the regression models performed worse in the validation cohort when compared to the derivation cohort. Further, there was little difference between regression models that contained pharmacogenetic coefficients and algorithms containing just non-pharmacogenetic coefficients. The inconsistency of results between the validation cohorts suggests that unaccounted population specific factors cause variability in dosing algorithm performance. Better methods for dosing that take into account inter- and intra-individual variability, at the initiation and maintenance phases of warfarin treatment, are needed.

  20. A review of a priori regression models for warfarin maintenance dose prediction.

    Science.gov (United States)

    Francis, Ben; Lane, Steven; Pirmohamed, Munir; Jorgensen, Andrea

    2014-01-01

    A number of a priori warfarin dosing algorithms, derived using linear regression methods, have been proposed. Although these dosing algorithms may have been validated using patients derived from the same centre, rarely have they been validated using a patient cohort recruited from another centre. In order to undertake external validation, two cohorts were utilised. One cohort formed by patients from a prospective trial and the second formed by patients in the control arm of the EU-PACT trial. Of these, 641 patients were identified as having attained stable dosing and formed the dataset used for validation. Predicted maintenance doses from six criterion fulfilling regression models were then compared to individual patient stable warfarin dose. Predictive ability was assessed with reference to several statistics including the R-square and mean absolute error. The six regression models explained different amounts of variability in the stable maintenance warfarin dose requirements of the patients in the two validation cohorts; adjusted R-squared values ranged from 24.2% to 68.6%. An overview of the summary statistics demonstrated that no one dosing algorithm could be considered optimal. The larger validation cohort from the prospective trial produced more consistent statistics across the six dosing algorithms. The study found that all the regression models performed worse in the validation cohort when compared to the derivation cohort. Further, there was little difference between regression models that contained pharmacogenetic coefficients and algorithms containing just non-pharmacogenetic coefficients. The inconsistency of results between the validation cohorts suggests that unaccounted population specific factors cause variability in dosing algorithm performance. Better methods for dosing that take into account inter- and intra-individual variability, at the initiation and maintenance phases of warfarin treatment, are needed.

  1. Modeling of chemical exergy of agricultural biomass using improved general regression neural network

    International Nuclear Information System (INIS)

    Huang, Y.W.; Chen, M.Q.; Li, Y.; Guo, J.

    2016-01-01

    A comprehensive evaluation for energy potential contained in agricultural biomass was a vital step for energy utilization of agricultural biomass. The chemical exergy of typical agricultural biomass was evaluated based on the second law of thermodynamics. The chemical exergy was significantly influenced by C and O elements rather than H element. The standard entropy of the samples also was examined based on their element compositions. Two predicted models of the chemical exergy were developed, which referred to a general regression neural network model based upon the element composition, and a linear model based upon the high heat value. An auto-refinement algorithm was firstly developed to improve the performance of regression neural network model. The developed general regression neural network model with K-fold cross-validation had a better ability for predicting the chemical exergy than the linear model, which had lower predicted errors (±1.5%). - Highlights: • Chemical exergies of agricultural biomass were evaluated based upon fifty samples. • Values for the standard entropy of agricultural biomass samples were calculated. • A linear relationship between chemical exergy and HHV of samples was detected. • An improved GRNN prediction model for the chemical exergy of biomass was developed.

  2. Quantitative Structure-Relative Volatility Relationship Model for Extractive Distillation of Ethylbenzene/p-Xylene Mixtures: Application to Binary and Ternary Mixtures as Extractive Agents

    Energy Technology Data Exchange (ETDEWEB)

    Kang, Young-Mook; Oh, Kyunghwan; You, Hwan; No, Kyoung Tai [Bioinformatics and Molecular Design Research Center, Seoul (Korea, Republic of); Jeon, Yukwon; Shul, Yong-Gun; Hwang, Sung Bo; Shin, Hyun Kil; Kim, Min Sung; Kim, Namseok; Son, Hyoungjun [Yonsei University, Seoul (Korea, Republic of); Chu, Young Hwan [Sangji University, Wonju (Korea, Republic of); Cho, Kwang-Hwi [Soongsil University, Seoul (Korea, Republic of)

    2016-04-15

    Ethylbenzene (EB) and p-xylene (PX) are important chemicals for the production of industrial materials; accordingly, their efficient separation is desired, even though the difference in their boiling points is very small. This paper describes the efforts toward the identification of high-performance extractive agents for EB and PX separation by distillation. Most high-performance extractive agents contain halogen atoms, which present health hazards and are corrosive to distillation plates. To avoid this disadvantage of extractive agents, we developed a quantitative structure-relative volatility relationship (QSRVR) model for designing safe extractive agents. We have previously developed and reported QSRVR models for single extractive agents. In this study, we introduce extended QSRVR models for binary and ternary extractive agents. The QSRVR models accurately predict the relative volatilities of binary and ternary extractive agents. The service to predict the relative volatility for binary and ternary extractive agents is freely available from the Internet at http://qsrvr.o pengsi.org/.

  3. Quantitative Structure-Relative Volatility Relationship Model for Extractive Distillation of Ethylbenzene/p-Xylene Mixtures: Application to Binary and Ternary Mixtures as Extractive Agents

    International Nuclear Information System (INIS)

    Kang, Young-Mook; Oh, Kyunghwan; You, Hwan; No, Kyoung Tai; Jeon, Yukwon; Shul, Yong-Gun; Hwang, Sung Bo; Shin, Hyun Kil; Kim, Min Sung; Kim, Namseok; Son, Hyoungjun; Chu, Young Hwan; Cho, Kwang-Hwi

    2016-01-01

    Ethylbenzene (EB) and p-xylene (PX) are important chemicals for the production of industrial materials; accordingly, their efficient separation is desired, even though the difference in their boiling points is very small. This paper describes the efforts toward the identification of high-performance extractive agents for EB and PX separation by distillation. Most high-performance extractive agents contain halogen atoms, which present health hazards and are corrosive to distillation plates. To avoid this disadvantage of extractive agents, we developed a quantitative structure-relative volatility relationship (QSRVR) model for designing safe extractive agents. We have previously developed and reported QSRVR models for single extractive agents. In this study, we introduce extended QSRVR models for binary and ternary extractive agents. The QSRVR models accurately predict the relative volatilities of binary and ternary extractive agents. The service to predict the relative volatility for binary and ternary extractive agents is freely available from the Internet at http://qsrvr.o pengsi.org/.

  4. New robust statistical procedures for the polytomous logistic regression models.

    Science.gov (United States)

    Castilla, Elena; Ghosh, Abhik; Martin, Nirian; Pardo, Leandro

    2018-05-17

    This article derives a new family of estimators, namely the minimum density power divergence estimators, as a robust generalization of the maximum likelihood estimator for the polytomous logistic regression model. Based on these estimators, a family of Wald-type test statistics for linear hypotheses is introduced. Robustness properties of both the proposed estimators and the test statistics are theoretically studied through the classical influence function analysis. Appropriate real life examples are presented to justify the requirement of suitable robust statistical procedures in place of the likelihood based inference for the polytomous logistic regression model. The validity of the theoretical results established in the article are further confirmed empirically through suitable simulation studies. Finally, an approach for the data-driven selection of the robustness tuning parameter is proposed with empirical justifications. © 2018, The International Biometric Society.

  5. Electricity demand loads modeling using AutoRegressive Moving Average (ARMA) models

    Energy Technology Data Exchange (ETDEWEB)

    Pappas, S.S. [Department of Information and Communication Systems Engineering, University of the Aegean, Karlovassi, 83 200 Samos (Greece); Ekonomou, L.; Chatzarakis, G.E. [Department of Electrical Engineering Educators, ASPETE - School of Pedagogical and Technological Education, N. Heraklion, 141 21 Athens (Greece); Karamousantas, D.C. [Technological Educational Institute of Kalamata, Antikalamos, 24100 Kalamata (Greece); Katsikas, S.K. [Department of Technology Education and Digital Systems, University of Piraeus, 150 Androutsou Srt., 18 532 Piraeus (Greece); Liatsis, P. [Division of Electrical Electronic and Information Engineering, School of Engineering and Mathematical Sciences, Information and Biomedical Engineering Centre, City University, Northampton Square, London EC1V 0HB (United Kingdom)

    2008-09-15

    This study addresses the problem of modeling the electricity demand loads in Greece. The provided actual load data is deseasonilized and an AutoRegressive Moving Average (ARMA) model is fitted on the data off-line, using the Akaike Corrected Information Criterion (AICC). The developed model fits the data in a successful manner. Difficulties occur when the provided data includes noise or errors and also when an on-line/adaptive modeling is required. In both cases and under the assumption that the provided data can be represented by an ARMA model, simultaneous order and parameter estimation of ARMA models under the presence of noise are performed. The produced results indicate that the proposed method, which is based on the multi-model partitioning theory, tackles successfully the studied problem. For validation purposes the produced results are compared with three other established order selection criteria, namely AICC, Akaike's Information Criterion (AIC) and Schwarz's Bayesian Information Criterion (BIC). The developed model could be useful in the studies that concern electricity consumption and electricity prices forecasts. (author)

  6. Full Ionisation In Binary-Binary Encounters With Small Positive Energies

    Science.gov (United States)

    Sweatman, W. L.

    2006-08-01

    Interactions between binary stars and single stars and binary stars and other binary stars play a key role in the dynamics of a dense stellar system. Energy can be transferred between the internal dynamics of a binary and the larger scale dynamics of the interacting objects. Binaries can be destroyed and created by the interaction. In a binary-binary encounter, full ionisation occurs when both of the binary stars are destroyed in the interaction to create four single stars. This is only possible when the total energy of the system is positive. For very small energies the probability of this occurring is very low and it tends towards zero as the total energy tends towards zero. Here the case is considered for which all the stars have equal masses. An asymptotic power law is predicted relating the probability of full ionisation with the total energy when this latter quantity is small. The exponent, which is approximately 2.31, is compared with the results from numerical scattering experiments. The theoretical approach taken is similar to one used previously in the three-body problem. It makes use of the fact that the most dramatic changes in scale and energies of a few-body system occur when its components pass near to a central configuration. The position, and number, of these configurations is not known for the general four-body problem, however, with equal masses there are known to be exactly five different cases. Separate consideration and comparison of the properties of orbits close to each of these five central configurations enables the prediction of the form of the cross-section for full ionisation for the case of small positive total energy. This is the relation between total energy and the probability of total ionisation described above.

  7. CALCULATING THE HABITABLE ZONE OF BINARY STAR SYSTEMS. I. S-TYPE BINARIES

    Energy Technology Data Exchange (ETDEWEB)

    Kaltenegger, Lisa [MPIA, Koenigstuhl 17, D-69117 Heidelberg (Germany); Haghighipour, Nader, E-mail: kaltenegger@mpia.de [Institute for Astronomy and NASA Astrobiology Institute, University of Hawaii-Manoa, Honolulu, HI 96822 (United States)

    2013-11-10

    We have developed a comprehensive methodology for calculating the boundaries of the habitable zone (HZ) of planet-hosting S-type binary star systems. Our approach is general and takes into account the contribution of both stars to the location and extent of the binary HZ with different stellar spectral types. We have studied how the binary eccentricity and stellar energy distribution affect the extent of the HZ. Results indicate that in binaries where the combination of mass-ratio and orbital eccentricity allows planet formation around a star of the system to proceed successfully, the effect of a less luminous secondary on the location of the primary's HZ is generally negligible. However, when the secondary is more luminous, it can influence the extent of the HZ. We present the details of the derivations of our methodology and discuss its application to the binary HZ around the primary and secondary main-sequence stars of an FF, MM, and FM binary, as well as two known planet-hosting binaries α Cen AB and HD 196886.

  8. CALCULATING THE HABITABLE ZONE OF BINARY STAR SYSTEMS. I. S-TYPE BINARIES

    International Nuclear Information System (INIS)

    Kaltenegger, Lisa; Haghighipour, Nader

    2013-01-01

    We have developed a comprehensive methodology for calculating the boundaries of the habitable zone (HZ) of planet-hosting S-type binary star systems. Our approach is general and takes into account the contribution of both stars to the location and extent of the binary HZ with different stellar spectral types. We have studied how the binary eccentricity and stellar energy distribution affect the extent of the HZ. Results indicate that in binaries where the combination of mass-ratio and orbital eccentricity allows planet formation around a star of the system to proceed successfully, the effect of a less luminous secondary on the location of the primary's HZ is generally negligible. However, when the secondary is more luminous, it can influence the extent of the HZ. We present the details of the derivations of our methodology and discuss its application to the binary HZ around the primary and secondary main-sequence stars of an FF, MM, and FM binary, as well as two known planet-hosting binaries α Cen AB and HD 196886

  9. Model many-body Stoner Hamiltonian for binary FeCr alloys

    Science.gov (United States)

    Nguyen-Manh, D.; Dudarev, S. L.

    2009-09-01

    We derive a model tight-binding many-body d -electron Stoner Hamiltonian for FeCr binary alloys and investigate the sensitivity of its mean-field solutions to the choice of hopping integrals and the Stoner exchange parameters. By applying the local charge-neutrality condition within a self-consistent treatment we show that the negative enthalpy-of-mixing anomaly characterizing the alloy in the low chromium concentration limit is due entirely to the presence of the on-site exchange Stoner terms and that the occurrence of this anomaly is not specifically related to the choice of hopping integrals describing conventional chemical bonding between atoms in the alloy. The Bain transformation pathway computed, using the proposed model Hamiltonian, for the Fe15Cr alloy configuration is in excellent agreement with ab initio total-energy calculations. Our investigation also shows how the parameters of a tight-binding many-body model Hamiltonian for a magnetic alloy can be derived from the comparison of its mean-field solutions with other, more accurate, mean-field approximations (e.g., density-functional calculations), hence stimulating the development of large-scale computational algorithms for modeling radiation damage effects in magnetic alloys and steels.

  10. Analysis of the influence of quantile regression model on mainland tourists' service satisfaction performance.

    Science.gov (United States)

    Wang, Wen-Cheng; Cho, Wen-Chien; Chen, Yin-Jen

    2014-01-01

    It is estimated that mainland Chinese tourists travelling to Taiwan can bring annual revenues of 400 billion NTD to the Taiwan economy. Thus, how the Taiwanese Government formulates relevant measures to satisfy both sides is the focus of most concern. Taiwan must improve the facilities and service quality of its tourism industry so as to attract more mainland tourists. This paper conducted a questionnaire survey of mainland tourists and used grey relational analysis in grey mathematics to analyze the satisfaction performance of all satisfaction question items. The first eight satisfaction items were used as independent variables, and the overall satisfaction performance was used as a dependent variable for quantile regression model analysis to discuss the relationship between the dependent variable under different quantiles and independent variables. Finally, this study further discussed the predictive accuracy of the least mean regression model and each quantile regression model, as a reference for research personnel. The analysis results showed that other variables could also affect the overall satisfaction performance of mainland tourists, in addition to occupation and age. The overall predictive accuracy of quantile regression model Q0.25 was higher than that of the other three models.

  11. Analysis of the Influence of Quantile Regression Model on Mainland Tourists' Service Satisfaction Performance

    Science.gov (United States)

    Wang, Wen-Cheng; Cho, Wen-Chien; Chen, Yin-Jen

    2014-01-01

    It is estimated that mainland Chinese tourists travelling to Taiwan can bring annual revenues of 400 billion NTD to the Taiwan economy. Thus, how the Taiwanese Government formulates relevant measures to satisfy both sides is the focus of most concern. Taiwan must improve the facilities and service quality of its tourism industry so as to attract more mainland tourists. This paper conducted a questionnaire survey of mainland tourists and used grey relational analysis in grey mathematics to analyze the satisfaction performance of all satisfaction question items. The first eight satisfaction items were used as independent variables, and the overall satisfaction performance was used as a dependent variable for quantile regression model analysis to discuss the relationship between the dependent variable under different quantiles and independent variables. Finally, this study further discussed the predictive accuracy of the least mean regression model and each quantile regression model, as a reference for research personnel. The analysis results showed that other variables could also affect the overall satisfaction performance of mainland tourists, in addition to occupation and age. The overall predictive accuracy of quantile regression model Q0.25 was higher than that of the other three models. PMID:24574916

  12. Analysis of the Influence of Quantile Regression Model on Mainland Tourists’ Service Satisfaction Performance

    Directory of Open Access Journals (Sweden)

    Wen-Cheng Wang

    2014-01-01

    Full Text Available It is estimated that mainland Chinese tourists travelling to Taiwan can bring annual revenues of 400 billion NTD to the Taiwan economy. Thus, how the Taiwanese Government formulates relevant measures to satisfy both sides is the focus of most concern. Taiwan must improve the facilities and service quality of its tourism industry so as to attract more mainland tourists. This paper conducted a questionnaire survey of mainland tourists and used grey relational analysis in grey mathematics to analyze the satisfaction performance of all satisfaction question items. The first eight satisfaction items were used as independent variables, and the overall satisfaction performance was used as a dependent variable for quantile regression model analysis to discuss the relationship between the dependent variable under different quantiles and independent variables. Finally, this study further discussed the predictive accuracy of the least mean regression model and each quantile regression model, as a reference for research personnel. The analysis results showed that other variables could also affect the overall satisfaction performance of mainland tourists, in addition to occupation and age. The overall predictive accuracy of quantile regression model Q0.25 was higher than that of the other three models.

  13. truncSP: An R Package for Estimation of Semi-Parametric Truncated Linear Regression Models

    Directory of Open Access Journals (Sweden)

    Maria Karlsson

    2014-05-01

    Full Text Available Problems with truncated data occur in many areas, complicating estimation and inference. Regarding linear regression models, the ordinary least squares estimator is inconsistent and biased for these types of data and is therefore unsuitable for use. Alternative estimators, designed for the estimation of truncated regression models, have been developed. This paper presents the R package truncSP. The package contains functions for the estimation of semi-parametric truncated linear regression models using three different estimators: the symmetrically trimmed least squares, quadratic mode, and left truncated estimators, all of which have been shown to have good asymptotic and ?nite sample properties. The package also provides functions for the analysis of the estimated models. Data from the environmental sciences are used to illustrate the functions in the package.

  14. Applied linear regression

    CERN Document Server

    Weisberg, Sanford

    2013-01-01

    Praise for the Third Edition ""...this is an excellent book which could easily be used as a course text...""-International Statistical Institute The Fourth Edition of Applied Linear Regression provides a thorough update of the basic theory and methodology of linear regression modeling. Demonstrating the practical applications of linear regression analysis techniques, the Fourth Edition uses interesting, real-world exercises and examples. Stressing central concepts such as model building, understanding parameters, assessing fit and reliability, and drawing conclusions, the new edition illus

  15. Testing for constant nonparametric effects in general semiparametric regression models with interactions

    KAUST Repository

    Wei, Jiawei; Carroll, Raymond J.; Maity, Arnab

    2011-01-01

    We consider the problem of testing for a constant nonparametric effect in a general semi-parametric regression model when there is the potential for interaction between the parametrically and nonparametrically modeled variables. The work

  16. Gravitational waves from double white dwarfs and AM CVn binaries

    International Nuclear Information System (INIS)

    Nelemans, Gijs

    2003-01-01

    I give a brief overview of our model for the galactic population of compact binaries that is used to predict the low-frequency gravitational wave signal from the galaxy, and discuss recent observational developments that will enable us to test and improve this model. The SPY project will discover some 150 new close double white dwarfs and, recently, two ROSAT sources turned out to be new AM CVn candidates, one with an orbital period of only 5 min. I give an update on the expected binaries that will be resolved by LISA and discuss what we can learn about the galactic population of compact binaries once LISA gives her first results

  17. Building interpretable predictive models for pediatric hospital readmission using Tree-Lasso logistic regression.

    Science.gov (United States)

    Jovanovic, Milos; Radovanovic, Sandro; Vukicevic, Milan; Van Poucke, Sven; Delibasic, Boris

    2016-09-01

    Quantification and early identification of unplanned readmission risk have the potential to improve the quality of care during hospitalization and after discharge. However, high dimensionality, sparsity, and class imbalance of electronic health data and the complexity of risk quantification, challenge the development of accurate predictive models. Predictive models require a certain level of interpretability in order to be applicable in real settings and create actionable insights. This paper aims to develop accurate and interpretable predictive models for readmission in a general pediatric patient population, by integrating a data-driven model (sparse logistic regression) and domain knowledge based on the international classification of diseases 9th-revision clinical modification (ICD-9-CM) hierarchy of diseases. Additionally, we propose a way to quantify the interpretability of a model and inspect the stability of alternative solutions. The analysis was conducted on >66,000 pediatric hospital discharge records from California, State Inpatient Databases, Healthcare Cost and Utilization Project between 2009 and 2011. We incorporated domain knowledge based on the ICD-9-CM hierarchy in a data driven, Tree-Lasso regularized logistic regression model, providing the framework for model interpretation. This approach was compared with traditional Lasso logistic regression resulting in models that are easier to interpret by fewer high-level diagnoses, with comparable prediction accuracy. The results revealed that the use of a Tree-Lasso model was as competitive in terms of accuracy (measured by area under the receiver operating characteristic curve-AUC) as the traditional Lasso logistic regression, but integration with the ICD-9-CM hierarchy of diseases provided more interpretable models in terms of high-level diagnoses. Additionally, interpretations of models are in accordance with existing medical understanding of pediatric readmission. Best performing models have

  18. Regularized Label Relaxation Linear Regression.

    Science.gov (United States)

    Fang, Xiaozhao; Xu, Yong; Li, Xuelong; Lai, Zhihui; Wong, Wai Keung; Fang, Bingwu

    2018-04-01

    Linear regression (LR) and some of its variants have been widely used for classification problems. Most of these methods assume that during the learning phase, the training samples can be exactly transformed into a strict binary label matrix, which has too little freedom to fit the labels adequately. To address this problem, in this paper, we propose a novel regularized label relaxation LR method, which has the following notable characteristics. First, the proposed method relaxes the strict binary label matrix into a slack variable matrix by introducing a nonnegative label relaxation matrix into LR, which provides more freedom to fit the labels and simultaneously enlarges the margins between different classes as much as possible. Second, the proposed method constructs the class compactness graph based on manifold learning and uses it as the regularization item to avoid the problem of overfitting. The class compactness graph is used to ensure that the samples sharing the same labels can be kept close after they are transformed. Two different algorithms, which are, respectively, based on -norm and -norm loss functions are devised. These two algorithms have compact closed-form solutions in each iteration so that they are easily implemented. Extensive experiments show that these two algorithms outperform the state-of-the-art algorithms in terms of the classification accuracy and running time.

  19. General simulation algorithm for autocorrelated binary processes.

    Science.gov (United States)

    Serinaldi, Francesco; Lombardo, Federico

    2017-02-01

    The apparent ubiquity of binary random processes in physics and many other fields has attracted considerable attention from the modeling community. However, generation of binary sequences with prescribed autocorrelation is a challenging task owing to the discrete nature of the marginal distributions, which makes the application of classical spectral techniques problematic. We show that such methods can effectively be used if we focus on the parent continuous process of beta distributed transition probabilities rather than on the target binary process. This change of paradigm results in a simulation procedure effectively embedding a spectrum-based iterative amplitude-adjusted Fourier transform method devised for continuous processes. The proposed algorithm is fully general, requires minimal assumptions, and can easily simulate binary signals with power-law and exponentially decaying autocorrelation functions corresponding, for instance, to Hurst-Kolmogorov and Markov processes. An application to rainfall intermittency shows that the proposed algorithm can also simulate surrogate data preserving the empirical autocorrelation.

  20. General simulation algorithm for autocorrelated binary processes

    Science.gov (United States)

    Serinaldi, Francesco; Lombardo, Federico

    2017-02-01

    The apparent ubiquity of binary random processes in physics and many other fields has attracted considerable attention from the modeling community. However, generation of binary sequences with prescribed autocorrelation is a challenging task owing to the discrete nature of the marginal distributions, which makes the application of classical spectral techniques problematic. We show that such methods can effectively be used if we focus on the parent continuous process of beta distributed transition probabilities rather than on the target binary process. This change of paradigm results in a simulation procedure effectively embedding a spectrum-based iterative amplitude-adjusted Fourier transform method devised for continuous processes. The proposed algorithm is fully general, requires minimal assumptions, and can easily simulate binary signals with power-law and exponentially decaying autocorrelation functions corresponding, for instance, to Hurst-Kolmogorov and Markov processes. An application to rainfall intermittency shows that the proposed algorithm can also simulate surrogate data preserving the empirical autocorrelation.