WorldWideScience

Sample records for nonparametric rank percentile

  1. Tutorial: Calculating Percentile Rank and Percentile Norms Using SPSS

    Science.gov (United States)

    Baumgartner, Ted A.

    2009-01-01

    Practitioners can benefit from using norms, but they often have to develop their own percentile rank and percentile norms. This article is a tutorial on how to quickly and easily calculate percentile rank and percentile norms using SPSS, and this information is presented for a data set. Some issues in calculating percentile rank and percentile…

  2. Nonparametric estimation of age-specific reference percentile curves with radial smoothing.

    Science.gov (United States)

    Wan, Xiaohai; Qu, Yongming; Huang, Yao; Zhang, Xiao; Song, Hanping; Jiang, Honghua

    2012-01-01

    Reference percentile curves represent the covariate-dependent distribution of a quantitative measurement and are often used to summarize and monitor dynamic processes such as human growth. We propose a new nonparametric method based on a radial smoothing (RS) technique to estimate age-specific reference percentile curves assuming the underlying distribution is relatively close to normal. We compared the RS method with both the LMS and the generalized additive models for location, scale and shape (GAMLSS) methods using simulated data and found that our method has smaller estimation error than the two existing methods. We also applied the new method to analyze height growth data from children being followed in a clinical observational study of growth hormone treatment, and compared the growth curves between those with growth disorders and the general population. Copyright © 2011 Elsevier Inc. All rights reserved.

  3. International Conference on Robust Rank-Based and Nonparametric Methods

    CERN Document Server

    McKean, Joseph

    2016-01-01

    The contributors to this volume include many of the distinguished researchers in this area. Many of these scholars have collaborated with Joseph McKean to develop underlying theory for these methods, obtain small sample corrections, and develop efficient algorithms for their computation. The papers cover the scope of the area, including robust nonparametric rank-based procedures through Bayesian and big data rank-based analyses. Areas of application include biostatistics and spatial areas. Over the last 30 years, robust rank-based and nonparametric methods have developed considerably. These procedures generalize traditional Wilcoxon-type methods for one- and two-sample location problems. Research into these procedures has culminated in complete analyses for many of the models used in practice including linear, generalized linear, mixed, and nonlinear models. Settings are both multivariate and univariate. With the development of R packages in these areas, computation of these procedures is easily shared with r...

  4. A comparison of average-based, percentile rank, and other citation impact indicators

    Energy Technology Data Exchange (ETDEWEB)

    Ruiz-Castillo, J.; Albarran, P.

    2016-07-01

    The main aim of this paper is to defend the view that, in spite of the broad agreement in favor of the MNCS and the percentile rank indicators, there are two other citation indicators with desirable properties that the above indicators do not posses: (i) a member of the family of high-impact indicators introduced in Albarránet al. (2011), and (ii) a new indicator, based in the work of Herrero & Villar (2013), which measures the relative performance of the different research units in terms of a series of tournaments in which each research unit is confronted with all others repeatedly. We compare indicators from the point of view of their discriminatory power, measured by the range and the coefficient of variation. Using a large dataset indexed by Thomson Reuters, we consider 40 countries that have published at least 10,000 articles in all sciences in 1998-2003. There are two main findings. First, the new indicator exhibits a greater discriminatory power than percentile rank indicators. Second, the high-impact indicator exhibits the greatest discriminatory power. (Author)

  5. Use of percentile rank sum method in identifying repetitive high occupational radiation dose jobs in a nuclear power plant

    International Nuclear Information System (INIS)

    Cho, Y.H.; Ko, H.S.; Kim, S.H.; Kang, C.S.; Moon, J.H.; Kim, K.D.

    2004-01-01

    The cost-effective reduction of occupational radiation dose (ORD) at a nuclear power plant could not be achieved without going through an extensive analysis of accumulated ORD data of existing plants. Through the data analysis, it is required to identify what are the jobs of repetitive high ORD at the nuclear power plant. In general the point value method commonly used, over-estimates the role of mean and median values to identify the high ORD jobs which can lead to misjudgment. In this study, Percentile Rank Sum Method (PRSM) is proposed to identify repetitive high ORD jobs, which is based on non-parametric statistical theory. As a case study, the method is applied to ORD data of maintenance and repair jobs at Kori units 3 and 4 that are pressurized water reactors with 950 MWe capacity and have been operated since 1986 and 1987, respectively in Korea. The results were verified and validated, and PRSM has been demonstrated to be an efficient method of analyzing the data. (authors)

  6. Digital image comparison by subtracting contextual transformations—percentile rank order differentiation

    Science.gov (United States)

    Wehde, M. E.

    1995-01-01

    The common method of digital image comparison by subtraction imposes various constraints on the image contents. Precise registration of images is required to assure proper evaluation of surface locations. The attribute being measured and the calibration and scaling of the sensor are also important to the validity and interpretability of the subtraction result. Influences of sensor gains and offsets complicate the subtraction process. The presence of any uniform systematic transformation component in one of two images to be compared distorts the subtraction results and requires analyst intervention to interpret or remove it. A new technique has been developed to overcome these constraints. Images to be compared are first transformed using the cumulative relative frequency as a transfer function. The transformed images represent the contextual relationship of each surface location with respect to all others within the image. The process of differentiating between the transformed images results in a percentile rank ordered difference. This process produces consistent terrain-change information even when the above requirements necessary for subtraction are relaxed. This technique may be valuable to an appropriately designed hierarchical terrain-monitoring methodology because it does not require human participation in the process.

  7. Prior publication productivity, grant percentile ranking, and topic-normalized citation impact of NHLBI cardiovascular R01 grants.

    Science.gov (United States)

    Kaltman, Jonathan R; Evans, Frank J; Danthi, Narasimhan S; Wu, Colin O; DiMichele, Donna M; Lauer, Michael S

    2014-09-12

    We previously demonstrated absence of association between peer-review-derived percentile ranking and raw citation impact in a large cohort of National Heart, Lung, and Blood Institute cardiovascular R01 grants, but we did not consider pregrant investigator publication productivity. We also did not normalize citation counts for scientific field, type of article, and year of publication. To determine whether measures of investigator prior productivity predict a grant's subsequent scientific impact as measured by normalized citation metrics. We identified 1492 investigator-initiated de novo National Heart, Lung, and Blood Institute R01 grant applications funded between 2001 and 2008 and linked the publications from these grants to their InCites (Thompson Reuters) citation record. InCites provides a normalized citation count for each publication stratifying by year of publication, type of publication, and field of science. The coprimary end points for this analysis were the normalized citation impact per million dollars allocated and the number of publications per grant that has normalized citation rate in the top decile per million dollars allocated (top 10% articles). Prior productivity measures included the number of National Heart, Lung, and Blood Institute-supported publications each principal investigator published in the 5 years before grant review and the corresponding prior normalized citation impact score. After accounting for potential confounders, there was no association between peer-review percentile ranking and bibliometric end points (all adjusted P>0.5). However, prior productivity was predictive (Pcitation counts, we confirmed a lack of association between peer-review grant percentile ranking and grant citation impact. However, prior investigator publication productivity was predictive of grant-specific citation impact. © 2014 American Heart Association, Inc.

  8. Scalable Bayesian nonparametric regression via a Plackett-Luce model for conditional ranks

    Science.gov (United States)

    Gray-Davies, Tristan; Holmes, Chris C.; Caron, François

    2018-01-01

    We present a novel Bayesian nonparametric regression model for covariates X and continuous response variable Y ∈ ℝ. The model is parametrized in terms of marginal distributions for Y and X and a regression function which tunes the stochastic ordering of the conditional distributions F (y|x). By adopting an approximate composite likelihood approach, we show that the resulting posterior inference can be decoupled for the separate components of the model. This procedure can scale to very large datasets and allows for the use of standard, existing, software from Bayesian nonparametric density estimation and Plackett-Luce ranking estimation to be applied. As an illustration, we show an application of our approach to a US Census dataset, with over 1,300,000 data points and more than 100 covariates. PMID:29623150

  9. How can journal impact factors be normalized across fields of science? An assessment in terms of percentile ranks and fractional counts

    NARCIS (Netherlands)

    Leydesdorff, L.; Zhou, P.; Bornmann, L.

    2013-01-01

    Using the CD-ROM version of the Science Citation Index 2010 (N = 3,705 journals), we study the (combined) effects of (a) fractional counting on the impact factor (IF) and (b) transformation of the skewed citation distributions into a distribution of 100 percentiles and six percentile rank classes

  10. Rank-based permutation approaches for non-parametric factorial designs.

    Science.gov (United States)

    Umlauft, Maria; Konietschke, Frank; Pauly, Markus

    2017-11-01

    Inference methods for null hypotheses formulated in terms of distribution functions in general non-parametric factorial designs are studied. The methods can be applied to continuous, ordinal or even ordered categorical data in a unified way, and are based only on ranks. In this set-up Wald-type statistics and ANOVA-type statistics are the current state of the art. The first method is asymptotically exact but a rather liberal statistical testing procedure for small to moderate sample size, while the latter is only an approximation which does not possess the correct asymptotic α level under the null. To bridge these gaps, a novel permutation approach is proposed which can be seen as a flexible generalization of the Kruskal-Wallis test to all kinds of factorial designs with independent observations. It is proven that the permutation principle is asymptotically correct while keeping its finite exactness property when data are exchangeable. The results of extensive simulation studies foster these theoretical findings. A real data set exemplifies its applicability. © 2017 The British Psychological Society.

  11. Performances of non-parametric statistics in sensitivity analysis and parameter ranking

    International Nuclear Information System (INIS)

    Saltelli, A.

    1987-01-01

    Twelve parametric and non-parametric sensitivity analysis techniques are compared in the case of non-linear model responses. The test models used are taken from the long-term risk analysis for the disposal of high level radioactive waste in a geological formation. They describe the transport of radionuclides through a set of engineered and natural barriers from the repository to the biosphere and to man. The output data from these models are the dose rates affecting the maximum exposed individual of a critical group at a given point in time. All the techniques are applied to the output from the same Monte Carlo simulations, where a modified version of Latin Hypercube method is used for the sample selection. Hypothesis testing is systematically applied to quantify the degree of confidence in the results given by the various sensitivity estimators. The estimators are ranked according to their robustness and stability, on the basis of two test cases. The conclusions are that no estimator can be considered the best from all points of view and recommend the use of more than just one estimator in sensitivity analysis

  12. Which percentile-based approach should be preferred for calculating normalized citation impact values? An empirical comparison of five approaches including a newly developed citation-rank approach (P100)

    NARCIS (Netherlands)

    Bornmann, L.; Leydesdorff, L.; Wang, J.

    2013-01-01

    For comparisons of citation impacts across fields and over time, bibliometricians normalize the observed citation counts with reference to an expected citation value. Percentile-based approaches have been proposed as a non-parametric alternative to parametric central-tendency statistics. Percentiles

  13. Percentile ranks and benchmark estimates of change for the Health Education Impact Questionnaire: Normative data from an Australian sample

    Directory of Open Access Journals (Sweden)

    Gerald R Elsworth

    2017-03-01

    Full Text Available Objective: Participant self-report data play an essential role in the evaluation of health education activities, programmes and policies. When questionnaire items do not have a clear mapping to a performance-based continuum, percentile norms are useful for communicating individual test results to users. Similarly, when assessing programme impact, the comparison of effect sizes for group differences or baseline to follow-up change with effect sizes observed in relevant normative data provides more directly useful information compared with statistical tests of mean differences and the evaluation of effect sizes for substantive significance using universal rule-of-thumb such as those for Cohen’s ‘d’. This article aims to assist managers, programme staff and clinicians of healthcare organisations who use the Health Education Impact Questionnaire interpret their results using percentile norms for individual baseline and follow-up scores together with group effect sizes for change across the duration of typical chronic disease self-management and support programme. Methods: Percentile norms for individual Health Education Impact Questionnaire scale scores and effect sizes for group change were calculated using freely available software for each of the eight Health Education Impact Questionnaire scales. Data used were archived responses of 2157 participants of chronic disease self-management programmes conducted by a wide range of organisations in Australia between July 2007 and March 2013. Results: Tables of percentile norms and three possible effect size benchmarks for baseline to follow-up change are provided together with two worked examples to assist interpretation. Conclusion: While the norms and benchmarks presented will be particularly relevant for Australian organisations and others using the English-language version of the Health Education Impact Questionnaire, they will also be useful for translated versions as a guide to the

  14. Inflation of type I error rates by unequal variances associated with parametric, nonparametric, and Rank-Transformation Tests

    Directory of Open Access Journals (Sweden)

    Donald W. Zimmerman

    2004-01-01

    Full Text Available It is well known that the two-sample Student t test fails to maintain its significance level when the variances of treatment groups are unequal, and, at the same time, sample sizes are unequal. However, introductory textbooks in psychology and education often maintain that the test is robust to variance heterogeneity when sample sizes are equal. The present study discloses that, for a wide variety of non-normal distributions, especially skewed distributions, the Type I error probabilities of both the t test and the Wilcoxon-Mann-Whitney test are substantially inflated by heterogeneous variances, even when sample sizes are equal. The Type I error rate of the t test performed on ranks replacing the scores (rank-transformed data is inflated in the same way and always corresponds closely to that of the Wilcoxon-Mann-Whitney test. For many probability densities, the distortion of the significance level is far greater after transformation to ranks and, contrary to known asymptotic properties, the magnitude of the inflation is an increasing function of sample size. Although nonparametric tests of location also can be sensitive to differences in the shape of distributions apart from location, the Wilcoxon-Mann-Whitney test and rank-transformation tests apparently are influenced mainly by skewness that is accompanied by specious differences in the means of ranks.

  15. Theory of nonparametric tests

    CERN Document Server

    Dickhaus, Thorsten

    2018-01-01

    This textbook provides a self-contained presentation of the main concepts and methods of nonparametric statistical testing, with a particular focus on the theoretical foundations of goodness-of-fit tests, rank tests, resampling tests, and projection tests. The substitution principle is employed as a unified approach to the nonparametric test problems discussed. In addition to mathematical theory, it also includes numerous examples and computer implementations. The book is intended for advanced undergraduate, graduate, and postdoc students as well as young researchers. Readers should be familiar with the basic concepts of mathematical statistics typically covered in introductory statistics courses.

  16. Binding affinity toward human prion protein of some anti-prion compounds - Assessment based on QSAR modeling, molecular docking and non-parametric ranking.

    Science.gov (United States)

    Kovačević, Strahinja; Karadžić, Milica; Podunavac-Kuzmanović, Sanja; Jevrić, Lidija

    2018-01-01

    The present study is based on the quantitative structure-activity relationship (QSAR) analysis of binding affinity toward human prion protein (huPrP C ) of quinacrine, pyridine dicarbonitrile, diphenylthiazole and diphenyloxazole analogs applying different linear and non-linear chemometric regression techniques, including univariate linear regression, multiple linear regression, partial least squares regression and artificial neural networks. The QSAR analysis distinguished molecular lipophilicity as an important factor that contributes to the binding affinity. Principal component analysis was used in order to reveal similarities or dissimilarities among the studied compounds. The analysis of in silico absorption, distribution, metabolism, excretion and toxicity (ADMET) parameters was conducted. The ranking of the studied analogs on the basis of their ADMET parameters was done applying the sum of ranking differences, as a relatively new chemometric method. The main aim of the study was to reveal the most important molecular features whose changes lead to the changes in the binding affinities of the studied compounds. Another point of view on the binding affinity of the most promising analogs was established by application of molecular docking analysis. The results of the molecular docking were proven to be in agreement with the experimental outcome. Copyright © 2017 Elsevier B.V. All rights reserved.

  17. 50th Percentile Rent Estimates

    Data.gov (United States)

    Department of Housing and Urban Development — Rent estimates at the 50th percentile (or median) are calculated for all Fair Market Rent areas. Fair Market Rents (FMRs) are primarily used to determine payment...

  18. Teaching Nonparametric Statistics Using Student Instrumental Values.

    Science.gov (United States)

    Anderson, Jonathan W.; Diddams, Margaret

    Nonparametric statistics are often difficult to teach in introduction to statistics courses because of the lack of real-world examples. This study demonstrated how teachers can use differences in the rankings and ratings of undergraduate and graduate values to discuss: (1) ipsative and normative scaling; (2) uses of the Mann-Whitney U-test; and…

  19. Nonparametric statistical inference

    CERN Document Server

    Gibbons, Jean Dickinson

    2010-01-01

    Overall, this remains a very fine book suitable for a graduate-level course in nonparametric statistics. I recommend it for all people interested in learning the basic ideas of nonparametric statistical inference.-Eugenia Stoimenova, Journal of Applied Statistics, June 2012… one of the best books available for a graduate (or advanced undergraduate) text for a theory course on nonparametric statistics. … a very well-written and organized book on nonparametric statistics, especially useful and recommended for teachers and graduate students.-Biometrics, 67, September 2011This excellently presente

  20. The application of non-parametric statistical method for an ALARA implementation

    International Nuclear Information System (INIS)

    Cho, Young Ho; Herr, Young Hoi

    2003-01-01

    The cost-effective reduction of Occupational Radiation Dose (ORD) at a nuclear power plant could not be achieved without going through an extensive analysis of accumulated ORD data of existing plants. Through the data analysis, it is required to identify what are the jobs of repetitive high ORD at the nuclear power plant. In this study, Percentile Rank Sum Method (PRSM) is proposed to identify repetitive high ORD jobs, which is based on non-parametric statistical theory. As a case study, the method is applied to ORD data of maintenance and repair jobs at Kori units 3 and 4 that are pressurized water reactors with 950 MWe capacity and have been operated since 1986 and 1987, respectively in Korea. The results was verified and validated, and PRSM has been demonstrated to be an efficient method of analyzing the data

  1. On Cooper's Nonparametric Test.

    Science.gov (United States)

    Schmeidler, James

    1978-01-01

    The basic assumption of Cooper's nonparametric test for trend (EJ 125 069) is questioned. It is contended that the proper assumption alters the distribution of the statistic and reduces its usefulness. (JKS)

  2. Nonparametric Transfer Function Models

    Science.gov (United States)

    Liu, Jun M.; Chen, Rong; Yao, Qiwei

    2009-01-01

    In this paper a class of nonparametric transfer function models is proposed to model nonlinear relationships between ‘input’ and ‘output’ time series. The transfer function is smooth with unknown functional forms, and the noise is assumed to be a stationary autoregressive-moving average (ARMA) process. The nonparametric transfer function is estimated jointly with the ARMA parameters. By modeling the correlation in the noise, the transfer function can be estimated more efficiently. The parsimonious ARMA structure improves the estimation efficiency in finite samples. The asymptotic properties of the estimators are investigated. The finite-sample properties are illustrated through simulations and one empirical example. PMID:20628584

  3. Bayesian nonparametric data analysis

    CERN Document Server

    Müller, Peter; Jara, Alejandro; Hanson, Tim

    2015-01-01

    This book reviews nonparametric Bayesian methods and models that have proven useful in the context of data analysis. Rather than providing an encyclopedic review of probability models, the book’s structure follows a data analysis perspective. As such, the chapters are organized by traditional data analysis problems. In selecting specific nonparametric models, simpler and more traditional models are favored over specialized ones. The discussed methods are illustrated with a wealth of examples, including applications ranging from stylized examples to case studies from recent literature. The book also includes an extensive discussion of computational methods and details on their implementation. R code for many examples is included in on-line software pages.

  4. Nonparametric statistics with applications to science and engineering

    CERN Document Server

    Kvam, Paul H

    2007-01-01

    A thorough and definitive book that fully addresses traditional and modern-day topics of nonparametric statistics This book presents a practical approach to nonparametric statistical analysis and provides comprehensive coverage of both established and newly developed methods. With the use of MATLAB, the authors present information on theorems and rank tests in an applied fashion, with an emphasis on modern methods in regression and curve fitting, bootstrap confidence intervals, splines, wavelets, empirical likelihood, and goodness-of-fit testing. Nonparametric Statistics with Applications to Science and Engineering begins with succinct coverage of basic results for order statistics, methods of categorical data analysis, nonparametric regression, and curve fitting methods. The authors then focus on nonparametric procedures that are becoming more relevant to engineering researchers and practitioners. The important fundamental materials needed to effectively learn and apply the discussed methods are also provide...

  5. Bayesian nonparametric hierarchical modeling.

    Science.gov (United States)

    Dunson, David B

    2009-04-01

    In biomedical research, hierarchical models are very widely used to accommodate dependence in multivariate and longitudinal data and for borrowing of information across data from different sources. A primary concern in hierarchical modeling is sensitivity to parametric assumptions, such as linearity and normality of the random effects. Parametric assumptions on latent variable distributions can be challenging to check and are typically unwarranted, given available prior knowledge. This article reviews some recent developments in Bayesian nonparametric methods motivated by complex, multivariate and functional data collected in biomedical studies. The author provides a brief review of flexible parametric approaches relying on finite mixtures and latent class modeling. Dirichlet process mixture models are motivated by the need to generalize these approaches to avoid assuming a fixed finite number of classes. Focusing on an epidemiology application, the author illustrates the practical utility and potential of nonparametric Bayes methods.

  6. Quantal Response: Nonparametric Modeling

    Science.gov (United States)

    2017-01-01

    capture the behavior of observed phenomena. Higher-order polynomial and finite-dimensional spline basis models allow for more complicated responses as the...flexibility as these are nonparametric (not constrained to any particular functional form). These should be useful in identifying nonstandard behavior via... deviance ∆ = −2 log(Lreduced/Lfull) is defined in terms of the likelihood function L. For normal error, Lfull = 1, and based on Eq. A-2, we have log

  7. Statistical methods for ranking data

    CERN Document Server

    Alvo, Mayer

    2014-01-01

    This book introduces advanced undergraduate, graduate students and practitioners to statistical methods for ranking data. An important aspect of nonparametric statistics is oriented towards the use of ranking data. Rank correlation is defined through the notion of distance functions and the notion of compatibility is introduced to deal with incomplete data. Ranking data are also modeled using a variety of modern tools such as CART, MCMC, EM algorithm and factor analysis. This book deals with statistical methods used for analyzing such data and provides a novel and unifying approach for hypotheses testing. The techniques described in the book are illustrated with examples and the statistical software is provided on the authors’ website.

  8. Nonparametric statistical inference

    CERN Document Server

    Gibbons, Jean Dickinson

    2014-01-01

    Thoroughly revised and reorganized, the fourth edition presents in-depth coverage of the theory and methods of the most widely used nonparametric procedures in statistical analysis and offers example applications appropriate for all areas of the social, behavioral, and life sciences. The book presents new material on the quantiles, the calculation of exact and simulated power, multiple comparisons, additional goodness-of-fit tests, methods of analysis of count data, and modern computer applications using MINITAB, SAS, and STATXACT. It includes tabular guides for simplified applications of tests and finding P values and confidence interval estimates.

  9. Exact nonparametric inference for detection of nonlinear determinism

    OpenAIRE

    Luo, Xiaodong; Zhang, Jie; Small, Michael; Moroz, Irene

    2005-01-01

    We propose an exact nonparametric inference scheme for the detection of nonlinear determinism. The essential fact utilized in our scheme is that, for a linear stochastic process with jointly symmetric innovations, its ordinary least square (OLS) linear prediction error is symmetric about zero. Based on this viewpoint, a class of linear signed rank statistics, e.g. the Wilcoxon signed rank statistic, can be derived with the known null distributions from the prediction error. Thus one of the ad...

  10. An Activity for Learning to Find Percentiles

    Science.gov (United States)

    Cox, Richard G.

    2016-01-01

    This classroom activity is designed to help students practice calculating percentiles. The approach of the activity involves physical sorting and full classroom participation in each calculation. The design encourages a more engaged approach than simply having students make a calculation with numbers on a paper.

  11. Nonparametric combinatorial sequence models.

    Science.gov (United States)

    Wauthier, Fabian L; Jordan, Michael I; Jojic, Nebojsa

    2011-11-01

    This work considers biological sequences that exhibit combinatorial structures in their composition: groups of positions of the aligned sequences are "linked" and covary as one unit across sequences. If multiple such groups exist, complex interactions can emerge between them. Sequences of this kind arise frequently in biology but methodologies for analyzing them are still being developed. This article presents a nonparametric prior on sequences which allows combinatorial structures to emerge and which induces a posterior distribution over factorized sequence representations. We carry out experiments on three biological sequence families which indicate that combinatorial structures are indeed present and that combinatorial sequence models can more succinctly describe them than simpler mixture models. We conclude with an application to MHC binding prediction which highlights the utility of the posterior distribution over sequence representations induced by the prior. By integrating out the posterior, our method compares favorably to leading binding predictors.

  12. Relation between body mass index percentile and muscle strength ...

    African Journals Online (AJOL)

    Relation between body mass index percentile and muscle strength and endurance. ... Egyptian Journal of Medical Human Genetics ... They were divided into three groups according to their body mass index percentile where group (a) is equal to or more than 5% percentile yet less than 85% percentile, group (b) is equal to ...

  13. Nonparametric tests for censored data

    CERN Document Server

    Bagdonavicus, Vilijandas; Nikulin, Mikhail

    2013-01-01

    This book concerns testing hypotheses in non-parametric models. Generalizations of many non-parametric tests to the case of censored and truncated data are considered. Most of the test results are proved and real applications are illustrated using examples. Theories and exercises are provided. The incorrect use of many tests applying most statistical software is highlighted and discussed.

  14. Blood Pressure Percentiles for School Children

    Directory of Open Access Journals (Sweden)

    İsmail Özanli

    2016-06-01

    Full Text Available Objective: The prevalence of hypertension in childhood and adolescence is gradually increasing. We aimed to in­vestigate the blood pressure (BP values of children aged 7-18 years. Methods: This study was conducted in a total of 3375 (1777 females, 1598 males children from 27 schools. Blood pressures of children were measured using sphyg­momanometer appropriate to arm circumference. Results: A positive relationship was found between sys­tolic blood pressure (SBP and diastolic blood pressure (DBP and the body weight, height, age and body mass index (BMI in male and female children. SBP was high­er in males than females after the age of 13. DBP was higher in males than the females after the age of 14. The mean annual increase of SBP was 2.06 mmHg in males and 1.54 mmHg in females. The mean annual increase of DBP was 1.52 mmHg in males and 1.38 mmHg in fe­males. Conclusion: In this study, we identified the threshold val­ues for blood pressure in children between the age of 7 and 18 years in Erzurum province. It is necessary to com­bine and evaluate data obtained from various regions for the identification of BP percentiles according to the age, gender and height percentiles of Turkish children.

  15. Nonparametric identification of copula structures

    KAUST Repository

    Li, Bo; Genton, Marc G.

    2013-01-01

    We propose a unified framework for testing a variety of assumptions commonly made about the structure of copulas, including symmetry, radial symmetry, joint symmetry, associativity and Archimedeanity, and max-stability. Our test is nonparametric

  16. A nonparametric spatial scan statistic for continuous data.

    Science.gov (United States)

    Jung, Inkyung; Cho, Ho Jin

    2015-10-20

    Spatial scan statistics are widely used for spatial cluster detection, and several parametric models exist. For continuous data, a normal-based scan statistic can be used. However, the performance of the model has not been fully evaluated for non-normal data. We propose a nonparametric spatial scan statistic based on the Wilcoxon rank-sum test statistic and compared the performance of the method with parametric models via a simulation study under various scenarios. The nonparametric method outperforms the normal-based scan statistic in terms of power and accuracy in almost all cases under consideration in the simulation study. The proposed nonparametric spatial scan statistic is therefore an excellent alternative to the normal model for continuous data and is especially useful for data following skewed or heavy-tailed distributions.

  17. Rank Dynamics

    Science.gov (United States)

    Gershenson, Carlos

    Studies of rank distributions have been popular for decades, especially since the work of Zipf. For example, if we rank words of a given language by use frequency (most used word in English is 'the', rank 1; second most common word is 'of', rank 2), the distribution can be approximated roughly with a power law. The same applies for cities (most populated city in a country ranks first), earthquakes, metabolism, the Internet, and dozens of other phenomena. We recently proposed ``rank diversity'' to measure how ranks change in time, using the Google Books Ngram dataset. Studying six languages between 1800 and 2009, we found that the rank diversity curves of languages are universal, adjusted with a sigmoid on log-normal scale. We are studying several other datasets (sports, economies, social systems, urban systems, earthquakes, artificial life). Rank diversity seems to be universal, independently of the shape of the rank distribution. I will present our work in progress towards a general description of the features of rank change in time, along with simple models which reproduce it

  18. Usng subjective percentiles and test data for estimating fragility functions

    International Nuclear Information System (INIS)

    George, L.L.; Mensing, R.W.

    1981-01-01

    Fragility functions are cumulative distribution functions (cdfs) of strengths at failure. They are needed for reliability analyses of systems such as power generation and transmission systems. Subjective opinions supplement sparse test data for estimating fragility functions. Often the opinions are opinions on the percentiles of the fragility function. Subjective percentiles are likely to be less biased than opinions on parameters of cdfs. Solutions to several problems in the estimation of fragility functions are found for subjective percentiles and test data. How subjective percentiles should be used to estimate subjective fragility functions, how subjective percentiles should be combined with test data, how fragility functions for several failure modes should be combined into a composite fragility function, and how inherent randomness and uncertainty due to lack of knowledge should be represented are considered. Subjective percentiles are treated as independent estimates of percentiles. The following are derived: least-squares parameter estimators for normal and lognormal cdfs, based on subjective percentiles (the method is applicable to any invertible cdf); a composite fragility function for combining several failure modes; estimators of variation within and between groups of experts for nonidentically distributed subjective percentiles; weighted least-squares estimators when subjective percentiles have higher variation at higher percents; and weighted least-squares and Bayes parameter estimators based on combining subjective percentiles and test data. 4 figures, 2 tables

  19. Decision support using nonparametric statistics

    CERN Document Server

    Beatty, Warren

    2018-01-01

    This concise volume covers nonparametric statistics topics that most are most likely to be seen and used from a practical decision support perspective. While many degree programs require a course in parametric statistics, these methods are often inadequate for real-world decision making in business environments. Much of the data collected today by business executives (for example, customer satisfaction opinions) requires nonparametric statistics for valid analysis, and this book provides the reader with a set of tools that can be used to validly analyze all data, regardless of type. Through numerous examples and exercises, this book explains why nonparametric statistics will lead to better decisions and how they are used to reach a decision, with a wide array of business applications. Online resources include exercise data, spreadsheets, and solutions.

  20. Analysis of small sample size studies using nonparametric bootstrap test with pooled resampling method.

    Science.gov (United States)

    Dwivedi, Alok Kumar; Mallawaarachchi, Indika; Alvarado, Luis A

    2017-06-30

    Experimental studies in biomedical research frequently pose analytical problems related to small sample size. In such studies, there are conflicting findings regarding the choice of parametric and nonparametric analysis, especially with non-normal data. In such instances, some methodologists questioned the validity of parametric tests and suggested nonparametric tests. In contrast, other methodologists found nonparametric tests to be too conservative and less powerful and thus preferred using parametric tests. Some researchers have recommended using a bootstrap test; however, this method also has small sample size limitation. We used a pooled method in nonparametric bootstrap test that may overcome the problem related with small samples in hypothesis testing. The present study compared nonparametric bootstrap test with pooled resampling method corresponding to parametric, nonparametric, and permutation tests through extensive simulations under various conditions and using real data examples. The nonparametric pooled bootstrap t-test provided equal or greater power for comparing two means as compared with unpaired t-test, Welch t-test, Wilcoxon rank sum test, and permutation test while maintaining type I error probability for any conditions except for Cauchy and extreme variable lognormal distributions. In such cases, we suggest using an exact Wilcoxon rank sum test. Nonparametric bootstrap paired t-test also provided better performance than other alternatives. Nonparametric bootstrap test provided benefit over exact Kruskal-Wallis test. We suggest using nonparametric bootstrap test with pooled resampling method for comparing paired or unpaired means and for validating the one way analysis of variance test results for non-normal data in small sample size studies. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  1. Testing discontinuities in nonparametric regression

    KAUST Repository

    Dai, Wenlin

    2017-01-19

    In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100

  2. Testing discontinuities in nonparametric regression

    KAUST Repository

    Dai, Wenlin; Zhou, Yuejin; Tong, Tiejun

    2017-01-01

    In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100

  3. EJSCREEN States Percentiles Lookup Table--2015 Public Release

    Data.gov (United States)

    U.S. Environmental Protection Agency — The States table provides percentile breaks of important EJSCREEN elements (demographic indicators and indexes, environmental indicators and indexes) at the state...

  4. EJSCREEN Regions Percentiles Lookup Table--2015 Public Release

    Data.gov (United States)

    U.S. Environmental Protection Agency — The Regions table provides percentile breaks of important EJSCREEN elements (demographic indicators and indexes, environmental indicators and indexes) at the EPA...

  5. EJSCREEN National Percentiles Lookup Table--2015 Public Release

    Data.gov (United States)

    U.S. Environmental Protection Agency — The USA table provides percentile breaks of important EJSCREEN elements (demographic indicators and indexes, environmental indicators and indexes) at the national...

  6. Nonparametric factor analysis of time series

    OpenAIRE

    Rodríguez-Poo, Juan M.; Linton, Oliver Bruce

    1998-01-01

    We introduce a nonparametric smoothing procedure for nonparametric factor analaysis of multivariate time series. The asymptotic properties of the proposed procedures are derived. We present an application based on the residuals from the Fair macromodel.

  7. The Distribution of the Sum of Signed Ranks

    Science.gov (United States)

    Albright, Brian

    2012-01-01

    We describe the calculation of the distribution of the sum of signed ranks and develop an exact recursive algorithm for the distribution as well as an approximation of the distribution using the normal. The results have applications to the non-parametric Wilcoxon signed-rank test.

  8. Nonparametric Inference for Periodic Sequences

    KAUST Repository

    Sun, Ying

    2012-02-01

    This article proposes a nonparametric method for estimating the period and values of a periodic sequence when the data are evenly spaced in time. The period is estimated by a "leave-out-one-cycle" version of cross-validation (CV) and complements the periodogram, a widely used tool for period estimation. The CV method is computationally simple and implicitly penalizes multiples of the smallest period, leading to a "virtually" consistent estimator of integer periods. This estimator is investigated both theoretically and by simulation.We also propose a nonparametric test of the null hypothesis that the data have constantmean against the alternative that the sequence of means is periodic. Finally, our methodology is demonstrated on three well-known time series: the sunspots and lynx trapping data, and the El Niño series of sea surface temperatures. © 2012 American Statistical Association and the American Society for Quality.

  9. Nonparametric predictive inference in reliability

    International Nuclear Information System (INIS)

    Coolen, F.P.A.; Coolen-Schrijner, P.; Yan, K.J.

    2002-01-01

    We introduce a recently developed statistical approach, called nonparametric predictive inference (NPI), to reliability. Bounds for the survival function for a future observation are presented. We illustrate how NPI can deal with right-censored data, and discuss aspects of competing risks. We present possible applications of NPI for Bernoulli data, and we briefly outline applications of NPI for replacement decisions. The emphasis is on introduction and illustration of NPI in reliability contexts, detailed mathematical justifications are presented elsewhere

  10. Assessing the value of customized birth weight percentiles.

    Science.gov (United States)

    Hutcheon, Jennifer A; Walker, Mark; Platt, Robert W

    2011-02-15

    Customized birth weight percentiles are weight-for-gestational-age percentiles that account for the influence of maternal characteristics on fetal growth. Although intuitively appealing, the incremental value they provide in the identification of intrauterine growth restriction (IUGR) over conventional birth weight percentiles is controversial. The objective of this study was to assess the value of customized birth weight percentiles in a simulated cohort of 100,000 infants aged 37 weeks whose IUGR status was known. A cohort of infants with a range of healthy birth weights was first simulated on the basis of the distributions of maternal/fetal characteristics observed in births at the Royal Victoria Hospital in Montreal, Canada, between 2000 and 2006. The occurrence of IUGR was re-created by reducing the observed birth weights of a small percentage of these infants. The value of customized percentiles was assessed by calculating true and false positive rates. Customizing birth weight percentiles for maternal characteristics added very little information to the identification of IUGR beyond that obtained from conventional weight-for-gestational-age percentiles (true positive rates of 61.8% and 61.1%, respectively, and false positive rates of 7.9% and 8.5%, respectively). For the process of customization to be worthwhile, maternal characteristics in the customization model were shown through simulation to require an unrealistically strong association with birth weight.

  11. Nonparametric identification of copula structures

    KAUST Repository

    Li, Bo

    2013-06-01

    We propose a unified framework for testing a variety of assumptions commonly made about the structure of copulas, including symmetry, radial symmetry, joint symmetry, associativity and Archimedeanity, and max-stability. Our test is nonparametric and based on the asymptotic distribution of the empirical copula process.We perform simulation experiments to evaluate our test and conclude that our method is reliable and powerful for assessing common assumptions on the structure of copulas, particularly when the sample size is moderately large. We illustrate our testing approach on two datasets. © 2013 American Statistical Association.

  12. Nonparametric Mixture of Regression Models.

    Science.gov (United States)

    Huang, Mian; Li, Runze; Wang, Shaoli

    2013-07-01

    Motivated by an analysis of US house price index data, we propose nonparametric finite mixture of regression models. We study the identifiability issue of the proposed models, and develop an estimation procedure by employing kernel regression. We further systematically study the sampling properties of the proposed estimators, and establish their asymptotic normality. A modified EM algorithm is proposed to carry out the estimation procedure. We show that our algorithm preserves the ascent property of the EM algorithm in an asymptotic sense. Monte Carlo simulations are conducted to examine the finite sample performance of the proposed estimation procedure. An empirical analysis of the US house price index data is illustrated for the proposed methodology.

  13. University Rankings: The Web Ranking

    Science.gov (United States)

    Aguillo, Isidro F.

    2012-01-01

    The publication in 2003 of the Ranking of Universities by Jiao Tong University of Shanghai has revolutionized not only academic studies on Higher Education, but has also had an important impact on the national policies and the individual strategies of the sector. The work gathers the main characteristics of this and other global university…

  14. Scale-Free Nonparametric Factor Analysis: A User-Friendly Introduction with Concrete Heuristic Examples.

    Science.gov (United States)

    Mittag, Kathleen Cage

    Most researchers using factor analysis extract factors from a matrix of Pearson product-moment correlation coefficients. A method is presented for extracting factors in a non-parametric way, by extracting factors from a matrix of Spearman rho (rank correlation) coefficients. It is possible to factor analyze a matrix of association such that…

  15. Nonparametric correlation models for portfolio allocation

    DEFF Research Database (Denmark)

    Aslanidis, Nektarios; Casas, Isabel

    2013-01-01

    This article proposes time-varying nonparametric and semiparametric estimators of the conditional cross-correlation matrix in the context of portfolio allocation. Simulations results show that the nonparametric and semiparametric models are best in DGPs with substantial variability or structural ...... currencies. Results show the nonparametric model generally dominates the others when evaluating in-sample. However, the semiparametric model is best for out-of-sample analysis....

  16. A contingency table approach to nonparametric testing

    CERN Document Server

    Rayner, JCW

    2000-01-01

    Most texts on nonparametric techniques concentrate on location and linear-linear (correlation) tests, with less emphasis on dispersion effects and linear-quadratic tests. Tests for higher moment effects are virtually ignored. Using a fresh approach, A Contingency Table Approach to Nonparametric Testing unifies and extends the popular, standard tests by linking them to tests based on models for data that can be presented in contingency tables.This approach unifies popular nonparametric statistical inference and makes the traditional, most commonly performed nonparametric analyses much more comp

  17. Nonparametric statistics for social and behavioral sciences

    CERN Document Server

    Kraska-MIller, M

    2013-01-01

    Introduction to Research in Social and Behavioral SciencesBasic Principles of ResearchPlanning for ResearchTypes of Research Designs Sampling ProceduresValidity and Reliability of Measurement InstrumentsSteps of the Research Process Introduction to Nonparametric StatisticsData AnalysisOverview of Nonparametric Statistics and Parametric Statistics Overview of Parametric Statistics Overview of Nonparametric StatisticsImportance of Nonparametric MethodsMeasurement InstrumentsAnalysis of Data to Determine Association and Agreement Pearson Chi-Square Test of Association and IndependenceContingency

  18. Nonparametric Bayesian inference in biostatistics

    CERN Document Server

    Müller, Peter

    2015-01-01

    As chapters in this book demonstrate, BNP has important uses in clinical sciences and inference for issues like unknown partitions in genomics. Nonparametric Bayesian approaches (BNP) play an ever expanding role in biostatistical inference from use in proteomics to clinical trials. Many research problems involve an abundance of data and require flexible and complex probability models beyond the traditional parametric approaches. As this book's expert contributors show, BNP approaches can be the answer. Survival Analysis, in particular survival regression, has traditionally used BNP, but BNP's potential is now very broad. This applies to important tasks like arrangement of patients into clinically meaningful subpopulations and segmenting the genome into functionally distinct regions. This book is designed to both review and introduce application areas for BNP. While existing books provide theoretical foundations, this book connects theory to practice through engaging examples and research questions. Chapters c...

  19. Nonparametric e-Mixture Estimation.

    Science.gov (United States)

    Takano, Ken; Hino, Hideitsu; Akaho, Shotaro; Murata, Noboru

    2016-12-01

    This study considers the common situation in data analysis when there are few observations of the distribution of interest or the target distribution, while abundant observations are available from auxiliary distributions. In this situation, it is natural to compensate for the lack of data from the target distribution by using data sets from these auxiliary distributions-in other words, approximating the target distribution in a subspace spanned by a set of auxiliary distributions. Mixture modeling is one of the simplest ways to integrate information from the target and auxiliary distributions in order to express the target distribution as accurately as possible. There are two typical mixtures in the context of information geometry: the [Formula: see text]- and [Formula: see text]-mixtures. The [Formula: see text]-mixture is applied in a variety of research fields because of the presence of the well-known expectation-maximazation algorithm for parameter estimation, whereas the [Formula: see text]-mixture is rarely used because of its difficulty of estimation, particularly for nonparametric models. The [Formula: see text]-mixture, however, is a well-tempered distribution that satisfies the principle of maximum entropy. To model a target distribution with scarce observations accurately, this letter proposes a novel framework for a nonparametric modeling of the [Formula: see text]-mixture and a geometrically inspired estimation algorithm. As numerical examples of the proposed framework, a transfer learning setup is considered. The experimental results show that this framework works well for three types of synthetic data sets, as well as an EEG real-world data set.

  20. On Locally Most Powerful Sequential Rank Tests

    Czech Academy of Sciences Publication Activity Database

    Kalina, Jan

    2017-01-01

    Roč. 36, č. 1 (2017), s. 111-125 ISSN 0747-4946 R&D Projects: GA ČR GA17-07384S Grant - others:Nadační fond na podporu vědy(CZ) Neuron Institutional support: RVO:67985807 Keywords : nonparametric test s * sequential ranks * stopping variable Subject RIV: BA - General Mathematics OBOR OECD: Pure mathematics Impact factor: 0.339, year: 2016

  1. Percentile curves for skinfold thickness for Canadian children and youth.

    Science.gov (United States)

    Kuhle, Stefan; Ashley-Martin, Jillian; Maguire, Bryan; Hamilton, David C

    2016-01-01

    Background. Skinfold thickness (SFT) measurements are a reliable and feasible method for assessing body fat in children but their use and interpretation is hindered by the scarcity of reference values in representative populations of children. The objective of the present study was to develop age- and sex-specific percentile curves for five SFT measures (biceps, triceps, subscapular, suprailiac, medial calf) in a representative population of Canadian children and youth. Methods. We analyzed data from 3,938 children and adolescents between 6 and 19 years of age who participated in the Canadian Health Measures Survey cycles 1 (2007/2009) and 2 (2009/2011). Standardized procedures were used to measure SFT. Age- and sex-specific centiles for SFT were calculated using the GAMLSS method. Results. Percentile curves were materially different in absolute value and shape for boys and girls. Percentile girls in girls steadily increased with age whereas percentile curves in boys were characterized by a pubertal centered peak. Conclusions. The current study has presented for the first time percentile curves for five SFT measures in a representative sample of Canadian children and youth.

  2. Percentile curves for skinfold thickness for Canadian children and youth

    Directory of Open Access Journals (Sweden)

    Stefan Kuhle

    2016-07-01

    Full Text Available Background. Skinfold thickness (SFT measurements are a reliable and feasible method for assessing body fat in children but their use and interpretation is hindered by the scarcity of reference values in representative populations of children. The objective of the present study was to develop age- and sex-specific percentile curves for five SFT measures (biceps, triceps, subscapular, suprailiac, medial calf in a representative population of Canadian children and youth. Methods. We analyzed data from 3,938 children and adolescents between 6 and 19 years of age who participated in the Canadian Health Measures Survey cycles 1 (2007/2009 and 2 (2009/2011. Standardized procedures were used to measure SFT. Age- and sex-specific centiles for SFT were calculated using the GAMLSS method. Results. Percentile curves were materially different in absolute value and shape for boys and girls. Percentile girls in girls steadily increased with age whereas percentile curves in boys were characterized by a pubertal centered peak. Conclusions. The current study has presented for the first time percentile curves for five SFT measures in a representative sample of Canadian children and youth.

  3. Percentile estimation using the normal and lognormal probability distribution

    International Nuclear Information System (INIS)

    Bement, T.R.

    1980-01-01

    Implicitly or explicitly percentile estimation is an important aspect of the analysis of aerial radiometric survey data. Standard deviation maps are produced for quadrangles which are surveyed as part of the National Uranium Resource Evaluation. These maps show where variables differ from their mean values by more than one, two or three standard deviations. Data may or may not be log-transformed prior to analysis. These maps have specific percentile interpretations only when proper distributional assumptions are met. Monte Carlo results are presented in this paper which show the consequences of estimating percentiles by: (1) assuming normality when the data are really from a lognormal distribution; and (2) assuming lognormality when the data are really from a normal distribution

  4. Percentile Curves for Anthropometric Measures for Canadian Children and Youth

    Science.gov (United States)

    Kuhle, Stefan; Maguire, Bryan; Ata, Nicole; Hamilton, David

    2015-01-01

    Body mass index (BMI) is commonly used to assess a child's weight status but it does not provide information about the distribution of body fat. Since the disease risks associated with obesity are related to the amount and distribution of body fat, measures that assess visceral or subcutaneous fat, such as waist circumference (WC), waist-to-height ratio (WHtR), or skinfolds thickness may be more suitable. The objective of this study was to develop percentile curves for BMI, WC, WHtR, and sum of 5 skinfolds (SF5) in a representative sample of Canadian children and youth. The analysis used data from 4115 children and adolescents between 6 and 19 years of age that participated in the Canadian Health Measures Survey Cycles 1 (2007/2009) and 2 (2009/2011). BMI, WC, WHtR, and SF5 were measured using standardized procedures. Age- and sex-specific centiles were calculated using the LMS method and the percentiles that intersect the adult cutpoints for BMI, WC, and WHtR at age 18 years were determined. Percentile curves for all measures showed an upward shift compared to curves from the pre-obesity epidemic era. The adult cutoffs for overweight and obesity corresponded to the 72nd and 91st percentile, respectively, for both sexes. The current study has presented for the first time percentile curves for BMI, WC, WHtR, and SF5 in a representative sample of Canadian children and youth. The percentile curves presented are meant to be descriptive rather than prescriptive as associations with cardiovascular disease markers or outcomes were not assessed. PMID:26176769

  5. Non-parametric correlative uncertainty quantification and sensitivity analysis: Application to a Langmuir bimolecular adsorption model

    Science.gov (United States)

    Feng, Jinchao; Lansford, Joshua; Mironenko, Alexander; Pourkargar, Davood Babaei; Vlachos, Dionisios G.; Katsoulakis, Markos A.

    2018-03-01

    We propose non-parametric methods for both local and global sensitivity analysis of chemical reaction models with correlated parameter dependencies. The developed mathematical and statistical tools are applied to a benchmark Langmuir competitive adsorption model on a close packed platinum surface, whose parameters, estimated from quantum-scale computations, are correlated and are limited in size (small data). The proposed mathematical methodology employs gradient-based methods to compute sensitivity indices. We observe that ranking influential parameters depends critically on whether or not correlations between parameters are taken into account. The impact of uncertainty in the correlation and the necessity of the proposed non-parametric perspective are demonstrated.

  6. Non-parametric correlative uncertainty quantification and sensitivity analysis: Application to a Langmuir bimolecular adsorption model

    Directory of Open Access Journals (Sweden)

    Jinchao Feng

    2018-03-01

    Full Text Available We propose non-parametric methods for both local and global sensitivity analysis of chemical reaction models with correlated parameter dependencies. The developed mathematical and statistical tools are applied to a benchmark Langmuir competitive adsorption model on a close packed platinum surface, whose parameters, estimated from quantum-scale computations, are correlated and are limited in size (small data. The proposed mathematical methodology employs gradient-based methods to compute sensitivity indices. We observe that ranking influential parameters depends critically on whether or not correlations between parameters are taken into account. The impact of uncertainty in the correlation and the necessity of the proposed non-parametric perspective are demonstrated.

  7. Bayesian Nonparametric Longitudinal Data Analysis.

    Science.gov (United States)

    Quintana, Fernando A; Johnson, Wesley O; Waetjen, Elaine; Gold, Ellen

    2016-01-01

    Practical Bayesian nonparametric methods have been developed across a wide variety of contexts. Here, we develop a novel statistical model that generalizes standard mixed models for longitudinal data that include flexible mean functions as well as combined compound symmetry (CS) and autoregressive (AR) covariance structures. AR structure is often specified through the use of a Gaussian process (GP) with covariance functions that allow longitudinal data to be more correlated if they are observed closer in time than if they are observed farther apart. We allow for AR structure by considering a broader class of models that incorporates a Dirichlet Process Mixture (DPM) over the covariance parameters of the GP. We are able to take advantage of modern Bayesian statistical methods in making full predictive inferences and about characteristics of longitudinal profiles and their differences across covariate combinations. We also take advantage of the generality of our model, which provides for estimation of a variety of covariance structures. We observe that models that fail to incorporate CS or AR structure can result in very poor estimation of a covariance or correlation matrix. In our illustration using hormone data observed on women through the menopausal transition, biology dictates the use of a generalized family of sigmoid functions as a model for time trends across subpopulation categories.

  8. Nonparametric Bayesian Modeling of Complex Networks

    DEFF Research Database (Denmark)

    Schmidt, Mikkel Nørgaard; Mørup, Morten

    2013-01-01

    an infinite mixture model as running example, we go through the steps of deriving the model as an infinite limit of a finite parametric model, inferring the model parameters by Markov chain Monte Carlo, and checking the model?s fit and predictive performance. We explain how advanced nonparametric models......Modeling structure in complex networks using Bayesian nonparametrics makes it possible to specify flexible model structures and infer the adequate model complexity from the observed data. This article provides a gentle introduction to nonparametric Bayesian modeling of complex networks: Using...

  9. Nonparametric functional mapping of quantitative trait loci.

    Science.gov (United States)

    Yang, Jie; Wu, Rongling; Casella, George

    2009-03-01

    Functional mapping is a useful tool for mapping quantitative trait loci (QTL) that control dynamic traits. It incorporates mathematical aspects of biological processes into the mixture model-based likelihood setting for QTL mapping, thus increasing the power of QTL detection and the precision of parameter estimation. However, in many situations there is no obvious functional form and, in such cases, this strategy will not be optimal. Here we propose to use nonparametric function estimation, typically implemented with B-splines, to estimate the underlying functional form of phenotypic trajectories, and then construct a nonparametric test to find evidence of existing QTL. Using the representation of a nonparametric regression as a mixed model, the final test statistic is a likelihood ratio test. We consider two types of genetic maps: dense maps and general maps, and the power of nonparametric functional mapping is investigated through simulation studies and demonstrated by examples.

  10. Essays on nonparametric econometrics of stochastic volatility

    NARCIS (Netherlands)

    Zu, Y.

    2012-01-01

    Volatility is a concept that describes the variation of financial returns. Measuring and modelling volatility dynamics is an important aspect of financial econometrics. This thesis is concerned with nonparametric approaches to volatility measurement and volatility model validation.

  11. Nonparametric methods for volatility density estimation

    NARCIS (Netherlands)

    Es, van Bert; Spreij, P.J.C.; Zanten, van J.H.

    2009-01-01

    Stochastic volatility modelling of financial processes has become increasingly popular. The proposed models usually contain a stationary volatility process. We will motivate and review several nonparametric methods for estimation of the density of the volatility process. Both models based on

  12. Intergenerational Educational Rank Mobility in 20th Century United States

    DEFF Research Database (Denmark)

    Karlson, Kristian Bernt

    2015-01-01

    in the overall schooling distribution both over time and among population groups defined by race and gender.METHODS & DATA: To analyze educational rank mobility, I use quantile transition matrices known from studies on intergenerational income mobility. However, because schooling distributions are quite lumpy......, particularly around 12 and 16 years of schooling, percentile ranks of interest may not always be defined among parents or offspring (e.g., the lower or upper quartile may not be given by the data). To deal with this issue, I use a cohort-adjustment that deflates the schooling distribution in proportion...... performance of historically disadvantaged groups. To reconcile these diverging trends, I propose examining educational mobility in terms of percentile ranks in the respective schooling distributions of parents and offspring. Using a novel estimator of educational rank, I compare patterns of mobility...

  13. ASSESSING PHYSICAL DEVELOPMENT OF CHILDREN WITH PERCENTILE DIAGRAMS

    Directory of Open Access Journals (Sweden)

    Rita R. Kildiyarova

    2017-01-01

    Full Text Available The results of the analysis of methods assessing anthropometric measures in children are presented. A method for visual examination of physical development using author's percentile diagrams for height, body weight, and the harmony of development of children of different age groups is offered. The method can be quickly performed, it is recommended for mass screening examination of children under outpatient treatment. To monitor the health of a specific child, a monitoring assessment of physical development is possible. The analysis of Z-score is of great clinical importance when determining anthropometric measures below the 3rd percentile, for the assessment of premature infants with congenital malformations and other diseases, in the presence of obesity. Graphical curves of body weight, height to age, body weight according to the height of boys and girls can be used by pediatricians. 

  14. Variable Selection for Regression Models of Percentile Flows

    Science.gov (United States)

    Fouad, G.

    2017-12-01

    Percentile flows describe the flow magnitude equaled or exceeded for a given percent of time, and are widely used in water resource management. However, these statistics are normally unavailable since most basins are ungauged. Percentile flows of ungauged basins are often predicted using regression models based on readily observable basin characteristics, such as mean elevation. The number of these independent variables is too large to evaluate all possible models. A subset of models is typically evaluated using automatic procedures, like stepwise regression. This ignores a large variety of methods from the field of feature (variable) selection and physical understanding of percentile flows. A study of 918 basins in the United States was conducted to compare an automatic regression procedure to the following variable selection methods: (1) principal component analysis, (2) correlation analysis, (3) random forests, (4) genetic programming, (5) Bayesian networks, and (6) physical understanding. The automatic regression procedure only performed better than principal component analysis. Poor performance of the regression procedure was due to a commonly used filter for multicollinearity, which rejected the strongest models because they had cross-correlated independent variables. Multicollinearity did not decrease model performance in validation because of a representative set of calibration basins. Variable selection methods based strictly on predictive power (numbers 2-5 from above) performed similarly, likely indicating a limit to the predictive power of the variables. Similar performance was also reached using variables selected based on physical understanding, a finding that substantiates recent calls to emphasize physical understanding in modeling for predictions in ungauged basins. The strongest variables highlighted the importance of geology and land cover, whereas widely used topographic variables were the weakest predictors. Variables suffered from a high

  15. Sparse structure regularized ranking

    KAUST Repository

    Wang, Jim Jing-Yan; Sun, Yijun; Gao, Xin

    2014-01-01

    Learning ranking scores is critical for the multimedia database retrieval problem. In this paper, we propose a novel ranking score learning algorithm by exploring the sparse structure and using it to regularize ranking scores. To explore the sparse

  16. Exact p-values for pairwise comparison of Friedman rank sums, with application to comparing classifiers

    NARCIS (Netherlands)

    Eisinga, R.N.; Heskes, T.M.; Pelzer, B.J.; Grotenhuis, H.F. te

    2017-01-01

    Background: The Friedman rank sum test is a widely-used nonparametric method in computational biology. In addition to examining the overall null hypothesis of no significant difference among any of the rank sums, it is typically of interest to conduct pairwise comparison tests. Current approaches to

  17. Recent Advances and Trends in Nonparametric Statistics

    CERN Document Server

    Akritas, MG

    2003-01-01

    The advent of high-speed, affordable computers in the last two decades has given a new boost to the nonparametric way of thinking. Classical nonparametric procedures, such as function smoothing, suddenly lost their abstract flavour as they became practically implementable. In addition, many previously unthinkable possibilities became mainstream; prime examples include the bootstrap and resampling methods, wavelets and nonlinear smoothers, graphical methods, data mining, bioinformatics, as well as the more recent algorithmic approaches such as bagging and boosting. This volume is a collection o

  18. Reduced Rank Regression

    DEFF Research Database (Denmark)

    Johansen, Søren

    2008-01-01

    The reduced rank regression model is a multivariate regression model with a coefficient matrix with reduced rank. The reduced rank regression algorithm is an estimation procedure, which estimates the reduced rank regression model. It is related to canonical correlations and involves calculating...

  19. On Locally Most Powerful Sequential Rank Tests

    Czech Academy of Sciences Publication Activity Database

    Kalina, Jan

    2017-01-01

    Roč. 36, č. 1 (2017), s. 111-125 ISSN 0747-4946 R&D Projects: GA ČR GA17-07384S Grant - others:Nadační fond na podporu vědy(CZ) Neuron Institutional support: RVO:67985556 Keywords : nonparametric test s * sequential ranks * stopping variable Subject RIV: BA - General Mathematics OBOR OECD: Pure mathematics Impact factor: 0.339, year: 2016 http://library.utia.cas.cz/separaty/2017/SI/kalina-0474065.pdf

  20. Nonparametric conditional predictive regions for time series

    NARCIS (Netherlands)

    de Gooijer, J.G.; Zerom Godefay, D.

    2000-01-01

    Several nonparametric predictors based on the Nadaraya-Watson kernel regression estimator have been proposed in the literature. They include the conditional mean, the conditional median, and the conditional mode. In this paper, we consider three types of predictive regions for these predictors — the

  1. Nonparametric predictive inference in statistical process control

    NARCIS (Netherlands)

    Arts, G.R.J.; Coolen, F.P.A.; Laan, van der P.

    2000-01-01

    New methods for statistical process control are presented, where the inferences have a nonparametric predictive nature. We consider several problems in process control in terms of uncertainties about future observable random quantities, and we develop inferences for these random quantities hased on

  2. Nonparametric predictive inference in statistical process control

    NARCIS (Netherlands)

    Arts, G.R.J.; Coolen, F.P.A.; Laan, van der P.

    2004-01-01

    Statistical process control (SPC) is used to decide when to stop a process as confidence in the quality of the next item(s) is low. Information to specify a parametric model is not always available, and as SPC is of a predictive nature, we present a control chart developed using nonparametric

  3. Non-Parametric Estimation of Correlation Functions

    DEFF Research Database (Denmark)

    Brincker, Rune; Rytter, Anders; Krenk, Steen

    In this paper three methods of non-parametric correlation function estimation are reviewed and evaluated: the direct method, estimation by the Fast Fourier Transform and finally estimation by the Random Decrement technique. The basic ideas of the techniques are reviewed, sources of bias are point...

  4. Nonparametric estimation in models for unobservable heterogeneity

    OpenAIRE

    Hohmann, Daniel

    2014-01-01

    Nonparametric models which allow for data with unobservable heterogeneity are studied. The first publication introduces new estimators and their asymptotic properties for conditional mixture models. The second publication considers estimation of a function from noisy observations of its Radon transform in a Gaussian white noise model.

  5. Nonparametric estimation of location and scale parameters

    KAUST Repository

    Potgieter, C.J.; Lombard, F.

    2012-01-01

    Two random variables X and Y belong to the same location-scale family if there are constants μ and σ such that Y and μ+σX have the same distribution. In this paper we consider non-parametric estimation of the parameters μ and σ under minimal

  6. A Bayesian Nonparametric Approach to Factor Analysis

    DEFF Research Database (Denmark)

    Piatek, Rémi; Papaspiliopoulos, Omiros

    2018-01-01

    This paper introduces a new approach for the inference of non-Gaussian factor models based on Bayesian nonparametric methods. It relaxes the usual normality assumption on the latent factors, widely used in practice, which is too restrictive in many settings. Our approach, on the contrary, does no...

  7. Panel data specifications in nonparametric kernel regression

    DEFF Research Database (Denmark)

    Czekaj, Tomasz Gerard; Henningsen, Arne

    parametric panel data estimators to analyse the production technology of Polish crop farms. The results of our nonparametric kernel regressions generally differ from the estimates of the parametric models but they only slightly depend on the choice of the kernel functions. Based on economic reasoning, we...

  8. Ranking Operations Management conferences

    NARCIS (Netherlands)

    Steenhuis, H.J.; de Bruijn, E.J.; Gupta, Sushil; Laptaned, U

    2007-01-01

    Several publications have appeared in the field of Operations Management which rank Operations Management related journals. Several ranking systems exist for journals based on , for example, perceived relevance and quality, citation, and author affiliation. Many academics also publish at conferences

  9. Coronary calcium predicts events better with absolute calcium scores than age-sex-race/ethnicity percentiles: MESA (Multi-Ethnic Study of Atherosclerosis).

    Science.gov (United States)

    Budoff, Matthew J; Nasir, Khurram; McClelland, Robyn L; Detrano, Robert; Wong, Nathan; Blumenthal, Roger S; Kondos, George; Kronmal, Richard A

    2009-01-27

    In this study, we aimed to establish whether age-sex-specific percentiles of coronary artery calcium (CAC) predict cardiovascular outcomes better than the actual (absolute) CAC score. The presence and extent of CAC correlates with the overall magnitude of coronary atherosclerotic plaque burden and with the development of subsequent coronary events. MESA (Multi-Ethnic Study of Atherosclerosis) is a prospective cohort study of 6,814 asymptomatic participants followed for coronary heart disease (CHD) events including myocardial infarction, angina, resuscitated cardiac arrest, or CHD death. Time to incident CHD was modeled with Cox regression, and we compared models with percentiles based on age, sex, and/or race/ethnicity to categories commonly used (0, 1 to 100, 101 to 400, 400+ Agatston units). There were 163 (2.4%) incident CHD events (median follow-up 3.75 years). Expressing CAC in terms of age- and sex-specific percentiles had significantly lower area under the receiver-operating characteristic curve (AUC) than when using absolute scores (women: AUC 0.73 versus 0.76, p = 0.044; men: AUC 0.73 versus 0.77, p better model fit with the overall score. Both methods robustly predicted events (>90th percentile associated with a hazard ratio [HR] of 16.4, 95% confidence interval [CI]: 9.30 to 28.9, and score >400 associated with HR of 20.6, 95% CI: 11.8 to 36.0). Within groups based on age-, sex-, and race/ethnicity-specific percentiles there remains a clear trend of increasing risk across levels of the absolute CAC groups. In contrast, once absolute CAC category is fixed, there is no increasing trend across levels of age-, sex-, and race/ethnicity-specific categories. Patients with low absolute scores are low-risk, regardless of age-, sex-, and race/ethnicity-specific percentile rank. Persons with an absolute CAC score of >400 are high risk, regardless of percentile rank. Using absolute CAC in standard groups performed better than age-, sex-, and race

  10. Parametric and Non-Parametric System Modelling

    DEFF Research Database (Denmark)

    Nielsen, Henrik Aalborg

    1999-01-01

    the focus is on combinations of parametric and non-parametric methods of regression. This combination can be in terms of additive models where e.g. one or more non-parametric term is added to a linear regression model. It can also be in terms of conditional parametric models where the coefficients...... considered. It is shown that adaptive estimation in conditional parametric models can be performed by combining the well known methods of local polynomial regression and recursive least squares with exponential forgetting. The approach used for estimation in conditional parametric models also highlights how...... networks is included. In this paper, neural networks are used for predicting the electricity production of a wind farm. The results are compared with results obtained using an adaptively estimated ARX-model. Finally, two papers on stochastic differential equations are included. In the first paper, among...

  11. Nonparametric Bayes Modeling of Multivariate Categorical Data.

    Science.gov (United States)

    Dunson, David B; Xing, Chuanhua

    2012-01-01

    Modeling of multivariate unordered categorical (nominal) data is a challenging problem, particularly in high dimensions and cases in which one wishes to avoid strong assumptions about the dependence structure. Commonly used approaches rely on the incorporation of latent Gaussian random variables or parametric latent class models. The goal of this article is to develop a nonparametric Bayes approach, which defines a prior with full support on the space of distributions for multiple unordered categorical variables. This support condition ensures that we are not restricting the dependence structure a priori. We show this can be accomplished through a Dirichlet process mixture of product multinomial distributions, which is also a convenient form for posterior computation. Methods for nonparametric testing of violations of independence are proposed, and the methods are applied to model positional dependence within transcription factor binding motifs.

  12. Network structure exploration via Bayesian nonparametric models

    International Nuclear Information System (INIS)

    Chen, Y; Wang, X L; Xiang, X; Tang, B Z; Bu, J Z

    2015-01-01

    Complex networks provide a powerful mathematical representation of complex systems in nature and society. To understand complex networks, it is crucial to explore their internal structures, also called structural regularities. The task of network structure exploration is to determine how many groups there are in a complex network and how to group the nodes of the network. Most existing structure exploration methods need to specify either a group number or a certain type of structure when they are applied to a network. In the real world, however, the group number and also the certain type of structure that a network has are usually unknown in advance. To explore structural regularities in complex networks automatically, without any prior knowledge of the group number or the certain type of structure, we extend a probabilistic mixture model that can handle networks with any type of structure but needs to specify a group number using Bayesian nonparametric theory. We also propose a novel Bayesian nonparametric model, called the Bayesian nonparametric mixture (BNPM) model. Experiments conducted on a large number of networks with different structures show that the BNPM model is able to explore structural regularities in networks automatically with a stable, state-of-the-art performance. (paper)

  13. portfolio optimization based on nonparametric estimation methods

    Directory of Open Access Journals (Sweden)

    mahsa ghandehari

    2017-03-01

    Full Text Available One of the major issues investors are facing with in capital markets is decision making about select an appropriate stock exchange for investing and selecting an optimal portfolio. This process is done through the risk and expected return assessment. On the other hand in portfolio selection problem if the assets expected returns are normally distributed, variance and standard deviation are used as a risk measure. But, the expected returns on assets are not necessarily normal and sometimes have dramatic differences from normal distribution. This paper with the introduction of conditional value at risk ( CVaR, as a measure of risk in a nonparametric framework, for a given expected return, offers the optimal portfolio and this method is compared with the linear programming method. The data used in this study consists of monthly returns of 15 companies selected from the top 50 companies in Tehran Stock Exchange during the winter of 1392 which is considered from April of 1388 to June of 1393. The results of this study show the superiority of nonparametric method over the linear programming method and the nonparametric method is much faster than the linear programming method.

  14. Nonparametric Mixture Models for Supervised Image Parcellation.

    Science.gov (United States)

    Sabuncu, Mert R; Yeo, B T Thomas; Van Leemput, Koen; Fischl, Bruce; Golland, Polina

    2009-09-01

    We present a nonparametric, probabilistic mixture model for the supervised parcellation of images. The proposed model yields segmentation algorithms conceptually similar to the recently developed label fusion methods, which register a new image with each training image separately. Segmentation is achieved via the fusion of transferred manual labels. We show that in our framework various settings of a model parameter yield algorithms that use image intensity information differently in determining the weight of a training subject during fusion. One particular setting computes a single, global weight per training subject, whereas another setting uses locally varying weights when fusing the training data. The proposed nonparametric parcellation approach capitalizes on recently developed fast and robust pairwise image alignment tools. The use of multiple registrations allows the algorithm to be robust to occasional registration failures. We report experiments on 39 volumetric brain MRI scans with expert manual labels for the white matter, cerebral cortex, ventricles and subcortical structures. The results demonstrate that the proposed nonparametric segmentation framework yields significantly better segmentation than state-of-the-art algorithms.

  15. Robustifying Bayesian nonparametric mixtures for count data.

    Science.gov (United States)

    Canale, Antonio; Prünster, Igor

    2017-03-01

    Our motivating application stems from surveys of natural populations and is characterized by large spatial heterogeneity in the counts, which makes parametric approaches to modeling local animal abundance too restrictive. We adopt a Bayesian nonparametric approach based on mixture models and innovate with respect to popular Dirichlet process mixture of Poisson kernels by increasing the model flexibility at the level both of the kernel and the nonparametric mixing measure. This allows to derive accurate and robust estimates of the distribution of local animal abundance and of the corresponding clusters. The application and a simulation study for different scenarios yield also some general methodological implications. Adding flexibility solely at the level of the mixing measure does not improve inferences, since its impact is severely limited by the rigidity of the Poisson kernel with considerable consequences in terms of bias. However, once a kernel more flexible than the Poisson is chosen, inferences can be robustified by choosing a prior more general than the Dirichlet process. Therefore, to improve the performance of Bayesian nonparametric mixtures for count data one has to enrich the model simultaneously at both levels, the kernel and the mixing measure. © 2016, The International Biometric Society.

  16. Sparse structure regularized ranking

    KAUST Repository

    Wang, Jim Jing-Yan

    2014-04-17

    Learning ranking scores is critical for the multimedia database retrieval problem. In this paper, we propose a novel ranking score learning algorithm by exploring the sparse structure and using it to regularize ranking scores. To explore the sparse structure, we assume that each multimedia object could be represented as a sparse linear combination of all other objects, and combination coefficients are regarded as a similarity measure between objects and used to regularize their ranking scores. Moreover, we propose to learn the sparse combination coefficients and the ranking scores simultaneously. A unified objective function is constructed with regard to both the combination coefficients and the ranking scores, and is optimized by an iterative algorithm. Experiments on two multimedia database retrieval data sets demonstrate the significant improvements of the propose algorithm over state-of-the-art ranking score learning algorithms.

  17. Introduction to nonparametric statistics for the biological sciences using R

    CERN Document Server

    MacFarland, Thomas W

    2016-01-01

    This book contains a rich set of tools for nonparametric analyses, and the purpose of this supplemental text is to provide guidance to students and professional researchers on how R is used for nonparametric data analysis in the biological sciences: To introduce when nonparametric approaches to data analysis are appropriate To introduce the leading nonparametric tests commonly used in biostatistics and how R is used to generate appropriate statistics for each test To introduce common figures typically associated with nonparametric data analysis and how R is used to generate appropriate figures in support of each data set The book focuses on how R is used to distinguish between data that could be classified as nonparametric as opposed to data that could be classified as parametric, with both approaches to data classification covered extensively. Following an introductory lesson on nonparametric statistics for the biological sciences, the book is organized into eight self-contained lessons on various analyses a...

  18. A nonparametric approach to medical survival data: Uncertainty in the context of risk in mortality analysis

    International Nuclear Information System (INIS)

    Janurová, Kateřina; Briš, Radim

    2014-01-01

    Medical survival right-censored data of about 850 patients are evaluated to analyze the uncertainty related to the risk of mortality on one hand and compare two basic surgery techniques in the context of risk of mortality on the other hand. Colorectal data come from patients who underwent colectomy in the University Hospital of Ostrava. Two basic surgery operating techniques are used for the colectomy: either traditional (open) or minimally invasive (laparoscopic). Basic question arising at the colectomy operation is, which type of operation to choose to guarantee longer overall survival time. Two non-parametric approaches have been used to quantify probability of mortality with uncertainties. In fact, complement of the probability to one, i.e. survival function with corresponding confidence levels is calculated and evaluated. First approach considers standard nonparametric estimators resulting from both the Kaplan–Meier estimator of survival function in connection with Greenwood's formula and the Nelson–Aalen estimator of cumulative hazard function including confidence interval for survival function as well. The second innovative approach, represented by Nonparametric Predictive Inference (NPI), uses lower and upper probabilities for quantifying uncertainty and provides a model of predictive survival function instead of the population survival function. The traditional log-rank test on one hand and the nonparametric predictive comparison of two groups of lifetime data on the other hand have been compared to evaluate risk of mortality in the context of mentioned surgery techniques. The size of the difference between two groups of lifetime data has been considered and analyzed as well. Both nonparametric approaches led to the same conclusion, that the minimally invasive operating technique guarantees the patient significantly longer survival time in comparison with the traditional operating technique

  19. A Rank Test on Equality of Population Medians

    OpenAIRE

    Pooi Ah Hin

    2012-01-01

    The Kruskal-Wallis test is a non-parametric test for the equality of K population medians. The test statistic involved is a measure of the overall closeness of the K average ranks in the individual samples to the average rank in the combined sample. The resulting acceptance region of the test however may not be the smallest region with the required acceptance probability under the null hypothesis. Presently an alternative acceptance region is constructed such that it has the smallest size, ap...

  20. Non-parametric smoothing of experimental data

    International Nuclear Information System (INIS)

    Kuketayev, A.T.; Pen'kov, F.M.

    2007-01-01

    Full text: Rapid processing of experimental data samples in nuclear physics often requires differentiation in order to find extrema. Therefore, even at the preliminary stage of data analysis, a range of noise reduction methods are used to smooth experimental data. There are many non-parametric smoothing techniques: interval averages, moving averages, exponential smoothing, etc. Nevertheless, it is more common to use a priori information about the behavior of the experimental curve in order to construct smoothing schemes based on the least squares techniques. The latter methodology's advantage is that the area under the curve can be preserved, which is equivalent to conservation of total speed of counting. The disadvantages of this approach include the lack of a priori information. For example, very often the sums of undifferentiated (by a detector) peaks are replaced with one peak during the processing of data, introducing uncontrolled errors in the determination of the physical quantities. The problem is solvable only by having experienced personnel, whose skills are much greater than the challenge. We propose a set of non-parametric techniques, which allows the use of any additional information on the nature of experimental dependence. The method is based on a construction of a functional, which includes both experimental data and a priori information. Minimum of this functional is reached on a non-parametric smoothed curve. Euler (Lagrange) differential equations are constructed for these curves; then their solutions are obtained analytically or numerically. The proposed approach allows for automated processing of nuclear physics data, eliminating the need for highly skilled laboratory personnel. Pursuant to the proposed approach is the possibility to obtain smoothing curves in a given confidence interval, e.g. according to the χ 2 distribution. This approach is applicable when constructing smooth solutions of ill-posed problems, in particular when solving

  1. Decompounding random sums: A nonparametric approach

    DEFF Research Database (Denmark)

    Hansen, Martin Bøgsted; Pitts, Susan M.

    Observations from sums of random variables with a random number of summands, known as random, compound or stopped sums arise within many areas of engineering and science. Quite often it is desirable to infer properties of the distribution of the terms in the random sum. In the present paper we...... review a number of applications and consider the nonlinear inverse problem of inferring the cumulative distribution function of the components in the random sum. We review the existing literature on non-parametric approaches to the problem. The models amenable to the analysis are generalized considerably...

  2. A Nonparametric Test for Seasonal Unit Roots

    OpenAIRE

    Kunst, Robert M.

    2009-01-01

    Abstract: We consider a nonparametric test for the null of seasonal unit roots in quarterly time series that builds on the RUR (records unit root) test by Aparicio, Escribano, and Sipols. We find that the test concept is more promising than a formalization of visual aids such as plots by quarter. In order to cope with the sensitivity of the original RUR test to autocorrelation under its null of a unit root, we suggest an augmentation step by autoregression. We present some evidence on the siz...

  3. A convenient method of obtaining percentile norms and accompanying interval estimates for self-report mood scales (DASS, DASS-21, HADS, PANAS, and sAD).

    Science.gov (United States)

    Crawford, John R; Garthwaite, Paul H; Lawrie, Caroline J; Henry, Julie D; MacDonald, Marie A; Sutherland, Jane; Sinha, Priyanka

    2009-06-01

    A series of recent papers have reported normative data from the general adult population for commonly used self-report mood scales. To bring together and supplement these data in order to provide a convenient means of obtaining percentile norms for the mood scales. A computer program was developed that provides point and interval estimates of the percentile rank corresponding to raw scores on the various self-report scales. The program can be used to obtain point and interval estimates of the percentile rank of an individual's raw scores on the DASS, DASS-21, HADS, PANAS, and sAD mood scales, based on normative sample sizes ranging from 758 to 3822. The interval estimates can be obtained using either classical or Bayesian methods as preferred. The computer program (which can be downloaded at www.abdn.ac.uk/~psy086/dept/MoodScore.htm) provides a convenient and reliable means of supplementing existing cut-off scores for self-report mood scales.

  4. Bayesian Nonparametric Clustering for Positive Definite Matrices.

    Science.gov (United States)

    Cherian, Anoop; Morellas, Vassilios; Papanikolopoulos, Nikolaos

    2016-05-01

    Symmetric Positive Definite (SPD) matrices emerge as data descriptors in several applications of computer vision such as object tracking, texture recognition, and diffusion tensor imaging. Clustering these data matrices forms an integral part of these applications, for which soft-clustering algorithms (K-Means, expectation maximization, etc.) are generally used. As is well-known, these algorithms need the number of clusters to be specified, which is difficult when the dataset scales. To address this issue, we resort to the classical nonparametric Bayesian framework by modeling the data as a mixture model using the Dirichlet process (DP) prior. Since these matrices do not conform to the Euclidean geometry, rather belongs to a curved Riemannian manifold,existing DP models cannot be directly applied. Thus, in this paper, we propose a novel DP mixture model framework for SPD matrices. Using the log-determinant divergence as the underlying dissimilarity measure to compare these matrices, and further using the connection between this measure and the Wishart distribution, we derive a novel DPM model based on the Wishart-Inverse-Wishart conjugate pair. We apply this model to several applications in computer vision. Our experiments demonstrate that our model is scalable to the dataset size and at the same time achieves superior accuracy compared to several state-of-the-art parametric and nonparametric clustering algorithms.

  5. How to Rank Journals.

    Science.gov (United States)

    Bradshaw, Corey J A; Brook, Barry W

    2016-01-01

    There are now many methods available to assess the relative citation performance of peer-reviewed journals. Regardless of their individual faults and advantages, citation-based metrics are used by researchers to maximize the citation potential of their articles, and by employers to rank academic track records. The absolute value of any particular index is arguably meaningless unless compared to other journals, and different metrics result in divergent rankings. To provide a simple yet more objective way to rank journals within and among disciplines, we developed a κ-resampled composite journal rank incorporating five popular citation indices: Impact Factor, Immediacy Index, Source-Normalized Impact Per Paper, SCImago Journal Rank and Google 5-year h-index; this approach provides an index of relative rank uncertainty. We applied the approach to six sample sets of scientific journals from Ecology (n = 100 journals), Medicine (n = 100), Multidisciplinary (n = 50); Ecology + Multidisciplinary (n = 25), Obstetrics & Gynaecology (n = 25) and Marine Biology & Fisheries (n = 25). We then cross-compared the κ-resampled ranking for the Ecology + Multidisciplinary journal set to the results of a survey of 188 publishing ecologists who were asked to rank the same journals, and found a 0.68-0.84 Spearman's ρ correlation between the two rankings datasets. Our composite index approach therefore approximates relative journal reputation, at least for that discipline. Agglomerative and divisive clustering and multi-dimensional scaling techniques applied to the Ecology + Multidisciplinary journal set identified specific clusters of similarly ranked journals, with only Nature & Science separating out from the others. When comparing a selection of journals within or among disciplines, we recommend collecting multiple citation-based metrics for a sample of relevant and realistic journals to calculate the composite rankings and their relative uncertainty windows.

  6. On Page Rank

    NARCIS (Netherlands)

    Hoede, C.

    In this paper the concept of page rank for the world wide web is discussed. The possibility of describing the distribution of page rank by an exponential law is considered. It is shown that the concept is essentially equal to that of status score, a centrality measure discussed already in 1953 by

  7. On Rank and Nullity

    Science.gov (United States)

    Dobbs, David E.

    2012-01-01

    This note explains how Emil Artin's proof that row rank equals column rank for a matrix with entries in a field leads naturally to the formula for the nullity of a matrix and also to an algorithm for solving any system of linear equations in any number of variables. This material could be used in any course on matrix theory or linear algebra.

  8. Hitting the Rankings Jackpot

    Science.gov (United States)

    Chapman, David W.

    2008-01-01

    Recently, Samford University was ranked 27th in the nation in a report released by "Forbes" magazine. In this article, the author relates how the people working at Samford University were surprised at its ranking. Although Samford is the largest privately institution in Alabama, its distinguished academic achievements aren't even…

  9. Ranking Decision Making Units with Stochastic Data by Using Coefficient of Variation

    OpenAIRE

    Lotfi, F.; Nematollahi, N.; Behzadi, M.H.; Mirbolouki, M.

    2010-01-01

    Data Envelopment Analysis (DEA) is a non-parametric technique which is based on mathematical programming for evaluating the efficiency of a set of Decision Making Units (DMUs). Throughout applications, managers encounter with stochastic data and the necessity of having a method that is able to evaluate efficiency and rank efficient units has been under consideration. In this paper considering the concept of coefficient of variation among efficient DMUs, two ranking methods has been proposed. ...

  10. Complete hazard ranking to analyze right-censored data: An ALS survival study.

    Directory of Open Access Journals (Sweden)

    Zhengnan Huang

    2017-12-01

    Full Text Available Survival analysis represents an important outcome measure in clinical research and clinical trials; further, survival ranking may offer additional advantages in clinical trials. In this study, we developed GuanRank, a non-parametric ranking-based technique to transform patients' survival data into a linear space of hazard ranks. The transformation enables the utilization of machine learning base-learners including Gaussian process regression, Lasso, and random forest on survival data. The method was submitted to the DREAM Amyotrophic Lateral Sclerosis (ALS Stratification Challenge. Ranked first place, the model gave more accurate ranking predictions on the PRO-ACT ALS dataset in comparison to Cox proportional hazard model. By utilizing right-censored data in its training process, the method demonstrated its state-of-the-art predictive power in ALS survival ranking. Its feature selection identified multiple important factors, some of which conflicts with previous studies.

  11. Complete hazard ranking to analyze right-censored data: An ALS survival study.

    Science.gov (United States)

    Huang, Zhengnan; Zhang, Hongjiu; Boss, Jonathan; Goutman, Stephen A; Mukherjee, Bhramar; Dinov, Ivo D; Guan, Yuanfang

    2017-12-01

    Survival analysis represents an important outcome measure in clinical research and clinical trials; further, survival ranking may offer additional advantages in clinical trials. In this study, we developed GuanRank, a non-parametric ranking-based technique to transform patients' survival data into a linear space of hazard ranks. The transformation enables the utilization of machine learning base-learners including Gaussian process regression, Lasso, and random forest on survival data. The method was submitted to the DREAM Amyotrophic Lateral Sclerosis (ALS) Stratification Challenge. Ranked first place, the model gave more accurate ranking predictions on the PRO-ACT ALS dataset in comparison to Cox proportional hazard model. By utilizing right-censored data in its training process, the method demonstrated its state-of-the-art predictive power in ALS survival ranking. Its feature selection identified multiple important factors, some of which conflicts with previous studies.

  12. On Parametric (and Non-Parametric Variation

    Directory of Open Access Journals (Sweden)

    Neil Smith

    2009-11-01

    Full Text Available This article raises the issue of the correct characterization of ‘Parametric Variation’ in syntax and phonology. After specifying their theoretical commitments, the authors outline the relevant parts of the Principles–and–Parameters framework, and draw a three-way distinction among Universal Principles, Parameters, and Accidents. The core of the contribution then consists of an attempt to provide identity criteria for parametric, as opposed to non-parametric, variation. Parametric choices must be antecedently known, and it is suggested that they must also satisfy seven individually necessary and jointly sufficient criteria. These are that they be cognitively represented, systematic, dependent on the input, deterministic, discrete, mutually exclusive, and irreversible.

  13. Nonparametric predictive pairwise comparison with competing risks

    International Nuclear Information System (INIS)

    Coolen-Maturi, Tahani

    2014-01-01

    In reliability, failure data often correspond to competing risks, where several failure modes can cause a unit to fail. This paper presents nonparametric predictive inference (NPI) for pairwise comparison with competing risks data, assuming that the failure modes are independent. These failure modes could be the same or different among the two groups, and these can be both observed and unobserved failure modes. NPI is a statistical approach based on few assumptions, with inferences strongly based on data and with uncertainty quantified via lower and upper probabilities. The focus is on the lower and upper probabilities for the event that the lifetime of a future unit from one group, say Y, is greater than the lifetime of a future unit from the second group, say X. The paper also shows how the two groups can be compared based on particular failure mode(s), and the comparison of the two groups when some of the competing risks are combined is discussed

  14. Nonparametric estimation of location and scale parameters

    KAUST Repository

    Potgieter, C.J.

    2012-12-01

    Two random variables X and Y belong to the same location-scale family if there are constants μ and σ such that Y and μ+σX have the same distribution. In this paper we consider non-parametric estimation of the parameters μ and σ under minimal assumptions regarding the form of the distribution functions of X and Y. We discuss an approach to the estimation problem that is based on asymptotic likelihood considerations. Our results enable us to provide a methodology that can be implemented easily and which yields estimators that are often near optimal when compared to fully parametric methods. We evaluate the performance of the estimators in a series of Monte Carlo simulations. © 2012 Elsevier B.V. All rights reserved.

  15. Nonparametric inference of network structure and dynamics

    Science.gov (United States)

    Peixoto, Tiago P.

    The network structure of complex systems determine their function and serve as evidence for the evolutionary mechanisms that lie behind them. Despite considerable effort in recent years, it remains an open challenge to formulate general descriptions of the large-scale structure of network systems, and how to reliably extract such information from data. Although many approaches have been proposed, few methods attempt to gauge the statistical significance of the uncovered structures, and hence the majority cannot reliably separate actual structure from stochastic fluctuations. Due to the sheer size and high-dimensionality of many networks, this represents a major limitation that prevents meaningful interpretations of the results obtained with such nonstatistical methods. In this talk, I will show how these issues can be tackled in a principled and efficient fashion by formulating appropriate generative models of network structure that can have their parameters inferred from data. By employing a Bayesian description of such models, the inference can be performed in a nonparametric fashion, that does not require any a priori knowledge or ad hoc assumptions about the data. I will show how this approach can be used to perform model comparison, and how hierarchical models yield the most appropriate trade-off between model complexity and quality of fit based on the statistical evidence present in the data. I will also show how this general approach can be elegantly extended to networks with edge attributes, that are embedded in latent spaces, and that change in time. The latter is obtained via a fully dynamic generative network model, based on arbitrary-order Markov chains, that can also be inferred in a nonparametric fashion. Throughout the talk I will illustrate the application of the methods with many empirical networks such as the internet at the autonomous systems level, the global airport network, the network of actors and films, social networks, citations among

  16. Recurrent fuzzy ranking methods

    Science.gov (United States)

    Hajjari, Tayebeh

    2012-11-01

    With the increasing development of fuzzy set theory in various scientific fields and the need to compare fuzzy numbers in different areas. Therefore, Ranking of fuzzy numbers plays a very important role in linguistic decision-making, engineering, business and some other fuzzy application systems. Several strategies have been proposed for ranking of fuzzy numbers. Each of these techniques has been shown to produce non-intuitive results in certain case. In this paper, we reviewed some recent ranking methods, which will be useful for the researchers who are interested in this area.

  17. A NONPARAMETRIC HYPOTHESIS TEST VIA THE BOOTSTRAP RESAMPLING

    OpenAIRE

    Temel, Tugrul T.

    2001-01-01

    This paper adapts an already existing nonparametric hypothesis test to the bootstrap framework. The test utilizes the nonparametric kernel regression method to estimate a measure of distance between the models stated under the null hypothesis. The bootstraped version of the test allows to approximate errors involved in the asymptotic hypothesis test. The paper also develops a Mathematica Code for the test algorithm.

  18. Simple nonparametric checks for model data fit in CAT

    NARCIS (Netherlands)

    Meijer, R.R.

    2005-01-01

    In this paper, the usefulness of several nonparametric checks is discussed in a computerized adaptive testing (CAT) context. Although there is no tradition of nonparametric scalability in CAT, it can be argued that scalability checks can be useful to investigate, for example, the quality of item

  19. Nonparametric Bayesian inference for multidimensional compound Poisson processes

    NARCIS (Netherlands)

    Gugushvili, S.; van der Meulen, F.; Spreij, P.

    2015-01-01

    Given a sample from a discretely observed multidimensional compound Poisson process, we study the problem of nonparametric estimation of its jump size density r0 and intensity λ0. We take a nonparametric Bayesian approach to the problem and determine posterior contraction rates in this context,

  20. Nonparametric analysis of blocked ordered categories data: some examples revisited

    Directory of Open Access Journals (Sweden)

    O. Thas

    2006-08-01

    Full Text Available Nonparametric analysis for general block designs can be given by using the Cochran-Mantel-Haenszel (CMH statistics. We demonstrate this with four examples and note that several well-known nonparametric statistics are special cases of CMH statistics.

  1. A Structural Labor Supply Model with Nonparametric Preferences

    NARCIS (Netherlands)

    van Soest, A.H.O.; Das, J.W.M.; Gong, X.

    2000-01-01

    Nonparametric techniques are usually seen as a statistic device for data description and exploration, and not as a tool for estimating models with a richer economic structure, which are often required for policy analysis.This paper presents an example where nonparametric flexibility can be attained

  2. Ranking as parameter estimation

    Czech Academy of Sciences Publication Activity Database

    Kárný, Miroslav; Guy, Tatiana Valentine

    2009-01-01

    Roč. 4, č. 2 (2009), s. 142-158 ISSN 1745-7645 R&D Projects: GA MŠk 2C06001; GA AV ČR 1ET100750401; GA MŠk 1M0572 Institutional research plan: CEZ:AV0Z10750506 Keywords : ranking * Bayesian estimation * negotiation * modelling Subject RIV: BB - Applied Statistics, Operational Research http://library.utia.cas.cz/separaty/2009/AS/karny- ranking as parameter estimation.pdf

  3. Hierarchical partial order ranking

    International Nuclear Information System (INIS)

    Carlsen, Lars

    2008-01-01

    Assessing the potential impact on environmental and human health from the production and use of chemicals or from polluted sites involves a multi-criteria evaluation scheme. A priori several parameters are to address, e.g., production tonnage, specific release scenarios, geographical and site-specific factors in addition to various substance dependent parameters. Further socio-economic factors may be taken into consideration. The number of parameters to be included may well appear to be prohibitive for developing a sensible model. The study introduces hierarchical partial order ranking (HPOR) that remedies this problem. By HPOR the original parameters are initially grouped based on their mutual connection and a set of meta-descriptors is derived representing the ranking corresponding to the single groups of descriptors, respectively. A second partial order ranking is carried out based on the meta-descriptors, the final ranking being disclosed though average ranks. An illustrative example on the prioritisation of polluted sites is given. - Hierarchical partial order ranking of polluted sites has been developed for prioritization based on a large number of parameters

  4. 2nd Conference of the International Society for Nonparametric Statistics

    CERN Document Server

    Manteiga, Wenceslao; Romo, Juan

    2016-01-01

    This volume collects selected, peer-reviewed contributions from the 2nd Conference of the International Society for Nonparametric Statistics (ISNPS), held in Cádiz (Spain) between June 11–16 2014, and sponsored by the American Statistical Association, the Institute of Mathematical Statistics, the Bernoulli Society for Mathematical Statistics and Probability, the Journal of Nonparametric Statistics and Universidad Carlos III de Madrid. The 15 articles are a representative sample of the 336 contributed papers presented at the conference. They cover topics such as high-dimensional data modelling, inference for stochastic processes and for dependent data, nonparametric and goodness-of-fit testing, nonparametric curve estimation, object-oriented data analysis, and semiparametric inference. The aim of the ISNPS 2014 conference was to bring together recent advances and trends in several areas of nonparametric statistics in order to facilitate the exchange of research ideas, promote collaboration among researchers...

  5. Age, Gender, and Race-Based Coronary Artery Calcium Score Percentiles in the Brazilian Longitudinal Study of Adult Health (ELSA-Brasil).

    Science.gov (United States)

    Pereira, Alexandre C; Gomez, Luz M; Bittencourt, Marcio Sommer; Staniak, Henrique Lane; Sharovsky, Rodolfo; Foppa, Murilo; Blaha, Michael J; Bensenor, Isabela M; Lotufo, Paulo A

    2016-06-01

    Coronary artery calcium (CAC) has been demonstrated to independently predict the risk of cardiovascular events and all-cause mortality, especially among White populations. Although the population distribution of CAC has been determined for several White populations, the distribution in ethnically admixed groups has not been well established. The CAC distribution, stratified for age, gender and race, is similar to the previously described distribution in the MESA study. The Brazilian Longitudinal Study of Adult Health (ELSA-Brasil) is a prospective cohort study designed to investigate subclinical cardiovascular disease in 6 different centers of Brazil. Similar to previous studies, individuals with self-reported coronary or cerebrovascular disease and those treated for diabetes mellitus were excluded from analysis. Percentiles of CAC distribution were estimated with nonparametric techniques. The analysis included 3616 individuals (54% female; mean age, 50 years). As expected, CAC prevalence and burden were steadily higher with increasing age, as well as increased in men and in White individuals. Our results revealed that for a given CAC score, the ELSA-derived CAC percentile would be lower in men compared with the Multi-Ethnic Study of Atherosclerosis (MESA) and would be higher in women compared with MESA. In our sample of the Brazilian population, we observed significant differences in CAC by sex, age, and race. Adjusted for age and sex, low-risk individuals from the Brazilian population present with significantly lower CAC prevalence and burden compared with other low-risk individuals from other worldwide populations. Using US-derived percentiles in Brazilian individuals may lead to overestimating relative CAC burden in men and underestimating relative CAC burden in women. © 2016 Wiley Periodicals, Inc.

  6. Waist circumference percentile curves for Malaysian children and adolescents aged 6.0-16.9 years.

    Science.gov (United States)

    Poh, Bee Koon; Jannah, Ahmad Nurul; Chong, Lai Khuen; Ruzita, Abd Talib; Ismail, Mohd Noor; McCarthy, David

    2011-08-01

    The prevalence of obesity is increasing rapidly and abdominal obesity especially is known to be a risk factor for metabolic syndrome and other non-communicable diseases. Waist circumference percentile curves are useful tools which can help to identify abdominal obesity among the childhood and adolescent populations. To develop age- and sex-specific waist circumference (WC) percentile curves for multi-ethnic Malaysian children and adolescents aged 6.0-16.9 years. Subjects and methods. A total of 16,203 participants comprising 8,093 boys and 8,110 girls recruited from all regions of Malaysia were involved in this study. Height, weight, WC were measured and BMI calculated. Smoothed WC percentile curves and values for the 3rd, 5th, 10th, 25th, 50th, 75th, 90th, 95th and 97th percentiles were constructed using the LMS Method. WC was found to increase with age in both sexes, but boys had higher WC values at every age and percentile. Z-scores generated using the UK reference data shows that Chinese children had the highest WC compared to Malays, Indians and other ethnicities. Comparisons with other studies indicate that at the 50th percentile, Malaysian curves did not differ from the UK, Hong Kong and Turkish curves, but at the 90th percentile, Malaysian curves were higher compared with other countries, starting at 10 years of age. The 90th percentile was adopted as the cut-off point to indicate abdominal obesity in Malaysian children and adolescents. These curves represent the first WC percentiles reported for Malaysian children, and they can serve as a reference for future studies.

  7. Ranking Practice Variability in the Medical Student Performance Evaluation: So Bad, It's "Good".

    Science.gov (United States)

    Boysen Osborn, Megan; Mattson, James; Yanuck, Justin; Anderson, Craig; Tekian, Ara; Fox, John Christian; Harris, Ilene B

    2016-11-01

    To examine the variability among medical schools in ranking systems used in medical student performance evaluations (MSPEs). The authors reviewed MSPEs from U.S. MD-granting medical schools received by the University of California, Irvine emergency medicine and internal medicine residency programs during 2012-2013 and 2014-2015. They recorded whether the school used a ranking system, the type of ranking system used, the size and description of student categories, the location of the ranking statement and category legend, and whether nonranking schools used language suggestive of rank. Of the 134 medical schools in the study sample, the majority (n = 101; 75%) provided ranks for students in the MSPE. Most of the ranking schools (n = 63; 62%) placed students into named category groups, but the number and size of groups varied. The most common descriptors used for these 63 schools' top, second, third, and lowest groups were "outstanding," "excellent," "very good," and "good," respectively, but each of these terms was used across a broad range of percentile ranks. Student ranks and school category legends were found in various locations. Many of the 33 schools that did not rank students included language suggestive of rank. There is extensive variation in ranking systems used in MSPEs. Program directors may find it difficult to use MSPEs to compare applicants, which may diminish the MSPE's value in the residency application process and negatively affect high-achieving students. A consistent approach to ranking students would benefit program directors, students, and student affairs officers.

  8. Multiplex PageRank.

    Science.gov (United States)

    Halu, Arda; Mondragón, Raúl J; Panzarasa, Pietro; Bianconi, Ginestra

    2013-01-01

    Many complex systems can be described as multiplex networks in which the same nodes can interact with one another in different layers, thus forming a set of interacting and co-evolving networks. Examples of such multiplex systems are social networks where people are involved in different types of relationships and interact through various forms of communication media. The ranking of nodes in multiplex networks is one of the most pressing and challenging tasks that research on complex networks is currently facing. When pairs of nodes can be connected through multiple links and in multiple layers, the ranking of nodes should necessarily reflect the importance of nodes in one layer as well as their importance in other interdependent layers. In this paper, we draw on the idea of biased random walks to define the Multiplex PageRank centrality measure in which the effects of the interplay between networks on the centrality of nodes are directly taken into account. In particular, depending on the intensity of the interaction between layers, we define the Additive, Multiplicative, Combined, and Neutral versions of Multiplex PageRank, and show how each version reflects the extent to which the importance of a node in one layer affects the importance the node can gain in another layer. We discuss these measures and apply them to an online multiplex social network. Findings indicate that taking the multiplex nature of the network into account helps uncover the emergence of rankings of nodes that differ from the rankings obtained from one single layer. Results provide support in favor of the salience of multiplex centrality measures, like Multiplex PageRank, for assessing the prominence of nodes embedded in multiple interacting networks, and for shedding a new light on structural properties that would otherwise remain undetected if each of the interacting networks were analyzed in isolation.

  9. Multiplex PageRank.

    Directory of Open Access Journals (Sweden)

    Arda Halu

    Full Text Available Many complex systems can be described as multiplex networks in which the same nodes can interact with one another in different layers, thus forming a set of interacting and co-evolving networks. Examples of such multiplex systems are social networks where people are involved in different types of relationships and interact through various forms of communication media. The ranking of nodes in multiplex networks is one of the most pressing and challenging tasks that research on complex networks is currently facing. When pairs of nodes can be connected through multiple links and in multiple layers, the ranking of nodes should necessarily reflect the importance of nodes in one layer as well as their importance in other interdependent layers. In this paper, we draw on the idea of biased random walks to define the Multiplex PageRank centrality measure in which the effects of the interplay between networks on the centrality of nodes are directly taken into account. In particular, depending on the intensity of the interaction between layers, we define the Additive, Multiplicative, Combined, and Neutral versions of Multiplex PageRank, and show how each version reflects the extent to which the importance of a node in one layer affects the importance the node can gain in another layer. We discuss these measures and apply them to an online multiplex social network. Findings indicate that taking the multiplex nature of the network into account helps uncover the emergence of rankings of nodes that differ from the rankings obtained from one single layer. Results provide support in favor of the salience of multiplex centrality measures, like Multiplex PageRank, for assessing the prominence of nodes embedded in multiple interacting networks, and for shedding a new light on structural properties that would otherwise remain undetected if each of the interacting networks were analyzed in isolation.

  10. Groundwater contaminant plume ranking

    International Nuclear Information System (INIS)

    1988-08-01

    Containment plumes at Uranium Mill Tailings Remedial Action (UMTRA) Project sites were ranked to assist in Subpart B (i.e., restoration requirements of 40 CFR Part 192) compliance strategies for each site, to prioritize aquifer restoration, and to budget future requests and allocations. The rankings roughly estimate hazards to the environment and human health, and thus assist in determining for which sites cleanup, if appropriate, will provide the greatest benefits for funds available. The rankings are based on the scores that were obtained using the US Department of Energy's (DOE) Modified Hazard Ranking System (MHRS). The MHRS and HRS consider and score three hazard modes for a site: migration, fire and explosion, and direct contact. The migration hazard mode score reflects the potential for harm to humans or the environment from migration of a hazardous substance off a site by groundwater, surface water, and air; it is a composite of separate scores for each of these routes. For ranking the containment plumes at UMTRA Project sites, it was assumed that each site had been remediated in compliance with the EPA standards and that relict contaminant plumes were present. Therefore, only the groundwater route was scored, and the surface water and air routes were not considered. Section 2.0 of this document describes the assumptions and procedures used to score the groundwater route, and Section 3.0 provides the resulting scores for each site. 40 tabs

  11. Nonparametric methods in actigraphy: An update

    Directory of Open Access Journals (Sweden)

    Bruno S.B. Gonçalves

    2014-09-01

    Full Text Available Circadian rhythmicity in humans has been well studied using actigraphy, a method of measuring gross motor movement. As actigraphic technology continues to evolve, it is important for data analysis to keep pace with new variables and features. Our objective is to study the behavior of two variables, interdaily stability and intradaily variability, to describe rest activity rhythm. Simulated data and actigraphy data of humans, rats, and marmosets were used in this study. We modified the method of calculation for IV and IS by modifying the time intervals of analysis. For each variable, we calculated the average value (IVm and ISm results for each time interval. Simulated data showed that (1 synchronization analysis depends on sample size, and (2 fragmentation is independent of the amplitude of the generated noise. We were able to obtain a significant difference in the fragmentation patterns of stroke patients using an IVm variable, while the variable IV60 was not identified. Rhythmic synchronization of activity and rest was significantly higher in young than adults with Parkinson׳s when using the ISM variable; however, this difference was not seen using IS60. We propose an updated format to calculate rhythmic fragmentation, including two additional optional variables. These alternative methods of nonparametric analysis aim to more precisely detect sleep–wake cycle fragmentation and synchronization.

  12. Bayesian nonparametric adaptive control using Gaussian processes.

    Science.gov (United States)

    Chowdhary, Girish; Kingravi, Hassan A; How, Jonathan P; Vela, Patricio A

    2015-03-01

    Most current model reference adaptive control (MRAC) methods rely on parametric adaptive elements, in which the number of parameters of the adaptive element are fixed a priori, often through expert judgment. An example of such an adaptive element is radial basis function networks (RBFNs), with RBF centers preallocated based on the expected operating domain. If the system operates outside of the expected operating domain, this adaptive element can become noneffective in capturing and canceling the uncertainty, thus rendering the adaptive controller only semiglobal in nature. This paper investigates a Gaussian process-based Bayesian MRAC architecture (GP-MRAC), which leverages the power and flexibility of GP Bayesian nonparametric models of uncertainty. The GP-MRAC does not require the centers to be preallocated, can inherently handle measurement noise, and enables MRAC to handle a broader set of uncertainties, including those that are defined as distributions over functions. We use stochastic stability arguments to show that GP-MRAC guarantees good closed-loop performance with no prior domain knowledge of the uncertainty. Online implementable GP inference methods are compared in numerical simulations against RBFN-MRAC with preallocated centers and are shown to provide better tracking and improved long-term learning.

  13. Nonparametric tests for equality of psychometric functions.

    Science.gov (United States)

    García-Pérez, Miguel A; Núñez-Antón, Vicente

    2017-12-07

    Many empirical studies measure psychometric functions (curves describing how observers' performance varies with stimulus magnitude) because these functions capture the effects of experimental conditions. To assess these effects, parametric curves are often fitted to the data and comparisons are carried out by testing for equality of mean parameter estimates across conditions. This approach is parametric and, thus, vulnerable to violations of the implied assumptions. Furthermore, testing for equality of means of parameters may be misleading: Psychometric functions may vary meaningfully across conditions on an observer-by-observer basis with no effect on the mean values of the estimated parameters. Alternative approaches to assess equality of psychometric functions per se are thus needed. This paper compares three nonparametric tests that are applicable in all situations of interest: The existing generalized Mantel-Haenszel test, a generalization of the Berry-Mielke test that was developed here, and a split variant of the generalized Mantel-Haenszel test also developed here. Their statistical properties (accuracy and power) are studied via simulation and the results show that all tests are indistinguishable as to accuracy but they differ non-uniformly as to power. Empirical use of the tests is illustrated via analyses of published data sets and practical recommendations are given. The computer code in MATLAB and R to conduct these tests is available as Electronic Supplemental Material.

  14. Bayesian Plackett-Luce Mixture Models for Partially Ranked Data.

    Science.gov (United States)

    Mollica, Cristina; Tardella, Luca

    2017-06-01

    The elicitation of an ordinal judgment on multiple alternatives is often required in many psychological and behavioral experiments to investigate preference/choice orientation of a specific population. The Plackett-Luce model is one of the most popular and frequently applied parametric distributions to analyze rankings of a finite set of items. The present work introduces a Bayesian finite mixture of Plackett-Luce models to account for unobserved sample heterogeneity of partially ranked data. We describe an efficient way to incorporate the latent group structure in the data augmentation approach and the derivation of existing maximum likelihood procedures as special instances of the proposed Bayesian method. Inference can be conducted with the combination of the Expectation-Maximization algorithm for maximum a posteriori estimation and the Gibbs sampling iterative procedure. We additionally investigate several Bayesian criteria for selecting the optimal mixture configuration and describe diagnostic tools for assessing the fitness of ranking distributions conditionally and unconditionally on the number of ranked items. The utility of the novel Bayesian parametric Plackett-Luce mixture for characterizing sample heterogeneity is illustrated with several applications to simulated and real preference ranked data. We compare our method with the frequentist approach and a Bayesian nonparametric mixture model both assuming the Plackett-Luce model as a mixture component. Our analysis on real datasets reveals the importance of an accurate diagnostic check for an appropriate in-depth understanding of the heterogenous nature of the partial ranking data.

  15. Ranking economic history journals

    DEFF Research Database (Denmark)

    Di Vaio, Gianfranco; Weisdorf, Jacob Louis

    2010-01-01

    This study ranks-for the first time-12 international academic journals that have economic history as their main topic. The ranking is based on data collected for the year 2007. Journals are ranked using standard citation analysis where we adjust for age, size and self-citation of journals. We also...... compare the leading economic history journals with the leading journals in economics in order to measure the influence on economics of economic history, and vice versa. With a few exceptions, our results confirm the general idea about what economic history journals are the most influential for economic...... history, and that, although economic history is quite independent from economics as a whole, knowledge exchange between the two fields is indeed going on....

  16. Ranking Economic History Journals

    DEFF Research Database (Denmark)

    Di Vaio, Gianfranco; Weisdorf, Jacob Louis

    This study ranks - for the first time - 12 international academic journals that have economic history as their main topic. The ranking is based on data collected for the year 2007. Journals are ranked using standard citation analysis where we adjust for age, size and self-citation of journals. We...... also compare the leading economic history journals with the leading journals in economics in order to measure the influence on economics of economic history, and vice versa. With a few exceptions, our results confirm the general idea about what economic history journals are the most influential...... for economic history, and that, although economic history is quite independent from economics as a whole, knowledge exchange between the two fields is indeed going on....

  17. Dynamic Matrix Rank

    DEFF Research Database (Denmark)

    Frandsen, Gudmund Skovbjerg; Frandsen, Peter Frands

    2009-01-01

    We consider maintaining information about the rank of a matrix under changes of the entries. For n×n matrices, we show an upper bound of O(n1.575) arithmetic operations and a lower bound of Ω(n) arithmetic operations per element change. The upper bound is valid when changing up to O(n0.575) entries...... in a single column of the matrix. We also give an algorithm that maintains the rank using O(n2) arithmetic operations per rank one update. These bounds appear to be the first nontrivial bounds for the problem. The upper bounds are valid for arbitrary fields, whereas the lower bound is valid for algebraically...... closed fields. The upper bound for element updates uses fast rectangular matrix multiplication, and the lower bound involves further development of an earlier technique for proving lower bounds for dynamic computation of rational functions....

  18. Menstruation disorders in adolescents with eating disorders-target body mass index percentiles for their resolution.

    Science.gov (United States)

    Vale, Beatriz; Brito, Sara; Paulos, Lígia; Moleiro, Pascoal

    2014-04-01

    To analyse the progression of body mass index in eating disorders and to determine the percentile for establishment and resolution of the disease. A retrospective descriptive cross-sectional study. Review of clinical files of adolescents with eating disorders. Of the 62 female adolescents studied with eating disorders, 51 presented with eating disorder not otherwise specified, 10 anorexia nervosa, and 1 bulimia nervosa. Twenty-one of these adolescents had menstrual disorders; in that, 14 secondary amenorrhea and 7 menstrual irregularities (6 eating disorder not otherwise specified, and 1 bulimia nervosa). In average, in anorectic adolescents, the initial body mass index was in 75th percentile; secondary amenorrhea was established 1 month after onset of the disease; minimum weight was 76.6% of ideal body mass index (at 4th percentile) at 10.2 months of disease; and resolution of amenorrhea occurred at 24 months, with average weight recovery of 93.4% of the ideal. In eating disorder not otherwise specified with menstrual disorder (n=10), the mean initial body mass index was at 85th percentile; minimal weight was in average 97.7% of the ideal value (minimum body mass index was in 52nd percentile) at 14.9 months of disease; body mass index stabilization occurred at 1.6 year of disease; and mean body mass index was in 73rd percentile. Considering eating disorder not otherwise specified with secondary amenorrhea (n=4); secondary amenorrhea occurred at 4 months, with resolution at 12 months of disease (mean 65th percentile body mass index). One-third of the eating disorder group had menstrual disorder - two-thirds presented with amenorrhea. This study indicated that for the resolution of their menstrual disturbance the body mass index percentiles to be achieved by female adolescents with eating disorders was 25-50 in anorexia nervosa, and 50-75, in eating disorder not otherwise specified.

  19. Two-stage meta-analysis of survival data from individual participants using percentile ratios

    Science.gov (United States)

    Barrett, Jessica K; Farewell, Vern T; Siannis, Fotios; Tierney, Jayne; Higgins, Julian P T

    2012-01-01

    Methods for individual participant data meta-analysis of survival outcomes commonly focus on the hazard ratio as a measure of treatment effect. Recently, Siannis et al. (2010, Statistics in Medicine 29:3030–3045) proposed the use of percentile ratios as an alternative to hazard ratios. We describe a novel two-stage method for the meta-analysis of percentile ratios that avoids distributional assumptions at the study level. Copyright © 2012 John Wiley & Sons, Ltd. PMID:22825835

  20. Modified Moment, Maximum Likelihood and Percentile Estimators for the Parameters of the Power Function Distribution

    Directory of Open Access Journals (Sweden)

    Azam Zaka

    2014-10-01

    Full Text Available This paper is concerned with the modifications of maximum likelihood, moments and percentile estimators of the two parameter Power function distribution. Sampling behavior of the estimators is indicated by Monte Carlo simulation. For some combinations of parameter values, some of the modified estimators appear better than the traditional maximum likelihood, moments and percentile estimators with respect to bias, mean square error and total deviation.

  1. Analysis and Extension of the Percentile Method, Estimating a Noise Curve from a Single Image

    Directory of Open Access Journals (Sweden)

    Miguel Colom

    2013-12-01

    Full Text Available Given a white Gaussian noise signal on a sampling grid, its variance can be estimated from a small block sample. However, in natural images we observe the combination of the geometry of the scene being photographed and the added noise. In this case, estimating directly the standard deviation of the noise from block samples is not reliable since the measured standard deviation is not explained just by the noise but also by the geometry of the image. The Percentile method tries to estimate the standard deviation of the noise from blocks of a high-passed version of the image and a small p-percentile of these standard deviations. The idea behind is that edges and textures in a block of the image increase the observed standard deviation but they never make it decrease. Therefore, a small percentile (0.5%, for example in the list of standard deviations of the blocks is less likely to be affected by the edges and textures than a higher percentile (50%, for example. The 0.5%-percentile is empirically proven to be adequate for most natural, medical and microscopy images. The Percentile method is adapted to signal-dependent noise, which is realistic with the Poisson noise model obtained by a CCD device in a digital camera.

  2. Early efficacy of the ketogenic diet is not affected by initial body mass index percentile.

    Science.gov (United States)

    Shull, Shastin; Diaz-Medina, Gloria; Wong-Kisiel, Lily; Nickels, Katherine; Eckert, Susan; Wirrell, Elaine

    2014-05-01

    Predictors of the ketogenic diet's success in treating pediatric intractable epilepsy are not well understood. The aim of this study was to determine whether initial body mass index and weight percentile impact early efficacy of the traditional ketogenic diet in children initiating therapy for intractable epilepsy. This retrospective study included all children initiating the ketogenic diet at Mayo Clinic, Rochester from January 2001 to December 2010 who had body mass index (children ≥2 years of age) or weight percentile (those diet initiation and seizure frequency recorded at diet initiation and one month. Responders were defined as achieving a >50% seizure reduction from baseline. Our cohort consisted of 48 patients (20 male) with a median age of 3.1 years. There was no significant correlation between initial body mass index or weight percentile and seizure frequency reduction at one month (P = 0.72, r = 0.26 and P = 0.91, r = 0.03). There was no significant association between body mass index or weight percentile quartile and responder rates (P = 0.21 and P = 0.57). Children considered overweight or obese at diet initiation (body mass index or weight percentile ≥85) did not have lower responder rates than those with body mass index or weight percentiles ketogenic diet. Copyright © 2014 Elsevier Inc. All rights reserved.

  3. Diversifying customer review rankings.

    Science.gov (United States)

    Krestel, Ralf; Dokoohaki, Nima

    2015-06-01

    E-commerce Web sites owe much of their popularity to consumer reviews accompanying product descriptions. On-line customers spend hours and hours going through heaps of textual reviews to decide which products to buy. At the same time, each popular product has thousands of user-generated reviews, making it impossible for a buyer to read everything. Current approaches to display reviews to users or recommend an individual review for a product are based on the recency or helpfulness of each review. In this paper, we present a framework to rank product reviews by optimizing the coverage of the ranking with respect to sentiment or aspects, or by summarizing all reviews with the top-K reviews in the ranking. To accomplish this, we make use of the assigned star rating for a product as an indicator for a review's sentiment polarity and compare bag-of-words (language model) with topic models (latent Dirichlet allocation) as a mean to represent aspects. Our evaluation on manually annotated review data from a commercial review Web site demonstrates the effectiveness of our approach, outperforming plain recency ranking by 30% and obtaining best results by combining language and topic model representations. Copyright © 2015 Elsevier Ltd. All rights reserved.

  4. College Rankings. ERIC Digest.

    Science.gov (United States)

    Holub, Tamara

    The popularity of college ranking surveys published by "U.S. News and World Report" and other magazines is indisputable, but the methodologies used to measure the quality of higher education institutions have come under fire by scholars and college officials. Criticisms have focused on methodological flaws, such as failure to consider…

  5. OutRank

    DEFF Research Database (Denmark)

    Müller, Emmanuel; Assent, Ira; Steinhausen, Uwe

    2008-01-01

    Outlier detection is an important data mining task for consistency checks, fraud detection, etc. Binary decision making on whether or not an object is an outlier is not appropriate in many applications and moreover hard to parametrize. Thus, recently, methods for outlier ranking have been proposed...

  6. Weak Disposability in Nonparametric Production Analysis with Undesirable Outputs

    NARCIS (Netherlands)

    Kuosmanen, T.K.

    2005-01-01

    Environmental Economics and Natural Resources Group at Wageningen University in The Netherlands Weak disposability of outputs means that firms can abate harmful emissions by decreasing the activity level. Modeling weak disposability in nonparametric production analysis has caused some confusion.

  7. Multi-sample nonparametric treatments comparison in medical ...

    African Journals Online (AJOL)

    Multi-sample nonparametric treatments comparison in medical follow-up study with unequal observation processes through simulation and bladder tumour case study. P. L. Tan, N.A. Ibrahim, M.B. Adam, J. Arasan ...

  8. A nonparametric mixture model for cure rate estimation.

    Science.gov (United States)

    Peng, Y; Dear, K B

    2000-03-01

    Nonparametric methods have attracted less attention than their parametric counterparts for cure rate analysis. In this paper, we study a general nonparametric mixture model. The proportional hazards assumption is employed in modeling the effect of covariates on the failure time of patients who are not cured. The EM algorithm, the marginal likelihood approach, and multiple imputations are employed to estimate parameters of interest in the model. This model extends models and improves estimation methods proposed by other researchers. It also extends Cox's proportional hazards regression model by allowing a proportion of event-free patients and investigating covariate effects on that proportion. The model and its estimation method are investigated by simulations. An application to breast cancer data, including comparisons with previous analyses using a parametric model and an existing nonparametric model by other researchers, confirms the conclusions from the parametric model but not those from the existing nonparametric model.

  9. Speaker Linking and Applications using Non-Parametric Hashing Methods

    Science.gov (United States)

    2016-09-08

    nonparametric estimate of a multivariate density function,” The Annals of Math- ematical Statistics , vol. 36, no. 3, pp. 1049–1051, 1965. [9] E. A. Patrick...Speaker Linking and Applications using Non-Parametric Hashing Methods† Douglas Sturim and William M. Campbell MIT Lincoln Laboratory, Lexington, MA...with many approaches [1, 2]. For this paper, we focus on using i-vectors [2], but the methods apply to any embedding. For the task of speaker QBE and

  10. Improving Ranking Using Quantum Probability

    OpenAIRE

    Melucci, Massimo

    2011-01-01

    The paper shows that ranking information units by quantum probability differs from ranking them by classical probability provided the same data used for parameter estimation. As probability of detection (also known as recall or power) and probability of false alarm (also known as fallout or size) measure the quality of ranking, we point out and show that ranking by quantum probability yields higher probability of detection than ranking by classical probability provided a given probability of ...

  11. The importance of extreme weight percentile in postoperative morbidity in children.

    Science.gov (United States)

    Stey, Anne M; Moss, R Lawrence; Kraemer, Kari; Cohen, Mark E; Ko, Clifford Y; Lee Hall, Bruce

    2014-05-01

    Anthropometric data are important indicators of child health. This study sought to determine whether anthropometric data of extreme weight were significant predictors of perioperative morbidity in pediatric surgery. This was a cohort study of children 29 days up to 18 years of age undergoing surgical procedures at participating American College of Surgeons' NSQIP Pediatric hospitals in 2011 and 2012. The primary outcomes were composite morbidity and surgical site infection. The primary predictor of interest was weight percentile, which was divided into the following categories: ≤5(th) percentile, 6(th) to 94(th), or ≥95(th) percentile. A hierarchical multivariate logistic model, adjusting for procedure case mix, demographic, and clinical patient characteristic variables, was used to quantify the relationship between weight percentile category and outcomes. Children in the ≤5th weight percentile had 1.19-fold higher odds of overall postoperative morbidity developing than children in the nonextreme range (95% CI, 1.10-1.30) when controlling for clinical variables. Yet these children did not have higher odds of surgical site infection developing. Children in the ≥95(th) weight percentile did not have a significant increase in overall postoperative morbidity. However, they were at 1.35-fold increased odds of surgical site infection compared with those in the nonextreme range when controlling for clinical variables (95% CI, 1.16-1.57). Both extremely high and extremely low weight percentile scores can be associated with increased postoperative complications after controlling for clinical variables. Copyright © 2014 American College of Surgeons. Published by Elsevier Inc. All rights reserved.

  12. Two non-parametric methods for derivation of constraints from radiotherapy dose–histogram data

    International Nuclear Information System (INIS)

    Ebert, M A; Kennedy, A; Joseph, D J; Gulliford, S L; Buettner, F; Foo, K; Haworth, A; Denham, J W

    2014-01-01

    Dose constraints based on histograms provide a convenient and widely-used method for informing and guiding radiotherapy treatment planning. Methods of derivation of such constraints are often poorly described. Two non-parametric methods for derivation of constraints are described and investigated in the context of determination of dose-specific cut-points—values of the free parameter (e.g., percentage volume of the irradiated organ) which best reflect resulting changes in complication incidence. A method based on receiver operating characteristic (ROC) analysis and one based on a maximally-selected standardized rank sum are described and compared using rectal toxicity data from a prostate radiotherapy trial. Multiple test corrections are applied using a free step-down resampling algorithm, which accounts for the large number of tests undertaken to search for optimal cut-points and the inherent correlation between dose–histogram points. Both methods provide consistent significant cut-point values, with the rank sum method displaying some sensitivity to the underlying data. The ROC method is simple to implement and can utilize a complication atlas, though an advantage of the rank sum method is the ability to incorporate all complication grades without the need for grade dichotomization. (note)

  13. 1991 Acceptance priority ranking

    International Nuclear Information System (INIS)

    1991-12-01

    The Standard Contract for Disposal of Spent Nuclear Fuel and/or High- Level Radioactive Waste (10 CFR Part 961) that the Department of Energy (DOE) has executed with the owners and generators of civilian spent nuclear fuel requires annual publication of the Acceptance Priority Ranking (APR). The 1991 APR details the order in which DOE will allocate Federal waste acceptance capacity. As required by the Standard Contract, the ranking is based on the age of permanently discharged spent nuclear fuel (SNF), with the owners of the oldest SNF, on an industry-wide basis, given the highest priority. the 1991 APR will be the basis for the annual allocation of waste acceptance capacity to the Purchasers in the 1991 Annual Capacity Report (ACR), to be issued later this year. This document is based on SNF discharges as of December 31, 1990, and reflects Purchaser comments and corrections, as appropriate, to the draft APR issued on May 15, 1991

  14. Evaluation and projection of daily temperature percentiles from statistical and dynamical downscaling methods

    Directory of Open Access Journals (Sweden)

    A. Casanueva

    2013-08-01

    Full Text Available The study of extreme events has become of great interest in recent years due to their direct impact on society. Extremes are usually evaluated by using extreme indicators, based on order statistics on the tail of the probability distribution function (typically percentiles. In this study, we focus on the tail of the distribution of daily maximum and minimum temperatures. For this purpose, we analyse high (95th and low (5th percentiles in daily maximum and minimum temperatures on the Iberian Peninsula, respectively, derived from different downscaling methods (statistical and dynamical. First, we analyse the performance of reanalysis-driven downscaling methods in present climate conditions. The comparison among the different methods is performed in terms of the bias of seasonal percentiles, considering as observations the public gridded data sets E-OBS and Spain02, and obtaining an estimation of both the mean and spatial percentile errors. Secondly, we analyse the increments of future percentile projections under the SRES A1B scenario and compare them with those corresponding to the mean temperature, showing that their relative importance depends on the method, and stressing the need to consider an ensemble of methodologies.

  15. Use of Pearson's Chi-Square for Testing Equality of Percentile Profiles across Multiple Populations.

    Science.gov (United States)

    Johnson, William D; Beyl, Robbie A; Burton, Jeffrey H; Johnson, Callie M; Romer, Jacob E; Zhang, Lei

    2015-08-01

    In large sample studies where distributions may be skewed and not readily transformed to symmetry, it may be of greater interest to compare different distributions in terms of percentiles rather than means. For example, it may be more informative to compare two or more populations with respect to their within population distributions by testing the hypothesis that their corresponding respective 10 th , 50 th , and 90 th percentiles are equal. As a generalization of the median test, the proposed test statistic is asymptotically distributed as Chi-square with degrees of freedom dependent upon the number of percentiles tested and constraints of the null hypothesis. Results from simulation studies are used to validate the nominal 0.05 significance level under the null hypothesis, and asymptotic power properties that are suitable for testing equality of percentile profiles against selected profile discrepancies for a variety of underlying distributions. A pragmatic example is provided to illustrate the comparison of the percentile profiles for four body mass index distributions.

  16. REGRES: A FORTRAN-77 program to calculate nonparametric and ``structural'' parametric solutions to bivariate regression equations

    Science.gov (United States)

    Rock, N. M. S.; Duffy, T. R.

    REGRES allows a range of regression equations to be calculated for paired sets of data values in which both variables are subject to error (i.e. neither is the "independent" variable). Nonparametric regressions, based on medians of all possible pairwise slopes and intercepts, are treated in detail. Estimated slopes and intercepts are output, along with confidence limits, Spearman and Kendall rank correlation coefficients. Outliers can be rejected with user-determined stringency. Parametric regressions can be calculated for any value of λ (the ratio of the variances of the random errors for y and x)—including: (1) major axis ( λ = 1); (2) reduced major axis ( λ = variance of y/variance of x); (3) Y on Xλ = infinity; or (4) X on Y ( λ = 0) solutions. Pearson linear correlation coefficients also are output. REGRES provides an alternative to conventional isochron assessment techniques where bivariate normal errors cannot be assumed, or weighting methods are inappropriate.

  17. Estimation of a monotone percentile residual life function under random censorship.

    Science.gov (United States)

    Franco-Pereira, Alba M; de Uña-Álvarez, Jacobo

    2013-01-01

    In this paper, we introduce a new estimator of a percentile residual life function with censored data under a monotonicity constraint. Specifically, it is assumed that the percentile residual life is a decreasing function. This assumption is useful when estimating the percentile residual life of units, which degenerate with age. We establish a law of the iterated logarithm for the proposed estimator, and its n-equivalence to the unrestricted estimator. The asymptotic normal distribution of the estimator and its strong approximation to a Gaussian process are also established. We investigate the finite sample performance of the monotone estimator in an extensive simulation study. Finally, data from a clinical trial in primary biliary cirrhosis of the liver are analyzed with the proposed methods. One of the conclusions of our work is that the restricted estimator may be much more efficient than the unrestricted one. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  18. Percentile of Serum Lipid Profile in Children and Adolescents of Birjand, Eastern Iran.

    Directory of Open Access Journals (Sweden)

    Fatemeh Taheri

    2014-11-01

    Full Text Available Abstract:Introduction: Racial and environmental differences in communities leading cause of differences in serum lipids. It can be said this study aimed in assessing percentile curves of serum lipid profile about 6-18 years old students of Birjand.Method: The present cross-sectional study was done on 4168 students of Birjand aged 6-18 years. They were classified into three age groups 6-10 and 15-18 and 11-14 years. The 5th, 10th, 25th, 50th, 75th, 90th, and 95th percentiles of lipids (cholesterol, LDL, HDL and triglycerides were determined by sex for different age groups.Result: The 5th, 10th, 25th, 50th, 75th, 90th and 95th percentiles for cholesterol, LDL, HDL, and TG were 114,123, 138, 157, 176, 197, 210; 54, 59, 71, 86, 102, 119, 131; 33, 36, 41, 48, 56, 64, 68 and 43, 49, 61, 78, 103, 138, 164, respectively. Conclusion: Percentiles of lipid in kids of Birjand are different in comparison with reference percentiles of the U.S and also Tehran. Triglycerides and HDL in children and adolescents of Birjand were higher and lower, respectively than the Americans. This could be due to racial differences and environmental factors such as nutrition and sedentary life style. This should be considered in interpretation of normal and abnormal values and determination of dyslipidemia in children and adolescents. Take the regional percentiles of serum lipids for Iranian children and adolescents recommended by examining a sufficient number of samples.

  19. Percentiles de salto con contramovimiento en escolares de Bogotá, Colombia: Estudio FUPRECOL

    OpenAIRE

    Ferro Vargas, Martha

    2016-01-01

    Objetivo: Determinar la distribución por percentiles de salto con contramovimiento (CMJ) en una población escolar de Bogotá, Colombia, perteneciente al estudio Fuprecol. Métodos: Estudio transversal realizado entre 2846 niños y 2754 adolescentes, entre 9 a 17 años de edad, pertenecientes a 18 instituciones educativas oficiales de Bogotá, Colombia. Se evaluó el CMJ, de acuerdo, con lo establecido por la batería de condición física, Fuprecol. Se calcularon, los percentiles (P3, P...

  20. Notes on the Implementation of Non-Parametric Statistics within the Westinghouse Realistic Large Break LOCA Evaluation Model (ASTRUM)

    International Nuclear Information System (INIS)

    Frepoli, Cesare; Oriani, Luca

    2006-01-01

    In recent years, non-parametric or order statistics methods have been widely used to assess the impact of the uncertainties within Best-Estimate LOCA evaluation models. The bounding of the uncertainties is achieved with a direct Monte Carlo sampling of the uncertainty attributes, with the minimum trial number selected to 'stabilize' the estimation of the critical output values (peak cladding temperature (PCT), local maximum oxidation (LMO), and core-wide oxidation (CWO A non-parametric order statistics uncertainty analysis was recently implemented within the Westinghouse Realistic Large Break LOCA evaluation model, also referred to as 'Automated Statistical Treatment of Uncertainty Method' (ASTRUM). The implementation or interpretation of order statistics in safety analysis is not fully consistent within the industry. This has led to an extensive public debate among regulators and researchers which can be found in the open literature. The USNRC-approved Westinghouse method follows a rigorous implementation of the order statistics theory, which leads to the execution of 124 simulations within a Large Break LOCA analysis. This is a solid approach which guarantees that a bounding value (at 95% probability) of the 95 th percentile for each of the three 10 CFR 50.46 ECCS design acceptance criteria (PCT, LMO and CWO) is obtained. The objective of this paper is to provide additional insights on the ASTRUM statistical approach, with a more in-depth analysis of pros and cons of the order statistics and of the Westinghouse approach in the implementation of this statistical methodology. (authors)

  1. Substantial injuries influence ranking position in young elite athletes of athletics, cross-country skiing and orienteering.

    Science.gov (United States)

    von Rosen, P; Heijne, A

    2018-04-01

    The relationship between injury and performance in young athletes is scarcely studied. The aim of this study was therefore to explore the association between injury prevalence and ranking position among adolescent elite athletes. One hundred and sixty-two male and female adolescent elite athletes (age range 15-19), competing in athletics (n = 59), cross-country skiing (n = 66), and orienteering (n = 37), were monitored weekly over 22-47 weeks using a web-based injury questionnaire. Ranking lists were collected. A significant (P = .003) difference was found in the seasonal substantial injury prevalence across the ranked athletes over the season, where the top-ranked (median 3.6%, 25-75th percentiles 0%-14.3%) and middle-ranked athletes (median 2.3%, 25-75th percentiles 0%-10.0%) had a lower substantial injury prevalence compared to the low-ranked athletes (median 11.3%, 25-75th percentiles 2.5%-27.1%), during both preseason (P = .002) and competitive season (P = .031). Athletes who improved their ranking position (51%, n = 51) reported a lower substantial injury prevalence (median 0%, 25-75th percentiles 0%-10.0%) compared to those who decreased (49%, n = 49) their ranking position (md 6.7%, 25-75th percentiles 0%-22.5%). In the top-ranked group, no athlete reported substantial injury more than 40% of all data collection time points compared to 9.6% (n = 5) in the middle-ranked, and 17.3% (n = 9) in the low-ranked group. Our results provide supporting evidence that substantial injuries, such as acute and overuse injuries leading to moderate or severe reductions in training or sports performance, influence ranking position in adolescent elite athletes. The findings are crucial to stakeholders involved in adolescent elite sports and support the value of designing effective preventive interventions for substantial injuries. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  2. Ranking Baltic States Researchers

    Directory of Open Access Journals (Sweden)

    Gyula Mester

    2017-10-01

    Full Text Available In this article, using the h-index and the total number of citations, the best 10 Lithuanian, Latvian and Estonian researchers from several disciplines are ranked. The list may be formed based on the h-index and the total number of citations, given in Web of Science, Scopus, Publish or Perish Program and Google Scholar database. Data for the first 10 researchers are presented. Google Scholar is the most complete. Therefore, to define a single indicator, h-index calculated by Google Scholar may be a good and simple one. The author chooses the Google Scholar database as it is the broadest one.

  3. Fourth-rank cosmology

    International Nuclear Information System (INIS)

    Marrakchi, A.E.L.; Tapia, V.

    1992-05-01

    Some cosmological implications of the recently proposed fourth-rank theory of gravitation are studied. The model exhibits the possibility of being free from the horizon and flatness problems at the price of introducing a negative pressure. The field equations we obtain are compatible with k obs =0 and Ω obs t clas approx. 10 20 t Planck approx. 10 -23 s. When interpreted at the light of General Relativity the treatment is shown to be almost equivalent to that of the standard model of cosmology combined with the inflationary scenario. Hence, an interpretation of the negative pressure hypothesis is provided. (author). 8 refs

  4. University Rankings and Social Science

    OpenAIRE

    Marginson, S.

    2014-01-01

    University rankings widely affect the behaviours of prospective students and their families, university executive leaders, academic faculty, governments and investors in higher education. Yet the social science foundations of global rankings receive little scrutiny. Rankings that simply recycle reputation without any necessary connection to real outputs are of no common value. It is necessary that rankings be soundly based in scientific terms if a virtuous relationship between performance and...

  5. Percentiles de peso al nacer por edad gestacional en gemelos peruanos Birth weight percentiles for Peruvian twins, according to gestational age and sex

    Directory of Open Access Journals (Sweden)

    Manuel Ticona Rendón

    2006-09-01

    Full Text Available gestacional y sexo, hemos realizado un estudio descriptivo, transversal y prospectivo que abarca los años entre 1992 y 2004. Fueron estudiados 282 gemelos vivos, sin factores de riesgo para retardo del crecimiento, procedentes de Tacná, Perú. Se calcularon promedios, desviación estándar y percentiles 10, 50 y 90 de peso por sexo y edad gestacional comprendida entre las 32 y 41 semanas. Se compararon los percentiles y los promedios entre uno y otro sexo y con estudios realizados en Noruega, Australia y Japón, considerando significativo cuando p < 0,05. El promedio de peso al nacer fue de 2 677 g ± 507 en el caso de los varones y de 2 615 g ± 461, en el caso de las niñas, sin diferencias significativas. La moda de la edad gestacional fue de 38 semanas y las diferencias en la mediana del peso al nacer según sexo fueron de 110 g. El pico de peso al nacer para los gemelos fue de 39 semanas y a partir de este los promedios declinaron. El promedio de peso al nacer de los gemelos varones fue más alto que el de las hembras y no se observaron diferencias significativas en ninguna edad gestacional. No se apreciaron diferencias entre los promedios de peso de gemelos peruanos y noruegos, de uno u otro sexo, sin embargo se registraron diferencias altamente significativas al compararlos con los de Australia y Japón, respecto a los cuales los promedios peruanos fueron mayores. Las curvas producidas como resultado del estudio proveen percentiles de peso al nacer para gemelos, según edad gestacional y sexo, que pueden ser utilizados por clínicos e investigadores peruanos.

  6. University Rankings and Social Science

    Science.gov (United States)

    Marginson, Simon

    2014-01-01

    University rankings widely affect the behaviours of prospective students and their families, university executive leaders, academic faculty, governments and investors in higher education. Yet the social science foundations of global rankings receive little scrutiny. Rankings that simply recycle reputation without any necessary connection to real…

  7. Comportamiento de percentiles de tensión arterial asociados a factores de riesgo en escolares Performance of blood pressure percentiles associated with risk factors in students

    Directory of Open Access Journals (Sweden)

    Javier Jesús Suárez Rivera

    2004-04-01

    Full Text Available Se realizó un estudio prospectivo y descriptivo del universo de escolares desde preescolar hasta 6to. grado de la Escuela Primaria "Jesús Menéndez," de la localidad de Alamar, en el período comprendido desde septiembre de 2000 hasta febrero de 2001, con el objetivo de estimar el comportamiento de los percentiles (pc de tensión arterial, según edad y sexo, así como los factores de riesgo asociados. La muestra quedó constituida por 743 alumnos, a los cuales se les realizó un examen físico que incluyó peso, talla, toma de tensión arterial y una encuesta abierta. Con los datos obtenidos se dividió la población en 4 grupos de estudio según percentiles de tensión arterial: grupo I ( 95 pc, según la literatura extranjera consultada, y se relacionaron con factores de riesgo. El mayor número de escolares estudiados se encontraban con cifras de tensión arterial ubicadas en canales menores al 50 pc (88,83 %, y el factor de riesgo que se encontró con mayor frecuencia fue el antecedente familiar de hipertensión arterial. Solo 6 escolares presentaron cifras de tensión arterial superiores al 95 pc.A prospective descriptive study of students from kindergarten to 6th grade in "Jesús Menendez" elementary school located in Alamar was performed from September 2000 to February 2001 to find out the performance of blood pressure percentiles by age and sex as well as the associated risk factors. The sample was comprised by 743 students who were physically examined, taking into account weight, height, blood pressure and an open survey. The obtained data allowed us to divide the population into 4 groups by blood pressure percentiles; group 1(95 pc according to the reviewed foreign literature and they were related to risk factors. The blood pressure values of the highest number of studied students were under 50 pc (88,83 % and the most frequent risk factors was family history of blood hypertension. Only 6 students had blood pressure value over 95 %.

  8. Predicting Market Impact Costs Using Nonparametric Machine Learning Models.

    Directory of Open Access Journals (Sweden)

    Saerom Park

    Full Text Available Market impact cost is the most significant portion of implicit transaction costs that can reduce the overall transaction cost, although it cannot be measured directly. In this paper, we employed the state-of-the-art nonparametric machine learning models: neural networks, Bayesian neural network, Gaussian process, and support vector regression, to predict market impact cost accurately and to provide the predictive model that is versatile in the number of variables. We collected a large amount of real single transaction data of US stock market from Bloomberg Terminal and generated three independent input variables. As a result, most nonparametric machine learning models outperformed a-state-of-the-art benchmark parametric model such as I-star model in four error measures. Although these models encounter certain difficulties in separating the permanent and temporary cost directly, nonparametric machine learning models can be good alternatives in reducing transaction costs by considerably improving in prediction performance.

  9. Predicting Market Impact Costs Using Nonparametric Machine Learning Models.

    Science.gov (United States)

    Park, Saerom; Lee, Jaewook; Son, Youngdoo

    2016-01-01

    Market impact cost is the most significant portion of implicit transaction costs that can reduce the overall transaction cost, although it cannot be measured directly. In this paper, we employed the state-of-the-art nonparametric machine learning models: neural networks, Bayesian neural network, Gaussian process, and support vector regression, to predict market impact cost accurately and to provide the predictive model that is versatile in the number of variables. We collected a large amount of real single transaction data of US stock market from Bloomberg Terminal and generated three independent input variables. As a result, most nonparametric machine learning models outperformed a-state-of-the-art benchmark parametric model such as I-star model in four error measures. Although these models encounter certain difficulties in separating the permanent and temporary cost directly, nonparametric machine learning models can be good alternatives in reducing transaction costs by considerably improving in prediction performance.

  10. Application of nonparametric statistic method for DNBR limit calculation

    International Nuclear Information System (INIS)

    Dong Bo; Kuang Bo; Zhu Xuenong

    2013-01-01

    Background: Nonparametric statistical method is a kind of statistical inference method not depending on a certain distribution; it calculates the tolerance limits under certain probability level and confidence through sampling methods. The DNBR margin is one important parameter of NPP design, which presents the safety level of NPP. Purpose and Methods: This paper uses nonparametric statistical method basing on Wilks formula and VIPER-01 subchannel analysis code to calculate the DNBR design limits (DL) of 300 MW NPP (Nuclear Power Plant) during the complete loss of flow accident, simultaneously compared with the DL of DNBR through means of ITDP to get certain DNBR margin. Results: The results indicate that this method can gain 2.96% DNBR margin more than that obtained by ITDP methodology. Conclusions: Because of the reduction of the conservation during analysis process, the nonparametric statistical method can provide greater DNBR margin and the increase of DNBR margin is benefited for the upgrading of core refuel scheme. (authors)

  11. Comparing parametric and nonparametric regression methods for panel data

    DEFF Research Database (Denmark)

    Czekaj, Tomasz Gerard; Henningsen, Arne

    We investigate and compare the suitability of parametric and non-parametric stochastic regression methods for analysing production technologies and the optimal firm size. Our theoretical analysis shows that the most commonly used functional forms in empirical production analysis, Cobb......-Douglas and Translog, are unsuitable for analysing the optimal firm size. We show that the Translog functional form implies an implausible linear relationship between the (logarithmic) firm size and the elasticity of scale, where the slope is artificially related to the substitutability between the inputs....... The practical applicability of the parametric and non-parametric regression methods is scrutinised and compared by an empirical example: we analyse the production technology and investigate the optimal size of Polish crop farms based on a firm-level balanced panel data set. A nonparametric specification test...

  12. Median Growth Percentiles (MGPs): Assessment of Intertemporal Stability and Correlations with Observational Scores

    Science.gov (United States)

    Pivovarova, Margarita; Amrein-Beardsley, Audrey

    2018-01-01

    While states are no longer required to set up teacher evaluation systems based in significant part on student test scores, quite a few continue to use value-added (VAMs) or student growth percentile (SGP) models for that purpose. In this study, we analyzed three years of teacher data to illustrate the performance of teachers' median growth…

  13. Empirical Percentile Growth Curves with Z-scores Considering Seasonal Compensatory Growths for Japanese Thoroughbred Horses

    Science.gov (United States)

    ONODA, Tomoaki; YAMAMOTO, Ryuta; SAWAMURA, Kyohei; MURASE, Harutaka; NAMBO, Yasuo; INOUE, Yoshinobu; MATSUI, Akira; MIYAKE, Takeshi; HIRAI, Nobuhiro

    2013-01-01

    Percentile growth curves are often used as a clinical indicator to evaluate variations of children’s growth status. In this study, we propose empirical percentile growth curves using Z-scores adapted for Japanese Thoroughbred horses, with considerations of the seasonal compensatory growth that is a typical characteristic of seasonal breeding animals. We previously developed new growth curve equations for Japanese Thoroughbreds adjusting for compensatory growth. Individual horses and residual effects were included as random effects in the growth curve equation model and their variance components were estimated. Based on the Z-scores of the estimated variance components, empirical percentile growth curves were constructed. A total of 5,594 and 5,680 body weight and age measurements of male and female Thoroughbreds, respectively, and 3,770 withers height and age measurements were used in the analyses. The developed empirical percentile growth curves using Z-scores are computationally feasible and useful for monitoring individual growth parameters of body weight and withers height of young Thoroughbred horses, especially during compensatory growth periods. PMID:24834004

  14. Nonparametric regression using the concept of minimum energy

    International Nuclear Information System (INIS)

    Williams, Mike

    2011-01-01

    It has recently been shown that an unbinned distance-based statistic, the energy, can be used to construct an extremely powerful nonparametric multivariate two sample goodness-of-fit test. An extension to this method that makes it possible to perform nonparametric regression using multiple multivariate data sets is presented in this paper. The technique, which is based on the concept of minimizing the energy of the system, permits determination of parameters of interest without the need for parametric expressions of the parent distributions of the data sets. The application and performance of this new method is discussed in the context of some simple example analyses.

  15. It's all relative: ranking the diversity of aquatic bacterial communities.

    Science.gov (United States)

    Shaw, Allison K; Halpern, Aaron L; Beeson, Karen; Tran, Bao; Venter, J Craig; Martiny, Jennifer B H

    2008-09-01

    The study of microbial diversity patterns is hampered by the enormous diversity of microbial communities and the lack of resources to sample them exhaustively. For many questions about richness and evenness, however, one only needs to know the relative order of diversity among samples rather than total diversity. We used 16S libraries from the Global Ocean Survey to investigate the ability of 10 diversity statistics (including rarefaction, non-parametric, parametric, curve extrapolation and diversity indices) to assess the relative diversity of six aquatic bacterial communities. Overall, we found that the statistics yielded remarkably similar rankings of the samples for a given sequence similarity cut-off. This correspondence, despite the different underlying assumptions of the statistics, suggests that diversity statistics are a useful tool for ranking samples of microbial diversity. In addition, sequence similarity cut-off influenced the diversity ranking of the samples, demonstrating that diversity statistics can also be used to detect differences in phylogenetic structure among microbial communities. Finally, a subsampling analysis suggests that further sequencing from these particular clone libraries would not have substantially changed the richness rankings of the samples.

  16. Fractional cointegration rank estimation

    DEFF Research Database (Denmark)

    Lasak, Katarzyna; Velasco, Carlos

    the parameters of the model under the null hypothesis of the cointegration rank r = 1, 2, ..., p-1. This step provides consistent estimates of the cointegration degree, the cointegration vectors, the speed of adjustment to the equilibrium parameters and the common trends. In the second step we carry out a sup......-likelihood ratio test of no-cointegration on the estimated p - r common trends that are not cointegrated under the null. The cointegration degree is re-estimated in the second step to allow for new cointegration relationships with different memory. We augment the error correction model in the second step...... to control for stochastic trend estimation effects from the first step. The critical values of the tests proposed depend only on the number of common trends under the null, p - r, and on the interval of the cointegration degrees b allowed, but not on the true cointegration degree b0. Hence, no additional...

  17. Rankings, creatividad y urbanismo

    Directory of Open Access Journals (Sweden)

    JOAQUÍN SABATÉ

    2008-08-01

    Full Text Available La competencia entre ciudades constituye uno de los factores impulsores de procesos de renovación urbana y los rankings han devenido instrumentos de medida de la calidad de las ciudades. Nos detendremos en el caso de un antiguo barrio industrial hoy en vías de transformación en distrito "creativo" por medio de una intervención urbanística de gran escala. Su análisis nos descubre tres claves críticas. En primer lugar, nos obliga a plantearnos la definición de innovación urbana y cómo se integran el pasado, la identidad y la memoria en la construcción del futuro. Nos lleva a comprender que la innovación y el conocimiento no se "dan" casualmente, sino que son el fruto de una larga y compleja red en la que participan saberes, espacios, actores e instituciones diversas en naturaleza, escala y magnitud. Por último nos obliga a reflexionar sobre el valor que se le otorga a lo local en los procesos de renovación urbana.Competition among cities constitutes one ofthe main factors o furban renewal, and rankings have become instruments to indícate cities quality. Studying the transformation of an old industrial quarter into a "creative district" by the means ofa large scale urban project we highlight three main conclusions. First, itasks us to reconsider the notion ofurban innovation and hoto past, identity and memory should intégrate the future development. Second, it shows that innovation and knowledge doesn't yield per chance, but are the result ofa large and complex grid of diverse knowledges, spaces, agents and institutions. Finally itforces us to reflect about the valué attributed to the "local" in urban renewalprocesses.

  18. Ranking nodes in growing networks: When PageRank fails.

    Science.gov (United States)

    Mariani, Manuel Sebastian; Medo, Matúš; Zhang, Yi-Cheng

    2015-11-10

    PageRank is arguably the most popular ranking algorithm which is being applied in real systems ranging from information to biological and infrastructure networks. Despite its outstanding popularity and broad use in different areas of science, the relation between the algorithm's efficacy and properties of the network on which it acts has not yet been fully understood. We study here PageRank's performance on a network model supported by real data, and show that realistic temporal effects make PageRank fail in individuating the most valuable nodes for a broad range of model parameters. Results on real data are in qualitative agreement with our model-based findings. This failure of PageRank reveals that the static approach to information filtering is inappropriate for a broad class of growing systems, and suggest that time-dependent algorithms that are based on the temporal linking patterns of these systems are needed to better rank the nodes.

  19. Neophilia Ranking of Scientific Journals.

    Science.gov (United States)

    Packalen, Mikko; Bhattacharya, Jay

    2017-01-01

    The ranking of scientific journals is important because of the signal it sends to scientists about what is considered most vital for scientific progress. Existing ranking systems focus on measuring the influence of a scientific paper (citations)-these rankings do not reward journals for publishing innovative work that builds on new ideas. We propose an alternative ranking based on the proclivity of journals to publish papers that build on new ideas, and we implement this ranking via a text-based analysis of all published biomedical papers dating back to 1946. In addition, we compare our neophilia ranking to citation-based (impact factor) rankings; this comparison shows that the two ranking approaches are distinct. Prior theoretical work suggests an active role for our neophilia index in science policy. Absent an explicit incentive to pursue novel science, scientists underinvest in innovative work because of a coordination problem: for work on a new idea to flourish, many scientists must decide to adopt it in their work. Rankings that are based purely on influence thus do not provide sufficient incentives for publishing innovative work. By contrast, adoption of the neophilia index as part of journal-ranking procedures by funding agencies and university administrators would provide an explicit incentive for journals to publish innovative work and thus help solve the coordination problem by increasing scientists' incentives to pursue innovative work.

  20. Adaptive nonparametric Bayesian inference using location-scale mixture priors

    NARCIS (Netherlands)

    Jonge, de R.; Zanten, van J.H.

    2010-01-01

    We study location-scale mixture priors for nonparametric statistical problems, including multivariate regression, density estimation and classification. We show that a rate-adaptive procedure can be obtained if the prior is properly constructed. In particular, we show that adaptation is achieved if

  1. The nonparametric bootstrap for the current status model

    NARCIS (Netherlands)

    Groeneboom, P.; Hendrickx, K.

    2017-01-01

    It has been proved that direct bootstrapping of the nonparametric maximum likelihood estimator (MLE) of the distribution function in the current status model leads to inconsistent confidence intervals. We show that bootstrapping of functionals of the MLE can however be used to produce valid

  2. Non-Parametric Analysis of Rating Transition and Default Data

    DEFF Research Database (Denmark)

    Fledelius, Peter; Lando, David; Perch Nielsen, Jens

    2004-01-01

    We demonstrate the use of non-parametric intensity estimation - including construction of pointwise confidence sets - for analyzing rating transition data. We find that transition intensities away from the class studied here for illustration strongly depend on the direction of the previous move b...

  3. Bayesian nonparametric system reliability using sets of priors

    NARCIS (Netherlands)

    Walter, G.M.; Aslett, L.J.M.; Coolen, F.P.A.

    2016-01-01

    An imprecise Bayesian nonparametric approach to system reliability with multiple types of components is developed. This allows modelling partial or imperfect prior knowledge on component failure distributions in a flexible way through bounds on the functioning probability. Given component level test

  4. Effect on Prediction when Modeling Covariates in Bayesian Nonparametric Models.

    Science.gov (United States)

    Cruz-Marcelo, Alejandro; Rosner, Gary L; Müller, Peter; Stewart, Clinton F

    2013-04-01

    In biomedical research, it is often of interest to characterize biologic processes giving rise to observations and to make predictions of future observations. Bayesian nonparametric methods provide a means for carrying out Bayesian inference making as few assumptions about restrictive parametric models as possible. There are several proposals in the literature for extending Bayesian nonparametric models to include dependence on covariates. Limited attention, however, has been directed to the following two aspects. In this article, we examine the effect on fitting and predictive performance of incorporating covariates in a class of Bayesian nonparametric models by one of two primary ways: either in the weights or in the locations of a discrete random probability measure. We show that different strategies for incorporating continuous covariates in Bayesian nonparametric models can result in big differences when used for prediction, even though they lead to otherwise similar posterior inferences. When one needs the predictive density, as in optimal design, and this density is a mixture, it is better to make the weights depend on the covariates. We demonstrate these points via a simulated data example and in an application in which one wants to determine the optimal dose of an anticancer drug used in pediatric oncology.

  5. Nonparametric modeling of dynamic functional connectivity in fmri data

    DEFF Research Database (Denmark)

    Nielsen, Søren Føns Vind; Madsen, Kristoffer H.; Røge, Rasmus

    2015-01-01

    dynamic changes. The existing approaches modeling dynamic connectivity have primarily been based on time-windowing the data and k-means clustering. We propose a nonparametric generative model for dynamic FC in fMRI that does not rely on specifying window lengths and number of dynamic states. Rooted...

  6. Surface Estimation, Variable Selection, and the Nonparametric Oracle Property.

    Science.gov (United States)

    Storlie, Curtis B; Bondell, Howard D; Reich, Brian J; Zhang, Hao Helen

    2011-04-01

    Variable selection for multivariate nonparametric regression is an important, yet challenging, problem due, in part, to the infinite dimensionality of the function space. An ideal selection procedure should be automatic, stable, easy to use, and have desirable asymptotic properties. In particular, we define a selection procedure to be nonparametric oracle (np-oracle) if it consistently selects the correct subset of predictors and at the same time estimates the smooth surface at the optimal nonparametric rate, as the sample size goes to infinity. In this paper, we propose a model selection procedure for nonparametric models, and explore the conditions under which the new method enjoys the aforementioned properties. Developed in the framework of smoothing spline ANOVA, our estimator is obtained via solving a regularization problem with a novel adaptive penalty on the sum of functional component norms. Theoretical properties of the new estimator are established. Additionally, numerous simulated and real examples further demonstrate that the new approach substantially outperforms other existing methods in the finite sample setting.

  7. Parametric vs. Nonparametric Regression Modelling within Clinical Decision Support

    Czech Academy of Sciences Publication Activity Database

    Kalina, Jan; Zvárová, Jana

    2017-01-01

    Roč. 5, č. 1 (2017), s. 21-27 ISSN 1805-8698 R&D Projects: GA ČR GA17-01251S Institutional support: RVO:67985807 Keywords : decision support systems * decision rules * statistical analysis * nonparametric regression Subject RIV: IN - Informatics, Computer Science OBOR OECD: Statistics and probability

  8. On the robust nonparametric regression estimation for a functional regressor

    OpenAIRE

    Azzedine , Nadjia; Laksaci , Ali; Ould-Saïd , Elias

    2009-01-01

    On the robust nonparametric regression estimation for a functional regressor correspondance: Corresponding author. (Ould-Said, Elias) (Azzedine, Nadjia) (Laksaci, Ali) (Ould-Said, Elias) Departement de Mathematiques--> , Univ. Djillali Liabes--> , BP 89--> , 22000 Sidi Bel Abbes--> - ALGERIA (Azzedine, Nadjia) Departement de Mathema...

  9. A general approach to posterior contraction in nonparametric inverse problems

    NARCIS (Netherlands)

    Knapik, Bartek; Salomond, Jean Bernard

    In this paper, we propose a general method to derive an upper bound for the contraction rate of the posterior distribution for nonparametric inverse problems. We present a general theorem that allows us to derive contraction rates for the parameter of interest from contraction rates of the related

  10. Non-parametric analysis of production efficiency of poultry egg ...

    African Journals Online (AJOL)

    Non-parametric analysis of production efficiency of poultry egg farmers in Delta ... analysis of factors affecting the output of poultry farmers showed that stock ... should be put in place for farmers to learn the best farm practices carried out on the ...

  11. Physical Activity, Sleep, and BMI Percentile in Rural and Urban Ugandan Youth.

    Science.gov (United States)

    Christoph, Mary J; Grigsby-Toussaint, Diana S; Baingana, Rhona; Ntambi, James M

    Uganda is experiencing a dual burden of over- and undernutrition, with overweight prevalence increasing while underweight remains common. Potential weight-related factors, particularly physical activity, sleep, and rural/urban status, are not currently well understood or commonly assessed in Ugandan youth. The purpose of this study was to pilot test a survey measuring weight-related factors in rural and urban Ugandan schoolchildren. A cross-sectional survey measured sociodemographics, physical activity, sleep patterns, and dietary factors in 148 rural and urban schoolchildren aged 11-16 in central Uganda. Height and weight were objectively measured. Rural and urban youth were compared on these factors using χ 2 and t tests. Regression was used to identify correlates of higher body mass index (BMI) percentile in the full sample and nonstunted youth. Youth were on average 12.1 ± 1.1 years old; underweight (10%) was more common than overweight (1.4%). Self-reported sleep duration and subjective sleep quality did not differ by rural/urban residence. Rural children overall had higher BMI percentile and marginally higher stunting prevalence. In adjusted analyses in both the full and nonstunted samples, higher BMI percentile was related to living in a rural area, higher frequency of physical activity, and higher subjective sleep quality; it was negatively related to being active on weekends. In the full sample, higher BMI percentile was also related to female gender, whereas in nonstunted youth, higher BMI was related to age. BMI percentile was unrelated to sedentary time, performance of active chores and sports, and dietary factors. This study is one of the first to pilot test a survey assessing weight-related factors, particularly physical activity and sleep, in Ugandan schoolchildren. BMI percentile was related to several sociodemographic, sleep, and physical activity factors among primarily normal-weight school children in Uganda, providing a basis for

  12. Low-rank coal research

    Energy Technology Data Exchange (ETDEWEB)

    Weber, G. F.; Laudal, D. L.

    1989-01-01

    This work is a compilation of reports on ongoing research at the University of North Dakota. Topics include: Control Technology and Coal Preparation Research (SO{sub x}/NO{sub x} control, waste management), Advanced Research and Technology Development (turbine combustion phenomena, combustion inorganic transformation, coal/char reactivity, liquefaction reactivity of low-rank coals, gasification ash and slag characterization, fine particulate emissions), Combustion Research (fluidized bed combustion, beneficiation of low-rank coals, combustion characterization of low-rank coal fuels, diesel utilization of low-rank coals), Liquefaction Research (low-rank coal direct liquefaction), and Gasification Research (hydrogen production from low-rank coals, advanced wastewater treatment, mild gasification, color and residual COD removal from Synfuel wastewaters, Great Plains Gasification Plant, gasifier optimization).

  13. Ranking Specific Sets of Objects.

    Science.gov (United States)

    Maly, Jan; Woltran, Stefan

    2017-01-01

    Ranking sets of objects based on an order between the single elements has been thoroughly studied in the literature. In particular, it has been shown that it is in general impossible to find a total ranking - jointly satisfying properties as dominance and independence - on the whole power set of objects. However, in many applications certain elements from the entire power set might not be required and can be neglected in the ranking process. For instance, certain sets might be ruled out due to hard constraints or are not satisfying some background theory. In this paper, we treat the computational problem whether an order on a given subset of the power set of elements satisfying different variants of dominance and independence can be found, given a ranking on the elements. We show that this problem is tractable for partial rankings and NP-complete for total rankings.

  14. Wikipedia ranking of world universities

    Science.gov (United States)

    Lages, José; Patt, Antoine; Shepelyansky, Dima L.

    2016-03-01

    We use the directed networks between articles of 24 Wikipedia language editions for producing the wikipedia ranking of world Universities (WRWU) using PageRank, 2DRank and CheiRank algorithms. This approach allows to incorporate various cultural views on world universities using the mathematical statistical analysis independent of cultural preferences. The Wikipedia ranking of top 100 universities provides about 60% overlap with the Shanghai university ranking demonstrating the reliable features of this approach. At the same time WRWU incorporates all knowledge accumulated at 24 Wikipedia editions giving stronger highlights for historically important universities leading to a different estimation of efficiency of world countries in university education. The historical development of university ranking is analyzed during ten centuries of their history.

  15. Relationships between walking and percentiles of adiposity inolder and younger men

    Energy Technology Data Exchange (ETDEWEB)

    Williams, Paul T.

    2005-06-01

    To assess the relationship of weekly walking distance to percentiles of adiposity in elders (age {ge} 75 years), seniors (55 {le} age <75 years), middle-age men (35 {le} age <55 years), and younger men (18 {le} age <35 years old). Cross-sectional analyses of baseline questionnaires from 7,082 male participants of the National Walkers Health Study. The walkers BMIs were inversely and significantly associated with walking distance (kg/m{sup 2} per km/wk) in elders (slope {+-} SE: -0.032 {+-} 0.008), seniors (-0.045 {+-} 0.005), and middle-aged men (-0.037 {+-} 0.007), as were their waist circumferences (-0.091 {+-} 0.025, -0.045 {+-} 0.005, and -0.091 {+-} 0.015 cm per km/wk, respectively), and these slopes remained significant when adjusted statistically for reported weekly servings of meat, fish, fruit, and alcohol. The declines in BMI associated with walking distance were greater at the higher than lower percentiles of the BMI distribution. Specifically, compared to the decline at the 10th BMI percentile, the decline in BMI at the 90th percentile was 5.1-fold greater in elders, 5.9-fold greater in seniors, and 6.7-fold greater in middle-age men. The declines in waist circumference associated with walking distance were also greater among men with broader waistlines. Exercise-induced weight loss (or self-selection) causes an inverse relationship between adiposity and walking distance in men 35 and older that is substantially greater among fatter men.

  16. Smoothed Body Composition Percentiles Curves for Mexican Children Aged 6 to 12 Years

    Directory of Open Access Journals (Sweden)

    Melchor Alpizar

    2017-12-01

    Full Text Available Overweight children and childhood obesity are a public health problem in Mexico. Obesity is traditionally assessed using body mass index (BMI, but an excess of adiposity does not necessarily reflect a high BMI. Thus, body composition indexes are a better alternative. Our objective was to generate body composition percentile curves in children from Mexico City. A total of 2026 boys and 1488 girls aged 6 to 12 years old were studied in Mexico City. Body weight, height, and BMI calculation were measured. Total body fat percentage (TBFP was derived from the skinfold thicknesses, and fat mass (FMI and free fat mass indexes (FFMI were calculated. Finally, age- and gender-specifıc smoothed percentile curves were generated with Cole’s Lambda, Mu, and Sigma (LMS method. In general, height, weight, waist circumference (WC, and TBFP were higher in boys, but FFM was higher in girls. TBFP appeared to increase significantly between ages 8 and 9 in boys (+2.9% and between ages 10 and 11 in girls (+1.2%. In contrast, FFM% decreased noticeably between ages 8 and 9 until 12 years old in boys and girls. FMI values peaked in boys at age 12 (P97 = 14.1 kg/m2 and in girls at age 11 (P97 = 8.8 kg/m2. FFMI percentiles increase at a steady state reaching a peak at age 12 in boys and girls. Smoothed body composition percentiles showed a different pattern in boys and girls. The use of TBFP, FMI, and FFMI along with BMI provides valuable information in epidemiological, nutritional, and clinical research.

  17. [Physical activity patterns of school adolescents: Validity, reliability and percentiles proposal for their evaluation].

    Science.gov (United States)

    Cossío Bolaños, Marco; Méndez Cornejo, Jorge; Luarte Rocha, Cristian; Vargas Vitoria, Rodrigo; Canqui Flores, Bernabé; Gomez Campos, Rossana

    2017-02-01

    Regular physical activity (PA) during childhood and adolescence is important for the prevention of non-communicable diseases and their risk factors. To validate a questionnaire for measuring patterns of PA, verify the reliability, comparing the levels of PA aligned with chronological and biological age, and to develop percentile curves to assess PA levels depending on biological maturation. Descriptive cross-sectional study was performed on a sample non-probabilistic quota of 3,176 Chilean adolescents (1685 males and 1491 females), with a mean age range from 10.0 to 18.9 years. An analysis was performed on, weight, standing and sitting height. The biological age through the years of peak growth rate and chronological age in years was determined. Body Mass Index was calculated and a survey of PA was applied. The LMS method was used to develop percentiles. The values for the confirmatory analysis showed saturations between 0.517 and 0.653. The value of adequacy of Kaiser-Meyer-Olkin (KMO) was 0.879 and with 70.8% of the variance explained. The Cronbach alpha values ranged from 0.81 to 0.86. There were differences between the genders when aligned chronological age. There were no differences when aligned by biological age. Percentiles are proposed to classify the PA of adolescents of both genders according to biological age and sex. The questionnaire used was valid and reliable, plus the PA should be evaluated by biological age. These findings led to the development of percentiles to assess PA according to biological age and gender.

  18. Defect Detection of Steel Surfaces with Global Adaptive Percentile Thresholding of Gradient Image

    Science.gov (United States)

    Neogi, Nirbhar; Mohanta, Dusmanta K.; Dutta, Pranab K.

    2017-12-01

    Steel strips are used extensively for white goods, auto bodies and other purposes where surface defects are not acceptable. On-line surface inspection systems can effectively detect and classify defects and help in taking corrective actions. For detection of defects use of gradients is very popular in highlighting and subsequently segmenting areas of interest in a surface inspection system. Most of the time, segmentation by a fixed value threshold leads to unsatisfactory results. As defects can be both very small and large in size, segmentation of a gradient image based on percentile thresholding can lead to inadequate or excessive segmentation of defective regions. A global adaptive percentile thresholding of gradient image has been formulated for blister defect and water-deposit (a pseudo defect) in steel strips. The developed method adaptively changes the percentile value used for thresholding depending on the number of pixels above some specific values of gray level of the gradient image. The method is able to segment defective regions selectively preserving the characteristics of defects irrespective of the size of the defects. The developed method performs better than Otsu method of thresholding and an adaptive thresholding method based on local properties.

  19. Dynamic Contrast-Enhanced MRI of Cervical Cancers: Temporal Percentile Screening of Contrast Enhancement Identifies Parameters for Prediction of Chemoradioresistance

    International Nuclear Information System (INIS)

    Andersen, Erlend K.F.; Hole, Knut Håkon; Lund, Kjersti V.; Sundfør, Kolbein; Kristensen, Gunnar B.; Lyng, Heidi; Malinen, Eirik

    2012-01-01

    Purpose: To systematically screen the tumor contrast enhancement of locally advanced cervical cancers to assess the prognostic value of two descriptive parameters derived from dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI). Methods and Materials: This study included a prospectively collected cohort of 81 patients who underwent DCE-MRI with gadopentetate dimeglumine before chemoradiotherapy. The following descriptive DCE-MRI parameters were extracted voxel by voxel and presented as histograms for each time point in the dynamic series: normalized relative signal increase (nRSI) and normalized area under the curve (nAUC). The first to 100th percentiles of the histograms were included in a log-rank survival test, resulting in p value and relative risk maps of all percentile–time intervals for each DCE-MRI parameter. The maps were used to evaluate the robustness of the individual percentile–time pairs and to construct prognostic parameters. Clinical endpoints were locoregional control and progression-free survival. The study was approved by the institutional ethics committee. Results: The p value maps of nRSI and nAUC showed a large continuous region of percentile–time pairs that were significantly associated with locoregional control (p < 0.05). These parameters had prognostic impact independent of tumor stage, volume, and lymph node status on multivariate analysis. Only a small percentile–time interval of nRSI was associated with progression-free survival. Conclusions: The percentile–time screening identified DCE-MRI parameters that predict long-term locoregional control after chemoradiotherapy of cervical cancer.

  20. Ranking nodes in growing networks: When PageRank fails

    Science.gov (United States)

    Mariani, Manuel Sebastian; Medo, Matúš; Zhang, Yi-Cheng

    2015-11-01

    PageRank is arguably the most popular ranking algorithm which is being applied in real systems ranging from information to biological and infrastructure networks. Despite its outstanding popularity and broad use in different areas of science, the relation between the algorithm’s efficacy and properties of the network on which it acts has not yet been fully understood. We study here PageRank’s performance on a network model supported by real data, and show that realistic temporal effects make PageRank fail in individuating the most valuable nodes for a broad range of model parameters. Results on real data are in qualitative agreement with our model-based findings. This failure of PageRank reveals that the static approach to information filtering is inappropriate for a broad class of growing systems, and suggest that time-dependent algorithms that are based on the temporal linking patterns of these systems are needed to better rank the nodes.

  1. Testing for constant nonparametric effects in general semiparametric regression models with interactions

    KAUST Repository

    Wei, Jiawei; Carroll, Raymond J.; Maity, Arnab

    2011-01-01

    We consider the problem of testing for a constant nonparametric effect in a general semi-parametric regression model when there is the potential for interaction between the parametrically and nonparametrically modeled variables. The work

  2. Percentile reference values for anthropometric body composition indices in European children from the IDEFICS study.

    Science.gov (United States)

    Nagy, P; Kovacs, E; Moreno, L A; Veidebaum, T; Tornaritis, M; Kourides, Y; Siani, A; Lauria, F; Sioen, I; Claessens, M; Mårild, S; Lissner, L; Bammann, K; Intemann, T; Buck, C; Pigeot, I; Ahrens, W; Molnár, D

    2014-09-01

    To characterise the nutritional status in children with obesity or wasting conditions, European anthropometric reference values for body composition measures beyond the body mass index (BMI) are needed. Differentiated assessment of body composition in children has long been hampered by the lack of appropriate references. The aim of our study is to provide percentiles for body composition indices in normal weight European children, based on the IDEFICS cohort (Identification and prevention of Dietary- and lifestyle-induced health Effects in Children and infantS). Overall 18,745 2.0-10.9-year-old children from eight countries participated in the study. Children classified as overweight/obese or underweight according to IOTF (N=5915) were excluded from the analysis. Anthropometric measurements (BMI (N=12 830); triceps, subscapular, fat mass and fat mass index (N=11,845-11,901); biceps, suprailiac skinfolds, sum of skinfolds calculated from skinfold thicknesses (N=8129-8205), neck circumference (N=12,241); waist circumference and waist-to-height ratio (N=12,381)) were analysed stratified by sex and smoothed 1st, 3rd, 10th, 25th, 50th, 75th, 90th, 97th and 99th percentile curves were calculated using GAMLSS. Percentile values of the most important anthropometric measures related to the degree of adiposity are depicted for European girls and boys. Age- and sex-specific differences were investigated for all measures. As an example, the 50th and 99th percentile values of waist circumference ranged from 50.7-59.2 cm and from 51.3-58.7 cm in 4.5- to <5.0-year-old girls and boys, respectively, to 60.6-74.5 cm in girls and to 59.9-76.7 cm in boys at the age of 10.5-10.9 years. The presented percentile curves may aid a differentiated assessment of total and abdominal adiposity in European children.

  3. PageRank tracker: from ranking to tracking.

    Science.gov (United States)

    Gong, Chen; Fu, Keren; Loza, Artur; Wu, Qiang; Liu, Jia; Yang, Jie

    2014-06-01

    Video object tracking is widely used in many real-world applications, and it has been extensively studied for over two decades. However, tracking robustness is still an issue in most existing methods, due to the difficulties with adaptation to environmental or target changes. In order to improve adaptability, this paper formulates the tracking process as a ranking problem, and the PageRank algorithm, which is a well-known webpage ranking algorithm used by Google, is applied. Labeled and unlabeled samples in tracking application are analogous to query webpages and the webpages to be ranked, respectively. Therefore, determining the target is equivalent to finding the unlabeled sample that is the most associated with existing labeled set. We modify the conventional PageRank algorithm in three aspects for tracking application, including graph construction, PageRank vector acquisition and target filtering. Our simulations with the use of various challenging public-domain video sequences reveal that the proposed PageRank tracker outperforms mean-shift tracker, co-tracker, semiboosting and beyond semiboosting trackers in terms of accuracy, robustness and stability.

  4. Nonparametric Regression Estimation for Multivariate Null Recurrent Processes

    Directory of Open Access Journals (Sweden)

    Biqing Cai

    2015-04-01

    Full Text Available This paper discusses nonparametric kernel regression with the regressor being a \\(d\\-dimensional \\(\\beta\\-null recurrent process in presence of conditional heteroscedasticity. We show that the mean function estimator is consistent with convergence rate \\(\\sqrt{n(Th^{d}}\\, where \\(n(T\\ is the number of regenerations for a \\(\\beta\\-null recurrent process and the limiting distribution (with proper normalization is normal. Furthermore, we show that the two-step estimator for the volatility function is consistent. The finite sample performance of the estimate is quite reasonable when the leave-one-out cross validation method is used for bandwidth selection. We apply the proposed method to study the relationship of Federal funds rate with 3-month and 5-year T-bill rates and discover the existence of nonlinearity of the relationship. Furthermore, the in-sample and out-of-sample performance of the nonparametric model is far better than the linear model.

  5. Nonparametric instrumental regression with non-convex constraints

    International Nuclear Information System (INIS)

    Grasmair, M; Scherzer, O; Vanhems, A

    2013-01-01

    This paper considers the nonparametric regression model with an additive error that is dependent on the explanatory variables. As is common in empirical studies in epidemiology and economics, it also supposes that valid instrumental variables are observed. A classical example in microeconomics considers the consumer demand function as a function of the price of goods and the income, both variables often considered as endogenous. In this framework, the economic theory also imposes shape restrictions on the demand function, such as integrability conditions. Motivated by this illustration in microeconomics, we study an estimator of a nonparametric constrained regression function using instrumental variables by means of Tikhonov regularization. We derive rates of convergence for the regularized model both in a deterministic and stochastic setting under the assumption that the true regression function satisfies a projected source condition including, because of the non-convexity of the imposed constraints, an additional smallness condition. (paper)

  6. Nonparametric instrumental regression with non-convex constraints

    Science.gov (United States)

    Grasmair, M.; Scherzer, O.; Vanhems, A.

    2013-03-01

    This paper considers the nonparametric regression model with an additive error that is dependent on the explanatory variables. As is common in empirical studies in epidemiology and economics, it also supposes that valid instrumental variables are observed. A classical example in microeconomics considers the consumer demand function as a function of the price of goods and the income, both variables often considered as endogenous. In this framework, the economic theory also imposes shape restrictions on the demand function, such as integrability conditions. Motivated by this illustration in microeconomics, we study an estimator of a nonparametric constrained regression function using instrumental variables by means of Tikhonov regularization. We derive rates of convergence for the regularized model both in a deterministic and stochastic setting under the assumption that the true regression function satisfies a projected source condition including, because of the non-convexity of the imposed constraints, an additional smallness condition.

  7. Comparing nonparametric Bayesian tree priors for clonal reconstruction of tumors.

    Science.gov (United States)

    Deshwar, Amit G; Vembu, Shankar; Morris, Quaid

    2015-01-01

    Statistical machine learning methods, especially nonparametric Bayesian methods, have become increasingly popular to infer clonal population structure of tumors. Here we describe the treeCRP, an extension of the Chinese restaurant process (CRP), a popular construction used in nonparametric mixture models, to infer the phylogeny and genotype of major subclonal lineages represented in the population of cancer cells. We also propose new split-merge updates tailored to the subclonal reconstruction problem that improve the mixing time of Markov chains. In comparisons with the tree-structured stick breaking prior used in PhyloSub, we demonstrate superior mixing and running time using the treeCRP with our new split-merge procedures. We also show that given the same number of samples, TSSB and treeCRP have similar ability to recover the subclonal structure of a tumor…

  8. Single versus mixture Weibull distributions for nonparametric satellite reliability

    International Nuclear Information System (INIS)

    Castet, Jean-Francois; Saleh, Joseph H.

    2010-01-01

    Long recognized as a critical design attribute for space systems, satellite reliability has not yet received the proper attention as limited on-orbit failure data and statistical analyses can be found in the technical literature. To fill this gap, we recently conducted a nonparametric analysis of satellite reliability for 1584 Earth-orbiting satellites launched between January 1990 and October 2008. In this paper, we provide an advanced parametric fit, based on mixture of Weibull distributions, and compare it with the single Weibull distribution model obtained with the Maximum Likelihood Estimation (MLE) method. We demonstrate that both parametric fits are good approximations of the nonparametric satellite reliability, but that the mixture Weibull distribution provides significant accuracy in capturing all the failure trends in the failure data, as evidenced by the analysis of the residuals and their quasi-normal dispersion.

  9. Seismic Signal Compression Using Nonparametric Bayesian Dictionary Learning via Clustering

    Directory of Open Access Journals (Sweden)

    Xin Tian

    2017-06-01

    Full Text Available We introduce a seismic signal compression method based on nonparametric Bayesian dictionary learning method via clustering. The seismic data is compressed patch by patch, and the dictionary is learned online. Clustering is introduced for dictionary learning. A set of dictionaries could be generated, and each dictionary is used for one cluster’s sparse coding. In this way, the signals in one cluster could be well represented by their corresponding dictionaries. A nonparametric Bayesian dictionary learning method is used to learn the dictionaries, which naturally infers an appropriate dictionary size for each cluster. A uniform quantizer and an adaptive arithmetic coding algorithm are adopted to code the sparse coefficients. With comparisons to other state-of-the art approaches, the effectiveness of the proposed method could be validated in the experiments.

  10. Using non-parametric methods in econometric production analysis

    DEFF Research Database (Denmark)

    Czekaj, Tomasz Gerard; Henningsen, Arne

    2012-01-01

    by investigating the relationship between the elasticity of scale and the farm size. We use a balanced panel data set of 371~specialised crop farms for the years 2004-2007. A non-parametric specification test shows that neither the Cobb-Douglas function nor the Translog function are consistent with the "true......Econometric estimation of production functions is one of the most common methods in applied economic production analysis. These studies usually apply parametric estimation techniques, which obligate the researcher to specify a functional form of the production function of which the Cobb...... parameter estimates, but also in biased measures which are derived from the parameters, such as elasticities. Therefore, we propose to use non-parametric econometric methods. First, these can be applied to verify the functional form used in parametric production analysis. Second, they can be directly used...

  11. Nonparametric Bayesian models through probit stick-breaking processes.

    Science.gov (United States)

    Rodríguez, Abel; Dunson, David B

    2011-03-01

    We describe a novel class of Bayesian nonparametric priors based on stick-breaking constructions where the weights of the process are constructed as probit transformations of normal random variables. We show that these priors are extremely flexible, allowing us to generate a great variety of models while preserving computational simplicity. Particular emphasis is placed on the construction of rich temporal and spatial processes, which are applied to two problems in finance and ecology.

  12. Non-parametric estimation of the individual's utility map

    OpenAIRE

    Noguchi, Takao; Sanborn, Adam N.; Stewart, Neil

    2013-01-01

    Models of risky choice have attracted much attention in behavioural economics. Previous research has repeatedly demonstrated that individuals' choices are not well explained by expected utility theory, and a number of alternative models have been examined using carefully selected sets of choice alternatives. The model performance however, can depend on which choice alternatives are being tested. Here we develop a non-parametric method for estimating the utility map over the wide range of choi...

  13. Nonparametric Efficiency Testing of Asian Stock Markets Using Weekly Data

    OpenAIRE

    CORNELIS A. LOS

    2004-01-01

    The efficiency of speculative markets, as represented by Fama's 1970 fair game model, is tested on weekly price index data of six Asian stock markets - Hong Kong, Indonesia, Malaysia, Singapore, Taiwan and Thailand - using Sherry's (1992) non-parametric methods. These scientific testing methods were originally developed to analyze the information processing efficiency of nervous systems. In particular, the stationarity and independence of the price innovations are tested over ten years, from ...

  14. Investigation of MLE in nonparametric estimation methods of reliability function

    International Nuclear Information System (INIS)

    Ahn, Kwang Won; Kim, Yoon Ik; Chung, Chang Hyun; Kim, Kil Yoo

    2001-01-01

    There have been lots of trials to estimate a reliability function. In the ESReDA 20 th seminar, a new method in nonparametric way was proposed. The major point of that paper is how to use censored data efficiently. Generally there are three kinds of approach to estimate a reliability function in nonparametric way, i.e., Reduced Sample Method, Actuarial Method and Product-Limit (PL) Method. The above three methods have some limits. So we suggest an advanced method that reflects censored information more efficiently. In many instances there will be a unique maximum likelihood estimator (MLE) of an unknown parameter, and often it may be obtained by the process of differentiation. It is well known that the three methods generally used to estimate a reliability function in nonparametric way have maximum likelihood estimators that are uniquely exist. So, MLE of the new method is derived in this study. The procedure to calculate a MLE is similar just like that of PL-estimator. The difference of the two is that in the new method, the mass (or weight) of each has an influence of the others but the mass in PL-estimator not

  15. Universal scaling in sports ranking

    International Nuclear Information System (INIS)

    Deng Weibing; Li Wei; Cai Xu; Bulou, Alain; Wang Qiuping A

    2012-01-01

    Ranking is a ubiquitous phenomenon in human society. On the web pages of Forbes, one may find all kinds of rankings, such as the world's most powerful people, the world's richest people, the highest-earning tennis players, and so on and so forth. Herewith, we study a specific kind—sports ranking systems in which players' scores and/or prize money are accrued based on their performances in different matches. By investigating 40 data samples which span 12 different sports, we find that the distributions of scores and/or prize money follow universal power laws, with exponents nearly identical for most sports. In order to understand the origin of this universal scaling we focus on the tennis ranking systems. By checking the data we find that, for any pair of players, the probability that the higher-ranked player tops the lower-ranked opponent is proportional to the rank difference between the pair. Such a dependence can be well fitted to a sigmoidal function. By using this feature, we propose a simple toy model which can simulate the competition of players in different matches. The simulations yield results consistent with the empirical findings. Extensive simulation studies indicate that the model is quite robust with respect to the modifications of some parameters. (paper)

  16. Physical Fitness Percentiles of German Children Aged 9-12 Years: Findings from a Longitudinal Study.

    Directory of Open Access Journals (Sweden)

    Kathleen Golle

    Full Text Available Generating percentile values is helpful for the identification of children with specific fitness characteristics (i.e., low or high fitness level to set appropriate fitness goals (i.e., fitness/health promotion and/or long-term youth athlete development. Thus, the aim of this longitudinal study was to assess physical fitness development in healthy children aged 9-12 years and to compute sex- and age-specific percentile values.Two-hundred and forty children (88 girls, 152 boys participated in this study and were tested for their physical fitness. Physical fitness was assessed using the 50-m sprint test (i.e., speed, the 1-kg ball push test, the triple hop test (i.e., upper- and lower- extremity muscular power, the stand-and-reach test (i.e., flexibility, the star run test (i.e., agility, and the 9-min run test (i.e., endurance. Age- and sex-specific percentile values (i.e., P10 to P90 were generated using the Lambda, Mu, and Sigma method. Adjusted (for change in body weight, height, and baseline performance age- and sex-differences as well as the interactions thereof were expressed by calculating effect sizes (Cohen's d.Significant main effects of Age were detected for all physical fitness tests (d = 0.40-1.34, whereas significant main effects of Sex were found for upper-extremity muscular power (d = 0.55, flexibility (d = 0.81, agility (d = 0.44, and endurance (d = 0.32 only. Further, significant Sex by Age interactions were observed for upper-extremity muscular power (d = 0.36, flexibility (d = 0.61, and agility (d = 0.27 in favor of girls. Both, linear and curvilinear shaped curves were found for percentile values across the fitness tests. Accelerated (curvilinear improvements were observed for upper-extremity muscular power (boys: 10-11 yrs; girls: 9-11 yrs, agility (boys: 9-10 yrs; girls: 9-11 yrs, and endurance (boys: 9-10 yrs; girls: 9-10 yrs. Tabulated percentiles for the 9-min run test indicated that running distances between 1

  17. Influence of population selection on the 99th percentile reference value for cardiac troponin assays.

    Science.gov (United States)

    Collinson, Paul O; Heung, Yen Ming; Gaze, David; Boa, Frances; Senior, Roxy; Christenson, Robert; Apple, Fred S

    2012-01-01

    We sought to determine the effect of patient selection on the 99th reference percentile of 2 sensitive and 1 high-sensitivity (hs) cardiac troponin assays in a well-defined reference population. Individuals>45 years old were randomly selected from 7 representative local community practices. Detailed information regarding the participants was collected via questionnaires. The healthy reference population was defined as individuals who had no history of vascular disease, hypertension, or heavy alcohol intake; were not receiving cardiac medication; and had blood pressure60 mL·min(-1)·(1.73 m2)(-1), and normal cardiac function according to results of echocardiography. Samples were stored at -70 °C until analysis for cardiac troponin I (cTnI) and cardiac troponin T (cTnT) and N-terminal pro-B-type natriuretic peptide. Application of progressively more stringent population selection strategies to the initial baseline population of 545 participants until the only individuals who remained were completely healthy according to the study criteria reduced the number of outliers seen and led to a progressive decrease in the 99th-percentile value obtained for the Roche hs-cTnT assay and the sensitive Beckman cTnI assay but not for the sensitive Siemens Ultra cTnI assay. Furthermore, a sex difference found in the baseline population for the hs-cTnT (P=0.0018) and Beckman cTnI assays (Pstrategy significantly influenced the 99th percentile reference values determined for troponin assays and the observed sex differences in troponin concentrations.

  18. Evaluation of world's largest social welfare scheme: An assessment using non-parametric approach.

    Science.gov (United States)

    Singh, Sanjeet

    2016-08-01

    Mahatma Gandhi National Rural Employment Guarantee Act (MGNREGA) is the world's largest social welfare scheme in India for the poverty alleviation through rural employment generation. This paper aims to evaluate and rank the performance of the states in India under MGNREGA scheme. A non-parametric approach, Data Envelopment Analysis (DEA) is used to calculate the overall technical, pure technical, and scale efficiencies of states in India. The sample data is drawn from the annual official reports published by the Ministry of Rural Development, Government of India. Based on three selected input parameters (expenditure indicators) and five output parameters (employment generation indicators), I apply both input and output oriented DEA models to estimate how well the states utilize their resources and generate outputs during the financial year 2013-14. The relative performance evaluation has been made under the assumption of constant returns and also under variable returns to scale to assess the impact of scale on performance. The results indicate that the main source of inefficiency is both technical and managerial practices adopted. 11 states are overall technically efficient and operate at the optimum scale whereas 18 states are pure technical or managerially efficient. It has been found that for some states it necessary to alter scheme size to perform at par with the best performing states. For inefficient states optimal input and output targets along with the resource savings and output gains are calculated. Analysis shows that if all inefficient states operate at optimal input and output levels, on an average 17.89% of total expenditure and a total amount of $780million could have been saved in a single year. Most of the inefficient states perform poorly when it comes to the participation of women and disadvantaged sections (SC&ST) in the scheme. In order to catch up with the performance of best performing states, inefficient states on an average need to enhance

  19. PageRank of integers

    International Nuclear Information System (INIS)

    Frahm, K M; Shepelyansky, D L; Chepelianskii, A D

    2012-01-01

    We up a directed network tracing links from a given integer to its divisors and analyze the properties of the Google matrix of this network. The PageRank vector of this matrix is computed numerically and it is shown that its probability is approximately inversely proportional to the PageRank index thus being similar to the Zipf law and the dependence established for the World Wide Web. The spectrum of the Google matrix of integers is characterized by a large gap and a relatively small number of nonzero eigenvalues. A simple semi-analytical expression for the PageRank of integers is derived that allows us to find this vector for matrices of billion size. This network provides a new PageRank order of integers. (paper)

  20. Freudenthal ranks: GHZ versus W

    International Nuclear Information System (INIS)

    Borsten, L

    2013-01-01

    The Hilbert space of three-qubit pure states may be identified with a Freudenthal triple system. Every state has an unique Freudenthal rank ranging from 1 to 4, which is determined by a set of automorphism group covariants. It is shown here that the optimal success rates for winning a three-player non-local game, varying over all local strategies, are strictly ordered by the Freudenthal rank of the shared three-qubit resource. (paper)

  1. Ranking Queries on Uncertain Data

    CERN Document Server

    Hua, Ming

    2011-01-01

    Uncertain data is inherent in many important applications, such as environmental surveillance, market analysis, and quantitative economics research. Due to the importance of those applications and rapidly increasing amounts of uncertain data collected and accumulated, analyzing large collections of uncertain data has become an important task. Ranking queries (also known as top-k queries) are often natural and useful in analyzing uncertain data. Ranking Queries on Uncertain Data discusses the motivations/applications, challenging problems, the fundamental principles, and the evaluation algorith

  2. Ranking in evolving complex networks

    Science.gov (United States)

    Liao, Hao; Mariani, Manuel Sebastian; Medo, Matúš; Zhang, Yi-Cheng; Zhou, Ming-Yang

    2017-05-01

    Complex networks have emerged as a simple yet powerful framework to represent and analyze a wide range of complex systems. The problem of ranking the nodes and the edges in complex networks is critical for a broad range of real-world problems because it affects how we access online information and products, how success and talent are evaluated in human activities, and how scarce resources are allocated by companies and policymakers, among others. This calls for a deep understanding of how existing ranking algorithms perform, and which are their possible biases that may impair their effectiveness. Many popular ranking algorithms (such as Google's PageRank) are static in nature and, as a consequence, they exhibit important shortcomings when applied to real networks that rapidly evolve in time. At the same time, recent advances in the understanding and modeling of evolving networks have enabled the development of a wide and diverse range of ranking algorithms that take the temporal dimension into account. The aim of this review is to survey the existing ranking algorithms, both static and time-aware, and their applications to evolving networks. We emphasize both the impact of network evolution on well-established static algorithms and the benefits from including the temporal dimension for tasks such as prediction of network traffic, prediction of future links, and identification of significant nodes.

  3. Positive School Climate Is Associated With Lower Body Mass Index Percentile Among Urban Preadolescents

    Science.gov (United States)

    Gilstad-Hayden, Kathryn; Carroll-Scott, Amy; Rosenthal, Lisa; Peters, Susan M.; McCaslin, Catherine; Ickovics, Jeannette R.

    2015-01-01

    BACKGROUND Schools are an important environmental context in children’s lives and are part of the complex web of factors that contribute to childhood obesity. Increasingly, attention has been placed on the importance of school climate (connectedness, academic standards, engagement, and student autonomy) as 1 domain of school environment beyond health policies and education that may have implications for student health outcomes. The purpose of this study is to examine the association of school climate with body mass index (BMI) among urban preadolescents. METHODS Health surveys and physical measures were collected among fifth- and sixth-grade students from 12 randomly selected public schools in a small New England city. School climate surveys were completed district-wide by students and teachers. Hierarchical linear modeling was used to test the association between students’ BMI and schools’ climate scores. RESULTS After controlling for potentially confounding individual-level characteristics, a 1-unit increase in school climate score (indicating more positive climate) was associated with a 7-point decrease in students’ BMI percentile. CONCLUSIONS Positive school climate is associated with lower student BMI percentile. More research is needed to understand the mechanisms behind this relationship and to explore whether interventions promoting positive school climate can effectively prevent and/or reduce obesity. PMID:25040118

  4. Diagnostic performance of BMI percentiles to identify adolescents with metabolic syndrome.

    Science.gov (United States)

    Laurson, Kelly R; Welk, Gregory J; Eisenmann, Joey C

    2014-02-01

    To compare the diagnostic performance of the Centers for Disease Control and Prevention (CDC) and FITNESSGRAM (FGram) BMI standards for quantifying metabolic risk in youth. Adolescents in the NHANES (n = 3385) were measured for anthropometric variables and metabolic risk factors. BMI percentiles were calculated, and youth were categorized by weight status (using CDC and FGram thresholds). Participants were also categorized by presence or absence of metabolic syndrome. The CDC and FGram standards were compared by prevalence of metabolic abnormalities, various diagnostic criteria, and odds of metabolic syndrome. Receiver operating characteristic curves were also created to identify optimal BMI percentiles to detect metabolic syndrome. The prevalence of metabolic syndrome in obese youth was 19% to 35%, compared with <2% in the normal-weight groups. The odds of metabolic syndrome for obese boys and girls were 46 to 67 and 19 to 22 times greater, respectively, than for normal-weight youth. The receiver operating characteristic analyses identified optimal thresholds similar to the CDC standards for boys and the FGram standards for girls. Overall, BMI thresholds were more strongly associated with metabolic syndrome in boys than in girls. Both the CDC and FGram standards are predictive of metabolic syndrome. The diagnostic utility of the CDC thresholds outperformed the FGram values for boys, whereas FGram standards were slightly better thresholds for girls. The use of a common set of thresholds for school and clinical applications would provide advantages for public health and clinical research and practice.

  5. Physical fitness percentile charts for children aged 6-10 from Portugal.

    Science.gov (United States)

    Roriz De Oliveira, M S; Seabra, A; Freitas, D; Eisenmann, J C; Maia, J

    2014-12-01

    The present study aims (1) to provide reference percentile charts for the following measures of Physical Fitness (PF): the sit-and-reach, handgrip, standing long jump, 50 yards' dash, 4x10m shuttle run and 1-mile run/walk tests in children aged 6 to 10 years, and (2) to compare the performance of the Portuguese children with their age- and sex peers. A total of 3804 Portuguese children (1985 boys and 1819 girls) aged 6-10 years old participated in this study. The sample was stratified from 20 public elementary schools and children were randomly selected in each school. Charts were separately built for each sex using the LMS method. Boys showed better results than girls in handgrip, standing long jump, 50 yards' dash, 4x10 m shuttle run and 1-mile run/walk, while girls are better performers than boys in sit-and-reach. Age- and gender- percentiles for a set of physical fitness tests for 6-10 year old (primary school) Portuguese children have been established. Boys showed greater overall PF than girls, except in the flexibility test, in which girls performed better. The reported normative values provide ample opportunities to accurately detect individual changes during childhood. These reference values are especially important in healthcare and educational settings, and can be added to the worldwide literature on physical fitness values in children.

  6. Pediatric refugees in Rhode Island: increases in BMI percentile, overweight, and obesity following resettlement.

    Science.gov (United States)

    Heney, Jessica H; Dimock, Camia C; Friedman, Jennifer F; Lewis, Carol

    2014-01-05

    To evaluate BMI change among pediatric refugees resettling in Providence, RI. Retrospective chart review of pediatric refugees from the initial evaluation to year 3 post-resettlement at Hasbro Children's Hospital. Primary outcome of interest was within person change in BMI percentile at each time point. From 2007-2012, 181 children visited the clinic. Initial prevalence of overweight and obesity was 14.1% and 3.2% versus 22.8% and 12.6% at year 3. From visit 1 and years 1-3, there was a positive mean within person change in BMI percentile of 12.9% (95% CI 6.3-19.6%s), 16.6% (95% CI 11.2-21.9%), and 14.4% (95% CI 9.1-19.7%). The prevalence of overweight and obesity increased from 17.3% at initial intake to 35.4% at 3 years post-resettlement to surpass that of American children (31.7-31.8% for 2007-2012). Refugee children have additional risk factors for obesity; multidisciplinary interventions must be designed to address nutrition at each visit.

  7. Waist Circumferences of Chilean Students: Comparison of the CDC-2012 Standard and Proposed Percentile Curves

    Directory of Open Access Journals (Sweden)

    Rossana Gómez-Campos

    2015-07-01

    Full Text Available The measurement of waist circumference (WC is considered to be an important means to control overweight and obesity in children and adolescents. The objectives of the study were to (a compare the WC measurements of Chilean students with the international CDC-2012 standard and other international standards, and (b propose a specific measurement value for the WC of Chilean students based on age and sex. A total of 3892 students (6 to 18 years old were assessed. Weight, height, body mass index (BMI, and WC were measured. WC was compared with the CDC-2012 international standard. Percentiles were constructed based on the LMS method. Chilean males had a greater WC during infancy. Subsequently, in late adolescence, males showed values lower than those of the international standards. Chilean females demonstrated values similar to the standards until the age of 12. Subsequently, females showed lower values. The 85th and 95th percentiles were adopted as cutoff points for evaluating overweight and obesity based on age and sex. The WC of Chilean students differs from the CDC-2012 curves. The regional norms proposed are a means to identify children and adolescents with a high risk of suffering from overweight and obesity disorders.

  8. Waist Circumferences of Chilean Students: Comparison of the CDC-2012 Standard and Proposed Percentile Curves

    Science.gov (United States)

    Gómez-Campos, Rossana; Lee Andruske, Cinthya; Hespanhol, Jefferson; Sulla Torres, Jose; Arruda, Miguel; Luarte-Rocha, Cristian; Cossio-Bolaños, Marco Antonio

    2015-01-01

    The measurement of waist circumference (WC) is considered to be an important means to control overweight and obesity in children and adolescents. The objectives of the study were to (a) compare the WC measurements of Chilean students with the international CDC-2012 standard and other international standards, and (b) propose a specific measurement value for the WC of Chilean students based on age and sex. A total of 3892 students (6 to 18 years old) were assessed. Weight, height, body mass index (BMI), and WC were measured. WC was compared with the CDC-2012 international standard. Percentiles were constructed based on the LMS method. Chilean males had a greater WC during infancy. Subsequently, in late adolescence, males showed values lower than those of the international standards. Chilean females demonstrated values similar to the standards until the age of 12. Subsequently, females showed lower values. The 85th and 95th percentiles were adopted as cutoff points for evaluating overweight and obesity based on age and sex. The WC of Chilean students differs from the CDC-2012 curves. The regional norms proposed are a means to identify children and adolescents with a high risk of suffering from overweight and obesity disorders. PMID:26184250

  9. Youth fitness testing: the effect of percentile-based evaluative feedback on intrinsic motivation.

    Science.gov (United States)

    Whitehead, J R; Corbin, C B

    1991-06-01

    This study was a test of Deci and Ryan's (1985) cognitive evaluation theory in a fitness testing situation. More specifically, it was a test of Proposition 2 of that theory, which posits that external events that increase or decrease perceived competence will increase or decrease intrinsic motivation. Seventh and eighth grade schoolchildren (N = 105) volunteered for an experiment that was ostensibly to collect data on a new youth fitness test (the Illinois Agility Run). After two untimed practice runs, a specially adapted version of the Intrinsic Motivation Inventory (IMI) was administered as a pretest of intrinsic motivation. Two weeks later when subjects ran again, they were apparently electronically timed. In reality, the subjects were given bogus feedback. Subjects in a positive feedback condition were told their scores were above the 80th percentile, while those in a negative feedback condition were told their scores were below the 20th percentile. Those in a control condition received no feedback. The IMI was again administered to the subjects after their runs. Multivariate and subsequent univariate tests were significant for all four subscale dependent variables (perceived interest-enjoyment, competence, effort, and pressure-tension). Positive feedback enhanced all aspects of intrinsic motivation, whereas negative feedback decreased them. In a further test of cognitive evaluation theory, path analysis results supported the prediction that perceived competence would mediate changes in the other IMI subscales. Taken together, these results clearly support cognitive evaluation theory and also may have important implications regarding motivation for those who administer youth fitness tests.

  10. Percentiles of the run-length distribution of the Exponentially Weighted Moving Average (EWMA) median chart

    Science.gov (United States)

    Tan, K. L.; Chong, Z. L.; Khoo, M. B. C.; Teoh, W. L.; Teh, S. Y.

    2017-09-01

    Quality control is crucial in a wide variety of fields, as it can help to satisfy customers’ needs and requirements by enhancing and improving the products and services to a superior quality level. The EWMA median chart was proposed as a useful alternative to the EWMA \\bar{X} chart because the median-type chart is robust against contamination, outliers or small deviation from the normality assumption compared to the traditional \\bar{X}-type chart. To provide a complete understanding of the run-length distribution, the percentiles of the run-length distribution should be investigated rather than depending solely on the average run length (ARL) performance measure. This is because interpretation depending on the ARL alone can be misleading, as the process mean shifts change according to the skewness and shape of the run-length distribution, varying from almost symmetric when the magnitude of the mean shift is large, to highly right-skewed when the process is in-control (IC) or slightly out-of-control (OOC). Before computing the percentiles of the run-length distribution, optimal parameters of the EWMA median chart will be obtained by minimizing the OOC ARL, while retaining the IC ARL at a desired value.

  11. Positive school climate is associated with lower body mass index percentile among urban preadolescents.

    Science.gov (United States)

    Gilstad-Hayden, Kathryn; Carroll-Scott, Amy; Rosenthal, Lisa; Peters, Susan M; McCaslin, Catherine; Ickovics, Jeannette R

    2014-08-01

    Schools are an important environmental context in children's lives and are part of the complex web of factors that contribute to childhood obesity. Increasingly, attention has been placed on the importance of school climate (connectedness, academic standards, engagement, and student autonomy) as 1 domain of school environment beyond health policies and education that may have implications for student health outcomes. The purpose of this study is to examine the association of school climate with body mass index (BMI) among urban preadolescents. Health surveys and physical measures were collected among fifth- and sixth-grade students from 12 randomly selected public schools in a small New England city. School climate surveys were completed district-wide by students and teachers. Hierarchical linear modeling was used to test the association between students' BMI and schools' climate scores. After controlling for potentially confounding individual-level characteristics, a 1-unit increase in school climate score (indicating more positive climate) was associated with a 7-point decrease in students' BMI percentile. Positive school climate is associated with lower student BMI percentile. More research is needed to understand the mechanisms behind this relationship and to explore whether interventions promoting positive school climate can effectively prevent and/or reduce obesity. © 2014, American School Health Association.

  12. RANK and RANK ligand expression in primary human osteosarcoma

    Directory of Open Access Journals (Sweden)

    Daniel Branstetter

    2015-09-01

    Our results demonstrate RANKL expression was observed in the tumor element in 68% of human OS using IHC. However, the staining intensity was relatively low and only 37% (29/79 of samples exhibited≥10% RANKL positive tumor cells. RANK expression was not observed in OS tumor cells. In contrast, RANK expression was clearly observed in other cells within OS samples, including the myeloid osteoclast precursor compartment, osteoclasts and in giant osteoclast cells. The intensity and frequency of RANKL and RANK staining in OS samples were substantially less than that observed in GCTB samples. The observation that RANKL is expressed in OS cells themselves suggests that these tumors may mediate an osteoclastic response, and anti-RANKL therapy may potentially be protective against bone pathologies in OS. However, the absence of RANK expression in primary human OS cells suggests that any autocrine RANKL/RANK signaling in human OS tumor cells is not operative, and anti-RANKL therapy would not directly affect the tumor.

  13. Ranking structures and rank-rank correlations of countries: The FIFA and UEFA cases

    Science.gov (United States)

    Ausloos, Marcel; Cloots, Rudi; Gadomski, Adam; Vitanov, Nikolay K.

    2014-04-01

    Ranking of agents competing with each other in complex systems may lead to paradoxes according to the pre-chosen different measures. A discussion is presented on such rank-rank, similar or not, correlations based on the case of European countries ranked by UEFA and FIFA from different soccer competitions. The first question to be answered is whether an empirical and simple law is obtained for such (self-) organizations of complex sociological systems with such different measuring schemes. It is found that the power law form is not the best description contrary to many modern expectations. The stretched exponential is much more adequate. Moreover, it is found that the measuring rules lead to some inner structures in both cases.

  14. A Nonparametric Bayesian Approach For Emission Tomography Reconstruction

    International Nuclear Information System (INIS)

    Barat, Eric; Dautremer, Thomas

    2007-01-01

    We introduce a PET reconstruction algorithm following a nonparametric Bayesian (NPB) approach. In contrast with Expectation Maximization (EM), the proposed technique does not rely on any space discretization. Namely, the activity distribution--normalized emission intensity of the spatial poisson process--is considered as a spatial probability density and observations are the projections of random emissions whose distribution has to be estimated. This approach is nonparametric in the sense that the quantity of interest belongs to the set of probability measures on R k (for reconstruction in k-dimensions) and it is Bayesian in the sense that we define a prior directly on this spatial measure. In this context, we propose to model the nonparametric probability density as an infinite mixture of multivariate normal distributions. As a prior for this mixture we consider a Dirichlet Process Mixture (DPM) with a Normal-Inverse Wishart (NIW) model as base distribution of the Dirichlet Process. As in EM-family reconstruction, we use a data augmentation scheme where the set of hidden variables are the emission locations for each observed line of response in the continuous object space. Thanks to the data augmentation, we propose a Markov Chain Monte Carlo (MCMC) algorithm (Gibbs sampler) which is able to generate draws from the posterior distribution of the spatial intensity. A difference with EM is that one step of the Gibbs sampler corresponds to the generation of emission locations while only the expected number of emissions per pixel/voxel is used in EM. Another key difference is that the estimated spatial intensity is a continuous function such that there is no need to compute a projection matrix. Finally, draws from the intensity posterior distribution allow the estimation of posterior functionnals like the variance or confidence intervals. Results are presented for simulated data based on a 2D brain phantom and compared to Bayesian MAP-EM

  15. Using non-parametric methods in econometric production analysis

    DEFF Research Database (Denmark)

    Czekaj, Tomasz Gerard; Henningsen, Arne

    Econometric estimation of production functions is one of the most common methods in applied economic production analysis. These studies usually apply parametric estimation techniques, which obligate the researcher to specify the functional form of the production function. Most often, the Cobb...... results—including measures that are of interest of applied economists, such as elasticities. Therefore, we propose to use nonparametric econometric methods. First, they can be applied to verify the functional form used in parametric estimations of production functions. Second, they can be directly used...

  16. STATCAT, Statistical Analysis of Parametric and Non-Parametric Data

    International Nuclear Information System (INIS)

    David, Hugh

    1990-01-01

    1 - Description of program or function: A suite of 26 programs designed to facilitate the appropriate statistical analysis and data handling of parametric and non-parametric data, using classical and modern univariate and multivariate methods. 2 - Method of solution: Data is read entry by entry, using a choice of input formats, and the resultant data bank is checked for out-of- range, rare, extreme or missing data. The completed STATCAT data bank can be treated by a variety of descriptive and inferential statistical methods, and modified, using other standard programs as required

  17. Panel data nonparametric estimation of production risk and risk preferences

    DEFF Research Database (Denmark)

    Czekaj, Tomasz Gerard; Henningsen, Arne

    approaches for obtaining firm-specific measures of risk attitudes. We found that Polish dairy farmers are risk averse regarding production risk and price uncertainty. According to our results, Polish dairy farmers perceive the production risk as being more significant than the risk related to output price......We apply nonparametric panel data kernel regression to investigate production risk, out-put price uncertainty, and risk attitudes of Polish dairy farms based on a firm-level unbalanced panel data set that covers the period 2004–2010. We compare different model specifications and different...

  18. Digital spectral analysis parametric, non-parametric and advanced methods

    CERN Document Server

    Castanié, Francis

    2013-01-01

    Digital Spectral Analysis provides a single source that offers complete coverage of the spectral analysis domain. This self-contained work includes details on advanced topics that are usually presented in scattered sources throughout the literature.The theoretical principles necessary for the understanding of spectral analysis are discussed in the first four chapters: fundamentals, digital signal processing, estimation in spectral analysis, and time-series models.An entire chapter is devoted to the non-parametric methods most widely used in industry.High resolution methods a

  19. A Bayesian nonparametric approach to causal inference on quantiles.

    Science.gov (United States)

    Xu, Dandan; Daniels, Michael J; Winterstein, Almut G

    2018-02-25

    We propose a Bayesian nonparametric approach (BNP) for causal inference on quantiles in the presence of many confounders. In particular, we define relevant causal quantities and specify BNP models to avoid bias from restrictive parametric assumptions. We first use Bayesian additive regression trees (BART) to model the propensity score and then construct the distribution of potential outcomes given the propensity score using a Dirichlet process mixture (DPM) of normals model. We thoroughly evaluate the operating characteristics of our approach and compare it to Bayesian and frequentist competitors. We use our approach to answer an important clinical question involving acute kidney injury using electronic health records. © 2018, The International Biometric Society.

  20. Categorical and nonparametric data analysis choosing the best statistical technique

    CERN Document Server

    Nussbaum, E Michael

    2014-01-01

    Featuring in-depth coverage of categorical and nonparametric statistics, this book provides a conceptual framework for choosing the most appropriate type of test in various research scenarios. Class tested at the University of Nevada, the book's clear explanations of the underlying assumptions, computer simulations, and Exploring the Concept boxes help reduce reader anxiety. Problems inspired by actual studies provide meaningful illustrations of the techniques. The underlying assumptions of each test and the factors that impact validity and statistical power are reviewed so readers can explain

  1. Nonparametric statistics a step-by-step approach

    CERN Document Server

    Corder, Gregory W

    2014-01-01

    "…a very useful resource for courses in nonparametric statistics in which the emphasis is on applications rather than on theory.  It also deserves a place in libraries of all institutions where introductory statistics courses are taught."" -CHOICE This Second Edition presents a practical and understandable approach that enhances and expands the statistical toolset for readers. This book includes: New coverage of the sign test and the Kolmogorov-Smirnov two-sample test in an effort to offer a logical and natural progression to statistical powerSPSS® (Version 21) software and updated screen ca

  2. Evaluation of Nonparametric Probabilistic Forecasts of Wind Power

    DEFF Research Database (Denmark)

    Pinson, Pierre; Møller, Jan Kloppenborg; Nielsen, Henrik Aalborg, orlov 31.07.2008

    Predictions of wind power production for horizons up to 48-72 hour ahead comprise a highly valuable input to the methods for the daily management or trading of wind generation. Today, users of wind power predictions are not only provided with point predictions, which are estimates of the most...... likely outcome for each look-ahead time, but also with uncertainty estimates given by probabilistic forecasts. In order to avoid assumptions on the shape of predictive distributions, these probabilistic predictions are produced from nonparametric methods, and then take the form of a single or a set...

  3. Estimation of Stochastic Volatility Models by Nonparametric Filtering

    DEFF Research Database (Denmark)

    Kanaya, Shin; Kristensen, Dennis

    2016-01-01

    /estimated volatility process replacing the latent process. Our estimation strategy is applicable to both parametric and nonparametric stochastic volatility models, and can handle both jumps and market microstructure noise. The resulting estimators of the stochastic volatility model will carry additional biases...... and variances due to the first-step estimation, but under regularity conditions we show that these vanish asymptotically and our estimators inherit the asymptotic properties of the infeasible estimators based on observations of the volatility process. A simulation study examines the finite-sample properties...

  4. Ranking species in mutualistic networks

    Science.gov (United States)

    Domínguez-García, Virginia; Muñoz, Miguel A.

    2015-02-01

    Understanding the architectural subtleties of ecological networks, believed to confer them enhanced stability and robustness, is a subject of outmost relevance. Mutualistic interactions have been profusely studied and their corresponding bipartite networks, such as plant-pollinator networks, have been reported to exhibit a characteristic ``nested'' structure. Assessing the importance of any given species in mutualistic networks is a key task when evaluating extinction risks and possible cascade effects. Inspired in a recently introduced algorithm -similar in spirit to Google's PageRank but with a built-in non-linearity- here we propose a method which -by exploiting their nested architecture- allows us to derive a sound ranking of species importance in mutualistic networks. This method clearly outperforms other existing ranking schemes and can become very useful for ecosystem management and biodiversity preservation, where decisions on what aspects of ecosystems to explicitly protect need to be made.

  5. Ranking Theory and Conditional Reasoning.

    Science.gov (United States)

    Skovgaard-Olsen, Niels

    2016-05-01

    Ranking theory is a formal epistemology that has been developed in over 600 pages in Spohn's recent book The Laws of Belief, which aims to provide a normative account of the dynamics of beliefs that presents an alternative to current probabilistic approaches. It has long been received in the AI community, but it has not yet found application in experimental psychology. The purpose of this paper is to derive clear, quantitative predictions by exploiting a parallel between ranking theory and a statistical model called logistic regression. This approach is illustrated by the development of a model for the conditional inference task using Spohn's (2013) ranking theoretic approach to conditionals. Copyright © 2015 Cognitive Science Society, Inc.

  6. University rankings in computer science

    DEFF Research Database (Denmark)

    Ehret, Philip; Zuccala, Alesia Ann; Gipp, Bela

    2017-01-01

    This is a research-in-progress paper concerning two types of institutional rankings, the Leiden and QS World ranking, and their relationship to a list of universities’ ‘geo-based’ impact scores, and Computing Research and Education Conference (CORE) participation scores in the field of computer...... science. A ‘geo-based’ impact measure examines the geographical distribution of incoming citations to a particular university’s journal articles for a specific period of time. It takes into account both the number of citations and the geographical variability in these citations. The CORE participation...... score is calculated on the basis of the number of weighted proceedings papers that a university has contributed to either an A*, A, B, or C conference as ranked by the Computing Research and Education Association of Australasia. In addition to calculating the correlations between the distinct university...

  7. Bayesian nonparametric dictionary learning for compressed sensing MRI.

    Science.gov (United States)

    Huang, Yue; Paisley, John; Lin, Qin; Ding, Xinghao; Fu, Xueyang; Zhang, Xiao-Ping

    2014-12-01

    We develop a Bayesian nonparametric model for reconstructing magnetic resonance images (MRIs) from highly undersampled k -space data. We perform dictionary learning as part of the image reconstruction process. To this end, we use the beta process as a nonparametric dictionary learning prior for representing an image patch as a sparse combination of dictionary elements. The size of the dictionary and patch-specific sparsity pattern are inferred from the data, in addition to other dictionary learning variables. Dictionary learning is performed directly on the compressed image, and so is tailored to the MRI being considered. In addition, we investigate a total variation penalty term in combination with the dictionary learning model, and show how the denoising property of dictionary learning removes dependence on regularization parameters in the noisy setting. We derive a stochastic optimization algorithm based on Markov chain Monte Carlo for the Bayesian model, and use the alternating direction method of multipliers for efficiently performing total variation minimization. We present empirical results on several MRI, which show that the proposed regularization framework can improve reconstruction accuracy over other methods.

  8. 1st Conference of the International Society for Nonparametric Statistics

    CERN Document Server

    Lahiri, S; Politis, Dimitris

    2014-01-01

    This volume is composed of peer-reviewed papers that have developed from the First Conference of the International Society for NonParametric Statistics (ISNPS). This inaugural conference took place in Chalkidiki, Greece, June 15-19, 2012. It was organized with the co-sponsorship of the IMS, the ISI, and other organizations. M.G. Akritas, S.N. Lahiri, and D.N. Politis are the first executive committee members of ISNPS, and the editors of this volume. ISNPS has a distinguished Advisory Committee that includes Professors R.Beran, P.Bickel, R. Carroll, D. Cook, P. Hall, R. Johnson, B. Lindsay, E. Parzen, P. Robinson, M. Rosenblatt, G. Roussas, T. SubbaRao, and G. Wahba. The Charting Committee of ISNPS consists of more than 50 prominent researchers from all over the world.   The chapters in this volume bring forth recent advances and trends in several areas of nonparametric statistics. In this way, the volume facilitates the exchange of research ideas, promotes collaboration among researchers from all over the wo...

  9. Nonparametric Analyses of Log-Periodic Precursors to Financial Crashes

    Science.gov (United States)

    Zhou, Wei-Xing; Sornette, Didier

    We apply two nonparametric methods to further test the hypothesis that log-periodicity characterizes the detrended price trajectory of large financial indices prior to financial crashes or strong corrections. The term "parametric" refers here to the use of the log-periodic power law formula to fit the data; in contrast, "nonparametric" refers to the use of general tools such as Fourier transform, and in the present case the Hilbert transform and the so-called (H, q)-analysis. The analysis using the (H, q)-derivative is applied to seven time series ending with the October 1987 crash, the October 1997 correction and the April 2000 crash of the Dow Jones Industrial Average (DJIA), the Standard & Poor 500 and Nasdaq indices. The Hilbert transform is applied to two detrended price time series in terms of the ln(tc-t) variable, where tc is the time of the crash. Taking all results together, we find strong evidence for a universal fundamental log-frequency f=1.02±0.05 corresponding to the scaling ratio λ=2.67±0.12. These values are in very good agreement with those obtained in earlier works with different parametric techniques. This note is extracted from a long unpublished report with 58 figures available at , which extensively describes the evidence we have accumulated on these seven time series, in particular by presenting all relevant details so that the reader can judge for himself or herself the validity and robustness of the results.

  10. A Bayesian nonparametric estimation of distributions and quantiles

    International Nuclear Information System (INIS)

    Poern, K.

    1988-11-01

    The report describes a Bayesian, nonparametric method for the estimation of a distribution function and its quantiles. The method, presupposing random sampling, is nonparametric, so the user has to specify a prior distribution on a space of distributions (and not on a parameter space). In the current application, where the method is used to estimate the uncertainty of a parametric calculational model, the Dirichlet prior distribution is to a large extent determined by the first batch of Monte Carlo-realizations. In this case the results of the estimation technique is very similar to the conventional empirical distribution function. The resulting posterior distribution is also Dirichlet, and thus facilitates the determination of probability (confidence) intervals at any given point in the space of interest. Another advantage is that also the posterior distribution of a specified quantitle can be derived and utilized to determine a probability interval for that quantile. The method was devised for use in the PROPER code package for uncertainty and sensitivity analysis. (orig.)

  11. Genomic breeding value estimation using nonparametric additive regression models

    Directory of Open Access Journals (Sweden)

    Solberg Trygve

    2009-01-01

    Full Text Available Abstract Genomic selection refers to the use of genomewide dense markers for breeding value estimation and subsequently for selection. The main challenge of genomic breeding value estimation is the estimation of many effects from a limited number of observations. Bayesian methods have been proposed to successfully cope with these challenges. As an alternative class of models, non- and semiparametric models were recently introduced. The present study investigated the ability of nonparametric additive regression models to predict genomic breeding values. The genotypes were modelled for each marker or pair of flanking markers (i.e. the predictors separately. The nonparametric functions for the predictors were estimated simultaneously using additive model theory, applying a binomial kernel. The optimal degree of smoothing was determined by bootstrapping. A mutation-drift-balance simulation was carried out. The breeding values of the last generation (genotyped was predicted using data from the next last generation (genotyped and phenotyped. The results show moderate to high accuracies of the predicted breeding values. A determination of predictor specific degree of smoothing increased the accuracy.

  12. A non-parametric framework for estimating threshold limit values

    Directory of Open Access Journals (Sweden)

    Ulm Kurt

    2005-11-01

    Full Text Available Abstract Background To estimate a threshold limit value for a compound known to have harmful health effects, an 'elbow' threshold model is usually applied. We are interested on non-parametric flexible alternatives. Methods We describe how a step function model fitted by isotonic regression can be used to estimate threshold limit values. This method returns a set of candidate locations, and we discuss two algorithms to select the threshold among them: the reduced isotonic regression and an algorithm considering the closed family of hypotheses. We assess the performance of these two alternative approaches under different scenarios in a simulation study. We illustrate the framework by analysing the data from a study conducted by the German Research Foundation aiming to set a threshold limit value in the exposure to total dust at workplace, as a causal agent for developing chronic bronchitis. Results In the paper we demonstrate the use and the properties of the proposed methodology along with the results from an application. The method appears to detect the threshold with satisfactory success. However, its performance can be compromised by the low power to reject the constant risk assumption when the true dose-response relationship is weak. Conclusion The estimation of thresholds based on isotonic framework is conceptually simple and sufficiently powerful. Given that in threshold value estimation context there is not a gold standard method, the proposed model provides a useful non-parametric alternative to the standard approaches and can corroborate or challenge their findings.

  13. Application of nonparametric statistics to material strength/reliability assessment

    International Nuclear Information System (INIS)

    Arai, Taketoshi

    1992-01-01

    An advanced material technology requires data base on a wide variety of material behavior which need to be established experimentally. It may often happen that experiments are practically limited in terms of reproducibility or a range of test parameters. Statistical methods can be applied to understanding uncertainties in such a quantitative manner as required from the reliability point of view. Statistical assessment involves determinations of a most probable value and the maximum and/or minimum value as one-sided or two-sided confidence limit. A scatter of test data can be approximated by a theoretical distribution only if the goodness of fit satisfies a test criterion. Alternatively, nonparametric statistics (NPS) or distribution-free statistics can be applied. Mathematical procedures by NPS are well established for dealing with most reliability problems. They handle only order statistics of a sample. Mathematical formulas and some applications to engineering assessments are described. They include confidence limits of median, population coverage of sample, required minimum number of a sample, and confidence limits of fracture probability. These applications demonstrate that a nonparametric statistical estimation is useful in logical decision making in the case a large uncertainty exists. (author)

  14. Subtracting a best rank-1 approximation may increase tensor rank

    NARCIS (Netherlands)

    Stegeman, Alwin; Comon, Pierre

    2010-01-01

    It has been shown that a best rank-R approximation of an order-k tensor may not exist when R >= 2 and k >= 3. This poses a serious problem to data analysts using tensor decompositions it has been observed numerically that, generally, this issue cannot be solved by consecutively computing and

  15. Consistent ranking of volatility models

    DEFF Research Database (Denmark)

    Hansen, Peter Reinhard; Lunde, Asger

    2006-01-01

    We show that the empirical ranking of volatility models can be inconsistent for the true ranking if the evaluation is based on a proxy for the population measure of volatility. For example, the substitution of a squared return for the conditional variance in the evaluation of ARCH-type models can...... variance in out-of-sample evaluations rather than the squared return. We derive the theoretical results in a general framework that is not specific to the comparison of volatility models. Similar problems can arise in comparisons of forecasting models whenever the predicted variable is a latent variable....

  16. Proposing a framework for airline service quality evaluation using Type-2 Fuzzy TOPSIS and non-parametric analysis

    Directory of Open Access Journals (Sweden)

    Navid Haghighat

    2017-12-01

    Full Text Available This paper focuses on evaluating airline service quality from the perspective of passengers' view. Until now a lot of researches has been performed in airline service quality evaluation in the world but a little research has been conducted in Iran, yet. In this study, a framework for measuring airline service quality in Iran is proposed. After reviewing airline service quality criteria, SSQAI model was selected because of its comprehensiveness in covering airline service quality dimensions. SSQAI questionnaire items were redesigned to adopt with Iranian airlines requirements and environmental circumstances in the Iran's economic and cultural context. This study includes fuzzy decision-making theory, considering the possible fuzzy subjective judgment of the evaluators during airline service quality evaluation. Fuzzy TOPSIS have been applied for ranking airlines service quality performances. Three major Iranian airlines which have the most passenger transfer volumes in domestic and foreign flights were chosen for evaluation in this research. Results demonstrated Mahan airline has got the best service quality performance rank in gaining passengers' satisfaction with delivery of high-quality services to its passengers, among the three major Iranian airlines. IranAir and Aseman airlines placed in the second and third rank, respectively, according to passenger's evaluation. Statistical analysis has been used in analyzing passenger responses. Due to the abnormality of data, Non-parametric tests were applied. To demonstrate airline ranks in every criterion separately, Friedman test was performed. Variance analysis and Tukey test were applied to study the influence of increasing in age and educational level of passengers on degree of their satisfaction from airline's service quality. Results showed that age has no significant relation to passenger satisfaction of airlines, however, increasing in educational level demonstrated a negative impact on

  17. CADDIS Volume 4. Data Analysis: PECBO Appendix - R Scripts for Non-Parametric Regressions

    Science.gov (United States)

    Script for computing nonparametric regression analysis. Overview of using scripts to infer environmental conditions from biological observations, statistically estimating species-environment relationships, statistical scripts.

  18. Consumer preference in ranking walking function utilizing the walking index for spinal cord injury II.

    Science.gov (United States)

    Patrick, M; Ditunno, P; Ditunno, J F; Marino, R J; Scivoletto, G; Lam, T; Loffree, J; Tamburella, F; Leiby, B

    2011-12-01

    Blinded rank ordering. To determine consumer preference in walking function utilizing the walking Index for spinal cord injury II (WISCI II) in individuals with spinal cord injury (SCI)from the Canada, the Italy and the United States of America. In all, 42 consumers with incomplete SCI (25 cervical, 12 thoracic, 5 lumbar) from Canada (12/42), Italy (14/42) and the United States of America (16/42) ranked the 20 levels of the WISCI II scale by their individual preference for walking. Subjects were blinded to the original ranking of the WISCI II scale by clinical scientists. Photographs of each WISCI II level used in a previous pilot study were randomly shuffled and rank ordered. Percentile, conjoint/cluster and graphic analyses were performed. All three analyses illustrated consumer ranking followed a bimodal distribution. Ranking for two levels with physical assistance and two levels with a walker were bimodal with a difference of five to six ranks between consumer subgroups (quartile analysis). The larger cluster (N=20) showed preference for walking with assistance over the smaller cluster (N=12), whose preference was walking without assistance and more devices. In all, 64% (27/42) of consumers ranked WISCI II level with no devices or braces and 1 person assistance higher than multiple levels of the WISCI II requiring no assistance. These results were unexpected, as the hypothesis was that consumers would rank independent walking higher than walking with assistance. Consumer preference for walking function should be considered in addition to objective measures in designing SCI trials that use significant improvement in walking function as an outcome measure.

  19. Let Us Rank Journalism Programs

    Science.gov (United States)

    Weber, Joseph

    2014-01-01

    Unlike law, business, and medical schools, as well as universities in general, journalism schools and journalism programs have rarely been ranked. Publishers such as "U.S. News & World Report," "Forbes," "Bloomberg Businessweek," and "Washington Monthly" do not pay them much mind. What is the best…

  20. On Rank Driven Dynamical Systems

    Science.gov (United States)

    Veerman, J. J. P.; Prieto, F. J.

    2014-08-01

    We investigate a class of models related to the Bak-Sneppen (BS) model, initially proposed to study evolution. The BS model is extremely simple and yet captures some forms of "complex behavior" such as self-organized criticality that is often observed in physical and biological systems. In this model, random fitnesses in are associated to agents located at the vertices of a graph . Their fitnesses are ranked from worst (0) to best (1). At every time-step the agent with the worst fitness and some others with a priori given rank probabilities are replaced by new agents with random fitnesses. We consider two cases: The exogenous case where the new fitnesses are taken from an a priori fixed distribution, and the endogenous case where the new fitnesses are taken from the current distribution as it evolves. We approximate the dynamics by making a simplifying independence assumption. We use Order Statistics and Dynamical Systems to define a rank-driven dynamical system that approximates the evolution of the distribution of the fitnesses in these rank-driven models, as well as in the BS model. For this simplified model we can find the limiting marginal distribution as a function of the initial conditions. Agreement with experimental results of the BS model is excellent.

  1. PageRank (II): Mathematics

    African Journals Online (AJOL)

    maths/stats

    ... GAUSS SEIDEL'S. NUMERICAL ALGORITHMS IN PAGE RANK ANALYSIS. ... The convergence is guaranteed, if the absolute value of the largest eigen ... improved Gauss-Seidel iteration algorithm, based on the decomposition. U. L. D. M. +. +. = ..... This corresponds to determine the eigen vector of T with eigen value 1.

  2. Multiple graph regularized protein domain ranking

    KAUST Repository

    Wang, Jim Jing-Yan

    2012-11-19

    Background: Protein domain ranking is a fundamental task in structural biology. Most protein domain ranking methods rely on the pairwise comparison of protein domains while neglecting the global manifold structure of the protein domain database. Recently, graph regularized ranking that exploits the global structure of the graph defined by the pairwise similarities has been proposed. However, the existing graph regularized ranking methods are very sensitive to the choice of the graph model and parameters, and this remains a difficult problem for most of the protein domain ranking methods.Results: To tackle this problem, we have developed the Multiple Graph regularized Ranking algorithm, MultiG-Rank. Instead of using a single graph to regularize the ranking scores, MultiG-Rank approximates the intrinsic manifold of protein domain distribution by combining multiple initial graphs for the regularization. Graph weights are learned with ranking scores jointly and automatically, by alternately minimizing an objective function in an iterative algorithm. Experimental results on a subset of the ASTRAL SCOP protein domain database demonstrate that MultiG-Rank achieves a better ranking performance than single graph regularized ranking methods and pairwise similarity based ranking methods.Conclusion: The problem of graph model and parameter selection in graph regularized protein domain ranking can be solved effectively by combining multiple graphs. This aspect of generalization introduces a new frontier in applying multiple graphs to solving protein domain ranking applications. 2012 Wang et al; licensee BioMed Central Ltd.

  3. Multiple graph regularized protein domain ranking

    KAUST Repository

    Wang, Jim Jing-Yan; Bensmail, Halima; Gao, Xin

    2012-01-01

    Background: Protein domain ranking is a fundamental task in structural biology. Most protein domain ranking methods rely on the pairwise comparison of protein domains while neglecting the global manifold structure of the protein domain database. Recently, graph regularized ranking that exploits the global structure of the graph defined by the pairwise similarities has been proposed. However, the existing graph regularized ranking methods are very sensitive to the choice of the graph model and parameters, and this remains a difficult problem for most of the protein domain ranking methods.Results: To tackle this problem, we have developed the Multiple Graph regularized Ranking algorithm, MultiG-Rank. Instead of using a single graph to regularize the ranking scores, MultiG-Rank approximates the intrinsic manifold of protein domain distribution by combining multiple initial graphs for the regularization. Graph weights are learned with ranking scores jointly and automatically, by alternately minimizing an objective function in an iterative algorithm. Experimental results on a subset of the ASTRAL SCOP protein domain database demonstrate that MultiG-Rank achieves a better ranking performance than single graph regularized ranking methods and pairwise similarity based ranking methods.Conclusion: The problem of graph model and parameter selection in graph regularized protein domain ranking can be solved effectively by combining multiple graphs. This aspect of generalization introduces a new frontier in applying multiple graphs to solving protein domain ranking applications. 2012 Wang et al; licensee BioMed Central Ltd.

  4. 14 CFR 1214.1105 - Final ranking.

    Science.gov (United States)

    2010-01-01

    ... 14 Aeronautics and Space 5 2010-01-01 2010-01-01 false Final ranking. 1214.1105 Section 1214.1105... Recruitment and Selection Program § 1214.1105 Final ranking. Final rankings will be based on a combination of... preference will be included in this final ranking in accordance with applicable regulations. ...

  5. Multiple graph regularized protein domain ranking.

    Science.gov (United States)

    Wang, Jim Jing-Yan; Bensmail, Halima; Gao, Xin

    2012-11-19

    Protein domain ranking is a fundamental task in structural biology. Most protein domain ranking methods rely on the pairwise comparison of protein domains while neglecting the global manifold structure of the protein domain database. Recently, graph regularized ranking that exploits the global structure of the graph defined by the pairwise similarities has been proposed. However, the existing graph regularized ranking methods are very sensitive to the choice of the graph model and parameters, and this remains a difficult problem for most of the protein domain ranking methods. To tackle this problem, we have developed the Multiple Graph regularized Ranking algorithm, MultiG-Rank. Instead of using a single graph to regularize the ranking scores, MultiG-Rank approximates the intrinsic manifold of protein domain distribution by combining multiple initial graphs for the regularization. Graph weights are learned with ranking scores jointly and automatically, by alternately minimizing an objective function in an iterative algorithm. Experimental results on a subset of the ASTRAL SCOP protein domain database demonstrate that MultiG-Rank achieves a better ranking performance than single graph regularized ranking methods and pairwise similarity based ranking methods. The problem of graph model and parameter selection in graph regularized protein domain ranking can be solved effectively by combining multiple graphs. This aspect of generalization introduces a new frontier in applying multiple graphs to solving protein domain ranking applications.

  6. Multiple graph regularized protein domain ranking

    Directory of Open Access Journals (Sweden)

    Wang Jim

    2012-11-01

    Full Text Available Abstract Background Protein domain ranking is a fundamental task in structural biology. Most protein domain ranking methods rely on the pairwise comparison of protein domains while neglecting the global manifold structure of the protein domain database. Recently, graph regularized ranking that exploits the global structure of the graph defined by the pairwise similarities has been proposed. However, the existing graph regularized ranking methods are very sensitive to the choice of the graph model and parameters, and this remains a difficult problem for most of the protein domain ranking methods. Results To tackle this problem, we have developed the Multiple Graph regularized Ranking algorithm, MultiG-Rank. Instead of using a single graph to regularize the ranking scores, MultiG-Rank approximates the intrinsic manifold of protein domain distribution by combining multiple initial graphs for the regularization. Graph weights are learned with ranking scores jointly and automatically, by alternately minimizing an objective function in an iterative algorithm. Experimental results on a subset of the ASTRAL SCOP protein domain database demonstrate that MultiG-Rank achieves a better ranking performance than single graph regularized ranking methods and pairwise similarity based ranking methods. Conclusion The problem of graph model and parameter selection in graph regularized protein domain ranking can be solved effectively by combining multiple graphs. This aspect of generalization introduces a new frontier in applying multiple graphs to solving protein domain ranking applications.

  7. A Survey on PageRank Computing

    OpenAIRE

    Berkhin, Pavel

    2005-01-01

    This survey reviews the research related to PageRank computing. Components of a PageRank vector serve as authority weights for web pages independent of their textual content, solely based on the hyperlink structure of the web. PageRank is typically used as a web search ranking component. This defines the importance of the model and the data structures that underly PageRank processing. Computing even a single PageRank is a difficult computational task. Computing many PageRanks is a much mor...

  8. Time evolution of Wikipedia network ranking

    Science.gov (United States)

    Eom, Young-Ho; Frahm, Klaus M.; Benczúr, András; Shepelyansky, Dima L.

    2013-12-01

    We study the time evolution of ranking and spectral properties of the Google matrix of English Wikipedia hyperlink network during years 2003-2011. The statistical properties of ranking of Wikipedia articles via PageRank and CheiRank probabilities, as well as the matrix spectrum, are shown to be stabilized for 2007-2011. A special emphasis is done on ranking of Wikipedia personalities and universities. We show that PageRank selection is dominated by politicians while 2DRank, which combines PageRank and CheiRank, gives more accent on personalities of arts. The Wikipedia PageRank of universities recovers 80% of top universities of Shanghai ranking during the considered time period.

  9. Generative Temporal Modelling of Neuroimaging - Decomposition and Nonparametric Testing

    DEFF Research Database (Denmark)

    Hald, Ditte Høvenhoff

    The goal of this thesis is to explore two improvements for functional magnetic resonance imaging (fMRI) analysis; namely our proposed decomposition method and an extension to the non-parametric testing framework. Analysis of fMRI allows researchers to investigate the functional processes...... of the brain, and provides insight into neuronal coupling during mental processes or tasks. The decomposition method is a Gaussian process-based independent components analysis (GPICA), which incorporates a temporal dependency in the sources. A hierarchical model specification is used, featuring both...... instantaneous and convolutive mixing, and the inferred temporal patterns. Spatial maps are seen to capture smooth and localized stimuli-related components, and often identifiable noise components. The implementation is freely available as a GUI/SPM plugin, and we recommend using GPICA as an additional tool when...

  10. Nonparametric Estimation of Distributions in Random Effects Models

    KAUST Repository

    Hart, Jeffrey D.

    2011-01-01

    We propose using minimum distance to obtain nonparametric estimates of the distributions of components in random effects models. A main setting considered is equivalent to having a large number of small datasets whose locations, and perhaps scales, vary randomly, but which otherwise have a common distribution. Interest focuses on estimating the distribution that is common to all datasets, knowledge of which is crucial in multiple testing problems where a location/scale invariant test is applied to every small dataset. A detailed algorithm for computing minimum distance estimates is proposed, and the usefulness of our methodology is illustrated by a simulation study and an analysis of microarray data. Supplemental materials for the article, including R-code and a dataset, are available online. © 2011 American Statistical Association.

  11. Prior processes and their applications nonparametric Bayesian estimation

    CERN Document Server

    Phadia, Eswar G

    2016-01-01

    This book presents a systematic and comprehensive treatment of various prior processes that have been developed over the past four decades for dealing with Bayesian approach to solving selected nonparametric inference problems. This revised edition has been substantially expanded to reflect the current interest in this area. After an overview of different prior processes, it examines the now pre-eminent Dirichlet process and its variants including hierarchical processes, then addresses new processes such as dependent Dirichlet, local Dirichlet, time-varying and spatial processes, all of which exploit the countable mixture representation of the Dirichlet process. It subsequently discusses various neutral to right type processes, including gamma and extended gamma, beta and beta-Stacy processes, and then describes the Chinese Restaurant, Indian Buffet and infinite gamma-Poisson processes, which prove to be very useful in areas such as machine learning, information retrieval and featural modeling. Tailfree and P...

  12. Spurious Seasonality Detection: A Non-Parametric Test Proposal

    Directory of Open Access Journals (Sweden)

    Aurelio F. Bariviera

    2018-01-01

    Full Text Available This paper offers a general and comprehensive definition of the day-of-the-week effect. Using symbolic dynamics, we develop a unique test based on ordinal patterns in order to detect it. This test uncovers the fact that the so-called “day-of-the-week” effect is partly an artifact of the hidden correlation structure of the data. We present simulations based on artificial time series as well. While time series generated with long memory are prone to exhibit daily seasonality, pure white noise signals exhibit no pattern preference. Since ours is a non-parametric test, it requires no assumptions about the distribution of returns, so that it could be a practical alternative to conventional econometric tests. We also made an exhaustive application of the here-proposed technique to 83 stock indexes around the world. Finally, the paper highlights the relevance of symbolic analysis in economic time series studies.

  13. Nonparametric autocovariance estimation from censored time series by Gaussian imputation.

    Science.gov (United States)

    Park, Jung Wook; Genton, Marc G; Ghosh, Sujit K

    2009-02-01

    One of the most frequently used methods to model the autocovariance function of a second-order stationary time series is to use the parametric framework of autoregressive and moving average models developed by Box and Jenkins. However, such parametric models, though very flexible, may not always be adequate to model autocovariance functions with sharp changes. Furthermore, if the data do not follow the parametric model and are censored at a certain value, the estimation results may not be reliable. We develop a Gaussian imputation method to estimate an autocovariance structure via nonparametric estimation of the autocovariance function in order to address both censoring and incorrect model specification. We demonstrate the effectiveness of the technique in terms of bias and efficiency with simulations under various rates of censoring and underlying models. We describe its application to a time series of silicon concentrations in the Arctic.

  14. Nonparametric estimation of stochastic differential equations with sparse Gaussian processes.

    Science.gov (United States)

    García, Constantino A; Otero, Abraham; Félix, Paulo; Presedo, Jesús; Márquez, David G

    2017-08-01

    The application of stochastic differential equations (SDEs) to the analysis of temporal data has attracted increasing attention, due to their ability to describe complex dynamics with physically interpretable equations. In this paper, we introduce a nonparametric method for estimating the drift and diffusion terms of SDEs from a densely observed discrete time series. The use of Gaussian processes as priors permits working directly in a function-space view and thus the inference takes place directly in this space. To cope with the computational complexity that requires the use of Gaussian processes, a sparse Gaussian process approximation is provided. This approximation permits the efficient computation of predictions for the drift and diffusion terms by using a distribution over a small subset of pseudosamples. The proposed method has been validated using both simulated data and real data from economy and paleoclimatology. The application of the method to real data demonstrates its ability to capture the behavior of complex systems.

  15. Debt and growth: A non-parametric approach

    Science.gov (United States)

    Brida, Juan Gabriel; Gómez, David Matesanz; Seijas, Maria Nela

    2017-11-01

    In this study, we explore the dynamic relationship between public debt and economic growth by using a non-parametric approach based on data symbolization and clustering methods. The study uses annual data of general government consolidated gross debt-to-GDP ratio and gross domestic product for sixteen countries between 1977 and 2015. Using symbolic sequences, we introduce a notion of distance between the dynamical paths of different countries. Then, a Minimal Spanning Tree and a Hierarchical Tree are constructed from time series to help detecting the existence of groups of countries sharing similar economic performance. The main finding of the study appears for the period 2008-2016 when several countries surpassed the 90% debt-to-GDP threshold. During this period, three groups (clubs) of countries are obtained: high, mid and low indebted countries, suggesting that the employed debt-to-GDP threshold drives economic dynamics for the selected countries.

  16. Nonparametric estimation of benchmark doses in environmental risk assessment

    Science.gov (United States)

    Piegorsch, Walter W.; Xiong, Hui; Bhattacharya, Rabi N.; Lin, Lizhen

    2013-01-01

    Summary An important statistical objective in environmental risk analysis is estimation of minimum exposure levels, called benchmark doses (BMDs), that induce a pre-specified benchmark response in a dose-response experiment. In such settings, representations of the risk are traditionally based on a parametric dose-response model. It is a well-known concern, however, that if the chosen parametric form is misspecified, inaccurate and possibly unsafe low-dose inferences can result. We apply a nonparametric approach for calculating benchmark doses, based on an isotonic regression method for dose-response estimation with quantal-response data (Bhattacharya and Kong, 2007). We determine the large-sample properties of the estimator, develop bootstrap-based confidence limits on the BMDs, and explore the confidence limits’ small-sample properties via a short simulation study. An example from cancer risk assessment illustrates the calculations. PMID:23914133

  17. Indoor Positioning Using Nonparametric Belief Propagation Based on Spanning Trees

    Directory of Open Access Journals (Sweden)

    Savic Vladimir

    2010-01-01

    Full Text Available Nonparametric belief propagation (NBP is one of the best-known methods for cooperative localization in sensor networks. It is capable of providing information about location estimation with appropriate uncertainty and to accommodate non-Gaussian distance measurement errors. However, the accuracy of NBP is questionable in loopy networks. Therefore, in this paper, we propose a novel approach, NBP based on spanning trees (NBP-ST created by breadth first search (BFS method. In addition, we propose a reliable indoor model based on obtained measurements in our lab. According to our simulation results, NBP-ST performs better than NBP in terms of accuracy and communication cost in the networks with high connectivity (i.e., highly loopy networks. Furthermore, the computational and communication costs are nearly constant with respect to the transmission radius. However, the drawbacks of proposed method are a little bit higher computational cost and poor performance in low-connected networks.

  18. Multi-Directional Non-Parametric Analysis of Agricultural Efficiency

    DEFF Research Database (Denmark)

    Balezentis, Tomas

    This thesis seeks to develop methodologies for assessment of agricultural efficiency and employ them to Lithuanian family farms. In particular, we focus on three particular objectives throughout the research: (i) to perform a fully non-parametric analysis of efficiency effects, (ii) to extend...... to the Multi-Directional Efficiency Analysis approach when the proposed models were employed to analyse empirical data of Lithuanian family farm performance, we saw substantial differences in efficiencies associated with different inputs. In particular, assets appeared to be the least efficiently used input...... relative to labour, intermediate consumption and land (in some cases land was not treated as a discretionary input). These findings call for further research on relationships among financial structure, investment decisions, and efficiency in Lithuanian family farms. Application of different techniques...

  19. Regional regression models of percentile flows for the contiguous United States: Expert versus data-driven independent variable selection

    Directory of Open Access Journals (Sweden)

    Geoffrey Fouad

    2018-06-01

    New hydrological insights for the region: A set of three variables selected based on an expert assessment of factors that influence percentile flows performed similarly to larger sets of variables selected using a data-driven method. Expert assessment variables included mean annual precipitation, potential evapotranspiration, and baseflow index. Larger sets of up to 37 variables contributed little, if any, additional predictive information. Variables used to describe the distribution of basin data (e.g. standard deviation were not useful, and average values were sufficient to characterize physical and climatic basin conditions. Effectiveness of the expert assessment variables may be due to the high degree of multicollinearity (i.e. cross-correlation among additional variables. A tool is provided in the Supplementary material to predict percentile flows based on the three expert assessment variables. Future work should develop new variables with a strong understanding of the processes related to percentile flows.

  20. Exact nonparametric confidence bands for the survivor function.

    Science.gov (United States)

    Matthews, David

    2013-10-12

    A method to produce exact simultaneous confidence bands for the empirical cumulative distribution function that was first described by Owen, and subsequently corrected by Jager and Wellner, is the starting point for deriving exact nonparametric confidence bands for the survivor function of any positive random variable. We invert a nonparametric likelihood test of uniformity, constructed from the Kaplan-Meier estimator of the survivor function, to obtain simultaneous lower and upper bands for the function of interest with specified global confidence level. The method involves calculating a null distribution and associated critical value for each observed sample configuration. However, Noe recursions and the Van Wijngaarden-Decker-Brent root-finding algorithm provide the necessary tools for efficient computation of these exact bounds. Various aspects of the effect of right censoring on these exact bands are investigated, using as illustrations two observational studies of survival experience among non-Hodgkin's lymphoma patients and a much larger group of subjects with advanced lung cancer enrolled in trials within the North Central Cancer Treatment Group. Monte Carlo simulations confirm the merits of the proposed method of deriving simultaneous interval estimates of the survivor function across the entire range of the observed sample. This research was supported by the Natural Sciences and Engineering Research Council (NSERC) of Canada. It was begun while the author was visiting the Department of Statistics, University of Auckland, and completed during a subsequent sojourn at the Medical Research Council Biostatistics Unit in Cambridge. The support of both institutions, in addition to that of NSERC and the University of Waterloo, is greatly appreciated.

  1. Hyperspectral image segmentation using a cooperative nonparametric approach

    Science.gov (United States)

    Taher, Akar; Chehdi, Kacem; Cariou, Claude

    2013-10-01

    In this paper a new unsupervised nonparametric cooperative and adaptive hyperspectral image segmentation approach is presented. The hyperspectral images are partitioned band by band in parallel and intermediate classification results are evaluated and fused, to get the final segmentation result. Two unsupervised nonparametric segmentation methods are used in parallel cooperation, namely the Fuzzy C-means (FCM) method, and the Linde-Buzo-Gray (LBG) algorithm, to segment each band of the image. The originality of the approach relies firstly on its local adaptation to the type of regions in an image (textured, non-textured), and secondly on the introduction of several levels of evaluation and validation of intermediate segmentation results before obtaining the final partitioning of the image. For the management of similar or conflicting results issued from the two classification methods, we gradually introduced various assessment steps that exploit the information of each spectral band and its adjacent bands, and finally the information of all the spectral bands. In our approach, the detected textured and non-textured regions are treated separately from feature extraction step, up to the final classification results. This approach was first evaluated on a large number of monocomponent images constructed from the Brodatz album. Then it was evaluated on two real applications using a respectively multispectral image for Cedar trees detection in the region of Baabdat (Lebanon) and a hyperspectral image for identification of invasive and non invasive vegetation in the region of Cieza (Spain). A correct classification rate (CCR) for the first application is over 97% and for the second application the average correct classification rate (ACCR) is over 99%.

  2. Estimation from PET data of transient changes in dopamine concentration induced by alcohol: support for a non-parametric signal estimation method

    Energy Technology Data Exchange (ETDEWEB)

    Constantinescu, C C; Yoder, K K; Normandin, M D; Morris, E D [Department of Radiology, Indiana University School of Medicine, Indianapolis, IN (United States); Kareken, D A [Department of Neurology, Indiana University School of Medicine, Indianapolis, IN (United States); Bouman, C A [Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN (United States); O' Connor, S J [Department of Psychiatry, Indiana University School of Medicine, Indianapolis, IN (United States)], E-mail: emorris@iupui.edu

    2008-03-07

    We previously developed a model-independent technique (non-parametric ntPET) for extracting the transient changes in neurotransmitter concentration from paired (rest and activation) PET studies with a receptor ligand. To provide support for our method, we introduced three hypotheses of validation based on work by Endres and Carson (1998 J. Cereb. Blood Flow Metab. 18 1196-210) and Yoder et al (2004 J. Nucl. Med. 45 903-11), and tested them on experimental data. All three hypotheses describe relationships between the estimated free (synaptic) dopamine curves (F{sup DA}(t)) and the change in binding potential ({delta}BP). The veracity of the F{sup DA}(t) curves recovered by nonparametric ntPET is supported when the data adhere to the following hypothesized behaviors: (1) {delta}BP should decline with increasing DA peak time, (2) {delta}BP should increase as the strength of the temporal correlation between F{sup DA}(t) and the free raclopride (F{sup RAC}(t)) curve increases, (3) {delta}BP should decline linearly with the effective weighted availability of the receptor sites. We analyzed regional brain data from 8 healthy subjects who received two [{sup 11}C]raclopride scans: one at rest, and one during which unanticipated IV alcohol was administered to stimulate dopamine release. For several striatal regions, nonparametric ntPET was applied to recover F{sup DA}(t), and binding potential values were determined. Kendall rank-correlation analysis confirmed that the F{sup DA}(t) data followed the expected trends for all three validation hypotheses. Our findings lend credence to our model-independent estimates of F{sup DA}(t). Application of nonparametric ntPET may yield important insights into how alterations in timing of dopaminergic neurotransmission are involved in the pathologies of addiction and other psychiatric disorders.

  3. Effect of Exclusive Breastfeeding Among Overweight and Obese Mothers on Infant Weight-for-Length Percentile at 1 Year.

    Science.gov (United States)

    Yeung, Hui; Leff, Michelle; Rhee, Kyung E

    Breastfeeding is associated with decreased risk of childhood obesity. However, there is a strong correlation between maternal weight status and childhood obesity, and it is unclear whether or not breastfeeding among overweight mothers could mitigate this risk. Our goal was to examine whether or not exclusive breastfeeding (compared to formula feeding) among overweight and obese mothers is associated with lower weight-for-length (W/L) percentile at 1 year. Data from the Infant Feeding Practices II study were used. Infants who were preterm or underweight at 1 year, and mothers who were underweight before pregnancy, were excluded from analysis. There was a significant interaction between exclusive breastfeeding for 4 months and maternal prepregnancy weight status (normal weight, overweight, obese) on infant W/L percentile at 1 year. Stratified linear mixed-effects growth modeling controlling for covariates was created to test the relationship between exclusive breastfeeding and infant W/L percentile within each maternal weight category. A total of 915 subjects met inclusion criteria. Normal weight and obese mothers who exclusively breastfed for 4 months had infants with a smaller rate of increase in W/L percentile during the first year compared with those who used formula. Infants of overweight and obese mothers who exclusively breastfed for 4 months had lower W/L percentile at 1 year than those who used formula. Exclusive breastfeeding for 4 months among normal weight and obese mothers resulted in less increase in W/L percentiles in the first year. Obese mothers often have a difficult time initiating and maintaining breastfeeding. Concerted efforts are needed to support this population with breastfeeding.

  4. The relationship between dietary patterns, body mass index percentile, and household food security in young urban children.

    Science.gov (United States)

    Trapp, Christine M; Burke, Georgine; Gorin, Amy A; Wiley, James F; Hernandez, Dominica; Crowell, Rebecca E; Grant, Autherene; Beaulieu, Annamarie; Cloutier, Michelle M

    2015-04-01

    The relationship between food insecurity and child obesity is unclear. Few studies have examined dietary patterns in children with regard to household food security and weight status. The aim of this study was to examine the association between household food security, dietary intake, and BMI percentile in low-income, preschool children. Low-income caregivers (n=222) with children ages 2-4 years were enrolled in a primary-care-based obesity prevention/reversal study (Steps to Growing Up Healthy) between October 2010 and December 2011. At baseline, demographic data, household food security status (US Household Food Security Instrument) and dietary intake (Children's Dietary Questionnaire; CDQ) were collected. BMI percentile was calculated from anthropometric data. Participating children were primarily Hispanic (90%), Medicaid insured (95%), 50% female, 35±8.7 months of age (mean±standard deviation), 19% overweight (BMI 85th-94th percentile), and 29% obese (≥95th percentile). Thirty-eight percent of interviews were conducted in Spanish. Twenty-five percent of households reported food insecurity. There was no association between household food insecurity and child BMI percentile. Dietary patterns of the children based on the CDQ did not differ by household food security status. Food group subscale scores (fruit and vegetable, fat from dairy, sweetened beverages, and noncore foods) on the CDQ did not differ between normal weight and overweight/obese children. Maternal depression and stress did not mediate the relationship between household food insecurity and child weight status. Hispanic children were more likely to be overweight or obese in both food-secure and food-insecure households. Household food insecurity was not associated with child BMI percentile in this study. Dietary intake patterns of children from food-insecure households were not different compared to those from food-secure households.

  5. Development and Evaluation of a Proposed Neck Shield for the 5 Percentile Hybrid III Female Dummy.

    Science.gov (United States)

    Banglmaier, Richard F; Pecoraro, Katie M; Feustel, Jim R; Scherer, Risa D; Rouhana, Stephen W

    2005-11-01

    Frontal airbag interaction with the head and neck of the Hybrid III family of dummies may involve a non-biofidelic interaction. Researchers have found that the deploying airbag may become entrapped in the hollow cavity behind the dummy chin. This study evaluated a prototype neck shield design, the Flap Neck Shield, for biofidelic response and the ability to prevent airbag entrapment in the chin/jaw cavity. Neck pendulum calibration tests were conducted for biofidelity evaluation. Static and dynamic airbag deployments were conducted to evaluate neck shield performance. Tests showed that the Flap Neck Shield behaved in a biofidelic manner with neck loads and head motion within established biofidelic limits. The Flap Neck Shield did not alter the neck loads during static or dynamic airbag interactions, but it did consistently prevent the airbag from penetrating the chin/jaw cavity. Use of the Flap Neck Shield with the 5(th) percentile Hybrid III female dummy is recommended for frontal airbag deployments given its acceptable biofidelic response and repeatable performance.

  6. Ranking Tehran’s Stock Exchange Top Fifty Stocks Using Fundamental Indexes and Fuzzy TOPSIS

    Directory of Open Access Journals (Sweden)

    E. S. Saleh

    2017-08-01

    Full Text Available Investment through the purchase of securities, constitute an important part of countries economic exchange. Therefore, making decisions about investing in a particular stock has become one of the most controversial areas of economic and financial research and various institutions have began to rank companies stock and determine priorities of stock purchase to investment. The current research, with the determination of important required indexes for companies ranking based on their shares value on the Tehran stock exchange, can greatly help to the accurate ranking of fifty premier listed companies. Initial ranking indicators are extracted and then a decision-making group (exchange experts with the use of the Delphi method and also non-parametric statistic methods, determines the final indexes. Then, by using Fuzzy ANP, weight criteria are obtained with taking into account their interaction with each other. Finally, using fuzzy TOPSIS and information extraction about the premier fifty listed companies of Tehran stock exchange in 2014 are ranked with the software "Rahavard Novin”. Sensitivity analysis to criteria weight and relevant analysis presentation was conducted at the end of the study procedures.

  7. Validating rankings in soccer championships

    Directory of Open Access Journals (Sweden)

    Annibal Parracho Sant'Anna

    2012-08-01

    Full Text Available The final ranking of a championship is determined by quality attributes combined with other factors which should be filtered out of any decision on relegation or draft for upper level tournaments. Factors like referees' mistakes and difficulty of certain matches due to its accidental importance to the opponents should have their influence reduced. This work tests approaches to combine classification rules considering the imprecision of the number of points as a measure of quality and of the variables that provide reliable explanation for it. Two home-advantage variables are tested and shown to be apt to enter as explanatory variables. Independence between the criteria is checked against the hypothesis of maximal correlation. The importance of factors and of composition rules is evaluated on the basis of correlation between rank vectors, number of classes and number of clubs in tail classes. Data from five years of the Brazilian Soccer Championship are analyzed.

  8. Minkowski metrics in creating universal ranking algorithms

    Directory of Open Access Journals (Sweden)

    Andrzej Ameljańczyk

    2014-06-01

    Full Text Available The paper presents a general procedure for creating the rankings of a set of objects, while the relation of preference based on any ranking function. The analysis was possible to use the ranking functions began by showing the fundamental drawbacks of commonly used functions in the form of a weighted sum. As a special case of the ranking procedure in the space of a relation, the procedure based on the notion of an ideal element and generalized Minkowski distance from the element was proposed. This procedure, presented as universal ranking algorithm, eliminates most of the disadvantages of ranking functions in the form of a weighted sum.[b]Keywords[/b]: ranking functions, preference relation, ranking clusters, categories, ideal point, universal ranking algorithm

  9. A ¤nonparametric dynamic additive regression model for longitudinal data

    DEFF Research Database (Denmark)

    Martinussen, T.; Scheike, T. H.

    2000-01-01

    dynamic linear models, estimating equations, least squares, longitudinal data, nonparametric methods, partly conditional mean models, time-varying-coefficient models......dynamic linear models, estimating equations, least squares, longitudinal data, nonparametric methods, partly conditional mean models, time-varying-coefficient models...

  10. Nonparametric Estimation of Cumulative Incidence Functions for Competing Risks Data with Missing Cause of Failure

    DEFF Research Database (Denmark)

    Effraimidis, Georgios; Dahl, Christian Møller

    In this paper, we develop a fully nonparametric approach for the estimation of the cumulative incidence function with Missing At Random right-censored competing risks data. We obtain results on the pointwise asymptotic normality as well as the uniform convergence rate of the proposed nonparametric...

  11. Non-parametric tests of productive efficiency with errors-in-variables

    NARCIS (Netherlands)

    Kuosmanen, T.K.; Post, T.; Scholtes, S.

    2007-01-01

    We develop a non-parametric test of productive efficiency that accounts for errors-in-variables, following the approach of Varian. [1985. Nonparametric analysis of optimizing behavior with measurement error. Journal of Econometrics 30(1/2), 445-458]. The test is based on the general Pareto-Koopmans

  12. Functional Multiplex PageRank

    Science.gov (United States)

    Iacovacci, Jacopo; Rahmede, Christoph; Arenas, Alex; Bianconi, Ginestra

    2016-10-01

    Recently it has been recognized that many complex social, technological and biological networks have a multilayer nature and can be described by multiplex networks. Multiplex networks are formed by a set of nodes connected by links having different connotations forming the different layers of the multiplex. Characterizing the centrality of the nodes in a multiplex network is a challenging task since the centrality of the node naturally depends on the importance associated to links of a certain type. Here we propose to assign to each node of a multiplex network a centrality called Functional Multiplex PageRank that is a function of the weights given to every different pattern of connections (multilinks) existent in the multiplex network between any two nodes. Since multilinks distinguish all the possible ways in which the links in different layers can overlap, the Functional Multiplex PageRank can describe important non-linear effects when large relevance or small relevance is assigned to multilinks with overlap. Here we apply the Functional Page Rank to the multiplex airport networks, to the neuronal network of the nematode C. elegans, and to social collaboration and citation networks between scientists. This analysis reveals important differences existing between the most central nodes of these networks, and the correlations between their so-called pattern to success.

  13. Low rank magnetic resonance fingerprinting.

    Science.gov (United States)

    Mazor, Gal; Weizman, Lior; Tal, Assaf; Eldar, Yonina C

    2016-08-01

    Magnetic Resonance Fingerprinting (MRF) is a relatively new approach that provides quantitative MRI using randomized acquisition. Extraction of physical quantitative tissue values is preformed off-line, based on acquisition with varying parameters and a dictionary generated according to the Bloch equations. MRF uses hundreds of radio frequency (RF) excitation pulses for acquisition, and therefore high under-sampling ratio in the sampling domain (k-space) is required. This under-sampling causes spatial artifacts that hamper the ability to accurately estimate the quantitative tissue values. In this work, we introduce a new approach for quantitative MRI using MRF, called Low Rank MRF. We exploit the low rank property of the temporal domain, on top of the well-known sparsity of the MRF signal in the generated dictionary domain. We present an iterative scheme that consists of a gradient step followed by a low rank projection using the singular value decomposition. Experiments on real MRI data demonstrate superior results compared to conventional implementation of compressed sensing for MRF at 15% sampling ratio.

  14. Ranking Support Vector Machine with Kernel Approximation

    Directory of Open Access Journals (Sweden)

    Kai Chen

    2017-01-01

    Full Text Available Learning to rank algorithm has become important in recent years due to its successful application in information retrieval, recommender system, and computational biology, and so forth. Ranking support vector machine (RankSVM is one of the state-of-art ranking models and has been favorably used. Nonlinear RankSVM (RankSVM with nonlinear kernels can give higher accuracy than linear RankSVM (RankSVM with a linear kernel for complex nonlinear ranking problem. However, the learning methods for nonlinear RankSVM are still time-consuming because of the calculation of kernel matrix. In this paper, we propose a fast ranking algorithm based on kernel approximation to avoid computing the kernel matrix. We explore two types of kernel approximation methods, namely, the Nyström method and random Fourier features. Primal truncated Newton method is used to optimize the pairwise L2-loss (squared Hinge-loss objective function of the ranking model after the nonlinear kernel approximation. Experimental results demonstrate that our proposed method gets a much faster training speed than kernel RankSVM and achieves comparable or better performance over state-of-the-art ranking algorithms.

  15. Ranking Support Vector Machine with Kernel Approximation.

    Science.gov (United States)

    Chen, Kai; Li, Rongchun; Dou, Yong; Liang, Zhengfa; Lv, Qi

    2017-01-01

    Learning to rank algorithm has become important in recent years due to its successful application in information retrieval, recommender system, and computational biology, and so forth. Ranking support vector machine (RankSVM) is one of the state-of-art ranking models and has been favorably used. Nonlinear RankSVM (RankSVM with nonlinear kernels) can give higher accuracy than linear RankSVM (RankSVM with a linear kernel) for complex nonlinear ranking problem. However, the learning methods for nonlinear RankSVM are still time-consuming because of the calculation of kernel matrix. In this paper, we propose a fast ranking algorithm based on kernel approximation to avoid computing the kernel matrix. We explore two types of kernel approximation methods, namely, the Nyström method and random Fourier features. Primal truncated Newton method is used to optimize the pairwise L2-loss (squared Hinge-loss) objective function of the ranking model after the nonlinear kernel approximation. Experimental results demonstrate that our proposed method gets a much faster training speed than kernel RankSVM and achieves comparable or better performance over state-of-the-art ranking algorithms.

  16. Blast Mitigation Sea Analysis - Evaluation of Lumbar Compression Data Trends in 5th Percentile Female Anthropomorphic Test Device Performance Compared to 50th Percentile Male Anthropomorphic Test Device in Drop Tower Testing

    Science.gov (United States)

    2016-08-21

    Kelly Bosch, PE Proceedings of the ASME 2016 International Design Engineering Technical Conferences & Computers and Information in Engineering...imparted on the occupant • Ideal EA device would reduce peak load and duration to reduce injury probability 5th Percentile Female – 200 g Pulse

  17. The Use of Nonparametric Kernel Regression Methods in Econometric Production Analysis

    DEFF Research Database (Denmark)

    Czekaj, Tomasz Gerard

    and nonparametric estimations of production functions in order to evaluate the optimal firm size. The second paper discusses the use of parametric and nonparametric regression methods to estimate panel data regression models. The third paper analyses production risk, price uncertainty, and farmers' risk preferences...... within a nonparametric panel data regression framework. The fourth paper analyses the technical efficiency of dairy farms with environmental output using nonparametric kernel regression in a semiparametric stochastic frontier analysis. The results provided in this PhD thesis show that nonparametric......This PhD thesis addresses one of the fundamental problems in applied econometric analysis, namely the econometric estimation of regression functions. The conventional approach to regression analysis is the parametric approach, which requires the researcher to specify the form of the regression...

  18. SibRank: Signed bipartite network analysis for neighbor-based collaborative ranking

    Science.gov (United States)

    Shams, Bita; Haratizadeh, Saman

    2016-09-01

    Collaborative ranking is an emerging field of recommender systems that utilizes users' preference data rather than rating values. Unfortunately, neighbor-based collaborative ranking has gained little attention despite its more flexibility and justifiability. This paper proposes a novel framework, called SibRank that seeks to improve the state of the art neighbor-based collaborative ranking methods. SibRank represents users' preferences as a signed bipartite network, and finds similar users, through a novel personalized ranking algorithm in signed networks.

  19. Parental Activity as Influence on Childrenˋs BMI Percentiles and Physical Activity

    Directory of Open Access Journals (Sweden)

    Nanette Erkelenz, Susanne Kobel, Sarah Kettner, Clemens Drenowatz, Jürgen M. Steinacker and the Research Group "Join the Healthy Boat - Primary School"

    2014-09-01

    Full Text Available Parents play a crucial role in the development of their children’s lifestyle and health behaviour. This study aims to examine associations between parental physical activity (PA and children’s BMI percentiles (BMIPCT, moderate to vigorous PA (MVPA as well as participation in organised sports. Height and body weight was measured in 1615 in German children (7.1 ± 0.6 years, 50.3% male and converted to BMIPCT. Parental BMI was calculated based on self-reported height and body weight. Children’s MVPA and sports participation as well as parental PA were assessed via parental questionnaire. Analysis of covariance (ANCOVA, controlling for age and family income was used to examine the association between parental and children’s PA levels as well as BMIPCT. 39.7% of the parents classified themselves as physically active and 8.3% of children were classified as overweight or obese. Lower BMIPCT were observed with both parents being physically active (44.5 ± 26.3 vs. 50.2 ± 26.9 and 52.0 ± 28.4, respectively. There was no association between parental and children’s PA levels but children with at least one active parent displayed a higher participation in organised sports (102.0 ± 96.6 and 117.7 ± 123.6 vs. 73.7 ± 100.0, respectively. Children of active parents were less likely to be overweight and obese. The lack of association between subjectively assessed parental PA and child MVPA suggests that parental support for PA in children is more important than parents being a role model. More active parents, however, may be more likely to facilitate participation in organised sports. These results underline the importance of the inclusion of parents in health promotion and obesity prevention programmes in children.

  20. Glaucoma Monitoring in a Clinical Setting Glaucoma Progression Analysis vs Nonparametric Progression Analysis in the Groningen Longitudinal Glaucoma Study

    NARCIS (Netherlands)

    Wesselink, Christiaan; Heeg, Govert P.; Jansonius, Nomdo M.

    Objective: To compare prospectively 2 perimetric progression detection algorithms for glaucoma, the Early Manifest Glaucoma Trial algorithm (glaucoma progression analysis [GPA]) and a nonparametric algorithm applied to the mean deviation (MD) (nonparametric progression analysis [NPA]). Methods:

  1. The association of weight percentile and motor vehicle crash injury among 3 to 8 year old children.

    Science.gov (United States)

    Zonfrillo, Mark R; Nelson, Kyle A; Durbin, Dennis R; Kallan, Michael J

    2010-01-01

    The use of age-appropriate child restraint systems significantly reduces injury and death associated with motor vehicle crashes (MVCs). Pediatric obesity has become a global epidemic. Although recent evidence suggests a possible association between pediatric obesity and MVC-related injury, there are potential misclassifications of body mass index from under-estimated height in younger children. Given this limitation, age- and sex-specific weight percentiles can be used as a proxy of weight status. The specific aim of this study was to determine the association between weight percentile and the risk of significant injury for children 3-8 years in MVCs. This was a cross-sectional study of children aged 3-8 years in MVCs in 16 US states, with data collected via insurance claims records and a telephone survey from 12/1/98-11/30/07. Parent-reported injuries with an abbreviated Injury Scale (AIS) score of 2+ indicated a clinically significant injury. Age- and sex-specific weight percentiles were calculated using pediatric norms. The study sample included 9,327 children aged 3-8 years (weighted to represent 157,878 children), of which 0.96% sustained clinically significant injuries. There was no association between weight percentiles and overall injury when adjusting for restraint type (p=0.71). However, increasing weight percentiles were associated with lower extremity injuries at a level that approached significance (p=0.053). Further research is necessary to describe mechanisms for weight-related differences in injury risk. Parents should continue to properly restrain their children in accordance with published guidelines.

  2. Rank Two Affine Manifolds in Genus 3

    OpenAIRE

    Aulicino, David; Nguyen, Duc-Manh

    2016-01-01

    We complete the classification of rank two affine manifolds in the moduli space of translation surfaces in genus three. Combined with a recent result of Mirzakhani and Wright, this completes the classification of higher rank affine manifolds in genus three.

  3. Population models and simulation methods: The case of the Spearman rank correlation.

    Science.gov (United States)

    Astivia, Oscar L Olvera; Zumbo, Bruno D

    2017-11-01

    The purpose of this paper is to highlight the importance of a population model in guiding the design and interpretation of simulation studies used to investigate the Spearman rank correlation. The Spearman rank correlation has been known for over a hundred years to applied researchers and methodologists alike and is one of the most widely used non-parametric statistics. Still, certain misconceptions can be found, either explicitly or implicitly, in the published literature because a population definition for this statistic is rarely discussed within the social and behavioural sciences. By relying on copula distribution theory, a population model is presented for the Spearman rank correlation, and its properties are explored both theoretically and in a simulation study. Through the use of the Iman-Conover algorithm (which allows the user to specify the rank correlation as a population parameter), simulation studies from previously published articles are explored, and it is found that many of the conclusions purported in them regarding the nature of the Spearman correlation would change if the data-generation mechanism better matched the simulation design. More specifically, issues such as small sample bias and lack of power of the t-test and r-to-z Fisher transformation disappear when the rank correlation is calculated from data sampled where the rank correlation is the population parameter. A proof for the consistency of the sample estimate of the rank correlation is shown as well as the flexibility of the copula model to encompass results previously published in the mathematical literature. © 2017 The British Psychological Society.

  4. Decision-Oriented Project Ranking for Asset Management System: Rail Net Denmark

    DEFF Research Database (Denmark)

    Salling, Kim Bang; Moshøj, Claus Rehfeld; Timm, Henrik

    2007-01-01

    is to apply a modified project ranking methodology: Asset Management System Priority Module (AMS-PM), which is a practical tool for assessing and ranking various project proposals in a straightforward manner. The methodology is set-out by a multi-criteria approach where weights are applied ultimately...... resulting in priority indices for the state-of-repair data. This paper is disposed as follows; firstly, a description of the Asset Management system is set-up including an overview of the state-of-repair data and the case study. Secondly, is the AMS-PM software model implemented through an exploratory case......The Danish rail net operator, Rail Net Denmark, has through the past years built up an Asset Management system, containing a certain percentile of all the company’s assets. This paper contains an elaborate overview on how to strengthen the system seen from a decision-support perspective. The focus...

  5. A local non-parametric model for trade sign inference

    Science.gov (United States)

    Blazejewski, Adam; Coggins, Richard

    2005-03-01

    We investigate a regularity in market order submission strategies for 12 stocks with large market capitalization on the Australian Stock Exchange. The regularity is evidenced by a predictable relationship between the trade sign (trade initiator), size of the trade, and the contents of the limit order book before the trade. We demonstrate this predictability by developing an empirical inference model to classify trades into buyer-initiated and seller-initiated. The model employs a local non-parametric method, k-nearest neighbor, which in the past was used successfully for chaotic time series prediction. The k-nearest neighbor with three predictor variables achieves an average out-of-sample classification accuracy of 71.40%, compared to 63.32% for the linear logistic regression with seven predictor variables. The result suggests that a non-linear approach may produce a more parsimonious trade sign inference model with a higher out-of-sample classification accuracy. Furthermore, for most of our stocks the observed regularity in market order submissions seems to have a memory of at least 30 trading days.

  6. Efficient nonparametric n -body force fields from machine learning

    Science.gov (United States)

    Glielmo, Aldo; Zeni, Claudio; De Vita, Alessandro

    2018-05-01

    We provide a definition and explicit expressions for n -body Gaussian process (GP) kernels, which can learn any interatomic interaction occurring in a physical system, up to n -body contributions, for any value of n . The series is complete, as it can be shown that the "universal approximator" squared exponential kernel can be written as a sum of n -body kernels. These recipes enable the choice of optimally efficient force models for each target system, as confirmed by extensive testing on various materials. We furthermore describe how the n -body kernels can be "mapped" on equivalent representations that provide database-size-independent predictions and are thus crucially more efficient. We explicitly carry out this mapping procedure for the first nontrivial (three-body) kernel of the series, and we show that this reproduces the GP-predicted forces with meV /Å accuracy while being orders of magnitude faster. These results pave the way to using novel force models (here named "M-FFs") that are computationally as fast as their corresponding standard parametrized n -body force fields, while retaining the nonparametric character, the ease of training and validation, and the accuracy of the best recently proposed machine-learning potentials.

  7. Non-parametric Bayesian networks: Improving theory and reviewing applications

    International Nuclear Information System (INIS)

    Hanea, Anca; Morales Napoles, Oswaldo; Ababei, Dan

    2015-01-01

    Applications in various domains often lead to high dimensional dependence modelling. A Bayesian network (BN) is a probabilistic graphical model that provides an elegant way of expressing the joint distribution of a large number of interrelated variables. BNs have been successfully used to represent uncertain knowledge in a variety of fields. The majority of applications use discrete BNs, i.e. BNs whose nodes represent discrete variables. Integrating continuous variables in BNs is an area fraught with difficulty. Several methods that handle discrete-continuous BNs have been proposed in the literature. This paper concentrates only on one method called non-parametric BNs (NPBNs). NPBNs were introduced in 2004 and they have been or are currently being used in at least twelve professional applications. This paper provides a short introduction to NPBNs, a couple of theoretical advances, and an overview of applications. The aim of the paper is twofold: one is to present the latest improvements of the theory underlying NPBNs, and the other is to complement the existing overviews of BNs applications with the NPNBs applications. The latter opens the opportunity to discuss some difficulties that applications pose to the theoretical framework and in this way offers some NPBN modelling guidance to practitioners. - Highlights: • The paper gives an overview of the current NPBNs methodology. • We extend the NPBN methodology by relaxing the conditions of one of its fundamental theorems. • We propose improvements of the data mining algorithm for the NPBNs. • We review the professional applications of the NPBNs.

  8. Nonparametric predictive inference for combined competing risks data

    International Nuclear Information System (INIS)

    Coolen-Maturi, Tahani; Coolen, Frank P.A.

    2014-01-01

    The nonparametric predictive inference (NPI) approach for competing risks data has recently been presented, in particular addressing the question due to which of the competing risks the next unit will fail, and also considering the effects of unobserved, re-defined, unknown or removed competing risks. In this paper, we introduce how the NPI approach can be used to deal with situations where units are not all at risk from all competing risks. This may typically occur if one combines information from multiple samples, which can, e.g. be related to further aspects of units that define the samples or groups to which the units belong or to different applications where the circumstances under which the units operate can vary. We study the effect of combining the additional information from these multiple samples, so effectively borrowing information on specific competing risks from other units, on the inferences. Such combination of information can be relevant to competing risks scenarios in a variety of application areas, including engineering and medical studies

  9. Transition redshift: new constraints from parametric and nonparametric methods

    Energy Technology Data Exchange (ETDEWEB)

    Rani, Nisha; Mahajan, Shobhit; Mukherjee, Amitabha [Department of Physics and Astrophysics, University of Delhi, New Delhi 110007 (India); Jain, Deepak [Deen Dayal Upadhyaya College, University of Delhi, New Delhi 110015 (India); Pires, Nilza, E-mail: nrani@physics.du.ac.in, E-mail: djain@ddu.du.ac.in, E-mail: shobhit.mahajan@gmail.com, E-mail: amimukh@gmail.com, E-mail: npires@dfte.ufrn.br [Departamento de Física Teórica e Experimental, UFRN, Campus Universitário, Natal, RN 59072-970 (Brazil)

    2015-12-01

    In this paper, we use the cosmokinematics approach to study the accelerated expansion of the Universe. This is a model independent approach and depends only on the assumption that the Universe is homogeneous and isotropic and is described by the FRW metric. We parametrize the deceleration parameter, q(z), to constrain the transition redshift (z{sub t}) at which the expansion of the Universe goes from a decelerating to an accelerating phase. We use three different parametrizations of q(z) namely, q{sub I}(z)=q{sub 1}+q{sub 2}z, q{sub II} (z) = q{sub 3} + q{sub 4} ln (1 + z) and q{sub III} (z)=½+q{sub 5}/(1+z){sup 2}. A joint analysis of the age of galaxies, strong lensing and supernovae Ia data indicates that the transition redshift is less than unity i.e. z{sub t} < 1. We also use a nonparametric approach (LOESS+SIMEX) to constrain z{sub t}. This too gives z{sub t} < 1 which is consistent with the value obtained by the parametric approach.

  10. Discrete non-parametric kernel estimation for global sensitivity analysis

    International Nuclear Information System (INIS)

    Senga Kiessé, Tristan; Ventura, Anne

    2016-01-01

    This work investigates the discrete kernel approach for evaluating the contribution of the variance of discrete input variables to the variance of model output, via analysis of variance (ANOVA) decomposition. Until recently only the continuous kernel approach has been applied as a metamodeling approach within sensitivity analysis framework, for both discrete and continuous input variables. Now the discrete kernel estimation is known to be suitable for smoothing discrete functions. We present a discrete non-parametric kernel estimator of ANOVA decomposition of a given model. An estimator of sensitivity indices is also presented with its asymtotic convergence rate. Some simulations on a test function analysis and a real case study from agricultural have shown that the discrete kernel approach outperforms the continuous kernel one for evaluating the contribution of moderate or most influential discrete parameters to the model output. - Highlights: • We study a discrete kernel estimation for sensitivity analysis of a model. • A discrete kernel estimator of ANOVA decomposition of the model is presented. • Sensitivity indices are calculated for discrete input parameters. • An estimator of sensitivity indices is also presented with its convergence rate. • An application is realized for improving the reliability of environmental models.

  11. Nonparametric predictive inference for combining diagnostic tests with parametric copula

    Science.gov (United States)

    Muhammad, Noryanti; Coolen, F. P. A.; Coolen-Maturi, T.

    2017-09-01

    Measuring the accuracy of diagnostic tests is crucial in many application areas including medicine and health care. The Receiver Operating Characteristic (ROC) curve is a popular statistical tool for describing the performance of diagnostic tests. The area under the ROC curve (AUC) is often used as a measure of the overall performance of the diagnostic test. In this paper, we interest in developing strategies for combining test results in order to increase the diagnostic accuracy. We introduce nonparametric predictive inference (NPI) for combining two diagnostic test results with considering dependence structure using parametric copula. NPI is a frequentist statistical framework for inference on a future observation based on past data observations. NPI uses lower and upper probabilities to quantify uncertainty and is based on only a few modelling assumptions. While copula is a well-known statistical concept for modelling dependence of random variables. A copula is a joint distribution function whose marginals are all uniformly distributed and it can be used to model the dependence separately from the marginal distributions. In this research, we estimate the copula density using a parametric method which is maximum likelihood estimator (MLE). We investigate the performance of this proposed method via data sets from the literature and discuss results to show how our method performs for different family of copulas. Finally, we briefly outline related challenges and opportunities for future research.

  12. Probability Machines: Consistent Probability Estimation Using Nonparametric Learning Machines

    Science.gov (United States)

    Malley, J. D.; Kruppa, J.; Dasgupta, A.; Malley, K. G.; Ziegler, A.

    2011-01-01

    Summary Background Most machine learning approaches only provide a classification for binary responses. However, probabilities are required for risk estimation using individual patient characteristics. It has been shown recently that every statistical learning machine known to be consistent for a nonparametric regression problem is a probability machine that is provably consistent for this estimation problem. Objectives The aim of this paper is to show how random forests and nearest neighbors can be used for consistent estimation of individual probabilities. Methods Two random forest algorithms and two nearest neighbor algorithms are described in detail for estimation of individual probabilities. We discuss the consistency of random forests, nearest neighbors and other learning machines in detail. We conduct a simulation study to illustrate the validity of the methods. We exemplify the algorithms by analyzing two well-known data sets on the diagnosis of appendicitis and the diagnosis of diabetes in Pima Indians. Results Simulations demonstrate the validity of the method. With the real data application, we show the accuracy and practicality of this approach. We provide sample code from R packages in which the probability estimation is already available. This means that all calculations can be performed using existing software. Conclusions Random forest algorithms as well as nearest neighbor approaches are valid machine learning methods for estimating individual probabilities for binary responses. Freely available implementations are available in R and may be used for applications. PMID:21915433

  13. Nonparametric Integrated Agrometeorological Drought Monitoring: Model Development and Application

    Science.gov (United States)

    Zhang, Qiang; Li, Qin; Singh, Vijay P.; Shi, Peijun; Huang, Qingzhong; Sun, Peng

    2018-01-01

    Drought is a major natural hazard that has massive impacts on the society. How to monitor drought is critical for its mitigation and early warning. This study proposed a modified version of the multivariate standardized drought index (MSDI) based on precipitation, evapotranspiration, and soil moisture, i.e., modified multivariate standardized drought index (MMSDI). This study also used nonparametric joint probability distribution analysis. Comparisons were done between standardized precipitation evapotranspiration index (SPEI), standardized soil moisture index (SSMI), MSDI, and MMSDI, and real-world observed drought regimes. Results indicated that MMSDI detected droughts that SPEI and/or SSMI failed to do. Also, MMSDI detected almost all droughts that were identified by SPEI and SSMI. Further, droughts detected by MMSDI were similar to real-world observed droughts in terms of drought intensity and drought-affected area. When compared to MMSDI, MSDI has the potential to overestimate drought intensity and drought-affected area across China, which should be attributed to exclusion of the evapotranspiration components from estimation of drought intensity. Therefore, MMSDI is proposed for drought monitoring that can detect agrometeorological droughts. Results of this study provide a framework for integrated drought monitoring in other regions of the world and can help to develop drought mitigation.

  14. Bayesian nonparametric clustering in phylogenetics: modeling antigenic evolution in influenza.

    Science.gov (United States)

    Cybis, Gabriela B; Sinsheimer, Janet S; Bedford, Trevor; Rambaut, Andrew; Lemey, Philippe; Suchard, Marc A

    2018-01-30

    Influenza is responsible for up to 500,000 deaths every year, and antigenic variability represents much of its epidemiological burden. To visualize antigenic differences across many viral strains, antigenic cartography methods use multidimensional scaling on binding assay data to map influenza antigenicity onto a low-dimensional space. Analysis of such assay data ideally leads to natural clustering of influenza strains of similar antigenicity that correlate with sequence evolution. To understand the dynamics of these antigenic groups, we present a framework that jointly models genetic and antigenic evolution by combining multidimensional scaling of binding assay data, Bayesian phylogenetic machinery and nonparametric clustering methods. We propose a phylogenetic Chinese restaurant process that extends the current process to incorporate the phylogenetic dependency structure between strains in the modeling of antigenic clusters. With this method, we are able to use the genetic information to better understand the evolution of antigenicity throughout epidemics, as shown in applications of this model to H1N1 influenza. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  15. Modeling Non-Gaussian Time Series with Nonparametric Bayesian Model.

    Science.gov (United States)

    Xu, Zhiguang; MacEachern, Steven; Xu, Xinyi

    2015-02-01

    We present a class of Bayesian copula models whose major components are the marginal (limiting) distribution of a stationary time series and the internal dynamics of the series. We argue that these are the two features with which an analyst is typically most familiar, and hence that these are natural components with which to work. For the marginal distribution, we use a nonparametric Bayesian prior distribution along with a cdf-inverse cdf transformation to obtain large support. For the internal dynamics, we rely on the traditionally successful techniques of normal-theory time series. Coupling the two components gives us a family of (Gaussian) copula transformed autoregressive models. The models provide coherent adjustments of time scales and are compatible with many extensions, including changes in volatility of the series. We describe basic properties of the models, show their ability to recover non-Gaussian marginal distributions, and use a GARCH modification of the basic model to analyze stock index return series. The models are found to provide better fit and improved short-range and long-range predictions than Gaussian competitors. The models are extensible to a large variety of fields, including continuous time models, spatial models, models for multiple series, models driven by external covariate streams, and non-stationary models.

  16. Nonparametric adaptive age replacement with a one-cycle criterion

    International Nuclear Information System (INIS)

    Coolen-Schrijner, P.; Coolen, F.P.A.

    2007-01-01

    Age replacement of technical units has received much attention in the reliability literature over the last four decades. Mostly, the failure time distribution for the units is assumed to be known, and minimal costs per unit of time is used as optimality criterion, where renewal reward theory simplifies the mathematics involved but requires the assumption that the same process and replacement strategy continues over a very large ('infinite') period of time. Recently, there has been increasing attention to adaptive strategies for age replacement, taking into account the information from the process. Although renewal reward theory can still be used to provide an intuitively and mathematically attractive optimality criterion, it is more logical to use minimal costs per unit of time over a single cycle as optimality criterion for adaptive age replacement. In this paper, we first show that in the classical age replacement setting, with known failure time distribution with increasing hazard rate, the one-cycle criterion leads to earlier replacement than the renewal reward criterion. Thereafter, we present adaptive age replacement with a one-cycle criterion within the nonparametric predictive inferential framework. We study the performance of this approach via simulations, which are also used for comparisons with the use of the renewal reward criterion within the same statistical framework

  17. Bayesian Nonparametric Model for Estimating Multistate Travel Time Distribution

    Directory of Open Access Journals (Sweden)

    Emmanuel Kidando

    2017-01-01

    Full Text Available Multistate models, that is, models with more than two distributions, are preferred over single-state probability models in modeling the distribution of travel time. Literature review indicated that the finite multistate modeling of travel time using lognormal distribution is superior to other probability functions. In this study, we extend the finite multistate lognormal model of estimating the travel time distribution to unbounded lognormal distribution. In particular, a nonparametric Dirichlet Process Mixture Model (DPMM with stick-breaking process representation was used. The strength of the DPMM is that it can choose the number of components dynamically as part of the algorithm during parameter estimation. To reduce computational complexity, the modeling process was limited to a maximum of six components. Then, the Markov Chain Monte Carlo (MCMC sampling technique was employed to estimate the parameters’ posterior distribution. Speed data from nine links of a freeway corridor, aggregated on a 5-minute basis, were used to calculate the corridor travel time. The results demonstrated that this model offers significant flexibility in modeling to account for complex mixture distributions of the travel time without specifying the number of components. The DPMM modeling further revealed that freeway travel time is characterized by multistate or single-state models depending on the inclusion of onset and offset of congestion periods.

  18. Nonparametric Bayes Classification and Hypothesis Testing on Manifolds

    Science.gov (United States)

    Bhattacharya, Abhishek; Dunson, David

    2012-01-01

    Our first focus is prediction of a categorical response variable using features that lie on a general manifold. For example, the manifold may correspond to the surface of a hypersphere. We propose a general kernel mixture model for the joint distribution of the response and predictors, with the kernel expressed in product form and dependence induced through the unknown mixing measure. We provide simple sufficient conditions for large support and weak and strong posterior consistency in estimating both the joint distribution of the response and predictors and the conditional distribution of the response. Focusing on a Dirichlet process prior for the mixing measure, these conditions hold using von Mises-Fisher kernels when the manifold is the unit hypersphere. In this case, Bayesian methods are developed for efficient posterior computation using slice sampling. Next we develop Bayesian nonparametric methods for testing whether there is a difference in distributions between groups of observations on the manifold having unknown densities. We prove consistency of the Bayes factor and develop efficient computational methods for its calculation. The proposed classification and testing methods are evaluated using simulation examples and applied to spherical data applications. PMID:22754028

  19. Bayesian nonparametric meta-analysis using Polya tree mixture models.

    Science.gov (United States)

    Branscum, Adam J; Hanson, Timothy E

    2008-09-01

    Summary. A common goal in meta-analysis is estimation of a single effect measure using data from several studies that are each designed to address the same scientific inquiry. Because studies are typically conducted in geographically disperse locations, recent developments in the statistical analysis of meta-analytic data involve the use of random effects models that account for study-to-study variability attributable to differences in environments, demographics, genetics, and other sources that lead to heterogeneity in populations. Stemming from asymptotic theory, study-specific summary statistics are modeled according to normal distributions with means representing latent true effect measures. A parametric approach subsequently models these latent measures using a normal distribution, which is strictly a convenient modeling assumption absent of theoretical justification. To eliminate the influence of overly restrictive parametric models on inferences, we consider a broader class of random effects distributions. We develop a novel hierarchical Bayesian nonparametric Polya tree mixture (PTM) model. We present methodology for testing the PTM versus a normal random effects model. These methods provide researchers a straightforward approach for conducting a sensitivity analysis of the normality assumption for random effects. An application involving meta-analysis of epidemiologic studies designed to characterize the association between alcohol consumption and breast cancer is presented, which together with results from simulated data highlight the performance of PTMs in the presence of nonnormality of effect measures in the source population.

  20. DPpackage: Bayesian Semi- and Nonparametric Modeling in R

    Directory of Open Access Journals (Sweden)

    Alejandro Jara

    2011-04-01

    Full Text Available Data analysis sometimes requires the relaxation of parametric assumptions in order to gain modeling flexibility and robustness against mis-specification of the probability model. In the Bayesian context, this is accomplished by placing a prior distribution on a function space, such as the space of all probability distributions or the space of all regression functions. Unfortunately, posterior distributions ranging over function spaces are highly complex and hence sampling methods play a key role. This paper provides an introduction to a simple, yet comprehensive, set of programs for the implementation of some Bayesian nonparametric and semiparametric models in R, DPpackage. Currently, DPpackage includes models for marginal and conditional density estimation, receiver operating characteristic curve analysis, interval-censored data, binary regression data, item response data, longitudinal and clustered data using generalized linear mixed models, and regression data using generalized additive models. The package also contains functions to compute pseudo-Bayes factors for model comparison and for eliciting the precision parameter of the Dirichlet process prior, and a general purpose Metropolis sampling algorithm. To maximize computational efficiency, the actual sampling for each model is carried out using compiled C, C++ or Fortran code.

  1. The Privilege of Ranking: Google Plays Ball.

    Science.gov (United States)

    Wiggins, Richard

    2003-01-01

    Discussion of ranking systems used in various settings, including college football and academic admissions, focuses on the Google search engine. Explains the PageRank mathematical formula that scores Web pages by connecting the number of links; limitations, including authenticity and accuracy of ranked Web pages; relevancy; adjusting algorithms;…

  2. A Comprehensive Analysis of Marketing Journal Rankings

    Science.gov (United States)

    Steward, Michelle D.; Lewis, Bruce R.

    2010-01-01

    The purpose of this study is to offer a comprehensive assessment of journal standings in Marketing from two perspectives. The discipline perspective of rankings is obtained from a collection of published journal ranking studies during the past 15 years. The studies in the published ranking stream are assessed for reliability by examining internal…

  3. An alternative approach to risk rank chemicals on the threat they pose to the aquatic environment.

    Science.gov (United States)

    Johnson, Andrew C; Donnachie, Rachel L; Sumpter, John P; Jürgens, Monika D; Moeckel, Claudia; Pereira, M Gloria

    2017-12-01

    This work presents a new and unbiased method of risk ranking chemicals based on the threat they pose to the aquatic environment. The study ranked 12 metals, 23 pesticides, 11 other persistent organic pollutants (POPs), 13 pharmaceuticals, 10 surfactants and similar compounds and 2 nanoparticles (total of 71) of concern against one another by comparing their median UK river water and median ecotoxicity effect concentrations. To complement this, by giving an assessment on potential wildlife impacts, risk ranking was also carried out by comparing the lowest 10th percentile of the effects data with the highest 90th percentile of the exposure data. In other words, risk was pared down to just toxicity versus exposure. Further modifications included incorporating bioconcentration factors, using only recent water measurements and excluding either lethal or sub-lethal effects. The top ten chemicals, based on the medians, which emerged as having the highest risk to organisms in UK surface waters using all the ecotoxicity data were copper, aluminium, zinc, ethinylestradiol (EE2), linear alkylbenzene sulfonate (LAS), triclosan, manganese, iron, methomyl and chlorpyrifos. By way of contrast, using current UK environmental quality standards as the comparator to median UK river water concentrations would have selected 6 different chemicals in the top ten. This approach revealed big differences in relative risk; for example, zinc presented a million times greater risk then metoprolol and LAS 550 times greater risk than nanosilver. With the exception of EE2, most pharmaceuticals were ranked as having a relatively low risk. Copyright © 2017 Elsevier B.V. All rights reserved.

  4. Birthweight percentiles by gestational age for births following assisted reproductive technology in Australia and New Zealand, 2002-2010.

    Science.gov (United States)

    Li, Zhuoyang; Wang, Yueping A; Ledger, William; Sullivan, Elizabeth A

    2014-08-01

    What is the standard of birthweight for gestational age for babies following assisted reproductive technology (ART) treatment? Birthweight for gestational age percentile charts were developed for singleton births following ART treatment using population-based data. Small for gestational age (SGA) and large for gestational age (LGA) births are at increased risks of perinatal morbidity and mortality. A birthweight percentile chart allows the detection of neonates at high risk, and can help inform the need for special care if required. This population study used data from the Australian and New Zealand Assisted Reproduction Database (ANZARD) for 72 694 live born singletons following ART treatment between January 2002 and December 2010 in Australia and New Zealand. A total of 69 315 births (35 580 males and 33 735 females) following ART treatment were analysed for the birthweight percentile. Exact percentiles of birthweight in grams were calculated for each gestational week between Week 25 and 42 for fresh and thaw cycles by infant sex. Univariate analysis was used to determine the exact birthweight percentile values. Student t-test was used to examine the mean birthweight difference between male and female infants, between single embryo transfer (SET) and double embryo transfer (DET) and between fresh and thaw cycles. Preterm births (birth before 37 completed weeks of gestation) and low birthweight (fetal growth standards but only the weight of live born infants at birth. The comparison of birthweight percentile charts for ART births and general population births provide evidence that the proportion of SGA births following ART treatment was comparable to the general population for SET fresh cycles and significantly lower for thaw cycles. Both fresh and thaw cycles showed better outcomes for singleton births following SET compared with DET. Policies to promote single embryo transfer should be considered in order to minimize the adverse perinatal outcomes associated

  5. Two-dimensional ranking of Wikipedia articles

    Science.gov (United States)

    Zhirov, A. O.; Zhirov, O. V.; Shepelyansky, D. L.

    2010-10-01

    The Library of Babel, described by Jorge Luis Borges, stores an enormous amount of information. The Library exists ab aeterno. Wikipedia, a free online encyclopaedia, becomes a modern analogue of such a Library. Information retrieval and ranking of Wikipedia articles become the challenge of modern society. While PageRank highlights very well known nodes with many ingoing links, CheiRank highlights very communicative nodes with many outgoing links. In this way the ranking becomes two-dimensional. Using CheiRank and PageRank we analyze the properties of two-dimensional ranking of all Wikipedia English articles and show that it gives their reliable classification with rich and nontrivial features. Detailed studies are done for countries, universities, personalities, physicists, chess players, Dow-Jones companies and other categories.

  6. 24 CFR 599.401 - Ranking of applications.

    Science.gov (United States)

    2010-04-01

    ... 24 Housing and Urban Development 3 2010-04-01 2010-04-01 false Ranking of applications. 599.401... Communities § 599.401 Ranking of applications. (a) Ranking order. Rural and urban applications will be ranked... applications ranked first. (b) Separate ranking categories. After initial ranking, both rural and urban...

  7. Nonparametric, Coupled ,Bayesian ,Dictionary ,and Classifier Learning for Hyperspectral Classification.

    Science.gov (United States)

    Akhtar, Naveed; Mian, Ajmal

    2017-10-03

    We present a principled approach to learn a discriminative dictionary along a linear classifier for hyperspectral classification. Our approach places Gaussian Process priors over the dictionary to account for the relative smoothness of the natural spectra, whereas the classifier parameters are sampled from multivariate Gaussians. We employ two Beta-Bernoulli processes to jointly infer the dictionary and the classifier. These processes are coupled under the same sets of Bernoulli distributions. In our approach, these distributions signify the frequency of the dictionary atom usage in representing class-specific training spectra, which also makes the dictionary discriminative. Due to the coupling between the dictionary and the classifier, the popularity of the atoms for representing different classes gets encoded into the classifier. This helps in predicting the class labels of test spectra that are first represented over the dictionary by solving a simultaneous sparse optimization problem. The labels of the spectra are predicted by feeding the resulting representations to the classifier. Our approach exploits the nonparametric Bayesian framework to automatically infer the dictionary size--the key parameter in discriminative dictionary learning. Moreover, it also has the desirable property of adaptively learning the association between the dictionary atoms and the class labels by itself. We use Gibbs sampling to infer the posterior probability distributions over the dictionary and the classifier under the proposed model, for which, we derive analytical expressions. To establish the effectiveness of our approach, we test it on benchmark hyperspectral images. The classification performance is compared with the state-of-the-art dictionary learning-based classification methods.

  8. A robust nonparametric method for quantifying undetected extinctions.

    Science.gov (United States)

    Chisholm, Ryan A; Giam, Xingli; Sadanandan, Keren R; Fung, Tak; Rheindt, Frank E

    2016-06-01

    How many species have gone extinct in modern times before being described by science? To answer this question, and thereby get a full assessment of humanity's impact on biodiversity, statistical methods that quantify undetected extinctions are required. Such methods have been developed recently, but they are limited by their reliance on parametric assumptions; specifically, they assume the pools of extant and undetected species decay exponentially, whereas real detection rates vary temporally with survey effort and real extinction rates vary with the waxing and waning of threatening processes. We devised a new, nonparametric method for estimating undetected extinctions. As inputs, the method requires only the first and last date at which each species in an ensemble was recorded. As outputs, the method provides estimates of the proportion of species that have gone extinct, detected, or undetected and, in the special case where the number of undetected extant species in the present day is assumed close to zero, of the absolute number of undetected extinct species. The main assumption of the method is that the per-species extinction rate is independent of whether a species has been detected or not. We applied the method to the resident native bird fauna of Singapore. Of 195 recorded species, 58 (29.7%) have gone extinct in the last 200 years. Our method projected that an additional 9.6 species (95% CI 3.4, 19.8) have gone extinct without first being recorded, implying a true extinction rate of 33.0% (95% CI 31.0%, 36.2%). We provide R code for implementing our method. Because our method does not depend on strong assumptions, we expect it to be broadly useful for quantifying undetected extinctions. © 2016 Society for Conservation Biology.

  9. Economic decision making and the application of nonparametric prediction models

    Science.gov (United States)

    Attanasi, E.D.; Coburn, T.C.; Freeman, P.A.

    2008-01-01

    Sustained increases in energy prices have focused attention on gas resources in low-permeability shale or in coals that were previously considered economically marginal. Daily well deliverability is often relatively small, although the estimates of the total volumes of recoverable resources in these settings are often large. Planning and development decisions for extraction of such resources must be areawide because profitable extraction requires optimization of scale economies to minimize costs and reduce risk. For an individual firm, the decision to enter such plays depends on reconnaissance-level estimates of regional recoverable resources and on cost estimates to develop untested areas. This paper shows how simple nonparametric local regression models, used to predict technically recoverable resources at untested sites, can be combined with economic models to compute regional-scale cost functions. The context of the worked example is the Devonian Antrim-shale gas play in the Michigan basin. One finding relates to selection of the resource prediction model to be used with economic models. Models chosen because they can best predict aggregate volume over larger areas (many hundreds of sites) smooth out granularity in the distribution of predicted volumes at individual sites. This loss of detail affects the representation of economic cost functions and may affect economic decisions. Second, because some analysts consider unconventional resources to be ubiquitous, the selection and order of specific drilling sites may, in practice, be determined arbitrarily by extraneous factors. The analysis shows a 15-20% gain in gas volume when these simple models are applied to order drilling prospects strategically rather than to choose drilling locations randomly. Copyright ?? 2008 Society of Petroleum Engineers.

  10. Agro-tourism and ranking

    Science.gov (United States)

    Cioca, L. I.; Giurea, R.; Precazzini, I.; Ragazzi, M.; Achim, M. I.; Schiavon, M.; Rada, E. C.

    2018-05-01

    Nowadays the global tourism growth has caused a significant interest in research focused on the impact of the tourism on environment and community. The purpose of this study is to introduce a new ranking for the classification of tourist accommodation establishments with the functions of agro-tourism boarding house type by examining the sector of agro-tourism based on a research aimed to improve the economic, socio-cultural and environmental performance of agrotourism structures. This paper links the criteria for the classification of agro-tourism boarding houses (ABHs) to the impact of agro-tourism activities on the environment, enhancing an eco-friendly approach on agro-tourism activities by increasing the quality reputation of the agro-tourism products and services. Taking into account the impact on the environment, agrotourism can play an important role by protecting and conserving it.

  11. Serum Thyroid-Stimulating Hormone Levels and Body Mass Index Percentiles in Children with Primary Hypothyroidism on Levothyroxine Replacement.

    Science.gov (United States)

    Shaoba, Asma; Basu, Sanjib; Mantis, Stelios; Minutti, Carla

    2017-12-15

    To determine the association, if any, between thyroid-stimulating hormone (TSH) levels and body mass index (BMI) percentiles in children with primary hypothyroidism who are chemically euthyroid and on treatment with levothyroxine. This retrospective cross-sectional study consisted of a review of medical records from RUSH Medical Center and Stroger Hospital, Chicago, USA of children with primary hypothyroidism who were seen in the clinic from 2008 to 2014 and who were chemically euthyroid and on treatment with levothyroxine for at least 6 months. The patients were divided into two groups based on their TSH levels (0.34-hypothyroidism who are chemically euthyroid on treatment with levothyroxine, there is a positive association between higher TSH levels and higher BMI percentiles. However, it is difficult to establish if the higher TSH levels are a direct cause or a consequence of the obesity. Further studies are needed to establish causation beyond significant association.

  12. Percentile Values for Running Sprint Field Tests in Children Ages 6-17 Years: Influence of Weight Status

    Science.gov (United States)

    Castro-Pinero, Jose; Gonzalez-Montesinos, Jose Luis; Keating, Xiaofen D.; Mora, Jesus; Sjostrom, Michael; Ruiz, Jonatan R.

    2010-01-01

    The aim of this study was to provide percentile values for six different sprint tests in 2,708 Spanish children (1,234 girls) ages 6-17.9 years. We also examined the influence of weight status on sprint performance across age groups, with a focus on underweight and obese groups. We used the 20-m, 30-m, and 50-m running sprint standing start and…

  13. Height, weight and BMI percentiles and nutritional status relative to the international growth references among Pakistani school-aged children

    OpenAIRE

    Mushtaq, Muhammad Umair; Gull, Sibgha; Mushtaq, Komal; Abdullah, Hussain Muhammad; Khurshid, Usman; Shahid, Ubeera; Shad, Mushtaq Ahmad; Akram, Javed

    2012-01-01

    Abstract Background Child growth is internationally recognized as an important indicator of nutritional status and health in populations. This study was aimed to compare age- and gender-specific height, weight and BMI percentiles and nutritional status relative to the international growth references among Pakistani school-aged children. Methods A population-based study was conducted with a multistage cluster sample of 1860 children aged five to twelve years in Lahore, Pakistan. Smoothed heigh...

  14. Nonparametric Monitoring for Geotechnical Structures Subject to Long-Term Environmental Change

    Directory of Open Access Journals (Sweden)

    Hae-Bum Yun

    2011-01-01

    Full Text Available A nonparametric, data-driven methodology of monitoring for geotechnical structures subject to long-term environmental change is discussed. Avoiding physical assumptions or excessive simplification of the monitored structures, the nonparametric monitoring methodology presented in this paper provides reliable performance-related information particularly when the collection of sensor data is limited. For the validation of the nonparametric methodology, a field case study was performed using a full-scale retaining wall, which had been monitored for three years using three tilt gauges. Using the very limited sensor data, it is demonstrated that important performance-related information, such as drainage performance and sensor damage, could be disentangled from significant daily, seasonal and multiyear environmental variations. Extensive literature review on recent developments of parametric and nonparametric data processing techniques for geotechnical applications is also presented.

  15. Kernel bandwidth estimation for non-parametric density estimation: a comparative study

    CSIR Research Space (South Africa)

    Van der Walt, CM

    2013-12-01

    Full Text Available We investigate the performance of conventional bandwidth estimators for non-parametric kernel density estimation on a number of representative pattern-recognition tasks, to gain a better understanding of the behaviour of these estimators in high...

  16. Examples of the Application of Nonparametric Information Geometry to Statistical Physics

    Directory of Open Access Journals (Sweden)

    Giovanni Pistone

    2013-09-01

    Full Text Available We review a nonparametric version of Amari’s information geometry in which the set of positive probability densities on a given sample space is endowed with an atlas of charts to form a differentiable manifold modeled on Orlicz Banach spaces. This nonparametric setting is used to discuss the setting of typical problems in machine learning and statistical physics, such as black-box optimization, Kullback-Leibler divergence, Boltzmann-Gibbs entropy and the Boltzmann equation.

  17. Screen Wars, Star Wars, and Sequels: Nonparametric Reanalysis of Movie Profitability

    OpenAIRE

    W. D. Walls

    2012-01-01

    In this paper we use nonparametric statistical tools to quantify motion-picture profit. We quantify the unconditional distribution of profit, the distribution of profit conditional on stars and sequels, and we also model the conditional expectation of movie profits using a non- parametric data-driven regression model. The flexibility of the non-parametric approach accommodates the full range of possible relationships among the variables without prior specification of a functional form, thereb...

  18. Developing an immigration policy for Germany on the basis of a nonparametric labor market classification

    OpenAIRE

    Froelich, Markus; Puhani, Patrick

    2004-01-01

    Based on a nonparametrically estimated model of labor market classifications, this paper makes suggestions for immigration policy using data from western Germany in the 1990s. It is demonstrated that nonparametric regression is feasible in higher dimensions with only a few thousand observations. In sum, labor markets able to absorb immigrants are characterized by above average age and by professional occupations. On the other hand, labor markets for young workers in service occupations are id...

  19. Nonparametric Identification and Estimation of Finite Mixture Models of Dynamic Discrete Choices

    OpenAIRE

    Hiroyuki Kasahara; Katsumi Shimotsu

    2006-01-01

    In dynamic discrete choice analysis, controlling for unobserved heterogeneity is an important issue, and finite mixture models provide flexible ways to account for unobserved heterogeneity. This paper studies nonparametric identifiability of type probabilities and type-specific component distributions in finite mixture models of dynamic discrete choices. We derive sufficient conditions for nonparametric identification for various finite mixture models of dynamic discrete choices used in appli...

  20. Nonparametric Change Point Diagnosis Method of Concrete Dam Crack Behavior Abnormality

    Directory of Open Access Journals (Sweden)

    Zhanchao Li

    2013-01-01

    Full Text Available The study on diagnosis method of concrete crack behavior abnormality has always been a hot spot and difficulty in the safety monitoring field of hydraulic structure. Based on the performance of concrete dam crack behavior abnormality in parametric statistical model and nonparametric statistical model, the internal relation between concrete dam crack behavior abnormality and statistical change point theory is deeply analyzed from the model structure instability of parametric statistical model and change of sequence distribution law of nonparametric statistical model. On this basis, through the reduction of change point problem, the establishment of basic nonparametric change point model, and asymptotic analysis on test method of basic change point problem, the nonparametric change point diagnosis method of concrete dam crack behavior abnormality is created in consideration of the situation that in practice concrete dam crack behavior may have more abnormality points. And the nonparametric change point diagnosis method of concrete dam crack behavior abnormality is used in the actual project, demonstrating the effectiveness and scientific reasonableness of the method established. Meanwhile, the nonparametric change point diagnosis method of concrete dam crack behavior abnormality has a complete theoretical basis and strong practicality with a broad application prospect in actual project.

  1. Error analysis of stochastic gradient descent ranking.

    Science.gov (United States)

    Chen, Hong; Tang, Yi; Li, Luoqing; Yuan, Yuan; Li, Xuelong; Tang, Yuanyan

    2013-06-01

    Ranking is always an important task in machine learning and information retrieval, e.g., collaborative filtering, recommender systems, drug discovery, etc. A kernel-based stochastic gradient descent algorithm with the least squares loss is proposed for ranking in this paper. The implementation of this algorithm is simple, and an expression of the solution is derived via a sampling operator and an integral operator. An explicit convergence rate for leaning a ranking function is given in terms of the suitable choices of the step size and the regularization parameter. The analysis technique used here is capacity independent and is novel in error analysis of ranking learning. Experimental results on real-world data have shown the effectiveness of the proposed algorithm in ranking tasks, which verifies the theoretical analysis in ranking error.

  2. An adaptive distance measure for use with nonparametric models

    International Nuclear Information System (INIS)

    Garvey, D. R.; Hines, J. W.

    2006-01-01

    Distance measures perform a critical task in nonparametric, locally weighted regression. Locally weighted regression (LWR) models are a form of 'lazy learning' which construct a local model 'on the fly' by comparing a query vector to historical, exemplar vectors according to a three step process. First, the distance of the query vector to each of the exemplar vectors is calculated. Next, these distances are passed to a kernel function, which converts the distances to similarities or weights. Finally, the model output or response is calculated by performing locally weighted polynomial regression. To date, traditional distance measures, such as the Euclidean, weighted Euclidean, and L1-norm have been used as the first step in the prediction process. Since these measures do not take into consideration sensor failures and drift, they are inherently ill-suited for application to 'real world' systems. This paper describes one such LWR model, namely auto associative kernel regression (AAKR), and describes a new, Adaptive Euclidean distance measure that can be used to dynamically compensate for faulty sensor inputs. In this new distance measure, the query observations that lie outside of the training range (i.e. outside the minimum and maximum input exemplars) are dropped from the distance calculation. This allows for the distance calculation to be robust to sensor drifts and failures, in addition to providing a method for managing inputs that exceed the training range. In this paper, AAKR models using the standard and Adaptive Euclidean distance are developed and compared for the pressure system of an operating nuclear power plant. It is shown that using the standard Euclidean distance for data with failed inputs, significant errors in the AAKR predictions can result. By using the Adaptive Euclidean distance it is shown that high fidelity predictions are possible, in spite of the input failure. In fact, it is shown that with the Adaptive Euclidean distance prediction

  3. Using Static Percentiles of AE9/AP9 to Approximate Dynamic Monte Carlo Runs for Radiation Analysis of Spiral Transfer Orbits

    Science.gov (United States)

    Kwan, Betty P.; O'Brien, T. Paul

    2015-06-01

    The Aerospace Corporation performed a study to determine whether static percentiles of AE9/AP9 can be used to approximate dynamic Monte Carlo runs for radiation analysis of spiral transfer orbits. Solar panel degradation is a major concern for solar-electric propulsion because solar-electric propulsion depends on the power output of the solar panel. Different spiral trajectories have different radiation environments that could lead to solar panel degradation. Because the spiral transfer orbits only last weeks to months, an average environment does not adequately address the possible transient enhancements of the radiation environment that must be accounted for in optimizing the transfer orbit trajectory. Therefore, to optimize the trajectory, an ensemble of Monte Carlo simulations of AE9/AP9 would normally be run for every spiral trajectory to determine the 95th percentile radiation environment. To avoid performing lengthy Monte Carlo dynamic simulations for every candidate spiral trajectory in the optimization, we found a static percentile that would be an accurate representation of the full Monte Carlo simulation for a representative set of spiral trajectories. For 3 LEO to GEO and 1 LEO to MEO trajectories, a static 90th percentile AP9 is a good approximation of the 95th percentile fluence with dynamics for 4-10 MeV protons, and a static 80th percentile AE9 is a good approximation of the 95th percentile fluence with dynamics for 0.5-2 MeV electrons. While the specific percentiles chosen cannot necessarily be used in general for other orbit trade studies, the concept of determining a static percentile as a quick approximation to a full Monte Carlo ensemble of simulations can likely be applied to other orbit trade studies. We expect the static percentile to depend on the region of space traversed, the mission duration, and the radiation effect considered.

  4. Methodology for ranking restoration options

    International Nuclear Information System (INIS)

    Hedemann Jensen, Per

    1999-04-01

    The work described in this report has been performed as a part of the RESTRAT Project FI4P-CT95-0021a (PL 950128) co-funded by the Nuclear Fission Safety Programme of the European Commission. The RESTRAT project has the overall objective of developing generic methodologies for ranking restoration techniques as a function of contamination and site characteristics. The project includes analyses of existing remediation methodologies and contaminated sites, and is structured in the following steps: characterisation of relevant contaminated sites; identification and characterisation of relevant restoration techniques; assessment of the radiological impact; development and application of a selection methodology for restoration options; formulation of generic conclusions and development of a manual. The project is intended to apply to situations in which sites with nuclear installations have been contaminated with radioactive materials as a result of the operation of these installations. The areas considered for remedial measures include contaminated land areas, rivers and sediments in rivers, lakes, and sea areas. Five contaminated European sites have been studied. Various remedial measures have been envisaged with respect to the optimisation of the protection of the populations being exposed to the radionuclides at the sites. Cost-benefit analysis and multi-attribute utility analysis have been applied for optimisation. Health, economic and social attributes have been included and weighting factors for the different attributes have been determined by the use of scaling constants. (au)

  5. Citation graph based ranking in Invenio

    CERN Document Server

    Marian, Ludmila; Rajman, Martin; Vesely, Martin

    2010-01-01

    Invenio is the web-based integrated digital library system developed at CERN. Within this framework, we present four types of ranking models based on the citation graph that complement the simple approach based on citation counts: time-dependent citation counts, a relevancy ranking which extends the PageRank model, a time-dependent ranking which combines the freshness of citations with PageRank and a ranking that takes into consideration the external citations. We present our analysis and results obtained on two main data sets: Inspire and CERN Document Server. Our main contributions are: (i) a study of the currently available ranking methods based on the citation graph; (ii) the development of new ranking methods that correct some of the identified limitations of the current methods such as treating all citations of equal importance, not taking time into account or considering the citation graph complete; (iii) a detailed study of the key parameters for these ranking methods. (The original publication is ava...

  6. Communities in Large Networks: Identification and Ranking

    DEFF Research Database (Denmark)

    Olsen, Martin

    2008-01-01

    We study the problem of identifying and ranking the members of a community in a very large network with link analysis only, given a set of representatives of the community. We define the concept of a community justified by a formal analysis of a simple model of the evolution of a directed graph. ...... and its immediate surroundings. The members are ranked with a “local” variant of the PageRank algorithm. Results are reported from successful experiments on identifying and ranking Danish Computer Science sites and Danish Chess pages using only a few representatives....

  7. Ranking Entities in Networks via Lefschetz Duality

    DEFF Research Database (Denmark)

    Aabrandt, Andreas; Hansen, Vagn Lundsgaard; Poulsen, Bjarne

    2014-01-01

    then be ranked according to how essential their positions are in the network by considering the effect of their respective absences. Defining a ranking of a network which takes the individual position of each entity into account has the purpose of assigning different roles to the entities, e.g. agents......, in the network. In this paper it is shown that the topology of a given network induces a ranking of the entities in the network. Further, it is demonstrated how to calculate this ranking and thus how to identify weak sub-networks in any given network....

  8. Application of Radial Basis Function Methods in the Development of a 95th Percentile Male Seated FEA Model.

    Science.gov (United States)

    Vavalle, Nicholas A; Schoell, Samantha L; Weaver, Ashley A; Stitzel, Joel D; Gayzik, F Scott

    2014-11-01

    Human body finite element models (FEMs) are a valuable tool in the study of injury biomechanics. However, the traditional model development process can be time-consuming. Scaling and morphing an existing FEM is an attractive alternative for generating morphologically distinct models for further study. The objective of this work is to use a radial basis function to morph the Global Human Body Models Consortium (GHBMC) average male model (M50) to the body habitus of a 95th percentile male (M95) and to perform validation tests on the resulting model. The GHBMC M50 model (v. 4.3) was created using anthropometric and imaging data from a living subject representing a 50th percentile male. A similar dataset was collected from a 95th percentile male (22,067 total images) and was used in the morphing process. Homologous landmarks on the reference (M50) and target (M95) geometries, with the existing FE node locations (M50 model), were inputs to the morphing algorithm. The radial basis function was applied to morph the FE model. The model represented a mass of 103.3 kg and contained 2.2 million elements with 1.3 million nodes. Simulations of the M95 in seven loading scenarios were presented ranging from a chest pendulum impact to a lateral sled test. The morphed model matched anthropometric data to within a rootmean square difference of 4.4% while maintaining element quality commensurate to the M50 model and matching other anatomical ranges and targets. The simulation validation data matched experimental data well in most cases.

  9. Percentiles of body fat measured by bioelectrical impedance in children and adolescents from Bogotá (Colombia): the FUPRECOL study.

    Science.gov (United States)

    Escobar-Cardozo, Germán D; Correa-Bautista, Jorge E; González-Jiménez, Emilio; Schmidt-RioValle, Jacqueline; Ramírez-Vélez, Robinson

    2016-04-01

    The analysis of body composition is a fundamental part of nutritional status assessment. The objective of this study was to establish body fat percentiles by bioelectrical impedance in children and adolescents from Bogotá (Colombia) who were part of the FUPRECOL study (Asociación de la Fuerza Prensil con Manifestaciones Tempranas de Riesgo Cardiovascular en Niños y Adolescentes Colombianos - Association between prehensile force and early signs of cardiovascular risk in Colombian children and adolescents). This was a cross-sectional study conducted among 5850 students aged 9-17.9 years old from Bogotá (Colombia). Body fat percentage was measured using foot-to-foot bioelectrical impedance (Tanita®, BF-689), by age and gender. Weight, height, waist circumference, and hip circumference were measured, and sexual maturity was self-staged. Percentiles (P3, P10, P25, P50, P75, P90 and P97) and centile curves were estimated using the LMS method (L [BoxCox curve], M [median curve] and S [variation coefficient curve]), by age and gender. Subjects included were 2526 children and 3324 adolescents. Body fat percentages and centile curves by age and gender were established. For most age groups, values resulted higher among girls than boys. Participants with values above P90 were considered to have a high cardiovascular risk due to excess fat (boys > 23.428.3, girls > 31.0-34.1). Body fat percentage percentiles measured using bioelectrical impedance by age and gender are presented here and may be used as reference to assess nutritional status and to predict cardiovascular risk due to excess fat at an early age. Sociedad Argentina de Pediatría.

  10. Height, weight and BMI percentiles and nutritional status relative to the international growth references among Pakistani school-aged children

    Directory of Open Access Journals (Sweden)

    Mushtaq Muhammad Umair

    2012-03-01

    Full Text Available Abstract Background Child growth is internationally recognized as an important indicator of nutritional status and health in populations. This study was aimed to compare age- and gender-specific height, weight and BMI percentiles and nutritional status relative to the international growth references among Pakistani school-aged children. Methods A population-based study was conducted with a multistage cluster sample of 1860 children aged five to twelve years in Lahore, Pakistan. Smoothed height, weight and BMI percentile curves were obtained and comparison was made with the World Health Organization 2007 (WHO and United States' Centers for Disease Control and Prevention 2000 (USCDC references. Over- and under-nutrition were defined according to the WHO and USCDC references, and the International Obesity Task Force (IOTF cut-offs. Simple descriptive statistics were used and statistical significance was considered at P Results Height, weight and BMI percentiles increased with age among both boys and girls, and both had approximately the same height and a lower weight and BMI as compared to the WHO and USCDC references. Mean differences from zero for height-, weight- and BMI-for-age z score values relative to the WHO and USCDC references were significant (P Conclusion Pakistani school-aged children significantly differed from the WHO and USCDC references. However, z score means relative to the WHO reference were closer to zero and the present study as compared to the USCDC reference. Overweight and obesity were significantly higher while underweight and thinness/wasting were significantly lower relative to the WHO reference as compared to the USCDC reference and the IOTF cut-offs. New growth charts for Pakistani children based on a nationally representative sample should be developed. Nevertheless, shifting to use of the 2007 WHO child growth reference might have important implications for child health programs and primary care pediatric clinics.

  11. Ranking scientific publications: the effect of nonlinearity

    Science.gov (United States)

    Yao, Liyang; Wei, Tian; Zeng, An; Fan, Ying; di, Zengru

    2014-10-01

    Ranking the significance of scientific publications is a long-standing challenge. The network-based analysis is a natural and common approach for evaluating the scientific credit of papers. Although the number of citations has been widely used as a metric to rank papers, recently some iterative processes such as the well-known PageRank algorithm have been applied to the citation networks to address this problem. In this paper, we introduce nonlinearity to the PageRank algorithm when aggregating resources from different nodes to further enhance the effect of important papers. The validation of our method is performed on the data of American Physical Society (APS) journals. The results indicate that the nonlinearity improves the performance of the PageRank algorithm in terms of ranking effectiveness, as well as robustness against malicious manipulations. Although the nonlinearity analysis is based on the PageRank algorithm, it can be easily extended to other iterative ranking algorithms and similar improvements are expected.

  12. Ranking scientific publications: the effect of nonlinearity.

    Science.gov (United States)

    Yao, Liyang; Wei, Tian; Zeng, An; Fan, Ying; Di, Zengru

    2014-10-17

    Ranking the significance of scientific publications is a long-standing challenge. The network-based analysis is a natural and common approach for evaluating the scientific credit of papers. Although the number of citations has been widely used as a metric to rank papers, recently some iterative processes such as the well-known PageRank algorithm have been applied to the citation networks to address this problem. In this paper, we introduce nonlinearity to the PageRank algorithm when aggregating resources from different nodes to further enhance the effect of important papers. The validation of our method is performed on the data of American Physical Society (APS) journals. The results indicate that the nonlinearity improves the performance of the PageRank algorithm in terms of ranking effectiveness, as well as robustness against malicious manipulations. Although the nonlinearity analysis is based on the PageRank algorithm, it can be easily extended to other iterative ranking algorithms and similar improvements are expected.

  13. Neural Ranking Models with Weak Supervision

    NARCIS (Netherlands)

    Dehghani, M.; Zamani, H.; Severyn, A.; Kamps, J.; Croft, W.B.

    2017-01-01

    Despite the impressive improvements achieved by unsupervised deep neural networks in computer vision and NLP tasks, such improvements have not yet been observed in ranking for information retrieval. The reason may be the complexity of the ranking problem, as it is not obvious how to learn from

  14. A Rational Method for Ranking Engineering Programs.

    Science.gov (United States)

    Glower, Donald D.

    1980-01-01

    Compares two methods for ranking academic programs, the opinion poll v examination of career successes of the program's alumni. For the latter, "Who's Who in Engineering" and levels of research funding provided data. Tables display resulting data and compare rankings by the two methods for chemical engineering and civil engineering. (CS)

  15. Lerot: An Online Learning to Rank Framework

    NARCIS (Netherlands)

    Schuth, A.; Hofmann, K.; Whiteson, S.; de Rijke, M.

    2013-01-01

    Online learning to rank methods for IR allow retrieval systems to optimize their own performance directly from interactions with users via click feedback. In the software package Lerot, presented in this paper, we have bundled all ingredients needed for experimenting with online learning to rank for

  16. Adaptive distributional extensions to DFR ranking

    DEFF Research Database (Denmark)

    Petersen, Casper; Simonsen, Jakob Grue; Järvelin, Kalervo

    2016-01-01

    -fitting distribution. We call this model Adaptive Distributional Ranking (ADR) because it adapts the ranking to the statistics of the specific dataset being processed each time. Experiments on TREC data show ADR to outperform DFR models (and their extensions) and be comparable in performance to a query likelihood...

  17. Contests with rank-order spillovers

    NARCIS (Netherlands)

    M.R. Baye (Michael); D. Kovenock (Dan); C.G. de Vries (Casper)

    2012-01-01

    textabstractThis paper presents a unified framework for characterizing symmetric equilibrium in simultaneous move, two-player, rank-order contests with complete information, in which each player's strategy generates direct or indirect affine "spillover" effects that depend on the rank-order of her

  18. Classification of rank 2 cluster varieties

    DEFF Research Database (Denmark)

    Mandel, Travis

    We classify rank 2 cluster varieties (those whose corresponding skew-form has rank 2) according to the deformation type of a generic fiber U of their X-spaces, as defined by Fock and Goncharov. Our approach is based on the work of Gross, Hacking, and Keel for cluster varieties and log Calabi...

  19. Using centrality to rank web snippets

    NARCIS (Netherlands)

    Jijkoun, V.; de Rijke, M.; Peters, C.; Jijkoun, V.; Mandl, T.; Müller, H.; Oard, D.W.; Peñas, A.; Petras, V.; Santos, D.

    2008-01-01

    We describe our participation in the WebCLEF 2007 task, targeted at snippet retrieval from web data. Our system ranks snippets based on a simple similarity-based centrality, inspired by the web page ranking algorithms. We experimented with retrieval units (sentences and paragraphs) and with the

  20. Mining Feedback in Ranking and Recommendation Systems

    Science.gov (United States)

    Zhuang, Ziming

    2009-01-01

    The amount of online information has grown exponentially over the past few decades, and users become more and more dependent on ranking and recommendation systems to address their information seeking needs. The advance in information technologies has enabled users to provide feedback on the utilities of the underlying ranking and recommendation…

  1. Entity Ranking using Wikipedia as a Pivot

    NARCIS (Netherlands)

    R. Kaptein; P. Serdyukov; A.P. de Vries (Arjen); J. Kamps

    2010-01-01

    htmlabstractIn this paper we investigate the task of Entity Ranking on the Web. Searchers looking for entities are arguably better served by presenting a ranked list of entities directly, rather than a list of web pages with relevant but also potentially redundant information about

  2. Entity ranking using Wikipedia as a pivot

    NARCIS (Netherlands)

    Kaptein, R.; Serdyukov, P.; de Vries, A.; Kamps, J.; Huang, X.J.; Jones, G.; Koudas, N.; Wu, X.; Collins-Thompson, K.

    2010-01-01

    In this paper we investigate the task of Entity Ranking on the Web. Searchers looking for entities are arguably better served by presenting a ranked list of entities directly, rather than a list of web pages with relevant but also potentially redundant information about these entities. Since

  3. Rank 2 fusion rings are complete intersections

    DEFF Research Database (Denmark)

    Andersen, Troels Bak

    We give a non-constructive proof that fusion rings attached to a simple complex Lie algebra of rank 2 are complete intersections.......We give a non-constructive proof that fusion rings attached to a simple complex Lie algebra of rank 2 are complete intersections....

  4. A Ranking Method for Evaluating Constructed Responses

    Science.gov (United States)

    Attali, Yigal

    2014-01-01

    This article presents a comparative judgment approach for holistically scored constructed response tasks. In this approach, the grader rank orders (rather than rate) the quality of a small set of responses. A prior automated evaluation of responses guides both set formation and scaling of rankings. Sets are formed to have similar prior scores and…

  5. Ranking Music Data by Relevance and Importance

    DEFF Research Database (Denmark)

    Ruxanda, Maria Magdalena; Nanopoulos, Alexandros; Jensen, Christian Søndergaard

    2008-01-01

    Due to the rapidly increasing availability of audio files on the Web, it is relevant to augment search engines with advanced audio search functionality. In this context, the ranking of the retrieved music is an important issue. This paper proposes a music ranking method capable of flexibly fusing...

  6. Ranking of Unwarranted Variations in Healthcare Treatments

    NARCIS (Netherlands)

    Moes, Herry; Brekelmans, Ruud; Hamers, Herbert; Hasaart, F.

    2017-01-01

    In this paper, we introduce a framework designed to identify and rank possible unwarranted variation of treatments in healthcare. The innovative aspect of this framework is a ranking procedure that aims to identify healthcare institutions where unwarranted variation is most severe, and diagnosis

  7. The Rankings Game: Who's Playing Whom?

    Science.gov (United States)

    Burness, John F.

    2008-01-01

    This summer, Forbes magazine published its new rankings of "America's Best Colleges," implying that it had developed a methodology that would give the public the information that it needed to choose a college wisely. "U.S. News & World Report," which in 1983 published the first annual ranking, just announced its latest ratings last week--including…

  8. Dynamic collective entity representations for entity ranking

    NARCIS (Netherlands)

    Graus, D.; Tsagkias, M.; Weerkamp, W.; Meij, E.; de Rijke, M.

    2016-01-01

    Entity ranking, i.e., successfully positioning a relevant entity at the top of the ranking for a given query, is inherently difficult due to the potential mismatch between the entity's description in a knowledge base, and the way people refer to the entity when searching for it. To counter this

  9. Comparing classical and quantum PageRanks

    Science.gov (United States)

    Loke, T.; Tang, J. W.; Rodriguez, J.; Small, M.; Wang, J. B.

    2017-01-01

    Following recent developments in quantum PageRanking, we present a comparative analysis of discrete-time and continuous-time quantum-walk-based PageRank algorithms. Relative to classical PageRank and to different extents, the quantum measures better highlight secondary hubs and resolve ranking degeneracy among peripheral nodes for all networks we studied in this paper. For the discrete-time case, we investigated the periodic nature of the walker's probability distribution for a wide range of networks and found that the dominant period does not grow with the size of these networks. Based on this observation, we introduce a new quantum measure using the maximum probabilities of the associated walker during the first couple of periods. This is particularly important, since it leads to a quantum PageRanking scheme that is scalable with respect to network size.

  10. Universal emergence of PageRank

    Energy Technology Data Exchange (ETDEWEB)

    Frahm, K M; Georgeot, B; Shepelyansky, D L, E-mail: frahm@irsamc.ups-tlse.fr, E-mail: georgeot@irsamc.ups-tlse.fr, E-mail: dima@irsamc.ups-tlse.fr [Laboratoire de Physique Theorique du CNRS, IRSAMC, Universite de Toulouse, UPS, 31062 Toulouse (France)

    2011-11-18

    The PageRank algorithm enables us to rank the nodes of a network through a specific eigenvector of the Google matrix, using a damping parameter {alpha} Element-Of ]0, 1[. Using extensive numerical simulations of large web networks, with a special accent on British University networks, we determine numerically and analytically the universal features of the PageRank vector at its emergence when {alpha} {yields} 1. The whole network can be divided into a core part and a group of invariant subspaces. For {alpha} {yields} 1, PageRank converges to a universal power-law distribution on the invariant subspaces whose size distribution also follows a universal power law. The convergence of PageRank at {alpha} {yields} 1 is controlled by eigenvalues of the core part of the Google matrix, which are extremely close to unity, leading to large relaxation times as, for example, in spin glasses. (paper)

  11. Universal emergence of PageRank

    International Nuclear Information System (INIS)

    Frahm, K M; Georgeot, B; Shepelyansky, D L

    2011-01-01

    The PageRank algorithm enables us to rank the nodes of a network through a specific eigenvector of the Google matrix, using a damping parameter α ∈ ]0, 1[. Using extensive numerical simulations of large web networks, with a special accent on British University networks, we determine numerically and analytically the universal features of the PageRank vector at its emergence when α → 1. The whole network can be divided into a core part and a group of invariant subspaces. For α → 1, PageRank converges to a universal power-law distribution on the invariant subspaces whose size distribution also follows a universal power law. The convergence of PageRank at α → 1 is controlled by eigenvalues of the core part of the Google matrix, which are extremely close to unity, leading to large relaxation times as, for example, in spin glasses. (paper)

  12. PageRank and rank-reversal dependence on the damping factor

    Science.gov (United States)

    Son, S.-W.; Christensen, C.; Grassberger, P.; Paczuski, M.

    2012-12-01

    PageRank (PR) is an algorithm originally developed by Google to evaluate the importance of web pages. Considering how deeply rooted Google's PR algorithm is to gathering relevant information or to the success of modern businesses, the question of rank stability and choice of the damping factor (a parameter in the algorithm) is clearly important. We investigate PR as a function of the damping factor d on a network obtained from a domain of the World Wide Web, finding that rank reversal happens frequently over a broad range of PR (and of d). We use three different correlation measures, Pearson, Spearman, and Kendall, to study rank reversal as d changes, and we show that the correlation of PR vectors drops rapidly as d changes from its frequently cited value, d0=0.85. Rank reversal is also observed by measuring the Spearman and Kendall rank correlation, which evaluate relative ranks rather than absolute PR. Rank reversal happens not only in directed networks containing rank sinks but also in a single strongly connected component, which by definition does not contain any sinks. We relate rank reversals to rank pockets and bottlenecks in the directed network structure. For the network studied, the relative rank is more stable by our measures around d=0.65 than at d=d0.

  13. PageRank and rank-reversal dependence on the damping factor.

    Science.gov (United States)

    Son, S-W; Christensen, C; Grassberger, P; Paczuski, M

    2012-12-01

    PageRank (PR) is an algorithm originally developed by Google to evaluate the importance of web pages. Considering how deeply rooted Google's PR algorithm is to gathering relevant information or to the success of modern businesses, the question of rank stability and choice of the damping factor (a parameter in the algorithm) is clearly important. We investigate PR as a function of the damping factor d on a network obtained from a domain of the World Wide Web, finding that rank reversal happens frequently over a broad range of PR (and of d). We use three different correlation measures, Pearson, Spearman, and Kendall, to study rank reversal as d changes, and we show that the correlation of PR vectors drops rapidly as d changes from its frequently cited value, d_{0}=0.85. Rank reversal is also observed by measuring the Spearman and Kendall rank correlation, which evaluate relative ranks rather than absolute PR. Rank reversal happens not only in directed networks containing rank sinks but also in a single strongly connected component, which by definition does not contain any sinks. We relate rank reversals to rank pockets and bottlenecks in the directed network structure. For the network studied, the relative rank is more stable by our measures around d=0.65 than at d=d_{0}.

  14. A tilting approach to ranking influence

    KAUST Repository

    Genton, Marc G.

    2014-12-01

    We suggest a new approach, which is applicable for general statistics computed from random samples of univariate or vector-valued or functional data, to assessing the influence that individual data have on the value of a statistic, and to ranking the data in terms of that influence. Our method is based on, first, perturbing the value of the statistic by ‘tilting’, or reweighting, each data value, where the total amount of tilt is constrained to be the least possible, subject to achieving a given small perturbation of the statistic, and, then, taking the ranking of the influence of data values to be that which corresponds to ranking the changes in data weights. It is shown, both theoretically and numerically, that this ranking does not depend on the size of the perturbation, provided that the perturbation is sufficiently small. That simple result leads directly to an elegant geometric interpretation of the ranks; they are the ranks of the lengths of projections of the weights onto a ‘line’ determined by the first empirical principal component function in a generalized measure of covariance. To illustrate the generality of the method we introduce and explore it in the case of functional data, where (for example) it leads to generalized boxplots. The method has the advantage of providing an interpretable ranking that depends on the statistic under consideration. For example, the ranking of data, in terms of their influence on the value of a statistic, is different for a measure of location and for a measure of scale. This is as it should be; a ranking of data in terms of their influence should depend on the manner in which the data are used. Additionally, the ranking recognizes, rather than ignores, sign, and in particular can identify left- and right-hand ‘tails’ of the distribution of a random function or vector.

  15. A Ranking Approach to Genomic Selection.

    Science.gov (United States)

    Blondel, Mathieu; Onogi, Akio; Iwata, Hiroyoshi; Ueda, Naonori

    2015-01-01

    Genomic selection (GS) is a recent selective breeding method which uses predictive models based on whole-genome molecular markers. Until now, existing studies formulated GS as the problem of modeling an individual's breeding value for a particular trait of interest, i.e., as a regression problem. To assess predictive accuracy of the model, the Pearson correlation between observed and predicted trait values was used. In this paper, we propose to formulate GS as the problem of ranking individuals according to their breeding value. Our proposed framework allows us to employ machine learning methods for ranking which had previously not been considered in the GS literature. To assess ranking accuracy of a model, we introduce a new measure originating from the information retrieval literature called normalized discounted cumulative gain (NDCG). NDCG rewards more strongly models which assign a high rank to individuals with high breeding value. Therefore, NDCG reflects a prerequisite objective in selective breeding: accurate selection of individuals with high breeding value. We conducted a comparison of 10 existing regression methods and 3 new ranking methods on 6 datasets, consisting of 4 plant species and 25 traits. Our experimental results suggest that tree-based ensemble methods including McRank, Random Forests and Gradient Boosting Regression Trees achieve excellent ranking accuracy. RKHS regression and RankSVM also achieve good accuracy when used with an RBF kernel. Traditional regression methods such as Bayesian lasso, wBSR and BayesC were found less suitable for ranking. Pearson correlation was found to correlate poorly with NDCG. Our study suggests two important messages. First, ranking methods are a promising research direction in GS. Second, NDCG can be a useful evaluation measure for GS.

  16. Bioprocess iterative batch-to-batch optimization based on hybrid parametric/nonparametric models.

    Science.gov (United States)

    Teixeira, Ana P; Clemente, João J; Cunha, António E; Carrondo, Manuel J T; Oliveira, Rui

    2006-01-01

    This paper presents a novel method for iterative batch-to-batch dynamic optimization of bioprocesses. The relationship between process performance and control inputs is established by means of hybrid grey-box models combining parametric and nonparametric structures. The bioreactor dynamics are defined by material balance equations, whereas the cell population subsystem is represented by an adjustable mixture of nonparametric and parametric models. Thus optimizations are possible without detailed mechanistic knowledge concerning the biological system. A clustering technique is used to supervise the reliability of the nonparametric subsystem during the optimization. Whenever the nonparametric outputs are unreliable, the objective function is penalized. The technique was evaluated with three simulation case studies. The overall results suggest that the convergence to the optimal process performance may be achieved after a small number of batches. The model unreliability risk constraint along with sampling scheduling are crucial to minimize the experimental effort required to attain a given process performance. In general terms, it may be concluded that the proposed method broadens the application of the hybrid parametric/nonparametric modeling technique to "newer" processes with higher potential for optimization.

  17. First rank symptoms for schizophrenia.

    Science.gov (United States)

    Soares-Weiser, Karla; Maayan, Nicola; Bergman, Hanna; Davenport, Clare; Kirkham, Amanda J; Grabowski, Sarah; Adams, Clive E

    2015-01-25

    Early and accurate diagnosis and treatment of schizophrenia may have long-term advantages for the patient; the longer psychosis goes untreated the more severe the repercussions for relapse and recovery. If the correct diagnosis is not schizophrenia, but another psychotic disorder with some symptoms similar to schizophrenia, appropriate treatment might be delayed, with possible severe repercussions for the person involved and their family. There is widespread uncertainty about the diagnostic accuracy of First Rank Symptoms (FRS); we examined whether they are a useful diagnostic tool to differentiate schizophrenia from other psychotic disorders. To determine the diagnostic accuracy of one or multiple FRS for diagnosing schizophrenia, verified by clinical history and examination by a qualified professional (e.g. psychiatrists, nurses, social workers), with or without the use of operational criteria and checklists, in people thought to have non-organic psychotic symptoms. We conducted searches in MEDLINE, EMBASE, and PsycInfo using OvidSP in April, June, July 2011 and December 2012. We also searched MEDION in December 2013. We selected studies that consecutively enrolled or randomly selected adults and adolescents with symptoms of psychosis, and assessed the diagnostic accuracy of FRS for schizophrenia compared to history and clinical examination performed by a qualified professional, which may or may not involve the use of symptom checklists or based on operational criteria such as ICD and DSM. Two review authors independently screened all references for inclusion. Risk of bias in included studies were assessed using the QUADAS-2 instrument. We recorded the number of true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN) for constructing a 2 x 2 table for each study or derived 2 x 2 data from reported summary statistics such as sensitivity, specificity, and/or likelihood ratios. We included 21 studies with a total of 6253 participants

  18. Generalized reduced rank latent factor regression for high dimensional tensor fields, and neuroimaging-genetic applications.

    Science.gov (United States)

    Tao, Chenyang; Nichols, Thomas E; Hua, Xue; Ching, Christopher R K; Rolls, Edmund T; Thompson, Paul M; Feng, Jianfeng

    2017-01-01

    We propose a generalized reduced rank latent factor regression model (GRRLF) for the analysis of tensor field responses and high dimensional covariates. The model is motivated by the need from imaging-genetic studies to identify genetic variants that are associated with brain imaging phenotypes, often in the form of high dimensional tensor fields. GRRLF identifies from the structure in the data the effective dimensionality of the data, and then jointly performs dimension reduction of the covariates, dynamic identification of latent factors, and nonparametric estimation of both covariate and latent response fields. After accounting for the latent and covariate effects, GRLLF performs a nonparametric test on the remaining factor of interest. GRRLF provides a better factorization of the signals compared with common solutions, and is less susceptible to overfitting because it exploits the effective dimensionality. The generality and the flexibility of GRRLF also allow various statistical models to be handled in a unified framework and solutions can be efficiently computed. Within the field of neuroimaging, it improves the sensitivity for weak signals and is a promising alternative to existing approaches. The operation of the framework is demonstrated with both synthetic datasets and a real-world neuroimaging example in which the effects of a set of genes on the structure of the brain at the voxel level were measured, and the results compared favorably with those from existing approaches. Copyright © 2016. Published by Elsevier Inc.

  19. Height-adjusted percentiles evaluated central obesity in children and adolescents more effectively than just waist circumference.

    Science.gov (United States)

    Hosseini, Mostafa; Kelishadi, Roya; Yousefifard, Mahmoud; Qorbani, Mostafa; Bazargani, Behnaz; Heshmat, Ramin; Motlagh, Mohammad Esmail; Mirminachi, Babak; Ataei, Neamatollah

    2017-01-01

    We compared the prevalence of obesity based on both waist circumference for height and body mass index (BMI) in Iranian children and adolescents. Data on 13 120 children with a mean age of 12.45 ± 3.36 years (50.8% male) from the fourth Childhood and Adolescence Surveillance and Prevention of Adult Non-communicable Disease study were included. Measured waist circumference values were modelled according to age, gender and height percentiles. The prevalence of obesity was estimated using the 90th percentiles for both unadjusted and height-adjusted waist circumferences and compared with the World Health Organization BMI cut-offs. They were analysed further for short, average and tall children. Waist circumference values increased steadily with age. For short and average height children, the prevalence of obesity was higher when height-adjusted waist circumference was used. For taller children, the prevalence of obesity using height-adjusted waist circumference and BMI was similar, but lower than the prevalence based on measurements unadjusted for height. Height-adjusted waist circumference and BMI identified different children as having obesity, with overlaps of 69.47% for boys and 68.42% for girls. Just using waist circumference underestimated obesity in some Iranian children and measurements should be adjusted for height. ©2016 Foundation Acta Paediatrica. Published by John Wiley & Sons Ltd.

  20. Adiabatic quantum algorithm for search engine ranking.

    Science.gov (United States)

    Garnerone, Silvano; Zanardi, Paolo; Lidar, Daniel A

    2012-06-08

    We propose an adiabatic quantum algorithm for generating a quantum pure state encoding of the PageRank vector, the most widely used tool in ranking the relative importance of internet pages. We present extensive numerical simulations which provide evidence that this algorithm can prepare the quantum PageRank state in a time which, on average, scales polylogarithmically in the number of web pages. We argue that the main topological feature of the underlying web graph allowing for such a scaling is the out-degree distribution. The top-ranked log(n) entries of the quantum PageRank state can then be estimated with a polynomial quantum speed-up. Moreover, the quantum PageRank state can be used in "q-sampling" protocols for testing properties of distributions, which require exponentially fewer measurements than all classical schemes designed for the same task. This can be used to decide whether to run a classical update of the PageRank.

  1. Ranking Adverse Drug Reactions With Crowdsourcing

    KAUST Repository

    Gottlieb, Assaf

    2015-03-23

    Background: There is no publicly available resource that provides the relative severity of adverse drug reactions (ADRs). Such a resource would be useful for several applications, including assessment of the risks and benefits of drugs and improvement of patient-centered care. It could also be used to triage predictions of drug adverse events. Objective: The intent of the study was to rank ADRs according to severity. Methods: We used Internet-based crowdsourcing to rank ADRs according to severity. We assigned 126,512 pairwise comparisons of ADRs to 2589 Amazon Mechanical Turk workers and used these comparisons to rank order 2929 ADRs. Results: There is good correlation (rho=.53) between the mortality rates associated with ADRs and their rank. Our ranking highlights severe drug-ADR predictions, such as cardiovascular ADRs for raloxifene and celecoxib. It also triages genes associated with severe ADRs such as epidermal growth-factor receptor (EGFR), associated with glioblastoma multiforme, and SCN1A, associated with epilepsy. Conclusions: ADR ranking lays a first stepping stone in personalized drug risk assessment. Ranking of ADRs using crowdsourcing may have useful clinical and financial implications, and should be further investigated in the context of health care decision making.

  2. Ranking adverse drug reactions with crowdsourcing.

    Science.gov (United States)

    Gottlieb, Assaf; Hoehndorf, Robert; Dumontier, Michel; Altman, Russ B

    2015-03-23

    There is no publicly available resource that provides the relative severity of adverse drug reactions (ADRs). Such a resource would be useful for several applications, including assessment of the risks and benefits of drugs and improvement of patient-centered care. It could also be used to triage predictions of drug adverse events. The intent of the study was to rank ADRs according to severity. We used Internet-based crowdsourcing to rank ADRs according to severity. We assigned 126,512 pairwise comparisons of ADRs to 2589 Amazon Mechanical Turk workers and used these comparisons to rank order 2929 ADRs. There is good correlation (rho=.53) between the mortality rates associated with ADRs and their rank. Our ranking highlights severe drug-ADR predictions, such as cardiovascular ADRs for raloxifene and celecoxib. It also triages genes associated with severe ADRs such as epidermal growth-factor receptor (EGFR), associated with glioblastoma multiforme, and SCN1A, associated with epilepsy. ADR ranking lays a first stepping stone in personalized drug risk assessment. Ranking of ADRs using crowdsourcing may have useful clinical and financial implications, and should be further investigated in the context of health care decision making.

  3. Nonparametric Bayesian density estimation on manifolds with applications to planar shapes.

    Science.gov (United States)

    Bhattacharya, Abhishek; Dunson, David B

    2010-12-01

    Statistical analysis on landmark-based shape spaces has diverse applications in morphometrics, medical diagnostics, machine vision and other areas. These shape spaces are non-Euclidean quotient manifolds. To conduct nonparametric inferences, one may define notions of centre and spread on this manifold and work with their estimates. However, it is useful to consider full likelihood-based methods, which allow nonparametric estimation of the probability density. This article proposes a broad class of mixture models constructed using suitable kernels on a general compact metric space and then on the planar shape space in particular. Following a Bayesian approach with a nonparametric prior on the mixing distribution, conditions are obtained under which the Kullback-Leibler property holds, implying large support and weak posterior consistency. Gibbs sampling methods are developed for posterior computation, and the methods are applied to problems in density estimation and classification with shape-based predictors. Simulation studies show improved estimation performance relative to existing approaches.

  4. RankExplorer: Visualization of Ranking Changes in Large Time Series Data.

    Science.gov (United States)

    Shi, Conglei; Cui, Weiwei; Liu, Shixia; Xu, Panpan; Chen, Wei; Qu, Huamin

    2012-12-01

    For many applications involving time series data, people are often interested in the changes of item values over time as well as their ranking changes. For example, people search many words via search engines like Google and Bing every day. Analysts are interested in both the absolute searching number for each word as well as their relative rankings. Both sets of statistics may change over time. For very large time series data with thousands of items, how to visually present ranking changes is an interesting challenge. In this paper, we propose RankExplorer, a novel visualization method based on ThemeRiver to reveal the ranking changes. Our method consists of four major components: 1) a segmentation method which partitions a large set of time series curves into a manageable number of ranking categories; 2) an extended ThemeRiver view with embedded color bars and changing glyphs to show the evolution of aggregation values related to each ranking category over time as well as the content changes in each ranking category; 3) a trend curve to show the degree of ranking changes over time; 4) rich user interactions to support interactive exploration of ranking changes. We have applied our method to some real time series data and the case studies demonstrate that our method can reveal the underlying patterns related to ranking changes which might otherwise be obscured in traditional visualizations.

  5. Augmenting the Deliberative Method for Ranking Risks.

    Science.gov (United States)

    Susel, Irving; Lasley, Trace; Montezemolo, Mark; Piper, Joel

    2016-01-01

    The Department of Homeland Security (DHS) characterized and prioritized the physical cross-border threats and hazards to the nation stemming from terrorism, market-driven illicit flows of people and goods (illegal immigration, narcotics, funds, counterfeits, and weaponry), and other nonmarket concerns (movement of diseases, pests, and invasive species). These threats and hazards pose a wide diversity of consequences with very different combinations of magnitudes and likelihoods, making it very challenging to prioritize them. This article presents the approach that was used at DHS to arrive at a consensus regarding the threats and hazards that stand out from the rest based on the overall risk they pose. Due to time constraints for the decision analysis, it was not feasible to apply multiattribute methodologies like multiattribute utility theory or the analytic hierarchy process. Using a holistic approach was considered, such as the deliberative method for ranking risks first published in this journal. However, an ordinal ranking alone does not indicate relative or absolute magnitude differences among the risks. Therefore, the use of the deliberative method for ranking risks is not sufficient for deciding whether there is a material difference between the top-ranked and bottom-ranked risks, let alone deciding what the stand-out risks are. To address this limitation of ordinal rankings, the deliberative method for ranking risks was augmented by adding an additional step to transform the ordinal ranking into a ratio scale ranking. This additional step enabled the selection of stand-out risks to help prioritize further analysis. © 2015 Society for Risk Analysis.

  6. Communities in Large Networks: Identification and Ranking

    DEFF Research Database (Denmark)

    Olsen, Martin

    2008-01-01

    show that the problem of deciding whether a non trivial community exists is NP complete. Nevertheless, experiments show that a very simple greedy approach can identify members of a community in the Danish part of the web graph with time complexity only dependent on the size of the found community...... and its immediate surroundings. The members are ranked with a “local” variant of the PageRank algorithm. Results are reported from successful experiments on identifying and ranking Danish Computer Science sites and Danish Chess pages using only a few representatives....

  7. A Universal Rank-Size Law

    Science.gov (United States)

    2016-01-01

    A mere hyperbolic law, like the Zipf’s law power function, is often inadequate to describe rank-size relationships. An alternative theoretical distribution is proposed based on theoretical physics arguments starting from the Yule-Simon distribution. A modeling is proposed leading to a universal form. A theoretical suggestion for the “best (or optimal) distribution”, is provided through an entropy argument. The ranking of areas through the number of cities in various countries and some sport competition ranking serves for the present illustrations. PMID:27812192

  8. Body Mass Index Percentile Curves for 7 To 18 Year Old Children and Adolescents; are the Sample Populations from Tehran Nationally Representative?

    Directory of Open Access Journals (Sweden)

    Mostafa Hosseini

    2016-06-01

    Full Text Available Background: The children’s body composition status is an important indicator of health condition evaluated through their body mass index (BMI. We aimed to provide standardized percentile curves of BMI in a population of Iranian children and adolescents. We assessed the nationally representative of sample populations from Tehran. Materials and Methods: A total sample of 14,865 children aged 7-18 years was gathered. The Lambda-Mu-Sigma method was used to derive sex-specific smoothed centiles for age via the Lambda-Mu-Sigma Chart Maker Program. Finally, the prevalence of overweight and obesity with 95% confidence interval (CI was calculated. Results: BMI percentiles obtained from Tehran’s population, except for the 10th percentile, seem to be very slightly greater than the urban boys from all over Iran. BMI percentiles have an increasing trend by age that is S-shaped with a slight slope. Only in the 90th and 97th percentiles of BMI for girls, this rising trend seems to stop. Boys generally have higher BMIs than girls. The exceptions are younger ages of 90th and 97th percentiles and older ages of 3rd and 10th percentiles. A total number of 1,008 (13.20%; 95% CI: 12.46-13.98 boys and 603 (8.34%; 95% CI: 7.72-9.00 girls were categorized as overweight and obese. Obesity were observed in 402 (5.27%; 95% CI: 4.79-5.79 boys and 274 (3.76%; 95% CI: 3.35-4.22 girls. Conclusion: We construct BMI percentile curves by age and gender for 7 to 18 years Iranian children and adolescents. It can be concluded that sample populations from Tehran are nationally representative.

  9. Nonparametric model validations for hidden Markov models with applications in financial econometrics.

    Science.gov (United States)

    Zhao, Zhibiao

    2011-06-01

    We address the nonparametric model validation problem for hidden Markov models with partially observable variables and hidden states. We achieve this goal by constructing a nonparametric simultaneous confidence envelope for transition density function of the observable variables and checking whether the parametric density estimate is contained within such an envelope. Our specification test procedure is motivated by a functional connection between the transition density of the observable variables and the Markov transition kernel of the hidden states. Our approach is applicable for continuous time diffusion models, stochastic volatility models, nonlinear time series models, and models with market microstructure noise.

  10. Multivariate nonparametric regression and visualization with R and applications to finance

    CERN Document Server

    Klemelä, Jussi

    2014-01-01

    A modern approach to statistical learning and its applications through visualization methods With a unique and innovative presentation, Multivariate Nonparametric Regression and Visualization provides readers with the core statistical concepts to obtain complete and accurate predictions when given a set of data. Focusing on nonparametric methods to adapt to the multiple types of data generatingmechanisms, the book begins with an overview of classification and regression. The book then introduces and examines various tested and proven visualization techniques for learning samples and functio

  11. NONPARAMETRIC FIXED EFFECT PANEL DATA MODELS: RELATIONSHIP BETWEEN AIR POLLUTION AND INCOME FOR TURKEY

    Directory of Open Access Journals (Sweden)

    Rabia Ece OMAY

    2013-06-01

    Full Text Available In this study, relationship between gross domestic product (GDP per capita and sulfur dioxide (SO2 and particulate matter (PM10 per capita is modeled for Turkey. Nonparametric fixed effect panel data analysis is used for the modeling. The panel data covers 12 territories, in first level of Nomenclature of Territorial Units for Statistics (NUTS, for period of 1990-2001. Modeling of the relationship between GDP and SO2 and PM10 for Turkey, the non-parametric models have given good results.

  12. Nonparametric method for failures diagnosis in the actuating subsystem of aircraft control system

    Science.gov (United States)

    Terentev, M. N.; Karpenko, S. S.; Zybin, E. Yu; Kosyanchuk, V. V.

    2018-02-01

    In this paper we design a nonparametric method for failures diagnosis in the aircraft control system that uses the measurements of the control signals and the aircraft states only. It doesn’t require a priori information of the aircraft model parameters, training or statistical calculations, and is based on analytical nonparametric one-step-ahead state prediction approach. This makes it possible to predict the behavior of unidentified and failure dynamic systems, to weaken the requirements to control signals, and to reduce the diagnostic time and problem complexity.

  13. A Bayesian approach to the analysis of quantal bioassay studies using nonparametric mixture models.

    Science.gov (United States)

    Fronczyk, Kassandra; Kottas, Athanasios

    2014-03-01

    We develop a Bayesian nonparametric mixture modeling framework for quantal bioassay settings. The approach is built upon modeling dose-dependent response distributions. We adopt a structured nonparametric prior mixture model, which induces a monotonicity restriction for the dose-response curve. Particular emphasis is placed on the key risk assessment goal of calibration for the dose level that corresponds to a specified response. The proposed methodology yields flexible inference for the dose-response relationship as well as for other inferential objectives, as illustrated with two data sets from the literature. © 2013, The International Biometric Society.

  14. Modern nonparametric, robust and multivariate methods festschrift in honour of Hannu Oja

    CERN Document Server

    Taskinen, Sara

    2015-01-01

    Written by leading experts in the field, this edited volume brings together the latest findings in the area of nonparametric, robust and multivariate statistical methods. The individual contributions cover a wide variety of topics ranging from univariate nonparametric methods to robust methods for complex data structures. Some examples from statistical signal processing are also given. The volume is dedicated to Hannu Oja on the occasion of his 65th birthday and is intended for researchers as well as PhD students with a good knowledge of statistics.

  15. Non-parametric data-based approach for the quantification and communication of uncertainties in river flood forecasts

    Science.gov (United States)

    Van Steenbergen, N.; Willems, P.

    2012-04-01

    Reliable flood forecasts are the most important non-structural measures to reduce the impact of floods. However flood forecasting systems are subject to uncertainty originating from the input data, model structure and model parameters of the different hydraulic and hydrological submodels. To quantify this uncertainty a non-parametric data-based approach has been developed. This approach analyses the historical forecast residuals (differences between the predictions and the observations at river gauging stations) without using a predefined statistical error distribution. Because the residuals are correlated with the value of the forecasted water level and the lead time, the residuals are split up into discrete classes of simulated water levels and lead times. For each class, percentile values are calculated of the model residuals and stored in a 'three dimensional error' matrix. By 3D interpolation in this error matrix, the uncertainty in new forecasted water levels can be quantified. In addition to the quantification of the uncertainty, the communication of this uncertainty is equally important. The communication has to be done in a consistent way, reducing the chance of misinterpretation. Also, the communication needs to be adapted to the audience; the majority of the larger public is not interested in in-depth information on the uncertainty on the predicted water levels, but only is interested in information on the likelihood of exceedance of certain alarm levels. Water managers need more information, e.g. time dependent uncertainty information, because they rely on this information to undertake the appropriate flood mitigation action. There are various ways in presenting uncertainty information (numerical, linguistic, graphical, time (in)dependent, etc.) each with their advantages and disadvantages for a specific audience. A useful method to communicate uncertainty of flood forecasts is by probabilistic flood mapping. These maps give a representation of the

  16. Scalable Faceted Ranking in Tagging Systems

    Science.gov (United States)

    Orlicki, José I.; Alvarez-Hamelin, J. Ignacio; Fierens, Pablo I.

    Nowadays, web collaborative tagging systems which allow users to upload, comment on and recommend contents, are growing. Such systems can be represented as graphs where nodes correspond to users and tagged-links to recommendations. In this paper we analyze the problem of computing a ranking of users with respect to a facet described as a set of tags. A straightforward solution is to compute a PageRank-like algorithm on a facet-related graph, but it is not feasible for online computation. We propose an alternative: (i) a ranking for each tag is computed offline on the basis of tag-related subgraphs; (ii) a faceted order is generated online by merging rankings corresponding to all the tags in the facet. Based on the graph analysis of YouTube and Flickr, we show that step (i) is scalable. We also present efficient algorithms for step (ii), which are evaluated by comparing their results with two gold standards.

  17. Evaluation of treatment effects by ranking

    DEFF Research Database (Denmark)

    Halekoh, U; Kristensen, K

    2008-01-01

    In crop experiments measurements are often made by a judge evaluating the crops' conditions after treatment. In the present paper an analysis is proposed for experiments where plots of crops treated differently are mutually ranked. In the experimental layout the crops are treated on consecutive...... plots usually placed side by side in one or more rows. In the proposed method a judge ranks several neighbouring plots, say three, by ranking them from best to worst. For the next observation the judge moves on by no more than two plots, such that up to two plots will be re-evaluated again...... in a comparison with the new plot(s). Data from studies using this set-up were analysed by a Thurstonian random utility model, which assumed that the judge's rankings were obtained by comparing latent continuous utilities or treatment effects. For the latent utilities a variance component model was considered...

  18. Superfund Hazard Ranking System Training Course

    Science.gov (United States)

    The Hazard Ranking System (HRS) training course is a four and ½ day, intermediate-level course designed for personnel who are required to compile, draft, and review preliminary assessments (PAs), site inspections (SIs), and HRS documentation records/packag

  19. Who's bigger? where historical figures really rank

    CERN Document Server

    Skiena, Steven

    2014-01-01

    Is Hitler bigger than Napoleon? Washington bigger than Lincoln? Picasso bigger than Einstein? Quantitative analysts are rapidly finding homes in social and cultural domains, from finance to politics. What about history? In this fascinating book, Steve Skiena and Charles Ward bring quantitative analysis to bear on ranking and comparing historical reputations. They evaluate each person by aggregating the traces of millions of opinions, just as Google ranks webpages. The book includes a technical discussion for readers interested in the details of the methods, but no mathematical or computational background is necessary to understand the rankings or conclusions. Along the way, the authors present the rankings of more than one thousand of history's most significant people in science, politics, entertainment, and all areas of human endeavor. Anyone interested in history or biography can see where their favorite figures place in the grand scheme of things.

  20. Ranking Forestry Investments With Parametric Linear Programming

    Science.gov (United States)

    Paul A. Murphy

    1976-01-01

    Parametric linear programming is introduced as a technique for ranking forestry investments under multiple constraints; it combines the advantages of simple tanking and linear programming as capital budgeting tools.

  1. Block models and personalized PageRank.

    Science.gov (United States)

    Kloumann, Isabel M; Ugander, Johan; Kleinberg, Jon

    2017-01-03

    Methods for ranking the importance of nodes in a network have a rich history in machine learning and across domains that analyze structured data. Recent work has evaluated these methods through the "seed set expansion problem": given a subset [Formula: see text] of nodes from a community of interest in an underlying graph, can we reliably identify the rest of the community? We start from the observation that the most widely used techniques for this problem, personalized PageRank and heat kernel methods, operate in the space of "landing probabilities" of a random walk rooted at the seed set, ranking nodes according to weighted sums of landing probabilities of different length walks. Both schemes, however, lack an a priori relationship to the seed set objective. In this work, we develop a principled framework for evaluating ranking methods by studying seed set expansion applied to the stochastic block model. We derive the optimal gradient for separating the landing probabilities of two classes in a stochastic block model and find, surprisingly, that under reasonable assumptions the gradient is asymptotically equivalent to personalized PageRank for a specific choice of the PageRank parameter [Formula: see text] that depends on the block model parameters. This connection provides a formal motivation for the success of personalized PageRank in seed set expansion and node ranking generally. We use this connection to propose more advanced techniques incorporating higher moments of landing probabilities; our advanced methods exhibit greatly improved performance, despite being simple linear classification rules, and are even competitive with belief propagation.

  2. Block models and personalized PageRank

    OpenAIRE

    Kloumann, Isabel M.; Ugander, Johan; Kleinberg, Jon

    2016-01-01

    Methods for ranking the importance of nodes in a network have a rich history in machine learning and across domains that analyze structured data. Recent work has evaluated these methods though the seed set expansion problem: given a subset $S$ of nodes from a community of interest in an underlying graph, can we reliably identify the rest of the community? We start from the observation that the most widely used techniques for this problem, personalized PageRank and heat kernel methods, operate...

  3. Evaluation of the osteoclastogenic process associated with RANK / RANK-L / OPG in odontogenic myxomas

    Science.gov (United States)

    González-Galván, María del Carmen; Mosqueda-Taylor, Adalberto; Bologna-Molina, Ronell; Setien-Olarra, Amaia; Marichalar-Mendia, Xabier; Aguirre-Urizar, José-Manuel

    2018-01-01

    Background Odontogenic myxoma (OM) is a benign intraosseous neoplasm that exhibits local aggressiveness and high recurrence rates. Osteoclastogenesis is an important phenomenon in the tumor growth of maxillary neoplasms. RANK (Receptor Activator of Nuclear Factor κappa B) is the signaling receptor of RANK-L (Receptor activator of nuclear factor kappa-Β ligand) that activates the osteoclasts. OPG (osteoprotegerin) is a decoy receptor for RANK-L that inhibits pro-osteoclastogenesis. The RANK / RANKL / OPG system participates in the regulation of osteolytic activity under normal conditions, and its alteration has been associated with greater bone destruction, and also with tumor growth. Objectives To analyze the immunohistochemical expression of OPG, RANK and RANK-L proteins in odontogenic myxomas (OMs) and their relationship with the tumor size. Material and Methods Eighteen OMs, 4 small ( 3cm) and 18 dental follicles (DF) that were included as control were studied by means of standard immunohistochemical procedure with RANK, RANKL and OPG antibodies. For the evaluation, 5 fields (40x) of representative areas of OM and DF were selected where the expression of each antibody was determined. Descriptive and comparative statistical analyses were performed with the obtained data. Results There are significant differences in the expression of RANK in OM samples as compared to DF (p = 0.022) and among the OMSs and OMLs (p = 0.032). Also a strong association is recognized in the expression of RANK-L and OPG in OM samples. Conclusions Activation of the RANK / RANK-L / OPG triad seems to be involved in the mechanisms of bone balance and destruction, as well as associated with tumor growth in odontogenic myxomas. Key words:Odontogenic myxoma, dental follicle, RANK, RANK-L, OPG, osteoclastogenesis. PMID:29680857

  4. How Many Alternatives Can Be Ranked? A Comparison of the Paired Comparison and Ranking Methods.

    Science.gov (United States)

    Ock, Minsu; Yi, Nari; Ahn, Jeonghoon; Jo, Min-Woo

    2016-01-01

    To determine the feasibility of converting ranking data into paired comparison (PC) data and suggest the number of alternatives that can be ranked by comparing a PC and a ranking method. Using a total of 222 health states, a household survey was conducted in a sample of 300 individuals from the general population. Each respondent performed a PC 15 times and a ranking method 6 times (two attempts of ranking three, four, and five health states, respectively). The health states of the PC and the ranking method were constructed to overlap each other. We converted the ranked data into PC data and examined the consistency of the response rate. Applying probit regression, we obtained the predicted probability of each method. Pearson correlation coefficients were determined between the predicted probabilities of those methods. The mean absolute error was also assessed between the observed and the predicted values. The overall consistency of the response rate was 82.8%. The Pearson correlation coefficients were 0.789, 0.852, and 0.893 for ranking three, four, and five health states, respectively. The lowest mean absolute error was 0.082 (95% confidence interval [CI] 0.074-0.090) in ranking five health states, followed by 0.123 (95% CI 0.111-0.135) in ranking four health states and 0.126 (95% CI 0.113-0.138) in ranking three health states. After empirically examining the consistency of the response rate between a PC and a ranking method, we suggest that using five alternatives in the ranking method may be superior to using three or four alternatives. Copyright © 2016 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.

  5. Rank distributions: A panoramic macroscopic outlook

    Science.gov (United States)

    Eliazar, Iddo I.; Cohen, Morrel H.

    2014-01-01

    This paper presents a panoramic macroscopic outlook of rank distributions. We establish a general framework for the analysis of rank distributions, which classifies them into five macroscopic "socioeconomic" states: monarchy, oligarchy-feudalism, criticality, socialism-capitalism, and communism. Oligarchy-feudalism is shown to be characterized by discrete macroscopic rank distributions, and socialism-capitalism is shown to be characterized by continuous macroscopic size distributions. Criticality is a transition state between oligarchy-feudalism and socialism-capitalism, which can manifest allometric scaling with multifractal spectra. Monarchy and communism are extreme forms of oligarchy-feudalism and socialism-capitalism, respectively, in which the intrinsic randomness vanishes. The general framework is applied to three different models of rank distributions—top-down, bottom-up, and global—and unveils each model's macroscopic universality and versatility. The global model yields a macroscopic classification of the generalized Zipf law, an omnipresent form of rank distributions observed across the sciences. An amalgamation of the three models establishes a universal rank-distribution explanation for the macroscopic emergence of a prevalent class of continuous size distributions, ones governed by unimodal densities with both Pareto and inverse-Pareto power-law tails.

  6. Fair ranking of researchers and research teams.

    Science.gov (United States)

    Vavryčuk, Václav

    2018-01-01

    The main drawback of ranking of researchers by the number of papers, citations or by the Hirsch index is ignoring the problem of distributing authorship among authors in multi-author publications. So far, the single-author or multi-author publications contribute to the publication record of a researcher equally. This full counting scheme is apparently unfair and causes unjust disproportions, in particular, if ranked researchers have distinctly different collaboration profiles. These disproportions are removed by less common fractional or authorship-weighted counting schemes, which can distribute the authorship credit more properly and suppress a tendency to unjustified inflation of co-authors. The urgent need of widely adopting a fair ranking scheme in practise is exemplified by analysing citation profiles of several highly-cited astronomers and astrophysicists. While the full counting scheme often leads to completely incorrect and misleading ranking, the fractional or authorship-weighted schemes are more accurate and applicable to ranking of researchers as well as research teams. In addition, they suppress differences in ranking among scientific disciplines. These more appropriate schemes should urgently be adopted by scientific publication databases as the Web of Science (Thomson Reuters) or the Scopus (Elsevier).

  7. Frontal impact response of a virtual low percentile six years old human thorax developed by automatic down-scaling

    Directory of Open Access Journals (Sweden)

    Špička J.

    2015-06-01

    Full Text Available Traffic accidents cause one of the highest numbers of severe injuries in the whole population spectrum. The numbers of deaths and seriously injured citizens prove that traffic accidents and their consequences are still a serious problem to be solved. The paper contributes to the field of vehicle safety technology with a virtual approach. Exploitation of the previously developed scaling algorithm enables the creation of a specific anthropometric model based on a validated reference model. The aim of the paper is to prove the biofidelity of the small percentile six years old virtual human model developed by automatic down-scaling in a frontal impact. For the automatically developed six years old virtual specific anthropometric model, the Kroell impact test is simulated and the results are compared to the experimental data. The chosen approach shows good correspondence of the scaled model performance to the experimental corridors.

  8. Flexible parametric survival models built on age-specific antimüllerian hormone percentiles are better predictors of menopause.

    Science.gov (United States)

    Ramezani Tehrani, Fahimeh; Mansournia, Mohammad Ali; Solaymani-Dodaran, Masoud; Steyerberg, Ewout; Azizi, Fereidoun

    2016-06-01

    This study aimed to improve existing prediction models for age at menopause. We identified all reproductive aged women with regular menstrual cycles who met our eligibility criteria (n = 1,015) in the Tehran Lipid and Glucose Study-an ongoing population-based cohort study initiated in 1998. Participants were examined every 3 years and their reproductive histories were recorded. Blood levels of antimüllerian hormone (AMH) were measured at the time of recruitment. Age at menopause was estimated based on serum concentrations of AMH using flexible parametric survival models. The optimum model was selected according to Akaike Information Criteria and the realness of the range of predicted median menopause age. We followed study participants for a median of 9.8 years during which 277 women reached menopause and found that a spline-based proportional odds model including age-specific AMH percentiles as the covariate performed well in terms of statistical criteria and provided the most clinically relevant and realistic predictions. The range of predicted median age at menopause for this model was 47.1 to 55.9 years. For those who reached menopause, the median of the absolute mean difference between actual and predicted age at menopause was 1.9 years (interquartile range 2.9). The model including the age-specific AMH percentiles as the covariate and using proportional odds as its covariate metrics meets all the statistical criteria for the best model and provides the most clinically relevant and realistic predictions for age at menopause for reproductive-aged women.

  9. Supremum Norm Posterior Contraction and Credible Sets for Nonparametric Multivariate Regression

    NARCIS (Netherlands)

    Yoo, W.W.; Ghosal, S

    2016-01-01

    In the setting of nonparametric multivariate regression with unknown error variance, we study asymptotic properties of a Bayesian method for estimating a regression function f and its mixed partial derivatives. We use a random series of tensor product of B-splines with normal basis coefficients as a

  10. Does Private Tutoring Work? The Effectiveness of Private Tutoring: A Nonparametric Bounds Analysis

    Science.gov (United States)

    Hof, Stefanie

    2014-01-01

    Private tutoring has become popular throughout the world. However, evidence for the effect of private tutoring on students' academic outcome is inconclusive; therefore, this paper presents an alternative framework: a nonparametric bounds method. The present examination uses, for the first time, a large representative data-set in a European setting…

  11. Testing a parametric function against a nonparametric alternative in IV and GMM settings

    DEFF Research Database (Denmark)

    Gørgens, Tue; Wurtz, Allan

    This paper develops a specification test for functional form for models identified by moment restrictions, including IV and GMM settings. The general framework is one where the moment restrictions are specified as functions of data, a finite-dimensional parameter vector, and a nonparametric real ...

  12. A structural nonparametric reappraisal of the CO2 emissions-income relationship

    NARCIS (Netherlands)

    Azomahou, T.T.; Goedhuys - Degelin, Micheline; Nguyen-Van, P.

    Relying on a structural nonparametric estimation, we show that co2 emissions clearly increase with income at low income levels. For higher income levels, we observe a decreasing relationship, though not significant. We also find thatco2 emissions monotonically increases with energy use at a

  13. Testing for constant nonparametric effects in general semiparametric regression models with interactions

    KAUST Repository

    Wei, Jiawei

    2011-07-01

    We consider the problem of testing for a constant nonparametric effect in a general semi-parametric regression model when there is the potential for interaction between the parametrically and nonparametrically modeled variables. The work was originally motivated by a unique testing problem in genetic epidemiology (Chatterjee, et al., 2006) that involved a typical generalized linear model but with an additional term reminiscent of the Tukey one-degree-of-freedom formulation, and their interest was in testing for main effects of the genetic variables, while gaining statistical power by allowing for a possible interaction between genes and the environment. Later work (Maity, et al., 2009) involved the possibility of modeling the environmental variable nonparametrically, but they focused on whether there was a parametric main effect for the genetic variables. In this paper, we consider the complementary problem, where the interest is in testing for the main effect of the nonparametrically modeled environmental variable. We derive a generalized likelihood ratio test for this hypothesis, show how to implement it, and provide evidence that our method can improve statistical power when compared to standard partially linear models with main effects only. We use the method for the primary purpose of analyzing data from a case-control study of colorectal adenoma.

  14. Assessing pupil and school performance by non-parametric and parametric techniques

    NARCIS (Netherlands)

    de Witte, K.; Thanassoulis, E.; Simpson, G.; Battisti, G.; Charlesworth-May, A.

    2010-01-01

    This paper discusses the use of the non-parametric free disposal hull (FDH) and the parametric multi-level model (MLM) as alternative methods for measuring pupil and school attainment where hierarchical structured data are available. Using robust FDH estimates, we show how to decompose the overall

  15. Nonparametric Estimation of Interval Reliability for Discrete-Time Semi-Markov Systems

    DEFF Research Database (Denmark)

    Georgiadis, Stylianos; Limnios, Nikolaos

    2016-01-01

    In this article, we consider a repairable discrete-time semi-Markov system with finite state space. The measure of the interval reliability is given as the probability of the system being operational over a given finite-length time interval. A nonparametric estimator is proposed for the interval...

  16. Low default credit scoring using two-class non-parametric kernel density estimation

    CSIR Research Space (South Africa)

    Rademeyer, E

    2016-12-01

    Full Text Available This paper investigates the performance of two-class classification credit scoring data sets with low default ratios. The standard two-class parametric Gaussian and non-parametric Parzen classifiers are extended, using Bayes’ rule, to include either...

  17. Non-Parametric Bayesian Updating within the Assessment of Reliability for Offshore Wind Turbine Support Structures

    DEFF Research Database (Denmark)

    Ramirez, José Rangel; Sørensen, John Dalsgaard

    2011-01-01

    This work illustrates the updating and incorporation of information in the assessment of fatigue reliability for offshore wind turbine. The new information, coming from external and condition monitoring can be used to direct updating of the stochastic variables through a non-parametric Bayesian u...

  18. Non-parametric production analysis of pesticides use in the Netherlands

    NARCIS (Netherlands)

    Oude Lansink, A.G.J.M.; Silva, E.

    2004-01-01

    Many previous empirical studies on the productivity of pesticides suggest that pesticides are under-utilized in agriculture despite the general held believe that these inputs are substantially over-utilized. This paper uses data envelopment analysis (DEA) to calculate non-parametric measures of the

  19. Analyzing cost efficient production behavior under economies of scope : A nonparametric methodology

    NARCIS (Netherlands)

    Cherchye, L.J.H.; de Rock, B.; Vermeulen, F.M.P.

    2008-01-01

    In designing a production model for firms that generate multiple outputs, we take as a starting point that such multioutput production refers to economies of scope, which in turn originate from joint input use and input externalities. We provide a nonparametric characterization of cost-efficient

  20. Analyzing Cost Efficient Production Behavior Under Economies of Scope : A Nonparametric Methodology

    NARCIS (Netherlands)

    Cherchye, L.J.H.; de Rock, B.; Vermeulen, F.M.P.

    2006-01-01

    In designing a production model for firms that generate multiple outputs, we take as a starting point that such multi-output production refers to economies of scope, which in turn originate from joint input use and input externalities. We provide a nonparametric characterization of cost efficient

  1. The Support Reduction Algorithm for Computing Non-Parametric Function Estimates in Mixture Models

    OpenAIRE

    GROENEBOOM, PIET; JONGBLOED, GEURT; WELLNER, JON A.

    2008-01-01

    In this paper, we study an algorithm (which we call the support reduction algorithm) that can be used to compute non-parametric M-estimators in mixture models. The algorithm is compared with natural competitors in the context of convex regression and the ‘Aspect problem’ in quantum physics.

  2. A Bayesian Beta-Mixture Model for Nonparametric IRT (BBM-IRT)

    Science.gov (United States)

    Arenson, Ethan A.; Karabatsos, George

    2017-01-01

    Item response models typically assume that the item characteristic (step) curves follow a logistic or normal cumulative distribution function, which are strictly monotone functions of person test ability. Such assumptions can be overly-restrictive for real item response data. We propose a simple and more flexible Bayesian nonparametric IRT model…

  3. Non-parametric Estimation of Diffusion-Paths Using Wavelet Scaling Methods

    DEFF Research Database (Denmark)

    Høg, Esben

    In continuous time, diffusion processes have been used for modelling financial dynamics for a long time. For example the Ornstein-Uhlenbeck process (the simplest mean-reverting process) has been used to model non-speculative price processes. We discuss non--parametric estimation of these processes...

  4. Non-Parametric Estimation of Diffusion-Paths Using Wavelet Scaling Methods

    DEFF Research Database (Denmark)

    Høg, Esben

    2003-01-01

    In continuous time, diffusion processes have been used for modelling financial dynamics for a long time. For example the Ornstein-Uhlenbeck process (the simplest mean--reverting process) has been used to model non-speculative price processes. We discuss non--parametric estimation of these processes...

  5. A non-parametric Bayesian approach to decompounding from high frequency data

    NARCIS (Netherlands)

    Gugushvili, Shota; van der Meulen, F.H.; Spreij, Peter

    2016-01-01

    Given a sample from a discretely observed compound Poisson process, we consider non-parametric estimation of the density f0 of its jump sizes, as well as of its intensity λ0. We take a Bayesian approach to the problem and specify the prior on f0 as the Dirichlet location mixture of normal densities.

  6. Data analysis with small samples and non-normal data nonparametrics and other strategies

    CERN Document Server

    Siebert, Carl F

    2017-01-01

    Written in everyday language for non-statisticians, this book provides all the information needed to successfully conduct nonparametric analyses. This ideal reference book provides step-by-step instructions to lead the reader through each analysis, screenshots of the software and output, and case scenarios to illustrate of all the analytic techniques.

  7. Nonparametric estimation of the stationary M/G/1 workload distribution function

    DEFF Research Database (Denmark)

    Hansen, Martin Bøgsted

    2005-01-01

    In this paper it is demonstrated how a nonparametric estimator of the stationary workload distribution function of the M/G/1-queue can be obtained by systematic sampling the workload process. Weak convergence results and bootstrap methods for empirical distribution functions for stationary associ...

  8. A non-parametric method for correction of global radiation observations

    DEFF Research Database (Denmark)

    Bacher, Peder; Madsen, Henrik; Perers, Bengt

    2013-01-01

    in the observations are corrected. These are errors such as: tilt in the leveling of the sensor, shadowing from surrounding objects, clipping and saturation in the signal processing, and errors from dirt and wear. The method is based on a statistical non-parametric clear-sky model which is applied to both...

  9. Nonparametric estimation in an "illness-death" model when all transition times are interval censored

    DEFF Research Database (Denmark)

    Frydman, Halina; Gerds, Thomas; Grøn, Randi

    2013-01-01

    We develop nonparametric maximum likelihood estimation for the parameters of an irreversible Markov chain on states {0,1,2} from the observations with interval censored times of 0 → 1, 0 → 2 and 1 → 2 transitions. The distinguishing aspect of the data is that, in addition to all transition times ...

  10. Non-parametric Tuning of PID Controllers A Modified Relay-Feedback-Test Approach

    CERN Document Server

    Boiko, Igor

    2013-01-01

    The relay feedback test (RFT) has become a popular and efficient  tool used in process identification and automatic controller tuning. Non-parametric Tuning of PID Controllers couples new modifications of classical RFT with application-specific optimal tuning rules to form a non-parametric method of test-and-tuning. Test and tuning are coordinated through a set of common parameters so that a PID controller can obtain the desired gain or phase margins in a system exactly, even with unknown process dynamics. The concept of process-specific optimal tuning rules in the nonparametric setup, with corresponding tuning rules for flow, level pressure, and temperature control loops is presented in the text.   Common problems of tuning accuracy based on parametric and non-parametric approaches are addressed. In addition, the text treats the parametric approach to tuning based on the modified RFT approach and the exact model of oscillations in the system under test using the locus of a perturbedrelay system (LPRS) meth...

  11. A comparative study of non-parametric models for identification of ...

    African Journals Online (AJOL)

    However, the frequency response method using random binary signals was good for unpredicted white noise characteristics and considered the best method for non-parametric system identifica-tion. The autoregressive external input (ARX) model was very useful for system identification, but on applicati-on, few input ...

  12. A non-parametric hierarchical model to discover behavior dynamics from tracks

    NARCIS (Netherlands)

    Kooij, J.F.P.; Englebienne, G.; Gavrila, D.M.

    2012-01-01

    We present a novel non-parametric Bayesian model to jointly discover the dynamics of low-level actions and high-level behaviors of tracked people in open environments. Our model represents behaviors as Markov chains of actions which capture high-level temporal dynamics. Actions may be shared by

  13. Experimental Sentinel-2 LAI estimation using parametric, non-parametric and physical retrieval methods - A comparison

    NARCIS (Netherlands)

    Verrelst, Jochem; Rivera, Juan Pablo; Veroustraete, Frank; Muñoz-Marí, Jordi; Clevers, J.G.P.W.; Camps-Valls, Gustau; Moreno, José

    2015-01-01

    Given the forthcoming availability of Sentinel-2 (S2) images, this paper provides a systematic comparison of retrieval accuracy and processing speed of a multitude of parametric, non-parametric and physically-based retrieval methods using simulated S2 data. An experimental field dataset (SPARC),

  14. PageRank as a method to rank biomedical literature by importance.

    Science.gov (United States)

    Yates, Elliot J; Dixon, Louise C

    2015-01-01

    Optimal ranking of literature importance is vital in overcoming article overload. Existing ranking methods are typically based on raw citation counts, giving a sum of 'inbound' links with no consideration of citation importance. PageRank, an algorithm originally developed for ranking webpages at the search engine, Google, could potentially be adapted to bibliometrics to quantify the relative importance weightings of a citation network. This article seeks to validate such an approach on the freely available, PubMed Central open access subset (PMC-OAS) of biomedical literature. On-demand cloud computing infrastructure was used to extract a citation network from over 600,000 full-text PMC-OAS articles. PageRanks and citation counts were calculated for each node in this network. PageRank is highly correlated with citation count (R = 0.905, P PageRank can be trivially computed on commodity cluster hardware and is linearly correlated with citation count. Given its putative benefits in quantifying relative importance, we suggest it may enrich the citation network, thereby overcoming the existing inadequacy of citation counts alone. We thus suggest PageRank as a feasible supplement to, or replacement of, existing bibliometric ranking methods.

  15. RANK/RANK-Ligand/OPG: Ein neuer Therapieansatz in der Osteoporosebehandlung

    Directory of Open Access Journals (Sweden)

    Preisinger E

    2007-01-01

    Full Text Available Die Erforschung der Kopplungsmechanismen zur Osteoklastogenese, Knochenresorption und Remodellierung eröffnete neue mögliche Therapieansätze in der Behandlung der Osteoporose. Eine Schlüsselrolle beim Knochenabbau spielt der RANK- ("receptor activator of nuclear factor (NF- κB"- Ligand (RANKL. Durch die Bindung von RANKL an den Rezeptor RANK wird die Knochenresorption eingeleitet. OPG (Osteoprotegerin sowie der für den klinischen Gebrauch entwickelte humane monoklonale Antikörper (IgG2 Denosumab blockieren die Bindung von RANK-Ligand an RANK und verhindern den Knochenabbau.

  16. Country-specific determinants of world university rankings

    OpenAIRE

    Pietrucha, Jacek

    2017-01-01

    This paper examines country-specific factors that affect the three most influential world university rankings (the Academic Ranking of World Universities, the QS World University Ranking, and the Times Higher Education World University Ranking). We run a cross sectional regression that covers 42–71 countries (depending on the ranking and data availability). We show that the position of universities from a country in the ranking is determined by the following country-specific variables: econom...

  17. Global network centrality of university rankings

    Science.gov (United States)

    Guo, Weisi; Del Vecchio, Marco; Pogrebna, Ganna

    2017-10-01

    Universities and higher education institutions form an integral part of the national infrastructure and prestige. As academic research benefits increasingly from international exchange and cooperation, many universities have increased investment in improving and enabling their global connectivity. Yet, the relationship of university performance and its global physical connectedness has not been explored in detail. We conduct, to our knowledge, the first large-scale data-driven analysis into whether there is a correlation between university relative ranking performance and its global connectivity via the air transport network. The results show that local access to global hubs (as measured by air transport network betweenness) strongly and positively correlates with the ranking growth (statistical significance in different models ranges between 5% and 1% level). We also found that the local airport's aggregate flight paths (degree) and capacity (weighted degree) has no effect on university ranking, further showing that global connectivity distance is more important than the capacity of flight connections. We also examined the effect of local city economic development as a confounding variable and no effect was observed suggesting that access to global transportation hubs outweighs economic performance as a determinant of university ranking. The impact of this research is that we have determined the importance of the centrality of global connectivity and, hence, established initial evidence for further exploring potential connections between university ranking and regional investment policies on improving global connectivity.

  18. Diversity rankings among bacterial lineages in soil.

    Science.gov (United States)

    Youssef, Noha H; Elshahed, Mostafa S

    2009-03-01

    We used rarefaction curve analysis and diversity ordering-based approaches to rank the 11 most frequently encountered bacterial lineages in soil according to diversity in 5 previously reported 16S rRNA gene clone libraries derived from agricultural, undisturbed tall grass prairie and forest soils (n=26,140, 28 328, 31 818, 13 001 and 53 533). The Planctomycetes, Firmicutes and the delta-Proteobacteria were consistently ranked among the most diverse lineages in all data sets, whereas the Verrucomicrobia, Gemmatimonadetes and beta-Proteobacteria were consistently ranked among the least diverse. On the other hand, the rankings of alpha-Proteobacteria, Acidobacteria, Actinobacteria, Bacteroidetes and Chloroflexi varied widely in different soil clone libraries. In general, lineages exhibiting largest differences in diversity rankings also exhibited the largest difference in relative abundance in the data sets examined. Within these lineages, a positive correlation between relative abundance and diversity was observed within the Acidobacteria, Actinobacteria and Chloroflexi, and a negative diversity-abundance correlation was observed within the Bacteroidetes. The ecological and evolutionary implications of these results are discussed.

  19. Social class rank, essentialism, and punitive judgment.

    Science.gov (United States)

    Kraus, Michael W; Keltner, Dacher

    2013-08-01

    Recent evidence suggests that perceptions of social class rank influence a variety of social cognitive tendencies, from patterns of causal attribution to moral judgment. In the present studies we tested the hypotheses that upper-class rank individuals would be more likely to endorse essentialist lay theories of social class categories (i.e., that social class is founded in genetically based, biological differences) than would lower-class rank individuals and that these beliefs would decrease support for restorative justice--which seeks to rehabilitate offenders, rather than punish unlawful action. Across studies, higher social class rank was associated with increased essentialism of social class categories (Studies 1, 2, and 4) and decreased support for restorative justice (Study 4). Moreover, manipulated essentialist beliefs decreased preferences for restorative justice (Study 3), and the association between social class rank and class-based essentialist theories was explained by the tendency to endorse beliefs in a just world (Study 2). Implications for how class-based essentialist beliefs potentially constrain social opportunity and mobility are discussed.

  20. RANK und RANKL - Vom Knochen zum Mammakarzinom

    Directory of Open Access Journals (Sweden)

    Sigl V

    2012-01-01

    Full Text Available RANK („Receptor Activator of NF-κB“ und sein Ligand RANKL sind Schlüsselmoleküle im Knochenmetabolismus und spielen eine essenzielle Rolle in der Entstehung von pathologischen Knochenveränderungen. Die Deregulation des RANK/RANKL-Systems ist zum Beispiel ein Hauptgrund für das Auftreten von postmenopausaler Osteoporose bei Frauen. Eine weitere wesentliche Funktion von RANK und RANKL liegt in der Entwicklung von milchsekretierenden Drüsen während der Schwangerschaft. Dabei regulieren Sexualhormone, wie zum Beispiel Progesteron, die Expression von RANKL und induzieren dadurch die Proliferation von epithelialen Zellen der Brust. Seit Längerem war schon bekannt, dass RANK und RANKL in der Metastasenbildung von Brustkrebszellen im Knochengewebe beteiligt sind. Wir konnten nun das RANK/RANKLSystem auch als essenziellen Mechanismus in der Entstehung von hormonellem Brustkrebs identifizieren. In diesem Beitrag werden wir daher den neuesten Erkenntnissen besondere Aufmerksamkeit schenken und diese kritisch in Bezug auf Brustkrebsentwicklung betrachten.

  1. Marginal versus joint Box-Cox transformation with applications to percentile curve construction for IgG subclasses and blood pressures.

    Science.gov (United States)

    He, Xuming; Ng, K W; Shi, Jian

    2003-02-15

    When age-specific percentile curves are constructed for several correlated variables, the marginal method of handling one variable at a time has typically been used. We address the question, frequently asked by practitioners, of whether we can achieve efficiency gains by joint estimation. We focus on a simple but common method of Box-Cox transformation and assess the statistical impact of a joint transformation to multivariate normality on the percentile curve estimation for correlated variables. We find that there is little gain from the joint transformation for estimating percentiles around the median but a noticeable reduction in variances is possible for estimating extreme percentiles that are usually of main interest in medical and biological applications. Our study is motivated by problems in constructing percentile charts for IgG subclasses of children and for blood pressures in adult populations, both of which are discussed in the paper as examples, and yet our general findings are applicable to a wide range of other problems. Copyright 2003 John Wiley & Sons, Ltd.

  2. Low Rank Approximation Algorithms, Implementation, Applications

    CERN Document Server

    Markovsky, Ivan

    2012-01-01

    Matrix low-rank approximation is intimately related to data modelling; a problem that arises frequently in many different fields. Low Rank Approximation: Algorithms, Implementation, Applications is a comprehensive exposition of the theory, algorithms, and applications of structured low-rank approximation. Local optimization methods and effective suboptimal convex relaxations for Toeplitz, Hankel, and Sylvester structured problems are presented. A major part of the text is devoted to application of the theory. Applications described include: system and control theory: approximate realization, model reduction, output error, and errors-in-variables identification; signal processing: harmonic retrieval, sum-of-damped exponentials, finite impulse response modeling, and array processing; machine learning: multidimensional scaling and recommender system; computer vision: algebraic curve fitting and fundamental matrix estimation; bioinformatics for microarray data analysis; chemometrics for multivariate calibration; ...

  3. Resolution of ranking hierarchies in directed networks

    Science.gov (United States)

    Barucca, Paolo; Lillo, Fabrizio

    2018-01-01

    Identifying hierarchies and rankings of nodes in directed graphs is fundamental in many applications such as social network analysis, biology, economics, and finance. A recently proposed method identifies the hierarchy by finding the ordered partition of nodes which minimises a score function, termed agony. This function penalises the links violating the hierarchy in a way depending on the strength of the violation. To investigate the resolution of ranking hierarchies we introduce an ensemble of random graphs, the Ranked Stochastic Block Model. We find that agony may fail to identify hierarchies when the structure is not strong enough and the size of the classes is small with respect to the whole network. We analytically characterise the resolution threshold and we show that an iterated version of agony can partly overcome this resolution limit. PMID:29394278

  4. Ranking beta sheet topologies of proteins

    DEFF Research Database (Denmark)

    Fonseca, Rasmus; Helles, Glennie; Winter, Pawel

    2010-01-01

    One of the challenges of protein structure prediction is to identify long-range interactions between amino acids. To reliably predict such interactions, we enumerate, score and rank all beta-topologies (partitions of beta-strands into sheets, orderings of strands within sheets and orientations...... of paired strands) of a given protein. We show that the beta-topology corresponding to the native structure is, with high probability, among the top-ranked. Since full enumeration is very time-consuming, we also suggest a method to deal with proteins with many beta-strands. The results reported...... in this paper are highly relevant for ab initio protein structure prediction methods based on decoy generation. The top-ranked beta-topologies can be used to find initial conformations from which conformational searches can be started. They can also be used to filter decoys by removing those with poorly...

  5. Data envelopment analysis of randomized ranks

    Directory of Open Access Journals (Sweden)

    Sant'Anna Annibal P.

    2002-01-01

    Full Text Available Probabilities and odds, derived from vectors of ranks, are here compared as measures of efficiency of decision-making units (DMUs. These measures are computed with the goal of providing preliminary information before starting a Data Envelopment Analysis (DEA or the application of any other evaluation or composition of preferences methodology. Preferences, quality and productivity evaluations are usually measured with errors or subject to influence of other random disturbances. Reducing evaluations to ranks and treating the ranks as estimates of location parameters of random variables, we are able to compute the probability of each DMU being classified as the best according to the consumption of each input and the production of each output. Employing the probabilities of being the best as efficiency measures, we stretch distances between the most efficient units. We combine these partial probabilities in a global efficiency score determined in terms of proximity to the efficiency frontier.

  6. Ranking spreaders by decomposing complex networks

    International Nuclear Information System (INIS)

    Zeng, An; Zhang, Cheng-Jun

    2013-01-01

    Ranking the nodes' ability of spreading in networks is crucial for designing efficient strategies to hinder spreading in the case of diseases or accelerate spreading in the case of information dissemination. In the well-known k-shell method, nodes are ranked only according to the links between the remaining nodes (residual links) while the links connecting to the removed nodes (exhausted links) are entirely ignored. In this Letter, we propose a mixed degree decomposition (MDD) procedure in which both the residual degree and the exhausted degree are considered. By simulating the epidemic spreading process on real networks, we show that the MDD method can outperform the k-shell and degree methods in ranking spreaders.

  7. Estimation of the limit of detection with a bootstrap-derived standard error by a partly non-parametric approach. Application to HPLC drug assays

    DEFF Research Database (Denmark)

    Linnet, Kristian

    2005-01-01

    Bootstrap, HPLC, limit of blank, limit of detection, non-parametric statistics, type I and II errors......Bootstrap, HPLC, limit of blank, limit of detection, non-parametric statistics, type I and II errors...

  8. Sign rank versus Vapnik-Chervonenkis dimension

    Science.gov (United States)

    Alon, N.; Moran, Sh; Yehudayoff, A.

    2017-12-01

    This work studies the maximum possible sign rank of sign (N × N)-matrices with a given Vapnik-Chervonenkis dimension d. For d=1, this maximum is three. For d=2, this maximum is \\widetilde{\\Theta}(N1/2). For d >2, similar but slightly less accurate statements hold. The lower bounds improve on previous ones by Ben-David et al., and the upper bounds are novel. The lower bounds are obtained by probabilistic constructions, using a theorem of Warren in real algebraic topology. The upper bounds are obtained using a result of Welzl about spanning trees with low stabbing number, and using the moment curve. The upper bound technique is also used to: (i) provide estimates on the number of classes of a given Vapnik-Chervonenkis dimension, and the number of maximum classes of a given Vapnik-Chervonenkis dimension--answering a question of Frankl from 1989, and (ii) design an efficient algorithm that provides an O(N/log(N)) multiplicative approximation for the sign rank. We also observe a general connection between sign rank and spectral gaps which is based on Forster's argument. Consider the adjacency (N × N)-matrix of a Δ-regular graph with a second eigenvalue of absolute value λ and Δ ≤ N/2. We show that the sign rank of the signed version of this matrix is at least Δ/λ. We use this connection to prove the existence of a maximum class C\\subseteq\\{+/- 1\\}^N with Vapnik-Chervonenkis dimension 2 and sign rank \\widetilde{\\Theta}(N1/2). This answers a question of Ben-David et al. regarding the sign rank of large Vapnik-Chervonenkis classes. We also describe limitations of this approach, in the spirit of the Alon-Boppana theorem. We further describe connections to communication complexity, geometry, learning theory, and combinatorics. Bibliography: 69 titles.

  9. Empirical Bayes ranking and selection methods via semiparametric hierarchical mixture models in microarray studies.

    Science.gov (United States)

    Noma, Hisashi; Matsui, Shigeyuki

    2013-05-20

    The main purpose of microarray studies is screening of differentially expressed genes as candidates for further investigation. Because of limited resources in this stage, prioritizing genes are relevant statistical tasks in microarray studies. For effective gene selections, parametric empirical Bayes methods for ranking and selection of genes with largest effect sizes have been proposed (Noma et al., 2010; Biostatistics 11: 281-289). The hierarchical mixture model incorporates the differential and non-differential components and allows information borrowing across differential genes with separation from nuisance, non-differential genes. In this article, we develop empirical Bayes ranking methods via a semiparametric hierarchical mixture model. A nonparametric prior distribution, rather than parametric prior distributions, for effect sizes is specified and estimated using the "smoothing by roughening" approach of Laird and Louis (1991; Computational statistics and data analysis 12: 27-37). We present applications to childhood and infant leukemia clinical studies with microarrays for exploring genes related to prognosis or disease progression. Copyright © 2012 John Wiley & Sons, Ltd.

  10. RankProdIt: A web-interactive Rank Products analysis tool

    Directory of Open Access Journals (Sweden)

    Laing Emma

    2010-08-01

    Full Text Available Abstract Background The first objective of a DNA microarray experiment is typically to generate a list of genes or probes that are found to be differentially expressed or represented (in the case of comparative genomic hybridizations and/or copy number variation between two conditions or strains. Rank Products analysis comprises a robust algorithm for deriving such lists from microarray experiments that comprise small numbers of replicates, for example, less than the number required for the commonly used t-test. Currently, users wishing to apply Rank Products analysis to their own microarray data sets have been restricted to the use of command line-based software which can limit its usage within the biological community. Findings Here we have developed a web interface to existing Rank Products analysis tools allowing users to quickly process their data in an intuitive and step-wise manner to obtain the respective Rank Product or Rank Sum, probability of false prediction and p-values in a downloadable file. Conclusions The online interactive Rank Products analysis tool RankProdIt, for analysis of any data set containing measurements for multiple replicated conditions, is available at: http://strep-microarray.sbs.surrey.ac.uk/RankProducts

  11. Rank-based Tests of the Cointegrating Rank in Semiparametric Error Correction Models

    NARCIS (Netherlands)

    Hallin, M.; van den Akker, R.; Werker, B.J.M.

    2012-01-01

    Abstract: This paper introduces rank-based tests for the cointegrating rank in an Error Correction Model with i.i.d. elliptical innovations. The tests are asymptotically distribution-free, and their validity does not depend on the actual distribution of the innovations. This result holds despite the

  12. When sparse coding meets ranking: a joint framework for learning sparse codes and ranking scores

    KAUST Repository

    Wang, Jim Jing-Yan

    2017-06-28

    Sparse coding, which represents a data point as a sparse reconstruction code with regard to a dictionary, has been a popular data representation method. Meanwhile, in database retrieval problems, learning the ranking scores from data points plays an important role. Up to now, these two problems have always been considered separately, assuming that data coding and ranking are two independent and irrelevant problems. However, is there any internal relationship between sparse coding and ranking score learning? If yes, how to explore and make use of this internal relationship? In this paper, we try to answer these questions by developing the first joint sparse coding and ranking score learning algorithm. To explore the local distribution in the sparse code space, and also to bridge coding and ranking problems, we assume that in the neighborhood of each data point, the ranking scores can be approximated from the corresponding sparse codes by a local linear function. By considering the local approximation error of ranking scores, the reconstruction error and sparsity of sparse coding, and the query information provided by the user, we construct a unified objective function for learning of sparse codes, the dictionary and ranking scores. We further develop an iterative algorithm to solve this optimization problem.

  13. Learning to rank for information retrieval

    CERN Document Server

    Liu, Tie-Yan

    2011-01-01

    Due to the fast growth of the Web and the difficulties in finding desired information, efficient and effective information retrieval systems have become more important than ever, and the search engine has become an essential tool for many people. The ranker, a central component in every search engine, is responsible for the matching between processed queries and indexed documents. Because of its central role, great attention has been paid to the research and development of ranking technologies. In addition, ranking is also pivotal for many other information retrieval applications, such as coll

  14. Cointegration rank testing under conditional heteroskedasticity

    DEFF Research Database (Denmark)

    Cavaliere, Giuseppe; Rahbek, Anders Christian; Taylor, Robert M.

    2010-01-01

    We analyze the properties of the conventional Gaussian-based cointegrating rank tests of Johansen (1996, Likelihood-Based Inference in Cointegrated Vector Autoregressive Models) in the case where the vector of series under test is driven by globally stationary, conditionally heteroskedastic......, relative to tests based on the asymptotic critical values or the i.i.d. bootstrap, the wild bootstrap rank tests perform very well in small samples under a variety of conditionally heteroskedastic innovation processes. An empirical application to the term structure of interest rates is given....

  15. Ranking health between countries in international comparisons

    DEFF Research Database (Denmark)

    Brønnum-Hansen, Henrik

    2014-01-01

    Cross-national comparisons and ranking of summary measures of population health sometimes give rise to inconsistent and diverging conclusions. In order to minimise confusion, international comparative studies ought to be based on well-harmonised data with common standards of definitions and docum......Cross-national comparisons and ranking of summary measures of population health sometimes give rise to inconsistent and diverging conclusions. In order to minimise confusion, international comparative studies ought to be based on well-harmonised data with common standards of definitions...

  16. Preference Learning and Ranking by Pairwise Comparison

    Science.gov (United States)

    Fürnkranz, Johannes; Hüllermeier, Eyke

    This chapter provides an overview of recent work on preference learning and ranking via pairwise classification. The learning by pairwise comparison (LPC) paradigm is the natural machine learning counterpart to the relational approach to preference modeling and decision making. From a machine learning point of view, LPC is especially appealing as it decomposes a possibly complex prediction problem into a certain number of learning problems of the simplest type, namely binary classification. We explain how to approach different preference learning problems, such as label and instance ranking, within the framework of LPC. We primarily focus on methodological aspects, but also address theoretical questions as well as algorithmic and complexity issues.

  17. Compressed Sensing with Rank Deficient Dictionaries

    DEFF Research Database (Denmark)

    Hansen, Thomas Lundgaard; Johansen, Daniel Højrup; Jørgensen, Peter Bjørn

    2012-01-01

    In compressed sensing it is generally assumed that the dictionary matrix constitutes a (possibly overcomplete) basis of the signal space. In this paper we consider dictionaries that do not span the signal space, i.e. rank deficient dictionaries. We show that in this case the signal-to-noise ratio...... (SNR) in the compressed samples can be increased by selecting the rows of the measurement matrix from the column space of the dictionary. As an example application of compressed sensing with a rank deficient dictionary, we present a case study of compressed sensing applied to the Coarse Acquisition (C...

  18. Ranking mutual funds using Sortino method

    Directory of Open Access Journals (Sweden)

    Khosro Faghani Makrani

    2014-04-01

    Full Text Available One of the primary concerns on most business activities is to determine an efficient method for ranking mutual funds. This paper performs an empirical investigation to rank 42 mutual funds listed on Tehran Stock Exchange using Sortino method over the period 2011-2012. The results of survey have been compared with market return and the results have confirmed that there were some positive and meaningful relationships between Sortino return and market return. In addition, there were some positive and meaningful relationship between two Sortino methods.

  19. Research of Subgraph Estimation Page Rank Algorithm for Web Page Rank

    Directory of Open Access Journals (Sweden)

    LI Lan-yin

    2017-04-01

    Full Text Available The traditional PageRank algorithm can not efficiently perform large data Webpage scheduling problem. This paper proposes an accelerated algorithm named topK-Rank,which is based on PageRank on the MapReduce platform. It can find top k nodes efficiently for a given graph without sacrificing accuracy. In order to identify top k nodes,topK-Rank algorithm prunes unnecessary nodes and edges in each iteration to dynamically construct subgraphs,and iteratively estimates lower/upper bounds of PageRank scores through subgraphs. Theoretical analysis shows that this method guarantees result exactness. Experiments show that topK-Rank algorithm can find k nodes much faster than the existing approaches.

  20. Percentile-Based ETCCDI Temperature Extremes Indices for CMIP5 Model Output: New Results through Semiparametric Quantile Regression Approach

    Science.gov (United States)

    Li, L.; Yang, C.

    2017-12-01

    Climate extremes often manifest as rare events in terms of surface air temperature and precipitation with an annual reoccurrence period. In order to represent the manifold characteristics of climate extremes for monitoring and analysis, the Expert Team on Climate Change Detection and Indices (ETCCDI) had worked out a set of 27 core indices based on daily temperature and precipitation data, describing extreme weather and climate events on an annual basis. The CLIMDEX project (http://www.climdex.org) had produced public domain datasets of such indices for data from a variety of sources, including output from global climate models (GCM) participating in the Coupled Model Intercomparison Project Phase 5 (CMIP5). Among the 27 ETCCDI indices, there are six percentile-based temperature extremes indices that may fall into two groups: exceedance rates (ER) (TN10p, TN90p, TX10p and TX90p) and durations (CSDI and WSDI). Percentiles must be estimated prior to the calculation of the indices, and could more or less be biased by the adopted algorithm. Such biases will in turn be propagated to the final results of indices. The CLIMDEX used an empirical quantile estimator combined with a bootstrap resampling procedure to reduce the inhomogeneity in the annual series of the ER indices. However, there are still some problems remained in the CLIMDEX datasets, namely the overestimated climate variability due to unaccounted autocorrelation in the daily temperature data, seasonally varying biases and inconsistency between algorithms applied to the ER indices and to the duration indices. We now present new results of the six indices through a semiparametric quantile regression approach for the CMIP5 model output. By using the base-period data as a whole and taking seasonality and autocorrelation into account, this approach successfully addressed the aforementioned issues and came out with consistent results. The new datasets cover the historical and three projected (RCP2.6, RCP4.5 and RCP

  1. Subject Gateway Sites and Search Engine Ranking.

    Science.gov (United States)

    Thelwall, Mike

    2002-01-01

    Discusses subject gateway sites and commercial search engines for the Web and presents an explanation of Google's PageRank algorithm. The principle question addressed is the conditions under which a gateway site will increase the likelihood that a target page is found in search engines. (LRW)

  2. Rank reduction of correlation matrices by majorization

    NARCIS (Netherlands)

    R. Pietersz (Raoul); P.J.F. Groenen (Patrick)

    2004-01-01

    textabstractIn this paper a novel method is developed for the problem of finding a low-rank correlation matrix nearest to a given correlation matrix. The method is based on majorization and therefore it is globally convergent. The method is computationally efficient, is straightforward to implement,

  3. Ranking related entities: components and analyses

    NARCIS (Netherlands)

    Bron, M.; Balog, K.; de Rijke, M.

    2010-01-01

    Related entity finding is the task of returning a ranked list of homepages of relevant entities of a specified type that need to engage in a given relationship with a given source entity. We propose a framework for addressing this task and perform a detailed analysis of four core components;

  4. Ranking Very Many Typed Entities on Wikipedia

    NARCIS (Netherlands)

    Zaragoza, Hugo; Rode, H.; Mika, Peter; Atserias, Jordi; Ciaramita, Massimiliano; Attardi, Guiseppe

    2007-01-01

    We discuss the problem of ranking very many entities of different types. In particular we deal with a heterogeneous set of types, some being very generic and some very specific. We discuss two approaches for this problem: i) exploiting the entity containment graph and ii) using a Web search engine

  5. On the Dirac groups of rank n

    International Nuclear Information System (INIS)

    Ferreira, P.L.; Alcaras, J.A.C.

    1980-01-01

    The group theoretical properties of the Dirac groups of rank n are discussed together with the properties and construction of their IR's. The cases n even and n odd show distinct features. Furthermore, for n odd, the cases n=4K+1 and n=4K+3 exhibit some different properties too. (Author) [pt

  6. On rank 2 Seiberg-Witten equations

    International Nuclear Information System (INIS)

    Massamba, F.; Thompson, G.

    2004-02-01

    We introduce and study a set of rank 2 Seiberg-Witten equations. We show that the moduli space of solutions is a compact, orientational and smooth manifold. For minimal surfaces of general type we are able to determine the basic classes. (author)

  7. A tilting approach to ranking influence

    KAUST Repository

    Genton, Marc G.; Hall, Peter

    2014-01-01

    We suggest a new approach, which is applicable for general statistics computed from random samples of univariate or vector-valued or functional data, to assessing the influence that individual data have on the value of a statistic, and to ranking

  8. Texture Repairing by Unified Low Rank Optimization

    Institute of Scientific and Technical Information of China (English)

    Xiao Liang; Xiang Ren; Zhengdong Zhang; Yi Ma

    2016-01-01

    In this paper, we show how to harness both low-rank and sparse structures in regular or near-regular textures for image completion. Our method is based on a unified formulation for both random and contiguous corruption. In addition to the low rank property of texture, the algorithm also uses the sparse assumption of the natural image: because the natural image is piecewise smooth, it is sparse in certain transformed domain (such as Fourier or wavelet transform). We combine low-rank and sparsity properties of the texture image together in the proposed algorithm. Our algorithm based on convex optimization can automatically and correctly repair the global structure of a corrupted texture, even without precise information about the regions to be completed. This algorithm integrates texture rectification and repairing into one optimization problem. Through extensive simulations, we show our method can complete and repair textures corrupted by errors with both random and contiguous supports better than existing low-rank matrix recovery methods. Our method demonstrates significant advantage over local patch based texture synthesis techniques in dealing with large corruption, non-uniform texture, and large perspective deformation.

  9. Semantic association ranking schemes for information retrieval ...

    Indian Academy of Sciences (India)

    retrieval applications using term association graph representation ... Department of Computer Science and Engineering, Government College of ... Introduction ... leads to poor precision, e.g., model, python, and chip. ...... The approaches proposed in this paper focuses on the query-centric re-ranking of search results.

  10. Efficient Rank Reduction of Correlation Matrices

    NARCIS (Netherlands)

    I. Grubisic (Igor); R. Pietersz (Raoul)

    2005-01-01

    textabstractGeometric optimisation algorithms are developed that efficiently find the nearest low-rank correlation matrix. We show, in numerical tests, that our methods compare favourably to the existing methods in the literature. The connection with the Lagrange multiplier method is established,

  11. Zero forcing parameters and minimum rank problems

    NARCIS (Netherlands)

    Barioli, F.; Barrett, W.; Fallat, S.M.; Hall, H.T.; Hogben, L.; Shader, B.L.; Driessche, van den P.; Holst, van der H.

    2010-01-01

    The zero forcing number Z(G), which is the minimum number of vertices in a zero forcing set of a graph G, is used to study the maximum nullity/minimum rank of the family of symmetric matrices described by G. It is shown that for a connected graph of order at least two, no vertex is in every zero

  12. A note on ranking assignments using reoptimization

    DEFF Research Database (Denmark)

    Pedersen, Christian Roed; Nielsen, L.R.; Andersen, K.A.

    2005-01-01

    We consider the problem of ranking assignments according to cost in the classical linear assignment problem. An algorithm partitioning the set of possible assignments, as suggested by Murty, is presented where, for each partition, the optimal assignment is calculated using a new reoptimization...

  13. Language Games: University Responses to Ranking Metrics

    Science.gov (United States)

    Heffernan, Troy A.; Heffernan, Amanda

    2018-01-01

    League tables of universities that measure performance in various ways are now commonplace, with numerous bodies providing their own rankings of how institutions throughout the world are seen to be performing on a range of metrics. This paper uses Lyotard's notion of language games to theorise that universities are regaining some power over being…

  14. Ranking Thinning Potential of Lodgepole Pine Stands

    OpenAIRE

    United States Department of Agriculture, Forest Service

    1987-01-01

    This paper presents models for predicting edge-response of dominant and codominant trees to clearing. Procedures are given for converting predictions to a thinning response index, for ranking stands for thinning priority. Data requirements, sampling suggestions, examples of application, and suggestions for management use are included to facilitate use as a field guide.

  15. Primate Innovation: Sex, Age and Social Rank

    NARCIS (Netherlands)

    Reader, S.M.; Laland, K.N.

    2001-01-01

    Analysis of an exhaustive survey of primate behavior collated from the published literature revealed significant variation in rates of innovation among individuals of different sex, age and social rank. We searched approximately 1,000 articles in four primatology journals, together with other

  16. Biomechanics Scholar Citations across Academic Ranks

    Directory of Open Access Journals (Sweden)

    Knudson Duane

    2015-11-01

    Full Text Available Study aim: citations to the publications of a scholar have been used as a measure of the quality or influence of their research record. A world-wide descriptive study of the citations to the publications of biomechanics scholars of various academic ranks was conducted.

  17. An algorithm for ranking assignments using reoptimization

    DEFF Research Database (Denmark)

    Pedersen, Christian Roed; Nielsen, Lars Relund; Andersen, Kim Allan

    2008-01-01

    We consider the problem of ranking assignments according to cost in the classical linear assignment problem. An algorithm partitioning the set of possible assignments, as suggested by Murty, is presented where, for each partition, the optimal assignment is calculated using a new reoptimization...... technique. Computational results for the new algorithm are presented...

  18. Ranking Workplace Competencies: Student and Graduate Perceptions.

    Science.gov (United States)

    Rainsbury, Elizabeth; Hodges, Dave; Burchell, Noel; Lay, Mark

    2002-01-01

    New Zealand business students and graduates made similar rankings of the five most important workplace competencies: computer literacy, customer service orientation, teamwork and cooperation, self-confidence, and willingness to learn. Graduates placed greater importance on most of the 24 competencies, resulting in a statistically significant…

  19. Comparing survival curves using rank tests

    NARCIS (Netherlands)

    Albers, Willem/Wim

    1990-01-01

    Survival times of patients can be compared using rank tests in various experimental setups, including the two-sample case and the case of paired data. Attention is focussed on two frequently occurring complications in medical applications: censoring and tail alternatives. A review is given of the

  20. A generalization of Friedman's rank statistic

    NARCIS (Netherlands)

    Kroon, de J.; Laan, van der P.

    1983-01-01

    In this paper a very natural generalization of the two·way analysis of variance rank statistic of FRIEDMAN is given. The general distribution-free test procedure based on this statistic for the effect of J treatments in a random block design can be applied in general two-way layouts without

  1. Probabilistic relation between In-Degree and PageRank

    NARCIS (Netherlands)

    Litvak, Nelli; Scheinhardt, Willem R.W.; Volkovich, Y.

    2008-01-01

    This paper presents a novel stochastic model that explains the relation between power laws of In-Degree and PageRank. PageRank is a popularity measure designed by Google to rank Web pages. We model the relation between PageRank and In-Degree through a stochastic equation, which is inspired by the

  2. Generalized reduced rank tests using the singular value decomposition

    NARCIS (Netherlands)

    Kleibergen, F.R.; Paap, R.

    2002-01-01

    We propose a novel statistic to test the rank of a matrix. The rank statistic overcomes deficiencies of existing rank statistics, like: necessity of a Kronecker covariance matrix for the canonical correlation rank statistic of Anderson (1951), sensitivity to the ordering of the variables for the LDU

  3. Nominal versus Attained Weights in Universitas 21 Ranking

    Science.gov (United States)

    Soh, Kaycheng

    2014-01-01

    Universitas 21 Ranking of National Higher Education Systems (U21 Ranking) is one of the three new ranking systems appearing in 2012. In contrast with the other systems, U21 Ranking uses countries as the unit of analysis. It has several features which lend it with greater trustworthiness, but it also shared some methodological issues with the other…

  4. The effect of new links on Google PageRank

    NARCIS (Netherlands)

    Avrachenkov, Konstatin; Litvak, Nelli

    2004-01-01

    PageRank is one of the principle criteria according to which Google ranks Web pages. PageRank can be interpreted as a frequency of visiting a Web page by a random surfer and thus it reflects the popularity of a Web page. We study the effect of newly created links on Google PageRank. We discuss to

  5. Generalized Reduced Rank Tests using the Singular Value Decomposition

    NARCIS (Netherlands)

    F.R. Kleibergen (Frank); R. Paap (Richard)

    2003-01-01

    textabstractWe propose a novel statistic to test the rank of a matrix. The rank statistic overcomes deficiencies of existing rank statistics, like: necessity of a Kronecker covariance matrix for the canonical correlation rank statistic of Anderson (1951), sensitivity to the ordering of the variables

  6. Hadron Energy Reconstruction for ATLAS Barrel Combined Calorimeter Using Non-Parametrical Method

    CERN Document Server

    Kulchitskii, Yu A

    2000-01-01

    Hadron energy reconstruction for the ATLAS barrel prototype combined calorimeter in the framework of the non-parametrical method is discussed. The non-parametrical method utilizes only the known e/h ratios and the electron calibration constants and does not require the determination of any parameters by a minimization technique. Thus, this technique lends itself to fast energy reconstruction in a first level trigger. The reconstructed mean values of the hadron energies are within \\pm1% of the true values and the fractional energy resolution is [(58\\pm 3)%{\\sqrt{GeV}}/\\sqrt{E}+(2.5\\pm0.3)%]\\bigoplus(1.7\\pm0.2) GeV/E. The value of the e/h ratio obtained for the electromagnetic compartment of the combined calorimeter is 1.74\\pm0.04. Results of a study of the longitudinal hadronic shower development are also presented.

  7. On Wasserstein Two-Sample Testing and Related Families of Nonparametric Tests

    Directory of Open Access Journals (Sweden)

    Aaditya Ramdas

    2017-01-01

    Full Text Available Nonparametric two-sample or homogeneity testing is a decision theoretic problem that involves identifying differences between two random variables without making parametric assumptions about their underlying distributions. The literature is old and rich, with a wide variety of statistics having being designed and analyzed, both for the unidimensional and the multivariate setting. Inthisshortsurvey,wefocusonteststatisticsthatinvolvetheWassersteindistance. Usingan entropic smoothing of the Wasserstein distance, we connect these to very different tests including multivariate methods involving energy statistics and kernel based maximum mean discrepancy and univariate methods like the Kolmogorov–Smirnov test, probability or quantile (PP/QQ plots and receiver operating characteristic or ordinal dominance (ROC/ODC curves. Some observations are implicit in the literature, while others seem to have not been noticed thus far. Given nonparametric two-sample testing’s classical and continued importance, we aim to provide useful connections for theorists and practitioners familiar with one subset of methods but not others.

  8. Nonparametric NAR-ARCH Modelling of Stock Prices by the Kernel Methodology

    Directory of Open Access Journals (Sweden)

    Mohamed Chikhi

    2018-02-01

    Full Text Available This paper analyses cyclical behaviour of Orange stock price listed in French stock exchange over 01/03/2000 to 02/02/2017 by testing the nonlinearities through a class of conditional heteroscedastic nonparametric models. The linearity and Gaussianity assumptions are rejected for Orange Stock returns and informational shocks have transitory effects on returns and volatility. The forecasting results show that Orange stock prices are short-term predictable and nonparametric NAR-ARCH model has better performance over parametric MA-APARCH model for short horizons. Plus, the estimates of this model are also better comparing to the predictions of the random walk model. This finding provides evidence for weak form of inefficiency in Paris stock market with limited rationality, thus it emerges arbitrage opportunities.

  9. Bayesian Bandwidth Selection for a Nonparametric Regression Model with Mixed Types of Regressors

    Directory of Open Access Journals (Sweden)

    Xibin Zhang

    2016-04-01

    Full Text Available This paper develops a sampling algorithm for bandwidth estimation in a nonparametric regression model with continuous and discrete regressors under an unknown error density. The error density is approximated by the kernel density estimator of the unobserved errors, while the regression function is estimated using the Nadaraya-Watson estimator admitting continuous and discrete regressors. We derive an approximate likelihood and posterior for bandwidth parameters, followed by a sampling algorithm. Simulation results show that the proposed approach typically leads to better accuracy of the resulting estimates than cross-validation, particularly for smaller sample sizes. This bandwidth estimation approach is applied to nonparametric regression model of the Australian All Ordinaries returns and the kernel density estimation of gross domestic product (GDP growth rates among the organisation for economic co-operation and development (OECD and non-OECD countries.

  10. Bootstrap Prediction Intervals in Non-Parametric Regression with Applications to Anomaly Detection

    Science.gov (United States)

    Kumar, Sricharan; Srivistava, Ashok N.

    2012-01-01

    Prediction intervals provide a measure of the probable interval in which the outputs of a regression model can be expected to occur. Subsequently, these prediction intervals can be used to determine if the observed output is anomalous or not, conditioned on the input. In this paper, a procedure for determining prediction intervals for outputs of nonparametric regression models using bootstrap methods is proposed. Bootstrap methods allow for a non-parametric approach to computing prediction intervals with no specific assumptions about the sampling distribution of the noise or the data. The asymptotic fidelity of the proposed prediction intervals is theoretically proved. Subsequently, the validity of the bootstrap based prediction intervals is illustrated via simulations. Finally, the bootstrap prediction intervals are applied to the problem of anomaly detection on aviation data.

  11. A Bayesian nonparametric approach to reconstruction and prediction of random dynamical systems

    Science.gov (United States)

    Merkatas, Christos; Kaloudis, Konstantinos; Hatjispyros, Spyridon J.

    2017-06-01

    We propose a Bayesian nonparametric mixture model for the reconstruction and prediction from observed time series data, of discretized stochastic dynamical systems, based on Markov Chain Monte Carlo methods. Our results can be used by researchers in physical modeling interested in a fast and accurate estimation of low dimensional stochastic models when the size of the observed time series is small and the noise process (perhaps) is non-Gaussian. The inference procedure is demonstrated specifically in the case of polynomial maps of an arbitrary degree and when a Geometric Stick Breaking mixture process prior over the space of densities, is applied to the additive errors. Our method is parsimonious compared to Bayesian nonparametric techniques based on Dirichlet process mixtures, flexible and general. Simulations based on synthetic time series are presented.

  12. Bayesian Non-Parametric Mixtures of GARCH(1,1 Models

    Directory of Open Access Journals (Sweden)

    John W. Lau

    2012-01-01

    Full Text Available Traditional GARCH models describe volatility levels that evolve smoothly over time, generated by a single GARCH regime. However, nonstationary time series data may exhibit abrupt changes in volatility, suggesting changes in the underlying GARCH regimes. Further, the number and times of regime changes are not always obvious. This article outlines a nonparametric mixture of GARCH models that is able to estimate the number and time of volatility regime changes by mixing over the Poisson-Kingman process. The process is a generalisation of the Dirichlet process typically used in nonparametric models for time-dependent data provides a richer clustering structure, and its application to time series data is novel. Inference is Bayesian, and a Markov chain Monte Carlo algorithm to explore the posterior distribution is described. The methodology is illustrated on the Standard and Poor's 500 financial index.

  13. Bayesian nonparametric estimation of continuous monotone functions with applications to dose-response analysis.

    Science.gov (United States)

    Bornkamp, Björn; Ickstadt, Katja

    2009-03-01

    In this article, we consider monotone nonparametric regression in a Bayesian framework. The monotone function is modeled as a mixture of shifted and scaled parametric probability distribution functions, and a general random probability measure is assumed as the prior for the mixing distribution. We investigate the choice of the underlying parametric distribution function and find that the two-sided power distribution function is well suited both from a computational and mathematical point of view. The model is motivated by traditional nonlinear models for dose-response analysis, and provides possibilities to elicitate informative prior distributions on different aspects of the curve. The method is compared with other recent approaches to monotone nonparametric regression in a simulation study and is illustrated on a data set from dose-response analysis.

  14. Promotion time cure rate model with nonparametric form of covariate effects.

    Science.gov (United States)

    Chen, Tianlei; Du, Pang

    2018-05-10

    Survival data with a cured portion are commonly seen in clinical trials. Motivated from a biological interpretation of cancer metastasis, promotion time cure model is a popular alternative to the mixture cure rate model for analyzing such data. The existing promotion cure models all assume a restrictive parametric form of covariate effects, which can be incorrectly specified especially at the exploratory stage. In this paper, we propose a nonparametric approach to modeling the covariate effects under the framework of promotion time cure model. The covariate effect function is estimated by smoothing splines via the optimization of a penalized profile likelihood. Point-wise interval estimates are also derived from the Bayesian interpretation of the penalized profile likelihood. Asymptotic convergence rates are established for the proposed estimates. Simulations show excellent performance of the proposed nonparametric method, which is then applied to a melanoma study. Copyright © 2018 John Wiley & Sons, Ltd.

  15. Scalable Bayesian nonparametric measures for exploring pairwise dependence via Dirichlet Process Mixtures.

    Science.gov (United States)

    Filippi, Sarah; Holmes, Chris C; Nieto-Barajas, Luis E

    2016-11-16

    In this article we propose novel Bayesian nonparametric methods using Dirichlet Process Mixture (DPM) models for detecting pairwise dependence between random variables while accounting for uncertainty in the form of the underlying distributions. A key criteria is that the procedures should scale to large data sets. In this regard we find that the formal calculation of the Bayes factor for a dependent-vs.-independent DPM joint probability measure is not feasible computationally. To address this we present Bayesian diagnostic measures for characterising evidence against a "null model" of pairwise independence. In simulation studies, as well as for a real data analysis, we show that our approach provides a useful tool for the exploratory nonparametric Bayesian analysis of large multivariate data sets.

  16. Analysing the length of care episode after hip fracture: a nonparametric and a parametric Bayesian approach.

    Science.gov (United States)

    Riihimäki, Jaakko; Sund, Reijo; Vehtari, Aki

    2010-06-01

    Effective utilisation of limited resources is a challenge for health care providers. Accurate and relevant information extracted from the length of stay distributions is useful for management purposes. Patient care episodes can be reconstructed from the comprehensive health registers, and in this paper we develop a Bayesian approach to analyse the length of care episode after a fractured hip. We model the large scale data with a flexible nonparametric multilayer perceptron network and with a parametric Weibull mixture model. To assess the performances of the models, we estimate expected utilities using predictive density as a utility measure. Since the model parameters cannot be directly compared, we focus on observables, and estimate the relevances of patient explanatory variables in predicting the length of stay. To demonstrate how the use of the nonparametric flexible model is advantageous for this complex health care data, we also study joint effects of variables in predictions, and visualise nonlinearities and interactions found in the data.

  17. A Bayesian nonparametric approach to reconstruction and prediction of random dynamical systems.

    Science.gov (United States)

    Merkatas, Christos; Kaloudis, Konstantinos; Hatjispyros, Spyridon J

    2017-06-01

    We propose a Bayesian nonparametric mixture model for the reconstruction and prediction from observed time series data, of discretized stochastic dynamical systems, based on Markov Chain Monte Carlo methods. Our results can be used by researchers in physical modeling interested in a fast and accurate estimation of low dimensional stochastic models when the size of the observed time series is small and the noise process (perhaps) is non-Gaussian. The inference procedure is demonstrated specifically in the case of polynomial maps of an arbitrary degree and when a Geometric Stick Breaking mixture process prior over the space of densities, is applied to the additive errors. Our method is parsimonious compared to Bayesian nonparametric techniques based on Dirichlet process mixtures, flexible and general. Simulations based on synthetic time series are presented.

  18. A nonparametric empirical Bayes framework for large-scale multiple testing.

    Science.gov (United States)

    Martin, Ryan; Tokdar, Surya T

    2012-07-01

    We propose a flexible and identifiable version of the 2-groups model, motivated by hierarchical Bayes considerations, that features an empirical null and a semiparametric mixture model for the nonnull cases. We use a computationally efficient predictive recursion (PR) marginal likelihood procedure to estimate the model parameters, even the nonparametric mixing distribution. This leads to a nonparametric empirical Bayes testing procedure, which we call PRtest, based on thresholding the estimated local false discovery rates. Simulations and real data examples demonstrate that, compared to existing approaches, PRtest's careful handling of the nonnull density can give a much better fit in the tails of the mixture distribution which, in turn, can lead to more realistic conclusions.

  19. Hierarchical Bayesian nonparametric mixture models for clustering with variable relevance determination.

    Science.gov (United States)

    Yau, Christopher; Holmes, Chris

    2011-07-01

    We propose a hierarchical Bayesian nonparametric mixture model for clustering when some of the covariates are assumed to be of varying relevance to the clustering problem. This can be thought of as an issue in variable selection for unsupervised learning. We demonstrate that by defining a hierarchical population based nonparametric prior on the cluster locations scaled by the inverse covariance matrices of the likelihood we arrive at a 'sparsity prior' representation which admits a conditionally conjugate prior. This allows us to perform full Gibbs sampling to obtain posterior distributions over parameters of interest including an explicit measure of each covariate's relevance and a distribution over the number of potential clusters present in the data. This also allows for individual cluster specific variable selection. We demonstrate improved inference on a number of canonical problems.

  20. A multitemporal and non-parametric approach for assessing the impacts of drought on vegetation greenness

    DEFF Research Database (Denmark)

    Carrao, Hugo; Sepulcre, Guadalupe; Horion, Stéphanie Marie Anne F

    2013-01-01

    This study evaluates the relationship between the frequency and duration of meteorological droughts and the subsequent temporal changes on the quantity of actively photosynthesizing biomass (greenness) estimated from satellite imagery on rainfed croplands in Latin America. An innovative non-parametric...... and non-supervised approach, based on the Fisher-Jenks optimal classification algorithm, is used to identify multi-scale meteorological droughts on the basis of empirical cumulative distributions of 1, 3, 6, and 12-monthly precipitation totals. As input data for the classifier, we use the gridded GPCC...... for the period between 1998 and 2010. The time-series analysis of vegetation greenness is performed during the growing season with a non-parametric method, namely the seasonal Relative Greenness (RG) of spatially accumulated fAPAR. The Global Land Cover map of 2000 and the GlobCover maps of 2005/2006 and 2009...