Tutorial: Calculating Percentile Rank and Percentile Norms Using SPSS
Baumgartner, Ted A.
2009-01-01
Practitioners can benefit from using norms, but they often have to develop their own percentile rank and percentile norms. This article is a tutorial on how to quickly and easily calculate percentile rank and percentile norms using SPSS, and this information is presented for a data set. Some issues in calculating percentile rank and percentile…
He, Qili; Su, Guoming; Liu, Keliang; Zhang, Fangcheng; Jiang, Yong; Gao, Jun; Liu, Lida; Jiang, Zhongren; Jin, Minwu; Xie, Huiping
2017-01-01
Hematologic and biochemical analytes of Sprague-Dawley rats are commonly used to determine effects that were induced by treatment and to evaluate organ dysfunction in toxicological safety assessments, but reference intervals have not been well established for these analytes. Reference intervals as presently defined for these analytes in Sprague-Dawley rats have not used internationally recommended statistical method nor stratified by sex. Thus, we aimed to establish sex-specific reference intervals for hematologic and biochemical parameters in Sprague-Dawley rats according to Clinical and Laboratory Standards Institute C28-A3 and American Society for Veterinary Clinical Pathology guideline. Hematology and biochemistry blood samples were collected from 500 healthy Sprague-Dawley rats (250 males and 250 females) in the control groups. We measured 24 hematologic analytes with the Sysmex XT-2100i analyzer, 9 biochemical analytes with the Olympus AU400 analyzer. We then determined statistically relevant sex partitions and calculated reference intervals, including corresponding 90% confidence intervals, using nonparametric rank percentile method. We observed that most hematologic and biochemical analytes of Sprague-Dawley rats were significantly influenced by sex. Males had higher hemoglobin, hematocrit, red blood cell count, red cell distribution width, mean corpuscular volume, mean corpuscular hemoglobin, white blood cell count, neutrophils, lymphocytes, monocytes, percentage of neutrophils, percentage of monocytes, alanine aminotransferase, aspartate aminotransferase, and triglycerides compared to females. Females had higher mean corpuscular hemoglobin concentration, plateletcrit, platelet count, eosinophils, percentage of lymphocytes, percentage of eosinophils, creatinine, glucose, total cholesterol and urea compared to males. Sex partition was required for most hematologic and biochemical analytes in Sprague-Dawley rats. We established sex-specific reference
Nonparametric estimation of age-specific reference percentile curves with radial smoothing.
Wan, Xiaohai; Qu, Yongming; Huang, Yao; Zhang, Xiao; Song, Hanping; Jiang, Honghua
2012-01-01
Reference percentile curves represent the covariate-dependent distribution of a quantitative measurement and are often used to summarize and monitor dynamic processes such as human growth. We propose a new nonparametric method based on a radial smoothing (RS) technique to estimate age-specific reference percentile curves assuming the underlying distribution is relatively close to normal. We compared the RS method with both the LMS and the generalized additive models for location, scale and shape (GAMLSS) methods using simulated data and found that our method has smaller estimation error than the two existing methods. We also applied the new method to analyze height growth data from children being followed in a clinical observational study of growth hormone treatment, and compared the growth curves between those with growth disorders and the general population. Copyright © 2011 Elsevier Inc. All rights reserved.
Point and interval estimates of percentile ranks for scores on the Texas Functional Living Scale.
Crawford, John R; Cullum, C Munro; Garthwaite, Paul H; Lycett, Emma; Allsopp, Kate J
2012-01-01
Point and interval estimates of percentile ranks are useful tools in assisting with the interpretation of neurocognitive test results. We provide percentile ranks for raw subscale scores on the Texas Functional Living Scale (TFLS; Cullum, Weiner, & Saine, 2009) using the TFLS standardization sample data (N = 800). Percentile ranks with interval estimates are also provided for the overall TFLS T score. Conversion tables are provided along with the option of obtaining the point and interval estimates using a computer program written to accompany this paper (TFLS_PRs.exe). The percentile ranks for the subscales offer an alternative to using the cumulative percentage tables in the test manual and provide a useful and quick way for neuropsychologists to assimilate information on the case's profile of scores on the TFLS subscales. The provision of interval estimates for the percentile ranks is in keeping with the contemporary emphasis on the use of confidence intervals in psychological statistics.
International Conference on Robust Rank-Based and Nonparametric Methods
McKean, Joseph
2016-01-01
The contributors to this volume include many of the distinguished researchers in this area. Many of these scholars have collaborated with Joseph McKean to develop underlying theory for these methods, obtain small sample corrections, and develop efficient algorithms for their computation. The papers cover the scope of the area, including robust nonparametric rank-based procedures through Bayesian and big data rank-based analyses. Areas of application include biostatistics and spatial areas. Over the last 30 years, robust rank-based and nonparametric methods have developed considerably. These procedures generalize traditional Wilcoxon-type methods for one- and two-sample location problems. Research into these procedures has culminated in complete analyses for many of the models used in practice including linear, generalized linear, mixed, and nonlinear models. Settings are both multivariate and univariate. With the development of R packages in these areas, computation of these procedures is easily shared with r...
International Nuclear Information System (INIS)
Cho, Y.H.; Ko, H.S.; Kim, S.H.; Kang, C.S.; Moon, J.H.; Kim, K.D.
2004-01-01
The cost-effective reduction of occupational radiation dose (ORD) at a nuclear power plant could not be achieved without going through an extensive analysis of accumulated ORD data of existing plants. Through the data analysis, it is required to identify what are the jobs of repetitive high ORD at the nuclear power plant. In general the point value method commonly used, over-estimates the role of mean and median values to identify the high ORD jobs which can lead to misjudgment. In this study, Percentile Rank Sum Method (PRSM) is proposed to identify repetitive high ORD jobs, which is based on non-parametric statistical theory. As a case study, the method is applied to ORD data of maintenance and repair jobs at Kori units 3 and 4 that are pressurized water reactors with 950 MWe capacity and have been operated since 1986 and 1987, respectively in Korea. The results were verified and validated, and PRSM has been demonstrated to be an efficient method of analyzing the data. (authors)
Wehde, M. E.
1995-01-01
The common method of digital image comparison by subtraction imposes various constraints on the image contents. Precise registration of images is required to assure proper evaluation of surface locations. The attribute being measured and the calibration and scaling of the sensor are also important to the validity and interpretability of the subtraction result. Influences of sensor gains and offsets complicate the subtraction process. The presence of any uniform systematic transformation component in one of two images to be compared distorts the subtraction results and requires analyst intervention to interpret or remove it. A new technique has been developed to overcome these constraints. Images to be compared are first transformed using the cumulative relative frequency as a transfer function. The transformed images represent the contextual relationship of each surface location with respect to all others within the image. The process of differentiating between the transformed images results in a percentile rank ordered difference. This process produces consistent terrain-change information even when the above requirements necessary for subtraction are relaxed. This technique may be valuable to an appropriately designed hierarchical terrain-monitoring methodology because it does not require human participation in the process.
Kaltman, Jonathan R; Evans, Frank J; Danthi, Narasimhan S; Wu, Colin O; DiMichele, Donna M; Lauer, Michael S
2014-09-12
We previously demonstrated absence of association between peer-review-derived percentile ranking and raw citation impact in a large cohort of National Heart, Lung, and Blood Institute cardiovascular R01 grants, but we did not consider pregrant investigator publication productivity. We also did not normalize citation counts for scientific field, type of article, and year of publication. To determine whether measures of investigator prior productivity predict a grant's subsequent scientific impact as measured by normalized citation metrics. We identified 1492 investigator-initiated de novo National Heart, Lung, and Blood Institute R01 grant applications funded between 2001 and 2008 and linked the publications from these grants to their InCites (Thompson Reuters) citation record. InCites provides a normalized citation count for each publication stratifying by year of publication, type of publication, and field of science. The coprimary end points for this analysis were the normalized citation impact per million dollars allocated and the number of publications per grant that has normalized citation rate in the top decile per million dollars allocated (top 10% articles). Prior productivity measures included the number of National Heart, Lung, and Blood Institute-supported publications each principal investigator published in the 5 years before grant review and the corresponding prior normalized citation impact score. After accounting for potential confounders, there was no association between peer-review percentile ranking and bibliometric end points (all adjusted P>0.5). However, prior productivity was predictive (Pcitation counts, we confirmed a lack of association between peer-review grant percentile ranking and grant citation impact. However, prior investigator publication productivity was predictive of grant-specific citation impact. © 2014 American Heart Association, Inc.
Scalable Bayesian nonparametric regression via a Plackett-Luce model for conditional ranks
Gray-Davies, Tristan; Holmes, Chris C.; Caron, François
2018-01-01
We present a novel Bayesian nonparametric regression model for covariates X and continuous response variable Y ∈ ℝ. The model is parametrized in terms of marginal distributions for Y and X and a regression function which tunes the stochastic ordering of the conditional distributions F (y|x). By adopting an approximate composite likelihood approach, we show that the resulting posterior inference can be decoupled for the separate components of the model. This procedure can scale to very large datasets and allows for the use of standard, existing, software from Bayesian nonparametric density estimation and Plackett-Luce ranking estimation to be applied. As an illustration, we show an application of our approach to a US Census dataset, with over 1,300,000 data points and more than 100 covariates. PMID:29623150
Causal inference for Mann-Whitney-Wilcoxon rank sum and other nonparametric statistics.
Wu, P; Han, Y; Chen, T; Tu, X M
2014-04-15
The nonparametric Mann-Whitney-Wilcoxon (MWW) rank sum test is widely used to test treatment effect by comparing the outcome distributions between two groups, especially when there are outliers in the data. However, such statistics generally yield invalid conclusions when applied to nonrandomized studies, particularly those in epidemiologic research. Although one may control for selection bias by using available approaches of covariates adjustment such as matching, regression analysis, propensity score matching, and marginal structural models, such analyses yield results that are not only subjective based on how the outliers are handled but also often difficult to interpret. A popular alternative is a conditional permutation test based on randomization inference [Rosenbaum PR. Covariance adjustment in randomized experiments and observational studies. Statistical Science 2002; 17(3):286-327]. Because it requires strong and implausible assumptions that may not be met in most applications, this approach has limited applications in practice. In this paper, we address this gap in the literature by extending MWW and other nonparametric statistics to provide causal inference for nonrandomized study data by integrating the potential outcome paradigm with the functional response models (FRM). FRM is uniquely positioned to model dynamic relationships between subjects, rather than attributes of a single subject as in most regression models, such as the MWW test within our context. The proposed approach is illustrated with data from both real and simulated studies. Copyright © 2013 John Wiley & Sons, Ltd.
Comparison of Rank Analysis of Covariance and Nonparametric Randomized Blocks Analysis.
Porter, Andrew C.; McSweeney, Maryellen
The relative power of three possible experimental designs under the condition that data is to be analyzed by nonparametric techniques; the comparison of the power of each nonparametric technique to its parametric analogue; and the comparison of relative powers using nonparametric and parametric techniques are discussed. The three nonparametric…
Performances of non-parametric statistics in sensitivity analysis and parameter ranking
International Nuclear Information System (INIS)
Saltelli, A.
1987-01-01
Twelve parametric and non-parametric sensitivity analysis techniques are compared in the case of non-linear model responses. The test models used are taken from the long-term risk analysis for the disposal of high level radioactive waste in a geological formation. They describe the transport of radionuclides through a set of engineered and natural barriers from the repository to the biosphere and to man. The output data from these models are the dose rates affecting the maximum exposed individual of a critical group at a given point in time. All the techniques are applied to the output from the same Monte Carlo simulations, where a modified version of Latin Hypercube method is used for the sample selection. Hypothesis testing is systematically applied to quantify the degree of confidence in the results given by the various sensitivity estimators. The estimators are ranked according to their robustness and stability, on the basis of two test cases. The conclusions are that no estimator can be considered the best from all points of view and recommend the use of more than just one estimator in sensitivity analysis
Analytic Hierarchy Process (AHP in Ranking Non-Parametric Stochastic Rainfall and Streamflow Models
Directory of Open Access Journals (Sweden)
Masengo Ilunga
2015-08-01
Full Text Available Analytic Hierarchy Process (AHP is used in the selection of categories of non-parametric stochastic models for hydrological data generation and its formulation is based on pairwise comparisons of models. These models or techniques are obtained from a recent study initiated by the Water Research Commission of South Africa (WRC and were compared predominantly based on their capability to extrapolate data beyond the range of historic hydrological data. The different categories of models involved in the selection process were: wavelet (A, reordering (B, K-nearest neighbor (C, kernel density (D and bootstrap (E. In the AHP formulation, criteria for the selection of techniques are: "ability for data to preserve historic characteristics", "ability to generate new hydrological data", "scope of applicability", "presence of negative data generated" and "user friendliness". The pairwise comparisons performed through AHP showed that the overall order of selection (ranking of models was D, C, A, B and C. The weights of these techniques were found to be 27.21%, 24.3 %, 22.15 %, 13.89 % and 11.80 % respectively. Hence, bootstrap category received the highest preference while nearest neighbor received the lowest preference when all selection criteria are taken into consideration.
Bornmann, L.; Leydesdorff, L.; Wang, J.
2013-01-01
For comparisons of citation impacts across fields and over time, bibliometricians normalize the observed citation counts with reference to an expected citation value. Percentile-based approaches have been proposed as a non-parametric alternative to parametric central-tendency statistics. Percentiles
Directory of Open Access Journals (Sweden)
Gerald R Elsworth
2017-03-01
Full Text Available Objective: Participant self-report data play an essential role in the evaluation of health education activities, programmes and policies. When questionnaire items do not have a clear mapping to a performance-based continuum, percentile norms are useful for communicating individual test results to users. Similarly, when assessing programme impact, the comparison of effect sizes for group differences or baseline to follow-up change with effect sizes observed in relevant normative data provides more directly useful information compared with statistical tests of mean differences and the evaluation of effect sizes for substantive significance using universal rule-of-thumb such as those for Cohen’s ‘d’. This article aims to assist managers, programme staff and clinicians of healthcare organisations who use the Health Education Impact Questionnaire interpret their results using percentile norms for individual baseline and follow-up scores together with group effect sizes for change across the duration of typical chronic disease self-management and support programme. Methods: Percentile norms for individual Health Education Impact Questionnaire scale scores and effect sizes for group change were calculated using freely available software for each of the eight Health Education Impact Questionnaire scales. Data used were archived responses of 2157 participants of chronic disease self-management programmes conducted by a wide range of organisations in Australia between July 2007 and March 2013. Results: Tables of percentile norms and three possible effect size benchmarks for baseline to follow-up change are provided together with two worked examples to assist interpretation. Conclusion: While the norms and benchmarks presented will be particularly relevant for Australian organisations and others using the English-language version of the Health Education Impact Questionnaire, they will also be useful for translated versions as a guide to the
Directory of Open Access Journals (Sweden)
Donald W. Zimmerman
2004-01-01
Full Text Available It is well known that the two-sample Student t test fails to maintain its significance level when the variances of treatment groups are unequal, and, at the same time, sample sizes are unequal. However, introductory textbooks in psychology and education often maintain that the test is robust to variance heterogeneity when sample sizes are equal. The present study discloses that, for a wide variety of non-normal distributions, especially skewed distributions, the Type I error probabilities of both the t test and the Wilcoxon-Mann-Whitney test are substantially inflated by heterogeneous variances, even when sample sizes are equal. The Type I error rate of the t test performed on ranks replacing the scores (rank-transformed data is inflated in the same way and always corresponds closely to that of the Wilcoxon-Mann-Whitney test. For many probability densities, the distortion of the significance level is far greater after transformation to ranks and, contrary to known asymptotic properties, the magnitude of the inflation is an increasing function of sample size. Although nonparametric tests of location also can be sensitive to differences in the shape of distributions apart from location, the Wilcoxon-Mann-Whitney test and rank-transformation tests apparently are influenced mainly by skewness that is accompanied by specious differences in the means of ranks.
Dickhaus, Thorsten
2018-01-01
This textbook provides a self-contained presentation of the main concepts and methods of nonparametric statistical testing, with a particular focus on the theoretical foundations of goodness-of-fit tests, rank tests, resampling tests, and projection tests. The substitution principle is employed as a unified approach to the nonparametric test problems discussed. In addition to mathematical theory, it also includes numerous examples and computer implementations. The book is intended for advanced undergraduate, graduate, and postdoc students as well as young researchers. Readers should be familiar with the basic concepts of mathematical statistics typically covered in introductory statistics courses.
50th Percentile Rent Estimates
Department of Housing and Urban Development — Rent estimates at the 50th percentile (or median) are calculated for all Fair Market Rent areas. Fair Market Rents (FMRs) are primarily used to determine payment...
Nonparametric Multivariate Rank Tests and their Unbiasedness
Czech Academy of Sciences Publication Activity Database
Jurečková, J.; Kalina, Jan
2012-01-01
Roč. 18, č. 1 (2012), s. 229-251 ISSN 1350-7265 Grant - others:GA ČR(CZ) GA201/09/0133; GA AV ČR(CZ) IAA101120801; GA MŠk(CZ) LC06024 Institutional research plan: CEZ:AV0Z10300504 Keywords : affine invariance * contiguity * Kolmogorov–Smirnov test * Lehmann alternatives * Liu–Singh test * Psi test * Savage test * two-sample multivariate model * unbiasedness * Wilcoxon test Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.935, year: 2012
Ultrasonic Fetal Cephalometry: Percentiles Curve
Flamme, P.
1972-01-01
Measurements by ultrasound of the biparietal diameter of the fetal head during pregnancy are a reliable guide to fetal growth. As a ready means of comparison with the normal we constructed from 4,170 measurements in 1,394 cases a curve showing the percentiles distribution of biparietal diameters for each week of gestation. PMID:5070162
Hypothesis Testing of Population Percentiles via the Wald Test with Bootstrap Variance Estimates
Johnson, William D.; Romer, Jacob E.
2016-01-01
Testing the equality of percentiles (quantiles) between populations is an effective method for robust, nonparametric comparison, especially when the distributions are asymmetric or irregularly shaped. Unlike global nonparametric tests for homogeneity such as the Kolmogorv-Smirnov test, testing the equality of a set of percentiles (i.e., a percentile profile) yields an estimate of the location and extent of the differences between the populations along the entire domain. The Wald test using bootstrap estimates of variance of the order statistics provides a unified method for hypothesis testing of functions of the population percentiles. Simulation studies are conducted to show performance of the method under various scenarios and to give suggestions on its use. Several examples are given to illustrate some useful applications to real data. PMID:27034909
Nonparametric statistical inference
Gibbons, Jean Dickinson
2010-01-01
Overall, this remains a very fine book suitable for a graduate-level course in nonparametric statistics. I recommend it for all people interested in learning the basic ideas of nonparametric statistical inference.-Eugenia Stoimenova, Journal of Applied Statistics, June 2012… one of the best books available for a graduate (or advanced undergraduate) text for a theory course on nonparametric statistics. … a very well-written and organized book on nonparametric statistics, especially useful and recommended for teachers and graduate students.-Biometrics, 67, September 2011This excellently presente
The application of non-parametric statistical method for an ALARA implementation
International Nuclear Information System (INIS)
Cho, Young Ho; Herr, Young Hoi
2003-01-01
The cost-effective reduction of Occupational Radiation Dose (ORD) at a nuclear power plant could not be achieved without going through an extensive analysis of accumulated ORD data of existing plants. Through the data analysis, it is required to identify what are the jobs of repetitive high ORD at the nuclear power plant. In this study, Percentile Rank Sum Method (PRSM) is proposed to identify repetitive high ORD jobs, which is based on non-parametric statistical theory. As a case study, the method is applied to ORD data of maintenance and repair jobs at Kori units 3 and 4 that are pressurized water reactors with 950 MWe capacity and have been operated since 1986 and 1987, respectively in Korea. The results was verified and validated, and PRSM has been demonstrated to be an efficient method of analyzing the data
On Cooper's Nonparametric Test.
Schmeidler, James
1978-01-01
The basic assumption of Cooper's nonparametric test for trend (EJ 125 069) is questioned. It is contended that the proper assumption alters the distribution of the statistic and reduces its usefulness. (JKS)
Nonparametric statistics with applications to science and engineering
Kvam, Paul H
2007-01-01
A thorough and definitive book that fully addresses traditional and modern-day topics of nonparametric statistics This book presents a practical approach to nonparametric statistical analysis and provides comprehensive coverage of both established and newly developed methods. With the use of MATLAB, the authors present information on theorems and rank tests in an applied fashion, with an emphasis on modern methods in regression and curve fitting, bootstrap confidence intervals, splines, wavelets, empirical likelihood, and goodness-of-fit testing. Nonparametric Statistics with Applications to Science and Engineering begins with succinct coverage of basic results for order statistics, methods of categorical data analysis, nonparametric regression, and curve fitting methods. The authors then focus on nonparametric procedures that are becoming more relevant to engineering researchers and practitioners. The important fundamental materials needed to effectively learn and apply the discussed methods are also provide...
Nonparametric Transfer Function Models
Liu, Jun M.; Chen, Rong; Yao, Qiwei
2009-01-01
In this paper a class of nonparametric transfer function models is proposed to model nonlinear relationships between ‘input’ and ‘output’ time series. The transfer function is smooth with unknown functional forms, and the noise is assumed to be a stationary autoregressive-moving average (ARMA) process. The nonparametric transfer function is estimated jointly with the ARMA parameters. By modeling the correlation in the noise, the transfer function can be estimated more efficiently. The parsimonious ARMA structure improves the estimation efficiency in finite samples. The asymptotic properties of the estimators are investigated. The finite-sample properties are illustrated through simulations and one empirical example. PMID:20628584
Applied nonparametric statistical methods
Sprent, Peter
2007-01-01
While preserving the clear, accessible style of previous editions, Applied Nonparametric Statistical Methods, Fourth Edition reflects the latest developments in computer-intensive methods that deal with intractable analytical problems and unwieldy data sets. Reorganized and with additional material, this edition begins with a brief summary of some relevant general statistical concepts and an introduction to basic ideas of nonparametric or distribution-free methods. Designed experiments, including those with factorial treatment structures, are now the focus of an entire chapter. The text also e
Bayesian nonparametric data analysis
Müller, Peter; Jara, Alejandro; Hanson, Tim
2015-01-01
This book reviews nonparametric Bayesian methods and models that have proven useful in the context of data analysis. Rather than providing an encyclopedic review of probability models, the book’s structure follows a data analysis perspective. As such, the chapters are organized by traditional data analysis problems. In selecting specific nonparametric models, simpler and more traditional models are favored over specialized ones. The discussed methods are illustrated with a wealth of examples, including applications ranging from stylized examples to case studies from recent literature. The book also includes an extensive discussion of computational methods and details on their implementation. R code for many examples is included in on-line software pages.
Birth weight reference percentiles for Chinese.
Directory of Open Access Journals (Sweden)
Li Dai
Full Text Available To develop a reference of population-based gestational age-specific birth weight percentiles for contemporary Chinese.Birth weight data was collected by the China National Population-based Birth Defects Surveillance System. A total of 1,105,214 live singleton births aged ≥28 weeks of gestation without birth defects during 2006-2010 were included. The lambda-mu-sigma method was utilized to generate percentiles and curves.Gestational age-specific birth weight percentiles for male and female infants were constructed separately. Significant differences were observed between the current reference and other references developed for Chinese or non-Chinese infants.There have been moderate increases in birth weight percentiles for Chinese infants of both sexes and most gestational ages since 1980s, suggesting the importance of utilizing an updated national reference for both clinical and research purposes.
Statistical methods for ranking data
Alvo, Mayer
2014-01-01
This book introduces advanced undergraduate, graduate students and practitioners to statistical methods for ranking data. An important aspect of nonparametric statistics is oriented towards the use of ranking data. Rank correlation is defined through the notion of distance functions and the notion of compatibility is introduced to deal with incomplete data. Ranking data are also modeled using a variety of modern tools such as CART, MCMC, EM algorithm and factor analysis. This book deals with statistical methods used for analyzing such data and provides a novel and unifying approach for hypotheses testing. The techniques described in the book are illustrated with examples and the statistical software is provided on the authors’ website.
On nonparametric hazard estimation.
Hobbs, Brian P
The Nelson-Aalen estimator provides the basis for the ubiquitous Kaplan-Meier estimator, and therefore is an essential tool for nonparametric survival analysis. This article reviews martingale theory and its role in demonstrating that the Nelson-Aalen estimator is uniformly consistent for estimating the cumulative hazard function for right-censored continuous time-to-failure data.
Sikkema, A.; Ruitenbeek, van H.C.G.; Gerritsma, W.; Wouters, P.
2014-01-01
In een academische wereld die steeds competitiever wordt, willen we graag weten wat ‘de beste’ universiteit is. Verschillende rankings bedienen ons op onze wenken, waaronder Times Higher Education, Sjanghai, QS en Leiden. De kritiek op die lijsten is echter niet mals, ook omdat universiteiten graag
Bayesian nonparametric hierarchical modeling.
Dunson, David B
2009-04-01
In biomedical research, hierarchical models are very widely used to accommodate dependence in multivariate and longitudinal data and for borrowing of information across data from different sources. A primary concern in hierarchical modeling is sensitivity to parametric assumptions, such as linearity and normality of the random effects. Parametric assumptions on latent variable distributions can be challenging to check and are typically unwarranted, given available prior knowledge. This article reviews some recent developments in Bayesian nonparametric methods motivated by complex, multivariate and functional data collected in biomedical studies. The author provides a brief review of flexible parametric approaches relying on finite mixtures and latent class modeling. Dirichlet process mixture models are motivated by the need to generalize these approaches to avoid assuming a fixed finite number of classes. Focusing on an epidemiology application, the author illustrates the practical utility and potential of nonparametric Bayes methods.
Nonparametric modal regression
Chen, Yen-Chi; Genovese, Christopher R.; Tibshirani, Ryan J.; Wasserman, Larry
2016-01-01
Modal regression estimates the local modes of the distribution of $Y$ given $X=x$, instead of the mean, as in the usual regression sense, and can hence reveal important structure missed by usual regression methods. We study a simple nonparametric method for modal regression, based on a kernel density estimate (KDE) of the joint distribution of $Y$ and $X$. We derive asymptotic error bounds for this method, and propose techniques for constructing confidence sets and prediction sets. The latter...
Directory of Open Access Journals (Sweden)
Telichenko Valeriy Ivanovich
2014-06-01
Full Text Available The article gives the analysis of university rankings and defines the differences in evaluation methods and indicators of world ranking agencies, presents new approaches to making global rankings. It defines the position of MGSU in Russian universities TOP-100 ranking. University rankings are not simply information, but the evaluation instrument of quality of education, initiating the improvement of ranking position. It’s important for Russian Universities claiming for higher positions in the world rankings. MGSU position in universities ranking made the University administration consider thoroughly the University positioning in the system of higher education, in the categories of education and science and among possible employers of the university graduates.
Non-parametric approach to the study of phenotypic stability.
Ferreira, D F; Fernandes, S B; Bruzi, A T; Ramalho, M A P
2016-02-19
The aim of this study was to undertake the theoretical derivations of non-parametric methods, which use linear regressions based on rank order, for stability analyses. These methods were extension different parametric methods used for stability analyses and the result was compared with a standard non-parametric method. Intensive computational methods (e.g., bootstrap and permutation) were applied, and data from the plant-breeding program of the Biology Department of UFLA (Minas Gerais, Brazil) were used to illustrate and compare the tests. The non-parametric stability methods were effective for the evaluation of phenotypic stability. In the presence of variance heterogeneity, the non-parametric methods exhibited greater power of discrimination when determining the phenotypic stability of genotypes.
Nonparametric statistical inference
Gibbons, Jean Dickinson
2014-01-01
Thoroughly revised and reorganized, the fourth edition presents in-depth coverage of the theory and methods of the most widely used nonparametric procedures in statistical analysis and offers example applications appropriate for all areas of the social, behavioral, and life sciences. The book presents new material on the quantiles, the calculation of exact and simulated power, multiple comparisons, additional goodness-of-fit tests, methods of analysis of count data, and modern computer applications using MINITAB, SAS, and STATXACT. It includes tabular guides for simplified applications of tests and finding P values and confidence interval estimates.
Nonparametric tests for data in randomised blocks with Ordered alternatives
Rayner, J. C. W.; Best, D. J.
1999-01-01
For randomized block designs, nonparametric treatment comparisons are usually made using the Friedman test for complete designs, and the Durbin test for incomplete designs; see, for example, Conover (1998). This permits assessment of only the mean rankings. Such comparisons are here extended to permit assessments of bivariate effects such as the linear by linear effect and the quadratic by linear, or umbrella effect.
Relation between body mass index percentile and muscle strength ...
African Journals Online (AJOL)
Relation between body mass index percentile and muscle strength and endurance. ... Egyptian Journal of Medical Human Genetics ... They were divided into three groups according to their body mass index percentile where group (a) is equal to or more than 5% percentile yet less than 85% percentile, group (b) is equal to ...
Nonparametric combinatorial sequence models.
Wauthier, Fabian L; Jordan, Michael I; Jojic, Nebojsa
2011-11-01
This work considers biological sequences that exhibit combinatorial structures in their composition: groups of positions of the aligned sequences are "linked" and covary as one unit across sequences. If multiple such groups exist, complex interactions can emerge between them. Sequences of this kind arise frequently in biology but methodologies for analyzing them are still being developed. This article presents a nonparametric prior on sequences which allows combinatorial structures to emerge and which induces a posterior distribution over factorized sequence representations. We carry out experiments on three biological sequence families which indicate that combinatorial structures are indeed present and that combinatorial sequence models can more succinctly describe them than simpler mixture models. We conclude with an application to MHC binding prediction which highlights the utility of the posterior distribution over sequence representations induced by the prior. By integrating out the posterior, our method compares favorably to leading binding predictors.
Nonparametric tests for censored data
Bagdonavicus, Vilijandas; Nikulin, Mikhail
2013-01-01
This book concerns testing hypotheses in non-parametric models. Generalizations of many non-parametric tests to the case of censored and truncated data are considered. Most of the test results are proved and real applications are illustrated using examples. Theories and exercises are provided. The incorrect use of many tests applying most statistical software is highlighted and discussed.
Nonparametric Estimation of Item and Respondent Locations from Unfolding-type Items.
Johnson, Matthew S
2006-06-01
Unlike their monotone counterparts, nonparametric unfolding response models, which assume the item response function is unimodal, have seen little attention in the psychometric literature. This paper studies the nonparametric behavior of unfolding models by building on the work of Post (1992). The paper provides rigorous justification for a class of nonparametric estimators of respondents' latent attitudes by proving that the estimators consistently rank order the respondents. The paper also suggests an algorithm for the rank ordering of items along the attitudes scale. Finally, the methods are evaluated using simulated data.
Gershenson, Carlos
Studies of rank distributions have been popular for decades, especially since the work of Zipf. For example, if we rank words of a given language by use frequency (most used word in English is 'the', rank 1; second most common word is 'of', rank 2), the distribution can be approximated roughly with a power law. The same applies for cities (most populated city in a country ranks first), earthquakes, metabolism, the Internet, and dozens of other phenomena. We recently proposed ``rank diversity'' to measure how ranks change in time, using the Google Books Ngram dataset. Studying six languages between 1800 and 2009, we found that the rank diversity curves of languages are universal, adjusted with a sigmoid on log-normal scale. We are studying several other datasets (sports, economies, social systems, urban systems, earthquakes, artificial life). Rank diversity seems to be universal, independently of the shape of the rank distribution. I will present our work in progress towards a general description of the features of rank change in time, along with simple models which reproduce it
Blood Pressure Percentiles for School Children
Directory of Open Access Journals (Sweden)
İsmail Özanli
2016-06-01
Full Text Available Objective: The prevalence of hypertension in childhood and adolescence is gradually increasing. We aimed to investigate the blood pressure (BP values of children aged 7-18 years. Methods: This study was conducted in a total of 3375 (1777 females, 1598 males children from 27 schools. Blood pressures of children were measured using sphygmomanometer appropriate to arm circumference. Results: A positive relationship was found between systolic blood pressure (SBP and diastolic blood pressure (DBP and the body weight, height, age and body mass index (BMI in male and female children. SBP was higher in males than females after the age of 13. DBP was higher in males than the females after the age of 14. The mean annual increase of SBP was 2.06 mmHg in males and 1.54 mmHg in females. The mean annual increase of DBP was 1.52 mmHg in males and 1.38 mmHg in females. Conclusion: In this study, we identified the threshold values for blood pressure in children between the age of 7 and 18 years in Erzurum province. It is necessary to combine and evaluate data obtained from various regions for the identification of BP percentiles according to the age, gender and height percentiles of Turkish children.
Bessière, Christian; Hébrard, Emmanuel; Katsirelos, George; Walsh, Toby; Kiziltan, Zeynep
2016-01-01
International audience; We need to reason about rankings of objects in a wide variety of domains including information retrieval, sports tournaments, bibliometrics, and statistics. We propose a global constraint therefore for modeling rankings. One important application for rankings is in reasoning about the correlation or uncorrelation between sequences. For example, we might wish to have consecutive delivery sched- ules correlated to make it easier for clients and em- ployees, or uncorrelat...
Usng subjective percentiles and test data for estimating fragility functions
International Nuclear Information System (INIS)
George, L.L.; Mensing, R.W.
1981-01-01
Fragility functions are cumulative distribution functions (cdfs) of strengths at failure. They are needed for reliability analyses of systems such as power generation and transmission systems. Subjective opinions supplement sparse test data for estimating fragility functions. Often the opinions are opinions on the percentiles of the fragility function. Subjective percentiles are likely to be less biased than opinions on parameters of cdfs. Solutions to several problems in the estimation of fragility functions are found for subjective percentiles and test data. How subjective percentiles should be used to estimate subjective fragility functions, how subjective percentiles should be combined with test data, how fragility functions for several failure modes should be combined into a composite fragility function, and how inherent randomness and uncertainty due to lack of knowledge should be represented are considered. Subjective percentiles are treated as independent estimates of percentiles. The following are derived: least-squares parameter estimators for normal and lognormal cdfs, based on subjective percentiles (the method is applicable to any invertible cdf); a composite fragility function for combining several failure modes; estimators of variation within and between groups of experts for nonidentically distributed subjective percentiles; weighted least-squares estimators when subjective percentiles have higher variation at higher percents; and weighted least-squares and Bayes parameter estimators based on combining subjective percentiles and test data. 4 figures, 2 tables
Decision support using nonparametric statistics
Beatty, Warren
2018-01-01
This concise volume covers nonparametric statistics topics that most are most likely to be seen and used from a practical decision support perspective. While many degree programs require a course in parametric statistics, these methods are often inadequate for real-world decision making in business environments. Much of the data collected today by business executives (for example, customer satisfaction opinions) requires nonparametric statistics for valid analysis, and this book provides the reader with a set of tools that can be used to validly analyze all data, regardless of type. Through numerous examples and exercises, this book explains why nonparametric statistics will lead to better decisions and how they are used to reach a decision, with a wide array of business applications. Online resources include exercise data, spreadsheets, and solutions.
Dwivedi, Alok Kumar; Mallawaarachchi, Indika; Alvarado, Luis A
2017-06-30
Experimental studies in biomedical research frequently pose analytical problems related to small sample size. In such studies, there are conflicting findings regarding the choice of parametric and nonparametric analysis, especially with non-normal data. In such instances, some methodologists questioned the validity of parametric tests and suggested nonparametric tests. In contrast, other methodologists found nonparametric tests to be too conservative and less powerful and thus preferred using parametric tests. Some researchers have recommended using a bootstrap test; however, this method also has small sample size limitation. We used a pooled method in nonparametric bootstrap test that may overcome the problem related with small samples in hypothesis testing. The present study compared nonparametric bootstrap test with pooled resampling method corresponding to parametric, nonparametric, and permutation tests through extensive simulations under various conditions and using real data examples. The nonparametric pooled bootstrap t-test provided equal or greater power for comparing two means as compared with unpaired t-test, Welch t-test, Wilcoxon rank sum test, and permutation test while maintaining type I error probability for any conditions except for Cauchy and extreme variable lognormal distributions. In such cases, we suggest using an exact Wilcoxon rank sum test. Nonparametric bootstrap paired t-test also provided better performance than other alternatives. Nonparametric bootstrap test provided benefit over exact Kruskal-Wallis test. We suggest using nonparametric bootstrap test with pooled resampling method for comparing paired or unpaired means and for validating the one way analysis of variance test results for non-normal data in small sample size studies. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
EJSCREEN National Percentiles Lookup Table--2015 Public Release
U.S. Environmental Protection Agency — The USA table provides percentile breaks of important EJSCREEN elements (demographic indicators and indexes, environmental indicators and indexes) at the national...
EJSCREEN States Percentiles Lookup Table--2015 Public Release
U.S. Environmental Protection Agency — The States table provides percentile breaks of important EJSCREEN elements (demographic indicators and indexes, environmental indicators and indexes) at the state...
EJSCREEN Regions Percentiles Lookup Table--2015 Public Release
U.S. Environmental Protection Agency — The Regions table provides percentile breaks of important EJSCREEN elements (demographic indicators and indexes, environmental indicators and indexes) at the EPA...
Nonparametric estimation of ultrasound pulses
DEFF Research Database (Denmark)
Jensen, Jørgen Arendt; Leeman, Sidney
1994-01-01
An algorithm for nonparametric estimation of 1D ultrasound pulses in echo sequences from human tissues is derived. The technique is a variation of the homomorphic filtering technique using the real cepstrum, and the underlying basis of the method is explained. The algorithm exploits a priori...
Testing discontinuities in nonparametric regression
Dai, Wenlin
2017-01-19
In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100
Directory of Open Access Journals (Sweden)
Edmar Soares de Vasconcelos
2008-03-01
Full Text Available O objetivo deste trabalho foi avaliar genótipos de soja quanto à sanidade de semente, com um método de análise, pelo qual se obtém índices de sanidade (eliminação e classificação com base em análise não-paramétrica. Esses índices consistiram em eliminar os genótipos com incidência de patógenos acima de um dado valor, estabelecido pelo experimentador e, em seguida, classificar os genótipos não eliminados, por ordem de incidência desses patógenos. A fim de comprovar sua eficácia, realizaram-se a simulação e comparação desse método com outros, e seu uso em dados de germinação e sanidade das sementes de cultivares e linhagens de soja, de ensaios finais do Programa de Melhoramento de Soja, do Departamento de Fitotecnia, da Universidade Federal de Viçosa, conduzidos no ano agrícola de 2002/2003. Os pesos das variáveis e os limites de corte, utilizados nos índices, foram estabelecidos tendo-se levado em consideração estudos que relacionam a sanidade das sementes e sua germinação. A utilização dos índices propostos permite classificar genótipos de soja, quanto à qualidade sanitária das sementes, e eliminar das análises os genótipos que não atingiram os níveis mínimos requeridos.The objective of this work was to assess soybean genotypes for seed sanity, with a method by which a sanity index (elimination and classification is obtained based on non-parametric analysis. This index consisted in the elimination of genotypes with pathogen incidence above a certain value, established by the researcher, and then the classification of the noneliminated genotypes in the first step, ordering them according to the incidence of the pathogens. To verify its effectiveness, it was accomplished a simulation study and comparison of this proposed method with others, and its use in germination and sanity data of seeds from soybean lineages and cultivars of final experiments of the Soybean Breeding Program of Departmento de
Trusteeship, 1996
1996-01-01
Five college and university admissions officers (Richard Whiteside, Robert Laird, Jon Rivenburg, Fred A. Hargadon, Stanley E. Henderson) tell how their institutions have responded to charges that some institutions selectively report data to boost their rankings in magazines and college guides. Comments address the way data are obtained and…
Nonparametric Inference for Periodic Sequences
Sun, Ying
2012-02-01
This article proposes a nonparametric method for estimating the period and values of a periodic sequence when the data are evenly spaced in time. The period is estimated by a "leave-out-one-cycle" version of cross-validation (CV) and complements the periodogram, a widely used tool for period estimation. The CV method is computationally simple and implicitly penalizes multiples of the smallest period, leading to a "virtually" consistent estimator of integer periods. This estimator is investigated both theoretically and by simulation.We also propose a nonparametric test of the null hypothesis that the data have constantmean against the alternative that the sequence of means is periodic. Finally, our methodology is demonstrated on three well-known time series: the sunspots and lynx trapping data, and the El Niño series of sea surface temperatures. © 2012 American Statistical Association and the American Society for Quality.
Nonparametric Econometrics: The np Package
Directory of Open Access Journals (Sweden)
Tristen Hayﬁeld
2008-07-01
Full Text Available We describe the R np package via a series of applications that may be of interest to applied econometricians. The np package implements a variety of nonparametric and semiparametric kernel-based estimators that are popular among econometricians. There are also procedures for nonparametric tests of signiﬁcance and consistent model speciﬁcation tests for parametric mean regression models and parametric quantile regression models, among others. The np package focuses on kernel methods appropriate for the mix of continuous, discrete, and categorical data often found in applied settings. Data-driven methods of bandwidth selection are emphasized throughout, though we caution the user that data-driven bandwidth selection methods can be computationally demanding.
Nonparametric predictive inference in reliability
International Nuclear Information System (INIS)
Coolen, F.P.A.; Coolen-Schrijner, P.; Yan, K.J.
2002-01-01
We introduce a recently developed statistical approach, called nonparametric predictive inference (NPI), to reliability. Bounds for the survival function for a future observation are presented. We illustrate how NPI can deal with right-censored data, and discuss aspects of competing risks. We present possible applications of NPI for Bernoulli data, and we briefly outline applications of NPI for replacement decisions. The emphasis is on introduction and illustration of NPI in reliability contexts, detailed mathematical justifications are presented elsewhere
Nonparametric Mixture of Regression Models.
Huang, Mian; Li, Runze; Wang, Shaoli
2013-07-01
Motivated by an analysis of US house price index data, we propose nonparametric finite mixture of regression models. We study the identifiability issue of the proposed models, and develop an estimation procedure by employing kernel regression. We further systematically study the sampling properties of the proposed estimators, and establish their asymptotic normality. A modified EM algorithm is proposed to carry out the estimation procedure. We show that our algorithm preserves the ascent property of the EM algorithm in an asymptotic sense. Monte Carlo simulations are conducted to examine the finite sample performance of the proposed estimation procedure. An empirical analysis of the US house price index data is illustrated for the proposed methodology.
Nonparametric identification of copula structures
Li, Bo
2013-06-01
We propose a unified framework for testing a variety of assumptions commonly made about the structure of copulas, including symmetry, radial symmetry, joint symmetry, associativity and Archimedeanity, and max-stability. Our test is nonparametric and based on the asymptotic distribution of the empirical copula process.We perform simulation experiments to evaluate our test and conclude that our method is reliable and powerful for assessing common assumptions on the structure of copulas, particularly when the sample size is moderately large. We illustrate our testing approach on two datasets. © 2013 American Statistical Association.
On Locally Most Powerful Sequential Rank Tests
Czech Academy of Sciences Publication Activity Database
Kalina, Jan
2017-01-01
Roč. 36, č. 1 (2017), s. 111-125 ISSN 0747-4946 R&D Projects: GA ČR GA17-07384S Grant - others:Nadační fond na podporu vědy(CZ) Neuron Institutional support: RVO:67985807 Keywords : nonparametric tests * sequential ranks * stopping variable Subject RIV: BA - General Mathematics OBOR OECD: Pure mathematics Impact factor: 0.339, year: 2016
A contingency table approach to nonparametric testing
Rayner, JCW
2000-01-01
Most texts on nonparametric techniques concentrate on location and linear-linear (correlation) tests, with less emphasis on dispersion effects and linear-quadratic tests. Tests for higher moment effects are virtually ignored. Using a fresh approach, A Contingency Table Approach to Nonparametric Testing unifies and extends the popular, standard tests by linking them to tests based on models for data that can be presented in contingency tables.This approach unifies popular nonparametric statistical inference and makes the traditional, most commonly performed nonparametric analyses much more comp
Nonparametric statistics for social and behavioral sciences
Kraska-MIller, M
2013-01-01
Introduction to Research in Social and Behavioral SciencesBasic Principles of ResearchPlanning for ResearchTypes of Research Designs Sampling ProceduresValidity and Reliability of Measurement InstrumentsSteps of the Research Process Introduction to Nonparametric StatisticsData AnalysisOverview of Nonparametric Statistics and Parametric Statistics Overview of Parametric Statistics Overview of Nonparametric StatisticsImportance of Nonparametric MethodsMeasurement InstrumentsAnalysis of Data to Determine Association and Agreement Pearson Chi-Square Test of Association and IndependenceContingency
Panel data specifications in nonparametric kernel regression
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard; Henningsen, Arne
We discuss nonparametric regression models for panel data. A fully nonparametric panel data specification that uses the time variable and the individual identifier as additional (categorical) explanatory variables is considered to be the most suitable. We use this estimator and conventional...... parametric panel data estimators to analyse the production technology of Polish crop farms. The results of our nonparametric kernel regressions generally differ from the estimates of the parametric models but they only slightly depend on the choice of the kernel functions. Based on economic reasoning, we...... found the estimates of the fully nonparametric panel data model to be more reliable....
Percentile curves for skinfold thickness for Canadian children and youth.
Kuhle, Stefan; Ashley-Martin, Jillian; Maguire, Bryan; Hamilton, David C
2016-01-01
Background. Skinfold thickness (SFT) measurements are a reliable and feasible method for assessing body fat in children but their use and interpretation is hindered by the scarcity of reference values in representative populations of children. The objective of the present study was to develop age- and sex-specific percentile curves for five SFT measures (biceps, triceps, subscapular, suprailiac, medial calf) in a representative population of Canadian children and youth. Methods. We analyzed data from 3,938 children and adolescents between 6 and 19 years of age who participated in the Canadian Health Measures Survey cycles 1 (2007/2009) and 2 (2009/2011). Standardized procedures were used to measure SFT. Age- and sex-specific centiles for SFT were calculated using the GAMLSS method. Results. Percentile curves were materially different in absolute value and shape for boys and girls. Percentile girls in girls steadily increased with age whereas percentile curves in boys were characterized by a pubertal centered peak. Conclusions. The current study has presented for the first time percentile curves for five SFT measures in a representative sample of Canadian children and youth.
Percentile curves for skinfold thickness for Canadian children and youth
Directory of Open Access Journals (Sweden)
Stefan Kuhle
2016-07-01
Full Text Available Background. Skinfold thickness (SFT measurements are a reliable and feasible method for assessing body fat in children but their use and interpretation is hindered by the scarcity of reference values in representative populations of children. The objective of the present study was to develop age- and sex-specific percentile curves for five SFT measures (biceps, triceps, subscapular, suprailiac, medial calf in a representative population of Canadian children and youth. Methods. We analyzed data from 3,938 children and adolescents between 6 and 19 years of age who participated in the Canadian Health Measures Survey cycles 1 (2007/2009 and 2 (2009/2011. Standardized procedures were used to measure SFT. Age- and sex-specific centiles for SFT were calculated using the GAMLSS method. Results. Percentile curves were materially different in absolute value and shape for boys and girls. Percentile girls in girls steadily increased with age whereas percentile curves in boys were characterized by a pubertal centered peak. Conclusions. The current study has presented for the first time percentile curves for five SFT measures in a representative sample of Canadian children and youth.
Nonparametric Bayesian inference in biostatistics
Müller, Peter
2015-01-01
As chapters in this book demonstrate, BNP has important uses in clinical sciences and inference for issues like unknown partitions in genomics. Nonparametric Bayesian approaches (BNP) play an ever expanding role in biostatistical inference from use in proteomics to clinical trials. Many research problems involve an abundance of data and require flexible and complex probability models beyond the traditional parametric approaches. As this book's expert contributors show, BNP approaches can be the answer. Survival Analysis, in particular survival regression, has traditionally used BNP, but BNP's potential is now very broad. This applies to important tasks like arrangement of patients into clinically meaningful subpopulations and segmenting the genome into functionally distinct regions. This book is designed to both review and introduce application areas for BNP. While existing books provide theoretical foundations, this book connects theory to practice through engaging examples and research questions. Chapters c...
Percentile estimation using the normal and lognormal probability distribution
International Nuclear Information System (INIS)
Bement, T.R.
1980-01-01
Implicitly or explicitly percentile estimation is an important aspect of the analysis of aerial radiometric survey data. Standard deviation maps are produced for quadrangles which are surveyed as part of the National Uranium Resource Evaluation. These maps show where variables differ from their mean values by more than one, two or three standard deviations. Data may or may not be log-transformed prior to analysis. These maps have specific percentile interpretations only when proper distributional assumptions are met. Monte Carlo results are presented in this paper which show the consequences of estimating percentiles by: (1) assuming normality when the data are really from a lognormal distribution; and (2) assuming lognormality when the data are really from a normal distribution
Percentile growth charts for biomedical studies using a porcine model.
Corson, A M; Laws, J; Laws, A; Litten, J C; Lean, I J; Clarke, L
2008-12-01
Increasing rates of obesity and heart disease are compromising quality of life for a growing number of people. There is much research linking adult disease with the growth and development both in utero and during the first year of life. The pig is an ideal model for studying the origins of developmental programming. The objective of this paper was to construct percentile growth curves for the pig for use in biomedical studies. The body weight (BW) of pigs was recorded from birth to 150 days of age and their crown-to-rump length was measured over the neonatal period to enable the ponderal index (PI; kg/m3) to be calculated. Data were normalised and percentile curves were constructed using Cole's lambda-mu-sigma (LMS) method for BW and PI. The construction of these percentile charts for use in biomedical research will allow a more detailed and precise tracking of growth and development of individual pigs under experimental conditions.
Nonparametric e-Mixture Estimation.
Takano, Ken; Hino, Hideitsu; Akaho, Shotaro; Murata, Noboru
2016-12-01
This study considers the common situation in data analysis when there are few observations of the distribution of interest or the target distribution, while abundant observations are available from auxiliary distributions. In this situation, it is natural to compensate for the lack of data from the target distribution by using data sets from these auxiliary distributions-in other words, approximating the target distribution in a subspace spanned by a set of auxiliary distributions. Mixture modeling is one of the simplest ways to integrate information from the target and auxiliary distributions in order to express the target distribution as accurately as possible. There are two typical mixtures in the context of information geometry: the [Formula: see text]- and [Formula: see text]-mixtures. The [Formula: see text]-mixture is applied in a variety of research fields because of the presence of the well-known expectation-maximazation algorithm for parameter estimation, whereas the [Formula: see text]-mixture is rarely used because of its difficulty of estimation, particularly for nonparametric models. The [Formula: see text]-mixture, however, is a well-tempered distribution that satisfies the principle of maximum entropy. To model a target distribution with scarce observations accurately, this letter proposes a novel framework for a nonparametric modeling of the [Formula: see text]-mixture and a geometrically inspired estimation algorithm. As numerical examples of the proposed framework, a transfer learning setup is considered. The experimental results show that this framework works well for three types of synthetic data sets, as well as an EEG real-world data set.
Network reconstruction using nonparametric additive ODE models.
Henderson, James; Michailidis, George
2014-01-01
Network representations of biological systems are widespread and reconstructing unknown networks from data is a focal problem for computational biologists. For example, the series of biochemical reactions in a metabolic pathway can be represented as a network, with nodes corresponding to metabolites and edges linking reactants to products. In a different context, regulatory relationships among genes are commonly represented as directed networks with edges pointing from influential genes to their targets. Reconstructing such networks from data is a challenging problem receiving much attention in the literature. There is a particular need for approaches tailored to time-series data and not reliant on direct intervention experiments, as the former are often more readily available. In this paper, we introduce an approach to reconstructing directed networks based on dynamic systems models. Our approach generalizes commonly used ODE models based on linear or nonlinear dynamics by extending the functional class for the functions involved from parametric to nonparametric models. Concomitantly we limit the complexity by imposing an additive structure on the estimated slope functions. Thus the submodel associated with each node is a sum of univariate functions. These univariate component functions form the basis for a novel coupling metric that we define in order to quantify the strength of proposed relationships and hence rank potential edges. We show the utility of the method by reconstructing networks using simulated data from computational models for the glycolytic pathway of Lactocaccus Lactis and a gene network regulating the pluripotency of mouse embryonic stem cells. For purposes of comparison, we also assess reconstruction performance using gene networks from the DREAM challenges. We compare our method to those that similarly rely on dynamic systems models and use the results to attempt to disentangle the distinct roles of linearity, sparsity, and derivative
Decision Boundary Feature Extraction for Nonparametric Classification
Lee, Chulhee; Landgrebe, David A.
1993-01-01
Feature extraction has long been an important topic in pattern recognition. Although many authors have studied feature extraction for parametric classifiers, relatively few feature extraction algorithms are available for nonparametric classifiers. A new feature extraction algorithm based on decision boundaries for nonparametric classifiers is proposed. It is noted that feature extraction for pattern recognition is equivalent to retaining 'discriminantly informative features' and a discriminantly informative feature is related to the decision boundary. Since nonparametric classifiers do not define decision boundaries in analytic form, the decision boundary and normal vectors must be estimated numerically. A procedure to extract discriminantly informative features based on a decision boundary for non-parametric classification is proposed. Experiments show that the proposed algorithm finds effective features for the nonparametric classifier with Parzen density estimation.
Directory of Open Access Journals (Sweden)
Jinchao Feng
2018-03-01
Full Text Available We propose non-parametric methods for both local and global sensitivity analysis of chemical reaction models with correlated parameter dependencies. The developed mathematical and statistical tools are applied to a benchmark Langmuir competitive adsorption model on a close packed platinum surface, whose parameters, estimated from quantum-scale computations, are correlated and are limited in size (small data. The proposed mathematical methodology employs gradient-based methods to compute sensitivity indices. We observe that ranking influential parameters depends critically on whether or not correlations between parameters are taken into account. The impact of uncertainty in the correlation and the necessity of the proposed non-parametric perspective are demonstrated.
Feng, Jinchao; Lansford, Joshua; Mironenko, Alexander; Pourkargar, Davood Babaei; Vlachos, Dionisios G.; Katsoulakis, Markos A.
2018-03-01
We propose non-parametric methods for both local and global sensitivity analysis of chemical reaction models with correlated parameter dependencies. The developed mathematical and statistical tools are applied to a benchmark Langmuir competitive adsorption model on a close packed platinum surface, whose parameters, estimated from quantum-scale computations, are correlated and are limited in size (small data). The proposed mathematical methodology employs gradient-based methods to compute sensitivity indices. We observe that ranking influential parameters depends critically on whether or not correlations between parameters are taken into account. The impact of uncertainty in the correlation and the necessity of the proposed non-parametric perspective are demonstrated.
Relation between body mass index percentile and muscle strength ...
African Journals Online (AJOL)
Noha Abdel Kader Abdel Kader Hasan
2016-02-01
Feb 1, 2016 ... a positive correlation between muscle strength and body mass index percentile while muscle endur- ance time had a negative correlation with it. Conclusion: The study shows that the BMI of children had a positive correlation with the muscle ... lar force in a specific movement pattern at definite velocity.
Bayesian Nonparametric Longitudinal Data Analysis.
Quintana, Fernando A; Johnson, Wesley O; Waetjen, Elaine; Gold, Ellen
2016-01-01
Practical Bayesian nonparametric methods have been developed across a wide variety of contexts. Here, we develop a novel statistical model that generalizes standard mixed models for longitudinal data that include flexible mean functions as well as combined compound symmetry (CS) and autoregressive (AR) covariance structures. AR structure is often specified through the use of a Gaussian process (GP) with covariance functions that allow longitudinal data to be more correlated if they are observed closer in time than if they are observed farther apart. We allow for AR structure by considering a broader class of models that incorporates a Dirichlet Process Mixture (DPM) over the covariance parameters of the GP. We are able to take advantage of modern Bayesian statistical methods in making full predictive inferences and about characteristics of longitudinal profiles and their differences across covariate combinations. We also take advantage of the generality of our model, which provides for estimation of a variety of covariance structures. We observe that models that fail to incorporate CS or AR structure can result in very poor estimation of a covariance or correlation matrix. In our illustration using hormone data observed on women through the menopausal transition, biology dictates the use of a generalized family of sigmoid functions as a model for time trends across subpopulation categories.
Nonparametric Bayesian Modeling of Complex Networks
DEFF Research Database (Denmark)
Schmidt, Mikkel Nørgaard; Mørup, Morten
2013-01-01
an infinite mixture model as running example, we go through the steps of deriving the model as an infinite limit of a finite parametric model, inferring the model parameters by Markov chain Monte Carlo, and checking the model?s fit and predictive performance. We explain how advanced nonparametric models......Modeling structure in complex networks using Bayesian nonparametrics makes it possible to specify flexible model structures and infer the adequate model complexity from the observed data. This article provides a gentle introduction to nonparametric Bayesian modeling of complex networks: Using...
Should we use customized fetal growth percentiles in urban Canada?
Melamed, Nir; Ray, Joel G; Shah, Prakesh S; Berger, Howard; Kingdom, John C
2014-02-01
An increasingly common challenge in antenatal care of the small for gestational age (SGA) fetus is the distinction between the constitutionally (physiologically) small fetus and the fetus affected by pathological intrauterine growth restriction (IUGR). We discuss here the rationale and the evidence for the use of customized growth percentiles for the purpose of distinguishing between the fetus with true IUGR and the fetus with constitutional SGA. We also provide estimates of the potential effects of adopting ethnicity-specific birth weight curves on the rates of SGA and large for gestational age status in multi-ethnic metropolitan cities in North America and Europe, such as the City of Toronto. Using customized growth percentiles would result in a considerable decline in the rate of a false-positive diagnosis of SGA among visible minorities, and improve the detection rate of true large for gestational age fetuses among these groups.
ASSESSING PHYSICAL DEVELOPMENT OF CHILDREN WITH PERCENTILE DIAGRAMS
Directory of Open Access Journals (Sweden)
Rita R. Kildiyarova
2017-01-01
Full Text Available The results of the analysis of methods assessing anthropometric measures in children are presented. A method for visual examination of physical development using author's percentile diagrams for height, body weight, and the harmony of development of children of different age groups is offered. The method can be quickly performed, it is recommended for mass screening examination of children under outpatient treatment. To monitor the health of a specific child, a monitoring assessment of physical development is possible. The analysis of Z-score is of great clinical importance when determining anthropometric measures below the 3rd percentile, for the assessment of premature infants with congenital malformations and other diseases, in the presence of obesity. Graphical curves of body weight, height to age, body weight according to the height of boys and girls can be used by pediatricians.
Nonparametric functional mapping of quantitative trait loci.
Yang, Jie; Wu, Rongling; Casella, George
2009-03-01
Functional mapping is a useful tool for mapping quantitative trait loci (QTL) that control dynamic traits. It incorporates mathematical aspects of biological processes into the mixture model-based likelihood setting for QTL mapping, thus increasing the power of QTL detection and the precision of parameter estimation. However, in many situations there is no obvious functional form and, in such cases, this strategy will not be optimal. Here we propose to use nonparametric function estimation, typically implemented with B-splines, to estimate the underlying functional form of phenotypic trajectories, and then construct a nonparametric test to find evidence of existing QTL. Using the representation of a nonparametric regression as a mixed model, the final test statistic is a likelihood ratio test. We consider two types of genetic maps: dense maps and general maps, and the power of nonparametric functional mapping is investigated through simulation studies and demonstrated by examples.
Using Mathematica to build Non-parametric Statistical Tables
Directory of Open Access Journals (Sweden)
Gloria Perez Sainz de Rozas
2003-01-01
Full Text Available In this paper, I present computational procedures to obtian statistical tables. The tables of the asymptotic distribution and the exact distribution of Kolmogorov-Smirnov statistic Dn for one population, the table of the distribution of the runs R, the table of the distribution of Wilcoxon signed-rank statistic W+ and the table of the distribution of Mann-Whitney statistic Ux using Mathematica, Version 3.9 under Window98. I think that it is an interesting cuestion because many statistical packages give the asymptotic significance level in the statistical tests and with these porcedures one can easily calculate the exact significance levels and the left-tail and right-tail probabilities with non-parametric distributions. I have used mathematica to make these calculations because one can use symbolic language to solve recursion relations. It's very easy to generate the format of the tables, and it's possible to obtain any table of the mentioned non-parametric distributions with any precision, not only with the standard parameters more used in Statistics, and without transcription mistakes. Furthermore, using similar procedures, we can generate tables for the following distribution functions: Binomial, Poisson, Hypergeometric, Normal, x2 Chi-Square, T-Student, F-Snedecor, Geometric, Gamma and Beta.
DEFF Research Database (Denmark)
Johansen, Søren
2008-01-01
The reduced rank regression model is a multivariate regression model with a coefficient matrix with reduced rank. The reduced rank regression algorithm is an estimation procedure, which estimates the reduced rank regression model. It is related to canonical correlations and involves calculating...
On Locally Most Powerful Sequential Rank Tests
Czech Academy of Sciences Publication Activity Database
Kalina, Jan
2017-01-01
Roč. 36, č. 1 (2017), s. 111-125 ISSN 0747-4946 R&D Projects: GA ČR GA17-07384S Grant - others:Nadační fond na podporu vědy(CZ) Neuron Institutional support: RVO:67985556 Keywords : nonparametric tests * sequential ranks * stopping variable Subject RIV: BA - General Mathematics OBOR OECD: Pure mathematics Impact factor: 0.339, year: 2016 http://library.utia.cas.cz/separaty/2017/SI/kalina-0474065.pdf
Why preferring parametric forecasting to nonparametric methods?
Jabot, Franck
2015-05-07
A recent series of papers by Charles T. Perretti and collaborators have shown that nonparametric forecasting methods can outperform parametric methods in noisy nonlinear systems. Such a situation can arise because of two main reasons: the instability of parametric inference procedures in chaotic systems which can lead to biased parameter estimates, and the discrepancy between the real system dynamics and the modeled one, a problem that Perretti and collaborators call "the true model myth". Should ecologists go on using the demanding parametric machinery when trying to forecast the dynamics of complex ecosystems? Or should they rely on the elegant nonparametric approach that appears so promising? It will be here argued that ecological forecasting based on parametric models presents two key comparative advantages over nonparametric approaches. First, the likelihood of parametric forecasting failure can be diagnosed thanks to simple Bayesian model checking procedures. Second, when parametric forecasting is diagnosed to be reliable, forecasting uncertainty can be estimated on virtual data generated with the fitted to data parametric model. In contrast, nonparametric techniques provide forecasts with unknown reliability. This argumentation is illustrated with the simple theta-logistic model that was previously used by Perretti and collaborators to make their point. It should convince ecologists to stick to standard parametric approaches, until methods have been developed to assess the reliability of nonparametric forecasting. Copyright © 2015 Elsevier Ltd. All rights reserved.
Jump percentile: a proposal for evaluation of high level sportsmen.
Centeno-Prada, R A; López, C; Naranjo-Orellana, J
2015-05-01
The goal of this study was to determine reference values of explosive strength for Spanish professional athletes using a force platform. Reference values are displayed as a sports-independent percentile distribution. A total of 323 elite male athletes (age: 20.38 ± 4.65 years, body mass: 75.04 ± 14.30 kg, height: 178.62 ± 14.18 cm) from different disciplines performed the following test: squat jump (SJ), countermovement jump (CMJ), Abalakov test (AB), drop jump (DJ) and repeated jumps (RJ). We calculated: relative peak power, relative peak force, maximal height, symmetry index, explosive index of strength, relative effective impulse, duration of jump, elastic capacity, eccentric time, action of arm, jump number, average height, intensity and fatigue index of force. Significant differences were found among sports disciplines (Pdisciplines in DJ variables. In RJ, the main variable characterizing the disciplines analyzed was average height, which showed a significant negative association with athletics, soccer, volleyball and gymnastics. The results obtained suggest that a percentile table may be useful in assessing explosive strength in athletes, regardless of there being any reference values available for their sports discipline.
Nonparametric correlation models for portfolio allocation
DEFF Research Database (Denmark)
Aslanidis, Nektarios; Casas, Isabel
2013-01-01
This article proposes time-varying nonparametric and semiparametric estimators of the conditional cross-correlation matrix in the context of portfolio allocation. Simulations results show that the nonparametric and semiparametric models are best in DGPs with substantial variability or structural...... breaks in correlations. Only when correlations are constant does the parametric DCC model deliver the best outcome. The methodologies are illustrated by evaluating two interesting portfolios. The first portfolio consists of the equity sector SPDRs and the S&P 500, while the second one contains major...
Recent Advances and Trends in Nonparametric Statistics
Akritas, MG
2003-01-01
The advent of high-speed, affordable computers in the last two decades has given a new boost to the nonparametric way of thinking. Classical nonparametric procedures, such as function smoothing, suddenly lost their abstract flavour as they became practically implementable. In addition, many previously unthinkable possibilities became mainstream; prime examples include the bootstrap and resampling methods, wavelets and nonlinear smoothers, graphical methods, data mining, bioinformatics, as well as the more recent algorithmic approaches such as bagging and boosting. This volume is a collection o
Ranking Operations Management conferences
Steenhuis, H.J.; de Bruijn, E.J.; Gupta, Sushil; Laptaned, U
2007-01-01
Several publications have appeared in the field of Operations Management which rank Operations Management related journals. Several ranking systems exist for journals based on , for example, perceived relevance and quality, citation, and author affiliation. Many academics also publish at conferences
Nonparametric estimation in models for unobservable heterogeneity
Hohmann, Daniel
2014-01-01
Nonparametric models which allow for data with unobservable heterogeneity are studied. The first publication introduces new estimators and their asymptotic properties for conditional mixture models. The second publication considers estimation of a function from noisy observations of its Radon transform in a Gaussian white noise model.
How Are Teachers Teaching? A Nonparametric Approach
De Witte, Kristof; Van Klaveren, Chris
2014-01-01
This paper examines which configuration of teaching activities maximizes student performance. For this purpose a nonparametric efficiency model is formulated that accounts for (1) self-selection of students and teachers in better schools and (2) complementary teaching activities. The analysis distinguishes both individual teaching (i.e., a…
Application of Nonparametric Methods in Studying Energy ...
African Journals Online (AJOL)
Consumer behaviour towards different forms of energy varies over time. The variance can be so large that the quality of the estimation functional relationship between the response variable and its associated explanatory variables is seriously affected. To attenuate this, kernel smoothing a nonparametric regression ...
Nonparametric confidence intervals for monotone functions
Groeneboom, P.; Jongbloed, G.
2015-01-01
We study nonparametric isotonic confidence intervals for monotone functions. In [Ann. Statist. 29 (2001) 1699–1731], pointwise confidence intervals, based on likelihood ratio tests using the restricted and unrestricted MLE in the current status model, are introduced. We extend the method to the
How are teachers teaching? A nonparametric approach
De Witte, Kristof; Van Klaveren, Chris
This paper examines which configuration of teaching activities maximizes student performance. For this purpose a nonparametric efficiency model is formulated that accounts for (1) self-selection of students and teachers in better schools and (2) complementary teaching activities. The analysis
Sparse structure regularized ranking
Wang, Jim Jing-Yan
2014-04-17
Learning ranking scores is critical for the multimedia database retrieval problem. In this paper, we propose a novel ranking score learning algorithm by exploring the sparse structure and using it to regularize ranking scores. To explore the sparse structure, we assume that each multimedia object could be represented as a sparse linear combination of all other objects, and combination coefficients are regarded as a similarity measure between objects and used to regularize their ranking scores. Moreover, we propose to learn the sparse combination coefficients and the ranking scores simultaneously. A unified objective function is constructed with regard to both the combination coefficients and the ranking scores, and is optimized by an iterative algorithm. Experiments on two multimedia database retrieval data sets demonstrate the significant improvements of the propose algorithm over state-of-the-art ranking score learning algorithms.
Nonparametric Bayes Modeling of Multivariate Categorical Data.
Dunson, David B; Xing, Chuanhua
2012-01-01
Modeling of multivariate unordered categorical (nominal) data is a challenging problem, particularly in high dimensions and cases in which one wishes to avoid strong assumptions about the dependence structure. Commonly used approaches rely on the incorporation of latent Gaussian random variables or parametric latent class models. The goal of this article is to develop a nonparametric Bayes approach, which defines a prior with full support on the space of distributions for multiple unordered categorical variables. This support condition ensures that we are not restricting the dependence structure a priori. We show this can be accomplished through a Dirichlet process mixture of product multinomial distributions, which is also a convenient form for posterior computation. Methods for nonparametric testing of violations of independence are proposed, and the methods are applied to model positional dependence within transcription factor binding motifs.
Adaptive Confidence Bands for Nonparametric Regression Functions.
Cai, T Tony; Low, Mark; Ma, Zongming
2014-01-01
A new formulation for the construction of adaptive confidence bands in non-parametric function estimation problems is proposed. Confidence bands are constructed which have size that adapts to the smoothness of the function while guaranteeing that both the relative excess mass of the function lying outside the band and the measure of the set of points where the function lies outside the band are small. It is shown that the bands adapt over a maximum range of Lipschitz classes. The adaptive confidence band can be easily implemented in standard statistical software with wavelet support. Numerical performance of the procedure is investigated using both simulated and real datasets. The numerical results agree well with the theoretical analysis. The procedure can be easily modified and used for other nonparametric function estimation models.
Nonparametric and semiparametric dynamic additive regression models
DEFF Research Database (Denmark)
Scheike, Thomas Harder; Martinussen, Torben
Dynamic additive regression models provide a flexible class of models for analysis of longitudinal data. The approach suggested in this work is suited for measurements obtained at random time points and aims at estimating time-varying effects. Both fully nonparametric and semiparametric models can...... in special cases. We investigate the finite sample properties of the estimators and conclude that the asymptotic results are valid for even samll samples....
Stochastic Non-Parametric Frontier Analysis
Mohammad Rahmani; Gholamreza Jahanshahloo
2014-01-01
In this paper we develop an approach that synthesizes the best features of the two main methods in the estimation of production eciency. Specically, our approach rst allows for statistical noise, similar to Stochastic frontier analysis , and second, it allows modeling multiple-inputs-multiple-outputs technologies without imposing parametric assumptions on production relationship, similar to what is done in non-parametric methods. The methodology is based on the theory of local maximum likelih...
Nonparametric Mixture Models for Supervised Image Parcellation.
Sabuncu, Mert R; Yeo, B T Thomas; Van Leemput, Koen; Fischl, Bruce; Golland, Polina
2009-09-01
We present a nonparametric, probabilistic mixture model for the supervised parcellation of images. The proposed model yields segmentation algorithms conceptually similar to the recently developed label fusion methods, which register a new image with each training image separately. Segmentation is achieved via the fusion of transferred manual labels. We show that in our framework various settings of a model parameter yield algorithms that use image intensity information differently in determining the weight of a training subject during fusion. One particular setting computes a single, global weight per training subject, whereas another setting uses locally varying weights when fusing the training data. The proposed nonparametric parcellation approach capitalizes on recently developed fast and robust pairwise image alignment tools. The use of multiple registrations allows the algorithm to be robust to occasional registration failures. We report experiments on 39 volumetric brain MRI scans with expert manual labels for the white matter, cerebral cortex, ventricles and subcortical structures. The results demonstrate that the proposed nonparametric segmentation framework yields significantly better segmentation than state-of-the-art algorithms.
Robustifying Bayesian nonparametric mixtures for count data.
Canale, Antonio; Prünster, Igor
2017-03-01
Our motivating application stems from surveys of natural populations and is characterized by large spatial heterogeneity in the counts, which makes parametric approaches to modeling local animal abundance too restrictive. We adopt a Bayesian nonparametric approach based on mixture models and innovate with respect to popular Dirichlet process mixture of Poisson kernels by increasing the model flexibility at the level both of the kernel and the nonparametric mixing measure. This allows to derive accurate and robust estimates of the distribution of local animal abundance and of the corresponding clusters. The application and a simulation study for different scenarios yield also some general methodological implications. Adding flexibility solely at the level of the mixing measure does not improve inferences, since its impact is severely limited by the rigidity of the Poisson kernel with considerable consequences in terms of bias. However, once a kernel more flexible than the Poisson is chosen, inferences can be robustified by choosing a prior more general than the Dirichlet process. Therefore, to improve the performance of Bayesian nonparametric mixtures for count data one has to enrich the model simultaneously at both levels, the kernel and the mixing measure. © 2016, The International Biometric Society.
Network structure exploration via Bayesian nonparametric models
International Nuclear Information System (INIS)
Chen, Y; Wang, X L; Xiang, X; Tang, B Z; Bu, J Z
2015-01-01
Complex networks provide a powerful mathematical representation of complex systems in nature and society. To understand complex networks, it is crucial to explore their internal structures, also called structural regularities. The task of network structure exploration is to determine how many groups there are in a complex network and how to group the nodes of the network. Most existing structure exploration methods need to specify either a group number or a certain type of structure when they are applied to a network. In the real world, however, the group number and also the certain type of structure that a network has are usually unknown in advance. To explore structural regularities in complex networks automatically, without any prior knowledge of the group number or the certain type of structure, we extend a probabilistic mixture model that can handle networks with any type of structure but needs to specify a group number using Bayesian nonparametric theory. We also propose a novel Bayesian nonparametric model, called the Bayesian nonparametric mixture (BNPM) model. Experiments conducted on a large number of networks with different structures show that the BNPM model is able to explore structural regularities in networks automatically with a stable, state-of-the-art performance. (paper)
portfolio optimization based on nonparametric estimation methods
Directory of Open Access Journals (Sweden)
mahsa ghandehari
2017-03-01
Full Text Available One of the major issues investors are facing with in capital markets is decision making about select an appropriate stock exchange for investing and selecting an optimal portfolio. This process is done through the risk and expected return assessment. On the other hand in portfolio selection problem if the assets expected returns are normally distributed, variance and standard deviation are used as a risk measure. But, the expected returns on assets are not necessarily normal and sometimes have dramatic differences from normal distribution. This paper with the introduction of conditional value at risk ( CVaR, as a measure of risk in a nonparametric framework, for a given expected return, offers the optimal portfolio and this method is compared with the linear programming method. The data used in this study consists of monthly returns of 15 companies selected from the top 50 companies in Tehran Stock Exchange during the winter of 1392 which is considered from April of 1388 to June of 1393. The results of this study show the superiority of nonparametric method over the linear programming method and the nonparametric method is much faster than the linear programming method.
Bradshaw, Corey J. A.; Brook, Barry W.
2016-01-01
There are now many methods available to assess the relative citation performance of peer-reviewed journals. Regardless of their individual faults and advantages, citation-based metrics are used by researchers to maximize the citation potential of their articles, and by employers to rank academic track records. The absolute value of any particular index is arguably meaningless unless compared to other journals, and different metrics result in divergent rankings. To provide a simple yet more objective way to rank journals within and among disciplines, we developed a κ-resampled composite journal rank incorporating five popular citation indices: Impact Factor, Immediacy Index, Source-Normalized Impact Per Paper, SCImago Journal Rank and Google 5-year h-index; this approach provides an index of relative rank uncertainty. We applied the approach to six sample sets of scientific journals from Ecology (n = 100 journals), Medicine (n = 100), Multidisciplinary (n = 50); Ecology + Multidisciplinary (n = 25), Obstetrics & Gynaecology (n = 25) and Marine Biology & Fisheries (n = 25). We then cross-compared the κ-resampled ranking for the Ecology + Multidisciplinary journal set to the results of a survey of 188 publishing ecologists who were asked to rank the same journals, and found a 0.68–0.84 Spearman’s ρ correlation between the two rankings datasets. Our composite index approach therefore approximates relative journal reputation, at least for that discipline. Agglomerative and divisive clustering and multi-dimensional scaling techniques applied to the Ecology + Multidisciplinary journal set identified specific clusters of similarly ranked journals, with only Nature & Science separating out from the others. When comparing a selection of journals within or among disciplines, we recommend collecting multiple citation-based metrics for a sample of relevant and realistic journals to calculate the composite rankings and their relative uncertainty windows. PMID:26930052
Bradshaw, Corey J A; Brook, Barry W
2016-01-01
There are now many methods available to assess the relative citation performance of peer-reviewed journals. Regardless of their individual faults and advantages, citation-based metrics are used by researchers to maximize the citation potential of their articles, and by employers to rank academic track records. The absolute value of any particular index is arguably meaningless unless compared to other journals, and different metrics result in divergent rankings. To provide a simple yet more objective way to rank journals within and among disciplines, we developed a κ-resampled composite journal rank incorporating five popular citation indices: Impact Factor, Immediacy Index, Source-Normalized Impact Per Paper, SCImago Journal Rank and Google 5-year h-index; this approach provides an index of relative rank uncertainty. We applied the approach to six sample sets of scientific journals from Ecology (n = 100 journals), Medicine (n = 100), Multidisciplinary (n = 50); Ecology + Multidisciplinary (n = 25), Obstetrics & Gynaecology (n = 25) and Marine Biology & Fisheries (n = 25). We then cross-compared the κ-resampled ranking for the Ecology + Multidisciplinary journal set to the results of a survey of 188 publishing ecologists who were asked to rank the same journals, and found a 0.68-0.84 Spearman's ρ correlation between the two rankings datasets. Our composite index approach therefore approximates relative journal reputation, at least for that discipline. Agglomerative and divisive clustering and multi-dimensional scaling techniques applied to the Ecology + Multidisciplinary journal set identified specific clusters of similarly ranked journals, with only Nature & Science separating out from the others. When comparing a selection of journals within or among disciplines, we recommend collecting multiple citation-based metrics for a sample of relevant and realistic journals to calculate the composite rankings and their relative uncertainty windows.
Multi Criteria Credit Rating Model for Small Enterprise Using a Nonparametric Method
Directory of Open Access Journals (Sweden)
Guotai Chi
2017-10-01
Full Text Available A small enterprise’s credit rating is employed to measure its probability of defaulting on a debt, but, for small enterprises, financial data are insufficient or even unreliable. Thus, building a multi criteria credit rating model based on the qualitative and quantitative criteria is of importance to finance small enterprises’ activities. Till now, there has not been a multicriteria credit risk model based on the rank sum test and entropy weighting method. In this paper, we try to fill this gap by offering three innovative contributions. First, the rank sum test shows significant differences in the average ranks associated with index data for the default and entire sample, ensuring that an index makes an effective differentiation between the default and non-default sample. Second, the rating equation’s capacity is tested to identify the potential defaults by verifying a clear difference between the average ranks of samples with default ratings (i.e., not index values and the entire sample. Third, in our nonparametric test, the rank sum test is used with rank correlation analysis made to screen for indices, thereby avoiding the assumption of normality associated with more common credit rating methods.
Dobbs, David E.
2012-01-01
This note explains how Emil Artin's proof that row rank equals column rank for a matrix with entries in a field leads naturally to the formula for the nullity of a matrix and also to an algorithm for solving any system of linear equations in any number of variables. This material could be used in any course on matrix theory or linear algebra.
Hoede, C.
In this paper the concept of page rank for the world wide web is discussed. The possibility of describing the distribution of page rank by an exponential law is considered. It is shown that the concept is essentially equal to that of status score, a centrality measure discussed already in 1953 by
Introduction to nonparametric statistics for the biological sciences using R
MacFarland, Thomas W
2016-01-01
This book contains a rich set of tools for nonparametric analyses, and the purpose of this supplemental text is to provide guidance to students and professional researchers on how R is used for nonparametric data analysis in the biological sciences: To introduce when nonparametric approaches to data analysis are appropriate To introduce the leading nonparametric tests commonly used in biostatistics and how R is used to generate appropriate statistics for each test To introduce common figures typically associated with nonparametric data analysis and how R is used to generate appropriate figures in support of each data set The book focuses on how R is used to distinguish between data that could be classified as nonparametric as opposed to data that could be classified as parametric, with both approaches to data classification covered extensively. Following an introductory lesson on nonparametric statistics for the biological sciences, the book is organized into eight self-contained lessons on various analyses a...
Complete hazard ranking to analyze right-censored data: An ALS survival study.
Directory of Open Access Journals (Sweden)
Zhengnan Huang
2017-12-01
Full Text Available Survival analysis represents an important outcome measure in clinical research and clinical trials; further, survival ranking may offer additional advantages in clinical trials. In this study, we developed GuanRank, a non-parametric ranking-based technique to transform patients' survival data into a linear space of hazard ranks. The transformation enables the utilization of machine learning base-learners including Gaussian process regression, Lasso, and random forest on survival data. The method was submitted to the DREAM Amyotrophic Lateral Sclerosis (ALS Stratification Challenge. Ranked first place, the model gave more accurate ranking predictions on the PRO-ACT ALS dataset in comparison to Cox proportional hazard model. By utilizing right-censored data in its training process, the method demonstrated its state-of-the-art predictive power in ALS survival ranking. Its feature selection identified multiple important factors, some of which conflicts with previous studies.
Recurrent fuzzy ranking methods
Hajjari, Tayebeh
2012-11-01
With the increasing development of fuzzy set theory in various scientific fields and the need to compare fuzzy numbers in different areas. Therefore, Ranking of fuzzy numbers plays a very important role in linguistic decision-making, engineering, business and some other fuzzy application systems. Several strategies have been proposed for ranking of fuzzy numbers. Each of these techniques has been shown to produce non-intuitive results in certain case. In this paper, we reviewed some recent ranking methods, which will be useful for the researchers who are interested in this area.
McKearnan, Shannon B; Wolfson, Julian; Vock, David M; Vazquez-Benitez, Gabriela; O'Connor, Patrick J
2018-01-03
The Net Reclassification Improvement (NRI) is a widely used metric used to assess the relative ability of two risk models to distinguish between low- and high-risk individuals. However, the validity and usefulness of the NRI have been questioned. Criticism of the NRI focuses on its use comparing nested risk models, whereas in practice it is often used to compare non-nested risk models derived from distinct data sources. In this study, we evaluated the performance of the NRI in a non-nested context by using it to compare competing cardiovascular risk prediction models. We explored the NRI's sensitivity to variations in risk categories and to the calibration of the compared models. We found that the NRI was very sensitive to changes in the definition of risk categories, especially when at least one model was mis-calibrated. To address these shortcomings, we describe a novel alternative to the usual NRI that uses percentiles of risk instead of cut-offs based on absolute risk. This percentile-based NRI demonstrates the relative ability of two models to rank patient risk. It displays more stable behavior, and we recommend its use when there are no established risk categories or when models are mis-calibrated. © The Author(s) 2018. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Non-Parametric Estimation of Correlation Functions
DEFF Research Database (Denmark)
Brincker, Rune; Rytter, Anders; Krenk, Steen
In this paper three methods of non-parametric correlation function estimation are reviewed and evaluated: the direct method, estimation by the Fast Fourier Transform and finally estimation by the Random Decrement technique. The basic ideas of the techniques are reviewed, sources of bias are point...... out, and methods to prevent bias are presented. The techniques are evaluated by comparing their speed and accuracy on the simple case of estimating auto-correlation functions for the response of a single degree-of-freedom system loaded with white noise....
A Nonparametric Test for Seasonal Unit Roots
Kunst, Robert M.
2009-01-01
Abstract: We consider a nonparametric test for the null of seasonal unit roots in quarterly time series that builds on the RUR (records unit root) test by Aparicio, Escribano, and Sipols. We find that the test concept is more promising than a formalization of visual aids such as plots by quarter. In order to cope with the sensitivity of the original RUR test to autocorrelation under its null of a unit root, we suggest an augmentation step by autoregression. We present some evidence on the siz...
Comparison of a Class of Rank-Score Tests in Two-Factor Designs ...
African Journals Online (AJOL)
ABSTRACT: Rank score functions are known to be versatile and powerful techniques in factorial designs. Researchers have established the theoretical properties of these methods based on nonparametric hypotheses, but only scanty empirical results are available in the literature on these procedures. In this paper, four ...
Bayesian Nonparametric Clustering for Positive Definite Matrices.
Cherian, Anoop; Morellas, Vassilios; Papanikolopoulos, Nikolaos
2016-05-01
Symmetric Positive Definite (SPD) matrices emerge as data descriptors in several applications of computer vision such as object tracking, texture recognition, and diffusion tensor imaging. Clustering these data matrices forms an integral part of these applications, for which soft-clustering algorithms (K-Means, expectation maximization, etc.) are generally used. As is well-known, these algorithms need the number of clusters to be specified, which is difficult when the dataset scales. To address this issue, we resort to the classical nonparametric Bayesian framework by modeling the data as a mixture model using the Dirichlet process (DP) prior. Since these matrices do not conform to the Euclidean geometry, rather belongs to a curved Riemannian manifold,existing DP models cannot be directly applied. Thus, in this paper, we propose a novel DP mixture model framework for SPD matrices. Using the log-determinant divergence as the underlying dissimilarity measure to compare these matrices, and further using the connection between this measure and the Wishart distribution, we derive a novel DPM model based on the Wishart-Inverse-Wishart conjugate pair. We apply this model to several applications in computer vision. Our experiments demonstrate that our model is scalable to the dataset size and at the same time achieves superior accuracy compared to several state-of-the-art parametric and nonparametric clustering algorithms.
A Comparison of Parametric and Non-Parametric Methods Applied to a Likert Scale.
Mircioiu, Constantin; Atkinson, Jeffrey
2017-05-10
A trenchant and passionate dispute over the use of parametric versus non-parametric methods for the analysis of Likert scale ordinal data has raged for the past eight decades. The answer is not a simple "yes" or "no" but is related to hypotheses, objectives, risks, and paradigms. In this paper, we took a pragmatic approach. We applied both types of methods to the analysis of actual Likert data on responses from different professional subgroups of European pharmacists regarding competencies for practice. Results obtained show that with "large" (>15) numbers of responses and similar (but clearly not normal) distributions from different subgroups, parametric and non-parametric analyses give in almost all cases the same significant or non-significant results for inter-subgroup comparisons. Parametric methods were more discriminant in the cases of non-similar conclusions. Considering that the largest differences in opinions occurred in the upper part of the 4-point Likert scale (ranks 3 "very important" and 4 "essential"), a "score analysis" based on this part of the data was undertaken. This transformation of the ordinal Likert data into binary scores produced a graphical representation that was visually easier to understand as differences were accentuated. In conclusion, in this case of Likert ordinal data with high response rates, restraining the analysis to non-parametric methods leads to a loss of information. The addition of parametric methods, graphical analysis, analysis of subsets, and transformation of data leads to more in-depth analyses.
Directory of Open Access Journals (Sweden)
Arda Halu
Full Text Available Many complex systems can be described as multiplex networks in which the same nodes can interact with one another in different layers, thus forming a set of interacting and co-evolving networks. Examples of such multiplex systems are social networks where people are involved in different types of relationships and interact through various forms of communication media. The ranking of nodes in multiplex networks is one of the most pressing and challenging tasks that research on complex networks is currently facing. When pairs of nodes can be connected through multiple links and in multiple layers, the ranking of nodes should necessarily reflect the importance of nodes in one layer as well as their importance in other interdependent layers. In this paper, we draw on the idea of biased random walks to define the Multiplex PageRank centrality measure in which the effects of the interplay between networks on the centrality of nodes are directly taken into account. In particular, depending on the intensity of the interaction between layers, we define the Additive, Multiplicative, Combined, and Neutral versions of Multiplex PageRank, and show how each version reflects the extent to which the importance of a node in one layer affects the importance the node can gain in another layer. We discuss these measures and apply them to an online multiplex social network. Findings indicate that taking the multiplex nature of the network into account helps uncover the emergence of rankings of nodes that differ from the rankings obtained from one single layer. Results provide support in favor of the salience of multiplex centrality measures, like Multiplex PageRank, for assessing the prominence of nodes embedded in multiple interacting networks, and for shedding a new light on structural properties that would otherwise remain undetected if each of the interacting networks were analyzed in isolation.
Halu, Arda; Mondragón, Raúl J; Panzarasa, Pietro; Bianconi, Ginestra
2013-01-01
Many complex systems can be described as multiplex networks in which the same nodes can interact with one another in different layers, thus forming a set of interacting and co-evolving networks. Examples of such multiplex systems are social networks where people are involved in different types of relationships and interact through various forms of communication media. The ranking of nodes in multiplex networks is one of the most pressing and challenging tasks that research on complex networks is currently facing. When pairs of nodes can be connected through multiple links and in multiple layers, the ranking of nodes should necessarily reflect the importance of nodes in one layer as well as their importance in other interdependent layers. In this paper, we draw on the idea of biased random walks to define the Multiplex PageRank centrality measure in which the effects of the interplay between networks on the centrality of nodes are directly taken into account. In particular, depending on the intensity of the interaction between layers, we define the Additive, Multiplicative, Combined, and Neutral versions of Multiplex PageRank, and show how each version reflects the extent to which the importance of a node in one layer affects the importance the node can gain in another layer. We discuss these measures and apply them to an online multiplex social network. Findings indicate that taking the multiplex nature of the network into account helps uncover the emergence of rankings of nodes that differ from the rankings obtained from one single layer. Results provide support in favor of the salience of multiplex centrality measures, like Multiplex PageRank, for assessing the prominence of nodes embedded in multiple interacting networks, and for shedding a new light on structural properties that would otherwise remain undetected if each of the interacting networks were analyzed in isolation.
Decision boundary feature selection for non-parametric classifier
Lee, Chulhee; Landgrebe, David A.
1991-01-01
Feature selection has been one of the most important topics in pattern recognition. Although many authors have studied feature selection for parametric classifiers, few algorithms are available for feature selection for nonparametric classifiers. In this paper we propose a new feature selection algorithm based on decision boundaries for nonparametric classifiers. We first note that feature selection for pattern recognition is equivalent to retaining 'discriminantly informative features', and a discriminantly informative feature is related to the decision boundary. A procedure to extract discriminantly informative features based on a decision boundary for nonparametric classification is proposed. Experiments show that the proposed algorithm finds effective features for the nonparametric classifier with Parzen density estimation.
Ranking economic history journals
DEFF Research Database (Denmark)
Di Vaio, Gianfranco; Weisdorf, Jacob Louis
2010-01-01
This study ranks-for the first time-12 international academic journals that have economic history as their main topic. The ranking is based on data collected for the year 2007. Journals are ranked using standard citation analysis where we adjust for age, size and self-citation of journals. We also...... compare the leading economic history journals with the leading journals in economics in order to measure the influence on economics of economic history, and vice versa. With a few exceptions, our results confirm the general idea about what economic history journals are the most influential for economic...... history, and that, although economic history is quite independent from economics as a whole, knowledge exchange between the two fields is indeed going on....
DEFF Research Database (Denmark)
Frandsen, Gudmund Skovbjerg; Frandsen, Peter Frands
2009-01-01
We consider maintaining information about the rank of a matrix under changes of the entries. For n×n matrices, we show an upper bound of O(n1.575) arithmetic operations and a lower bound of Ω(n) arithmetic operations per element change. The upper bound is valid when changing up to O(n0.575) entries...... in a single column of the matrix. We also give an algorithm that maintains the rank using O(n2) arithmetic operations per rank one update. These bounds appear to be the first nontrivial bounds for the problem. The upper bounds are valid for arbitrary fields, whereas the lower bound is valid for algebraically...... closed fields. The upper bound for element updates uses fast rectangular matrix multiplication, and the lower bound involves further development of an earlier technique for proving lower bounds for dynamic computation of rational functions....
Ranking Economic History Journals
DEFF Research Database (Denmark)
Di Vaio, Gianfranco; Weisdorf, Jacob Louis
This study ranks - for the first time - 12 international academic journals that have economic history as their main topic. The ranking is based on data collected for the year 2007. Journals are ranked using standard citation analysis where we adjust for age, size and self-citation of journals. We...... also compare the leading economic history journals with the leading journals in economics in order to measure the influence on economics of economic history, and vice versa. With a few exceptions, our results confirm the general idea about what economic history journals are the most influential...... for economic history, and that, although economic history is quite independent from economics as a whole, knowledge exchange between the two fields is indeed going on....
A Bayesian Nonparametric Approach to Factor Analysis
DEFF Research Database (Denmark)
Piatek, Rémi; Papaspiliopoulos, Omiros
2018-01-01
This paper introduces a new approach for the inference of non-Gaussian factor models based on Bayesian nonparametric methods. It relaxes the usual normality assumption on the latent factors, widely used in practice, which is too restrictive in many settings. Our approach, on the contrary, does...... not impose any particular assumptions on the shape of the distribution of the factors, but still secures the basic requirements for the identification of the model. We design a new sampling scheme based on marginal data augmentation for the inference of mixtures of normals with location and scale...... restrictions. This approach is augmented by the use of a retrospective sampler, to allow for the inference of a constrained Dirichlet process mixture model for the distribution of the latent factors. We carry out a simulation study to illustrate the methodology and demonstrate its benefits. Our sampler is very...
On Parametric (and Non-Parametric Variation
Directory of Open Access Journals (Sweden)
Neil Smith
2009-11-01
Full Text Available This article raises the issue of the correct characterization of ‘Parametric Variation’ in syntax and phonology. After specifying their theoretical commitments, the authors outline the relevant parts of the Principles–and–Parameters framework, and draw a three-way distinction among Universal Principles, Parameters, and Accidents. The core of the contribution then consists of an attempt to provide identity criteria for parametric, as opposed to non-parametric, variation. Parametric choices must be antecedently known, and it is suggested that they must also satisfy seven individually necessary and jointly sufficient criteria. These are that they be cognitively represented, systematic, dependent on the input, deterministic, discrete, mutually exclusive, and irreversible.
Nonparametric estimation of location and scale parameters
Potgieter, C.J.
2012-12-01
Two random variables X and Y belong to the same location-scale family if there are constants μ and σ such that Y and μ+σX have the same distribution. In this paper we consider non-parametric estimation of the parameters μ and σ under minimal assumptions regarding the form of the distribution functions of X and Y. We discuss an approach to the estimation problem that is based on asymptotic likelihood considerations. Our results enable us to provide a methodology that can be implemented easily and which yields estimators that are often near optimal when compared to fully parametric methods. We evaluate the performance of the estimators in a series of Monte Carlo simulations. © 2012 Elsevier B.V. All rights reserved.
Stochastic Non-Parametric Frontier Analysis
Directory of Open Access Journals (Sweden)
Mohammad Rahmani
2014-05-01
Full Text Available In this paper we develop an approach that synthesizes the best features of the two main methods in the estimation of production eciency. Specically, our approach rst allows for statistical noise, similar to Stochastic frontier analysis , and second, it allows modeling multiple-inputs-multiple-outputs technologies without imposing parametric assumptions on production relationship, similar to what is done in non-parametric methods. The methodology is based on the theory of local maximum likelihood estimation and extends recent works of Kumbhakar et al. We will use local-spherical coordinate system to transform multi-input multi-output data to more exible system which we can use in our approach. We also illustrate the performance of our approach with simulated example
Nonparametric dark energy reconstruction from supernova data.
Holsclaw, Tracy; Alam, Ujjaini; Sansó, Bruno; Lee, Herbert; Heitmann, Katrin; Habib, Salman; Higdon, David
2010-12-10
Understanding the origin of the accelerated expansion of the Universe poses one of the greatest challenges in physics today. Lacking a compelling fundamental theory to test, observational efforts are targeted at a better characterization of the underlying cause. If a new form of mass-energy, dark energy, is driving the acceleration, the redshift evolution of the equation of state parameter w(z) will hold essential clues as to its origin. To best exploit data from observations it is necessary to develop a robust and accurate reconstruction approach, with controlled errors, for w(z). We introduce a new, nonparametric method for solving the associated statistical inverse problem based on Gaussian process modeling and Markov chain Monte Carlo sampling. Applying this method to recent supernova measurements, we reconstruct the continuous history of w out to redshift z=1.5.
Decompounding random sums: A nonparametric approach
DEFF Research Database (Denmark)
Hansen, Martin Bøgsted; Pitts, Susan M.
Observations from sums of random variables with a random number of summands, known as random, compound or stopped sums arise within many areas of engineering and science. Quite often it is desirable to infer properties of the distribution of the terms in the random sum. In the present paper we...... review a number of applications and consider the nonlinear inverse problem of inferring the cumulative distribution function of the components in the random sum. We review the existing literature on non-parametric approaches to the problem. The models amenable to the analysis are generalized considerably...... and the properties of the proposed estimators are studied. Bootstrap methods are suggested to provide confidence bounds. Finally a number of algorithms are suggested to make the methods operational and tested on simulated data. In particular we show how Panjer recursion in general can be inverted for the Panjer...
Nonparametric predictive pairwise comparison with competing risks
International Nuclear Information System (INIS)
Coolen-Maturi, Tahani
2014-01-01
In reliability, failure data often correspond to competing risks, where several failure modes can cause a unit to fail. This paper presents nonparametric predictive inference (NPI) for pairwise comparison with competing risks data, assuming that the failure modes are independent. These failure modes could be the same or different among the two groups, and these can be both observed and unobserved failure modes. NPI is a statistical approach based on few assumptions, with inferences strongly based on data and with uncertainty quantified via lower and upper probabilities. The focus is on the lower and upper probabilities for the event that the lifetime of a future unit from one group, say Y, is greater than the lifetime of a future unit from the second group, say X. The paper also shows how the two groups can be compared based on particular failure mode(s), and the comparison of the two groups when some of the competing risks are combined is discussed
Nonparametric inference of network structure and dynamics
Peixoto, Tiago P.
The network structure of complex systems determine their function and serve as evidence for the evolutionary mechanisms that lie behind them. Despite considerable effort in recent years, it remains an open challenge to formulate general descriptions of the large-scale structure of network systems, and how to reliably extract such information from data. Although many approaches have been proposed, few methods attempt to gauge the statistical significance of the uncovered structures, and hence the majority cannot reliably separate actual structure from stochastic fluctuations. Due to the sheer size and high-dimensionality of many networks, this represents a major limitation that prevents meaningful interpretations of the results obtained with such nonstatistical methods. In this talk, I will show how these issues can be tackled in a principled and efficient fashion by formulating appropriate generative models of network structure that can have their parameters inferred from data. By employing a Bayesian description of such models, the inference can be performed in a nonparametric fashion, that does not require any a priori knowledge or ad hoc assumptions about the data. I will show how this approach can be used to perform model comparison, and how hierarchical models yield the most appropriate trade-off between model complexity and quality of fit based on the statistical evidence present in the data. I will also show how this general approach can be elegantly extended to networks with edge attributes, that are embedded in latent spaces, and that change in time. The latter is obtained via a fully dynamic generative network model, based on arbitrary-order Markov chains, that can also be inferred in a nonparametric fashion. Throughout the talk I will illustrate the application of the methods with many empirical networks such as the internet at the autonomous systems level, the global airport network, the network of actors and films, social networks, citations among
Bayesian Plackett-Luce Mixture Models for Partially Ranked Data.
Mollica, Cristina; Tardella, Luca
2017-06-01
The elicitation of an ordinal judgment on multiple alternatives is often required in many psychological and behavioral experiments to investigate preference/choice orientation of a specific population. The Plackett-Luce model is one of the most popular and frequently applied parametric distributions to analyze rankings of a finite set of items. The present work introduces a Bayesian finite mixture of Plackett-Luce models to account for unobserved sample heterogeneity of partially ranked data. We describe an efficient way to incorporate the latent group structure in the data augmentation approach and the derivation of existing maximum likelihood procedures as special instances of the proposed Bayesian method. Inference can be conducted with the combination of the Expectation-Maximization algorithm for maximum a posteriori estimation and the Gibbs sampling iterative procedure. We additionally investigate several Bayesian criteria for selecting the optimal mixture configuration and describe diagnostic tools for assessing the fitness of ranking distributions conditionally and unconditionally on the number of ranked items. The utility of the novel Bayesian parametric Plackett-Luce mixture for characterizing sample heterogeneity is illustrated with several applications to simulated and real preference ranked data. We compare our method with the frequentist approach and a Bayesian nonparametric mixture model both assuming the Plackett-Luce model as a mixture component. Our analysis on real datasets reveals the importance of an accurate diagnostic check for an appropriate in-depth understanding of the heterogenous nature of the partial ranking data.
A Structural Labor Supply Model with Nonparametric Preferences
van Soest, A.H.O.; Das, J.W.M.; Gong, X.
2000-01-01
Nonparametric techniques are usually seen as a statistic device for data description and exploration, and not as a tool for estimating models with a richer economic structure, which are often required for policy analysis.This paper presents an example where nonparametric flexibility can be attained
A note on Nonparametric Confidence Interval for a Shift Parameter ...
African Journals Online (AJOL)
In this article an application of a kernel based nonparametric approach in constructing a large sample nonparametric confidence interval for a shift parameter is considered. The method is illustrated using the Cauchy distribution as a location model. The kernel-based method is found to have a shorter interval for the shift ...
Simple nonparametric checks for model data fit in CAT
Meijer, R.R.
2005-01-01
In this paper, the usefulness of several nonparametric checks is discussed in a computerized adaptive testing (CAT) context. Although there is no tradition of nonparametric scalability in CAT, it can be argued that scalability checks can be useful to investigate, for example, the quality of item
Testing the race model inequality : a nonparametric approach
Maris, G.K.J.; Maris, E.
2003-01-01
This paper introduces a nonparametric procedure for testing the race model explanation of the redundant signals effect. The null hypothesis is the race model inequality derived from the race model by Miller (Cognitive Psychol. 14 (1982) 247). The construction of a nonparametric test is made possible
Nonparametric analysis of blocked ordered categories data: some examples revisited
Directory of Open Access Journals (Sweden)
O. Thas
2006-08-01
Full Text Available Nonparametric analysis for general block designs can be given by using the Cochran-Mantel-Haenszel (CMH statistics. We demonstrate this with four examples and note that several well-known nonparametric statistics are special cases of CMH statistics.
Diversifying customer review rankings.
Krestel, Ralf; Dokoohaki, Nima
2015-06-01
E-commerce Web sites owe much of their popularity to consumer reviews accompanying product descriptions. On-line customers spend hours and hours going through heaps of textual reviews to decide which products to buy. At the same time, each popular product has thousands of user-generated reviews, making it impossible for a buyer to read everything. Current approaches to display reviews to users or recommend an individual review for a product are based on the recency or helpfulness of each review. In this paper, we present a framework to rank product reviews by optimizing the coverage of the ranking with respect to sentiment or aspects, or by summarizing all reviews with the top-K reviews in the ranking. To accomplish this, we make use of the assigned star rating for a product as an indicator for a review's sentiment polarity and compare bag-of-words (language model) with topic models (latent Dirichlet allocation) as a mean to represent aspects. Our evaluation on manually annotated review data from a commercial review Web site demonstrates the effectiveness of our approach, outperforming plain recency ranking by 30% and obtaining best results by combining language and topic model representations. Copyright © 2015 Elsevier Ltd. All rights reserved.
DEFF Research Database (Denmark)
Müller, Emmanuel; Assent, Ira; Steinhausen, Uwe
2008-01-01
Outlier detection is an important data mining task for consistency checks, fraud detection, etc. Binary decision making on whether or not an object is an outlier is not appropriate in many applications and moreover hard to parametrize. Thus, recently, methods for outlier ranking have been proposed...
Bias and imprecision in posture percentile variables estimated from short exposure samples
Directory of Open Access Journals (Sweden)
Mathiassen Svend Erik
2012-03-01
Full Text Available Abstract Background Upper arm postures are believed to be an important risk determinant for musculoskeletal disorder development in the neck and shoulders. The 10th and 90th percentiles of the angular elevation distribution have been reported in many studies as measures of neutral and extreme postural exposures, and variation has been quantified by the 10th-90th percentile range. Further, the 50th percentile is commonly reported as a measure of "average" exposure. These four variables have been estimated using samples of observed or directly measured postures, typically using sampling durations between 5 and 120 min. Methods The present study examined the statistical properties of estimated full-shift values of the 10th, 50th and 90th percentile and the 10th-90th percentile range of right upper arm elevation obtained from samples of seven different durations, ranging from 5 to 240 min. The sampling strategies were realized by simulation, using a parent data set of 73 full-shift, continuous inclinometer recordings among hairdressers. For each shift, sampling duration and exposure variable, the mean, standard deviation and sample dispersion limits (2.5% and 97.5% of all possible sample estimates obtained at one minute intervals were calculated and compared to the true full-shift exposure value. Results Estimates of the 10th percentile proved to be upward biased with limited sampling, and those of the 90th percentile and the percentile range, downward biased. The 50th percentile was also slightly upwards biased. For all variables, bias was more severe with shorter sampling durations, and it correlated significantly with the true full-shift value for the 10th and 90th percentiles and the percentile range. As expected, shorter samples led to decreased precision of the estimate; sample standard deviations correlated strongly with true full-shift exposure values. Conclusions The documented risk of pronounced bias and low precision of percentile
2nd Conference of the International Society for Nonparametric Statistics
Manteiga, Wenceslao; Romo, Juan
2016-01-01
This volume collects selected, peer-reviewed contributions from the 2nd Conference of the International Society for Nonparametric Statistics (ISNPS), held in Cádiz (Spain) between June 11–16 2014, and sponsored by the American Statistical Association, the Institute of Mathematical Statistics, the Bernoulli Society for Mathematical Statistics and Probability, the Journal of Nonparametric Statistics and Universidad Carlos III de Madrid. The 15 articles are a representative sample of the 336 contributed papers presented at the conference. They cover topics such as high-dimensional data modelling, inference for stochastic processes and for dependent data, nonparametric and goodness-of-fit testing, nonparametric curve estimation, object-oriented data analysis, and semiparametric inference. The aim of the ISNPS 2014 conference was to bring together recent advances and trends in several areas of nonparametric statistics in order to facilitate the exchange of research ideas, promote collaboration among researchers...
Ranking Cases with Classification Rules
Zhang, Jianping; Bala, Jerzy W.; Hadjarian, Ali; Han, Brent
Many real-world machine learning applications require a ranking of cases, in addition to their classification. While classification rules are not a good representation for ranking, the human comprehensibility aspect of rules makes them an attractive option for many ranking problems where such model transparency is desired. There have been numerous studies on ranking with decision trees, but not many on ranking with decision rules. Although rules are similar to decision trees in many respects, there are important differences between them when used for ranking. In this chapter, we propose a framework for ranking with rules. The framework extends and substantially improves on the reported methods for ranking with decision trees. It introduces three types of rule-based ranking methods: post analysis of rules, hybrid methods, and multiple rule set analysis. We also study the impact of rule learning bias on the ranking performance. While traditional measures used for ranking performance evaluation tend to focus on the entire rank ordered list, the aim of many ranking applications is to optimize the performance on only a small portion of the top ranked cases. Accordingly, we propose a simple method for measuring the performance of a classification or ranking algorithm that focuses on these top ranked cases. Empirical studies have been conducted to evaluate some of the proposed methods.
Ross, Sarah G; Begeny, John C
2014-08-01
Growing from demands for accountability and research-based practice in the field of education, there is recent focus on developing standards for the implementation and analysis of single-case designs. Effect size methods for single-case designs provide a useful way to discuss treatment magnitude in the context of individual intervention. Although a standard effect size methodology does not yet exist within single-case research, panel experts recently recommended pairing regression and non-parametric approaches when analyzing effect size data. This study compared two single-case effect size methods: the regression-based, Allison-MT method and the newer, non-parametric, Tau-U method. Using previously published research that measured the Words read Correct per Minute (WCPM) variable, these two methods were examined by comparing differences in overall effect size scores and rankings of intervention effect. Results indicated that the regression method produced significantly larger effect sizes than the non-parametric method, but the rankings of the effect size scores had a strong, positive relation. Implications of these findings for research and practice are discussed. Copyright © 2014 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.
Two-stage meta-analysis of survival data from individual participants using percentile ratios
Barrett, Jessica K; Farewell, Vern T; Siannis, Fotios; Tierney, Jayne; Higgins, Julian P T
2012-01-01
Methods for individual participant data meta-analysis of survival outcomes commonly focus on the hazard ratio as a measure of treatment effect. Recently, Siannis et al. (2010, Statistics in Medicine 29:3030–3045) proposed the use of percentile ratios as an alternative to hazard ratios. We describe a novel two-stage method for the meta-analysis of percentile ratios that avoids distributional assumptions at the study level. Copyright © 2012 John Wiley & Sons, Ltd. PMID:22825835
Directory of Open Access Journals (Sweden)
Azam Zaka
2014-10-01
Full Text Available This paper is concerned with the modifications of maximum likelihood, moments and percentile estimators of the two parameter Power function distribution. Sampling behavior of the estimators is indicated by Monte Carlo simulation. For some combinations of parameter values, some of the modified estimators appear better than the traditional maximum likelihood, moments and percentile estimators with respect to bias, mean square error and total deviation.
Vale, Beatriz; Brito, Sara; Paulos, Lígia; Moleiro, Pascoal
2014-01-01
ABSTRACT Objective: To analyse the progression of body mass index in eating disorders and to determine the percentile for establishment and resolution of the disease. Methods: A retrospective descriptive cross-sectional study. Review of clinical files of adolescents with eating disorders. Results: Of the 62 female adolescents studied with eating disorders, 51 presented with eating disorder not otherwise specified, 10 anorexia nervosa, and 1 bulimia nervosa. Twenty-one of these adolescents had menstrual disorders; in that, 14 secondary amenorrhea and 7 menstrual irregularities (6 eating disorder not otherwise specified, and 1 bulimia nervosa). In average, in anorectic adolescents, the initial body mass index was in 75th percentile; secondary amenorrhea was established 1 month after onset of the disease; minimum weight was 76.6% of ideal body mass index (at 4th percentile) at 10.2 months of disease; and resolution of amenorrhea occurred at 24 months, with average weight recovery of 93.4% of the ideal. In eating disorder not otherwise specified with menstrual disorder (n=10), the mean initial body mass index was at 85th percentile; minimal weight was in average 97.7% of the ideal value (minimum body mass index was in 52nd percentile) at 14.9 months of disease; body mass index stabilization occured at 1.6 year of disease; and mean body mass index was in 73rd percentile. Considering eating disorder not otherwise specified with secondary amenorrhea (n=4); secondary amenorrhea occurred at 4 months, with resolution at 12 months of disease (mean 65th percentile body mass index). Conclusion: One-third of the eating disorder group had menstrual disorder – two-thirds presented with amenorrhea. This study indicated that for the resolution of their menstrual disturbance the body mass index percentiles to be achieved by female adolescents with eating disorders was 25–50 in anorexia nervosa, and 50–75, in eating disorder not otherwise specified. PMID:25003922
Early efficacy of the ketogenic diet is not affected by initial body mass index percentile.
Shull, Shastin; Diaz-Medina, Gloria; Wong-Kisiel, Lily; Nickels, Katherine; Eckert, Susan; Wirrell, Elaine
2014-05-01
Predictors of the ketogenic diet's success in treating pediatric intractable epilepsy are not well understood. The aim of this study was to determine whether initial body mass index and weight percentile impact early efficacy of the traditional ketogenic diet in children initiating therapy for intractable epilepsy. This retrospective study included all children initiating the ketogenic diet at Mayo Clinic, Rochester from January 2001 to December 2010 who had body mass index (children ≥2 years of age) or weight percentile (those diet initiation and seizure frequency recorded at diet initiation and one month. Responders were defined as achieving a >50% seizure reduction from baseline. Our cohort consisted of 48 patients (20 male) with a median age of 3.1 years. There was no significant correlation between initial body mass index or weight percentile and seizure frequency reduction at one month (P = 0.72, r = 0.26 and P = 0.91, r = 0.03). There was no significant association between body mass index or weight percentile quartile and responder rates (P = 0.21 and P = 0.57). Children considered overweight or obese at diet initiation (body mass index or weight percentile ≥85) did not have lower responder rates than those with body mass index or weight percentiles ketogenic diet. Copyright © 2014 Elsevier Inc. All rights reserved.
Analysis and Extension of the Percentile Method, Estimating a Noise Curve from a Single Image
Directory of Open Access Journals (Sweden)
Miguel Colom
2013-12-01
Full Text Available Given a white Gaussian noise signal on a sampling grid, its variance can be estimated from a small block sample. However, in natural images we observe the combination of the geometry of the scene being photographed and the added noise. In this case, estimating directly the standard deviation of the noise from block samples is not reliable since the measured standard deviation is not explained just by the noise but also by the geometry of the image. The Percentile method tries to estimate the standard deviation of the noise from blocks of a high-passed version of the image and a small p-percentile of these standard deviations. The idea behind is that edges and textures in a block of the image increase the observed standard deviation but they never make it decrease. Therefore, a small percentile (0.5%, for example in the list of standard deviations of the blocks is less likely to be affected by the edges and textures than a higher percentile (50%, for example. The 0.5%-percentile is empirically proven to be adequate for most natural, medical and microscopy images. The Percentile method is adapted to signal-dependent noise, which is realistic with the Poisson noise model obtained by a CCD device in a digital camera.
Fernández, J R; Bohan Brown, M; López-Alarcón, M; Dawson, J A; Guo, F; Redden, D T; Allison, D B
2017-10-01
Obesity is a global health concern but the United States has reported a leveling in obesity rates in the pediatric population. To provide updated waist circumference (WC) percentile values, identify differences across time and discuss differences within the context of reported weight stabilization in a nationally representative sample of American children. Percentiles for WC in self-identified African Americans (AA), European Americans (EA) and Mexican Americans (MA) were obtained from 2009-2014 National Health and Nutrition Examination Survey data (NHANES2014). Descriptive trends across time in 10th, 25th, 50th, 75th and 90th percentile WC distributions were identified by comparing NHANES2012 with previously reported NHANESIII (1988-1994). WC increased in a monotonic fashion in AA, EA and MA boys and girls. When compared with NHANESIII data, a clear left shift of percentile categories was observed such that values that used to be in the 90th percentile are now in the 85th percentile. Differences in WC were observed in EA and MA boys during a reported period of weight stabilization. WC has changed in the US pediatric population across time, even during times of reported weight stabilization, particularly among children of diverse racial/ethnic backgrounds. © 2016 World Obesity Federation.
2015-04-28
Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’08, Philadelphia, PA, USA, 2008, Society for Industrial and Applied Mathematics, pp. 268–276...whose structural properties can play a crucial role in the accuracy of the ranking process if exploited accordingly, as in the case of recent work on time...RUM) [59]. The BTL model has found numerous applications in recent years, including pricing in the airline industry [58], or analysis of professional
Can College Rankings Be Believed?
Directory of Open Access Journals (Sweden)
Meredith Davis
Full Text Available The article summarizes literature on college and university rankings worldwide and the strategies used by various ranking organizations, including those of government and popular media. It traces the history of national and global rankings, indicators used by ranking systems, and the effect of rankings on academic programs and their institutions. Although ranking systems employ diverse criteria and most weight certain indicators over others, there is considerable skepticism that most actually measure educational quality. At the same time, students and their families increasingly consult these evaluations when making college decisions, and sponsors of faculty research consider reputation when forming academic partnerships. While there are serious concerns regarding the validity of ranking institutions when so little data can support differences between one institution and another, college rankings appear to be here to stay.
Ranking Baltic States Researchers
Directory of Open Access Journals (Sweden)
Gyula Mester
2017-10-01
Full Text Available In this article, using the h-index and the total number of citations, the best 10 Lithuanian, Latvian and Estonian researchers from several disciplines are ranked. The list may be formed based on the h-index and the total number of citations, given in Web of Science, Scopus, Publish or Perish Program and Google Scholar database. Data for the first 10 researchers are presented. Google Scholar is the most complete. Therefore, to define a single indicator, h-index calculated by Google Scholar may be a good and simple one. The author chooses the Google Scholar database as it is the broadest one.
A nonparametric mixture model for cure rate estimation.
Peng, Y; Dear, K B
2000-03-01
Nonparametric methods have attracted less attention than their parametric counterparts for cure rate analysis. In this paper, we study a general nonparametric mixture model. The proportional hazards assumption is employed in modeling the effect of covariates on the failure time of patients who are not cured. The EM algorithm, the marginal likelihood approach, and multiple imputations are employed to estimate parameters of interest in the model. This model extends models and improves estimation methods proposed by other researchers. It also extends Cox's proportional hazards regression model by allowing a proportion of event-free patients and investigating covariate effects on that proportion. The model and its estimation method are investigated by simulations. An application to breast cancer data, including comparisons with previous analyses using a parametric model and an existing nonparametric model by other researchers, confirms the conclusions from the parametric model but not those from the existing nonparametric model.
Multi-sample nonparametric treatments comparison in medical ...
African Journals Online (AJOL)
Multi-sample nonparametric treatments comparison in medical follow-up study with unequal observation processes through simulation and bladder tumour case study. P. L. Tan, N.A. Ibrahim, M.B. Adam, J. Arasan ...
Weak Disposability in Nonparametric Production Analysis with Undesirable Outputs
Kuosmanen, T.K.
2005-01-01
Environmental Economics and Natural Resources Group at Wageningen University in The Netherlands Weak disposability of outputs means that firms can abate harmful emissions by decreasing the activity level. Modeling weak disposability in nonparametric production analysis has caused some confusion.
Nonparametric Bayesian drift estimation for multidimensional stochastic differential equations
Gugushvili, S.; Spreij, P.
2014-01-01
We consider nonparametric Bayesian estimation of the drift coefficient of a multidimensional stochastic differential equation from discrete-time observations on the solution of this equation. Under suitable regularity conditions, we establish posterior consistency in this context.
Speaker Linking and Applications using Non-Parametric Hashing Methods
2016-09-08
Speaker Linking and Applications using Non-Parametric Hashing Methods† Douglas Sturim and William M. Campbell MIT Lincoln Laboratory, Lexington, MA...to prove this by extending the proof of [8] to non-parametric hashing. 6. References [1] W. M. Campbell, D. E. Sturim, and D. A. Reynolds , “Support vec...EPFL-CONF-192414, 2012. [13] S. H. Shum, W. M. Campbell, and D. A. Reynolds , “Large-scale community detection on speaker content graphs,” in
University Rankings and Social Science
Marginson, Simon
2014-01-01
University rankings widely affect the behaviours of prospective students and their families, university executive leaders, academic faculty, governments and investors in higher education. Yet the social science foundations of global rankings receive little scrutiny. Rankings that simply recycle reputation without any necessary connection to real…
African Journals Online (AJOL)
maths/stats
Page Rank is a virtual value meaning nothing until you put it into the context of search engine results. Higher Page Rank pages will have tendency to rank better in the search engine results provided they are still optimized for the keywords you are searching for. Visitors coming from search engines are most praised kind of.
Percentiles relative to maxillary permanent canine inclination by age: a radiologic study.
Alessandri Bonetti, Giulio; Zanarini, Matteo; Danesi, Margherita; Parenti, Serena Incerti; Gatto, Maria Rosaria
2009-10-01
Few studies have investigated developmental norms for maxillary permanent canine eruption. In this observational cross-sectional study, we aimed to provide an age-related description of the percentiles relative to canine inclination in a large sample of nonorthodontic patients. Associations between inclination and sector were also analyzed. Canine inclination and sector location were measured on 1020 panoramic radiographs obtained from subjects of white ancestry aged between 8 and 11 years not seeking orthodontic treatment. The total sample comprised 2037 canines. Canine inclination increases between 8 and 9 years and decreases between 9 and 11 years. The greatest value for each percentile is at 9 years. A linear model should be hypothesized for differences in canine inclination between 2 successive ages in correspondence to each percentile. The proportion of sector 2 canines decreases and that of sector 1 increases with age. In the same age group, the inclination generally decreases as the sector decreases. Percentiles by age show the average canine inclination in a certain population. Further studies are required to verify whether percentiles can be a diagnostic aid for determining normal canine inclination at a given age and for quantifying the risk of canine impaction or adjacent root resorption.
Two non-parametric methods for derivation of constraints from radiotherapy dose–histogram data
International Nuclear Information System (INIS)
Ebert, M A; Kennedy, A; Joseph, D J; Gulliford, S L; Buettner, F; Foo, K; Haworth, A; Denham, J W
2014-01-01
Dose constraints based on histograms provide a convenient and widely-used method for informing and guiding radiotherapy treatment planning. Methods of derivation of such constraints are often poorly described. Two non-parametric methods for derivation of constraints are described and investigated in the context of determination of dose-specific cut-points—values of the free parameter (e.g., percentage volume of the irradiated organ) which best reflect resulting changes in complication incidence. A method based on receiver operating characteristic (ROC) analysis and one based on a maximally-selected standardized rank sum are described and compared using rectal toxicity data from a prostate radiotherapy trial. Multiple test corrections are applied using a free step-down resampling algorithm, which accounts for the large number of tests undertaken to search for optimal cut-points and the inherent correlation between dose–histogram points. Both methods provide consistent significant cut-point values, with the rank sum method displaying some sensitivity to the underlying data. The ROC method is simple to implement and can utilize a complication atlas, though an advantage of the rank sum method is the ability to incorporate all complication grades without the need for grade dichotomization. (note)
Two non-parametric methods for derivation of constraints from radiotherapy dose-histogram data
Ebert, M. A.; Gulliford, S. L.; Buettner, F.; Foo, K.; Haworth, A.; Kennedy, A.; Joseph, D. J.; Denham, J. W.
2014-07-01
Dose constraints based on histograms provide a convenient and widely-used method for informing and guiding radiotherapy treatment planning. Methods of derivation of such constraints are often poorly described. Two non-parametric methods for derivation of constraints are described and investigated in the context of determination of dose-specific cut-points—values of the free parameter (e.g., percentage volume of the irradiated organ) which best reflect resulting changes in complication incidence. A method based on receiver operating characteristic (ROC) analysis and one based on a maximally-selected standardized rank sum are described and compared using rectal toxicity data from a prostate radiotherapy trial. Multiple test corrections are applied using a free step-down resampling algorithm, which accounts for the large number of tests undertaken to search for optimal cut-points and the inherent correlation between dose-histogram points. Both methods provide consistent significant cut-point values, with the rank sum method displaying some sensitivity to the underlying data. The ROC method is simple to implement and can utilize a complication atlas, though an advantage of the rank sum method is the ability to incorporate all complication grades without the need for grade dichotomization.
Intergenerational Educational Rank Mobility in 20th Century United States
DEFF Research Database (Denmark)
Karlson, Kristian Bernt
2015-01-01
that for cohorts born after the War, Blacks have a much smaller probability of upward income mobility than do Whites. This would suggest that the equalization in upward educational mobility reported in my study has not brought about similar equalization in upward income mobility. Second, my study offers......BACKGROUND: Studies of educational mobility in the United States report widespread persistence in the association between parental and offspring schooling over most of the 20th century. Despite this apparent persistency, many other studies report substantial improvements in the educational...... performance of historically disadvantaged groups. To reconcile these diverging trends, I propose examining educational mobility in terms of percentile ranks in the respective schooling distributions of parents and offspring. Using a novel estimator of educational rank, I compare patterns of mobility...
Estimation of a monotone percentile residual life function under random censorship.
Franco-Pereira, Alba M; de Uña-Álvarez, Jacobo
2013-01-01
In this paper, we introduce a new estimator of a percentile residual life function with censored data under a monotonicity constraint. Specifically, it is assumed that the percentile residual life is a decreasing function. This assumption is useful when estimating the percentile residual life of units, which degenerate with age. We establish a law of the iterated logarithm for the proposed estimator, and its n-equivalence to the unrestricted estimator. The asymptotic normal distribution of the estimator and its strong approximation to a Gaussian process are also established. We investigate the finite sample performance of the monotone estimator in an extensive simulation study. Finally, data from a clinical trial in primary biliary cirrhosis of the liver are analyzed with the proposed methods. One of the conclusions of our work is that the restricted estimator may be much more efficient than the unrestricted one. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
von Rosen, P; Heijne, A
2018-04-01
The relationship between injury and performance in young athletes is scarcely studied. The aim of this study was therefore to explore the association between injury prevalence and ranking position among adolescent elite athletes. One hundred and sixty-two male and female adolescent elite athletes (age range 15-19), competing in athletics (n = 59), cross-country skiing (n = 66), and orienteering (n = 37), were monitored weekly over 22-47 weeks using a web-based injury questionnaire. Ranking lists were collected. A significant (P = .003) difference was found in the seasonal substantial injury prevalence across the ranked athletes over the season, where the top-ranked (median 3.6%, 25-75th percentiles 0%-14.3%) and middle-ranked athletes (median 2.3%, 25-75th percentiles 0%-10.0%) had a lower substantial injury prevalence compared to the low-ranked athletes (median 11.3%, 25-75th percentiles 2.5%-27.1%), during both preseason (P = .002) and competitive season (P = .031). Athletes who improved their ranking position (51%, n = 51) reported a lower substantial injury prevalence (median 0%, 25-75th percentiles 0%-10.0%) compared to those who decreased (49%, n = 49) their ranking position (md 6.7%, 25-75th percentiles 0%-22.5%). In the top-ranked group, no athlete reported substantial injury more than 40% of all data collection time points compared to 9.6% (n = 5) in the middle-ranked, and 17.3% (n = 9) in the low-ranked group. Our results provide supporting evidence that substantial injuries, such as acute and overuse injuries leading to moderate or severe reductions in training or sports performance, influence ranking position in adolescent elite athletes. The findings are crucial to stakeholders involved in adolescent elite sports and support the value of designing effective preventive interventions for substantial injuries. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Percentile of Serum Lipid Profile in Children and Adolescents of Birjand, Eastern Iran.
Directory of Open Access Journals (Sweden)
Fatemeh Taheri
2014-11-01
Full Text Available Abstract:Introduction: Racial and environmental differences in communities leading cause of differences in serum lipids. It can be said this study aimed in assessing percentile curves of serum lipid profile about 6-18 years old students of Birjand.Method: The present cross-sectional study was done on 4168 students of Birjand aged 6-18 years. They were classified into three age groups 6-10 and 15-18 and 11-14 years. The 5th, 10th, 25th, 50th, 75th, 90th, and 95th percentiles of lipids (cholesterol, LDL, HDL and triglycerides were determined by sex for different age groups.Result: The 5th, 10th, 25th, 50th, 75th, 90th and 95th percentiles for cholesterol, LDL, HDL, and TG were 114,123, 138, 157, 176, 197, 210; 54, 59, 71, 86, 102, 119, 131; 33, 36, 41, 48, 56, 64, 68 and 43, 49, 61, 78, 103, 138, 164, respectively. Conclusion: Percentiles of lipid in kids of Birjand are different in comparison with reference percentiles of the U.S and also Tehran. Triglycerides and HDL in children and adolescents of Birjand were higher and lower, respectively than the Americans. This could be due to racial differences and environmental factors such as nutrition and sedentary life style. This should be considered in interpretation of normal and abnormal values and determination of dyslipidemia in children and adolescents. Take the regional percentiles of serum lipids for Iranian children and adolescents recommended by examining a sufficient number of samples.
MacLean, Alair
2010-01-01
This article examines the effects of peacetime cold war military service on the life course according to four potentially overlapping theories that state that military service (1) was a disruption, (2) was a positive turning point, (3) allowed veterans to accumulate advantage, and (4) was an agent of social reproduction. The article argues that the extent to which the effect of military service on veterans' lives corresponds with one or another of the preceding theories depends on historical shifts in three dimensions: conscription, conflict, and benefits. Military service during the peacetime draft era of the late 1950s had a neutral effect on the socioeconomic attainment of enlisted veterans. However, it had a positive effect on veterans who served as officers, which partly stemmed from status reproduction and selection. Yet net of pre-service and educational differences by rank, officers in this peacetime draft era were still able to accumulate advantage. PMID:20842210
Percentiles de salto con contramovimiento en escolares de Bogotá, Colombia: Estudio FUPRECOL
Ferro Vargas, Martha
2016-01-01
Objetivo: Determinar la distribución por percentiles de salto con contramovimiento (CMJ) en una población escolar de Bogotá, Colombia, perteneciente al estudio Fuprecol. Métodos: Estudio transversal realizado entre 2846 niños y 2754 adolescentes, entre 9 a 17 años de edad, pertenecientes a 18 instituciones educativas oficiales de Bogotá, Colombia. Se evaluó el CMJ, de acuerdo, con lo establecido por la batería de condición física, Fuprecol. Se calcularon, los percentiles (P3, P...
Rankings, creatividad y urbanismo
Directory of Open Access Journals (Sweden)
JOAQUÍN SABATÉ
2008-08-01
Full Text Available La competencia entre ciudades constituye uno de los factores impulsores de procesos de renovación urbana y los rankings han devenido instrumentos de medida de la calidad de las ciudades. Nos detendremos en el caso de un antiguo barrio industrial hoy en vías de transformación en distrito "creativo" por medio de una intervención urbanística de gran escala. Su análisis nos descubre tres claves críticas. En primer lugar, nos obliga a plantearnos la definición de innovación urbana y cómo se integran el pasado, la identidad y la memoria en la construcción del futuro. Nos lleva a comprender que la innovación y el conocimiento no se "dan" casualmente, sino que son el fruto de una larga y compleja red en la que participan saberes, espacios, actores e instituciones diversas en naturaleza, escala y magnitud. Por último nos obliga a reflexionar sobre el valor que se le otorga a lo local en los procesos de renovación urbana.Competition among cities constitutes one ofthe main factors o furban renewal, and rankings have become instruments to indícate cities quality. Studying the transformation of an old industrial quarter into a "creative district" by the means ofa large scale urban project we highlight three main conclusions. First, itasks us to reconsider the notion ofurban innovation and hoto past, identity and memory should intégrate the future development. Second, it shows that innovation and knowledge doesn't yield per chance, but are the result ofa large and complex grid of diverse knowledges, spaces, agents and institutions. Finally itforces us to reflect about the valué attributed to the "local" in urban renewalprocesses.
Sequential rank agreement methods for comparison of ranked lists
DEFF Research Database (Denmark)
Ekstrøm, Claus Thorn; Gerds, Thomas Alexander; Jensen, Andreas Kryger
2015-01-01
rank genes according to their difference in gene expression levels. This article constructs measures of the agreement of two or more ordered lists. We use the standard deviation of the ranks to define a measure of agreement that both provides an intuitive interpretation and can be applied to any number......The comparison of alternative rankings of a set of items is a general and prominent task in applied statistics. Predictor variables are ranked according to magnitude of association with an outcome, prediction models rank subjects according to the personalized risk of an event, and genetic studies...... are illustrated using gene rankings, and using data from two Danish ovarian cancer studies where we assess the within and between agreement of different statistical classification methods....
Ranking nodes in growing networks: When PageRank fails.
Mariani, Manuel Sebastian; Medo, Matúš; Zhang, Yi-Cheng
2015-11-10
PageRank is arguably the most popular ranking algorithm which is being applied in real systems ranging from information to biological and infrastructure networks. Despite its outstanding popularity and broad use in different areas of science, the relation between the algorithm's efficacy and properties of the network on which it acts has not yet been fully understood. We study here PageRank's performance on a network model supported by real data, and show that realistic temporal effects make PageRank fail in individuating the most valuable nodes for a broad range of model parameters. Results on real data are in qualitative agreement with our model-based findings. This failure of PageRank reveals that the static approach to information filtering is inappropriate for a broad class of growing systems, and suggest that time-dependent algorithms that are based on the temporal linking patterns of these systems are needed to better rank the nodes.
Directory of Open Access Journals (Sweden)
Manuel Ticona Rendón
2006-09-01
Full Text Available gestacional y sexo, hemos realizado un estudio descriptivo, transversal y prospectivo que abarca los años entre 1992 y 2004. Fueron estudiados 282 gemelos vivos, sin factores de riesgo para retardo del crecimiento, procedentes de Tacná, Perú. Se calcularon promedios, desviación estándar y percentiles 10, 50 y 90 de peso por sexo y edad gestacional comprendida entre las 32 y 41 semanas. Se compararon los percentiles y los promedios entre uno y otro sexo y con estudios realizados en Noruega, Australia y Japón, considerando significativo cuando p < 0,05. El promedio de peso al nacer fue de 2 677 g ± 507 en el caso de los varones y de 2 615 g ± 461, en el caso de las niñas, sin diferencias significativas. La moda de la edad gestacional fue de 38 semanas y las diferencias en la mediana del peso al nacer según sexo fueron de 110 g. El pico de peso al nacer para los gemelos fue de 39 semanas y a partir de este los promedios declinaron. El promedio de peso al nacer de los gemelos varones fue más alto que el de las hembras y no se observaron diferencias significativas en ninguna edad gestacional. No se apreciaron diferencias entre los promedios de peso de gemelos peruanos y noruegos, de uno u otro sexo, sin embargo se registraron diferencias altamente significativas al compararlos con los de Australia y Japón, respecto a los cuales los promedios peruanos fueron mayores. Las curvas producidas como resultado del estudio proveen percentiles de peso al nacer para gemelos, según edad gestacional y sexo, que pueden ser utilizados por clínicos e investigadores peruanos.
Directory of Open Access Journals (Sweden)
Javier Jesús Suárez Rivera
2004-04-01
Full Text Available Se realizó un estudio prospectivo y descriptivo del universo de escolares desde preescolar hasta 6to. grado de la Escuela Primaria "Jesús Menéndez," de la localidad de Alamar, en el período comprendido desde septiembre de 2000 hasta febrero de 2001, con el objetivo de estimar el comportamiento de los percentiles (pc de tensión arterial, según edad y sexo, así como los factores de riesgo asociados. La muestra quedó constituida por 743 alumnos, a los cuales se les realizó un examen físico que incluyó peso, talla, toma de tensión arterial y una encuesta abierta. Con los datos obtenidos se dividió la población en 4 grupos de estudio según percentiles de tensión arterial: grupo I ( 95 pc, según la literatura extranjera consultada, y se relacionaron con factores de riesgo. El mayor número de escolares estudiados se encontraban con cifras de tensión arterial ubicadas en canales menores al 50 pc (88,83 %, y el factor de riesgo que se encontró con mayor frecuencia fue el antecedente familiar de hipertensión arterial. Solo 6 escolares presentaron cifras de tensión arterial superiores al 95 pc.A prospective descriptive study of students from kindergarten to 6th grade in "Jesús Menendez" elementary school located in Alamar was performed from September 2000 to February 2001 to find out the performance of blood pressure percentiles by age and sex as well as the associated risk factors. The sample was comprised by 743 students who were physically examined, taking into account weight, height, blood pressure and an open survey. The obtained data allowed us to divide the population into 4 groups by blood pressure percentiles; group 1(95 pc according to the reviewed foreign literature and they were related to risk factors. The blood pressure values of the highest number of studied students were under 50 pc (88,83 % and the most frequent risk factors was family history of blood hypertension. Only 6 students had blood pressure value over 95 %.
Oyhenart, Evelia E; Lomaglio, Delia B; Dahinten, Silvia L V; Bejarano, Ignacio F; Herráez, Ángel; Cesani, María F; Torres, María F; Luis, María A; Quintero, Fabián A; Alfaro, Emma L; Orden, Alicia B; Bergel Sanchis, María L; de Espinosa, Marisa González-Montero; Garraza, Mariela; Luna, María E; Forte, Luis M; Mesa, María S; Moreno Romero, Susana; López-Ejeda, Noemí; Dipierri, José E; Marrodán, María D
2015-01-01
The Argentinean population is characterized by ethnic, cultural and socio-economic diversity. To calculate the percentiles of weight-for-age (W/A) and height-for-age (H/A) of schoolchildren from Argentina employing the LMS method; and to compare the obtained percentiles with those of the international and national references. Anthropometric data of 18 698 students (8672 girls and 10 026 boys) of 3-13 years old were collected (2003-2008) from Buenos Aires, Catamarca, Chubut, Jujuy, La Pampa and Mendoza. Percentiles of W/A and H/A were obtained with the LMS method. Statistical and graphical comparisons were established with the WHO (international reference) and with that published by the Argentinean Paediatric Society (national reference). Differences in W/A and H/A, regarding the references, were negative and greater at the highest percentiles and in most of the age groups. On average, the differences were greater for boys than girls and for national than international references. The distribution of weight and height of schoolchildren, coming from most regions of the country, differs from those of national and international references. It should be advisable to establish a new national reference based on internationally recognized methodological criteria that adequately reflect the biological and cultural diversity of the Argentinean populations.
Blood pressure percentiles in a group of Nigerian school age children
African Journals Online (AJOL)
PROF. EZECHUKWU
2014-04-06
Apr 6, 2014 ... life style including watching television and the large consumption of the fattening fast food are found to be ..... style that includes television watching, video gaming and less physical exercise. This is further ... cal activities like playing football and other outdoor games. The 90th and 95th percentiles for blood ...
Using Percentile Schedules to Increase Eye Contact in Children with Fragile X Syndrome
Hall, Scott S.; Maynes, Natalee P.; Reiss, Allan L.
2009-01-01
Aversion to eye contact is a common behavior of individuals diagnosed with Fragile X syndrome (FXS); however, no studies to date have attempted to increase eye-contact duration in these individuals. In this study, we employed a percentile reinforcement schedule with and without overcorrection to shape eye-contact duration of 6 boys with FXS.…
Covariate Measurement Error Correction for Student Growth Percentiles Using the SIMEX Method
Shang, Yi; VanIwaarden, Adam; Betebenner, Damian W.
2015-01-01
In this study, we examined the impact of covariate measurement error (ME) on the estimation of quantile regression and student growth percentiles (SGPs), and find that SGPs tend to be overestimated among students with higher prior achievement and underestimated among those with lower prior achievement, a problem we describe as ME endogeneity in…
Tooth size changes with age in a Spanish population: percentile tables.
Paulino, Vera; Paredes, Vanessa; Cibrian, Rosa; Gandia, José-Luis
2011-09-01
The aims of this work were: firstly, to draw up tables of percentile tooth sizes similar to those of Sanin and Savara for three age groups of a Spanish population; secondly, to describe changes in tooth size between those groups over time, as well as observing any sexual dimorphism and, finally, to compare both the Spanish and Sanin and Savara 's American population samples. The sample included 359 patients and was divided into three age groups: adolescents, young adults and adults, of both genders. After dental cast digitalization, mesiodistal tooth-size was measured on each dental cast using a digital method. Dental size tables organized by percentiles for each group of age and gender were drawn up. Percentiles under 30 were considered as small, between 30 and 70 as average, and above 70 as large. As symmetry was found between contralateral teeth, the mean between the teeth of the two semi-arches was considered. The mesiodistal tooth sizes of adolescents did not present statistically significant differences between genders, in contrast to the two other age groups. Mesiodistal tooth diameters tended to diminish with age, especially in women, in the Spanish population. The values obtained for our dental tables, organized by percentiles, were slightly higher than those found by Sanin and Savara in an American population, especially for women.
ONODA, Tomoaki; YAMAMOTO, Ryuta; SAWAMURA, Kyohei; MURASE, Harutaka; NAMBO, Yasuo; INOUE, Yoshinobu; MATSUI, Akira; MIYAKE, Takeshi; HIRAI, Nobuhiro
2013-01-01
Percentile growth curves are often used as a clinical indicator to evaluate variations of children’s growth status. In this study, we propose empirical percentile growth curves using Z-scores adapted for Japanese Thoroughbred horses, with considerations of the seasonal compensatory growth that is a typical characteristic of seasonal breeding animals. We previously developed new growth curve equations for Japanese Thoroughbreds adjusting for compensatory growth. Individual horses and residual effects were included as random effects in the growth curve equation model and their variance components were estimated. Based on the Z-scores of the estimated variance components, empirical percentile growth curves were constructed. A total of 5,594 and 5,680 body weight and age measurements of male and female Thoroughbreds, respectively, and 3,770 withers height and age measurements were used in the analyses. The developed empirical percentile growth curves using Z-scores are computationally feasible and useful for monitoring individual growth parameters of body weight and withers height of young Thoroughbred horses, especially during compensatory growth periods. PMID:24834004
Energy Technology Data Exchange (ETDEWEB)
Weber, G. F.; Laudal, D. L.
1989-01-01
This work is a compilation of reports on ongoing research at the University of North Dakota. Topics include: Control Technology and Coal Preparation Research (SO{sub x}/NO{sub x} control, waste management), Advanced Research and Technology Development (turbine combustion phenomena, combustion inorganic transformation, coal/char reactivity, liquefaction reactivity of low-rank coals, gasification ash and slag characterization, fine particulate emissions), Combustion Research (fluidized bed combustion, beneficiation of low-rank coals, combustion characterization of low-rank coal fuels, diesel utilization of low-rank coals), Liquefaction Research (low-rank coal direct liquefaction), and Gasification Research (hydrogen production from low-rank coals, advanced wastewater treatment, mild gasification, color and residual COD removal from Synfuel wastewaters, Great Plains Gasification Plant, gasifier optimization).
Wikipedia ranking of world universities
Lages, José; Patt, Antoine; Shepelyansky, Dima L.
2016-03-01
We use the directed networks between articles of 24 Wikipedia language editions for producing the wikipedia ranking of world Universities (WRWU) using PageRank, 2DRank and CheiRank algorithms. This approach allows to incorporate various cultural views on world universities using the mathematical statistical analysis independent of cultural preferences. The Wikipedia ranking of top 100 universities provides about 60% overlap with the Shanghai university ranking demonstrating the reliable features of this approach. At the same time WRWU incorporates all knowledge accumulated at 24 Wikipedia editions giving stronger highlights for historically important universities leading to a different estimation of efficiency of world countries in university education. The historical development of university ranking is analyzed during ten centuries of their history.
Gunsolus, Ian L; Jaffe, Allan S; Sexter, Anne; Schulz, Karen; Ler, Ranka; Lindgren, Brittany; Saenger, Amy K; Love, Sara A; Apple, Fred S
2017-12-01
Our purpose was to determine a) overall and sex-specific 99th percentile upper reference limits (URL) and b) influences of statistical methods and comorbidities on the URLs. Heparin plasma from 838 normal subjects (423 men, 415 women) were obtained from the AACC (Universal Sample Bank). The cobas e602 measured cTnT (Roche Gen 5 assay); limit of detection (LoD), 3ng/L. Hemoglobin A1c (URL 6.5%), NT-proBNP (URL 125ng/L) and eGFR (60mL/min/1.73m 2 ) were measured, along with identification of statin use, to better define normality. 99th percentile URLs were determined by the non-parametric (NP), Harrell-Davis Estimator (HDE) and Robust (R) methods. 355 men and 339 women remained after exclusions. Overallstatistical method used influenced URLs as follows: pre/post exclusion overall, NP 16/16ng/L, HDE 17/17ng/L, R not available; men NP 18/16ng/L, HDE 21/19ng/L, R 16/11ng/L; women NP 13/10ng/L, HDE 14/14ng/L, R not available. We demonstrated that a) the Gen 5 cTnT assay does not meet the IFCC guideline for high-sensitivity assays, b) surrogate biomarkers significantly lowers the URLs and c) statistical methods used impact URLs. Our data suggest lower sex-specific cTnT 99th percentiles than reported in the FDA approved package insert. We emphasize the importance of detailing the criteria used to include and exclude subjects for defining a healthy population and the statistical method used to calculate 99th percentiles and identify outliers. Copyright © 2017 The Canadian Society of Clinical Chemists. Published by Elsevier Inc. All rights reserved.
Rank equivalent and rank degenerate skew cyclic codes
DEFF Research Database (Denmark)
Martinez Peñas, Umberto
2017-01-01
Two skew cyclic codes can be equivalent for the Hamming metric only if they have the same length, and only the zero code is degenerate. The situation is completely different for the rank metric. We study rank equivalences between skew cyclic codes of different lengths and, with the aim of finding...
Graph embedded nonparametric mutual information for supervised dimensionality reduction.
Bouzas, Dimitrios; Arvanitopoulos, Nikolaos; Tefas, Anastasios
2015-05-01
In this paper, we propose a novel algorithm for dimensionality reduction that uses as a criterion the mutual information (MI) between the transformed data and their corresponding class labels. The MI is a powerful criterion that can be used as a proxy to the Bayes error rate. Furthermore, recent quadratic nonparametric implementations of MI are computationally efficient and do not require any prior assumptions about the class densities. We show that the quadratic nonparametric MI can be formulated as a kernel objective in the graph embedding framework. Moreover, we propose its linear equivalent as a novel linear dimensionality reduction algorithm. The derived methods are compared against the state-of-the-art dimensionality reduction algorithms with various classifiers and on various benchmark and real-life datasets. The experimental results show that nonparametric MI as an optimization objective for dimensionality reduction gives comparable and in most of the cases better results compared with other dimensionality reduction methods.
Application of nonparametric statistic method for DNBR limit calculation
International Nuclear Information System (INIS)
Dong Bo; Kuang Bo; Zhu Xuenong
2013-01-01
Background: Nonparametric statistical method is a kind of statistical inference method not depending on a certain distribution; it calculates the tolerance limits under certain probability level and confidence through sampling methods. The DNBR margin is one important parameter of NPP design, which presents the safety level of NPP. Purpose and Methods: This paper uses nonparametric statistical method basing on Wilks formula and VIPER-01 subchannel analysis code to calculate the DNBR design limits (DL) of 300 MW NPP (Nuclear Power Plant) during the complete loss of flow accident, simultaneously compared with the DL of DNBR through means of ITDP to get certain DNBR margin. Results: The results indicate that this method can gain 2.96% DNBR margin more than that obtained by ITDP methodology. Conclusions: Because of the reduction of the conservation during analysis process, the nonparametric statistical method can provide greater DNBR margin and the increase of DNBR margin is benefited for the upgrading of core refuel scheme. (authors)
Ranking nodes in growing networks: When PageRank fails
Mariani, Manuel Sebastian; Medo, Matúš; Zhang, Yi-Cheng
2015-11-01
PageRank is arguably the most popular ranking algorithm which is being applied in real systems ranging from information to biological and infrastructure networks. Despite its outstanding popularity and broad use in different areas of science, the relation between the algorithm’s efficacy and properties of the network on which it acts has not yet been fully understood. We study here PageRank’s performance on a network model supported by real data, and show that realistic temporal effects make PageRank fail in individuating the most valuable nodes for a broad range of model parameters. Results on real data are in qualitative agreement with our model-based findings. This failure of PageRank reveals that the static approach to information filtering is inappropriate for a broad class of growing systems, and suggest that time-dependent algorithms that are based on the temporal linking patterns of these systems are needed to better rank the nodes.
University Ranking as Social Exclusion
Amsler, Sarah S.; Bolsmann, Chris
2012-01-01
In this article we explore the dual role of global university rankings in the creation of a new, knowledge-identified, transnational capitalist class and in facilitating new forms of social exclusion. We examine how and why the practice of ranking universities has become widely defined by national and international organisations as an important…
PageRank tracker: from ranking to tracking.
Gong, Chen; Fu, Keren; Loza, Artur; Wu, Qiang; Liu, Jia; Yang, Jie
2014-06-01
Video object tracking is widely used in many real-world applications, and it has been extensively studied for over two decades. However, tracking robustness is still an issue in most existing methods, due to the difficulties with adaptation to environmental or target changes. In order to improve adaptability, this paper formulates the tracking process as a ranking problem, and the PageRank algorithm, which is a well-known webpage ranking algorithm used by Google, is applied. Labeled and unlabeled samples in tracking application are analogous to query webpages and the webpages to be ranked, respectively. Therefore, determining the target is equivalent to finding the unlabeled sample that is the most associated with existing labeled set. We modify the conventional PageRank algorithm in three aspects for tracking application, including graph construction, PageRank vector acquisition and target filtering. Our simulations with the use of various challenging public-domain video sequences reveal that the proposed PageRank tracker outperforms mean-shift tracker, co-tracker, semiboosting and beyond semiboosting trackers in terms of accuracy, robustness and stability.
DEFF Research Database (Denmark)
Sørensen, Kaspar; Juul, Anders
2015-01-01
OBJECTIVE: Early pubertal timing is consistently associated with increased BMI percentile-for-age in pubertal girls, while data in boys are more ambiguous. However, higher BMI percentile-for-age may be a result of the earlier puberty per se rather than vice versa. The aim was to evaluate markers...... of adiposity in relation to pubertal timing and reproductive hormone levels in healthy pubertal boys and girls. STUDY DESIGN: Population-based cross-sectional study (The Copenhagen Puberty Study). Eight-hundred and two healthy Caucasian children and adolescents (486 girls) aged 8.5-16.5 years participated. BMI...... and bioelectric impedance analyses (BIA) were used to estimate adiposity. Clinical pubertal markers (Tanner stages and testicular volume) were evaluated. LH, FSH, estradiol, testosterone, SHBG and IGF1 levels were determined by immunoassays. RESULTS: In all age groups, higher BMI (all 1 year age-groups, P ≤ 0...
Parametric vs. Nonparametric Regression Modelling within Clinical Decision Support
Czech Academy of Sciences Publication Activity Database
Kalina, Jan; Zvárová, Jana
2017-01-01
Roč. 5, č. 1 (2017), s. 21-27 ISSN 1805-8698 R&D Projects: GA ČR GA17-01251S Institutional support: RVO:67985807 Keywords : decision support systems * decision rules * statistical analysis * nonparametric regression Subject RIV: IN - Informatics, Computer Science OBOR OECD: Statistics and probability
Effect on Prediction when Modeling Covariates in Bayesian Nonparametric Models.
Cruz-Marcelo, Alejandro; Rosner, Gary L; Müller, Peter; Stewart, Clinton F
2013-04-01
In biomedical research, it is often of interest to characterize biologic processes giving rise to observations and to make predictions of future observations. Bayesian nonparametric methods provide a means for carrying out Bayesian inference making as few assumptions about restrictive parametric models as possible. There are several proposals in the literature for extending Bayesian nonparametric models to include dependence on covariates. Limited attention, however, has been directed to the following two aspects. In this article, we examine the effect on fitting and predictive performance of incorporating covariates in a class of Bayesian nonparametric models by one of two primary ways: either in the weights or in the locations of a discrete random probability measure. We show that different strategies for incorporating continuous covariates in Bayesian nonparametric models can result in big differences when used for prediction, even though they lead to otherwise similar posterior inferences. When one needs the predictive density, as in optimal design, and this density is a mixture, it is better to make the weights depend on the covariates. We demonstrate these points via a simulated data example and in an application in which one wants to determine the optimal dose of an anticancer drug used in pediatric oncology.
Comparing parametric and nonparametric regression methods for panel data
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard; Henningsen, Arne
We investigate and compare the suitability of parametric and non-parametric stochastic regression methods for analysing production technologies and the optimal firm size. Our theoretical analysis shows that the most commonly used functional forms in empirical production analysis, Cobb-Douglas and......We investigate and compare the suitability of parametric and non-parametric stochastic regression methods for analysing production technologies and the optimal firm size. Our theoretical analysis shows that the most commonly used functional forms in empirical production analysis, Cobb......-Douglas and Translog, are unsuitable for analysing the optimal firm size. We show that the Translog functional form implies an implausible linear relationship between the (logarithmic) firm size and the elasticity of scale, where the slope is artificially related to the substitutability between the inputs....... The practical applicability of the parametric and non-parametric regression methods is scrutinised and compared by an empirical example: we analyse the production technology and investigate the optimal size of Polish crop farms based on a firm-level balanced panel data set. A nonparametric specification test...
Nonparametric Item Response Function Estimates with the EM Algorithm.
Rossi, Natasha; Wang, Xiaohui; Ramsay, James O.
2002-01-01
Combined several developments in statistics and item response theory to develop a procedure for analysis of dichotomously scored test data. This version of nonparametric item response analysis, as illustrated through simulation and with data from other studies, marginalizes the role of the ability parameter theta. (SLD)
Nonparametric model assisted model calibrated estimation in two ...
African Journals Online (AJOL)
Nonparametric model assisted model calibrated estimation in two stage survey sampling. RO Otieno, PN Mwita, PN Kihara. Abstract. No Abstract > East African Journal of Statistics Vol. 1 (3) 2007: pp.261-281. Full Text: EMAIL FULL TEXT EMAIL FULL TEXT · DOWNLOAD FULL TEXT DOWNLOAD FULL TEXT.
Nonparametric estimation of the maximum of conditional hazard ...
African Journals Online (AJOL)
The maximum of the conditional hazard function is a parameter of great importance in statistics, in particular in seismicity studies, because it constitutes the maximum risk of occurrence of an earthquake in a given interval of time. Using the kernel nonparametric estimates based on convolution kernel techniques of the rst ...
Nonparametric modeling of dynamic functional connectivity in fmri data
DEFF Research Database (Denmark)
Nielsen, Søren Føns Vind; Madsen, Kristoffer H; Røge, Rasmus
2015-01-01
dynamic changes. The existing approaches modeling dynamic connectivity have primarily been based on time-windowing the data and k-means clustering. We propose a nonparametric generative model for dynamic FC in fMRI that does not rely on specifying window lengths and number of dynamic states. Rooted...
Use of parametric and non-parametric survival analysis techniques ...
African Journals Online (AJOL)
This paper presents parametric and non-parametric survival analysis procedures that can be used to compare acaricides. The effectiveness of Delta Tick Pour On and Delta Tick Spray in knocking down tsetse flies were determined. The two formulations were supplied by Chemplex. The comparison was based on data ...
Estimation of Stochastic Volatility Models by Nonparametric Filtering
DEFF Research Database (Denmark)
Kanaya, Shin; Kristensen, Dennis
2016-01-01
/estimated volatility process replacing the latent process. Our estimation strategy is applicable to both parametric and nonparametric stochastic volatility models, and can handle both jumps and market microstructure noise. The resulting estimators of the stochastic volatility model will carry additional biases...
Non-Parametric Analysis of Rating Transition and Default Data
DEFF Research Database (Denmark)
Fledelius, Peter; Lando, David; Perch Nielsen, Jens
2004-01-01
We demonstrate the use of non-parametric intensity estimation - including construction of pointwise confidence sets - for analyzing rating transition data. We find that transition intensities away from the class studied here for illustration strongly depend on the direction of the previous move b...
Panel data nonparametric estimation of production risk and risk preferences
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard; Henningsen, Arne
We apply nonparametric panel data kernel regression to investigate production risk, out-put price uncertainty, and risk attitudes of Polish dairy farms based on a firm-level unbalanced panel data set that covers the period 2004–2010. We compare different model specifications and different...
Nonparametric Item Response Curve Estimation with Correction for Measurement Error
Guo, Hongwen; Sinharay, Sandip
2011-01-01
Nonparametric or kernel regression estimation of item response curves (IRCs) is often used in item analysis in testing programs. These estimates are biased when the observed scores are used as the regressor because the observed scores are contaminated by measurement error. Accuracy of this estimation is a concern theoretically and operationally.…
A general approach to posterior contraction in nonparametric inverse problems
Knapik, Bartek; Salomond, Jean Bernard
In this paper, we propose a general method to derive an upper bound for the contraction rate of the posterior distribution for nonparametric inverse problems. We present a general theorem that allows us to derive contraction rates for the parameter of interest from contraction rates of the related
Kułaga, Zbigniew; Litwin, Mieczysław; Grajda, Aneta; Kułaga, Katarzyna; Gurzkowska, Beata; Góźdź, Magdalena; Pan, Huiqi
2012-10-01
The objective of this study was to construct blood pressure (BP) references with the use of a validated oscillometric device for normal-weight, school-aged children and adolescents and to study BP predictors. BP was measured in 14 266 randomly selected, normal-weight Polish children and adolescents aged 7-18 years, who were free of chronic disease, using a validated oscillometric device (Datascope Accutor Plus). Height, weight and waist circumference were measured. BP percentiles were constructed for age and height simultaneously with the use of a polynomial regression model. The normative values of BP were compared with the US normal-weight reference, German oscillometric reference, and Polish auscultatory reference. Reference BP percentiles by sex, age and height are presented. At median height, the age-specific differences in the 90th BP percentiles compared with German oscillometric reference ranged in the case of boys from -3 to 2 mmHg and from -5 to -1 mmHg, SBP and DBP, respectively, and in the case of girls from 0 to 3 mmHg and from -5 to -1 mmHg, SBP and DBP, respectively. As compared to weight, waist circumference was stronger SBP predictor in low birth weight boys. The study provides BP references for oscillmetric device, based on a current, nationally representative sample of normal-weight Polish children and adolescents. The normative values of BP were compared taking into consideration the height and BMI differences, the pubertal spurt, the methods of BP measurement and percentile construction.
Relationships between walking and percentiles of adiposity inolder and younger men
Energy Technology Data Exchange (ETDEWEB)
Williams, Paul T.
2005-06-01
To assess the relationship of weekly walking distance to percentiles of adiposity in elders (age {ge} 75 years), seniors (55 {le} age <75 years), middle-age men (35 {le} age <55 years), and younger men (18 {le} age <35 years old). Cross-sectional analyses of baseline questionnaires from 7,082 male participants of the National Walkers Health Study. The walkers BMIs were inversely and significantly associated with walking distance (kg/m{sup 2} per km/wk) in elders (slope {+-} SE: -0.032 {+-} 0.008), seniors (-0.045 {+-} 0.005), and middle-aged men (-0.037 {+-} 0.007), as were their waist circumferences (-0.091 {+-} 0.025, -0.045 {+-} 0.005, and -0.091 {+-} 0.015 cm per km/wk, respectively), and these slopes remained significant when adjusted statistically for reported weekly servings of meat, fish, fruit, and alcohol. The declines in BMI associated with walking distance were greater at the higher than lower percentiles of the BMI distribution. Specifically, compared to the decline at the 10th BMI percentile, the decline in BMI at the 90th percentile was 5.1-fold greater in elders, 5.9-fold greater in seniors, and 6.7-fold greater in middle-age men. The declines in waist circumference associated with walking distance were also greater among men with broader waistlines. Exercise-induced weight loss (or self-selection) causes an inverse relationship between adiposity and walking distance in men 35 and older that is substantially greater among fatter men.
Cossío Bolaños, Marco; Méndez Cornejo, Jorge; Luarte Rocha, Cristian; Vargas Vitoria, Rodrigo; Canqui Flores, Bernabé; Gomez Campos, Rossana
2016-08-16
Regular physical activity (PA) during childhood and adolescence is important for the prevention of non-communicable diseases and their risk factors. To validate a questionnaire for measuring patterns of PA, verify the reliability, comparing the levels of PA aligned with chronological and biological age, and to develop percentile curves to assess PA levels depending on biological maturation. Descriptive cross-sectional study was performed on a sample non-probabilistic quota of 3,176 Chilean adolescents (1685 males and 1491 females), with a mean age range from 10.0 to 18.9 years. An analysis was performed on, weight, standing and sitting height. The biological age through the years of peak growth rate and chronological age in years was determined. Body Mass Index was calculated and a survey of PA was applied. The LMS method was used to develop percentiles. The values for the confirmatory analysis showed saturations between 0.517 and 0.653. The value of adequacy of Kaiser-Meyer-Olkin (KMO) was 0.879 and with 70.8% of the variance explained. The Cronbach alpha values ranged from 0.81 to 0.86. There were differences between the genders when aligned chronological age. There were no differences when aligned by biological age. Percentiles are proposed to classify the PA of adolescents of both genders according to biological age and sex. The questionnaire used was valid and reliable, plus the PA should be evaluated by biological age. These findings led to the development of percentiles to assess PA according to biological age and gender. Copyright © 2016 Sociedad Chilena de Pediatría. Publicado por Elsevier España, S.L.U. All rights reserved.
International Nuclear Information System (INIS)
Andersen, Erlend K.F.; Hole, Knut Håkon; Lund, Kjersti V.; Sundfør, Kolbein; Kristensen, Gunnar B.; Lyng, Heidi; Malinen, Eirik
2012-01-01
Purpose: To systematically screen the tumor contrast enhancement of locally advanced cervical cancers to assess the prognostic value of two descriptive parameters derived from dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI). Methods and Materials: This study included a prospectively collected cohort of 81 patients who underwent DCE-MRI with gadopentetate dimeglumine before chemoradiotherapy. The following descriptive DCE-MRI parameters were extracted voxel by voxel and presented as histograms for each time point in the dynamic series: normalized relative signal increase (nRSI) and normalized area under the curve (nAUC). The first to 100th percentiles of the histograms were included in a log-rank survival test, resulting in p value and relative risk maps of all percentile–time intervals for each DCE-MRI parameter. The maps were used to evaluate the robustness of the individual percentile–time pairs and to construct prognostic parameters. Clinical endpoints were locoregional control and progression-free survival. The study was approved by the institutional ethics committee. Results: The p value maps of nRSI and nAUC showed a large continuous region of percentile–time pairs that were significantly associated with locoregional control (p < 0.05). These parameters had prognostic impact independent of tumor stage, volume, and lymph node status on multivariate analysis. Only a small percentile–time interval of nRSI was associated with progression-free survival. Conclusions: The percentile–time screening identified DCE-MRI parameters that predict long-term locoregional control after chemoradiotherapy of cervical cancer.
[Percentile curves on growth among breastfed 1-4 year olds in 8 urban areas].
Feng, W W; Huang, X N; Wang, H S; Gong, L M; Xu, Y Q; Pan, X P; Jin, X
2017-04-10
Objective: To construct the growth percentile curves of weight-, length/height-, head circumference and BMI for 1 to 4 year-olds who had been breastfed in urban areas. Methods: Data was from the longitudinal study on 1 025 breastfed children aged 1 to 4 years, in 8 urban areas during 2008-2012. MLwiN2.25 was selected to construct the multi-level models of weight-for-age,length for-age, head circumference-for-age and BMI-for-age. The models included many growth relevant factors including gender, age, family and social demographic characteristics, perinatal factors, parent biological characteristics, dietary patterns and diseases of childen. Based on these models, predicted values ( P(3) , P(15) , P(50) , P(85) , P(97) ) were estimated to fit the percentiles reference curves. Results: The percentiles reference curves of weight-, length/height, head circumferenceand BMI-for-age for the 1-4 year-olds who had been breastfed in the urban areas were developed. Differences of all the indicators between boys and girls were statistically significant ( P growth, constructed by the longitudinal observational data and scientific method, were important in reflecting the development of breastfed children in urban areas.
Universal scaling in sports ranking
International Nuclear Information System (INIS)
Deng Weibing; Li Wei; Cai Xu; Bulou, Alain; Wang Qiuping A
2012-01-01
Ranking is a ubiquitous phenomenon in human society. On the web pages of Forbes, one may find all kinds of rankings, such as the world's most powerful people, the world's richest people, the highest-earning tennis players, and so on and so forth. Herewith, we study a specific kind—sports ranking systems in which players' scores and/or prize money are accrued based on their performances in different matches. By investigating 40 data samples which span 12 different sports, we find that the distributions of scores and/or prize money follow universal power laws, with exponents nearly identical for most sports. In order to understand the origin of this universal scaling we focus on the tennis ranking systems. By checking the data we find that, for any pair of players, the probability that the higher-ranked player tops the lower-ranked opponent is proportional to the rank difference between the pair. Such a dependence can be well fitted to a sigmoidal function. By using this feature, we propose a simple toy model which can simulate the competition of players in different matches. The simulations yield results consistent with the empirical findings. Extensive simulation studies indicate that the model is quite robust with respect to the modifications of some parameters. (paper)
Nagy, P; Kovacs, E; Moreno, L A; Veidebaum, T; Tornaritis, M; Kourides, Y; Siani, A; Lauria, F; Sioen, I; Claessens, M; Mårild, S; Lissner, L; Bammann, K; Intemann, T; Buck, C; Pigeot, I; Ahrens, W; Molnár, D
2014-09-01
To characterise the nutritional status in children with obesity or wasting conditions, European anthropometric reference values for body composition measures beyond the body mass index (BMI) are needed. Differentiated assessment of body composition in children has long been hampered by the lack of appropriate references. The aim of our study is to provide percentiles for body composition indices in normal weight European children, based on the IDEFICS cohort (Identification and prevention of Dietary- and lifestyle-induced health Effects in Children and infantS). Overall 18,745 2.0-10.9-year-old children from eight countries participated in the study. Children classified as overweight/obese or underweight according to IOTF (N=5915) were excluded from the analysis. Anthropometric measurements (BMI (N=12 830); triceps, subscapular, fat mass and fat mass index (N=11,845-11,901); biceps, suprailiac skinfolds, sum of skinfolds calculated from skinfold thicknesses (N=8129-8205), neck circumference (N=12,241); waist circumference and waist-to-height ratio (N=12,381)) were analysed stratified by sex and smoothed 1st, 3rd, 10th, 25th, 50th, 75th, 90th, 97th and 99th percentile curves were calculated using GAMLSS. Percentile values of the most important anthropometric measures related to the degree of adiposity are depicted for European girls and boys. Age- and sex-specific differences were investigated for all measures. As an example, the 50th and 99th percentile values of waist circumference ranged from 50.7-59.2 cm and from 51.3-58.7 cm in 4.5- to <5.0-year-old girls and boys, respectively, to 60.6-74.5 cm in girls and to 59.9-76.7 cm in boys at the age of 10.5-10.9 years. The presented percentile curves may aid a differentiated assessment of total and abdominal adiposity in European children.
International Nuclear Information System (INIS)
Frahm, K M; Shepelyansky, D L; Chepelianskii, A D
2012-01-01
We up a directed network tracing links from a given integer to its divisors and analyze the properties of the Google matrix of this network. The PageRank vector of this matrix is computed numerically and it is shown that its probability is approximately inversely proportional to the PageRank index thus being similar to the Zipf law and the dependence established for the World Wide Web. The spectrum of the Google matrix of integers is characterized by a large gap and a relatively small number of nonzero eigenvalues. A simple semi-analytical expression for the PageRank of integers is derived that allows us to find this vector for matrices of billion size. This network provides a new PageRank order of integers. (paper)
Ranking in evolving complex networks
Liao, Hao; Mariani, Manuel Sebastian; Medo, Matúš; Zhang, Yi-Cheng; Zhou, Ming-Yang
2017-05-01
Complex networks have emerged as a simple yet powerful framework to represent and analyze a wide range of complex systems. The problem of ranking the nodes and the edges in complex networks is critical for a broad range of real-world problems because it affects how we access online information and products, how success and talent are evaluated in human activities, and how scarce resources are allocated by companies and policymakers, among others. This calls for a deep understanding of how existing ranking algorithms perform, and which are their possible biases that may impair their effectiveness. Many popular ranking algorithms (such as Google's PageRank) are static in nature and, as a consequence, they exhibit important shortcomings when applied to real networks that rapidly evolve in time. At the same time, recent advances in the understanding and modeling of evolving networks have enabled the development of a wide and diverse range of ranking algorithms that take the temporal dimension into account. The aim of this review is to survey the existing ranking algorithms, both static and time-aware, and their applications to evolving networks. We emphasize both the impact of network evolution on well-established static algorithms and the benefits from including the temporal dimension for tasks such as prediction of network traffic, prediction of future links, and identification of significant nodes.
RANK and RANK ligand expression in primary human osteosarcoma
Directory of Open Access Journals (Sweden)
Daniel Branstetter
2015-09-01
Our results demonstrate RANKL expression was observed in the tumor element in 68% of human OS using IHC. However, the staining intensity was relatively low and only 37% (29/79 of samples exhibited≥10% RANKL positive tumor cells. RANK expression was not observed in OS tumor cells. In contrast, RANK expression was clearly observed in other cells within OS samples, including the myeloid osteoclast precursor compartment, osteoclasts and in giant osteoclast cells. The intensity and frequency of RANKL and RANK staining in OS samples were substantially less than that observed in GCTB samples. The observation that RANKL is expressed in OS cells themselves suggests that these tumors may mediate an osteoclastic response, and anti-RANKL therapy may potentially be protective against bone pathologies in OS. However, the absence of RANK expression in primary human OS cells suggests that any autocrine RANKL/RANK signaling in human OS tumor cells is not operative, and anti-RANKL therapy would not directly affect the tumor.
Two Phase Analysis of Ski Schools Customer Satisfaction: Multivariate Ranking and Cub Models
Directory of Open Access Journals (Sweden)
Rosa Arboretti
2014-06-01
Full Text Available Monitoring tourists' opinions is an important issue also for companies providing sport services. The aim of this paper was to apply CUB models and nonparametric permutation methods to a large customer satisfaction survey performed in 2011 in the ski schools of Alto Adige (Italy. The two-phase data processing was mainly aimed to: establish a global ranking of a sample of five ski schools, on the basis of satisfaction scores for several specific service aspects; to estimate specific components of the respondents’ evaluation process (feeling and uncertainty and to detect if customers’ characteristics affected these two components. With the application of NPC-Global ranking we obtained a ranking of the evaluated ski schools simultaneously considering satisfaction scores of several service’s aspects. CUB models showed which aspects and subgroups were less satisfied giving tips on how to improve services and customer satisfaction.
Physical Fitness Percentiles of German Children Aged 9-12 Years: Findings from a Longitudinal Study.
Directory of Open Access Journals (Sweden)
Kathleen Golle
Full Text Available Generating percentile values is helpful for the identification of children with specific fitness characteristics (i.e., low or high fitness level to set appropriate fitness goals (i.e., fitness/health promotion and/or long-term youth athlete development. Thus, the aim of this longitudinal study was to assess physical fitness development in healthy children aged 9-12 years and to compute sex- and age-specific percentile values.Two-hundred and forty children (88 girls, 152 boys participated in this study and were tested for their physical fitness. Physical fitness was assessed using the 50-m sprint test (i.e., speed, the 1-kg ball push test, the triple hop test (i.e., upper- and lower- extremity muscular power, the stand-and-reach test (i.e., flexibility, the star run test (i.e., agility, and the 9-min run test (i.e., endurance. Age- and sex-specific percentile values (i.e., P10 to P90 were generated using the Lambda, Mu, and Sigma method. Adjusted (for change in body weight, height, and baseline performance age- and sex-differences as well as the interactions thereof were expressed by calculating effect sizes (Cohen's d.Significant main effects of Age were detected for all physical fitness tests (d = 0.40-1.34, whereas significant main effects of Sex were found for upper-extremity muscular power (d = 0.55, flexibility (d = 0.81, agility (d = 0.44, and endurance (d = 0.32 only. Further, significant Sex by Age interactions were observed for upper-extremity muscular power (d = 0.36, flexibility (d = 0.61, and agility (d = 0.27 in favor of girls. Both, linear and curvilinear shaped curves were found for percentile values across the fitness tests. Accelerated (curvilinear improvements were observed for upper-extremity muscular power (boys: 10-11 yrs; girls: 9-11 yrs, agility (boys: 9-10 yrs; girls: 9-11 yrs, and endurance (boys: 9-10 yrs; girls: 9-10 yrs. Tabulated percentiles for the 9-min run test indicated that running distances between 1
85Th Percentile Speed Prediction Model For Bode Saadu– Jebba Road In Kwara State, Nigeria
Directory of Open Access Journals (Sweden)
O. O. Joseph
2014-01-01
Full Text Available Among the roadway element, horizontal alignment has long been recognized as having a significant effect on vehicle speeds. Unexpectedly tight horizontal curves can lead to accidents as drivers try to negotiate them at too high a speed. Design features, such as curvature and super elevation, are directly related to, and vary appreciably with design speed; while the 85%percentile speed of light vehicle is commonly used as a basis for the design i.e. the speed exceeded by only 15% of the vehicles. Reconnaissance survey of the area was carried out; geometric data were extracted from the working drawing while instantaneous speed were measured manually using stop watch at selected locations. Regression analysis of both the geometric data extracted and the 85th percentile speed evaluated were performed using Statistical Package for Social Sciences (SPSS to formulate simple mathematical model for operating speed. The coefficient of determination(R and coefficient of determination (R2 obtained were 85.2% and 72.6% respectively.In the reported model for 85th percentile vehicular speed, regression parameters were statistically significant for variables, the relationship exists between all the parameters since the both output shows R – sq value was more than 50%. It can be conclude that, the operating speed of the vehicle depends on the Radius (R; Length of curve (Lc; Tangent lengths (TL ; Gradient (G; and Super elevation (e. It is recommended that the model equation obtained could be used to obtained design speed in the study location and similar road network; while a large national research effort aimed at developing speed models for Nigerian situation should be conducted, the results will aid in the development of consistent design and traffic control
Combined parametric-nonparametric identification of block-oriented systems
Mzyk, Grzegorz
2014-01-01
This book considers a problem of block-oriented nonlinear dynamic system identification in the presence of random disturbances. This class of systems includes various interconnections of linear dynamic blocks and static nonlinear elements, e.g., Hammerstein system, Wiener system, Wiener-Hammerstein ("sandwich") system and additive NARMAX systems with feedback. Interconnecting signals are not accessible for measurement. The combined parametric-nonparametric algorithms, proposed in the book, can be selected dependently on the prior knowledge of the system and signals. Most of them are based on the decomposition of the complex system identification task into simpler local sub-problems by using non-parametric (kernel or orthogonal) regression estimation. In the parametric stage, the generalized least squares or the instrumental variables technique is commonly applied to cope with correlated excitations. Limit properties of the algorithms have been shown analytically and illustrated in simple experiments.
Comparing nonparametric Bayesian tree priors for clonal reconstruction of tumors.
Deshwar, Amit G; Vembu, Shankar; Morris, Quaid
2015-01-01
Statistical machine learning methods, especially nonparametric Bayesian methods, have become increasingly popular to infer clonal population structure of tumors. Here we describe the treeCRP, an extension of the Chinese restaurant process (CRP), a popular construction used in nonparametric mixture models, to infer the phylogeny and genotype of major subclonal lineages represented in the population of cancer cells. We also propose new split-merge updates tailored to the subclonal reconstruction problem that improve the mixing time of Markov chains. In comparisons with the tree-structured stick breaking prior used in PhyloSub, we demonstrate superior mixing and running time using the treeCRP with our new split-merge procedures. We also show that given the same number of samples, TSSB and treeCRP have similar ability to recover the subclonal structure of a tumor…
Single versus mixture Weibull distributions for nonparametric satellite reliability
International Nuclear Information System (INIS)
Castet, Jean-Francois; Saleh, Joseph H.
2010-01-01
Long recognized as a critical design attribute for space systems, satellite reliability has not yet received the proper attention as limited on-orbit failure data and statistical analyses can be found in the technical literature. To fill this gap, we recently conducted a nonparametric analysis of satellite reliability for 1584 Earth-orbiting satellites launched between January 1990 and October 2008. In this paper, we provide an advanced parametric fit, based on mixture of Weibull distributions, and compare it with the single Weibull distribution model obtained with the Maximum Likelihood Estimation (MLE) method. We demonstrate that both parametric fits are good approximations of the nonparametric satellite reliability, but that the mixture Weibull distribution provides significant accuracy in capturing all the failure trends in the failure data, as evidenced by the analysis of the residuals and their quasi-normal dispersion.
Using non-parametric methods in econometric production analysis
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard; Henningsen, Arne
2012-01-01
by investigating the relationship between the elasticity of scale and the farm size. We use a balanced panel data set of 371~specialised crop farms for the years 2004-2007. A non-parametric specification test shows that neither the Cobb-Douglas function nor the Translog function are consistent with the "true......Econometric estimation of production functions is one of the most common methods in applied economic production analysis. These studies usually apply parametric estimation techniques, which obligate the researcher to specify a functional form of the production function of which the Cobb...... parameter estimates, but also in biased measures which are derived from the parameters, such as elasticities. Therefore, we propose to use non-parametric econometric methods. First, these can be applied to verify the functional form used in parametric production analysis. Second, they can be directly used...
Nonparametric Regression Estimation for Multivariate Null Recurrent Processes
Directory of Open Access Journals (Sweden)
Biqing Cai
2015-04-01
Full Text Available This paper discusses nonparametric kernel regression with the regressor being a \\(d\\-dimensional \\(\\beta\\-null recurrent process in presence of conditional heteroscedasticity. We show that the mean function estimator is consistent with convergence rate \\(\\sqrt{n(Th^{d}}\\, where \\(n(T\\ is the number of regenerations for a \\(\\beta\\-null recurrent process and the limiting distribution (with proper normalization is normal. Furthermore, we show that the two-step estimator for the volatility function is consistent. The finite sample performance of the estimate is quite reasonable when the leave-one-out cross validation method is used for bandwidth selection. We apply the proposed method to study the relationship of Federal funds rate with 3-month and 5-year T-bill rates and discover the existence of nonlinearity of the relationship. Furthermore, the in-sample and out-of-sample performance of the nonparametric model is far better than the linear model.
Nonparametric instrumental regression with non-convex constraints
Grasmair, M.; Scherzer, O.; Vanhems, A.
2013-03-01
This paper considers the nonparametric regression model with an additive error that is dependent on the explanatory variables. As is common in empirical studies in epidemiology and economics, it also supposes that valid instrumental variables are observed. A classical example in microeconomics considers the consumer demand function as a function of the price of goods and the income, both variables often considered as endogenous. In this framework, the economic theory also imposes shape restrictions on the demand function, such as integrability conditions. Motivated by this illustration in microeconomics, we study an estimator of a nonparametric constrained regression function using instrumental variables by means of Tikhonov regularization. We derive rates of convergence for the regularized model both in a deterministic and stochastic setting under the assumption that the true regression function satisfies a projected source condition including, because of the non-convexity of the imposed constraints, an additional smallness condition.
Nonparametric instrumental regression with non-convex constraints
International Nuclear Information System (INIS)
Grasmair, M; Scherzer, O; Vanhems, A
2013-01-01
This paper considers the nonparametric regression model with an additive error that is dependent on the explanatory variables. As is common in empirical studies in epidemiology and economics, it also supposes that valid instrumental variables are observed. A classical example in microeconomics considers the consumer demand function as a function of the price of goods and the income, both variables often considered as endogenous. In this framework, the economic theory also imposes shape restrictions on the demand function, such as integrability conditions. Motivated by this illustration in microeconomics, we study an estimator of a nonparametric constrained regression function using instrumental variables by means of Tikhonov regularization. We derive rates of convergence for the regularized model both in a deterministic and stochastic setting under the assumption that the true regression function satisfies a projected source condition including, because of the non-convexity of the imposed constraints, an additional smallness condition. (paper)
Seismic Signal Compression Using Nonparametric Bayesian Dictionary Learning via Clustering
Directory of Open Access Journals (Sweden)
Xin Tian
2017-06-01
Full Text Available We introduce a seismic signal compression method based on nonparametric Bayesian dictionary learning method via clustering. The seismic data is compressed patch by patch, and the dictionary is learned online. Clustering is introduced for dictionary learning. A set of dictionaries could be generated, and each dictionary is used for one cluster’s sparse coding. In this way, the signals in one cluster could be well represented by their corresponding dictionaries. A nonparametric Bayesian dictionary learning method is used to learn the dictionaries, which naturally infers an appropriate dictionary size for each cluster. A uniform quantizer and an adaptive arithmetic coding algorithm are adopted to code the sparse coefficients. With comparisons to other state-of-the art approaches, the effectiveness of the proposed method could be validated in the experiments.
Nonparametric Identification of Dynamic Games with Discrete and Continuous Choices
Jason R. Blevins
2010-01-01
This paper shows that the payoff functions in a class of dynamic games of incomplete information are nonparametrically identified under standard assumptions currently used in applied work. Models of this kind are prevalent in empirical industrial organization where, for example, firms in oligopolistic industries make discrete entry and exit decisions followed by continuous investment or pricing decisions. We also provide results for single-agent models, a leading special case which is commonl...
Nonparametric Bayesian models through probit stick-breaking processes.
Rodríguez, Abel; Dunson, David B
2011-03-01
We describe a novel class of Bayesian nonparametric priors based on stick-breaking constructions where the weights of the process are constructed as probit transformations of normal random variables. We show that these priors are extremely flexible, allowing us to generate a great variety of models while preserving computational simplicity. Particular emphasis is placed on the construction of rich temporal and spatial processes, which are applied to two problems in finance and ecology.
AN EFFECTIVE TECHNIQUE OF MULTIPLE IMPUTATION IN NONPARAMETRIC QUANTILE REGRESSION
Yanan Hu; Qianqian Zhu; Maozai Tian
2014-01-01
In this study, we consider the nonparametric quantile regression model with the covariates Missing at Random (MAR). Multiple imputation is becoming an increasingly popular approach for analyzing missing data, which combined with quantile regression is not well-developed. We propose an effective and accurate two-stage multiple imputation method for the model based on the quantile regression, which consists of initial imputation in the first stage and multiple imputation in the second stage. Th...
Non-parametric versus parametric methods in environmental sciences
Directory of Open Access Journals (Sweden)
Muhammad Riaz
2017-06-01
Full Text Available This current report intends to highlight the importance of considering background assumptions required for the analysis of real datasets in different disciplines. We will provide comparative discussion of parametric methods (that depends on distributional assumptions (like normality relative to non-parametric methods (that are free from many distributional assumptions. We have chosen a real dataset from environmental sciences (one of the application areas. The findings may be extended to the other disciplines following the same spirit.
Nonparametric Estimation of Information-Based Measures of Statistical Dispersion
Czech Academy of Sciences Publication Activity Database
Košťál, Lubomír; Pokora, Ondřej
2012-01-01
Roč. 14, č. 7 (2012), s. 1221-1233 ISSN 1099-4300 R&D Projects: GA ČR(CZ) GAP103/11/0282; GA ČR(CZ) GBP304/12/G069; GA ČR(CZ) GPP103/12/ P558 Institutional support: RVO:67985823 Keywords : statistical dispersion * entropy * Fisher information * nonparametric density estimation * neuronal activity Subject RIV: FH - Neurology Impact factor: 1.347, year: 2012
A Novel Nonparametric Distance Estimator for Densities with Error Bounds
Directory of Open Access Journals (Sweden)
Alexandre R.F. Carvalho
2013-05-01
Full Text Available The use of a metric to assess distance between probability densities is an important practical problem. In this work, a particular metric induced by an α-divergence is studied. The Hellinger metric can be interpreted as a particular case within the framework of generalized Tsallis divergences and entropies. The nonparametric Parzen’s density estimator emerges as a natural candidate to estimate the underlying probability density function, since it may account for data from different groups, or experiments with distinct instrumental precisions, i.e., non-independent and identically distributed (non-i.i.d. data. However, the information theoretic derived metric of the nonparametric Parzen’s density estimator displays infinite variance, limiting the direct use of resampling estimators. Based on measure theory, we present a change of measure to build a finite variance density allowing the use of resampling estimators. In order to counteract the poor scaling with dimension, we propose a new nonparametric two-stage robust resampling estimator of Hellinger’s metric error bounds for heterocedastic data. The approach presents very promising results allowing the use of different covariances for different clusters with impact on the distance evaluation.
Gilstad-Hayden, Kathryn; Carroll-Scott, Amy; Rosenthal, Lisa; Peters, Susan M.; McCaslin, Catherine; Ickovics, Jeannette R.
2015-01-01
BACKGROUND Schools are an important environmental context in children’s lives and are part of the complex web of factors that contribute to childhood obesity. Increasingly, attention has been placed on the importance of school climate (connectedness, academic standards, engagement, and student autonomy) as 1 domain of school environment beyond health policies and education that may have implications for student health outcomes. The purpose of this study is to examine the association of school climate with body mass index (BMI) among urban preadolescents. METHODS Health surveys and physical measures were collected among fifth- and sixth-grade students from 12 randomly selected public schools in a small New England city. School climate surveys were completed district-wide by students and teachers. Hierarchical linear modeling was used to test the association between students’ BMI and schools’ climate scores. RESULTS After controlling for potentially confounding individual-level characteristics, a 1-unit increase in school climate score (indicating more positive climate) was associated with a 7-point decrease in students’ BMI percentile. CONCLUSIONS Positive school climate is associated with lower student BMI percentile. More research is needed to understand the mechanisms behind this relationship and to explore whether interventions promoting positive school climate can effectively prevent and/or reduce obesity. PMID:25040118
Tan, K. L.; Chong, Z. L.; Khoo, M. B. C.; Teoh, W. L.; Teh, S. Y.
2017-09-01
Quality control is crucial in a wide variety of fields, as it can help to satisfy customers’ needs and requirements by enhancing and improving the products and services to a superior quality level. The EWMA median chart was proposed as a useful alternative to the EWMA \\bar{X} chart because the median-type chart is robust against contamination, outliers or small deviation from the normality assumption compared to the traditional \\bar{X}-type chart. To provide a complete understanding of the run-length distribution, the percentiles of the run-length distribution should be investigated rather than depending solely on the average run length (ARL) performance measure. This is because interpretation depending on the ARL alone can be misleading, as the process mean shifts change according to the skewness and shape of the run-length distribution, varying from almost symmetric when the magnitude of the mean shift is large, to highly right-skewed when the process is in-control (IC) or slightly out-of-control (OOC). Before computing the percentiles of the run-length distribution, optimal parameters of the EWMA median chart will be obtained by minimizing the OOC ARL, while retaining the IC ARL at a desired value.
Ranking species in mutualistic networks
Domínguez-García, Virginia; Muñoz, Miguel A.
2015-02-01
Understanding the architectural subtleties of ecological networks, believed to confer them enhanced stability and robustness, is a subject of outmost relevance. Mutualistic interactions have been profusely studied and their corresponding bipartite networks, such as plant-pollinator networks, have been reported to exhibit a characteristic ``nested'' structure. Assessing the importance of any given species in mutualistic networks is a key task when evaluating extinction risks and possible cascade effects. Inspired in a recently introduced algorithm -similar in spirit to Google's PageRank but with a built-in non-linearity- here we propose a method which -by exploiting their nested architecture- allows us to derive a sound ranking of species importance in mutualistic networks. This method clearly outperforms other existing ranking schemes and can become very useful for ecosystem management and biodiversity preservation, where decisions on what aspects of ecosystems to explicitly protect need to be made.
Subtracting a best rank-1 approximation may increase tensor rank
Stegeman, Alwin; Comon, Pierre
2010-01-01
It has been shown that a best rank-R approximation of an order-k tensor may not exist when R >= 2 and k >= 3. This poses a serious problem to data analysts using tensor decompositions it has been observed numerically that, generally, this issue cannot be solved by consecutively computing and
Evaluation of world's largest social welfare scheme: An assessment using non-parametric approach.
Singh, Sanjeet
2016-08-01
Mahatma Gandhi National Rural Employment Guarantee Act (MGNREGA) is the world's largest social welfare scheme in India for the poverty alleviation through rural employment generation. This paper aims to evaluate and rank the performance of the states in India under MGNREGA scheme. A non-parametric approach, Data Envelopment Analysis (DEA) is used to calculate the overall technical, pure technical, and scale efficiencies of states in India. The sample data is drawn from the annual official reports published by the Ministry of Rural Development, Government of India. Based on three selected input parameters (expenditure indicators) and five output parameters (employment generation indicators), I apply both input and output oriented DEA models to estimate how well the states utilize their resources and generate outputs during the financial year 2013-14. The relative performance evaluation has been made under the assumption of constant returns and also under variable returns to scale to assess the impact of scale on performance. The results indicate that the main source of inefficiency is both technical and managerial practices adopted. 11 states are overall technically efficient and operate at the optimum scale whereas 18 states are pure technical or managerially efficient. It has been found that for some states it necessary to alter scheme size to perform at par with the best performing states. For inefficient states optimal input and output targets along with the resource savings and output gains are calculated. Analysis shows that if all inefficient states operate at optimal input and output levels, on an average 17.89% of total expenditure and a total amount of $780million could have been saved in a single year. Most of the inefficient states perform poorly when it comes to the participation of women and disadvantaged sections (SC&ST) in the scheme. In order to catch up with the performance of best performing states, inefficient states on an average need to enhance
University rankings in computer science
DEFF Research Database (Denmark)
Ehret, Philip; Zuccala, Alesia Ann; Gipp, Bela
2017-01-01
This is a research-in-progress paper concerning two types of institutional rankings, the Leiden and QS World ranking, and their relationship to a list of universities’ ‘geo-based’ impact scores, and Computing Research and Education Conference (CORE) participation scores in the field of computer...... science. A ‘geo-based’ impact measure examines the geographical distribution of incoming citations to a particular university’s journal articles for a specific period of time. It takes into account both the number of citations and the geographical variability in these citations. The CORE participation...
Let Us Rank Journalism Programs
Weber, Joseph
2014-01-01
Unlike law, business, and medical schools, as well as universities in general, journalism schools and journalism programs have rarely been ranked. Publishers such as "U.S. News & World Report," "Forbes," "Bloomberg Businessweek," and "Washington Monthly" do not pay them much mind. What is the best…
Van rankings naar multidimensionale classificatieaanpak
van Vught, Franciscus A.; Vossensteyn, Johan J.
2009-01-01
Onlangs is de nieuwe Times Higher Education – QS World Universities Ranking gepubliceerd. Nederlandse universiteiten doen het goed en dat is groot nieuws. Of we het nu leuk vinden of niet, of we de methodologie toejuichen of verafschuwen, eigenlijk weten we allemaal wel dat de internationale
Measuring and Ranking Value Drivers
M.M. Akalu
2002-01-01
textabstractAnalysis of the strength of value drivers is crucial to understand their influence in the process of free cash flow generation. The paper addresses the issue of value driver measurement and ranking. The research reveals that, value drivers have similar pattern across industries.
On Rank Driven Dynamical Systems
Veerman, J. J. P.; Prieto, F. J.
2014-08-01
We investigate a class of models related to the Bak-Sneppen (BS) model, initially proposed to study evolution. The BS model is extremely simple and yet captures some forms of "complex behavior" such as self-organized criticality that is often observed in physical and biological systems. In this model, random fitnesses in are associated to agents located at the vertices of a graph . Their fitnesses are ranked from worst (0) to best (1). At every time-step the agent with the worst fitness and some others with a priori given rank probabilities are replaced by new agents with random fitnesses. We consider two cases: The exogenous case where the new fitnesses are taken from an a priori fixed distribution, and the endogenous case where the new fitnesses are taken from the current distribution as it evolves. We approximate the dynamics by making a simplifying independence assumption. We use Order Statistics and Dynamical Systems to define a rank-driven dynamical system that approximates the evolution of the distribution of the fitnesses in these rank-driven models, as well as in the BS model. For this simplified model we can find the limiting marginal distribution as a function of the initial conditions. Agreement with experimental results of the BS model is excellent.
Multiple graph regularized protein domain ranking
Wang, Jim Jing-Yan
2012-11-19
Background: Protein domain ranking is a fundamental task in structural biology. Most protein domain ranking methods rely on the pairwise comparison of protein domains while neglecting the global manifold structure of the protein domain database. Recently, graph regularized ranking that exploits the global structure of the graph defined by the pairwise similarities has been proposed. However, the existing graph regularized ranking methods are very sensitive to the choice of the graph model and parameters, and this remains a difficult problem for most of the protein domain ranking methods.Results: To tackle this problem, we have developed the Multiple Graph regularized Ranking algorithm, MultiG-Rank. Instead of using a single graph to regularize the ranking scores, MultiG-Rank approximates the intrinsic manifold of protein domain distribution by combining multiple initial graphs for the regularization. Graph weights are learned with ranking scores jointly and automatically, by alternately minimizing an objective function in an iterative algorithm. Experimental results on a subset of the ASTRAL SCOP protein domain database demonstrate that MultiG-Rank achieves a better ranking performance than single graph regularized ranking methods and pairwise similarity based ranking methods.Conclusion: The problem of graph model and parameter selection in graph regularized protein domain ranking can be solved effectively by combining multiple graphs. This aspect of generalization introduces a new frontier in applying multiple graphs to solving protein domain ranking applications. 2012 Wang et al; licensee BioMed Central Ltd.
The Globalization of College and University Rankings
Altbach, Philip G.
2012-01-01
In the era of globalization, accountability, and benchmarking, university rankings have achieved a kind of iconic status. The major ones--the Academic Ranking of World Universities (ARWU, or the "Shanghai rankings"), the QS (Quacquarelli Symonds Limited) World University Rankings, and the "Times Higher Education" World…
Multiple graph regularized protein domain ranking.
Wang, Jim Jing-Yan; Bensmail, Halima; Gao, Xin
2012-11-19
Protein domain ranking is a fundamental task in structural biology. Most protein domain ranking methods rely on the pairwise comparison of protein domains while neglecting the global manifold structure of the protein domain database. Recently, graph regularized ranking that exploits the global structure of the graph defined by the pairwise similarities has been proposed. However, the existing graph regularized ranking methods are very sensitive to the choice of the graph model and parameters, and this remains a difficult problem for most of the protein domain ranking methods. To tackle this problem, we have developed the Multiple Graph regularized Ranking algorithm, MultiG-Rank. Instead of using a single graph to regularize the ranking scores, MultiG-Rank approximates the intrinsic manifold of protein domain distribution by combining multiple initial graphs for the regularization. Graph weights are learned with ranking scores jointly and automatically, by alternately minimizing an objective function in an iterative algorithm. Experimental results on a subset of the ASTRAL SCOP protein domain database demonstrate that MultiG-Rank achieves a better ranking performance than single graph regularized ranking methods and pairwise similarity based ranking methods. The problem of graph model and parameter selection in graph regularized protein domain ranking can be solved effectively by combining multiple graphs. This aspect of generalization introduces a new frontier in applying multiple graphs to solving protein domain ranking applications.
Multiple graph regularized protein domain ranking
Directory of Open Access Journals (Sweden)
Wang Jim
2012-11-01
Full Text Available Abstract Background Protein domain ranking is a fundamental task in structural biology. Most protein domain ranking methods rely on the pairwise comparison of protein domains while neglecting the global manifold structure of the protein domain database. Recently, graph regularized ranking that exploits the global structure of the graph defined by the pairwise similarities has been proposed. However, the existing graph regularized ranking methods are very sensitive to the choice of the graph model and parameters, and this remains a difficult problem for most of the protein domain ranking methods. Results To tackle this problem, we have developed the Multiple Graph regularized Ranking algorithm, MultiG-Rank. Instead of using a single graph to regularize the ranking scores, MultiG-Rank approximates the intrinsic manifold of protein domain distribution by combining multiple initial graphs for the regularization. Graph weights are learned with ranking scores jointly and automatically, by alternately minimizing an objective function in an iterative algorithm. Experimental results on a subset of the ASTRAL SCOP protein domain database demonstrate that MultiG-Rank achieves a better ranking performance than single graph regularized ranking methods and pairwise similarity based ranking methods. Conclusion The problem of graph model and parameter selection in graph regularized protein domain ranking can be solved effectively by combining multiple graphs. This aspect of generalization introduces a new frontier in applying multiple graphs to solving protein domain ranking applications.
A Survey on PageRank Computing
Berkhin, Pavel
2005-01-01
This survey reviews the research related to PageRank computing. Components of a PageRank vector serve as authority weights for web pages independent of their textual content, solely based on the hyperlink structure of the web. PageRank is typically used as a web search ranking component. This defines the importance of the model and the data structures that underly PageRank processing. Computing even a single PageRank is a difficult computational task. Computing many PageRanks is a much mor...
Time evolution of Wikipedia network ranking
Eom, Young-Ho; Frahm, Klaus M.; Benczúr, András; Shepelyansky, Dima L.
2013-12-01
We study the time evolution of ranking and spectral properties of the Google matrix of English Wikipedia hyperlink network during years 2003-2011. The statistical properties of ranking of Wikipedia articles via PageRank and CheiRank probabilities, as well as the matrix spectrum, are shown to be stabilized for 2007-2011. A special emphasis is done on ranking of Wikipedia personalities and universities. We show that PageRank selection is dominated by politicians while 2DRank, which combines PageRank and CheiRank, gives more accent on personalities of arts. The Wikipedia PageRank of universities recovers 80% of top universities of Shanghai ranking during the considered time period.
A Nonparametric Bayesian Approach For Emission Tomography Reconstruction
International Nuclear Information System (INIS)
Barat, Eric; Dautremer, Thomas
2007-01-01
We introduce a PET reconstruction algorithm following a nonparametric Bayesian (NPB) approach. In contrast with Expectation Maximization (EM), the proposed technique does not rely on any space discretization. Namely, the activity distribution--normalized emission intensity of the spatial poisson process--is considered as a spatial probability density and observations are the projections of random emissions whose distribution has to be estimated. This approach is nonparametric in the sense that the quantity of interest belongs to the set of probability measures on R k (for reconstruction in k-dimensions) and it is Bayesian in the sense that we define a prior directly on this spatial measure. In this context, we propose to model the nonparametric probability density as an infinite mixture of multivariate normal distributions. As a prior for this mixture we consider a Dirichlet Process Mixture (DPM) with a Normal-Inverse Wishart (NIW) model as base distribution of the Dirichlet Process. As in EM-family reconstruction, we use a data augmentation scheme where the set of hidden variables are the emission locations for each observed line of response in the continuous object space. Thanks to the data augmentation, we propose a Markov Chain Monte Carlo (MCMC) algorithm (Gibbs sampler) which is able to generate draws from the posterior distribution of the spatial intensity. A difference with EM is that one step of the Gibbs sampler corresponds to the generation of emission locations while only the expected number of emissions per pixel/voxel is used in EM. Another key difference is that the estimated spatial intensity is a continuous function such that there is no need to compute a projection matrix. Finally, draws from the intensity posterior distribution allow the estimation of posterior functionnals like the variance or confidence intervals. Results are presented for simulated data based on a 2D brain phantom and compared to Bayesian MAP-EM
A Bayesian nonparametric approach to causal inference on quantiles.
Xu, Dandan; Daniels, Michael J; Winterstein, Almut G
2018-02-25
We propose a Bayesian nonparametric approach (BNP) for causal inference on quantiles in the presence of many confounders. In particular, we define relevant causal quantities and specify BNP models to avoid bias from restrictive parametric assumptions. We first use Bayesian additive regression trees (BART) to model the propensity score and then construct the distribution of potential outcomes given the propensity score using a Dirichlet process mixture (DPM) of normals model. We thoroughly evaluate the operating characteristics of our approach and compare it to Bayesian and frequentist competitors. We use our approach to answer an important clinical question involving acute kidney injury using electronic health records. © 2018, The International Biometric Society.
Nonparametric Cointegration Analysis of Fractional Systems With Unknown Integration Orders
DEFF Research Database (Denmark)
Nielsen, Morten Ørregaard
2009-01-01
In this paper a nonparametric variance ratio testing approach is proposed for determining the number of cointegrating relations in fractionally integrated systems. The test statistic is easily calculated without prior knowledge of the integration order of the data, the strength of the cointegrating....... The asymptotic distribution theory for the proposed test is non-standard but easily tabulated. Monte Carlo simulations demonstrate excellent finite sample properties, even rivaling those of well-specified parametric tests. The proposed methodology is applied to the term structure of interest rates, where...
Nonparametric statistics a step-by-step approach
Corder, Gregory W
2014-01-01
"…a very useful resource for courses in nonparametric statistics in which the emphasis is on applications rather than on theory. It also deserves a place in libraries of all institutions where introductory statistics courses are taught."" -CHOICE This Second Edition presents a practical and understandable approach that enhances and expands the statistical toolset for readers. This book includes: New coverage of the sign test and the Kolmogorov-Smirnov two-sample test in an effort to offer a logical and natural progression to statistical powerSPSS® (Version 21) software and updated screen ca
Categorical and nonparametric data analysis choosing the best statistical technique
Nussbaum, E Michael
2014-01-01
Featuring in-depth coverage of categorical and nonparametric statistics, this book provides a conceptual framework for choosing the most appropriate type of test in various research scenarios. Class tested at the University of Nevada, the book's clear explanations of the underlying assumptions, computer simulations, and Exploring the Concept boxes help reduce reader anxiety. Problems inspired by actual studies provide meaningful illustrations of the techniques. The underlying assumptions of each test and the factors that impact validity and statistical power are reviewed so readers can explain
Nonparametric likelihood based estimation of linear filters for point processes
DEFF Research Database (Denmark)
Hansen, Niels Richard
2015-01-01
We consider models for multivariate point processes where the intensity is given nonparametrically in terms of functions in a reproducing kernel Hilbert space. The likelihood function involves a time integral and is consequently not given in terms of a finite number of kernel evaluations. The main...... the implementation relies crucially on the use of sparse matrices. As an illustration we consider neuron network modeling, and we use this example to investigate how the computational costs of the approximations depend on the resolution of the time discretization. The implementation is available in the R package...
Using non-parametric methods in econometric production analysis
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard; Henningsen, Arne
Econometric estimation of production functions is one of the most common methods in applied economic production analysis. These studies usually apply parametric estimation techniques, which obligate the researcher to specify the functional form of the production function. Most often, the Cobb...... results—including measures that are of interest of applied economists, such as elasticities. Therefore, we propose to use nonparametric econometric methods. First, they can be applied to verify the functional form used in parametric estimations of production functions. Second, they can be directly used...
Digital spectral analysis parametric, non-parametric and advanced methods
Castanié, Francis
2013-01-01
Digital Spectral Analysis provides a single source that offers complete coverage of the spectral analysis domain. This self-contained work includes details on advanced topics that are usually presented in scattered sources throughout the literature.The theoretical principles necessary for the understanding of spectral analysis are discussed in the first four chapters: fundamentals, digital signal processing, estimation in spectral analysis, and time-series models.An entire chapter is devoted to the non-parametric methods most widely used in industry.High resolution methods a
Evaluation of Nonparametric Probabilistic Forecasts of Wind Power
DEFF Research Database (Denmark)
Pinson, Pierre; Møller, Jan Kloppenborg; Nielsen, Henrik Aalborg, orlov 31.07.2008
Predictions of wind power production for horizons up to 48-72 hour ahead comprise a highly valuable input to the methods for the daily management or trading of wind generation. Today, users of wind power predictions are not only provided with point predictions, which are estimates of the most...... likely outcome for each look-ahead time, but also with uncertainty estimates given by probabilistic forecasts. In order to avoid assumptions on the shape of predictive distributions, these probabilistic predictions are produced from nonparametric methods, and then take the form of a single or a set...
Validating rankings in soccer championships
Directory of Open Access Journals (Sweden)
Annibal Parracho Sant'Anna
2012-08-01
Full Text Available The final ranking of a championship is determined by quality attributes combined with other factors which should be filtered out of any decision on relegation or draft for upper level tournaments. Factors like referees' mistakes and difficulty of certain matches due to its accidental importance to the opponents should have their influence reduced. This work tests approaches to combine classification rules considering the imprecision of the number of points as a measure of quality and of the variables that provide reliable explanation for it. Two home-advantage variables are tested and shown to be apt to enter as explanatory variables. Independence between the criteria is checked against the hypothesis of maximal correlation. The importance of factors and of composition rules is evaluated on the basis of correlation between rank vectors, number of classes and number of clubs in tail classes. Data from five years of the Brazilian Soccer Championship are analyzed.
Sexual orientation and occupational rank
Ali M Ahmed; Lina Andersson; Mats Hammarstedt
2011-01-01
This paper presents a study of differences in occupational rank between gay and heterosexual males as well as between lesbian and heterosexual females. We estimate different specifications of an ordered probit model on register data from Sweden. Our data consist of married heterosexual men and women and homosexual men and women living in civil unions. We find that homosexual men have a lower probability of working in a profession demanding a longer university education or a management profess...
Iacovacci, Jacopo; Rahmede, Christoph; Arenas, Alex; Bianconi, Ginestra
2016-10-01
Recently it has been recognized that many complex social, technological and biological networks have a multilayer nature and can be described by multiplex networks. Multiplex networks are formed by a set of nodes connected by links having different connotations forming the different layers of the multiplex. Characterizing the centrality of the nodes in a multiplex network is a challenging task since the centrality of the node naturally depends on the importance associated to links of a certain type. Here we propose to assign to each node of a multiplex network a centrality called Functional Multiplex PageRank that is a function of the weights given to every different pattern of connections (multilinks) existent in the multiplex network between any two nodes. Since multilinks distinguish all the possible ways in which the links in different layers can overlap, the Functional Multiplex PageRank can describe important non-linear effects when large relevance or small relevance is assigned to multilinks with overlap. Here we apply the Functional Page Rank to the multiplex airport networks, to the neuronal network of the nematode C. elegans, and to social collaboration and citation networks between scientists. This analysis reveals important differences existing between the most central nodes of these networks, and the correlations between their so-called pattern to success.
A Bayesian nonparametric estimation of distributions and quantiles
International Nuclear Information System (INIS)
Poern, K.
1988-11-01
The report describes a Bayesian, nonparametric method for the estimation of a distribution function and its quantiles. The method, presupposing random sampling, is nonparametric, so the user has to specify a prior distribution on a space of distributions (and not on a parameter space). In the current application, where the method is used to estimate the uncertainty of a parametric calculational model, the Dirichlet prior distribution is to a large extent determined by the first batch of Monte Carlo-realizations. In this case the results of the estimation technique is very similar to the conventional empirical distribution function. The resulting posterior distribution is also Dirichlet, and thus facilitates the determination of probability (confidence) intervals at any given point in the space of interest. Another advantage is that also the posterior distribution of a specified quantitle can be derived and utilized to determine a probability interval for that quantile. The method was devised for use in the PROPER code package for uncertainty and sensitivity analysis. (orig.)
1st Conference of the International Society for Nonparametric Statistics
Lahiri, S; Politis, Dimitris
2014-01-01
This volume is composed of peer-reviewed papers that have developed from the First Conference of the International Society for NonParametric Statistics (ISNPS). This inaugural conference took place in Chalkidiki, Greece, June 15-19, 2012. It was organized with the co-sponsorship of the IMS, the ISI, and other organizations. M.G. Akritas, S.N. Lahiri, and D.N. Politis are the first executive committee members of ISNPS, and the editors of this volume. ISNPS has a distinguished Advisory Committee that includes Professors R.Beran, P.Bickel, R. Carroll, D. Cook, P. Hall, R. Johnson, B. Lindsay, E. Parzen, P. Robinson, M. Rosenblatt, G. Roussas, T. SubbaRao, and G. Wahba. The Charting Committee of ISNPS consists of more than 50 prominent researchers from all over the world. The chapters in this volume bring forth recent advances and trends in several areas of nonparametric statistics. In this way, the volume facilitates the exchange of research ideas, promotes collaboration among researchers from all over the wo...
Bayesian nonparametric dictionary learning for compressed sensing MRI.
Huang, Yue; Paisley, John; Lin, Qin; Ding, Xinghao; Fu, Xueyang; Zhang, Xiao-Ping
2014-12-01
We develop a Bayesian nonparametric model for reconstructing magnetic resonance images (MRIs) from highly undersampled k -space data. We perform dictionary learning as part of the image reconstruction process. To this end, we use the beta process as a nonparametric dictionary learning prior for representing an image patch as a sparse combination of dictionary elements. The size of the dictionary and patch-specific sparsity pattern are inferred from the data, in addition to other dictionary learning variables. Dictionary learning is performed directly on the compressed image, and so is tailored to the MRI being considered. In addition, we investigate a total variation penalty term in combination with the dictionary learning model, and show how the denoising property of dictionary learning removes dependence on regularization parameters in the noisy setting. We derive a stochastic optimization algorithm based on Markov chain Monte Carlo for the Bayesian model, and use the alternating direction method of multipliers for efficiently performing total variation minimization. We present empirical results on several MRI, which show that the proposed regularization framework can improve reconstruction accuracy over other methods.
Nonparametric Analyses of Log-Periodic Precursors to Financial Crashes
Zhou, Wei-Xing; Sornette, Didier
We apply two nonparametric methods to further test the hypothesis that log-periodicity characterizes the detrended price trajectory of large financial indices prior to financial crashes or strong corrections. The term "parametric" refers here to the use of the log-periodic power law formula to fit the data; in contrast, "nonparametric" refers to the use of general tools such as Fourier transform, and in the present case the Hilbert transform and the so-called (H, q)-analysis. The analysis using the (H, q)-derivative is applied to seven time series ending with the October 1987 crash, the October 1997 correction and the April 2000 crash of the Dow Jones Industrial Average (DJIA), the Standard & Poor 500 and Nasdaq indices. The Hilbert transform is applied to two detrended price time series in terms of the ln(tc-t) variable, where tc is the time of the crash. Taking all results together, we find strong evidence for a universal fundamental log-frequency f=1.02±0.05 corresponding to the scaling ratio λ=2.67±0.12. These values are in very good agreement with those obtained in earlier works with different parametric techniques. This note is extracted from a long unpublished report with 58 figures available at , which extensively describes the evidence we have accumulated on these seven time series, in particular by presenting all relevant details so that the reader can judge for himself or herself the validity and robustness of the results.
Minkowski metrics in creating universal ranking algorithms
Directory of Open Access Journals (Sweden)
Andrzej Ameljańczyk
2014-06-01
Full Text Available The paper presents a general procedure for creating the rankings of a set of objects, while the relation of preference based on any ranking function. The analysis was possible to use the ranking functions began by showing the fundamental drawbacks of commonly used functions in the form of a weighted sum. As a special case of the ranking procedure in the space of a relation, the procedure based on the notion of an ideal element and generalized Minkowski distance from the element was proposed. This procedure, presented as universal ranking algorithm, eliminates most of the disadvantages of ranking functions in the form of a weighted sum.[b]Keywords[/b]: ranking functions, preference relation, ranking clusters, categories, ideal point, universal ranking algorithm
Relevance ranking for vertical search engines
Chang, Yi
2014-01-01
In plain, uncomplicated language, and using detailed examples to explain the key concepts, models, and algorithms in vertical search ranking, Relevance Ranking for Vertical Search Engines teaches readers how to manipulate ranking algorithms to achieve better results in real-world applications. This reference book for professionals covers concepts and theories from the fundamental to the advanced, such as relevance, query intention, location-based relevance ranking, and cross-property ranking. It covers the most recent developments in vertical search ranking applications, such as freshness-based relevance theory for new search applications, location-based relevance theory for local search applications, and cross-property ranking theory for applications involving multiple verticals. It introduces ranking algorithms and teaches readers how to manipulate ranking algorithms for the best results. It covers concepts and theories from the fundamental to the advanced. It discusses the state of the art: development of ...
Ranking Support Vector Machine with Kernel Approximation
Directory of Open Access Journals (Sweden)
Kai Chen
2017-01-01
Full Text Available Learning to rank algorithm has become important in recent years due to its successful application in information retrieval, recommender system, and computational biology, and so forth. Ranking support vector machine (RankSVM is one of the state-of-art ranking models and has been favorably used. Nonlinear RankSVM (RankSVM with nonlinear kernels can give higher accuracy than linear RankSVM (RankSVM with a linear kernel for complex nonlinear ranking problem. However, the learning methods for nonlinear RankSVM are still time-consuming because of the calculation of kernel matrix. In this paper, we propose a fast ranking algorithm based on kernel approximation to avoid computing the kernel matrix. We explore two types of kernel approximation methods, namely, the Nyström method and random Fourier features. Primal truncated Newton method is used to optimize the pairwise L2-loss (squared Hinge-loss objective function of the ranking model after the nonlinear kernel approximation. Experimental results demonstrate that our proposed method gets a much faster training speed than kernel RankSVM and achieves comparable or better performance over state-of-the-art ranking algorithms.
Directory of Open Access Journals (Sweden)
Navid Haghighat
2017-12-01
Full Text Available This paper focuses on evaluating airline service quality from the perspective of passengers' view. Until now a lot of researches has been performed in airline service quality evaluation in the world but a little research has been conducted in Iran, yet. In this study, a framework for measuring airline service quality in Iran is proposed. After reviewing airline service quality criteria, SSQAI model was selected because of its comprehensiveness in covering airline service quality dimensions. SSQAI questionnaire items were redesigned to adopt with Iranian airlines requirements and environmental circumstances in the Iran's economic and cultural context. This study includes fuzzy decision-making theory, considering the possible fuzzy subjective judgment of the evaluators during airline service quality evaluation. Fuzzy TOPSIS have been applied for ranking airlines service quality performances. Three major Iranian airlines which have the most passenger transfer volumes in domestic and foreign flights were chosen for evaluation in this research. Results demonstrated Mahan airline has got the best service quality performance rank in gaining passengers' satisfaction with delivery of high-quality services to its passengers, among the three major Iranian airlines. IranAir and Aseman airlines placed in the second and third rank, respectively, according to passenger's evaluation. Statistical analysis has been used in analyzing passenger responses. Due to the abnormality of data, Non-parametric tests were applied. To demonstrate airline ranks in every criterion separately, Friedman test was performed. Variance analysis and Tukey test were applied to study the influence of increasing in age and educational level of passengers on degree of their satisfaction from airline's service quality. Results showed that age has no significant relation to passenger satisfaction of airlines, however, increasing in educational level demonstrated a negative impact on
Ranking Tehran’s Stock Exchange Top Fifty Stocks Using Fundamental Indexes and Fuzzy TOPSIS
Directory of Open Access Journals (Sweden)
E. S. Saleh
2017-08-01
Full Text Available Investment through the purchase of securities, constitute an important part of countries economic exchange. Therefore, making decisions about investing in a particular stock has become one of the most controversial areas of economic and financial research and various institutions have began to rank companies stock and determine priorities of stock purchase to investment. The current research, with the determination of important required indexes for companies ranking based on their shares value on the Tehran stock exchange, can greatly help to the accurate ranking of fifty premier listed companies. Initial ranking indicators are extracted and then a decision-making group (exchange experts with the use of the Delphi method and also non-parametric statistic methods, determines the final indexes. Then, by using Fuzzy ANP, weight criteria are obtained with taking into account their interaction with each other. Finally, using fuzzy TOPSIS and information extraction about the premier fifty listed companies of Tehran stock exchange in 2014 are ranked with the software "Rahavard Novin”. Sensitivity analysis to criteria weight and relevant analysis presentation was conducted at the end of the study procedures.
A rank-based statistical test for measuring synergistic effects between two gene sets.
Shiraishi, Yuichi; Okada-Hatakeyama, Mariko; Miyano, Satoru
2011-09-01
Due to recent advances in high-throughput technologies, data on various types of genomic annotation have accumulated. These data will be crucially helpful for elucidating the combinatorial logic of transcription. Although several approaches have been proposed for inferring cooperativity among multiple factors, most approaches are haunted by the issues of normalization and threshold values. In this article, we propose a rank-based non-parametric statistical test for measuring the effects between two gene sets. This method is free from the issues of normalization and threshold value determination for gene expression values. Furthermore, we have proposed an efficient Markov chain Monte Carlo method for calculating an approximate significance value of synergy. We have applied this approach for detecting synergistic combinations of transcription factor binding motifs and histone modifications. C implementation of the method is available from http://www.hgc.jp/~yshira/software/rankSynergy.zip. yshira@hgc.jp Supplementary data are available at Bioinformatics online.
Yeung, Hui; Leff, Michelle; Rhee, Kyung E
Breastfeeding is associated with decreased risk of childhood obesity. However, there is a strong correlation between maternal weight status and childhood obesity, and it is unclear whether or not breastfeeding among overweight mothers could mitigate this risk. Our goal was to examine whether or not exclusive breastfeeding (compared to formula feeding) among overweight and obese mothers is associated with lower weight-for-length (W/L) percentile at 1 year. Data from the Infant Feeding Practices II study were used. Infants who were preterm or underweight at 1 year, and mothers who were underweight before pregnancy, were excluded from analysis. There was a significant interaction between exclusive breastfeeding for 4 months and maternal prepregnancy weight status (normal weight, overweight, obese) on infant W/L percentile at 1 year. Stratified linear mixed-effects growth modeling controlling for covariates was created to test the relationship between exclusive breastfeeding and infant W/L percentile within each maternal weight category. A total of 915 subjects met inclusion criteria. Normal weight and obese mothers who exclusively breastfed for 4 months had infants with a smaller rate of increase in W/L percentile during the first year compared with those who used formula. Infants of overweight and obese mothers who exclusively breastfed for 4 months had lower W/L percentile at 1 year than those who used formula. Exclusive breastfeeding for 4 months among normal weight and obese mothers resulted in less increase in W/L percentiles in the first year. Obese mothers often have a difficult time initiating and maintaining breastfeeding. Concerted efforts are needed to support this population with breastfeeding.
Development and Evaluation of a Proposed Neck Shield for the 5 Percentile Hybrid III Female Dummy.
Banglmaier, Richard F; Pecoraro, Katie M; Feustel, Jim R; Scherer, Risa D; Rouhana, Stephen W
2005-11-01
Frontal airbag interaction with the head and neck of the Hybrid III family of dummies may involve a non-biofidelic interaction. Researchers have found that the deploying airbag may become entrapped in the hollow cavity behind the dummy chin. This study evaluated a prototype neck shield design, the Flap Neck Shield, for biofidelic response and the ability to prevent airbag entrapment in the chin/jaw cavity. Neck pendulum calibration tests were conducted for biofidelity evaluation. Static and dynamic airbag deployments were conducted to evaluate neck shield performance. Tests showed that the Flap Neck Shield behaved in a biofidelic manner with neck loads and head motion within established biofidelic limits. The Flap Neck Shield did not alter the neck loads during static or dynamic airbag interactions, but it did consistently prevent the airbag from penetrating the chin/jaw cavity. Use of the Flap Neck Shield with the 5(th) percentile Hybrid III female dummy is recommended for frontal airbag deployments given its acceptable biofidelic response and repeatable performance.
Nonparametric modeling of dynamic functional connectivity in fmri data
DEFF Research Database (Denmark)
Nielsen, Søren Føns Vind; Madsen, Kristoffer H.; Røge, Rasmus
2015-01-01
dynamic changes. The existing approaches modeling dynamic connectivity have primarily been based on time-windowing the data and k-means clustering. We propose a nonparametric generative model for dynamic FC in fMRI that does not rely on specifying window lengths and number of dynamic states. Rooted......Dynamic functional connectivity (FC) has in recent years become a topic of interest in the neuroimaging community. Several models and methods exist for both functional magnetic resonance imaging (fMRI) and electroencephalography (EEG), and the results point towards the conclusion that FC exhibits...... in Bayesian statistical modeling we use the predictive likelihood to investigate if the model can discriminate between a motor task and rest both within and across subjects. We further investigate what drives dynamic states using the model on the entire data collated across subjects and task/rest. We find...
Prior processes and their applications nonparametric Bayesian estimation
Phadia, Eswar G
2016-01-01
This book presents a systematic and comprehensive treatment of various prior processes that have been developed over the past four decades for dealing with Bayesian approach to solving selected nonparametric inference problems. This revised edition has been substantially expanded to reflect the current interest in this area. After an overview of different prior processes, it examines the now pre-eminent Dirichlet process and its variants including hierarchical processes, then addresses new processes such as dependent Dirichlet, local Dirichlet, time-varying and spatial processes, all of which exploit the countable mixture representation of the Dirichlet process. It subsequently discusses various neutral to right type processes, including gamma and extended gamma, beta and beta-Stacy processes, and then describes the Chinese Restaurant, Indian Buffet and infinite gamma-Poisson processes, which prove to be very useful in areas such as machine learning, information retrieval and featural modeling. Tailfree and P...
Nonparametric Facial Feature Localization Using Segment-Based Eigenfeatures
Directory of Open Access Journals (Sweden)
Hyun-Chul Choi
2016-01-01
Full Text Available We present a nonparametric facial feature localization method using relative directional information between regularly sampled image segments and facial feature points. Instead of using any iterative parameter optimization technique or search algorithm, our method finds the location of facial feature points by using a weighted concentration of the directional vectors originating from the image segments pointing to the expected facial feature positions. Each directional vector is calculated by linear combination of eigendirectional vectors which are obtained by a principal component analysis of training facial segments in feature space of histogram of oriented gradient (HOG. Our method finds facial feature points very fast and accurately, since it utilizes statistical reasoning from all the training data without need to extract local patterns at the estimated positions of facial features, any iterative parameter optimization algorithm, and any search algorithm. In addition, we can reduce the storage size for the trained model by controlling the energy preserving level of HOG pattern space.
Generative Temporal Modelling of Neuroimaging - Decomposition and Nonparametric Testing
DEFF Research Database (Denmark)
Hald, Ditte Høvenhoff
The goal of this thesis is to explore two improvements for functional magnetic resonance imaging (fMRI) analysis; namely our proposed decomposition method and an extension to the non-parametric testing framework. Analysis of fMRI allows researchers to investigate the functional processes...... of the brain, and provides insight into neuronal coupling during mental processes or tasks. The decomposition method is a Gaussian process-based independent components analysis (GPICA), which incorporates a temporal dependency in the sources. A hierarchical model specification is used, featuring both...... instantaneous and convolutive mixing, and the inferred temporal patterns. Spatial maps are seen to capture smooth and localized stimuli-related components, and often identifiable noise components. The implementation is freely available as a GUI/SPM plugin, and we recommend using GPICA as an additional tool when...
Nonparametric Statistical Structuring of Knowledge Systems Using Binary Feature Matches
DEFF Research Database (Denmark)
Mørup, Morten; Kano Glückstad, Fumiko; Herlau, Tue
2014-01-01
Structuring knowledge systems with binary features is often based on imposing a similarity measure and clustering objects according to this similarity. Unfortunately, such analyses can be heavily influenced by the choice of similarity measure. Furthermore, it is unclear at which level clusters have...... statistical support and how this approach generalizes to the structuring and alignment of knowledge systems. We propose a non-parametric Bayesian generative model for structuring binary feature data that does not depend on a specific choice of similarity measure. We jointly model all combinations of binary...... matches and structure the data into groups at the level in which they have statistical support. The model naturally extends to structuring and aligning an arbitrary number of systems. We analyze three datasets on educational concepts and their features and demonstrate how the proposed model can both...
Nonparametric statistical structuring of knowledge systems using binary feature matches
DEFF Research Database (Denmark)
Mørup, Morten; Glückstad, Fumiko Kano; Herlau, Tue
2014-01-01
Structuring knowledge systems with binary features is often based on imposing a similarity measure and clustering objects according to this similarity. Unfortunately, such analyses can be heavily influenced by the choice of similarity measure. Furthermore, it is unclear at which level clusters have...... statistical support and how this approach generalizes to the structuring and alignment of knowledge systems. We propose a non-parametric Bayesian generative model for structuring binary feature data that does not depend on a specific choice of similarity measure. We jointly model all combinations of binary...... matches and structure the data into groups at the level in which they have statistical support. The model naturally extends to structuring and aligning an arbitrary number of systems. We analyze three datasets on educational concepts and their features and demonstrate how the proposed model can both...
Indoor Positioning Using Nonparametric Belief Propagation Based on Spanning Trees
Directory of Open Access Journals (Sweden)
Savic Vladimir
2010-01-01
Full Text Available Nonparametric belief propagation (NBP is one of the best-known methods for cooperative localization in sensor networks. It is capable of providing information about location estimation with appropriate uncertainty and to accommodate non-Gaussian distance measurement errors. However, the accuracy of NBP is questionable in loopy networks. Therefore, in this paper, we propose a novel approach, NBP based on spanning trees (NBP-ST created by breadth first search (BFS method. In addition, we propose a reliable indoor model based on obtained measurements in our lab. According to our simulation results, NBP-ST performs better than NBP in terms of accuracy and communication cost in the networks with high connectivity (i.e., highly loopy networks. Furthermore, the computational and communication costs are nearly constant with respect to the transmission radius. However, the drawbacks of proposed method are a little bit higher computational cost and poor performance in low-connected networks.
Nonparametric estimation of stochastic differential equations with sparse Gaussian processes.
García, Constantino A; Otero, Abraham; Félix, Paulo; Presedo, Jesús; Márquez, David G
2017-08-01
The application of stochastic differential equations (SDEs) to the analysis of temporal data has attracted increasing attention, due to their ability to describe complex dynamics with physically interpretable equations. In this paper, we introduce a nonparametric method for estimating the drift and diffusion terms of SDEs from a densely observed discrete time series. The use of Gaussian processes as priors permits working directly in a function-space view and thus the inference takes place directly in this space. To cope with the computational complexity that requires the use of Gaussian processes, a sparse Gaussian process approximation is provided. This approximation permits the efficient computation of predictions for the drift and diffusion terms by using a distribution over a small subset of pseudosamples. The proposed method has been validated using both simulated data and real data from economy and paleoclimatology. The application of the method to real data demonstrates its ability to capture the behavior of complex systems.
Nonparametric Estimation of Distributions in Random Effects Models
Hart, Jeffrey D.
2011-01-01
We propose using minimum distance to obtain nonparametric estimates of the distributions of components in random effects models. A main setting considered is equivalent to having a large number of small datasets whose locations, and perhaps scales, vary randomly, but which otherwise have a common distribution. Interest focuses on estimating the distribution that is common to all datasets, knowledge of which is crucial in multiple testing problems where a location/scale invariant test is applied to every small dataset. A detailed algorithm for computing minimum distance estimates is proposed, and the usefulness of our methodology is illustrated by a simulation study and an analysis of microarray data. Supplemental materials for the article, including R-code and a dataset, are available online. © 2011 American Statistical Association.
Debt and growth: A non-parametric approach
Brida, Juan Gabriel; Gómez, David Matesanz; Seijas, Maria Nela
2017-11-01
In this study, we explore the dynamic relationship between public debt and economic growth by using a non-parametric approach based on data symbolization and clustering methods. The study uses annual data of general government consolidated gross debt-to-GDP ratio and gross domestic product for sixteen countries between 1977 and 2015. Using symbolic sequences, we introduce a notion of distance between the dynamical paths of different countries. Then, a Minimal Spanning Tree and a Hierarchical Tree are constructed from time series to help detecting the existence of groups of countries sharing similar economic performance. The main finding of the study appears for the period 2008-2016 when several countries surpassed the 90% debt-to-GDP threshold. During this period, three groups (clubs) of countries are obtained: high, mid and low indebted countries, suggesting that the employed debt-to-GDP threshold drives economic dynamics for the selected countries.
MatchIt: Nonparametric Preprocessing for Parametric Causal Inference
Directory of Open Access Journals (Sweden)
Daniel Ho
2011-08-01
Full Text Available MatchIt implements the suggestions of Ho, Imai, King, and Stuart (2007 for improving parametric statistical models by preprocessing data with nonparametric matching methods. MatchIt implements a wide range of sophisticated matching methods, making it possible to greatly reduce the dependence of causal inferences on hard-to-justify, but commonly made, statistical modeling assumptions. The software also easily fits into existing research practices since, after preprocessing data with MatchIt, researchers can use whatever parametric model they would have used without MatchIt, but produce inferences with substantially more robustness and less sensitivity to modeling assumptions. MatchIt is an R program, and also works seamlessly with Zelig.
Multi-Directional Non-Parametric Analysis of Agricultural Efficiency
DEFF Research Database (Denmark)
Balezentis, Tomas
This thesis seeks to develop methodologies for assessment of agricultural efficiency and employ them to Lithuanian family farms. In particular, we focus on three particular objectives throughout the research: (i) to perform a fully non-parametric analysis of efficiency effects, (ii) to extend...... to the Multi-Directional Efficiency Analysis approach when the proposed models were employed to analyse empirical data of Lithuanian family farm performance, we saw substantial differences in efficiencies associated with different inputs. In particular, assets appeared to be the least efficiently used input...... relative to labour, intermediate consumption and land (in some cases land was not treated as a discretionary input). These findings call for further research on relationships among financial structure, investment decisions, and efficiency in Lithuanian family farms. Application of different techniques...
A Comprehensive Analysis of Marketing Journal Rankings
Steward, Michelle D.; Lewis, Bruce R.
2010-01-01
The purpose of this study is to offer a comprehensive assessment of journal standings in Marketing from two perspectives. The discipline perspective of rankings is obtained from a collection of published journal ranking studies during the past 15 years. The studies in the published ranking stream are assessed for reliability by examining internal…
Methodology, Meaning and Usefulness of Rankings
Williams, Ross
2008-01-01
University rankings are having a profound effect on both higher education systems and individual universities. In this paper we outline these effects, discuss the desirable characteristics of a good ranking methodology and document existing practice, with an emphasis on the two main international rankings (Shanghai Jiao Tong and THES-QS). We take…
The Privilege of Ranking: Google Plays Ball.
Wiggins, Richard
2003-01-01
Discussion of ranking systems used in various settings, including college football and academic admissions, focuses on the Google search engine. Explains the PageRank mathematical formula that scores Web pages by connecting the number of links; limitations, including authenticity and accuracy of ranked Web pages; relevancy; adjusting algorithms;…
Decision-Oriented Project Ranking for Asset Management System: Rail Net Denmark
DEFF Research Database (Denmark)
Salling, Kim Bang; Moshøj, Claus Rehfeld; Timm, Henrik
2007-01-01
is to apply a modified project ranking methodology: Asset Management System Priority Module (AMS-PM), which is a practical tool for assessing and ranking various project proposals in a straightforward manner. The methodology is set-out by a multi-criteria approach where weights are applied ultimately......The Danish rail net operator, Rail Net Denmark, has through the past years built up an Asset Management system, containing a certain percentile of all the company’s assets. This paper contains an elaborate overview on how to strengthen the system seen from a decision-support perspective. The focus...... resulting in priority indices for the state-of-repair data. This paper is disposed as follows; firstly, a description of the Asset Management system is set-up including an overview of the state-of-repair data and the case study. Secondly, is the AMS-PM software model implemented through an exploratory case...
A ¤nonparametric dynamic additive regression model for longitudinal data
DEFF Research Database (Denmark)
Martinussen, T.; Scheike, T. H.
2000-01-01
dynamic linear models, estimating equations, least squares, longitudinal data, nonparametric methods, partly conditional mean models, time-varying-coefficient models......dynamic linear models, estimating equations, least squares, longitudinal data, nonparametric methods, partly conditional mean models, time-varying-coefficient models...
Efficiency and equity in private and public education: A nonparametric comparison
Cherchye, L.; de Witte, K.; Ooghe, E.; Nicaise, I.
2010-01-01
We present a nonparametric approach for (1) efficiency and (2) equity evaluation in education. Firstly, we use a nonparametric (Data Envelopment Analysis) model that is specially tailored to assess educational efficiency at the pupil level. The model accounts for the fact that typically minimal
Equity and efficiency in private and public education: a nonparametric comparison
Cherchye, L.; de Witte, K.; Ooghe, E.; Nicaise, I.
2007-01-01
We present a nonparametric approach for the equity and efficiency evaluation of (private and public) primary schools in Flanders. First, we use a nonparametric (Data Envelopment Analysis) model that is specially tailored to assess educational efficiency at the pupil level. The model accounts for the
Overview of NonParametric Combination-based permutation tests for Multivariate multi-sample problems
Directory of Open Access Journals (Sweden)
Rosa Arboretti Giancristofaro
2014-09-01
Full Text Available In this work we present a review on nonparametric combination-based permutation tests along with SAS macros allowing to deal with two-sample and one-way MANOVA design problems, within NonParametric Combination methodology framework. Applications to real case studies are also presented.
Directory of Open Access Journals (Sweden)
Panjikkaran Seeja
2009-01-01
Full Text Available Research Questions: 1. Are all the existing methods for estimating the obesity and overweight in school going children in India equally efficient? 2. How to derive more efficient obesity percentiles to determine obesity and overweight status in school-going children aged 7-12 years old? Objectives: 1. To investigate and analyze the prevalence rate of obesity and overweight children in India, using the established standards. 2. To compare the efficiency among the tools with the expected levels in the Indian population. 3. To establish and demonstrate the higher efficiency of the proposed percentile chart. Study Design: A cross-sectional study using a completely randomized design. Settings: Government, private-aided, unaided, and central schools in the Thrissur district of Kerala. Participants: A total of 1500 boys and 1500 girls aged 7-12 years old. Results: BMI percentiles, waist circumference percentiles, and waist to height ratio are the ruling methodologies in establishing the obese and overweight relations in school-going children. Each one suffers from the disadvantage of not considering either one or more of the obesity contributing factors in human growth dynamics, the major being waist circumference and weight. A new methodology for mitigating this defect through considering BMI and waist circumference simultaneously for establishing still efficient percentiles to arrive at obesity and overweight status is detailed here. Age-wise centiles for obesity and overweight status separately for boys and girls aged 7-12 years old were established. Comparative efficiency of this methodology over BMI had shown that this could mitigate the inability of BMI to consider waist circumference. Also, this had the advantage of considering body weight in obesity analysis, which is the major handicap in waist to height ratio. An analysis using a population of 1500 boys and 1500 girls has yielded 3.6% obese and 6.2% overweight samples, which is well within
Two-dimensional ranking of Wikipedia articles
Zhirov, A. O.; Zhirov, O. V.; Shepelyansky, D. L.
2010-10-01
The Library of Babel, described by Jorge Luis Borges, stores an enormous amount of information. The Library exists ab aeterno. Wikipedia, a free online encyclopaedia, becomes a modern analogue of such a Library. Information retrieval and ranking of Wikipedia articles become the challenge of modern society. While PageRank highlights very well known nodes with many ingoing links, CheiRank highlights very communicative nodes with many outgoing links. In this way the ranking becomes two-dimensional. Using CheiRank and PageRank we analyze the properties of two-dimensional ranking of all Wikipedia English articles and show that it gives their reliable classification with rich and nontrivial features. Detailed studies are done for countries, universities, personalities, physicists, chess players, Dow-Jones companies and other categories.
Lawrence, C.; Somers, J. T.; Baldwin, M. A.; Wells, J. A.; Newby, N.; Currie, N. J.
2014-01-01
NASA spacecraft design requirements for occupant protection are a combination of the Brinkley criteria and injury metrics extracted from anthropomorphic test devices (ATD's). For the ATD injury metrics, the requirements specify the use of the 5th percentile female Hybrid III and the 95th percentile male Hybrid III. Furthermore, each of these ATD's is required to be fitted with an articulating pelvis and a straight spine. The articulating pelvis is necessary for the ATD to fit into spacecraft seats, while the straight spine is required as injury metrics for vertical accelerations are better defined for this configuration. The requirements require that physical testing be performed with both ATD's to demonstrate compliance. Before compliance testing can be conducted, extensive modeling and simulation are required to determine appropriate test conditions, simulate conditions not feasible for testing, and assess design features to better ensure compliance testing is successful. While finite element (FE) models are currently available for many of the physical ATD's, currently there are no complete models for either the 5th percentile female or the 95th percentile male Hybrid III with a straight spine and articulating pelvis. The purpose of this work is to assess the accuracy of the existing Livermore Software Technology Corporation's FE models of the 5th and 95th percentile ATD's. To perform this assessment, a series of tests will be performed at Wright Patterson Air Force Research Lab using their horizontal impact accelerator sled test facility. The ATD's will be placed in the Orion seat with a modified-advanced-crew-escape-system (MACES) pressure suit and helmet, and driven with loadings similar to what is expected for the actual Orion vehicle during landing, launch abort, and chute deployment. Test data will be compared to analytical predictions and modelling uncertainty factors will be determined for each injury metric. Additionally, the test data will be used to
Wesselink, Christiaan; Heeg, Govert P.; Jansonius, Nomdo M.
Objective: To compare prospectively 2 perimetric progression detection algorithms for glaucoma, the Early Manifest Glaucoma Trial algorithm (glaucoma progression analysis [GPA]) and a nonparametric algorithm applied to the mean deviation (MD) (nonparametric progression analysis [NPA]). Methods:
Li, Zhuoyang; Wang, Yueping A; Ledger, William; Sullivan, Elizabeth A
2014-08-01
What is the standard of birthweight for gestational age for babies following assisted reproductive technology (ART) treatment? Birthweight for gestational age percentile charts were developed for singleton births following ART treatment using population-based data. Small for gestational age (SGA) and large for gestational age (LGA) births are at increased risks of perinatal morbidity and mortality. A birthweight percentile chart allows the detection of neonates at high risk, and can help inform the need for special care if required. This population study used data from the Australian and New Zealand Assisted Reproduction Database (ANZARD) for 72 694 live born singletons following ART treatment between January 2002 and December 2010 in Australia and New Zealand. A total of 69 315 births (35 580 males and 33 735 females) following ART treatment were analysed for the birthweight percentile. Exact percentiles of birthweight in grams were calculated for each gestational week between Week 25 and 42 for fresh and thaw cycles by infant sex. Univariate analysis was used to determine the exact birthweight percentile values. Student t-test was used to examine the mean birthweight difference between male and female infants, between single embryo transfer (SET) and double embryo transfer (DET) and between fresh and thaw cycles. Preterm births (birth before 37 completed weeks of gestation) and low birthweight (fetal growth standards but only the weight of live born infants at birth. The comparison of birthweight percentile charts for ART births and general population births provide evidence that the proportion of SGA births following ART treatment was comparable to the general population for SET fresh cycles and significantly lower for thaw cycles. Both fresh and thaw cycles showed better outcomes for singleton births following SET compared with DET. Policies to promote single embryo transfer should be considered in order to minimize the adverse perinatal outcomes associated
The 97.5th and 99.5th percentile of vertical cup disc ratio in the United States.
Swanson, Mark W
2011-01-01
The International Society of Geographical and Epidemiological Ophthalmology (ISGEO) suggests a case definition for the prevalence studies of glaucoma based on the 97.5th and 99.5th percentile of vertical optic cup distribution among the evaluated population. Although multiple studies evaluating the prevalence of glaucoma have been undertaken in the last 20 years, case definitions have varied, and data on the underlying population distribution are sparse. This study evaluates the population distribution of 97.5th and 99.5th percentile of vertical cup disc ratio (VCDR) and VCDR asymmetry (VCDRA) in the US population and its association with demographic characteristics, self-reported glaucoma, and visual field. The National Health and Nutrition Examination Survey (NHANES) is a nationally representative sample of the US population, which during the years 2005 to 2008 collected frequency doubling technology screening fields and digital fundus photography. Accounting for the complex design of the NHANES population, estimates of the 97.5th and 99.5th percentile of VCDR and VCDRA were calculated, and national estimates of glaucoma prevalence were generated. Associations between disc characteristics, demographic variables, and self-reported glaucoma were explored. Approximately 2.11% (95% confidence interval, 1.55 to 2.67) of the US population older than 40 years has glaucoma based on ISGEO criteria. A much larger 5.13% (95% confidence interval, 4.43 to 5.85) of the US population older than 40 years self-reports having glaucoma. Based on the estimates from NHANES, 6.89% of the population has a VCDR or VCDRA >97.5th percentile in either eye or OU. For the at-risk population with VCDR/VCDRA above the 97.5th percentile, <20% reported having glaucoma, whereas for those at the 99.5th percentile, <50% reported having glaucoma. The prevalence of glaucoma from NHANES based on ISGEO criteria produces similar population estimated to other population-based studies. Self
Nonparametric adaptive age replacement with a one-cycle criterion
International Nuclear Information System (INIS)
Coolen-Schrijner, P.; Coolen, F.P.A.
2007-01-01
Age replacement of technical units has received much attention in the reliability literature over the last four decades. Mostly, the failure time distribution for the units is assumed to be known, and minimal costs per unit of time is used as optimality criterion, where renewal reward theory simplifies the mathematics involved but requires the assumption that the same process and replacement strategy continues over a very large ('infinite') period of time. Recently, there has been increasing attention to adaptive strategies for age replacement, taking into account the information from the process. Although renewal reward theory can still be used to provide an intuitively and mathematically attractive optimality criterion, it is more logical to use minimal costs per unit of time over a single cycle as optimality criterion for adaptive age replacement. In this paper, we first show that in the classical age replacement setting, with known failure time distribution with increasing hazard rate, the one-cycle criterion leads to earlier replacement than the renewal reward criterion. Thereafter, we present adaptive age replacement with a one-cycle criterion within the nonparametric predictive inferential framework. We study the performance of this approach via simulations, which are also used for comparisons with the use of the renewal reward criterion within the same statistical framework
Nonparametric Integrated Agrometeorological Drought Monitoring: Model Development and Application
Zhang, Qiang; Li, Qin; Singh, Vijay P.; Shi, Peijun; Huang, Qingzhong; Sun, Peng
2018-01-01
Drought is a major natural hazard that has massive impacts on the society. How to monitor drought is critical for its mitigation and early warning. This study proposed a modified version of the multivariate standardized drought index (MSDI) based on precipitation, evapotranspiration, and soil moisture, i.e., modified multivariate standardized drought index (MMSDI). This study also used nonparametric joint probability distribution analysis. Comparisons were done between standardized precipitation evapotranspiration index (SPEI), standardized soil moisture index (SSMI), MSDI, and MMSDI, and real-world observed drought regimes. Results indicated that MMSDI detected droughts that SPEI and/or SSMI failed to do. Also, MMSDI detected almost all droughts that were identified by SPEI and SSMI. Further, droughts detected by MMSDI were similar to real-world observed droughts in terms of drought intensity and drought-affected area. When compared to MMSDI, MSDI has the potential to overestimate drought intensity and drought-affected area across China, which should be attributed to exclusion of the evapotranspiration components from estimation of drought intensity. Therefore, MMSDI is proposed for drought monitoring that can detect agrometeorological droughts. Results of this study provide a framework for integrated drought monitoring in other regions of the world and can help to develop drought mitigation.
Bayesian nonparametric clustering in phylogenetics: modeling antigenic evolution in influenza.
Cybis, Gabriela B; Sinsheimer, Janet S; Bedford, Trevor; Rambaut, Andrew; Lemey, Philippe; Suchard, Marc A
2018-01-30
Influenza is responsible for up to 500,000 deaths every year, and antigenic variability represents much of its epidemiological burden. To visualize antigenic differences across many viral strains, antigenic cartography methods use multidimensional scaling on binding assay data to map influenza antigenicity onto a low-dimensional space. Analysis of such assay data ideally leads to natural clustering of influenza strains of similar antigenicity that correlate with sequence evolution. To understand the dynamics of these antigenic groups, we present a framework that jointly models genetic and antigenic evolution by combining multidimensional scaling of binding assay data, Bayesian phylogenetic machinery and nonparametric clustering methods. We propose a phylogenetic Chinese restaurant process that extends the current process to incorporate the phylogenetic dependency structure between strains in the modeling of antigenic clusters. With this method, we are able to use the genetic information to better understand the evolution of antigenicity throughout epidemics, as shown in applications of this model to H1N1 influenza. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Bayesian Nonparametric Model for Estimating Multistate Travel Time Distribution
Directory of Open Access Journals (Sweden)
Emmanuel Kidando
2017-01-01
Full Text Available Multistate models, that is, models with more than two distributions, are preferred over single-state probability models in modeling the distribution of travel time. Literature review indicated that the finite multistate modeling of travel time using lognormal distribution is superior to other probability functions. In this study, we extend the finite multistate lognormal model of estimating the travel time distribution to unbounded lognormal distribution. In particular, a nonparametric Dirichlet Process Mixture Model (DPMM with stick-breaking process representation was used. The strength of the DPMM is that it can choose the number of components dynamically as part of the algorithm during parameter estimation. To reduce computational complexity, the modeling process was limited to a maximum of six components. Then, the Markov Chain Monte Carlo (MCMC sampling technique was employed to estimate the parameters’ posterior distribution. Speed data from nine links of a freeway corridor, aggregated on a 5-minute basis, were used to calculate the corridor travel time. The results demonstrated that this model offers significant flexibility in modeling to account for complex mixture distributions of the travel time without specifying the number of components. The DPMM modeling further revealed that freeway travel time is characterized by multistate or single-state models depending on the inclusion of onset and offset of congestion periods.
Nonparametric Bayes Classification and Hypothesis Testing on Manifolds
Bhattacharya, Abhishek; Dunson, David
2012-01-01
Our first focus is prediction of a categorical response variable using features that lie on a general manifold. For example, the manifold may correspond to the surface of a hypersphere. We propose a general kernel mixture model for the joint distribution of the response and predictors, with the kernel expressed in product form and dependence induced through the unknown mixing measure. We provide simple sufficient conditions for large support and weak and strong posterior consistency in estimating both the joint distribution of the response and predictors and the conditional distribution of the response. Focusing on a Dirichlet process prior for the mixing measure, these conditions hold using von Mises-Fisher kernels when the manifold is the unit hypersphere. In this case, Bayesian methods are developed for efficient posterior computation using slice sampling. Next we develop Bayesian nonparametric methods for testing whether there is a difference in distributions between groups of observations on the manifold having unknown densities. We prove consistency of the Bayes factor and develop efficient computational methods for its calculation. The proposed classification and testing methods are evaluated using simulation examples and applied to spherical data applications. PMID:22754028
Bayesian nonparametric meta-analysis using Polya tree mixture models.
Branscum, Adam J; Hanson, Timothy E
2008-09-01
Summary. A common goal in meta-analysis is estimation of a single effect measure using data from several studies that are each designed to address the same scientific inquiry. Because studies are typically conducted in geographically disperse locations, recent developments in the statistical analysis of meta-analytic data involve the use of random effects models that account for study-to-study variability attributable to differences in environments, demographics, genetics, and other sources that lead to heterogeneity in populations. Stemming from asymptotic theory, study-specific summary statistics are modeled according to normal distributions with means representing latent true effect measures. A parametric approach subsequently models these latent measures using a normal distribution, which is strictly a convenient modeling assumption absent of theoretical justification. To eliminate the influence of overly restrictive parametric models on inferences, we consider a broader class of random effects distributions. We develop a novel hierarchical Bayesian nonparametric Polya tree mixture (PTM) model. We present methodology for testing the PTM versus a normal random effects model. These methods provide researchers a straightforward approach for conducting a sensitivity analysis of the normality assumption for random effects. An application involving meta-analysis of epidemiologic studies designed to characterize the association between alcohol consumption and breast cancer is presented, which together with results from simulated data highlight the performance of PTMs in the presence of nonnormality of effect measures in the source population.
On Consistent Nonparametric Statistical Tests of Symmetry Hypotheses
Directory of Open Access Journals (Sweden)
Jean-François Quessy
2016-05-01
Full Text Available Being able to formally test for symmetry hypotheses is an important topic in many fields, including environmental and physical sciences. In this paper, one concentrates on a large family of nonparametric tests of symmetry based on Cramér–von Mises statistics computed from empirical distribution and characteristic functions. These tests possess the highly desirable property of being universally consistent in the sense that they detect any kind of departure from symmetry as the sample size becomes large. The asymptotic behaviour of these test statistics under symmetry is deduced from the theory of first-order degenerate V-statistics. The issue of computing valid p-values is tackled using the multiplier bootstrap method suitably adapted to V-statistics, yielding elegant, easy-to-compute and quick procedures for testing symmetry. A special focus is put on tests of univariate symmetry, bivariate exchangeability and reflected symmetry; a simulation study indicates the good sampling properties of these tests. Finally, a framework for testing general symmetry hypotheses is introduced.
The Utility of Nonparametric Transformations for Imputation of Survey Data
Directory of Open Access Journals (Sweden)
Robbins Michael W.
2014-12-01
Full Text Available Missing values present a prevalent problem in the analysis of establishment survey data. Multivariate imputation algorithms (which are used to fill in missing observations tend to have the common limitation that imputations for continuous variables are sampled from Gaussian distributions. This limitation is addressed here through the use of robust marginal transformations. Specifically, kernel-density and empirical distribution-type transformations are discussed and are shown to have favorable properties when used for imputation of complex survey data. Although such techniques have wide applicability (i.e., they may be easily applied in conjunction with a wide array of imputation techniques, the proposed methodology is applied here with an algorithm for imputation in the USDA’s Agricultural Resource Management Survey. Data analysis and simulation results are used to illustrate the specific advantages of the robust methods when compared to the fully parametric techniques and to other relevant techniques such as predictive mean matching. To summarize, transformations based upon parametric densities are shown to distort several data characteristics in circumstances where the parametric model is ill fit; however, no circumstances are found in which the transformations based upon parametric models outperform the nonparametric transformations. As a result, the transformation based upon the empirical distribution (which is the most computationally efficient is recommended over the other transformation procedures in practice.
Probability Machines: Consistent Probability Estimation Using Nonparametric Learning Machines
Malley, J. D.; Kruppa, J.; Dasgupta, A.; Malley, K. G.; Ziegler, A.
2011-01-01
Summary Background Most machine learning approaches only provide a classification for binary responses. However, probabilities are required for risk estimation using individual patient characteristics. It has been shown recently that every statistical learning machine known to be consistent for a nonparametric regression problem is a probability machine that is provably consistent for this estimation problem. Objectives The aim of this paper is to show how random forests and nearest neighbors can be used for consistent estimation of individual probabilities. Methods Two random forest algorithms and two nearest neighbor algorithms are described in detail for estimation of individual probabilities. We discuss the consistency of random forests, nearest neighbors and other learning machines in detail. We conduct a simulation study to illustrate the validity of the methods. We exemplify the algorithms by analyzing two well-known data sets on the diagnosis of appendicitis and the diagnosis of diabetes in Pima Indians. Results Simulations demonstrate the validity of the method. With the real data application, we show the accuracy and practicality of this approach. We provide sample code from R packages in which the probability estimation is already available. This means that all calculations can be performed using existing software. Conclusions Random forest algorithms as well as nearest neighbor approaches are valid machine learning methods for estimating individual probabilities for binary responses. Freely available implementations are available in R and may be used for applications. PMID:21915433
Non-parametric and least squares Langley plot methods
Kiedron, P. W.; Michalsky, J. J.
2016-01-01
Langley plots are used to calibrate sun radiometers primarily for the measurement of the aerosol component of the atmosphere that attenuates (scatters and absorbs) incoming direct solar radiation. In principle, the calibration of a sun radiometer is a straightforward application of the Bouguer-Lambert-Beer law V = V0e-τ ṡ m, where a plot of ln(V) voltage vs. m air mass yields a straight line with intercept ln(V0). This ln(V0) subsequently can be used to solve for τ for any measurement of V and calculation of m. This calibration works well on some high mountain sites, but the application of the Langley plot calibration technique is more complicated at other, more interesting, locales. This paper is concerned with ferreting out calibrations at difficult sites and examining and comparing a number of conventional and non-conventional methods for obtaining successful Langley plots. The 11 techniques discussed indicate that both least squares and various non-parametric techniques produce satisfactory calibrations with no significant differences among them when the time series of ln(V0)'s are smoothed and interpolated with median and mean moving window filters.
Probability machines: consistent probability estimation using nonparametric learning machines.
Malley, J D; Kruppa, J; Dasgupta, A; Malley, K G; Ziegler, A
2012-01-01
Most machine learning approaches only provide a classification for binary responses. However, probabilities are required for risk estimation using individual patient characteristics. It has been shown recently that every statistical learning machine known to be consistent for a nonparametric regression problem is a probability machine that is provably consistent for this estimation problem. The aim of this paper is to show how random forests and nearest neighbors can be used for consistent estimation of individual probabilities. Two random forest algorithms and two nearest neighbor algorithms are described in detail for estimation of individual probabilities. We discuss the consistency of random forests, nearest neighbors and other learning machines in detail. We conduct a simulation study to illustrate the validity of the methods. We exemplify the algorithms by analyzing two well-known data sets on the diagnosis of appendicitis and the diagnosis of diabetes in Pima Indians. Simulations demonstrate the validity of the method. With the real data application, we show the accuracy and practicality of this approach. We provide sample code from R packages in which the probability estimation is already available. This means that all calculations can be performed using existing software. Random forest algorithms as well as nearest neighbor approaches are valid machine learning methods for estimating individual probabilities for binary responses. Freely available implementations are available in R and may be used for applications.
A local non-parametric model for trade sign inference
Blazejewski, Adam; Coggins, Richard
2005-03-01
We investigate a regularity in market order submission strategies for 12 stocks with large market capitalization on the Australian Stock Exchange. The regularity is evidenced by a predictable relationship between the trade sign (trade initiator), size of the trade, and the contents of the limit order book before the trade. We demonstrate this predictability by developing an empirical inference model to classify trades into buyer-initiated and seller-initiated. The model employs a local non-parametric method, k-nearest neighbor, which in the past was used successfully for chaotic time series prediction. The k-nearest neighbor with three predictor variables achieves an average out-of-sample classification accuracy of 71.40%, compared to 63.32% for the linear logistic regression with seven predictor variables. The result suggests that a non-linear approach may produce a more parsimonious trade sign inference model with a higher out-of-sample classification accuracy. Furthermore, for most of our stocks the observed regularity in market order submissions seems to have a memory of at least 30 trading days.
Non-parametric Bayesian networks: Improving theory and reviewing applications
International Nuclear Information System (INIS)
Hanea, Anca; Morales Napoles, Oswaldo; Ababei, Dan
2015-01-01
Applications in various domains often lead to high dimensional dependence modelling. A Bayesian network (BN) is a probabilistic graphical model that provides an elegant way of expressing the joint distribution of a large number of interrelated variables. BNs have been successfully used to represent uncertain knowledge in a variety of fields. The majority of applications use discrete BNs, i.e. BNs whose nodes represent discrete variables. Integrating continuous variables in BNs is an area fraught with difficulty. Several methods that handle discrete-continuous BNs have been proposed in the literature. This paper concentrates only on one method called non-parametric BNs (NPBNs). NPBNs were introduced in 2004 and they have been or are currently being used in at least twelve professional applications. This paper provides a short introduction to NPBNs, a couple of theoretical advances, and an overview of applications. The aim of the paper is twofold: one is to present the latest improvements of the theory underlying NPBNs, and the other is to complement the existing overviews of BNs applications with the NPNBs applications. The latter opens the opportunity to discuss some difficulties that applications pose to the theoretical framework and in this way offers some NPBN modelling guidance to practitioners. - Highlights: • The paper gives an overview of the current NPBNs methodology. • We extend the NPBN methodology by relaxing the conditions of one of its fundamental theorems. • We propose improvements of the data mining algorithm for the NPBNs. • We review the professional applications of the NPBNs.
Nonparametric predictive inference for combining diagnostic tests with parametric copula
Muhammad, Noryanti; Coolen, F. P. A.; Coolen-Maturi, T.
2017-09-01
Measuring the accuracy of diagnostic tests is crucial in many application areas including medicine and health care. The Receiver Operating Characteristic (ROC) curve is a popular statistical tool for describing the performance of diagnostic tests. The area under the ROC curve (AUC) is often used as a measure of the overall performance of the diagnostic test. In this paper, we interest in developing strategies for combining test results in order to increase the diagnostic accuracy. We introduce nonparametric predictive inference (NPI) for combining two diagnostic test results with considering dependence structure using parametric copula. NPI is a frequentist statistical framework for inference on a future observation based on past data observations. NPI uses lower and upper probabilities to quantify uncertainty and is based on only a few modelling assumptions. While copula is a well-known statistical concept for modelling dependence of random variables. A copula is a joint distribution function whose marginals are all uniformly distributed and it can be used to model the dependence separately from the marginal distributions. In this research, we estimate the copula density using a parametric method which is maximum likelihood estimator (MLE). We investigate the performance of this proposed method via data sets from the literature and discuss results to show how our method performs for different family of copulas. Finally, we briefly outline related challenges and opportunities for future research.
Robust Discriminant Analysis Based on Nonparametric Maximum Entropy
He, Ran; Hu, Bao-Gang; Yuan, Xiao-Tong
In this paper, we propose a Robust Discriminant Analysis based on maximum entropy (MaxEnt) criterion (MaxEnt-RDA), which is derived from a nonparametric estimate of Renyi’s quadratic entropy. MaxEnt-RDA uses entropy as both objective and constraints; thus the structural information of classes is preserved while information loss is minimized. It is a natural extension of LDA from Gaussian assumption to any distribution assumption. Like LDA, the optimal solution of MaxEnt-RDA can also be solved by an eigen-decomposition method, where feature extraction is achieved by designing two Parzen probability matrices that characterize the within-class variation and the between-class variation respectively. Furthermore, MaxEnt-RDA makes use of high order statistics (entropy) to estimate the probability matrix so that it is robust to outliers. Experiments on toy problem , UCI datasets and face datasets demonstrate the effectiveness of the proposed method with comparison to other state-of-the-art methods.
DPpackage: Bayesian Semi- and Nonparametric Modeling in R
Directory of Open Access Journals (Sweden)
Alejandro Jara
2011-04-01
Full Text Available Data analysis sometimes requires the relaxation of parametric assumptions in order to gain modeling flexibility and robustness against mis-specification of the probability model. In the Bayesian context, this is accomplished by placing a prior distribution on a function space, such as the space of all probability distributions or the space of all regression functions. Unfortunately, posterior distributions ranging over function spaces are highly complex and hence sampling methods play a key role. This paper provides an introduction to a simple, yet comprehensive, set of programs for the implementation of some Bayesian nonparametric and semiparametric models in R, DPpackage. Currently, DPpackage includes models for marginal and conditional density estimation, receiver operating characteristic curve analysis, interval-censored data, binary regression data, item response data, longitudinal and clustered data using generalized linear mixed models, and regression data using generalized additive models. The package also contains functions to compute pseudo-Bayes factors for model comparison and for eliciting the precision parameter of the Dirichlet process prior, and a general purpose Metropolis sampling algorithm. To maximize computational efficiency, the actual sampling for each model is carried out using compiled C, C++ or Fortran code.
Error analysis of stochastic gradient descent ranking.
Chen, Hong; Tang, Yi; Li, Luoqing; Yuan, Yuan; Li, Xuelong; Tang, Yuanyan
2013-06-01
Ranking is always an important task in machine learning and information retrieval, e.g., collaborative filtering, recommender systems, drug discovery, etc. A kernel-based stochastic gradient descent algorithm with the least squares loss is proposed for ranking in this paper. The implementation of this algorithm is simple, and an expression of the solution is derived via a sampling operator and an integral operator. An explicit convergence rate for leaning a ranking function is given in terms of the suitable choices of the step size and the regularization parameter. The analysis technique used here is capacity independent and is novel in error analysis of ranking learning. Experimental results on real-world data have shown the effectiveness of the proposed algorithm in ranking tasks, which verifies the theoretical analysis in ranking error.
Methodology for ranking restoration options
International Nuclear Information System (INIS)
Hedemann Jensen, Per
1999-04-01
The work described in this report has been performed as a part of the RESTRAT Project FI4P-CT95-0021a (PL 950128) co-funded by the Nuclear Fission Safety Programme of the European Commission. The RESTRAT project has the overall objective of developing generic methodologies for ranking restoration techniques as a function of contamination and site characteristics. The project includes analyses of existing remediation methodologies and contaminated sites, and is structured in the following steps: characterisation of relevant contaminated sites; identification and characterisation of relevant restoration techniques; assessment of the radiological impact; development and application of a selection methodology for restoration options; formulation of generic conclusions and development of a manual. The project is intended to apply to situations in which sites with nuclear installations have been contaminated with radioactive materials as a result of the operation of these installations. The areas considered for remedial measures include contaminated land areas, rivers and sediments in rivers, lakes, and sea areas. Five contaminated European sites have been studied. Various remedial measures have been envisaged with respect to the optimisation of the protection of the populations being exposed to the radionuclides at the sites. Cost-benefit analysis and multi-attribute utility analysis have been applied for optimisation. Health, economic and social attributes have been included and weighting factors for the different attributes have been determined by the use of scaling constants. (au)
The mathematics behind PageRank algorithm
Spačal, Gregor
2016-01-01
PageRank is Google's algorithm for ranking web pages by relevance. Pages can then be hierarchically sorted in order to provide better search results. The MSc thesis considers functioning, relevance, general properties of web search and its weaknesses before the appearance of Google. One of the most important questions is, if we can formally explain the mathematics behind PageRank algorithm and what mathematical knowledge is necessary. Finally, we present an example of its implementation i...
Is CBA ranking of transport investments robust?
Börjesson, Maria; Eliasson, Jonas; Lundberg, Mattias
2012-01-01
Cost-benefit analysis (CBA) is often used when many transport investments need to be ranked against each other, for example in national investment planning. However, results are often questioned on claims that the ranking depends crucially on uncertain assumptions about the future, and on methodologically or ethically contestable trade-offs of different types of benefits relative to each other. This paper explores the robustness of CBA rankings of transport investments with respect to two typ...
Citation graph based ranking in Invenio
Marian, Ludmila; Rajman, Martin; Vesely, Martin
2010-01-01
Invenio is the web-based integrated digital library system developed at CERN. Within this framework, we present four types of ranking models based on the citation graph that complement the simple approach based on citation counts: time-dependent citation counts, a relevancy ranking which extends the PageRank model, a time-dependent ranking which combines the freshness of citations with PageRank and a ranking that takes into consideration the external citations. We present our analysis and results obtained on two main data sets: Inspire and CERN Document Server. Our main contributions are: (i) a study of the currently available ranking methods based on the citation graph; (ii) the development of new ranking methods that correct some of the identified limitations of the current methods such as treating all citations of equal importance, not taking time into account or considering the citation graph complete; (iii) a detailed study of the key parameters for these ranking methods. (The original publication is ava...
Communities in Large Networks: Identification and Ranking
DEFF Research Database (Denmark)
Olsen, Martin
2008-01-01
We study the problem of identifying and ranking the members of a community in a very large network with link analysis only, given a set of representatives of the community. We define the concept of a community justified by a formal analysis of a simple model of the evolution of a directed graph. ...... and its immediate surroundings. The members are ranked with a “local” variant of the PageRank algorithm. Results are reported from successful experiments on identifying and ranking Danish Computer Science sites and Danish Chess pages using only a few representatives....
Shaoba, Asma; Basu, Sanjib; Mantis, Stelios; Minutti, Carla
2017-12-15
To determine the association, if any, between thyroid-stimulating hormone (TSH) levels and body mass index (BMI) percentiles in children with primary hypothyroidism who are chemically euthyroid and on treatment with levothyroxine. This retrospective cross-sectional study consisted of a review of medical records from RUSH Medical Center and Stroger Hospital, Chicago, USA of children with primary hypothyroidism who were seen in the clinic from 2008 to 2014 and who were chemically euthyroid and on treatment with levothyroxine for at least 6 months. The patients were divided into two groups based on their TSH levels (0.34-hypothyroidism who are chemically euthyroid on treatment with levothyroxine, there is a positive association between higher TSH levels and higher BMI percentiles. However, it is difficult to establish if the higher TSH levels are a direct cause or a consequence of the obesity. Further studies are needed to establish causation beyond significant association.
Bussler, Sarah; Vogel, Mandy; Pietzner, Diana; Harms, Kristian; Buzek, Theresa; Penke, Melanie; Händel, Norman; Körner, Antje; Baumann, Ulrich; Kiess, Wieland; Flemming, Gunter
2017-09-19
The present study aims to clarify the effects of sex, age, BMI and puberty on transaminase serum levels in children and adolescents and to provide new age- and sex-related percentiles for alanine aminotransferase (ALT), aspartate aminotransferase (AST) and γ-glutamyltransferase (GGT). Venous blood and anthropometric data were collected from 4,126 cases. Excluded were cases of participants with potential hepatotoxic medication, with evidence of potential illness at the time of blood sampling and non-normal BMI (BMI 90 th ). The resulting data (N = 3,131 cases) were used for the calculations of ALT, AST, and GGT percentiles. Age- and sex-related reference intervals were established by using an LMSP-type method. Serum levels of transaminases follow age-specific patterns and relate to the onset of puberty. This observation is more pronounced in girls than in boys. The ALT percentiles showed similar shaped patterns in both sexes. Multivariate regression confirmed significant effects of puberty and BMI-SDS (β = 2.21) on ALT. Surprisingly, AST serum levels were negatively influenced by age (β = -1.42) and BMI-SDS (β = -0.15). The GGT percentiles revealed significant sex-specific differences, correlated positively with age (β = 0.37) and showed significant association with BMI-SDS (β = 1.16). Current reference values of ALT, AST and GGT serum levels were calculated for children between 11 months and 16.0 years, using modern analytical and statistical methods. This study extends the current knowledge about transaminases by revealing influences of age, sex, BMI, and puberty on the serum concentrations of all three parameters and has for these parameters one of the largest sample sizes published so far. This article is protected by copyright. All rights reserved. © 2017 by the American Association for the Study of Liver Diseases.
Martínez, Airín D; Ruelas, Lillian; Granger, Douglas A
2017-11-01
Fear of deportation (FOD) is a prevalent concern among mixed-status families. Yet, our understanding of how FOD shapes human health and development is in its infancy. To begin to address this knowledge gap, we examined the relationship between household FOD, body mass index (BMI) percentiles and salivary uric acid (sUA), a biomarker related to oxidative stress/hypertension/metabolic syndrome, among 111 individuals living in Mexican-origin families. Participants were 65 children (2 months-17 years, 49% female) and 46 adults (20-58 years, 71% female) living in 30 Mexican-origin families with at least one immigrant parent in Phoenix, AZ. We recruited families using cluster probability sampling of 30 randomly selected census tracts with a high proportion of Hispanic/Latino immigrants. The head of household completed a survey containing demographic, FOD, and psychosocial measures. All family members provided saliva (later assayed for sUA) and anthropometric measures. Relationships between household FOD, BMI percentile, and sUA levels were estimated using multilevel models. Higher levels of household FOD were associated with lower BMI percentiles and lower sUA levels between families, after controlling for social support and socioeconomic proxies. Key features of the social ecology in which mixed-status families are embedded are associated with individual differences in biological processes linked to increased risk for chronic disease. © 2017 Wiley Periodicals, Inc.
Lurie, Samuel; Weiner, Eran; Golan, Abraham; Sadan, Oscar
2014-01-01
To establish leukocyte count and differential percentiles in healthy singleton term laboring women during spontaneous normal vaginal labor following an uncomplicated pregnancy. An analysis of the records of all women (n = 762) who delivered at our delivery ward during a 2-month period was performed. After exclusion for cesarean delivery, induction of labor, pregnancy complications, preterm labor, multiple pregnancy, fever on admission, and lack of full blood count on admission, 365 parturient women during the 1st stage of labor were included in the final analysis. The total and differential leukocyte counts were determined by standard procedure by an automated cell counter. The leukocyte count range on admission to the delivery ward during the 1st stage of labor in healthy parturient women was between 4.4 × and 21.7 × 10(3)/µl and the 99th percentile limit was 20.06 × 10(3)/µl. The total leukocyte count was not influenced by cervical dilatation, ruptured membranes, or the presence and regularity of uterine contractions. An observed leukocyte count within the 99th percentile limit (20.06 × 10(3)/µl) in an otherwise normal parturient woman is reassuring in the absence of other clinical evidence. © 2014 S. Karger AG, Basel.
Akhtar, Naveed; Mian, Ajmal
2017-10-03
We present a principled approach to learn a discriminative dictionary along a linear classifier for hyperspectral classification. Our approach places Gaussian Process priors over the dictionary to account for the relative smoothness of the natural spectra, whereas the classifier parameters are sampled from multivariate Gaussians. We employ two Beta-Bernoulli processes to jointly infer the dictionary and the classifier. These processes are coupled under the same sets of Bernoulli distributions. In our approach, these distributions signify the frequency of the dictionary atom usage in representing class-specific training spectra, which also makes the dictionary discriminative. Due to the coupling between the dictionary and the classifier, the popularity of the atoms for representing different classes gets encoded into the classifier. This helps in predicting the class labels of test spectra that are first represented over the dictionary by solving a simultaneous sparse optimization problem. The labels of the spectra are predicted by feeding the resulting representations to the classifier. Our approach exploits the nonparametric Bayesian framework to automatically infer the dictionary size--the key parameter in discriminative dictionary learning. Moreover, it also has the desirable property of adaptively learning the association between the dictionary atoms and the class labels by itself. We use Gibbs sampling to infer the posterior probability distributions over the dictionary and the classifier under the proposed model, for which, we derive analytical expressions. To establish the effectiveness of our approach, we test it on benchmark hyperspectral images. The classification performance is compared with the state-of-the-art dictionary learning-based classification methods.
Economic decision making and the application of nonparametric prediction models
Attanasi, E.D.; Coburn, T.C.; Freeman, P.A.
2008-01-01
Sustained increases in energy prices have focused attention on gas resources in low-permeability shale or in coals that were previously considered economically marginal. Daily well deliverability is often relatively small, although the estimates of the total volumes of recoverable resources in these settings are often large. Planning and development decisions for extraction of such resources must be areawide because profitable extraction requires optimization of scale economies to minimize costs and reduce risk. For an individual firm, the decision to enter such plays depends on reconnaissance-level estimates of regional recoverable resources and on cost estimates to develop untested areas. This paper shows how simple nonparametric local regression models, used to predict technically recoverable resources at untested sites, can be combined with economic models to compute regional-scale cost functions. The context of the worked example is the Devonian Antrim-shale gas play in the Michigan basin. One finding relates to selection of the resource prediction model to be used with economic models. Models chosen because they can best predict aggregate volume over larger areas (many hundreds of sites) smooth out granularity in the distribution of predicted volumes at individual sites. This loss of detail affects the representation of economic cost functions and may affect economic decisions. Second, because some analysts consider unconventional resources to be ubiquitous, the selection and order of specific drilling sites may, in practice, be determined arbitrarily by extraneous factors. The analysis shows a 15-20% gain in gas volume when these simple models are applied to order drilling prospects strategically rather than to choose drilling locations randomly. Copyright ?? 2008 Society of Petroleum Engineers.
Non-parametric estimation of low-concentration benzene metabolism.
Cox, Louis A; Schnatter, A Robert; Boogaard, Peter J; Banton, Marcy; Ketelslegers, Hans B
2017-12-25
Two apparently contradictory findings in the literature on low-dose human metabolism of benzene are as follows. First, metabolism is approximately linear at low concentrations, e.g., below 10 ppm. This is consistent with decades of quantitative modeling of benzene pharmacokinetics and dose-dependent metabolism. Second, measured benzene exposure and metabolite concentrations for occupationally exposed benzene workers in Tianjin, China show that dose-specific metabolism (DSM) ratios of metabolite concentrations per ppm of benzene in air decrease steadily with benzene concentration, with the steepest decreases below 3 ppm. This has been interpreted as indicating that metabolism at low concentrations of benzene is highly nonlinear. We reexamine the data using non-parametric methods. Our main conclusion is that both findings are correct; they are not contradictory. Low-concentration metabolism can be linear, with metabolite concentrations proportional to benzene concentrations in air, and yet DSM ratios can still decrease with benzene concentrations. This is because a ratio of random variables can be negatively correlated with its own denominator even if the mean of the numerator is proportional to the denominator. Interpreting DSM ratios that decrease with air benzene concentrations as evidence of nonlinear metabolism is therefore unwarranted when plots of metabolite concentrations against benzene ppm in air show approximately straight-line relationships between them, as in the Tianjin data. Thus, an apparent contradiction that has fueled heated discussions in the recent literature can be resolved by recognizing that highly nonlinear, decreasing DSM ratios are consistent with linear metabolism. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.
SOUTH AFRICAN ARMY RANKS AND INSIGNIA
African Journals Online (AJOL)
ants at that time, and they were introduced only at the end of World War ... field cornet and assistant field cornet, commandant apparently being equivalent to major. When the comman- dos were organised into mounted brigades for the 1914-15. German ... rank of brigadier-general in favour of a field rank, initially designated.
Using centrality to rank web snippets
Jijkoun, V.; de Rijke, M.; Peters, C.; Jijkoun, V.; Mandl, T.; Müller, H.; Oard, D.W.; Peñas, A.; Petras, V.; Santos, D.
2008-01-01
We describe our participation in the WebCLEF 2007 task, targeted at snippet retrieval from web data. Our system ranks snippets based on a simple similarity-based centrality, inspired by the web page ranking algorithms. We experimented with retrieval units (sentences and paragraphs) and with the
Entity ranking using Wikipedia as a pivot
Kaptein, R.; Serdyukov, P.; de Vries, A.; Kamps, J.; Huang, X.J.; Jones, G.; Koudas, N.; Wu, X.; Collins-Thompson, K.
2010-01-01
In this paper we investigate the task of Entity Ranking on the Web. Searchers looking for entities are arguably better served by presenting a ranked list of entities directly, rather than a list of web pages with relevant but also potentially redundant information about these entities. Since
Entity Ranking using Wikipedia as a Pivot
R. Kaptein; P. Serdyukov; A.P. de Vries (Arjen); J. Kamps
2010-01-01
htmlabstractIn this paper we investigate the task of Entity Ranking on the Web. Searchers looking for entities are arguably better served by presenting a ranked list of entities directly, rather than a list of web pages with relevant but also potentially redundant information about
Ranking Entities in Networks via Lefschetz Duality
DEFF Research Database (Denmark)
Aabrandt, Andreas; Hansen, Vagn Lundsgaard; Poulsen, Bjarne
2014-01-01
then be ranked according to how essential their positions are in the network by considering the effect of their respective absences. Defining a ranking of a network which takes the individual position of each entity into account has the purpose of assigning different roles to the entities, e.g. agents...
Semantic association ranking schemes for information retrieval ...
Indian Academy of Sciences (India)
problem into machine learning problem. Typically, the documents are ... of-words retrieval function that ranks a set of documents based on the query terms appearing in .... Graph-based doc- ument ranking algorithms have been widely used in calculating term weights to represent the contribution of a term in search context.
Contests with rank-order spillovers
M.R. Baye (Michael); D. Kovenock (Dan); C.G. de Vries (Casper)
2012-01-01
textabstractThis paper presents a unified framework for characterizing symmetric equilibrium in simultaneous move, two-player, rank-order contests with complete information, in which each player's strategy generates direct or indirect affine "spillover" effects that depend on the rank-order of her
Classification of rank 2 cluster varieties
DEFF Research Database (Denmark)
Mandel, Travis
We classify rank 2 cluster varieties (those whose corresponding skew-form has rank 2) according to the deformation type of a generic fiber U of their X-spaces, as defined by Fock and Goncharov. Our approach is based on the work of Gross, Hacking, and Keel for cluster varieties and log Calabi...
Embedded feature ranking for ensemble MLP classifiers.
Windeatt, Terry; Duangsoithong, Rakkrit; Smith, Raymond
2011-06-01
A feature ranking scheme for multilayer perceptron (MLP) ensembles is proposed, along with a stopping criterion based upon the out-of-bootstrap estimate. To solve multi-class problems feature ranking is combined with modified error-correcting output coding. Experimental results on benchmark data demonstrate the versatility of the MLP base classifier in removing irrelevant features.
Neural Ranking Models with Weak Supervision
Dehghani, M.; Zamani, H.; Severyn, A.; Kamps, J.; Croft, W.B.
2017-01-01
Despite the impressive improvements achieved by unsupervised deep neural networks in computer vision and NLP tasks, such improvements have not yet been observed in ranking for information retrieval. The reason may be the complexity of the ranking problem, as it is not obvious how to learn from
Ranking scientific publications: the effect of nonlinearity
Yao, Liyang; Wei, Tian; Zeng, An; Fan, Ying; di, Zengru
2014-10-01
Ranking the significance of scientific publications is a long-standing challenge. The network-based analysis is a natural and common approach for evaluating the scientific credit of papers. Although the number of citations has been widely used as a metric to rank papers, recently some iterative processes such as the well-known PageRank algorithm have been applied to the citation networks to address this problem. In this paper, we introduce nonlinearity to the PageRank algorithm when aggregating resources from different nodes to further enhance the effect of important papers. The validation of our method is performed on the data of American Physical Society (APS) journals. The results indicate that the nonlinearity improves the performance of the PageRank algorithm in terms of ranking effectiveness, as well as robustness against malicious manipulations. Although the nonlinearity analysis is based on the PageRank algorithm, it can be easily extended to other iterative ranking algorithms and similar improvements are expected.
Ranking scientific publications: the effect of nonlinearity.
Yao, Liyang; Wei, Tian; Zeng, An; Fan, Ying; Di, Zengru
2014-10-17
Ranking the significance of scientific publications is a long-standing challenge. The network-based analysis is a natural and common approach for evaluating the scientific credit of papers. Although the number of citations has been widely used as a metric to rank papers, recently some iterative processes such as the well-known PageRank algorithm have been applied to the citation networks to address this problem. In this paper, we introduce nonlinearity to the PageRank algorithm when aggregating resources from different nodes to further enhance the effect of important papers. The validation of our method is performed on the data of American Physical Society (APS) journals. The results indicate that the nonlinearity improves the performance of the PageRank algorithm in terms of ranking effectiveness, as well as robustness against malicious manipulations. Although the nonlinearity analysis is based on the PageRank algorithm, it can be easily extended to other iterative ranking algorithms and similar improvements are expected.
Dynamic collective entity representations for entity ranking
Graus, D.; Tsagkias, M.; Weerkamp, W.; Meij, E.; de Rijke, M.
2016-01-01
Entity ranking, i.e., successfully positioning a relevant entity at the top of the ranking for a given query, is inherently difficult due to the potential mismatch between the entity's description in a knowledge base, and the way people refer to the entity when searching for it. To counter this
Kernel bandwidth estimation for non-parametric density estimation: a comparative study
CSIR Research Space (South Africa)
Van der Walt, CM
2013-12-01
Full Text Available We investigate the performance of conventional bandwidth estimators for non-parametric kernel density estimation on a number of representative pattern-recognition tasks, to gain a better understanding of the behaviour of these estimators in high...
Alberto Baccini; Antono Banfi; Giuseppe De Nicolao; Paola Galimberti
2015-01-01
University rankings represent a controversial issue in the debate about higher education policy. One of the best known university ranking is the Quacquarelli Symonds World University Rankings (QS), published annually since 2004 by Quacquarelli Symonds ltd, a company founded in 1990 and headquartered in London. QS provides a ranking based on a score calculated by weighting six different indicators. The 2015 edition, published in October 2015, introduced major methodological innovations and, as...
Ocampo-Duque, William; Osorio, Carolina; Piamba, Christian; Schuhmacher, Marta; Domingo, José L
2013-02-01
The integration of water quality monitoring variables is essential in environmental decision making. Nowadays, advanced techniques to manage subjectivity, imprecision, uncertainty, vagueness, and variability are required in such complex evaluation process. We here propose a probabilistic fuzzy hybrid model to assess river water quality. Fuzzy logic reasoning has been used to compute a water quality integrative index. By applying a Monte Carlo technique, based on non-parametric probability distributions, the randomness of model inputs was estimated. Annual histograms of nine water quality variables were built with monitoring data systematically collected in the Colombian Cauca River, and probability density estimations using the kernel smoothing method were applied to fit data. Several years were assessed, and river sectors upstream and downstream the city of Santiago de Cali, a big city with basic wastewater treatment and high industrial activity, were analyzed. The probabilistic fuzzy water quality index was able to explain the reduction in water quality, as the river receives a larger number of agriculture, domestic, and industrial effluents. The results of the hybrid model were compared to traditional water quality indexes. The main advantage of the proposed method is that it considers flexible boundaries between the linguistic qualifiers used to define the water status, being the belongingness of water quality to the diverse output fuzzy sets or classes provided with percentiles and histograms, which allows classify better the real water condition. The results of this study show that fuzzy inference systems integrated to stochastic non-parametric techniques may be used as complementary tools in water quality indexing methodologies. Copyright © 2012 Elsevier Ltd. All rights reserved.
Reliability of journal impact factor rankings
Directory of Open Access Journals (Sweden)
Greenwood Darren C
2007-11-01
Full Text Available Abstract Background Journal impact factors and their ranks are used widely by journals, researchers, and research assessment exercises. Methods Based on citations to journals in research and experimental medicine in 2005, Bayesian Markov chain Monte Carlo methods were used to estimate the uncertainty associated with these journal performance indicators. Results Intervals representing plausible ranges of values for journal impact factor ranks indicated that most journals cannot be ranked with great precision. Only the top and bottom few journals could place any confidence in their rank position. Intervals were wider and overlapping for most journals. Conclusion Decisions placed on journal impact factors are potentially misleading where the uncertainty associated with the measure is ignored. This article proposes that caution should be exercised in the interpretation of journal impact factors and their ranks, and specifically that a measure of uncertainty should be routinely presented alongside the point estimate.
Comparing classical and quantum PageRanks
Loke, T.; Tang, J. W.; Rodriguez, J.; Small, M.; Wang, J. B.
2017-01-01
Following recent developments in quantum PageRanking, we present a comparative analysis of discrete-time and continuous-time quantum-walk-based PageRank algorithms. Relative to classical PageRank and to different extents, the quantum measures better highlight secondary hubs and resolve ranking degeneracy among peripheral nodes for all networks we studied in this paper. For the discrete-time case, we investigated the periodic nature of the walker's probability distribution for a wide range of networks and found that the dominant period does not grow with the size of these networks. Based on this observation, we introduce a new quantum measure using the maximum probabilities of the associated walker during the first couple of periods. This is particularly important, since it leads to a quantum PageRanking scheme that is scalable with respect to network size.
Universal emergence of PageRank
Energy Technology Data Exchange (ETDEWEB)
Frahm, K M; Georgeot, B; Shepelyansky, D L, E-mail: frahm@irsamc.ups-tlse.fr, E-mail: georgeot@irsamc.ups-tlse.fr, E-mail: dima@irsamc.ups-tlse.fr [Laboratoire de Physique Theorique du CNRS, IRSAMC, Universite de Toulouse, UPS, 31062 Toulouse (France)
2011-11-18
The PageRank algorithm enables us to rank the nodes of a network through a specific eigenvector of the Google matrix, using a damping parameter {alpha} Element-Of ]0, 1[. Using extensive numerical simulations of large web networks, with a special accent on British University networks, we determine numerically and analytically the universal features of the PageRank vector at its emergence when {alpha} {yields} 1. The whole network can be divided into a core part and a group of invariant subspaces. For {alpha} {yields} 1, PageRank converges to a universal power-law distribution on the invariant subspaces whose size distribution also follows a universal power law. The convergence of PageRank at {alpha} {yields} 1 is controlled by eigenvalues of the core part of the Google matrix, which are extremely close to unity, leading to large relaxation times as, for example, in spin glasses. (paper)
Universal emergence of PageRank
International Nuclear Information System (INIS)
Frahm, K M; Georgeot, B; Shepelyansky, D L
2011-01-01
The PageRank algorithm enables us to rank the nodes of a network through a specific eigenvector of the Google matrix, using a damping parameter α ∈ ]0, 1[. Using extensive numerical simulations of large web networks, with a special accent on British University networks, we determine numerically and analytically the universal features of the PageRank vector at its emergence when α → 1. The whole network can be divided into a core part and a group of invariant subspaces. For α → 1, PageRank converges to a universal power-law distribution on the invariant subspaces whose size distribution also follows a universal power law. The convergence of PageRank at α → 1 is controlled by eigenvalues of the core part of the Google matrix, which are extremely close to unity, leading to large relaxation times as, for example, in spin glasses. (paper)
Nonparametric Identification and Estimation of Finite Mixture Models of Dynamic Discrete Choices
Hiroyuki Kasahara; Katsumi Shimotsu
2006-01-01
In dynamic discrete choice analysis, controlling for unobserved heterogeneity is an important issue, and finite mixture models provide flexible ways to account for unobserved heterogeneity. This paper studies nonparametric identifiability of type probabilities and type-specific component distributions in finite mixture models of dynamic discrete choices. We derive sufficient conditions for nonparametric identification for various finite mixture models of dynamic discrete choices used in appli...
A new method of joint nonparametric estimation of probability density and its support
Moriyama, Taku
2017-01-01
In this paper we propose a new method of joint nonparametric estimation of probability density and its support. As is well known, nonparametric kernel density estimator has "boundary bias problem" when the support of the population density is not the whole real line. To avoid the unknown boundary effects, our estimator detects the boundary, and eliminates the boundary-bias of the estimator simultaneously. Moreover, we refer an extension to a simple multivariate case, and propose an improved e...
Examples of the Application of Nonparametric Information Geometry to Statistical Physics
Directory of Open Access Journals (Sweden)
Giovanni Pistone
2013-09-01
Full Text Available We review a nonparametric version of Amari’s information geometry in which the set of positive probability densities on a given sample space is endowed with an atlas of charts to form a differentiable manifold modeled on Orlicz Banach spaces. This nonparametric setting is used to discuss the setting of typical problems in machine learning and statistical physics, such as black-box optimization, Kullback-Leibler divergence, Boltzmann-Gibbs entropy and the Boltzmann equation.
Kwan, Betty P.; O'Brien, T. Paul
2015-06-01
The Aerospace Corporation performed a study to determine whether static percentiles of AE9/AP9 can be used to approximate dynamic Monte Carlo runs for radiation analysis of spiral transfer orbits. Solar panel degradation is a major concern for solar-electric propulsion because solar-electric propulsion depends on the power output of the solar panel. Different spiral trajectories have different radiation environments that could lead to solar panel degradation. Because the spiral transfer orbits only last weeks to months, an average environment does not adequately address the possible transient enhancements of the radiation environment that must be accounted for in optimizing the transfer orbit trajectory. Therefore, to optimize the trajectory, an ensemble of Monte Carlo simulations of AE9/AP9 would normally be run for every spiral trajectory to determine the 95th percentile radiation environment. To avoid performing lengthy Monte Carlo dynamic simulations for every candidate spiral trajectory in the optimization, we found a static percentile that would be an accurate representation of the full Monte Carlo simulation for a representative set of spiral trajectories. For 3 LEO to GEO and 1 LEO to MEO trajectories, a static 90th percentile AP9 is a good approximation of the 95th percentile fluence with dynamics for 4-10 MeV protons, and a static 80th percentile AE9 is a good approximation of the 95th percentile fluence with dynamics for 0.5-2 MeV electrons. While the specific percentiles chosen cannot necessarily be used in general for other orbit trade studies, the concept of determining a static percentile as a quick approximation to a full Monte Carlo ensemble of simulations can likely be applied to other orbit trade studies. We expect the static percentile to depend on the region of space traversed, the mission duration, and the radiation effect considered.
A tilting approach to ranking influence
Genton, Marc G.
2014-12-01
We suggest a new approach, which is applicable for general statistics computed from random samples of univariate or vector-valued or functional data, to assessing the influence that individual data have on the value of a statistic, and to ranking the data in terms of that influence. Our method is based on, first, perturbing the value of the statistic by ‘tilting’, or reweighting, each data value, where the total amount of tilt is constrained to be the least possible, subject to achieving a given small perturbation of the statistic, and, then, taking the ranking of the influence of data values to be that which corresponds to ranking the changes in data weights. It is shown, both theoretically and numerically, that this ranking does not depend on the size of the perturbation, provided that the perturbation is sufficiently small. That simple result leads directly to an elegant geometric interpretation of the ranks; they are the ranks of the lengths of projections of the weights onto a ‘line’ determined by the first empirical principal component function in a generalized measure of covariance. To illustrate the generality of the method we introduce and explore it in the case of functional data, where (for example) it leads to generalized boxplots. The method has the advantage of providing an interpretable ranking that depends on the statistic under consideration. For example, the ranking of data, in terms of their influence on the value of a statistic, is different for a measure of location and for a measure of scale. This is as it should be; a ranking of data in terms of their influence should depend on the manner in which the data are used. Additionally, the ranking recognizes, rather than ignores, sign, and in particular can identify left- and right-hand ‘tails’ of the distribution of a random function or vector.
A Ranking Approach to Genomic Selection.
Blondel, Mathieu; Onogi, Akio; Iwata, Hiroyoshi; Ueda, Naonori
2015-01-01
Genomic selection (GS) is a recent selective breeding method which uses predictive models based on whole-genome molecular markers. Until now, existing studies formulated GS as the problem of modeling an individual's breeding value for a particular trait of interest, i.e., as a regression problem. To assess predictive accuracy of the model, the Pearson correlation between observed and predicted trait values was used. In this paper, we propose to formulate GS as the problem of ranking individuals according to their breeding value. Our proposed framework allows us to employ machine learning methods for ranking which had previously not been considered in the GS literature. To assess ranking accuracy of a model, we introduce a new measure originating from the information retrieval literature called normalized discounted cumulative gain (NDCG). NDCG rewards more strongly models which assign a high rank to individuals with high breeding value. Therefore, NDCG reflects a prerequisite objective in selective breeding: accurate selection of individuals with high breeding value. We conducted a comparison of 10 existing regression methods and 3 new ranking methods on 6 datasets, consisting of 4 plant species and 25 traits. Our experimental results suggest that tree-based ensemble methods including McRank, Random Forests and Gradient Boosting Regression Trees achieve excellent ranking accuracy. RKHS regression and RankSVM also achieve good accuracy when used with an RBF kernel. Traditional regression methods such as Bayesian lasso, wBSR and BayesC were found less suitable for ranking. Pearson correlation was found to correlate poorly with NDCG. Our study suggests two important messages. First, ranking methods are a promising research direction in GS. Second, NDCG can be a useful evaluation measure for GS.
PageRank and rank-reversal dependence on the damping factor
Son, S.-W.; Christensen, C.; Grassberger, P.; Paczuski, M.
2012-12-01
PageRank (PR) is an algorithm originally developed by Google to evaluate the importance of web pages. Considering how deeply rooted Google's PR algorithm is to gathering relevant information or to the success of modern businesses, the question of rank stability and choice of the damping factor (a parameter in the algorithm) is clearly important. We investigate PR as a function of the damping factor d on a network obtained from a domain of the World Wide Web, finding that rank reversal happens frequently over a broad range of PR (and of d). We use three different correlation measures, Pearson, Spearman, and Kendall, to study rank reversal as d changes, and we show that the correlation of PR vectors drops rapidly as d changes from its frequently cited value, d0=0.85. Rank reversal is also observed by measuring the Spearman and Kendall rank correlation, which evaluate relative ranks rather than absolute PR. Rank reversal happens not only in directed networks containing rank sinks but also in a single strongly connected component, which by definition does not contain any sinks. We relate rank reversals to rank pockets and bottlenecks in the directed network structure. For the network studied, the relative rank is more stable by our measures around d=0.65 than at d=d0.
PageRank and rank-reversal dependence on the damping factor.
Son, S-W; Christensen, C; Grassberger, P; Paczuski, M
2012-12-01
PageRank (PR) is an algorithm originally developed by Google to evaluate the importance of web pages. Considering how deeply rooted Google's PR algorithm is to gathering relevant information or to the success of modern businesses, the question of rank stability and choice of the damping factor (a parameter in the algorithm) is clearly important. We investigate PR as a function of the damping factor d on a network obtained from a domain of the World Wide Web, finding that rank reversal happens frequently over a broad range of PR (and of d). We use three different correlation measures, Pearson, Spearman, and Kendall, to study rank reversal as d changes, and we show that the correlation of PR vectors drops rapidly as d changes from its frequently cited value, d_{0}=0.85. Rank reversal is also observed by measuring the Spearman and Kendall rank correlation, which evaluate relative ranks rather than absolute PR. Rank reversal happens not only in directed networks containing rank sinks but also in a single strongly connected component, which by definition does not contain any sinks. We relate rank reversals to rank pockets and bottlenecks in the directed network structure. For the network studied, the relative rank is more stable by our measures around d=0.65 than at d=d_{0}.
Nonparametric Change Point Diagnosis Method of Concrete Dam Crack Behavior Abnormality
Directory of Open Access Journals (Sweden)
Zhanchao Li
2013-01-01
Full Text Available The study on diagnosis method of concrete crack behavior abnormality has always been a hot spot and difficulty in the safety monitoring field of hydraulic structure. Based on the performance of concrete dam crack behavior abnormality in parametric statistical model and nonparametric statistical model, the internal relation between concrete dam crack behavior abnormality and statistical change point theory is deeply analyzed from the model structure instability of parametric statistical model and change of sequence distribution law of nonparametric statistical model. On this basis, through the reduction of change point problem, the establishment of basic nonparametric change point model, and asymptotic analysis on test method of basic change point problem, the nonparametric change point diagnosis method of concrete dam crack behavior abnormality is created in consideration of the situation that in practice concrete dam crack behavior may have more abnormality points. And the nonparametric change point diagnosis method of concrete dam crack behavior abnormality is used in the actual project, demonstrating the effectiveness and scientific reasonableness of the method established. Meanwhile, the nonparametric change point diagnosis method of concrete dam crack behavior abnormality has a complete theoretical basis and strong practicality with a broad application prospect in actual project.
First rank symptoms for schizophrenia.
Soares-Weiser, Karla; Maayan, Nicola; Bergman, Hanna; Davenport, Clare; Kirkham, Amanda J; Grabowski, Sarah; Adams, Clive E
2015-01-25
Early and accurate diagnosis and treatment of schizophrenia may have long-term advantages for the patient; the longer psychosis goes untreated the more severe the repercussions for relapse and recovery. If the correct diagnosis is not schizophrenia, but another psychotic disorder with some symptoms similar to schizophrenia, appropriate treatment might be delayed, with possible severe repercussions for the person involved and their family. There is widespread uncertainty about the diagnostic accuracy of First Rank Symptoms (FRS); we examined whether they are a useful diagnostic tool to differentiate schizophrenia from other psychotic disorders. To determine the diagnostic accuracy of one or multiple FRS for diagnosing schizophrenia, verified by clinical history and examination by a qualified professional (e.g. psychiatrists, nurses, social workers), with or without the use of operational criteria and checklists, in people thought to have non-organic psychotic symptoms. We conducted searches in MEDLINE, EMBASE, and PsycInfo using OvidSP in April, June, July 2011 and December 2012. We also searched MEDION in December 2013. We selected studies that consecutively enrolled or randomly selected adults and adolescents with symptoms of psychosis, and assessed the diagnostic accuracy of FRS for schizophrenia compared to history and clinical examination performed by a qualified professional, which may or may not involve the use of symptom checklists or based on operational criteria such as ICD and DSM. Two review authors independently screened all references for inclusion. Risk of bias in included studies were assessed using the QUADAS-2 instrument. We recorded the number of true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN) for constructing a 2 x 2 table for each study or derived 2 x 2 data from reported summary statistics such as sensitivity, specificity, and/or likelihood ratios. We included 21 studies with a total of 6253 participants
An adaptive distance measure for use with nonparametric models
International Nuclear Information System (INIS)
Garvey, D. R.; Hines, J. W.
2006-01-01
Distance measures perform a critical task in nonparametric, locally weighted regression. Locally weighted regression (LWR) models are a form of 'lazy learning' which construct a local model 'on the fly' by comparing a query vector to historical, exemplar vectors according to a three step process. First, the distance of the query vector to each of the exemplar vectors is calculated. Next, these distances are passed to a kernel function, which converts the distances to similarities or weights. Finally, the model output or response is calculated by performing locally weighted polynomial regression. To date, traditional distance measures, such as the Euclidean, weighted Euclidean, and L1-norm have been used as the first step in the prediction process. Since these measures do not take into consideration sensor failures and drift, they are inherently ill-suited for application to 'real world' systems. This paper describes one such LWR model, namely auto associative kernel regression (AAKR), and describes a new, Adaptive Euclidean distance measure that can be used to dynamically compensate for faulty sensor inputs. In this new distance measure, the query observations that lie outside of the training range (i.e. outside the minimum and maximum input exemplars) are dropped from the distance calculation. This allows for the distance calculation to be robust to sensor drifts and failures, in addition to providing a method for managing inputs that exceed the training range. In this paper, AAKR models using the standard and Adaptive Euclidean distance are developed and compared for the pressure system of an operating nuclear power plant. It is shown that using the standard Euclidean distance for data with failed inputs, significant errors in the AAKR predictions can result. By using the Adaptive Euclidean distance it is shown that high fidelity predictions are possible, in spite of the input failure. In fact, it is shown that with the Adaptive Euclidean distance prediction
Directory of Open Access Journals (Sweden)
Alberto Baccini
2015-10-01
Full Text Available University rankings represent a controversial issue in the debate about higher education policy. One of the best known university ranking is the Quacquarelli Symonds World University Rankings (QS, published annually since 2004 by Quacquarelli Symonds ltd, a company founded in 1990 and headquartered in London. QS provides a ranking based on a score calculated by weighting six different indicators. The 2015 edition, published in October 2015, introduced major methodological innovations and, as a consequence, many universities worldwide underwent major changes of their scores and ranks. Ben Sowter, head of division of intelligence unit of Quacquarelli Symonds, responds to 15 questions about the new QS methodology.
Directory of Open Access Journals (Sweden)
Pedro Bernardino
2010-03-01
Full Text Available The academic rankings are a controversial subject in higher education. However, despite all the criticism, academic rankings are here to stay and more and more different stakeholders use rankings to obtain information about the institutions' performance. The two most well-known rankings, The Times and the Shanghai Jiao Tong University rankings have different methodologies. The Times ranking is based on peer review, whereas the Shanghai ranking has only quantitative indicators and is mainly based on research outputs. In Germany, the CHE ranking uses a different methodology from the traditional rankings, allowing the users to choose criteria and weights. The Portuguese higher education institutions are performing below their European peers, and the Government believes that an academic ranking could improve both performance and competitiveness between institutions. The purpose of this paper is to analyse the advantages and problems of academic rankings and provide guidance to a new Portuguese ranking.Los rankings académicos son un tema muy contradictorio en la enseñanza superior. Todavía, además de todas las críticas los rankings están para quedarse entre nosotros. Y cada vez más, diferentes stakeholders utilizan los rankings para obtener información sobre el desempeño de las instituciones. Dos de los rankings más conocidos, el The Times y el ranking de la universidad de Shangai Jiao Tong tienen métodos distintos. El The Times se basa en la opinión de expertos mientras el ranking de la universidad de Shangai presenta solamente indicadores cuantitativos y mayoritariamente basados en los resultados de actividades de investigación. En Alemania el ranking CHE usa un método distinto permitiendo al utilizador elegir los criterios y su importancia. Las instituciones de enseñanza superior portuguesas tienen un desempeño abajo de las europeas y el gobierno cree que un ranking académico podría contribuir para mejorar su desempeño y
Ranking Adverse Drug Reactions With Crowdsourcing
Gottlieb, Assaf
2015-03-23
Background: There is no publicly available resource that provides the relative severity of adverse drug reactions (ADRs). Such a resource would be useful for several applications, including assessment of the risks and benefits of drugs and improvement of patient-centered care. It could also be used to triage predictions of drug adverse events. Objective: The intent of the study was to rank ADRs according to severity. Methods: We used Internet-based crowdsourcing to rank ADRs according to severity. We assigned 126,512 pairwise comparisons of ADRs to 2589 Amazon Mechanical Turk workers and used these comparisons to rank order 2929 ADRs. Results: There is good correlation (rho=.53) between the mortality rates associated with ADRs and their rank. Our ranking highlights severe drug-ADR predictions, such as cardiovascular ADRs for raloxifene and celecoxib. It also triages genes associated with severe ADRs such as epidermal growth-factor receptor (EGFR), associated with glioblastoma multiforme, and SCN1A, associated with epilepsy. Conclusions: ADR ranking lays a first stepping stone in personalized drug risk assessment. Ranking of ADRs using crowdsourcing may have useful clinical and financial implications, and should be further investigated in the context of health care decision making.
Adiabatic quantum algorithm for search engine ranking.
Garnerone, Silvano; Zanardi, Paolo; Lidar, Daniel A
2012-06-08
We propose an adiabatic quantum algorithm for generating a quantum pure state encoding of the PageRank vector, the most widely used tool in ranking the relative importance of internet pages. We present extensive numerical simulations which provide evidence that this algorithm can prepare the quantum PageRank state in a time which, on average, scales polylogarithmically in the number of web pages. We argue that the main topological feature of the underlying web graph allowing for such a scaling is the out-degree distribution. The top-ranked log(n) entries of the quantum PageRank state can then be estimated with a polynomial quantum speed-up. Moreover, the quantum PageRank state can be used in "q-sampling" protocols for testing properties of distributions, which require exponentially fewer measurements than all classical schemes designed for the same task. This can be used to decide whether to run a classical update of the PageRank.
Ranking adverse drug reactions with crowdsourcing.
Gottlieb, Assaf; Hoehndorf, Robert; Dumontier, Michel; Altman, Russ B
2015-03-23
There is no publicly available resource that provides the relative severity of adverse drug reactions (ADRs). Such a resource would be useful for several applications, including assessment of the risks and benefits of drugs and improvement of patient-centered care. It could also be used to triage predictions of drug adverse events. The intent of the study was to rank ADRs according to severity. We used Internet-based crowdsourcing to rank ADRs according to severity. We assigned 126,512 pairwise comparisons of ADRs to 2589 Amazon Mechanical Turk workers and used these comparisons to rank order 2929 ADRs. There is good correlation (rho=.53) between the mortality rates associated with ADRs and their rank. Our ranking highlights severe drug-ADR predictions, such as cardiovascular ADRs for raloxifene and celecoxib. It also triages genes associated with severe ADRs such as epidermal growth-factor receptor (EGFR), associated with glioblastoma multiforme, and SCN1A, associated with epilepsy. ADR ranking lays a first stepping stone in personalized drug risk assessment. Ranking of ADRs using crowdsourcing may have useful clinical and financial implications, and should be further investigated in the context of health care decision making.
Tao, Chenyang; Nichols, Thomas E; Hua, Xue; Ching, Christopher R K; Rolls, Edmund T; Thompson, Paul M; Feng, Jianfeng
2017-01-01
We propose a generalized reduced rank latent factor regression model (GRRLF) for the analysis of tensor field responses and high dimensional covariates. The model is motivated by the need from imaging-genetic studies to identify genetic variants that are associated with brain imaging phenotypes, often in the form of high dimensional tensor fields. GRRLF identifies from the structure in the data the effective dimensionality of the data, and then jointly performs dimension reduction of the covariates, dynamic identification of latent factors, and nonparametric estimation of both covariate and latent response fields. After accounting for the latent and covariate effects, GRLLF performs a nonparametric test on the remaining factor of interest. GRRLF provides a better factorization of the signals compared with common solutions, and is less susceptible to overfitting because it exploits the effective dimensionality. The generality and the flexibility of GRRLF also allow various statistical models to be handled in a unified framework and solutions can be efficiently computed. Within the field of neuroimaging, it improves the sensitivity for weak signals and is a promising alternative to existing approaches. The operation of the framework is demonstrated with both synthetic datasets and a real-world neuroimaging example in which the effects of a set of genes on the structure of the brain at the voxel level were measured, and the results compared favorably with those from existing approaches. Copyright © 2016. Published by Elsevier Inc.
Tan's Epsilon-Determinant and Ranks of Matrices over Semirings
Mohindru, Preeti; Pereira, Rajesh
2015-01-01
We use the ϵ-determinant introduced by Ya-Jia Tan to define a family of ranks of matrices over certain semirings. We show that these ranks generalize some known rank functions over semirings such as the determinantal rank. We also show that this family of ranks satisfies the rank-sum and Sylvester inequalities. We classify all bijective linear maps which preserve these ranks. PMID:27347506
Vavalle, Nicholas A; Schoell, Samantha L; Weaver, Ashley A; Stitzel, Joel D; Gayzik, F Scott
2014-11-01
Human body finite element models (FEMs) are a valuable tool in the study of injury biomechanics. However, the traditional model development process can be time-consuming. Scaling and morphing an existing FEM is an attractive alternative for generating morphologically distinct models for further study. The objective of this work is to use a radial basis function to morph the Global Human Body Models Consortium (GHBMC) average male model (M50) to the body habitus of a 95th percentile male (M95) and to perform validation tests on the resulting model. The GHBMC M50 model (v. 4.3) was created using anthropometric and imaging data from a living subject representing a 50th percentile male. A similar dataset was collected from a 95th percentile male (22,067 total images) and was used in the morphing process. Homologous landmarks on the reference (M50) and target (M95) geometries, with the existing FE node locations (M50 model), were inputs to the morphing algorithm. The radial basis function was applied to morph the FE model. The model represented a mass of 103.3 kg and contained 2.2 million elements with 1.3 million nodes. Simulations of the M95 in seven loading scenarios were presented ranging from a chest pendulum impact to a lateral sled test. The morphed model matched anthropometric data to within a rootmean square difference of 4.4% while maintaining element quality commensurate to the M50 model and matching other anatomical ranges and targets. The simulation validation data matched experimental data well in most cases.
Saccani, Raquel; Valentini, Nadia C
2012-01-01
To compare Alberta Infant Motor Scale scores for Brazilian infants with the Canadian norm and to construct sex-specific reference curves and percentiles for motor development for a Brazilian population. This study recruited 795 children aged 0 to 18 months from a number of different towns in Brazil. Infants were assessed by an experienced researcher in a silent room using the Alberta Infant Motor Scale. Sex-specific percentiles (P5, P10, P25, P50, P75 and P90) were calculated and analyzed for each age in months from 0 to 18 months. No significant differences (p > 0.05) between boys and girls were observed for the majority of ages. The exception was 14 months, where the girls scored higher for overall motor performance (p = 0.015) and had a higher development percentile (0.021). It was observed that the development curves demonstrated a tendency to nonlinear development in both sexes and for both typical and atypical children. Variation in motor acquisition was minimal at the extremes of the age range: during the first two months of life and from 15 months onwards. Although the Alberta Infant Motor Scale is widely used in both research and clinical practice, it has certain limitations in terms of behavioral differentiation before 2 months and after 15 months. This reduced sensitivity at the extremes of the age range may be related to the number of motor items assessed at these ages and their difficulty. It is suggested that other screening instruments be employed for children over the age of 15 months.
Escobar-Cardozo, Germán D; Correa-Bautista, Jorge E; González-Jiménez, Emilio; Schmidt-RioValle, Jacqueline; Ramírez-Vélez, Robinson
2016-04-01
The analysis of body composition is a fundamental part of nutritional status assessment. The objective of this study was to establish body fat percentiles by bioelectrical impedance in children and adolescents from Bogotá (Colombia) who were part of the FUPRECOL study (Asociación de la Fuerza Prensil con Manifestaciones Tempranas de Riesgo Cardiovascular en Niños y Adolescentes Colombianos - Association between prehensile force and early signs of cardiovascular risk in Colombian children and adolescents). This was a cross-sectional study conducted among 5850 students aged 9-17.9 years old from Bogotá (Colombia). Body fat percentage was measured using foot-to-foot bioelectrical impedance (Tanita®, BF-689), by age and gender. Weight, height, waist circumference, and hip circumference were measured, and sexual maturity was self-staged. Percentiles (P3, P10, P25, P50, P75, P90 and P97) and centile curves were estimated using the LMS method (L [BoxCox curve], M [median curve] and S [variation coefficient curve]), by age and gender. Subjects included were 2526 children and 3324 adolescents. Body fat percentages and centile curves by age and gender were established. For most age groups, values resulted higher among girls than boys. Participants with values above P90 were considered to have a high cardiovascular risk due to excess fat (boys > 23.428.3, girls > 31.0-34.1). Body fat percentage percentiles measured using bioelectrical impedance by age and gender are presented here and may be used as reference to assess nutritional status and to predict cardiovascular risk due to excess fat at an early age. Sociedad Argentina de Pediatría.
Kudsk, Kenneth A.; Munoz-del-Rio, Alejandro; Busch, Rebecca A.; Kight, Cassandra E.; Schoeller, Dale A.
2015-01-01
Background Loss of protein mass and lower fat-free mass index (FFMI) are associated with longer length of stay, post-surgical complications and other poor outcomes in hospitalized patients Normative data for FFMI of U.S. populations does not exist. This work aims to create a stratified FFMI percentile table for the U.S. population using the large bioelectric impedance analysis data obtained from National Health and Nutrition Examination Surveys (NHANES). Methods Fat-free mass (FFM) was calculated from the NHANES III bioelectric impedance analysis and anthropometric data for males and females ages 12 to over 90 years for three race-ethnicities (non-Hispanic white, non-Hispanic black, and Mexican-American). FFM was normalized by subject height to create a FFMI distribution table for the U.S. population. Selected percentiles were obtained by age, sex, and race-ethnicity. Data was collapsed by race-ethnicity before and after removing obese and underweight subjects to create a FFMI decile table for males and females aged 12 and over for the healthy weight U.S. population. Results FFMI increased during adolescent growth but stabilized in the early 20s. The FFMI deciles were similar by race-ethnicity and age group remaining relatively stable between ages of 22 and 80 years. The FFMI deciles for males and females were significantly different. Conclusions After eliminating the obese and extremely thin, FFMI percentiles remain stable during adult years allowing creation of age- and race/ethnicity-independent decile tables for males and females. These tables allow stratification of individuals for nutrition intervention trials to depict changing nutrition status during medical, surgical and nutritional interventions. PMID:26092851
Carlsson, Anna; Chang, Fred; Lemmen, Paul; Kullgren, Anders; Schmitt, Kai-Uwe; Linder, Astrid; Svensson, Mats Y
2014-01-01
Whiplash-associated disorders (WADs), or whiplash injuries, due to low-severity vehicle crashes are of great concern in motorized countries and it is well established that the risk of such injuries is higher for females than for males, even in similar crash conditions. Recent protective systems have been shown to be more beneficial for males than for females. Hence, there is a need for improved tools to address female WAD prevention when developing and evaluating the performance of whiplash protection systems. The objective of this study is to develop and evaluate a finite element model of a 50th percentile female rear impact crash test dummy. The anthropometry of the 50th percentile female was specified based on literature data. The model, called EvaRID (female rear impact dummy), was based on the same design concept as the existing 50th percentile male rear impact dummy, the BioRID II. A scaling approach was developed and the first version, EvaRID V1.0, was implemented. Its dynamic response was compared to female volunteer data from rear impact sled tests. The EvaRID V1.0 model and the volunteer tests compared well until ∼250 ms of the head and T1 forward accelerations and rearward linear displacements and of the head rearward angular displacement. Markedly less T1 rearward angular displacement was found for the EvaRID model compared to the female volunteers. Similar results were received for the BioRID II model when comparing simulated responses with experimental data under volunteer loading conditions. The results indicate that the biofidelity of the EvaRID V1.0 and BioRID II FE models have limitations, predominantly in the T1 rearward angular displacement, at low velocity changes (7 km/h). The BioRID II model was validated against dummy test results in a loading range close to consumer test conditions (EuroNCAP) and lower severity levels of volunteer testing were not considered. The EvaRID dummy model demonstrated the potential of becoming a valuable tool
Directory of Open Access Journals (Sweden)
Venkaiah K.
2015-06-01
Full Text Available Introduction: Level of development in health and nutrition at district level is useful for planning intervention in less developed districts. Aims & Objectives: to develop composite index based on 12 variables to compare development within districts in the state of Madhya Pradesh. Material & Methods: Data collected by National Institute of Nutrition, Hyderabad during 2010-11 on nutritional status of rural children at district level in Madhya Pradesh was used. A total of 22,895 children (Boys: 12379, Girls: 10516, were covered. Results: It was observed that Indore district rank 1st as per composite index and Singrauli rank last in the district ranking. Three categories of districts were done based on percentile of composite index i.e less developed, developing and developed districts. It was observed that there was significant (p<0.01 trend in the prevalence of undernutrition among three set of districts. Similarly, significant (p<0.01 trend was observed in proportion of children participating regularly in ICDS supplementary feeding programme, use of sanitary latrine and iodized cooking salt among three sets of districts. Conclusions: Widespread disparity in health and nutrition was observed among the districts. It is quite important to examine the extent of improvements needed in different developmental indicators for enhancing the level of development of low developed districts. This will help the planners and administrators to readjust the resources for bringing about uniform development in the state.
Who's bigger? where historical figures really rank
Skiena, Steven
2014-01-01
Is Hitler bigger than Napoleon? Washington bigger than Lincoln? Picasso bigger than Einstein? Quantitative analysts are rapidly finding homes in social and cultural domains, from finance to politics. What about history? In this fascinating book, Steve Skiena and Charles Ward bring quantitative analysis to bear on ranking and comparing historical reputations. They evaluate each person by aggregating the traces of millions of opinions, just as Google ranks webpages. The book includes a technical discussion for readers interested in the details of the methods, but no mathematical or computational background is necessary to understand the rankings or conclusions. Along the way, the authors present the rankings of more than one thousand of history's most significant people in science, politics, entertainment, and all areas of human endeavor. Anyone interested in history or biography can see where their favorite figures place in the grand scheme of things.
Low-Rank Representation for Incomplete Data
Directory of Open Access Journals (Sweden)
Jiarong Shi
2014-01-01
Full Text Available Low-rank matrix recovery (LRMR has been becoming an increasingly popular technique for analyzing data with missing entries, gross corruptions, and outliers. As a significant component of LRMR, the model of low-rank representation (LRR seeks the lowest-rank representation among all samples and it is robust for recovering subspace structures. This paper attempts to solve the problem of LRR with partially observed entries. Firstly, we construct a nonconvex minimization by taking the low rankness, robustness, and incompletion into consideration. Then we employ the technique of augmented Lagrange multipliers to solve the proposed program. Finally, experimental results on synthetic and real-world datasets validate the feasibility and effectiveness of the proposed method.
Free Malcev algebra of rank three
Kornev, Alexandr
2011-01-01
We find a basis of the free Malcev algebra on three free generators over a field of characteristic zero. The specialty and semiprimity of this algebra are proved. In addition, we prove the decomposability of this algebra into subdirect sum of the free Lie algebra rank three and the free algebra of rank three of variety of Malcev algebras generated by a simple seven-dimensional Malcev algebra.
Block models and personalized PageRank.
Kloumann, Isabel M; Ugander, Johan; Kleinberg, Jon
2017-01-03
Methods for ranking the importance of nodes in a network have a rich history in machine learning and across domains that analyze structured data. Recent work has evaluated these methods through the "seed set expansion problem": given a subset [Formula: see text] of nodes from a community of interest in an underlying graph, can we reliably identify the rest of the community? We start from the observation that the most widely used techniques for this problem, personalized PageRank and heat kernel methods, operate in the space of "landing probabilities" of a random walk rooted at the seed set, ranking nodes according to weighted sums of landing probabilities of different length walks. Both schemes, however, lack an a priori relationship to the seed set objective. In this work, we develop a principled framework for evaluating ranking methods by studying seed set expansion applied to the stochastic block model. We derive the optimal gradient for separating the landing probabilities of two classes in a stochastic block model and find, surprisingly, that under reasonable assumptions the gradient is asymptotically equivalent to personalized PageRank for a specific choice of the PageRank parameter [Formula: see text] that depends on the block model parameters. This connection provides a formal motivation for the success of personalized PageRank in seed set expansion and node ranking generally. We use this connection to propose more advanced techniques incorporating higher moments of landing probabilities; our advanced methods exhibit greatly improved performance, despite being simple linear classification rules, and are even competitive with belief propagation.
Rank distributions: A panoramic macroscopic outlook
Eliazar, Iddo I.; Cohen, Morrel H.
2014-01-01
This paper presents a panoramic macroscopic outlook of rank distributions. We establish a general framework for the analysis of rank distributions, which classifies them into five macroscopic "socioeconomic" states: monarchy, oligarchy-feudalism, criticality, socialism-capitalism, and communism. Oligarchy-feudalism is shown to be characterized by discrete macroscopic rank distributions, and socialism-capitalism is shown to be characterized by continuous macroscopic size distributions. Criticality is a transition state between oligarchy-feudalism and socialism-capitalism, which can manifest allometric scaling with multifractal spectra. Monarchy and communism are extreme forms of oligarchy-feudalism and socialism-capitalism, respectively, in which the intrinsic randomness vanishes. The general framework is applied to three different models of rank distributions—top-down, bottom-up, and global—and unveils each model's macroscopic universality and versatility. The global model yields a macroscopic classification of the generalized Zipf law, an omnipresent form of rank distributions observed across the sciences. An amalgamation of the three models establishes a universal rank-distribution explanation for the macroscopic emergence of a prevalent class of continuous size distributions, ones governed by unimodal densities with both Pareto and inverse-Pareto power-law tails.
Hosseini, Mostafa; Kelishadi, Roya; Yousefifard, Mahmoud; Qorbani, Mostafa; Bazargani, Behnaz; Heshmat, Ramin; Motlagh, Mohammad Esmail; Mirminachi, Babak; Ataei, Neamatollah
2017-01-01
We compared the prevalence of obesity based on both waist circumference for height and body mass index (BMI) in Iranian children and adolescents. Data on 13 120 children with a mean age of 12.45 ± 3.36 years (50.8% male) from the fourth Childhood and Adolescence Surveillance and Prevention of Adult Non-communicable Disease study were included. Measured waist circumference values were modelled according to age, gender and height percentiles. The prevalence of obesity was estimated using the 90th percentiles for both unadjusted and height-adjusted waist circumferences and compared with the World Health Organization BMI cut-offs. They were analysed further for short, average and tall children. Waist circumference values increased steadily with age. For short and average height children, the prevalence of obesity was higher when height-adjusted waist circumference was used. For taller children, the prevalence of obesity using height-adjusted waist circumference and BMI was similar, but lower than the prevalence based on measurements unadjusted for height. Height-adjusted waist circumference and BMI identified different children as having obesity, with overlaps of 69.47% for boys and 68.42% for girls. Just using waist circumference underestimated obesity in some Iranian children and measurements should be adjusted for height. ©2016 Foundation Acta Paediatrica. Published by John Wiley & Sons Ltd.
Humphris, Gerry; Crawford, John R; Hill, Kirsty; Gilbert, Angela; Freeman, Ruth
2013-06-24
A recent UK population survey of oral health included questions to assess dental anxiety to provide mean and prevalence estimates of this important psychological construct. A two-stage cluster sample was used for the survey across England, Wales, and Northern Ireland. The survey took place between October-December 2009, and January-April 2010. All interviewers were trained on survey procedures. Within the 7,233 households sampled there were 13,509 adults who were asked to participate in the survey and 11,382 participated (84%). The scale was reliable and showed some evidence of unidimensionality. Estimated proportion of participants with high dental anxiety (cut-off score = 19) was 11.6%. Percentiles and confidence intervals were presented and can be estimated for individual patients across various age ranges and gender using an on-line tool. The largest reported data set on the MDAS from a representative UK sample was presented. The scale's psychometrics is supportive for the routine assessment of patient dental anxiety to compare against a number of major demographic groups categorised by age and sex. Practitioners within the UK have a resource to estimate the rarity of a particular patient's level of dental anxiety, with confidence intervals, when using the on-line percentile calculator.
Non-parametric algorithm to isolate chunks in response sequences
Directory of Open Access Journals (Sweden)
Andrea Alamia
2016-09-01
Full Text Available Chunking consists in grouping items of a sequence into small clusters, named chunks, with the assumed goal of lessening working memory load. Despite extensive research, the current methods used to detect chunks, and to identify different chunking strategies, remain discordant and difficult to implement. Here, we propose a simple and reliable method to identify chunks in a sequence and to determine their stability across blocks.This algorithm is based on a ranking method and its major novelty is that it provides concomitantly both the features of individual chunk in a given sequence, and an overall index that quantifies the chunking pattern consistency across sequences. The analysis of simulated data confirmed the validity of our method in different conditions of noise, chunk lengths and chunk numbers; moreover, we found that this algorithm was particularly efficient in the noise range observed in real data, provided that at least 4 sequence repetitions were included in each experimental block. Furthermore, we applied this algorithm to actual reaction time series gathered from 3 published experiments and were able to confirm the findings obtained in the original reports. In conclusion, this novel algorithm is easy to implement, is robust to outliers and provides concurrent and reliable estimation of chunk position and chunking dynamics, making it useful to study both sequence-specific and general chunking effects.The algorithm is available at: https://github.com/artipago/Non-parametric-algorithm-to-isolate-chunks-in-response-sequences
Bioprocess iterative batch-to-batch optimization based on hybrid parametric/nonparametric models.
Teixeira, Ana P; Clemente, João J; Cunha, António E; Carrondo, Manuel J T; Oliveira, Rui
2006-01-01
This paper presents a novel method for iterative batch-to-batch dynamic optimization of bioprocesses. The relationship between process performance and control inputs is established by means of hybrid grey-box models combining parametric and nonparametric structures. The bioreactor dynamics are defined by material balance equations, whereas the cell population subsystem is represented by an adjustable mixture of nonparametric and parametric models. Thus optimizations are possible without detailed mechanistic knowledge concerning the biological system. A clustering technique is used to supervise the reliability of the nonparametric subsystem during the optimization. Whenever the nonparametric outputs are unreliable, the objective function is penalized. The technique was evaluated with three simulation case studies. The overall results suggest that the convergence to the optimal process performance may be achieved after a small number of batches. The model unreliability risk constraint along with sampling scheduling are crucial to minimize the experimental effort required to attain a given process performance. In general terms, it may be concluded that the proposed method broadens the application of the hybrid parametric/nonparametric modeling technique to "newer" processes with higher potential for optimization.
A non-parametric hidden Markov model for climate state identification
Directory of Open Access Journals (Sweden)
M. F. Lambert
2003-01-01
Full Text Available Hidden Markov models (HMMs can allow for the varying wet and dry cycles in the climate without the need to simulate supplementary climate variables. The fitting of a parametric HMM relies upon assumptions for the state conditional distributions. It is shown that inappropriate assumptions about state conditional distributions can lead to biased estimates of state transition probabilities. An alternative non-parametric model with a hidden state structure that overcomes this problem is described. It is shown that a two-state non-parametric model produces accurate estimates of both transition probabilities and the state conditional distributions. The non-parametric model can be used directly or as a technique for identifying appropriate state conditional distributions to apply when fitting a parametric HMM. The non-parametric model is fitted to data from ten rainfall stations and four streamflow gauging stations at varying distances inland from the Pacific coast of Australia. Evidence for hydrological persistence, though not mathematical persistence, was identified in both rainfall and streamflow records, with the latter showing hidden states with longer sojourn times. Persistence appears to increase with distance from the coast. Keywords: Hidden Markov models, non-parametric, two-state model, climate states, persistence, probability distributions
Directory of Open Access Journals (Sweden)
Mostafa Hosseini
2016-06-01
Full Text Available Background: The children’s body composition status is an important indicator of health condition evaluated through their body mass index (BMI. We aimed to provide standardized percentile curves of BMI in a population of Iranian children and adolescents. We assessed the nationally representative of sample populations from Tehran. Materials and Methods: A total sample of 14,865 children aged 7-18 years was gathered. The Lambda-Mu-Sigma method was used to derive sex-specific smoothed centiles for age via the Lambda-Mu-Sigma Chart Maker Program. Finally, the prevalence of overweight and obesity with 95% confidence interval (CI was calculated. Results: BMI percentiles obtained from Tehran’s population, except for the 10th percentile, seem to be very slightly greater than the urban boys from all over Iran. BMI percentiles have an increasing trend by age that is S-shaped with a slight slope. Only in the 90th and 97th percentiles of BMI for girls, this rising trend seems to stop. Boys generally have higher BMIs than girls. The exceptions are younger ages of 90th and 97th percentiles and older ages of 3rd and 10th percentiles. A total number of 1,008 (13.20%; 95% CI: 12.46-13.98 boys and 603 (8.34%; 95% CI: 7.72-9.00 girls were categorized as overweight and obese. Obesity were observed in 402 (5.27%; 95% CI: 4.79-5.79 boys and 274 (3.76%; 95% CI: 3.35-4.22 girls. Conclusion: We construct BMI percentile curves by age and gender for 7 to 18 years Iranian children and adolescents. It can be concluded that sample populations from Tehran are nationally representative.
PageRank as a method to rank biomedical literature by importance.
Yates, Elliot J; Dixon, Louise C
2015-01-01
Optimal ranking of literature importance is vital in overcoming article overload. Existing ranking methods are typically based on raw citation counts, giving a sum of 'inbound' links with no consideration of citation importance. PageRank, an algorithm originally developed for ranking webpages at the search engine, Google, could potentially be adapted to bibliometrics to quantify the relative importance weightings of a citation network. This article seeks to validate such an approach on the freely available, PubMed Central open access subset (PMC-OAS) of biomedical literature. On-demand cloud computing infrastructure was used to extract a citation network from over 600,000 full-text PMC-OAS articles. PageRanks and citation counts were calculated for each node in this network. PageRank is highly correlated with citation count (R = 0.905, P PageRank can be trivially computed on commodity cluster hardware and is linearly correlated with citation count. Given its putative benefits in quantifying relative importance, we suggest it may enrich the citation network, thereby overcoming the existing inadequacy of citation counts alone. We thus suggest PageRank as a feasible supplement to, or replacement of, existing bibliometric ranking methods.
RANK/RANK-Ligand/OPG: Ein neuer Therapieansatz in der Osteoporosebehandlung
Directory of Open Access Journals (Sweden)
Preisinger E
2007-01-01
Full Text Available Die Erforschung der Kopplungsmechanismen zur Osteoklastogenese, Knochenresorption und Remodellierung eröffnete neue mögliche Therapieansätze in der Behandlung der Osteoporose. Eine Schlüsselrolle beim Knochenabbau spielt der RANK- ("receptor activator of nuclear factor (NF- κB"- Ligand (RANKL. Durch die Bindung von RANKL an den Rezeptor RANK wird die Knochenresorption eingeleitet. OPG (Osteoprotegerin sowie der für den klinischen Gebrauch entwickelte humane monoklonale Antikörper (IgG2 Denosumab blockieren die Bindung von RANK-Ligand an RANK und verhindern den Knochenabbau.
Nonparametric Bayesian density estimation on manifolds with applications to planar shapes.
Bhattacharya, Abhishek; Dunson, David B
2010-12-01
Statistical analysis on landmark-based shape spaces has diverse applications in morphometrics, medical diagnostics, machine vision and other areas. These shape spaces are non-Euclidean quotient manifolds. To conduct nonparametric inferences, one may define notions of centre and spread on this manifold and work with their estimates. However, it is useful to consider full likelihood-based methods, which allow nonparametric estimation of the probability density. This article proposes a broad class of mixture models constructed using suitable kernels on a general compact metric space and then on the planar shape space in particular. Following a Bayesian approach with a nonparametric prior on the mixing distribution, conditions are obtained under which the Kullback-Leibler property holds, implying large support and weak posterior consistency. Gibbs sampling methods are developed for posterior computation, and the methods are applied to problems in density estimation and classification with shape-based predictors. Simulation studies show improved estimation performance relative to existing approaches.
DEFF Research Database (Denmark)
Ramirez, José Rangel; Sørensen, John Dalsgaard
2011-01-01
This work illustrates the updating and incorporation of information in the assessment of fatigue reliability for offshore wind turbine. The new information, coming from external and condition monitoring can be used to direct updating of the stochastic variables through a non-parametric Bayesian...... updating approach and be integrated in the reliability analysis by a third-order polynomial chaos expansion approximation. Although Classical Bayesian updating approaches are often used because of its parametric formulation, non-parametric approaches are better alternatives for multi-parametric updating...... with a non-conjugating formulation. The results in this paper show the influence on the time dependent updated reliability when non-parametric and classical Bayesian approaches are used. Further, the influence on the reliability of the number of updated parameters is illustrated....
Country-specific determinants of world university rankings
Pietrucha, Jacek
2017-01-01
This paper examines country-specific factors that affect the three most influential world university rankings (the Academic Ranking of World Universities, the QS World University Ranking, and the Times Higher Education World University Ranking). We run a cross sectional regression that covers 42–71 countries (depending on the ranking and data availability). We show that the position of universities from a country in the ranking is determined by the following country-specific variables: econom...
Global network centrality of university rankings
Guo, Weisi; Del Vecchio, Marco; Pogrebna, Ganna
2017-10-01
Universities and higher education institutions form an integral part of the national infrastructure and prestige. As academic research benefits increasingly from international exchange and cooperation, many universities have increased investment in improving and enabling their global connectivity. Yet, the relationship of university performance and its global physical connectedness has not been explored in detail. We conduct, to our knowledge, the first large-scale data-driven analysis into whether there is a correlation between university relative ranking performance and its global connectivity via the air transport network. The results show that local access to global hubs (as measured by air transport network betweenness) strongly and positively correlates with the ranking growth (statistical significance in different models ranges between 5% and 1% level). We also found that the local airport's aggregate flight paths (degree) and capacity (weighted degree) has no effect on university ranking, further showing that global connectivity distance is more important than the capacity of flight connections. We also examined the effect of local city economic development as a confounding variable and no effect was observed suggesting that access to global transportation hubs outweighs economic performance as a determinant of university ranking. The impact of this research is that we have determined the importance of the centrality of global connectivity and, hence, established initial evidence for further exploring potential connections between university ranking and regional investment policies on improving global connectivity.
Social class rank, essentialism, and punitive judgment.
Kraus, Michael W; Keltner, Dacher
2013-08-01
Recent evidence suggests that perceptions of social class rank influence a variety of social cognitive tendencies, from patterns of causal attribution to moral judgment. In the present studies we tested the hypotheses that upper-class rank individuals would be more likely to endorse essentialist lay theories of social class categories (i.e., that social class is founded in genetically based, biological differences) than would lower-class rank individuals and that these beliefs would decrease support for restorative justice--which seeks to rehabilitate offenders, rather than punish unlawful action. Across studies, higher social class rank was associated with increased essentialism of social class categories (Studies 1, 2, and 4) and decreased support for restorative justice (Study 4). Moreover, manipulated essentialist beliefs decreased preferences for restorative justice (Study 3), and the association between social class rank and class-based essentialist theories was explained by the tendency to endorse beliefs in a just world (Study 2). Implications for how class-based essentialist beliefs potentially constrain social opportunity and mobility are discussed.
A cognitive model for aggregating people's rankings.
Directory of Open Access Journals (Sweden)
Michael D Lee
Full Text Available We develop a cognitive modeling approach, motivated by classic theories of knowledge representation and judgment from psychology, for combining people's rankings of items. The model makes simple assumptions about how individual differences in knowledge lead to observed ranking data in behavioral tasks. We implement the cognitive model as a Bayesian graphical model, and use computational sampling to infer an aggregate ranking and measures of the individual expertise. Applications of the model to 23 data sets, dealing with general knowledge and prediction tasks, show that the model performs well in producing an aggregate ranking that is often close to the ground truth and, as in the "wisdom of the crowd" effect, usually performs better than most of individuals. We also present some evidence that the model outperforms the traditional statistical Borda count method, and that the model is able to infer people's relative expertise surprisingly well without knowing the ground truth. We discuss the advantages of the cognitive modeling approach to combining ranking data, and in wisdom of the crowd research generally, as well as highlighting a number of potential directions for future model development.
RANK und RANKL - Vom Knochen zum Mammakarzinom
Directory of Open Access Journals (Sweden)
Sigl V
2012-01-01
Full Text Available RANK (Receptor Activator of NF-κB und sein Ligand RANKL sind Schlüsselmoleküle im Knochenmetabolismus und spielen eine essenzielle Rolle in der Entstehung von pathologischen Knochenveränderungen. Die Deregulation des RANK/RANKL-Systems ist zum Beispiel ein Hauptgrund für das Auftreten von postmenopausaler Osteoporose bei Frauen. Eine weitere wesentliche Funktion von RANK und RANKL liegt in der Entwicklung von milchsekretierenden Drüsen während der Schwangerschaft. Dabei regulieren Sexualhormone, wie zum Beispiel Progesteron, die Expression von RANKL und induzieren dadurch die Proliferation von epithelialen Zellen der Brust. Seit Längerem war schon bekannt, dass RANK und RANKL in der Metastasenbildung von Brustkrebszellen im Knochengewebe beteiligt sind. Wir konnten nun das RANK/RANKLSystem auch als essenziellen Mechanismus in der Entstehung von hormonellem Brustkrebs identifizieren. In diesem Beitrag werden wir daher den neuesten Erkenntnissen besondere Aufmerksamkeit schenken und diese kritisch in Bezug auf Brustkrebsentwicklung betrachten.
A Bayesian approach to the analysis of quantal bioassay studies using nonparametric mixture models.
Fronczyk, Kassandra; Kottas, Athanasios
2014-03-01
We develop a Bayesian nonparametric mixture modeling framework for quantal bioassay settings. The approach is built upon modeling dose-dependent response distributions. We adopt a structured nonparametric prior mixture model, which induces a monotonicity restriction for the dose-response curve. Particular emphasis is placed on the key risk assessment goal of calibration for the dose level that corresponds to a specified response. The proposed methodology yields flexible inference for the dose-response relationship as well as for other inferential objectives, as illustrated with two data sets from the literature. © 2013, The International Biometric Society.
Multivariate nonparametric regression and visualization with R and applications to finance
Klemelä, Jussi
2014-01-01
A modern approach to statistical learning and its applications through visualization methods With a unique and innovative presentation, Multivariate Nonparametric Regression and Visualization provides readers with the core statistical concepts to obtain complete and accurate predictions when given a set of data. Focusing on nonparametric methods to adapt to the multiple types of data generatingmechanisms, the book begins with an overview of classification and regression. The book then introduces and examines various tested and proven visualization techniques for learning samples and functio
Zhao, Zhibiao
2011-06-01
We address the nonparametric model validation problem for hidden Markov models with partially observable variables and hidden states. We achieve this goal by constructing a nonparametric simultaneous confidence envelope for transition density function of the observable variables and checking whether the parametric density estimate is contained within such an envelope. Our specification test procedure is motivated by a functional connection between the transition density of the observable variables and the Markov transition kernel of the hidden states. Our approach is applicable for continuous time diffusion models, stochastic volatility models, nonlinear time series models, and models with market microstructure noise.
Directory of Open Access Journals (Sweden)
Rabia Ece OMAY
2013-06-01
Full Text Available In this study, relationship between gross domestic product (GDP per capita and sulfur dioxide (SO2 and particulate matter (PM10 per capita is modeled for Turkey. Nonparametric fixed effect panel data analysis is used for the modeling. The panel data covers 12 territories, in first level of Nomenclature of Territorial Units for Statistics (NUTS, for period of 1990-2001. Modeling of the relationship between GDP and SO2 and PM10 for Turkey, the non-parametric models have given good results.
A Least Squares Method for Variance Estimation in Heteroscedastic Nonparametric Regression
Directory of Open Access Journals (Sweden)
Yuejin Zhou
2014-01-01
Full Text Available Interest in variance estimation in nonparametric regression has grown greatly in the past several decades. Among the existing methods, the least squares estimator in Tong and Wang (2005 is shown to have nice statistical properties and is also easy to implement. Nevertheless, their method only applies to regression models with homoscedastic errors. In this paper, we propose two least squares estimators for the error variance in heteroscedastic nonparametric regression: the intercept estimator and the slope estimator. Both estimators are shown to be consistent and their asymptotic properties are investigated. Finally, we demonstrate through simulation studies that the proposed estimators perform better than the existing competitor in various settings.
Directory of Open Access Journals (Sweden)
Špička J.
2015-06-01
Full Text Available Traffic accidents cause one of the highest numbers of severe injuries in the whole population spectrum. The numbers of deaths and seriously injured citizens prove that traffic accidents and their consequences are still a serious problem to be solved. The paper contributes to the field of vehicle safety technology with a virtual approach. Exploitation of the previously developed scaling algorithm enables the creation of a specific anthropometric model based on a validated reference model. The aim of the paper is to prove the biofidelity of the small percentile six years old virtual human model developed by automatic down-scaling in a frontal impact. For the automatically developed six years old virtual specific anthropometric model, the Kroell impact test is simulated and the results are compared to the experimental data. The chosen approach shows good correspondence of the scaled model performance to the experimental corridors.
Resolution of ranking hierarchies in directed networks
Barucca, Paolo; Lillo, Fabrizio
2018-01-01
Identifying hierarchies and rankings of nodes in directed graphs is fundamental in many applications such as social network analysis, biology, economics, and finance. A recently proposed method identifies the hierarchy by finding the ordered partition of nodes which minimises a score function, termed agony. This function penalises the links violating the hierarchy in a way depending on the strength of the violation. To investigate the resolution of ranking hierarchies we introduce an ensemble of random graphs, the Ranked Stochastic Block Model. We find that agony may fail to identify hierarchies when the structure is not strong enough and the size of the classes is small with respect to the whole network. We analytically characterise the resolution threshold and we show that an iterated version of agony can partly overcome this resolution limit. PMID:29394278
Low Rank Approximation Algorithms, Implementation, Applications
Markovsky, Ivan
2012-01-01
Matrix low-rank approximation is intimately related to data modelling; a problem that arises frequently in many different fields. Low Rank Approximation: Algorithms, Implementation, Applications is a comprehensive exposition of the theory, algorithms, and applications of structured low-rank approximation. Local optimization methods and effective suboptimal convex relaxations for Toeplitz, Hankel, and Sylvester structured problems are presented. A major part of the text is devoted to application of the theory. Applications described include: system and control theory: approximate realization, model reduction, output error, and errors-in-variables identification; signal processing: harmonic retrieval, sum-of-damped exponentials, finite impulse response modeling, and array processing; machine learning: multidimensional scaling and recommender system; computer vision: algebraic curve fitting and fundamental matrix estimation; bioinformatics for microarray data analysis; chemometrics for multivariate calibration; ...
Ranking beta sheet topologies of proteins
DEFF Research Database (Denmark)
Fonseca, Rasmus; Helles, Glennie; Winter, Pawel
2010-01-01
One of the challenges of protein structure prediction is to identify long-range interactions between amino acids. To reliably predict such interactions, we enumerate, score and rank all beta-topologies (partitions of beta-strands into sheets, orderings of strands within sheets and orientations...... of paired strands) of a given protein. We show that the beta-topology corresponding to the native structure is, with high probability, among the top-ranked. Since full enumeration is very time-consuming, we also suggest a method to deal with proteins with many beta-strands. The results reported...... in this paper are highly relevant for ab initio protein structure prediction methods based on decoy generation. The top-ranked beta-topologies can be used to find initial conformations from which conformational searches can be started. They can also be used to filter decoys by removing those with poorly...
Ramezani Tehrani, Fahimeh; Mansournia, Mohammad Ali; Solaymani-Dodaran, Masoud; Steyerberg, Ewout; Azizi, Fereidoun
2016-06-01
This study aimed to improve existing prediction models for age at menopause. We identified all reproductive aged women with regular menstrual cycles who met our eligibility criteria (n = 1,015) in the Tehran Lipid and Glucose Study-an ongoing population-based cohort study initiated in 1998. Participants were examined every 3 years and their reproductive histories were recorded. Blood levels of antimüllerian hormone (AMH) were measured at the time of recruitment. Age at menopause was estimated based on serum concentrations of AMH using flexible parametric survival models. The optimum model was selected according to Akaike Information Criteria and the realness of the range of predicted median menopause age. We followed study participants for a median of 9.8 years during which 277 women reached menopause and found that a spline-based proportional odds model including age-specific AMH percentiles as the covariate performed well in terms of statistical criteria and provided the most clinically relevant and realistic predictions. The range of predicted median age at menopause for this model was 47.1 to 55.9 years. For those who reached menopause, the median of the absolute mean difference between actual and predicted age at menopause was 1.9 years (interquartile range 2.9). The model including the age-specific AMH percentiles as the covariate and using proportional odds as its covariate metrics meets all the statistical criteria for the best model and provides the most clinically relevant and realistic predictions for age at menopause for reproductive-aged women.
Sign rank versus Vapnik-Chervonenkis dimension
Alon, N.; Moran, Sh; Yehudayoff, A.
2017-12-01
This work studies the maximum possible sign rank of sign (N × N)-matrices with a given Vapnik-Chervonenkis dimension d. For d=1, this maximum is three. For d=2, this maximum is \\widetilde{\\Theta}(N1/2). For d >2, similar but slightly less accurate statements hold. The lower bounds improve on previous ones by Ben-David et al., and the upper bounds are novel. The lower bounds are obtained by probabilistic constructions, using a theorem of Warren in real algebraic topology. The upper bounds are obtained using a result of Welzl about spanning trees with low stabbing number, and using the moment curve. The upper bound technique is also used to: (i) provide estimates on the number of classes of a given Vapnik-Chervonenkis dimension, and the number of maximum classes of a given Vapnik-Chervonenkis dimension--answering a question of Frankl from 1989, and (ii) design an efficient algorithm that provides an O(N/log(N)) multiplicative approximation for the sign rank. We also observe a general connection between sign rank and spectral gaps which is based on Forster's argument. Consider the adjacency (N × N)-matrix of a Δ-regular graph with a second eigenvalue of absolute value λ and Δ ≤ N/2. We show that the sign rank of the signed version of this matrix is at least Δ/λ. We use this connection to prove the existence of a maximum class C\\subseteq\\{+/- 1\\}^N with Vapnik-Chervonenkis dimension 2 and sign rank \\widetilde{\\Theta}(N1/2). This answers a question of Ben-David et al. regarding the sign rank of large Vapnik-Chervonenkis classes. We also describe limitations of this approach, in the spirit of the Alon-Boppana theorem. We further describe connections to communication complexity, geometry, learning theory, and combinatorics. Bibliography: 69 titles.
Pulling Rank: A Plan to Help Students with College Choice in an Age of Rankings
Thacker, Lloyd
2008-01-01
Colleges and universities are "ranksteering"--driving under the influence of popular college rankings systems like "U.S. News and World Report's" Best Colleges. This article examines the criticisms of college rankings and describes how a group of education leaders is honing a plan to end the tyranny of the ratings game and better help students and…
When sparse coding meets ranking: a joint framework for learning sparse codes and ranking scores
Wang, Jim Jing-Yan
2017-06-28
Sparse coding, which represents a data point as a sparse reconstruction code with regard to a dictionary, has been a popular data representation method. Meanwhile, in database retrieval problems, learning the ranking scores from data points plays an important role. Up to now, these two problems have always been considered separately, assuming that data coding and ranking are two independent and irrelevant problems. However, is there any internal relationship between sparse coding and ranking score learning? If yes, how to explore and make use of this internal relationship? In this paper, we try to answer these questions by developing the first joint sparse coding and ranking score learning algorithm. To explore the local distribution in the sparse code space, and also to bridge coding and ranking problems, we assume that in the neighborhood of each data point, the ranking scores can be approximated from the corresponding sparse codes by a local linear function. By considering the local approximation error of ranking scores, the reconstruction error and sparsity of sparse coding, and the query information provided by the user, we construct a unified objective function for learning of sparse codes, the dictionary and ranking scores. We further develop an iterative algorithm to solve this optimization problem.
Rank-based Tests of the Cointegrating Rank in Semiparametric Error Correction Models
Hallin, M.; van den Akker, R.; Werker, B.J.M.
2012-01-01
Abstract: This paper introduces rank-based tests for the cointegrating rank in an Error Correction Model with i.i.d. elliptical innovations. The tests are asymptotically distribution-free, and their validity does not depend on the actual distribution of the innovations. This result holds despite the
RankProdIt: A web-interactive Rank Products analysis tool
Directory of Open Access Journals (Sweden)
Laing Emma
2010-08-01
Full Text Available Abstract Background The first objective of a DNA microarray experiment is typically to generate a list of genes or probes that are found to be differentially expressed or represented (in the case of comparative genomic hybridizations and/or copy number variation between two conditions or strains. Rank Products analysis comprises a robust algorithm for deriving such lists from microarray experiments that comprise small numbers of replicates, for example, less than the number required for the commonly used t-test. Currently, users wishing to apply Rank Products analysis to their own microarray data sets have been restricted to the use of command line-based software which can limit its usage within the biological community. Findings Here we have developed a web interface to existing Rank Products analysis tools allowing users to quickly process their data in an intuitive and step-wise manner to obtain the respective Rank Product or Rank Sum, probability of false prediction and p-values in a downloadable file. Conclusions The online interactive Rank Products analysis tool RankProdIt, for analysis of any data set containing measurements for multiple replicated conditions, is available at: http://strep-microarray.sbs.surrey.ac.uk/RankProducts
Preference Learning and Ranking by Pairwise Comparison
Fürnkranz, Johannes; Hüllermeier, Eyke
This chapter provides an overview of recent work on preference learning and ranking via pairwise classification. The learning by pairwise comparison (LPC) paradigm is the natural machine learning counterpart to the relational approach to preference modeling and decision making. From a machine learning point of view, LPC is especially appealing as it decomposes a possibly complex prediction problem into a certain number of learning problems of the simplest type, namely binary classification. We explain how to approach different preference learning problems, such as label and instance ranking, within the framework of LPC. We primarily focus on methodological aspects, but also address theoretical questions as well as algorithmic and complexity issues.
Resonances under rank-one perturbations
Bourget, Olivier; Cortés, Víctor H.; Del Río, Rafael; Fernández, Claudio
2017-09-01
We study resonances generated by rank-one perturbations of self-adjoint operators with eigenvalues embedded in the continuous spectrum. Instability of these eigenvalues is analyzed and almost exponential decay for the associated resonant states is exhibited. We show how these results can be applied to Sturm-Liouville operators. Main tools are the Aronszajn-Donoghue theory for rank-one perturbations, a reduction process of the resolvent based on the Feshbach-Livsic formula, the Fermi golden rule, and a careful analysis of the Fourier transform of quasi-Lorentzian functions. We relate these results to sojourn time estimates and spectral concentration phenomena.
Compressed Sensing with Rank Deficient Dictionaries
DEFF Research Database (Denmark)
Hansen, Thomas Lundgaard; Johansen, Daniel Højrup; Jørgensen, Peter Bjørn
2012-01-01
In compressed sensing it is generally assumed that the dictionary matrix constitutes a (possibly overcomplete) basis of the signal space. In this paper we consider dictionaries that do not span the signal space, i.e. rank deficient dictionaries. We show that in this case the signal-to-noise ratio...... (SNR) in the compressed samples can be increased by selecting the rows of the measurement matrix from the column space of the dictionary. As an example application of compressed sensing with a rank deficient dictionary, we present a case study of compressed sensing applied to the Coarse Acquisition (C...
Research of Subgraph Estimation Page Rank Algorithm for Web Page Rank
Directory of Open Access Journals (Sweden)
LI Lan-yin
2017-04-01
Full Text Available The traditional PageRank algorithm can not efficiently perform large data Webpage scheduling problem. This paper proposes an accelerated algorithm named topK-Rank，which is based on PageRank on the MapReduce platform. It can find top k nodes efficiently for a given graph without sacrificing accuracy. In order to identify top k nodes，topK-Rank algorithm prunes unnecessary nodes and edges in each iteration to dynamically construct subgraphs，and iteratively estimates lower/upper bounds of PageRank scores through subgraphs. Theoretical analysis shows that this method guarantees result exactness. Experiments show that topK-Rank algorithm can find k nodes much faster than the existing approaches.
Noma, Hisashi; Matsui, Shigeyuki
2013-05-20
The main purpose of microarray studies is screening of differentially expressed genes as candidates for further investigation. Because of limited resources in this stage, prioritizing genes are relevant statistical tasks in microarray studies. For effective gene selections, parametric empirical Bayes methods for ranking and selection of genes with largest effect sizes have been proposed (Noma et al., 2010; Biostatistics 11: 281-289). The hierarchical mixture model incorporates the differential and non-differential components and allows information borrowing across differential genes with separation from nuisance, non-differential genes. In this article, we develop empirical Bayes ranking methods via a semiparametric hierarchical mixture model. A nonparametric prior distribution, rather than parametric prior distributions, for effect sizes is specified and estimated using the "smoothing by roughening" approach of Laird and Louis (1991; Computational statistics and data analysis 12: 27-37). We present applications to childhood and infant leukemia clinical studies with microarrays for exploring genes related to prognosis or disease progression. Copyright © 2012 John Wiley & Sons, Ltd.
van de Laar, Arnold W J M; Acherman, Yair I Z
2014-05-01
The frequently used 35 kg/m2 body mass index (BMI) and 50 % excess weight loss (%EWL) criteria are no longer adequate for defining the success of a bariatric or metabolic surgery. It is not clear whether they are still useful to simply determine the sufficiency of a patient’s postoperative weight loss. An alternative way of defining sufficient weight loss is presented, using weight loss percentile charts of large representative series as a benchmark. Gastric bypass weight loss results from the Bariatric Outcomes Longitudinal Database (BOLD) with ≥2 years of follow-up are presented with percentiles in function of postoperative time and their nadir results in function of initial BMI using different outcome metrics. These percentiles are compared with the BMI35 and 50%EWL criteria. Of 49,098 patients eligible for ≥2 years of follow-up, 8,945 had reported weight loss at ≥2 years (20.0% male, mean initial BMI 47.7 kg/m2). They reached nadir BMI at a mean of 603 days. Their 50th percentiles surpassed both 50 %EWL and BMI35 after 135 days. More than 95% achieved 50% EWL; more than 75% achieved BMI35. BMI and %EWL results are influenced more by initial BMI than total weight loss (%TWL) results. BOLD gastric bypass weight loss data are presented with percentile curves. BMI and %EWL are clearly not suited for this purpose. Provided that follow-up data are solid, %TWL-based percentile charts can constitute neutral benchmarks for defining sufficient postoperative weight loss over time. Criteria for overall success, however, should consider clear goals of health improvement, including metabolic aspects. Frequently used criteria 50% EWL and BMI35 are inadequate for both. Their static weight loss components do not match the found percentiles and their health improvement components do not match known metabolic criteria.
Non-parametric Tuning of PID Controllers A Modified Relay-Feedback-Test Approach
Boiko, Igor
2013-01-01
The relay feedback test (RFT) has become a popular and efficient tool used in process identification and automatic controller tuning. Non-parametric Tuning of PID Controllers couples new modifications of classical RFT with application-specific optimal tuning rules to form a non-parametric method of test-and-tuning. Test and tuning are coordinated through a set of common parameters so that a PID controller can obtain the desired gain or phase margins in a system exactly, even with unknown process dynamics. The concept of process-specific optimal tuning rules in the nonparametric setup, with corresponding tuning rules for flow, level pressure, and temperature control loops is presented in the text. Common problems of tuning accuracy based on parametric and non-parametric approaches are addressed. In addition, the text treats the parametric approach to tuning based on the modified RFT approach and the exact model of oscillations in the system under test using the locus of a perturbedrelay system (LPRS) meth...
A Comparison of Shewhart Control Charts based on Normality, Nonparametrics, and Extreme-Value Theory
Ion, R.A.; Does, R.J.M.M.; Klaassen, C.A.J.
2000-01-01
Several control charts for individual observations are compared. The traditional ones are the well-known Shewhart control charts with estimators for the spread based on the sample standard deviation and the average of the moving ranges. The alternatives are nonparametric control charts, based on
Schaumburg, J.
2012-01-01
A framework is introduced allowing us to apply nonparametric quantile regression to Value at Risk (VaR) prediction at any probability level of interest. A monotonized double kernel local linear estimator is used to estimate moderate (1%) conditional quantiles of index return distributions. For
Bayard, David S.; Neely, Michael
2016-01-01
An experimental design approach is presented for individualized therapy in the special case where the prior information is specified by a nonparametric (NP) population model. Here, a nonparametric model refers to a discrete probability model characterized by a finite set of support points and their associated weights. An important question arises as to how to best design experiments for this type of model. Many experimental design methods are based on Fisher Information or other approaches originally developed for parametric models. While such approaches have been used with some success across various applications, it is interesting to note that they largely fail to address the fundamentally discrete nature of the nonparametric model. Specifically, the problem of identifying an individual from a nonparametric prior is more naturally treated as a problem of classification, i.e., to find a support point that best matches the patient’s behavior. This paper studies the discrete nature of the NP experiment design problem from a classification point of view. Several new insights are provided including the use of Bayes Risk as an information measure, and new alternative methods for experiment design. One particular method, denoted as MMopt (Multiple-Model Optimal), will be examined in detail and shown to require minimal computation while having distinct advantages compared to existing approaches. Several simulated examples, including a case study involving oral voriconazole in children, are given to demonstrate the usefulness of MMopt in pharmacokinetics applications. PMID:27909942
The Support Reduction Algorithm for Computing Non-Parametric Function Estimates in Mixture Models
GROENEBOOM, PIET; JONGBLOED, GEURT; WELLNER, JON A.
2008-01-01
In this paper, we study an algorithm (which we call the support reduction algorithm) that can be used to compute non-parametric M-estimators in mixture models. The algorithm is compared with natural competitors in the context of convex regression and the ‘Aspect problem’ in quantum physics.
A Bayesian Beta-Mixture Model for Nonparametric IRT (BBM-IRT)
Arenson, Ethan A.; Karabatsos, George
2017-01-01
Item response models typically assume that the item characteristic (step) curves follow a logistic or normal cumulative distribution function, which are strictly monotone functions of person test ability. Such assumptions can be overly-restrictive for real item response data. We propose a simple and more flexible Bayesian nonparametric IRT model…
Nonparametric bayesian reward segmentation for skill discovery using inverse reinforcement learning
CSIR Research Space (South Africa)
Ranchod, P
2015-10-01
Full Text Available to be optimizing. The skill boundaries and the number of skills making up each demonstration are unknown. We use a Bayesian nonparametric approach to propose skill segmentations and maximum entropy inverse reinforcement learning to infer reward functions from...
Non-parametric production analysis of pesticides use in the Netherlands
Oude Lansink, A.G.J.M.; Silva, E.
2004-01-01
Many previous empirical studies on the productivity of pesticides suggest that pesticides are under-utilized in agriculture despite the general held believe that these inputs are substantially over-utilized. This paper uses data envelopment analysis (DEA) to calculate non-parametric measures of the
Measuring the Influence of Networks on Transaction Costs Using a Nonparametric Regression Technique
DEFF Research Database (Denmark)
Henningsen, Geraldine; Henningsen, Arne; Henning, Christian H.C.A.
. We empirically analyse the effect of networks on productivity using a cross-validated local linear non-parametric regression technique and a data set of 384 farms in Poland. Our empirical study generally supports our hypothesis that networks affect productivity. Large and dense trading networks...
Wei, Jiawei
2011-07-01
We consider the problem of testing for a constant nonparametric effect in a general semi-parametric regression model when there is the potential for interaction between the parametrically and nonparametrically modeled variables. The work was originally motivated by a unique testing problem in genetic epidemiology (Chatterjee, et al., 2006) that involved a typical generalized linear model but with an additional term reminiscent of the Tukey one-degree-of-freedom formulation, and their interest was in testing for main effects of the genetic variables, while gaining statistical power by allowing for a possible interaction between genes and the environment. Later work (Maity, et al., 2009) involved the possibility of modeling the environmental variable nonparametrically, but they focused on whether there was a parametric main effect for the genetic variables. In this paper, we consider the complementary problem, where the interest is in testing for the main effect of the nonparametrically modeled environmental variable. We derive a generalized likelihood ratio test for this hypothesis, show how to implement it, and provide evidence that our method can improve statistical power when compared to standard partially linear models with main effects only. We use the method for the primary purpose of analyzing data from a case-control study of colorectal adenoma.
Measuring the influence of networks on transaction costs using a non-parametric regression technique
DEFF Research Database (Denmark)
Henningsen, Géraldine; Henningsen, Arne; Henning, Christian H.C.A.
. We empirically analyse the effect of networks on productivity using a cross-validated local linear non-parametric regression technique and a data set of 384 farms in Poland. Our empirical study generally supports our hypothesis that networks affect productivity. Large and dense trading networks...
Assessing pupil and school performance by non-parametric and parametric techniques
de Witte, K.; Thanassoulis, E.; Simpson, G.; Battisti, G.; Charlesworth-May, A.
2010-01-01
This paper discusses the use of the non-parametric free disposal hull (FDH) and the parametric multi-level model (MLM) as alternative methods for measuring pupil and school attainment where hierarchical structured data are available. Using robust FDH estimates, we show how to decompose the overall
Non-parametric Bayesian graph models reveal community structure in resting state fMRI
DEFF Research Database (Denmark)
Andersen, Kasper Winther; Madsen, Kristoffer H.; Siebner, Hartwig Roman
2014-01-01
Modeling of resting state functional magnetic resonance imaging (rs-fMRI) data using network models is of increasing interest. It is often desirable to group nodes into clusters to interpret the communication patterns between nodes. In this study we consider three different nonparametric Bayesian...
A non-parametric Bayesian approach to decompounding from high frequency data
Gugushvili, Shota; van der Meulen, F.H.; Spreij, Peter
2016-01-01
Given a sample from a discretely observed compound Poisson process, we consider non-parametric estimation of the density f0 of its jump sizes, as well as of its intensity λ0. We take a Bayesian approach to the problem and specify the prior on f0 as the Dirichlet location mixture of normal densities.
Non-parametric Estimation of Diffusion-Paths Using Wavelet Scaling Methods
DEFF Research Database (Denmark)
Høg, Esben
In continuous time, diffusion processes have been used for modelling financial dynamics for a long time. For example the Ornstein-Uhlenbeck process (the simplest mean-reverting process) has been used to model non-speculative price processes. We discuss non--parametric estimation of these processes...
Non-Parametric Estimation of Diffusion-Paths Using Wavelet Scaling Methods
DEFF Research Database (Denmark)
Høg, Esben
2003-01-01
In continuous time, diffusion processes have been used for modelling financial dynamics for a long time. For example the Ornstein-Uhlenbeck process (the simplest mean--reverting process) has been used to model non-speculative price processes. We discuss non--parametric estimation of these processes...
A Unified Nonparametric IRT Model for "d"-Dimensional Psychological Test Data ("d"-Isop)
Scheiblechner, Hartmann
2007-01-01
The (univariate) isotonic psychometric (ISOP) model (Scheiblechner, 1995) is a nonparametric IRT model for dichotomous and polytomous (rating scale) psychological test data. A weak subject independence axiom W1 postulates that the subjects are ordered in the same way except for ties (i.e., similarly or isotonically) by all items of a psychological…
Wey, Andrew; Connett, John; Rudser, Kyle
2015-07-01
For estimating conditional survival functions, non-parametric estimators can be preferred to parametric and semi-parametric estimators due to relaxed assumptions that enable robust estimation. Yet, even when misspecified, parametric and semi-parametric estimators can possess better operating characteristics in small sample sizes due to smaller variance than non-parametric estimators. Fundamentally, this is a bias-variance trade-off situation in that the sample size is not large enough to take advantage of the low bias of non-parametric estimation. Stacked survival models estimate an optimally weighted combination of models that can span parametric, semi-parametric, and non-parametric models by minimizing prediction error. An extensive simulation study demonstrates that stacked survival models consistently perform well across a wide range of scenarios by adaptively balancing the strengths and weaknesses of individual candidate survival models. In addition, stacked survival models perform as well as or better than the model selected through cross-validation. Finally, stacked survival models are applied to a well-known German breast cancer study. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
DEFF Research Database (Denmark)
Ramirez, José Rangel; Sørensen, John Dalsgaard
2011-01-01
This work illustrates the updating and incorporation of information in the assessment of fatigue reliability for offshore wind turbine. The new information, coming from external and condition monitoring can be used to direct updating of the stochastic variables through a non-parametric Bayesian u...
Alterman, Arthur I.; Cacciola, John S.; Habing, Brian; Lynch, Kevin G.
2007-01-01
Baseline Addiction Severity Index (5th ed.; ASI-5) data of 2,142 substance abuse patients were analyzed with two nonparametric item response theory (NIRT) methods: Mokken scaling and conditional covariance techniques. Nine reliable and dimensionally homogeneous Recent Problem indexes emerged in the ASI-5's seven areas, including two each in the…
Production with Storable and Durable Inputs: Nonparametric Analysis of Intertemporal Efficiency
L. Cherchye (Laurens); B. de Rock (Bram); P.J. Kerstens (Pieter Jan)
2016-01-01
textabstractWe propose a nonparametric methodology for intertemporal production analysis that accounts for durable as well as storable inputs. Durable inputs contribute to the production outputs in multiple consecutive periods. Storable inputs are non-durable and can be stored in inventories for use
Nonparametric Estimation of Interval Reliability for Discrete-Time Semi-Markov Systems
DEFF Research Database (Denmark)
Georgiadis, Stylianos; Limnios, Nikolaos
2016-01-01
In this article, we consider a repairable discrete-time semi-Markov system with finite state space. The measure of the interval reliability is given as the probability of the system being operational over a given finite-length time interval. A nonparametric estimator is proposed for the interval...
A comparative study of non-parametric models for identification of ...
African Journals Online (AJOL)
However, the frequency response method using random binary signals was good for unpredicted white noise characteristics and considered the best method for non-parametric system identifica-tion. The autoregressive external input (ARX) model was very useful for system identification, but on applicati-on, few input ...
Data analysis with small samples and non-normal data nonparametrics and other strategies
Siebert, Carl F
2017-01-01
Written in everyday language for non-statisticians, this book provides all the information needed to successfully conduct nonparametric analyses. This ideal reference book provides step-by-step instructions to lead the reader through each analysis, screenshots of the software and output, and case scenarios to illustrate of all the analytic techniques.
Nonparametric Tests of Collectively Rational Consumption Behavior : An Integer Programming Procedure
Cherchye, L.J.H.; de Rock, B.; Sabbe, J.; Vermeulen, F.M.P.
2008-01-01
We present an IP-based nonparametric (revealed preference) testing proce- dure for rational consumption behavior in terms of general collective models, which include consumption externalities and public consumption. An empiri- cal application to data drawn from the Russia Longitudinal Monitoring
Measurement Error in Nonparametric Item Response Curve Estimation. Research Report. ETS RR-11-28
Guo, Hongwen; Sinharay, Sandip
2011-01-01
Nonparametric, or kernel, estimation of item response curve (IRC) is a concern theoretically and operationally. Accuracy of this estimation, often used in item analysis in testing programs, is biased when the observed scores are used as the regressor because the observed scores are contaminated by measurement error. In this study, we investigate…
Park, Jungkyu; Yu, Hsiu-Ting
2016-01-01
The multilevel latent class model (MLCM) is a multilevel extension of a latent class model (LCM) that is used to analyze nested structure data structure. The nonparametric version of an MLCM assumes a discrete latent variable at a higher-level nesting structure to account for the dependency among observations nested within a higher-level unit. In…
Nonparametric estimation of the stationary M/G/1 workload distribution function
DEFF Research Database (Denmark)
Hansen, Martin Bøgsted
2005-01-01
In this paper it is demonstrated how a nonparametric estimator of the stationary workload distribution function of the M/G/1-queue can be obtained by systematic sampling the workload process. Weak convergence results and bootstrap methods for empirical distribution functions for stationary associ...
DEFF Research Database (Denmark)
Effraimidis, Georgios; Dahl, Christian Møller
In this paper, we develop a fully nonparametric approach for the estimation of the cumulative incidence function with Missing At Random right-censored competing risks data. We obtain results on the pointwise asymptotic normality as well as the uniform convergence rate of the proposed nonparametri...
A non-parametric hierarchical model to discover behavior dynamics from tracks
Kooij, J.F.P.; Englebienne, G.; Gavrila, D.M.
2012-01-01
We present a novel non-parametric Bayesian model to jointly discover the dynamics of low-level actions and high-level behaviors of tracked people in open environments. Our model represents behaviors as Markov chains of actions which capture high-level temporal dynamics. Actions may be shared by
Verrelst, Jochem; Rivera, Juan Pablo; Veroustraete, Frank; Muñoz-Marí, Jordi; Clevers, J.G.P.W.; Camps-Valls, Gustau; Moreno, José
2015-01-01
Given the forthcoming availability of Sentinel-2 (S2) images, this paper provides a systematic comparison of retrieval accuracy and processing speed of a multitude of parametric, non-parametric and physically-based retrieval methods using simulated S2 data. An experimental field dataset (SPARC),
Low default credit scoring using two-class non-parametric kernel density estimation
CSIR Research Space (South Africa)
Rademeyer, E
2016-12-01
Full Text Available This paper investigates the performance of two-class classification credit scoring data sets with low default ratios. The standard two-class parametric Gaussian and non-parametric Parzen classifiers are extended, using Bayes’ rule, to include either...
Primate Innovation: Sex, Age and Social Rank
Reader, S.M.; Laland, K.N.
2001-01-01
Analysis of an exhaustive survey of primate behavior collated from the published literature revealed significant variation in rates of innovation among individuals of different sex, age and social rank. We searched approximately 1,000 articles in four primatology journals, together with other
Ranking Workplace Competencies: Student and Graduate Perceptions.
Rainsbury, Elizabeth; Hodges, Dave; Burchell, Noel; Lay, Mark
2002-01-01
New Zealand business students and graduates made similar rankings of the five most important workplace competencies: computer literacy, customer service orientation, teamwork and cooperation, self-confidence, and willingness to learn. Graduates placed greater importance on most of the 24 competencies, resulting in a statistically significant…
KMS weights on higher rank buildings
Marcinek, J.; Marcolli, M.
2016-01-01
We extend some of the results of Carey-Marcolli-Rennie on modular index invariants of Mumford curves to the case of higher rank buildings: we discuss notions of KMS weights on buildings, that generalize the construction of graph weights over graph C*-algebras.
Rank reduction of correlation matrices by majorization
R. Pietersz (Raoul); P.J.F. Groenen (Patrick)
2004-01-01
textabstractIn this paper a novel method is developed for the problem of finding a low-rank correlation matrix nearest to a given correlation matrix. The method is based on majorization and therefore it is globally convergent. The method is computationally efficient, is straightforward to implement,
Biomechanics Scholar Citations across Academic Ranks
Directory of Open Access Journals (Sweden)
Knudson Duane
2015-11-01
Full Text Available Study aim: citations to the publications of a scholar have been used as a measure of the quality or influence of their research record. A world-wide descriptive study of the citations to the publications of biomechanics scholars of various academic ranks was conducted.
Likelihoods for fixed rank nomination networks.
Hoff, Peter; Fosdick, Bailey; Volfovsky, Alex; Stovel, Katherine
2013-12-01
Many studies that gather social network data use survey methods that lead to censored, missing, or otherwise incomplete information. For example, the popular fixed rank nomination (FRN) scheme, often used in studies of schools and businesses, asks study participants to nominate and rank at most a small number of contacts or friends, leaving the existence of other relations uncertain. However, most statistical models are formulated in terms of completely observed binary networks. Statistical analyses of FRN data with such models ignore the censored and ranked nature of the data and could potentially result in misleading statistical inference. To investigate this possibility, we compare Bayesian parameter estimates obtained from a likelihood for complete binary networks with those obtained from likelihoods that are derived from the FRN scheme, and therefore accommodate the ranked and censored nature of the data. We show analytically and via simulation that the binary likelihood can provide misleading inference, particularly for certain model parameters that relate network ties to characteristics of individuals and pairs of individuals. We also compare these different likelihoods in a data analysis of several adolescent social networks. For some of these networks, the parameter estimates from the binary and FRN likelihoods lead to different conclusions, indicating the importance of analyzing FRN data with a method that accounts for the FRN survey design.
Cointegration rank testing under conditional heteroskedasticity
DEFF Research Database (Denmark)
Cavaliere, Giuseppe; Rahbek, Anders Christian; Taylor, Robert M.
2010-01-01
(martingale difference) innovations. We first demonstrate that the limiting null distributions of the rank statistics coincide with those derived by previous authors who assume either independent and identically distributed (i.i.d.) or (strict and covariance) stationary martingale difference innovations. We...
Subject Gateway Sites and Search Engine Ranking.
Thelwall, Mike
2002-01-01
Discusses subject gateway sites and commercial search engines for the Web and presents an explanation of Google's PageRank algorithm. The principle question addressed is the conditions under which a gateway site will increase the likelihood that a target page is found in search engines. (LRW)
Ranking beta sheet topologies of proteins
DEFF Research Database (Denmark)
Fonseca, Rasmus; Helles, Glennie; Winter, Pawel
2010-01-01
One of the challenges of protein structure prediction is to identify long-range interactions between amino acids. To reliably predict such interactions, we enumerate, score and rank all beta-topologies (partitions of beta-strands into sheets, orderings of strands within sheets and orientations of...
Ranking Very Many Typed Entities on Wikipedia
Zaragoza, Hugo; Rode, H.; Mika, Peter; Atserias, Jordi; Ciaramita, Massimiliano; Attardi, Guiseppe
2007-01-01
We discuss the problem of ranking very many entities of different types. In particular we deal with a heterogeneous set of types, some being very generic and some very specific. We discuss two approaches for this problem: i) exploiting the entity containment graph and ii) using a Web search engine
Generalized reduced rank tests using the singular value decomposition
Kleibergen, F.R.; Paap, R.
2002-01-01
We propose a novel statistic to test the rank of a matrix. The rank statistic overcomes deficiencies of existing rank statistics, like: necessity of a Kronecker covariance matrix for the canonical correlation rank statistic of Anderson (1951), sensitivity to the ordering of the variables for the LDU
Generalized Reduced Rank Tests using the Singular Value Decomposition
F.R. Kleibergen (Frank); R. Paap (Richard)
2003-01-01
textabstractWe propose a novel statistic to test the rank of a matrix. The rank statistic overcomes deficiencies of existing rank statistics, like: necessity of a Kronecker covariance matrix for the canonical correlation rank statistic of Anderson (1951), sensitivity to the ordering of the variables
The effect of new links on Google PageRank
Avrachenkov, Konstatin; Litvak, Nelli
2004-01-01
PageRank is one of the principle criteria according to which Google ranks Web pages. PageRank can be interpreted as a frequency of visiting a Web page by a random surfer and thus it reflects the popularity of a Web page. We study the effect of newly created links on Google PageRank. We discuss to
Probabilistic relation between In-Degree and PageRank
Litvak, Nelli; Scheinhardt, Willem R.W.; Volkovich, Y.
2008-01-01
This paper presents a novel stochastic model that explains the relation between power laws of In-Degree and PageRank. PageRank is a popularity measure designed by Google to rank Web pages. We model the relation between PageRank and In-Degree through a stochastic equation, which is inspired by the
Generalized Reduced Rank Tests using the Singular Value Decomposition
Kleibergen, F.R.; Paap, R.
2006-01-01
We propose a novel statistic to test the rank of a matrix. The rank statistic overcomes deficiencies of existing rank statistics, like: a Kronecker covariance matrix for the canonical correlation rank statistic of Anderson [Annals of Mathematical Statistics (1951), 22, 327-351] sensitivity to the
Biology of RANK, RANKL, and osteoprotegerin
Boyce, Brendan F; Xing, Lianping
2007-01-01
The discovery of the receptor activator of nuclear factor-κB ligand (RANKL)/RANK/osteoprotegerin (OPG) system and its role in the regulation of bone resorption exemplifies how both serendipity and a logic-based approach can identify factors that regulate cell function. Before this discovery in the mid to late 1990s, it had long been recognized that osteoclast formation was regulated by factors expressed by osteoblast/stromal cells, but it had not been anticipated that members of the tumor necrosis factor superfamily of ligands and receptors would be involved or that the factors involved would have extensive functions beyond bone remodeling. RANKL/RANK signaling regulates the formation of multinucleated osteoclasts from their precursors as well as their activation and survival in normal bone remodeling and in a variety of pathologic conditions. OPG protects the skeleton from excessive bone resorption by binding to RANKL and preventing it from binding to its receptor, RANK. Thus, RANKL/OPG ratio is an important determinant of bone mass and skeletal integrity. Genetic studies in mice indicate that RANKL/RANK signaling is also required for lymph node formation and mammary gland lactational hyperplasia, and that OPG also protects arteries from medial calcification. Thus, these tumor necrosis factor superfamily members have important functions outside bone. Although our understanding of the mechanisms whereby they regulate osteoclast formation has advanced rapidly during the past 10 years, many questions remain about their roles in health and disease. Here we review our current understanding of the role of the RANKL/RANK/OPG system in bone and other tissues. PMID:17634140
VaRank: a simple and powerful tool for ranking genetic variants
Directory of Open Access Journals (Sweden)
Véronique Geoffroy
2015-03-01
Full Text Available Background. Most genetic disorders are caused by single nucleotide variations (SNVs or small insertion/deletions (indels. High throughput sequencing has broadened the catalogue of human variation, including common polymorphisms, rare variations or disease causing mutations. However, identifying one variation among hundreds or thousands of others is still a complex task for biologists, geneticists and clinicians.Results. We have developed VaRank, a command-line tool for the ranking of genetic variants detected by high-throughput sequencing. VaRank scores and prioritizes variants annotated either by Alamut Batch or SnpEff. A barcode allows users to quickly view the presence/absence of variants (with homozygote/heterozygote status in analyzed samples. VaRank supports the commonly used VCF input format for variants analysis thus allowing it to be easily integrated into NGS bioinformatics analysis pipelines. VaRank has been successfully applied to disease-gene identification as well as to molecular diagnostics setup for several hundred patients.Conclusions. VaRank is implemented in Tcl/Tk, a scripting language which is platform-independent but has been tested only on Unix environment. The source code is available under the GNU GPL, and together with sample data and detailed documentation can be downloaded from http://www.lbgi.fr/VaRank/.
Differential invariants for higher-rank tensors. A progress report
International Nuclear Information System (INIS)
Tapial, V.
2004-07-01
We outline the construction of differential invariants for higher-rank tensors. In section 2 we outline the general method for the construction of differential invariants. A first result is that the simplest tensor differential invariant contains derivatives of the same order as the rank of the tensor. In section 3 we review the construction for the first-rank tensors (vectors) and second-rank tensors (metrics). In section 4 we outline the same construction for higher-rank tensors. (author)
Mushtaq, Muhammad Umair; Gull, Sibgha; Mushtaq, Komal; Abdullah, Hussain Muhammad; Khurshid, Usman; Shahid, Ubeera; Shad, Mushtaq Ahmad; Akram, Javed
2012-03-19
Child growth is internationally recognized as an important indicator of nutritional status and health in populations. This study was aimed to compare age- and gender-specific height, weight and BMI percentiles and nutritional status relative to the international growth references among Pakistani school-aged children. A population-based study was conducted with a multistage cluster sample of 1860 children aged five to twelve years in Lahore, Pakistan. Smoothed height, weight and BMI percentile curves were obtained and comparison was made with the World Health Organization 2007 (WHO) and United States' Centers for Disease Control and Prevention 2000 (USCDC) references. Over- and under-nutrition were defined according to the WHO and USCDC references, and the International Obesity Task Force (IOTF) cut-offs. Simple descriptive statistics were used and statistical significance was considered at P references. Mean differences from zero for height-, weight- and BMI-for-age z score values relative to the WHO and USCDC references were significant (P reference were closer to zero and the present study as compared to the USCDC reference. Mean differences between weight-for-age (0.19, 95% CI 0.10-0.30) and BMI-for-age (0.21, 95% CI 0.11-0.30) z scores relative to the WHO and USCDC references were significant. Over-nutrition estimates were higher (P reference as compared to the USCDC reference (17% vs. 15% overweight and 7.5% vs. 4% obesity) while underweight and thinness/wasting were lower (P reference as compared to the USCDC reference (7% vs. 12% underweight and 10% vs. 13% thinness). Significantly lower overweight (8%) and obesity (5%) prevalence and higher thinness grade one prevalence (19%) was seen with use of the IOTF cut-offs as compared to the WHO and USCDC references. Mean difference between height-for-age z scores and difference in stunting prevalence relative to the WHO and USCDC references was not significant. Pakistani school-aged children significantly differed
Li, L.; Yang, C.
2017-12-01
Climate extremes often manifest as rare events in terms of surface air temperature and precipitation with an annual reoccurrence period. In order to represent the manifold characteristics of climate extremes for monitoring and analysis, the Expert Team on Climate Change Detection and Indices (ETCCDI) had worked out a set of 27 core indices based on daily temperature and precipitation data, describing extreme weather and climate events on an annual basis. The CLIMDEX project (http://www.climdex.org) had produced public domain datasets of such indices for data from a variety of sources, including output from global climate models (GCM) participating in the Coupled Model Intercomparison Project Phase 5 (CMIP5). Among the 27 ETCCDI indices, there are six percentile-based temperature extremes indices that may fall into two groups: exceedance rates (ER) (TN10p, TN90p, TX10p and TX90p) and durations (CSDI and WSDI). Percentiles must be estimated prior to the calculation of the indices, and could more or less be biased by the adopted algorithm. Such biases will in turn be propagated to the final results of indices. The CLIMDEX used an empirical quantile estimator combined with a bootstrap resampling procedure to reduce the inhomogeneity in the annual series of the ER indices. However, there are still some problems remained in the CLIMDEX datasets, namely the overestimated climate variability due to unaccounted autocorrelation in the daily temperature data, seasonally varying biases and inconsistency between algorithms applied to the ER indices and to the duration indices. We now present new results of the six indices through a semiparametric quantile regression approach for the CMIP5 model output. By using the base-period data as a whole and taking seasonality and autocorrelation into account, this approach successfully addressed the aforementioned issues and came out with consistent results. The new datasets cover the historical and three projected (RCP2.6, RCP4.5 and RCP
Dual channel rank-based intensity weighting for quantitative co-localization of microscopy images
LENUS (Irish Health Repository)
Singan, Vasanth R
2011-10-21
Abstract Background Accurate quantitative co-localization is a key parameter in the context of understanding the spatial co-ordination of molecules and therefore their function in cells. Existing co-localization algorithms consider either the presence of co-occurring pixels or correlations of intensity in regions of interest. Depending on the image source, and the algorithm selected, the co-localization coefficients determined can be highly variable, and often inaccurate. Furthermore, this choice of whether co-occurrence or correlation is the best approach for quantifying co-localization remains controversial. Results We have developed a novel algorithm to quantify co-localization that improves on and addresses the major shortcomings of existing co-localization measures. This algorithm uses a non-parametric ranking of pixel intensities in each channel, and the difference in ranks of co-localizing pixel positions in the two channels is used to weight the coefficient. This weighting is applied to co-occurring pixels thereby efficiently combining both co-occurrence and correlation. Tests with synthetic data sets show that the algorithm is sensitive to both co-occurrence and correlation at varying levels of intensity. Analysis of biological data sets demonstrate that this new algorithm offers high sensitivity, and that it is capable of detecting subtle changes in co-localization, exemplified by studies on a well characterized cargo protein that moves through the secretory pathway of cells. Conclusions This algorithm provides a novel way to efficiently combine co-occurrence and correlation components in biological images, thereby generating an accurate measure of co-localization. This approach of rank weighting of intensities also eliminates the need for manual thresholding of the image, which is often a cause of error in co-localization quantification. We envisage that this tool will facilitate the quantitative analysis of a wide range of biological data sets
Tumor classification ranking from microarray data
Directory of Open Access Journals (Sweden)
Kijsanayothin Phongphun
2008-09-01
Full Text Available Abstract Background Gene expression profiles based on microarray data are recognized as potential diagnostic indices of cancer. Molecular tumor classifications resulted from these data and learning algorithms have advanced our understanding of genetic changes associated with cancer etiology and development. However, classifications are not always perfect and in such cases the classification rankings (likelihoods of correct class predictions can be useful for directing further research (e.g., by deriving inferences about predictive indicators or prioritizing future experiments. Classification ranking is a challenging problem, particularly for microarray data, where there is a huge number of possible regulated genes with no known rating function. This study investigates the possibility of making tumor classification more informative by using a method for classification ranking that requires no additional ranking analysis and maintains relatively good classification accuracy. Results Microarray data of 11 different types and subtypes of cancer were analyzed using MDR (Multi-Dimensional Ranker, a recently developed boosting-based ranking algorithm. The number of predictor genes in all of the resulting classification models was at most nine, a huge reduction from the more than 12 thousands genes in the majority of the expression samples. Compared to several other learning algorithms, MDR gives the greatest AUC (area under the ROC curve for the classifications of prostate cancer, acute lymphoblastic leukemia (ALL and four ALL subtypes: BCR-ABL, E2A-PBX1, MALL and TALL. SVM (Support Vector Machine gives the highest AUC for the classifications of lung, lymphoma, and breast cancers, and two ALL subtypes: Hyperdiploid > 50 and TEL-AML1. MDR gives highly competitive results, producing the highest average AUC, 91.01%, and an average overall accuracy of 90.01% for cancer expression analysis. Conclusion Using the classification rankings from MDR is a simple
DEFF Research Database (Denmark)
Linnet, Kristian
2005-01-01
Bootstrap, HPLC, limit of blank, limit of detection, non-parametric statistics, type I and II errors......Bootstrap, HPLC, limit of blank, limit of detection, non-parametric statistics, type I and II errors...
Hosking, Michael Robert
This dissertation improves an analyst's use of simulation by offering improvements in the utilization of kriging metamodels. There are three main contributions. First an analysis is performed of what comprises good experimental designs for practical (non-toy) problems when using a kriging metamodel. Second is an explanation and demonstration of how reduced rank decompositions can improve the performance of kriging, now referred to as reduced rank kriging. Third is the development of an extension of reduced rank kriging which solves an open question regarding the usage of reduced rank kriging in practice. This extension is called omni-rank kriging. Finally these results are demonstrated on two case studies. The first contribution focuses on experimental design. Sequential designs are generally known to be more efficient than "one shot" designs. However, sequential designs require some sort of pilot design from which the sequential stage can be based. We seek to find good initial designs for these pilot studies, as well as designs which will be effective if there is no following sequential stage. We test a wide variety of designs over a small set of test-bed problems. Our findings indicate that analysts should take advantage of any prior information they have about their problem's shape and/or their goals in metamodeling. In the event of a total lack of information we find that Latin hypercube designs are robust default choices. Our work is most distinguished by its attention to the higher levels of dimensionality. The second contribution introduces and explains an alternative method for kriging when there is noise in the data, which we call reduced rank kriging. Reduced rank kriging is based on using a reduced rank decomposition which artificially smoothes the kriging weights similar to a nugget effect. Our primary focus will be showing how the reduced rank decomposition propagates through kriging empirically. In addition, we show further evidence for our
Directory of Open Access Journals (Sweden)
Yaser Sarikhani
2017-01-01
Full Text Available Background. High blood pressure in adults is directly correlated with increased risk of cardiovascular diseases. Hypertension in childhood and adolescence could be considered among the major causes of this problem in adults. This study aimed to investigate the factors associated with hypertension among the adolescents of Jahrom city in Iran and also standard percentiles of blood pressure were estimated for this group. Methods. In this community-based cross-sectional study 983 high school students from different areas of the city were included using a multistage random cluster sampling method in 2014. Blood pressure, weight, and height of each student measured using standard methods. Data were analyzed by statistical software SPSS 16. Results. In total, 498 male and 454 female students were included in this study. Average systolic blood pressure of students was 110.27 mmHg with a variation range of 80.6–151.3. Average diastolic blood pressure was 71.76 mmHg with the variation range of 49.3–105. Results of this study indicated that there was a significant relationship between gender, body mass index, and parental education level with systolic and diastolic blood pressure of the students (P<0.05. Conclusions. Body mass index was one of the most important changeable factors associated with blood pressure in adolescents. Paying attention to this factor in adolescence could be effective in prevention of cardiovascular diseases in adulthood.
Low-rank quadratic semidefinite programming
Yuan, Ganzhao
2013-04-01
Low rank matrix approximation is an attractive model in large scale machine learning problems, because it can not only reduce the memory and runtime complexity, but also provide a natural way to regularize parameters while preserving learning accuracy. In this paper, we address a special class of nonconvex quadratic matrix optimization problems, which require a low rank positive semidefinite solution. Despite their non-convexity, we exploit the structure of these problems to derive an efficient solver that converges to their local optima. Furthermore, we show that the proposed solution is capable of dramatically enhancing the efficiency and scalability of a variety of concrete problems, which are of significant interest to the machine learning community. These problems include the Top-k Eigenvalue problem, Distance learning and Kernel learning. Extensive experiments on UCI benchmarks have shown the effectiveness and efficiency of our proposed method. © 2012.
Social Bookmarking Induced Active Page Ranking
Takahashi, Tsubasa; Kitagawa, Hiroyuki; Watanabe, Keita
Social bookmarking services have recently made it possible for us to register and share our own bookmarks on the web and are attracting attention. The services let us get structured data: (URL, Username, Timestamp, Tag Set). And these data represent user interest in web pages. The number of bookmarks is a barometer of web page value. Some web pages have many bookmarks, but most of those bookmarks may have been posted far in the past. Therefore, even if a web page has many bookmarks, their value is not guaranteed. If most of the bookmarks are very old, the page may be obsolete. In this paper, by focusing on the timestamp sequence of social bookmarkings on web pages, we model their activation levels representing current values. Further, we improve our previously proposed ranking method for web search by introducing the activation level concept. Finally, through experiments, we show effectiveness of the proposed ranking method.
Fourth-rank gravity. A progress report
International Nuclear Information System (INIS)
Tapia, V.
1992-04-01
We consider the consequences of describing the metric properties of space-time through a quartic line element. The associated ''metric'' is a fourth-rank tensor. After developing some fundamentals for such geometry, we construct a field theory for the gravitational field. This theory coincides with General Relativity in the vacuum case. Departures from General Relativity are obtained only in the presence of matter. We develop a simple cosmological model which is not in contradiction with the observed value Ω approx. 0.2-0.3 for the energy density parameter. A further application concerns conformal field theory. We are able to prove that a conformal field theory possesses an infinite-dimensional symmetry group only if the dimension of space-time is equal to the rank of the metric. In this case we are able to construct an integrable conformal field theory in four dimensions. The model is renormalisable by power counting. (author). 9 refs
Ranking agility factors affecting hospitals in Iran
Directory of Open Access Journals (Sweden)
M. Abdi Talarposht
2017-04-01
Full Text Available Background: Agility is an effective response to the changing and unpredictable environment and using these changes as opportunities for organizational improvement. Objective: The aim of the present study was to rank the factors affecting agile supply chain of hospitals of Iran. Methods: This applied study was conducted by cross sectional-descriptive method at some point of 2015 for one year. The research population included managers, administrators, faculty members and experts were selected hospitals. A total of 260 people were selected as sample from the health centers. The construct validity of the questionnaire was approved by confirmatory factor analysis test and its reliability was approved by Cronbach's alpha (α=0.97. All data were analyzed by Kolmogorov-Smirnov, Chi-square and Friedman tests. Findings: The development of staff skills, the use of information technology, the integration of processes, appropriate planning, and customer satisfaction and product quality had a significant impact on the agility of public hospitals of Iran (P<0.001. New product introductions had earned the highest ranking and the development of staff skills earned the lowest ranking. Conclusion: The new product introduction, market responsiveness and sensitivity, reduce costs, and the integration of organizational processes, ratings better to have acquired agility hospitals in Iran. Therefore, planners and officials of hospitals have to, through the promotion quality and variety of services customer-oriented, providing a basis for investing in the hospital and etc to apply for agility supply chain public hospitals of Iran.
[2013 research ranking of Spanish public universities].
Buela-Casal, Gualberto; Quevedo-Blasco, Raúl; Guillén-Riquelme, Alejandro
2015-01-01
The evaluation of research production and productivity is becoming increasingly necessary for universities. Having reliable and clear data is extremely useful in order to uncover strengths and weaknesses. The objective of this article is to update the research ranking of Spanish public universities with the 2013 data. Assessment was carried out based on articles in journals indexed in the JCR, research periods, R+D projects, doctoral theses, FPU grants, doctoral studies awarded with a citation of excellence, and patents, providing a rating, both for each individual indicator and globally, in production and productivity. The same methodology as previous editions was followed. In the global ranking, the universities with a higher production are Barcelona, Complutense of Madrid, and Granada. In productivity, the first positions are held by the universities Pompeu Fabra, Pablo de Olavide, and the Autonomous University of Barcelona. Differences can be found between the universities in production and productivity, while there are also certain similarities with regard to the position of Spanish universities in international rankings.
Kernelized rank learning for personalized drug recommendation.
He, Xiao; Folkman, Lukas; Borgwardt, Karsten
2018-03-08
Large-scale screenings of cancer cell lines with detailed molecular profiles against libraries of pharmacological compounds are currently being performed in order to gain a better understanding of the genetic component of drug response and to enhance our ability to recommend therapies given a patient's molecular profile. These comprehensive screens differ from the clinical setting in which (1) medical records only contain the response of a patient to very few drugs, (2) drugs are recommended by doctors based on their expert judgment, and (3) selecting the most promising therapy is often more important than accurately predicting the sensitivity to all potential drugs. Current regression models for drug sensitivity prediction fail to account for these three properties. We present a machine learning approach, named Kernelized Rank Learning (KRL), that ranks drugs based on their predicted effect per cell line (patient), circumventing the difficult problem of precisely predicting the sensitivity to the given drug. Our approach outperforms several state-of-the-art predictors in drug recommendation, particularly if the training dataset is sparse, and generalizes to patient data. Our work phrases personalized drug recommendation as a new type of machine learning problem with translational potential to the clinic. The Python implementation of KRL and scripts for running our experiments are available at https://github.com/BorgwardtLab/Kernelized-Rank-Learning. xiao.he@bsse.ethz.ch, lukas.folkman@bsse.ethz.ch. Supplementary data are available at Bioinformatics online.
Association between Metabolic Syndrome and Job Rank.
Mehrdad, Ramin; Pouryaghoub, Gholamreza; Moradi, Mahboubeh
2018-01-01
The occupation of the people can influence the development of metabolic syndrome. To determine the association between metabolic syndrome and its determinants with the job rank in workers of a large car factory in Iran. 3989 male workers at a large car manufacturing company were invited to participate in this cross-sectional study. Demographic and anthropometric data of the participants, including age, height, weight, and abdominal circumference were measured. Blood samples were taken to measure lipid profile and blood glucose level. Metabolic syndrome was diagnosed in each participant based on ATPIII 2001 criteria. The workers were categorized based on their job rank into 3 groups of (1) office workers, (2) workers with physical exertion, and (3) workers with chemical exposure. The study characteristics, particularly the frequency of metabolic syndrome and its determinants were compared among the study groups. The prevalence of metabolic syndrome in our study was 7.7% (95% CI 6.9 to 8.5). HDL levels were significantly lower in those who had chemical exposure (p=0.045). Diastolic blood pressure was significantly higher in those who had mechanical exertion (p=0.026). The frequency of metabolic syndrome in the office workers, workers with physical exertion, and workers with chemical exposure was 7.3%, 7.9%, and 7.8%, respectively (p=0.836). Seemingly, there is no association between metabolic syndrome and job rank.
Iris Template Protection Based on Local Ranking
Directory of Open Access Journals (Sweden)
Dongdong Zhao
2018-01-01
Full Text Available Biometrics have been widely studied in recent years, and they are increasingly employed in real-world applications. Meanwhile, a number of potential threats to the privacy of biometric data arise. Iris template protection demands that the privacy of iris data should be protected when performing iris recognition. According to the international standard ISO/IEC 24745, iris template protection should satisfy the irreversibility, revocability, and unlinkability. However, existing works about iris template protection demonstrate that it is difficult to satisfy the three privacy requirements simultaneously while supporting effective iris recognition. In this paper, we propose an iris template protection method based on local ranking. Specifically, the iris data are first XORed (Exclusive OR operation with an application-specific string; next, we divide the results into blocks and then partition the blocks into groups. The blocks in each group are ranked according to their decimal values, and original blocks are transformed to their rank values for storage. We also extend the basic method to support the shifting strategy and masking strategy, which are two important strategies for iris recognition. We demonstrate that the proposed method satisfies the irreversibility, revocability, and unlinkability. Experimental results on typical iris datasets (i.e., CASIA-IrisV3-Interval, CASIA-IrisV4-Lamp, UBIRIS-V1-S1, and MMU-V1 show that the proposed method could maintain the recognition performance while protecting the privacy of iris data.
Rank-dependant factorization of entanglement evolution
International Nuclear Information System (INIS)
Siomau, Michael
2016-01-01
Highlights: • In some cases the complex entanglement evolution can be factorized on simple terms. • We suggest factorization equations for multiqubit entanglement evolution. • The factorization is solely defined by the rank of the final state density matrices. • The factorization is independent on the local noisy channels and initial pure states. - Abstract: The description of the entanglement evolution of a complex quantum system can be significantly simplified due to the symmetries of the initial state and the quantum channels, which simultaneously affect parts of the system. Using concurrence as the entanglement measure, we study the entanglement evolution of few qubit systems, when each of the qubits is affected by a local unital channel independently on the others. We found that for low-rank density matrices of the final quantum state, such complex entanglement dynamics can be completely described by a combination of independent factors representing the evolution of entanglement of the initial state, when just one of the qubits is affected by a local channel. We suggest necessary conditions for the rank of the density matrices to represent the entanglement evolution through the factors. Our finding is supported with analytical examples and numerical simulations.
A Review of Outcomes of Seven World University Ranking Systems
Directory of Open Access Journals (Sweden)
Mahmood Khosrowjerdi
2012-12-01
Full Text Available There are many national and international ranking systems rank the universities and higher education institutions of the world, nationally or internationally, based on the same or different criteria. The question is whether we need all these ranking systems? Are the outcomes of these ranking systems as different as they claim? This study collected data from the results of seven major ranking systems including Shanghai, QS, 4International, Webometrics, HEEACT, and Leiden University ranking and analyzed them. Results showed a significant correlation among the outcomes of these international ranking systems in ranking and rating the world's top 50 universities. The highest correlation was between Shanghai - THE (Spearman's Rho = 0.85; Shanghai - Webometrics (Spearman's Rho = 0.81 and Shanghai - Leiden (Spearman's Rho = 0.80. Finally, some suggestions for improving current ranking systems have been investigated.
The effect of new links on Google PageRank
Avrachenkov, Konstatin; Litvak, Nelli
2004-01-01
PageRank is one of the principle criteria according to which Google ranks Web pages. PageRank can be interpreted as a frequency of visiting a Web page by a random surfer and thus it reflects the popularity of a Web page. We study the effect of newly created links on Google PageRank. We discuss to what extend a page can control its PageRank. Using the asymptotic analysis we provide simple conditions that show if new links bring benefits to a Web page and its neighbors in terms of PageRank or t...
Country-specific determinants of world university rankings.
Pietrucha, Jacek
2018-01-01
This paper examines country-specific factors that affect the three most influential world university rankings (the Academic Ranking of World Universities, the QS World University Ranking, and the Times Higher Education World University Ranking). We run a cross sectional regression that covers 42-71 countries (depending on the ranking and data availability). We show that the position of universities from a country in the ranking is determined by the following country-specific variables: economic potential of the country, research and development expenditure, long-term political stability (freedom from war, occupation, coups and major changes in the political system), and institutional variables, including government effectiveness.
Bayesian Non-Parametric Mixtures of GARCH(1,1 Models
Directory of Open Access Journals (Sweden)
John W. Lau
2012-01-01
Full Text Available Traditional GARCH models describe volatility levels that evolve smoothly over time, generated by a single GARCH regime. However, nonstationary time series data may exhibit abrupt changes in volatility, suggesting changes in the underlying GARCH regimes. Further, the number and times of regime changes are not always obvious. This article outlines a nonparametric mixture of GARCH models that is able to estimate the number and time of volatility regime changes by mixing over the Poisson-Kingman process. The process is a generalisation of the Dirichlet process typically used in nonparametric models for time-dependent data provides a richer clustering structure, and its application to time series data is novel. Inference is Bayesian, and a Markov chain Monte Carlo algorithm to explore the posterior distribution is described. The methodology is illustrated on the Standard and Poor's 500 financial index.
Bornkamp, Björn; Ickstadt, Katja
2009-03-01
In this article, we consider monotone nonparametric regression in a Bayesian framework. The monotone function is modeled as a mixture of shifted and scaled parametric probability distribution functions, and a general random probability measure is assumed as the prior for the mixing distribution. We investigate the choice of the underlying parametric distribution function and find that the two-sided power distribution function is well suited both from a computational and mathematical point of view. The model is motivated by traditional nonlinear models for dose-response analysis, and provides possibilities to elicitate informative prior distributions on different aspects of the curve. The method is compared with other recent approaches to monotone nonparametric regression in a simulation study and is illustrated on a data set from dose-response analysis.
Promotion time cure rate model with nonparametric form of covariate effects.
Chen, Tianlei; Du, Pang
2018-05-10
Survival data with a cured portion are commonly seen in clinical trials. Motivated from a biological interpretation of cancer metastasis, promotion time cure model is a popular alternative to the mixture cure rate model for analyzing such data. The existing promotion cure models all assume a restrictive parametric form of covariate effects, which can be incorrectly specified especially at the exploratory stage. In this paper, we propose a nonparametric approach to modeling the covariate effects under the framework of promotion time cure model. The covariate effect function is estimated by smoothing splines via the optimization of a penalized profile likelihood. Point-wise interval estimates are also derived from the Bayesian interpretation of the penalized profile likelihood. Asymptotic convergence rates are established for the proposed estimates. Simulations show excellent performance of the proposed nonparametric method, which is then applied to a melanoma study. Copyright © 2018 John Wiley & Sons, Ltd.
Filippi, Sarah; Holmes, Chris C; Nieto-Barajas, Luis E
2016-11-16
In this article we propose novel Bayesian nonparametric methods using Dirichlet Process Mixture (DPM) models for detecting pairwise dependence between random variables while accounting for uncertainty in the form of the underlying distributions. A key criteria is that the procedures should scale to large data sets. In this regard we find that the formal calculation of the Bayes factor for a dependent-vs.-independent DPM joint probability measure is not feasible computationally. To address this we present Bayesian diagnostic measures for characterising evidence against a "null model" of pairwise independence. In simulation studies, as well as for a real data analysis, we show that our approach provides a useful tool for the exploratory nonparametric Bayesian analysis of large multivariate data sets.
Riihimäki, Jaakko; Sund, Reijo; Vehtari, Aki
2010-06-01
Effective utilisation of limited resources is a challenge for health care providers. Accurate and relevant information extracted from the length of stay distributions is useful for management purposes. Patient care episodes can be reconstructed from the comprehensive health registers, and in this paper we develop a Bayesian approach to analyse the length of care episode after a fractured hip. We model the large scale data with a flexible nonparametric multilayer perceptron network and with a parametric Weibull mixture model. To assess the performances of the models, we estimate expected utilities using predictive density as a utility measure. Since the model parameters cannot be directly compared, we focus on observables, and estimate the relevances of patient explanatory variables in predicting the length of stay. To demonstrate how the use of the nonparametric flexible model is advantageous for this complex health care data, we also study joint effects of variables in predictions, and visualise nonlinearities and interactions found in the data.
A Bayesian nonparametric approach to reconstruction and prediction of random dynamical systems.
Merkatas, Christos; Kaloudis, Konstantinos; Hatjispyros, Spyridon J
2017-06-01
We propose a Bayesian nonparametric mixture model for the reconstruction and prediction from observed time series data, of discretized stochastic dynamical systems, based on Markov Chain Monte Carlo methods. Our results can be used by researchers in physical modeling interested in a fast and accurate estimation of low dimensional stochastic models when the size of the observed time series is small and the noise process (perhaps) is non-Gaussian. The inference procedure is demonstrated specifically in the case of polynomial maps of an arbitrary degree and when a Geometric Stick Breaking mixture process prior over the space of densities, is applied to the additive errors. Our method is parsimonious compared to Bayesian nonparametric techniques based on Dirichlet process mixtures, flexible and general. Simulations based on synthetic time series are presented.
A nonparametric empirical Bayes framework for large-scale multiple testing.
Martin, Ryan; Tokdar, Surya T
2012-07-01
We propose a flexible and identifiable version of the 2-groups model, motivated by hierarchical Bayes considerations, that features an empirical null and a semiparametric mixture model for the nonnull cases. We use a computationally efficient predictive recursion (PR) marginal likelihood procedure to estimate the model parameters, even the nonparametric mixing distribution. This leads to a nonparametric empirical Bayes testing procedure, which we call PRtest, based on thresholding the estimated local false discovery rates. Simulations and real data examples demonstrate that, compared to existing approaches, PRtest's careful handling of the nonnull density can give a much better fit in the tails of the mixture distribution which, in turn, can lead to more realistic conclusions.
Yau, Christopher; Holmes, Chris
2011-07-01
We propose a hierarchical Bayesian nonparametric mixture model for clustering when some of the covariates are assumed to be of varying relevance to the clustering problem. This can be thought of as an issue in variable selection for unsupervised learning. We demonstrate that by defining a hierarchical population based nonparametric prior on the cluster locations scaled by the inverse covariance matrices of the likelihood we arrive at a 'sparsity prior' representation which admits a conditionally conjugate prior. This allows us to perform full Gibbs sampling to obtain posterior distributions over parameters of interest including an explicit measure of each covariate's relevance and a distribution over the number of potential clusters present in the data. This also allows for individual cluster specific variable selection. We demonstrate improved inference on a number of canonical problems.
DEFF Research Database (Denmark)
Carrao, Hugo; Sepulcre, Guadalupe; Horion, Stéphanie Marie Anne F
2013-01-01
This study evaluates the relationship between the frequency and duration of meteorological droughts and the subsequent temporal changes on the quantity of actively photosynthesizing biomass (greenness) estimated from satellite imagery on rainfed croplands in Latin America. An innovative non-parametric...... and non-supervised approach, based on the Fisher-Jenks optimal classification algorithm, is used to identify multi-scale meteorological droughts on the basis of empirical cumulative distributions of 1, 3, 6, and 12-monthly precipitation totals. As input data for the classifier, we use the gridded GPCC...... for the period between 1998 and 2010. The time-series analysis of vegetation greenness is performed during the growing season with a non-parametric method, namely the seasonal Relative Greenness (RG) of spatially accumulated fAPAR. The Global Land Cover map of 2000 and the GlobCover maps of 2005/2006 and 2009...
Lloyd, Blair P; Finley, Crystal I; Weaver, Emily S
2015-11-17
Stereotypy is common in individuals with developmental disabilities and may become disruptive in the context of instruction. The purpose of this study was to embed brief experimental analyses in the context of reading instruction to evaluate effects of antecedent and consequent variables on latencies to and durations of stereotypy. We trained a reading instructor to implement a trial-based functional analysis and a subsequent antecedent analysis of stimulus features for an adolescent with autism in a reading clinic. We used alternating treatments designs with applications of nonparametric statistical analyses to control Type I error rates. Results of the experimental analyses suggested stereotypy was maintained by nonsocial reinforcement and informed the extent to which features of academic materials influenced levels of stereotypy. Results of nonparametric statistical analyses were consistent with conclusions based on visual analysis. Brief experimental analyses may be embedded in academic instruction to inform the stimulus conditions that influence stereotypy.
Hadron Energy Reconstruction for ATLAS Barrel Combined Calorimeter Using Non-Parametrical Method
Kulchitskii, Yu A
2000-01-01
Hadron energy reconstruction for the ATLAS barrel prototype combined calorimeter in the framework of the non-parametrical method is discussed. The non-parametrical method utilizes only the known e/h ratios and the electron calibration constants and does not require the determination of any parameters by a minimization technique. Thus, this technique lends itself to fast energy reconstruction in a first level trigger. The reconstructed mean values of the hadron energies are within \\pm1% of the true values and the fractional energy resolution is [(58\\pm 3)%{\\sqrt{GeV}}/\\sqrt{E}+(2.5\\pm0.3)%]\\bigoplus(1.7\\pm0.2) GeV/E. The value of the e/h ratio obtained for the electromagnetic compartment of the combined calorimeter is 1.74\\pm0.04. Results of a study of the longitudinal hadronic shower development are also presented.
Bootstrap Prediction Intervals in Non-Parametric Regression with Applications to Anomaly Detection
Kumar, Sricharan; Srivistava, Ashok N.
2012-01-01
Prediction intervals provide a measure of the probable interval in which the outputs of a regression model can be expected to occur. Subsequently, these prediction intervals can be used to determine if the observed output is anomalous or not, conditioned on the input. In this paper, a procedure for determining prediction intervals for outputs of nonparametric regression models using bootstrap methods is proposed. Bootstrap methods allow for a non-parametric approach to computing prediction intervals with no specific assumptions about the sampling distribution of the noise or the data. The asymptotic fidelity of the proposed prediction intervals is theoretically proved. Subsequently, the validity of the bootstrap based prediction intervals is illustrated via simulations. Finally, the bootstrap prediction intervals are applied to the problem of anomaly detection on aviation data.
Using nonparametrics to specify a model to measure the value of travel time
DEFF Research Database (Denmark)
Fosgerau, Mogens
2007-01-01
Using a range of nonparametric methods, the paper examines the specification of a model to evaluate the willingness-to-pay (WTP) for travel time changes from binomial choice data from a simple time-cost trading experiment. The analysis favours a model with random WTP as the only source of randomn......Using a range of nonparametric methods, the paper examines the specification of a model to evaluate the willingness-to-pay (WTP) for travel time changes from binomial choice data from a simple time-cost trading experiment. The analysis favours a model with random WTP as the only source...... unobserved heterogeneity. This formulation is useful for parametric modelling. The index indicates that the WTP varies systematically with income and other individual characteristics. The WTP varies also with the time difference presented in the experiment which is in contradiction of standard utility theory....
A Bayesian nonparametric approach to reconstruction and prediction of random dynamical systems
Merkatas, Christos; Kaloudis, Konstantinos; Hatjispyros, Spyridon J.
2017-06-01
We propose a Bayesian nonparametric mixture model for the reconstruction and prediction from observed time series data, of discretized stochastic dynamical systems, based on Markov Chain Monte Carlo methods. Our results can be used by researchers in physical modeling interested in a fast and accurate estimation of low dimensional stochastic models when the size of the observed time series is small and the noise process (perhaps) is non-Gaussian. The inference procedure is demonstrated specifically in the case of polynomial maps of an arbitrary degree and when a Geometric Stick Breaking mixture process prior over the space of densities, is applied to the additive errors. Our method is parsimonious compared to Bayesian nonparametric techniques based on Dirichlet process mixtures, flexible and general. Simulations based on synthetic time series are presented.
An Intelligent Nonparametric GS Detection Algorithm Based on Adaptive Threshold Selection
Directory of Open Access Journals (Sweden)
Zhang Lin
2012-12-01
Full Text Available In modern radar systems, the clutter’s statistic characters are unknown. With this clutter, the capability of CFAR of parametric detection algorithms will decline. So nonparametric detection algorithms become very important. An intelligent nonparametric Generalized Sign (GS detection algorithm Variability Index-Generalized Sign (VI-GS based on adaptive threshold selection is proposed. The VI-GS detection algorithm comploys a composite approach based on the GS detection algorithm, the Trimmed GS detection algorithm (TGS and the Greatest Of GS detection algorithm (GO-GS. The performance of this detection algorithm in the nonhomogenous clutter background is analyzed respectively based on simulated Gaussian distributed clutter and real radar data. These results show that it performs robustly in the homogeneous background as well as the nonhomogeneous background.
The Use of Nonparametric Kernel Regression Methods in Econometric Production Analysis
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard
This PhD thesis addresses one of the fundamental problems in applied econometric analysis, namely the econometric estimation of regression functions. The conventional approach to regression analysis is the parametric approach, which requires the researcher to specify the form of the regression...... and hence, also in biased measures, which are derived from the estimated parameters. This, in turn, can result in incorrect economic conclusions and recommendations for managers, politicians and decision makers in general. This PhD thesis focuses on a nonparametric econometric approach that can be used...... kernel methods are well-suited to econometric production analysis and can outperform traditional parametric methods. Although the empirical focus of this thesis is on the application of nonparametric kernel regression in applied production analysis, the findings are also applicable to econometric...
On Wasserstein Two-Sample Testing and Related Families of Nonparametric Tests
Directory of Open Access Journals (Sweden)
Aaditya Ramdas
2017-01-01
Full Text Available Nonparametric two-sample or homogeneity testing is a decision theoretic problem that involves identifying differences between two random variables without making parametric assumptions about their underlying distributions. The literature is old and rich, with a wide variety of statistics having being designed and analyzed, both for the unidimensional and the multivariate setting. Inthisshortsurvey,wefocusonteststatisticsthatinvolvetheWassersteindistance. Usingan entropic smoothing of the Wasserstein distance, we connect these to very different tests including multivariate methods involving energy statistics and kernel based maximum mean discrepancy and univariate methods like the Kolmogorov–Smirnov test, probability or quantile (PP/QQ plots and receiver operating characteristic or ordinal dominance (ROC/ODC curves. Some observations are implicit in the literature, while others seem to have not been noticed thus far. Given nonparametric two-sample testing’s classical and continued importance, we aim to provide useful connections for theorists and practitioners familiar with one subset of methods but not others.
Bayesian Bandwidth Selection for a Nonparametric Regression Model with Mixed Types of Regressors
Directory of Open Access Journals (Sweden)
Xibin Zhang
2016-04-01
Full Text Available This paper develops a sampling algorithm for bandwidth estimation in a nonparametric regression model with continuous and discrete regressors under an unknown error density. The error density is approximated by the kernel density estimator of the unobserved errors, while the regression function is estimated using the Nadaraya-Watson estimator admitting continuous and discrete regressors. We derive an approximate likelihood and posterior for bandwidth parameters, followed by a sampling algorithm. Simulation results show that the proposed approach typically leads to better accuracy of the resulting estimates than cross-validation, particularly for smaller sample sizes. This bandwidth estimation approach is applied to nonparametric regression model of the Australian All Ordinaries returns and the kernel density estimation of gross domestic product (GDP growth rates among the organisation for economic co-operation and development (OECD and non-OECD countries.
SOURCES OF GROWTH IN THE TURKISH ECONOMY: A NON-PARAMETRIC APPROACH
Directory of Open Access Journals (Sweden)
ŞENAY AÇIKGÖZ
2013-06-01
Full Text Available Estimating the sources of economic growth within the framework of the conventional growth accounting approach is based on two assumptions namely the factor markets are competitive and the underlying aggregate production function has a specific form. In this study, following Iwata et al (2003, sources of economic growth and total factor productivity (TFP growth in the Turkish economy for the period 1968-2006 were estimated with nonparametric regression approach which does not require imposing these restrictive assumptions. Nonparametric estimates of income share of capital and labor indicated there are diminishing returns to scale in the Turkish economy. According to results, capital formation is the main sources of growth before in the pre-1980 period, TFP is the sources of growth with the exception of 1991-95 period and post-1980 period seems to be the sources of growth. It is observed that labor’s contribution to economic growth reached the highest level in the 1991-95.
DEFF Research Database (Denmark)
Henningsen, Geraldine; Henningsen, Arne; Henning, Christian
2011-01-01
All business transactions as well as achieving innovations take up resources, subsumed under the concept of transaction costs (TAC). One of the major factors in TAC theory is information. Information networks can catalyse the interpersonal information exchange and hence, increase the access to no...... are unveiled by reduced productivity. A cross-validated local linear non-parametric regression shows that good information networks increase the productivity of farms. A bootstrapping procedure confirms that this result is statistically significant....
Nonparametric estimation of the stationary M/G/1 workload distribution function
DEFF Research Database (Denmark)
Hansen, Martin Bøgsted
In this paper it is demonstrated how a nonparametric estimator of the stationary workload distribution function of the M/G/1-queue can be obtained by systematic sampling the workload process. Weak convergence results and bootstrap methods for empirical distribution functions for stationary...... associated sequences are used to derive asymptotic results and bootstrap methods for inference about the workload distribution function. The potential of the method is illustrated by a simulation study of the M/D/1 model....
A simple non-parametric goodness-of-fit test for elliptical copulas
Directory of Open Access Journals (Sweden)
Jaser Miriam
2017-12-01
Full Text Available In this paper, we propose a simple non-parametric goodness-of-fit test for elliptical copulas of any dimension. It is based on the equality of Kendall’s tau and Blomqvist’s beta for all bivariate margins. Nominal level and power of the proposed test are investigated in a Monte Carlo study. An empirical application illustrates our goodness-of-fit test at work.
An application of nonparametric Cox regression model in reliability analysis: A case study
Czech Academy of Sciences Publication Activity Database
Volf, Petr
2004-01-01
Roč. 40, č. 5 (2004), s. 639-648 ISSN 0023-5954 R&D Projects: GA ČR GA201/02/0049; GA ČR GA402/01/0539 Institutional research plan: CEZ:AV0Z1075907 Keywords : hazard rate * nonparametric regression * Cox model Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.224, year: 2004