WorldWideScience

Sample records for nonparametric kernel estimation

  1. Kernel bandwidth estimation for non-parametric density estimation: a comparative study

    CSIR Research Space (South Africa)

    Van der Walt, CM

    2013-12-01

    Full Text Available We investigate the performance of conventional bandwidth estimators for non-parametric kernel density estimation on a number of representative pattern-recognition tasks, to gain a better understanding of the behaviour of these estimators in high...

  2. Discrete non-parametric kernel estimation for global sensitivity analysis

    International Nuclear Information System (INIS)

    Senga Kiessé, Tristan; Ventura, Anne

    2016-01-01

    This work investigates the discrete kernel approach for evaluating the contribution of the variance of discrete input variables to the variance of model output, via analysis of variance (ANOVA) decomposition. Until recently only the continuous kernel approach has been applied as a metamodeling approach within sensitivity analysis framework, for both discrete and continuous input variables. Now the discrete kernel estimation is known to be suitable for smoothing discrete functions. We present a discrete non-parametric kernel estimator of ANOVA decomposition of a given model. An estimator of sensitivity indices is also presented with its asymtotic convergence rate. Some simulations on a test function analysis and a real case study from agricultural have shown that the discrete kernel approach outperforms the continuous kernel one for evaluating the contribution of moderate or most influential discrete parameters to the model output. - Highlights: • We study a discrete kernel estimation for sensitivity analysis of a model. • A discrete kernel estimator of ANOVA decomposition of the model is presented. • Sensitivity indices are calculated for discrete input parameters. • An estimator of sensitivity indices is also presented with its convergence rate. • An application is realized for improving the reliability of environmental models.

  3. Panel data specifications in nonparametric kernel regression

    DEFF Research Database (Denmark)

    Czekaj, Tomasz Gerard; Henningsen, Arne

    parametric panel data estimators to analyse the production technology of Polish crop farms. The results of our nonparametric kernel regressions generally differ from the estimates of the parametric models but they only slightly depend on the choice of the kernel functions. Based on economic reasoning, we...

  4. The Kernel Mixture Network: A Nonparametric Method for Conditional Density Estimation of Continuous Random Variables

    OpenAIRE

    Ambrogioni, Luca; Güçlü, Umut; van Gerven, Marcel A. J.; Maris, Eric

    2017-01-01

    This paper introduces the kernel mixture network, a new method for nonparametric estimation of conditional probability densities using neural networks. We model arbitrarily complex conditional densities as linear combinations of a family of kernel functions centered at a subset of training points. The weights are determined by the outer layer of a deep neural network, trained by minimizing the negative log likelihood. This generalizes the popular quantized softmax approach, which can be seen ...

  5. The Use of Nonparametric Kernel Regression Methods in Econometric Production Analysis

    DEFF Research Database (Denmark)

    Czekaj, Tomasz Gerard

    and nonparametric estimations of production functions in order to evaluate the optimal firm size. The second paper discusses the use of parametric and nonparametric regression methods to estimate panel data regression models. The third paper analyses production risk, price uncertainty, and farmers' risk preferences...... within a nonparametric panel data regression framework. The fourth paper analyses the technical efficiency of dairy farms with environmental output using nonparametric kernel regression in a semiparametric stochastic frontier analysis. The results provided in this PhD thesis show that nonparametric......This PhD thesis addresses one of the fundamental problems in applied econometric analysis, namely the econometric estimation of regression functions. The conventional approach to regression analysis is the parametric approach, which requires the researcher to specify the form of the regression...

  6. Consistent Estimation of Pricing Kernels from Noisy Price Data

    OpenAIRE

    Vladislav Kargin

    2003-01-01

    If pricing kernels are assumed non-negative then the inverse problem of finding the pricing kernel is well-posed. The constrained least squares method provides a consistent estimate of the pricing kernel. When the data are limited, a new method is suggested: relaxed maximization of the relative entropy. This estimator is also consistent. Keywords: $\\epsilon$-entropy, non-parametric estimation, pricing kernel, inverse problems.

  7. The Kernel Estimation in Biosystems Engineering

    Directory of Open Access Journals (Sweden)

    Esperanza Ayuga Téllez

    2008-04-01

    Full Text Available In many fields of biosystems engineering, it is common to find works in which statistical information is analysed that violates the basic hypotheses necessary for the conventional forecasting methods. For those situations, it is necessary to find alternative methods that allow the statistical analysis considering those infringements. Non-parametric function estimation includes methods that fit a target function locally, using data from a small neighbourhood of the point. Weak assumptions, such as continuity and differentiability of the target function, are rather used than "a priori" assumption of the global target function shape (e.g., linear or quadratic. In this paper a few basic rules of decision are enunciated, for the application of the non-parametric estimation method. These statistical rules set up the first step to build an interface usermethod for the consistent application of kernel estimation for not expert users. To reach this aim, univariate and multivariate estimation methods and density function were analysed, as well as regression estimators. In some cases the models to be applied in different situations, based on simulations, were defined. Different biosystems engineering applications of the kernel estimation are also analysed in this review.

  8. On Improving Convergence Rates for Nonnegative Kernel Density Estimators

    OpenAIRE

    Terrell, George R.; Scott, David W.

    1980-01-01

    To improve the rate of decrease of integrated mean square error for nonparametric kernel density estimators beyond $0(n^{-\\frac{4}{5}}),$ we must relax the constraint that the density estimate be a bonafide density function, that is, be nonnegative and integrate to one. All current methods for kernel (and orthogonal series) estimators relax the nonnegativity constraint. In this paper we show how to achieve similar improvement by relaxing the integral constraint only. This is important in appl...

  9. Nonparametric Regression Estimation for Multivariate Null Recurrent Processes

    Directory of Open Access Journals (Sweden)

    Biqing Cai

    2015-04-01

    Full Text Available This paper discusses nonparametric kernel regression with the regressor being a \\(d\\-dimensional \\(\\beta\\-null recurrent process in presence of conditional heteroscedasticity. We show that the mean function estimator is consistent with convergence rate \\(\\sqrt{n(Th^{d}}\\, where \\(n(T\\ is the number of regenerations for a \\(\\beta\\-null recurrent process and the limiting distribution (with proper normalization is normal. Furthermore, we show that the two-step estimator for the volatility function is consistent. The finite sample performance of the estimate is quite reasonable when the leave-one-out cross validation method is used for bandwidth selection. We apply the proposed method to study the relationship of Federal funds rate with 3-month and 5-year T-bill rates and discover the existence of nonlinearity of the relationship. Furthermore, the in-sample and out-of-sample performance of the nonparametric model is far better than the linear model.

  10. Nonparametric Bayesian density estimation on manifolds with applications to planar shapes.

    Science.gov (United States)

    Bhattacharya, Abhishek; Dunson, David B

    2010-12-01

    Statistical analysis on landmark-based shape spaces has diverse applications in morphometrics, medical diagnostics, machine vision and other areas. These shape spaces are non-Euclidean quotient manifolds. To conduct nonparametric inferences, one may define notions of centre and spread on this manifold and work with their estimates. However, it is useful to consider full likelihood-based methods, which allow nonparametric estimation of the probability density. This article proposes a broad class of mixture models constructed using suitable kernels on a general compact metric space and then on the planar shape space in particular. Following a Bayesian approach with a nonparametric prior on the mixing distribution, conditions are obtained under which the Kullback-Leibler property holds, implying large support and weak posterior consistency. Gibbs sampling methods are developed for posterior computation, and the methods are applied to problems in density estimation and classification with shape-based predictors. Simulation studies show improved estimation performance relative to existing approaches.

  11. A survey of kernel-type estimators for copula and their applications

    Science.gov (United States)

    Sumarjaya, I. W.

    2017-10-01

    Copulas have been widely used to model nonlinear dependence structure. Main applications of copulas include areas such as finance, insurance, hydrology, rainfall to name but a few. The flexibility of copula allows researchers to model dependence structure beyond Gaussian distribution. Basically, a copula is a function that couples multivariate distribution functions to their one-dimensional marginal distribution functions. In general, there are three methods to estimate copula. These are parametric, nonparametric, and semiparametric method. In this article we survey kernel-type estimators for copula such as mirror reflection kernel, beta kernel, transformation method and local likelihood transformation method. Then, we apply these kernel methods to three stock indexes in Asia. The results of our analysis suggest that, albeit variation in information criterion values, the local likelihood transformation method performs better than the other kernel methods.

  12. Adaptive nonparametric estimation for L\\'evy processes observed at low frequency

    OpenAIRE

    Kappus, Johanna

    2013-01-01

    This article deals with adaptive nonparametric estimation for L\\'evy processes observed at low frequency. For general linear functionals of the L\\'evy measure, we construct kernel estimators, provide upper risk bounds and derive rates of convergence under regularity assumptions. Our focus lies on the adaptive choice of the bandwidth, using model selection techniques. We face here a non-standard problem of model selection with unknown variance. A new approach towards this problem is proposed, ...

  13. Genomic breeding value estimation using nonparametric additive regression models

    Directory of Open Access Journals (Sweden)

    Solberg Trygve

    2009-01-01

    Full Text Available Abstract Genomic selection refers to the use of genomewide dense markers for breeding value estimation and subsequently for selection. The main challenge of genomic breeding value estimation is the estimation of many effects from a limited number of observations. Bayesian methods have been proposed to successfully cope with these challenges. As an alternative class of models, non- and semiparametric models were recently introduced. The present study investigated the ability of nonparametric additive regression models to predict genomic breeding values. The genotypes were modelled for each marker or pair of flanking markers (i.e. the predictors separately. The nonparametric functions for the predictors were estimated simultaneously using additive model theory, applying a binomial kernel. The optimal degree of smoothing was determined by bootstrapping. A mutation-drift-balance simulation was carried out. The breeding values of the last generation (genotyped was predicted using data from the next last generation (genotyped and phenotyped. The results show moderate to high accuracies of the predicted breeding values. A determination of predictor specific degree of smoothing increased the accuracy.

  14. Optimal Bandwidth Selection for Kernel Density Functionals Estimation

    Directory of Open Access Journals (Sweden)

    Su Chen

    2015-01-01

    Full Text Available The choice of bandwidth is crucial to the kernel density estimation (KDE and kernel based regression. Various bandwidth selection methods for KDE and local least square regression have been developed in the past decade. It has been known that scale and location parameters are proportional to density functionals ∫γ(xf2(xdx with appropriate choice of γ(x and furthermore equality of scale and location tests can be transformed to comparisons of the density functionals among populations. ∫γ(xf2(xdx can be estimated nonparametrically via kernel density functionals estimation (KDFE. However, the optimal bandwidth selection for KDFE of ∫γ(xf2(xdx has not been examined. We propose a method to select the optimal bandwidth for the KDFE. The idea underlying this method is to search for the optimal bandwidth by minimizing the mean square error (MSE of the KDFE. Two main practical bandwidth selection techniques for the KDFE of ∫γ(xf2(xdx are provided: Normal scale bandwidth selection (namely, “Rule of Thumb” and direct plug-in bandwidth selection. Simulation studies display that our proposed bandwidth selection methods are superior to existing density estimation bandwidth selection methods in estimating density functionals.

  15. Robustifying Bayesian nonparametric mixtures for count data.

    Science.gov (United States)

    Canale, Antonio; Prünster, Igor

    2017-03-01

    Our motivating application stems from surveys of natural populations and is characterized by large spatial heterogeneity in the counts, which makes parametric approaches to modeling local animal abundance too restrictive. We adopt a Bayesian nonparametric approach based on mixture models and innovate with respect to popular Dirichlet process mixture of Poisson kernels by increasing the model flexibility at the level both of the kernel and the nonparametric mixing measure. This allows to derive accurate and robust estimates of the distribution of local animal abundance and of the corresponding clusters. The application and a simulation study for different scenarios yield also some general methodological implications. Adding flexibility solely at the level of the mixing measure does not improve inferences, since its impact is severely limited by the rigidity of the Poisson kernel with considerable consequences in terms of bias. However, once a kernel more flexible than the Poisson is chosen, inferences can be robustified by choosing a prior more general than the Dirichlet process. Therefore, to improve the performance of Bayesian nonparametric mixtures for count data one has to enrich the model simultaneously at both levels, the kernel and the mixing measure. © 2016, The International Biometric Society.

  16. Adaptive Nonparametric Variance Estimation for a Ratio Estimator ...

    African Journals Online (AJOL)

    Kernel estimators for smooth curves require modifications when estimating near end points of the support, both for practical and asymptotic reasons. The construction of such boundary kernels as solutions of variational problem is a difficult exercise. For estimating the error variance of a ratio estimator, we suggest an ...

  17. Multivariate and semiparametric kernel regression

    OpenAIRE

    Härdle, Wolfgang; Müller, Marlene

    1997-01-01

    The paper gives an introduction to theory and application of multivariate and semiparametric kernel smoothing. Multivariate nonparametric density estimation is an often used pilot tool for examining the structure of data. Regression smoothing helps in investigating the association between covariates and responses. We concentrate on kernel smoothing using local polynomial fitting which includes the Nadaraya-Watson estimator. Some theory on the asymptotic behavior and bandwidth selection is pro...

  18. A NONPARAMETRIC HYPOTHESIS TEST VIA THE BOOTSTRAP RESAMPLING

    OpenAIRE

    Temel, Tugrul T.

    2001-01-01

    This paper adapts an already existing nonparametric hypothesis test to the bootstrap framework. The test utilizes the nonparametric kernel regression method to estimate a measure of distance between the models stated under the null hypothesis. The bootstraped version of the test allows to approximate errors involved in the asymptotic hypothesis test. The paper also develops a Mathematica Code for the test algorithm.

  19. Nonparametric conditional predictive regions for time series

    NARCIS (Netherlands)

    de Gooijer, J.G.; Zerom Godefay, D.

    2000-01-01

    Several nonparametric predictors based on the Nadaraya-Watson kernel regression estimator have been proposed in the literature. They include the conditional mean, the conditional median, and the conditional mode. In this paper, we consider three types of predictive regions for these predictors — the

  20. Kernel PLS Estimation of Single-trial Event-related Potentials

    Science.gov (United States)

    Rosipal, Roman; Trejo, Leonard J.

    2004-01-01

    Nonlinear kernel partial least squaes (KPLS) regressior, is a novel smoothing approach to nonparametric regression curve fitting. We have developed a KPLS approach to the estimation of single-trial event related potentials (ERPs). For improved accuracy of estimation, we also developed a local KPLS method for situations in which there exists prior knowledge about the approximate latency of individual ERP components. To assess the utility of the KPLS approach, we compared non-local KPLS and local KPLS smoothing with other nonparametric signal processing and smoothing methods. In particular, we examined wavelet denoising, smoothing splines, and localized smoothing splines. We applied these methods to the estimation of simulated mixtures of human ERPs and ongoing electroencephalogram (EEG) activity using a dipole simulator (BESA). In this scenario we considered ongoing EEG to represent spatially and temporally correlated noise added to the ERPs. This simulation provided a reasonable but simplified model of real-world ERP measurements. For estimation of the simulated single-trial ERPs, local KPLS provided a level of accuracy that was comparable with or better than the other methods. We also applied the local KPLS method to the estimation of human ERPs recorded in an experiment on co,onitive fatigue. For these data, the local KPLS method provided a clear improvement in visualization of single-trial ERPs as well as their averages. The local KPLS method may serve as a new alternative to the estimation of single-trial ERPs and improvement of ERP averages.

  1. Bayesian Bandwidth Selection for a Nonparametric Regression Model with Mixed Types of Regressors

    Directory of Open Access Journals (Sweden)

    Xibin Zhang

    2016-04-01

    Full Text Available This paper develops a sampling algorithm for bandwidth estimation in a nonparametric regression model with continuous and discrete regressors under an unknown error density. The error density is approximated by the kernel density estimator of the unobserved errors, while the regression function is estimated using the Nadaraya-Watson estimator admitting continuous and discrete regressors. We derive an approximate likelihood and posterior for bandwidth parameters, followed by a sampling algorithm. Simulation results show that the proposed approach typically leads to better accuracy of the resulting estimates than cross-validation, particularly for smaller sample sizes. This bandwidth estimation approach is applied to nonparametric regression model of the Australian All Ordinaries returns and the kernel density estimation of gross domestic product (GDP growth rates among the organisation for economic co-operation and development (OECD and non-OECD countries.

  2. Global Polynomial Kernel Hazard Estimation

    DEFF Research Database (Denmark)

    Hiabu, Munir; Miranda, Maria Dolores Martínez; Nielsen, Jens Perch

    2015-01-01

    This paper introduces a new bias reducing method for kernel hazard estimation. The method is called global polynomial adjustment (GPA). It is a global correction which is applicable to any kernel hazard estimator. The estimator works well from a theoretical point of view as it asymptotically redu...

  3. Strong consistency of nonparametric Bayes density estimation on compact metric spaces with applications to specific manifolds.

    Science.gov (United States)

    Bhattacharya, Abhishek; Dunson, David B

    2012-08-01

    This article considers a broad class of kernel mixture density models on compact metric spaces and manifolds. Following a Bayesian approach with a nonparametric prior on the location mixing distribution, sufficient conditions are obtained on the kernel, prior and the underlying space for strong posterior consistency at any continuous density. The prior is also allowed to depend on the sample size n and sufficient conditions are obtained for weak and strong consistency. These conditions are verified on compact Euclidean spaces using multivariate Gaussian kernels, on the hypersphere using a von Mises-Fisher kernel and on the planar shape space using complex Watson kernels.

  4. Adaptive metric kernel regression

    DEFF Research Database (Denmark)

    Goutte, Cyril; Larsen, Jan

    2000-01-01

    Kernel smoothing is a widely used non-parametric pattern recognition technique. By nature, it suffers from the curse of dimensionality and is usually difficult to apply to high input dimensions. In this contribution, we propose an algorithm that adapts the input metric used in multivariate...... regression by minimising a cross-validation estimate of the generalisation error. This allows to automatically adjust the importance of different dimensions. The improvement in terms of modelling performance is illustrated on a variable selection task where the adaptive metric kernel clearly outperforms...

  5. Nonparametric Inference of Doubly Stochastic Poisson Process Data via the Kernel Method.

    Science.gov (United States)

    Zhang, Tingting; Kou, S C

    2010-01-01

    Doubly stochastic Poisson processes, also known as the Cox processes, frequently occur in various scientific fields. In this article, motivated primarily by analyzing Cox process data in biophysics, we propose a nonparametric kernel-based inference method. We conduct a detailed study, including an asymptotic analysis, of the proposed method, and provide guidelines for its practical use, introducing a fast and stable regression method for bandwidth selection. We apply our method to real photon arrival data from recent single-molecule biophysical experiments, investigating proteins' conformational dynamics. Our result shows that conformational fluctuation is widely present in protein systems, and that the fluctuation covers a broad range of time scales, highlighting the dynamic and complex nature of proteins' structure.

  6. Adaptive Metric Kernel Regression

    DEFF Research Database (Denmark)

    Goutte, Cyril; Larsen, Jan

    1998-01-01

    Kernel smoothing is a widely used nonparametric pattern recognition technique. By nature, it suffers from the curse of dimensionality and is usually difficult to apply to high input dimensions. In this paper, we propose an algorithm that adapts the input metric used in multivariate regression...... by minimising a cross-validation estimate of the generalisation error. This allows one to automatically adjust the importance of different dimensions. The improvement in terms of modelling performance is illustrated on a variable selection task where the adaptive metric kernel clearly outperforms the standard...

  7. A nonparametric mixture model for cure rate estimation.

    Science.gov (United States)

    Peng, Y; Dear, K B

    2000-03-01

    Nonparametric methods have attracted less attention than their parametric counterparts for cure rate analysis. In this paper, we study a general nonparametric mixture model. The proportional hazards assumption is employed in modeling the effect of covariates on the failure time of patients who are not cured. The EM algorithm, the marginal likelihood approach, and multiple imputations are employed to estimate parameters of interest in the model. This model extends models and improves estimation methods proposed by other researchers. It also extends Cox's proportional hazards regression model by allowing a proportion of event-free patients and investigating covariate effects on that proportion. The model and its estimation method are investigated by simulations. An application to breast cancer data, including comparisons with previous analyses using a parametric model and an existing nonparametric model by other researchers, confirms the conclusions from the parametric model but not those from the existing nonparametric model.

  8. Non-Parametric Estimation of Correlation Functions

    DEFF Research Database (Denmark)

    Brincker, Rune; Rytter, Anders; Krenk, Steen

    In this paper three methods of non-parametric correlation function estimation are reviewed and evaluated: the direct method, estimation by the Fast Fourier Transform and finally estimation by the Random Decrement technique. The basic ideas of the techniques are reviewed, sources of bias are point...

  9. Low default credit scoring using two-class non-parametric kernel density estimation

    CSIR Research Space (South Africa)

    Rademeyer, E

    2016-12-01

    Full Text Available This paper investigates the performance of two-class classification credit scoring data sets with low default ratios. The standard two-class parametric Gaussian and non-parametric Parzen classifiers are extended, using Bayes’ rule, to include either...

  10. Nonparametric evaluation of dynamic disease risk: a spatio-temporal kernel approach.

    Directory of Open Access Journals (Sweden)

    Zhijie Zhang

    Full Text Available Quantifying the distributions of disease risk in space and time jointly is a key element for understanding spatio-temporal phenomena while also having the potential to enhance our understanding of epidemiologic trajectories. However, most studies to date have neglected time dimension and focus instead on the "average" spatial pattern of disease risk, thereby masking time trajectories of disease risk. In this study we propose a new idea titled "spatio-temporal kernel density estimation (stKDE" that employs hybrid kernel (i.e., weight functions to evaluate the spatio-temporal disease risks. This approach not only can make full use of sample data but also "borrows" information in a particular manner from neighboring points both in space and time via appropriate choice of kernel functions. Monte Carlo simulations show that the proposed method performs substantially better than the traditional (i.e., frequency-based kernel density estimation (trKDE which has been used in applied settings while two illustrative examples demonstrate that the proposed approach can yield superior results compared to the popular trKDE approach. In addition, there exist various possibilities for improving and extending this method.

  11. Nonparametric Mixture of Regression Models.

    Science.gov (United States)

    Huang, Mian; Li, Runze; Wang, Shaoli

    2013-07-01

    Motivated by an analysis of US house price index data, we propose nonparametric finite mixture of regression models. We study the identifiability issue of the proposed models, and develop an estimation procedure by employing kernel regression. We further systematically study the sampling properties of the proposed estimators, and establish their asymptotic normality. A modified EM algorithm is proposed to carry out the estimation procedure. We show that our algorithm preserves the ascent property of the EM algorithm in an asymptotic sense. Monte Carlo simulations are conducted to examine the finite sample performance of the proposed estimation procedure. An empirical analysis of the US house price index data is illustrated for the proposed methodology.

  12. Surface Estimation, Variable Selection, and the Nonparametric Oracle Property.

    Science.gov (United States)

    Storlie, Curtis B; Bondell, Howard D; Reich, Brian J; Zhang, Hao Helen

    2011-04-01

    Variable selection for multivariate nonparametric regression is an important, yet challenging, problem due, in part, to the infinite dimensionality of the function space. An ideal selection procedure should be automatic, stable, easy to use, and have desirable asymptotic properties. In particular, we define a selection procedure to be nonparametric oracle (np-oracle) if it consistently selects the correct subset of predictors and at the same time estimates the smooth surface at the optimal nonparametric rate, as the sample size goes to infinity. In this paper, we propose a model selection procedure for nonparametric models, and explore the conditions under which the new method enjoys the aforementioned properties. Developed in the framework of smoothing spline ANOVA, our estimator is obtained via solving a regularization problem with a novel adaptive penalty on the sum of functional component norms. Theoretical properties of the new estimator are established. Additionally, numerous simulated and real examples further demonstrate that the new approach substantially outperforms other existing methods in the finite sample setting.

  13. Nonparametric Identification of Glucose-Insulin Process in IDDM Patient with Multi-meal Disturbance

    Science.gov (United States)

    Bhattacharjee, A.; Sutradhar, A.

    2012-12-01

    Modern close loop control for blood glucose level in a diabetic patient necessarily uses an explicit model of the process. A fixed parameter full order or reduced order model does not characterize the inter-patient and intra-patient parameter variability. This paper deals with a frequency domain nonparametric identification of the nonlinear glucose-insulin process in an insulin dependent diabetes mellitus patient that captures the process dynamics in presence of uncertainties and parameter variations. An online frequency domain kernel estimation method has been proposed that uses the input-output data from the 19th order first principle model of the patient in intravenous route. Volterra equations up to second order kernels with extended input vector for a Hammerstein model are solved online by adaptive recursive least square (ARLS) algorithm. The frequency domain kernels are estimated using the harmonic excitation input data sequence from the virtual patient model. A short filter memory length of M = 2 was found sufficient to yield acceptable accuracy with lesser computation time. The nonparametric models are useful for closed loop control, where the frequency domain kernels can be directly used as the transfer function. The validation results show good fit both in frequency and time domain responses with nominal patient as well as with parameter variations.

  14. Panel data nonparametric estimation of production risk and risk preferences

    DEFF Research Database (Denmark)

    Czekaj, Tomasz Gerard; Henningsen, Arne

    approaches for obtaining firm-specific measures of risk attitudes. We found that Polish dairy farmers are risk averse regarding production risk and price uncertainty. According to our results, Polish dairy farmers perceive the production risk as being more significant than the risk related to output price......We apply nonparametric panel data kernel regression to investigate production risk, out-put price uncertainty, and risk attitudes of Polish dairy farms based on a firm-level unbalanced panel data set that covers the period 2004–2010. We compare different model specifications and different...

  15. Nonparametric methods for volatility density estimation

    NARCIS (Netherlands)

    Es, van Bert; Spreij, P.J.C.; Zanten, van J.H.

    2009-01-01

    Stochastic volatility modelling of financial processes has become increasingly popular. The proposed models usually contain a stationary volatility process. We will motivate and review several nonparametric methods for estimation of the density of the volatility process. Both models based on

  16. Nonparametric e-Mixture Estimation.

    Science.gov (United States)

    Takano, Ken; Hino, Hideitsu; Akaho, Shotaro; Murata, Noboru

    2016-12-01

    This study considers the common situation in data analysis when there are few observations of the distribution of interest or the target distribution, while abundant observations are available from auxiliary distributions. In this situation, it is natural to compensate for the lack of data from the target distribution by using data sets from these auxiliary distributions-in other words, approximating the target distribution in a subspace spanned by a set of auxiliary distributions. Mixture modeling is one of the simplest ways to integrate information from the target and auxiliary distributions in order to express the target distribution as accurately as possible. There are two typical mixtures in the context of information geometry: the [Formula: see text]- and [Formula: see text]-mixtures. The [Formula: see text]-mixture is applied in a variety of research fields because of the presence of the well-known expectation-maximazation algorithm for parameter estimation, whereas the [Formula: see text]-mixture is rarely used because of its difficulty of estimation, particularly for nonparametric models. The [Formula: see text]-mixture, however, is a well-tempered distribution that satisfies the principle of maximum entropy. To model a target distribution with scarce observations accurately, this letter proposes a novel framework for a nonparametric modeling of the [Formula: see text]-mixture and a geometrically inspired estimation algorithm. As numerical examples of the proposed framework, a transfer learning setup is considered. The experimental results show that this framework works well for three types of synthetic data sets, as well as an EEG real-world data set.

  17. Nonparametric estimation in models for unobservable heterogeneity

    OpenAIRE

    Hohmann, Daniel

    2014-01-01

    Nonparametric models which allow for data with unobservable heterogeneity are studied. The first publication introduces new estimators and their asymptotic properties for conditional mixture models. The second publication considers estimation of a function from noisy observations of its Radon transform in a Gaussian white noise model.

  18. Nonparametric NAR-ARCH Modelling of Stock Prices by the Kernel Methodology

    Directory of Open Access Journals (Sweden)

    Mohamed Chikhi

    2018-02-01

    Full Text Available This paper analyses cyclical behaviour of Orange stock price listed in French stock exchange over 01/03/2000 to 02/02/2017 by testing the nonlinearities through a class of conditional heteroscedastic nonparametric models. The linearity and Gaussianity assumptions are rejected for Orange Stock returns and informational shocks have transitory effects on returns and volatility. The forecasting results show that Orange stock prices are short-term predictable and nonparametric NAR-ARCH model has better performance over parametric MA-APARCH model for short horizons. Plus, the estimates of this model are also better comparing to the predictions of the random walk model. This finding provides evidence for weak form of inefficiency in Paris stock market with limited rationality, thus it emerges arbitrage opportunities.

  19. Kernel density estimation-based real-time prediction for respiratory motion

    International Nuclear Information System (INIS)

    Ruan, Dan

    2010-01-01

    Effective delivery of adaptive radiotherapy requires locating the target with high precision in real time. System latency caused by data acquisition, streaming, processing and delivery control necessitates prediction. Prediction is particularly challenging for highly mobile targets such as thoracic and abdominal tumors undergoing respiration-induced motion. The complexity of the respiratory motion makes it difficult to build and justify explicit models. In this study, we honor the intrinsic uncertainties in respiratory motion and propose a statistical treatment of the prediction problem. Instead of asking for a deterministic covariate-response map and a unique estimate value for future target position, we aim to obtain a distribution of the future target position (response variable) conditioned on the observed historical sample values (covariate variable). The key idea is to estimate the joint probability distribution (pdf) of the covariate and response variables using an efficient kernel density estimation method. Then, the problem of identifying the distribution of the future target position reduces to identifying the section in the joint pdf based on the observed covariate. Subsequently, estimators are derived based on this estimated conditional distribution. This probabilistic perspective has some distinctive advantages over existing deterministic schemes: (1) it is compatible with potentially inconsistent training samples, i.e., when close covariate variables correspond to dramatically different response values; (2) it is not restricted by any prior structural assumption on the map between the covariate and the response; (3) the two-stage setup allows much freedom in choosing statistical estimates and provides a full nonparametric description of the uncertainty for the resulting estimate. We evaluated the prediction performance on ten patient RPM traces, using the root mean squared difference between the prediction and the observed value normalized by the

  20. Adaptive Estimation of Heteroscedastic Money Demand Model of Pakistan

    Directory of Open Access Journals (Sweden)

    Muhammad Aslam

    2007-07-01

    Full Text Available For the problem of estimation of Money demand model of Pakistan, money supply (M1 shows heteroscedasticity of the unknown form. For estimation of such model we compare two adaptive estimators with ordinary least squares estimator and show the attractive performance of the adaptive estimators, namely, nonparametric kernel estimator and nearest neighbour regression estimator. These comparisons are made on the basis standard errors of the estimated coefficients, standard error of regression, Akaike Information Criteria (AIC value, and the Durban-Watson statistic for autocorrelation. We further show that nearest neighbour regression estimator performs better when comparing with the other nonparametric kernel estimator.

  1. Rigorous home range estimation with movement data: a new autocorrelated kernel density estimator.

    Science.gov (United States)

    Fleming, C H; Fagan, W F; Mueller, T; Olson, K A; Leimgruber, P; Calabrese, J M

    2015-05-01

    Quantifying animals' home ranges is a key problem in ecology and has important conservation and wildlife management applications. Kernel density estimation (KDE) is a workhorse technique for range delineation problems that is both statistically efficient and nonparametric. KDE assumes that the data are independent and identically distributed (IID). However, animal tracking data, which are routinely used as inputs to KDEs, are inherently autocorrelated and violate this key assumption. As we demonstrate, using realistically autocorrelated data in conventional KDEs results in grossly underestimated home ranges. We further show that the performance of conventional KDEs actually degrades as data quality improves, because autocorrelation strength increases as movement paths become more finely resolved. To remedy these flaws with the traditional KDE method, we derive an autocorrelated KDE (AKDE) from first principles to use autocorrelated data, making it perfectly suited for movement data sets. We illustrate the vastly improved performance of AKDE using analytical arguments, relocation data from Mongolian gazelles, and simulations based upon the gazelle's observed movement process. By yielding better minimum area estimates for threatened wildlife populations, we believe that future widespread use of AKDE will have significant impact on ecology and conservation biology.

  2. Variable Kernel Density Estimation

    OpenAIRE

    Terrell, George R.; Scott, David W.

    1992-01-01

    We investigate some of the possibilities for improvement of univariate and multivariate kernel density estimates by varying the window over the domain of estimation, pointwise and globally. Two general approaches are to vary the window width by the point of estimation and by point of the sample observation. The first possibility is shown to be of little efficacy in one variable. In particular, nearest-neighbor estimators in all versions perform poorly in one and two dimensions, but begin to b...

  3. Investigation of MLE in nonparametric estimation methods of reliability function

    International Nuclear Information System (INIS)

    Ahn, Kwang Won; Kim, Yoon Ik; Chung, Chang Hyun; Kim, Kil Yoo

    2001-01-01

    There have been lots of trials to estimate a reliability function. In the ESReDA 20 th seminar, a new method in nonparametric way was proposed. The major point of that paper is how to use censored data efficiently. Generally there are three kinds of approach to estimate a reliability function in nonparametric way, i.e., Reduced Sample Method, Actuarial Method and Product-Limit (PL) Method. The above three methods have some limits. So we suggest an advanced method that reflects censored information more efficiently. In many instances there will be a unique maximum likelihood estimator (MLE) of an unknown parameter, and often it may be obtained by the process of differentiation. It is well known that the three methods generally used to estimate a reliability function in nonparametric way have maximum likelihood estimators that are uniquely exist. So, MLE of the new method is derived in this study. The procedure to calculate a MLE is similar just like that of PL-estimator. The difference of the two is that in the new method, the mass (or weight) of each has an influence of the others but the mass in PL-estimator not

  4. Variable kernel density estimation in high-dimensional feature spaces

    CSIR Research Space (South Africa)

    Van der Walt, Christiaan M

    2017-02-01

    Full Text Available Estimating the joint probability density function of a dataset is a central task in many machine learning applications. In this work we address the fundamental problem of kernel bandwidth estimation for variable kernel density estimation in high...

  5. Considering a non-polynomial basis for local kernel regression problem

    Science.gov (United States)

    Silalahi, Divo Dharma; Midi, Habshah

    2017-01-01

    A common used as solution for local kernel nonparametric regression problem is given using polynomial regression. In this study, we demonstrated the estimator and properties using maximum likelihood estimator for a non-polynomial basis such B-spline to replacing the polynomial basis. This estimator allows for flexibility in the selection of a bandwidth and a knot. The best estimator was selected by finding an optimal bandwidth and knot through minimizing the famous generalized validation function.

  6. Nonparametric model validations for hidden Markov models with applications in financial econometrics.

    Science.gov (United States)

    Zhao, Zhibiao

    2011-06-01

    We address the nonparametric model validation problem for hidden Markov models with partially observable variables and hidden states. We achieve this goal by constructing a nonparametric simultaneous confidence envelope for transition density function of the observable variables and checking whether the parametric density estimate is contained within such an envelope. Our specification test procedure is motivated by a functional connection between the transition density of the observable variables and the Markov transition kernel of the hidden states. Our approach is applicable for continuous time diffusion models, stochastic volatility models, nonlinear time series models, and models with market microstructure noise.

  7. Bivariate discrete beta Kernel graduation of mortality data.

    Science.gov (United States)

    Mazza, Angelo; Punzo, Antonio

    2015-07-01

    Various parametric/nonparametric techniques have been proposed in literature to graduate mortality data as a function of age. Nonparametric approaches, as for example kernel smoothing regression, are often preferred because they do not assume any particular mortality law. Among the existing kernel smoothing approaches, the recently proposed (univariate) discrete beta kernel smoother has been shown to provide some benefits. Bivariate graduation, over age and calendar years or durations, is common practice in demography and actuarial sciences. In this paper, we generalize the discrete beta kernel smoother to the bivariate case, and we introduce an adaptive bandwidth variant that may provide additional benefits when data on exposures to the risk of death are available; furthermore, we outline a cross-validation procedure for bandwidths selection. Using simulations studies, we compare the bivariate approach proposed here with its corresponding univariate formulation and with two popular nonparametric bivariate graduation techniques, based on Epanechnikov kernels and on P-splines. To make simulations realistic, a bivariate dataset, based on probabilities of dying recorded for the US males, is used. Simulations have confirmed the gain in performance of the new bivariate approach with respect to both the univariate and the bivariate competitors.

  8. Nonparametric estimation of location and scale parameters

    KAUST Repository

    Potgieter, C.J.; Lombard, F.

    2012-01-01

    Two random variables X and Y belong to the same location-scale family if there are constants μ and σ such that Y and μ+σX have the same distribution. In this paper we consider non-parametric estimation of the parameters μ and σ under minimal

  9. Clustering via Kernel Decomposition

    DEFF Research Database (Denmark)

    Have, Anna Szynkowiak; Girolami, Mark A.; Larsen, Jan

    2006-01-01

    Methods for spectral clustering have been proposed recently which rely on the eigenvalue decomposition of an affinity matrix. In this work it is proposed that the affinity matrix is created based on the elements of a non-parametric density estimator. This matrix is then decomposed to obtain...... posterior probabilities of class membership using an appropriate form of nonnegative matrix factorization. The troublesome selection of hyperparameters such as kernel width and number of clusters can be obtained using standard cross-validation methods as is demonstrated on a number of diverse data sets....

  10. A Bayesian nonparametric estimation of distributions and quantiles

    International Nuclear Information System (INIS)

    Poern, K.

    1988-11-01

    The report describes a Bayesian, nonparametric method for the estimation of a distribution function and its quantiles. The method, presupposing random sampling, is nonparametric, so the user has to specify a prior distribution on a space of distributions (and not on a parameter space). In the current application, where the method is used to estimate the uncertainty of a parametric calculational model, the Dirichlet prior distribution is to a large extent determined by the first batch of Monte Carlo-realizations. In this case the results of the estimation technique is very similar to the conventional empirical distribution function. The resulting posterior distribution is also Dirichlet, and thus facilitates the determination of probability (confidence) intervals at any given point in the space of interest. Another advantage is that also the posterior distribution of a specified quantitle can be derived and utilized to determine a probability interval for that quantile. The method was devised for use in the PROPER code package for uncertainty and sensitivity analysis. (orig.)

  11. Estimation of Stochastic Volatility Models by Nonparametric Filtering

    DEFF Research Database (Denmark)

    Kanaya, Shin; Kristensen, Dennis

    2016-01-01

    /estimated volatility process replacing the latent process. Our estimation strategy is applicable to both parametric and nonparametric stochastic volatility models, and can handle both jumps and market microstructure noise. The resulting estimators of the stochastic volatility model will carry additional biases...... and variances due to the first-step estimation, but under regularity conditions we show that these vanish asymptotically and our estimators inherit the asymptotic properties of the infeasible estimators based on observations of the volatility process. A simulation study examines the finite-sample properties...

  12. Short-term forecasting of meteorological time series using Nonparametric Functional Data Analysis (NPFDA)

    Science.gov (United States)

    Curceac, S.; Ternynck, C.; Ouarda, T.

    2015-12-01

    Over the past decades, a substantial amount of research has been conducted to model and forecast climatic variables. In this study, Nonparametric Functional Data Analysis (NPFDA) methods are applied to forecast air temperature and wind speed time series in Abu Dhabi, UAE. The dataset consists of hourly measurements recorded for a period of 29 years, 1982-2010. The novelty of the Functional Data Analysis approach is in expressing the data as curves. In the present work, the focus is on daily forecasting and the functional observations (curves) express the daily measurements of the above mentioned variables. We apply a non-linear regression model with a functional non-parametric kernel estimator. The computation of the estimator is performed using an asymmetrical quadratic kernel function for local weighting based on the bandwidth obtained by a cross validation procedure. The proximities between functional objects are calculated by families of semi-metrics based on derivatives and Functional Principal Component Analysis (FPCA). Additionally, functional conditional mode and functional conditional median estimators are applied and the advantages of combining their results are analysed. A different approach employs a SARIMA model selected according to the minimum Akaike (AIC) and Bayessian (BIC) Information Criteria and based on the residuals of the model. The performance of the models is assessed by calculating error indices such as the root mean square error (RMSE), relative RMSE, BIAS and relative BIAS. The results indicate that the NPFDA models provide more accurate forecasts than the SARIMA models. Key words: Nonparametric functional data analysis, SARIMA, time series forecast, air temperature, wind speed

  13. Nonparametric volatility density estimation for discrete time models

    NARCIS (Netherlands)

    Es, van Bert; Spreij, P.J.C.; Zanten, van J.H.

    2005-01-01

    We consider discrete time models for asset prices with a stationary volatility process. We aim at estimating the multivariate density of this process at a set of consecutive time instants. A Fourier-type deconvolution kernel density estimator based on the logarithm of the squared process is proposed

  14. Improved Variable Window Kernel Estimates of Probability Densities

    OpenAIRE

    Hall, Peter; Hu, Tien Chung; Marron, J. S.

    1995-01-01

    Variable window width kernel density estimators, with the width varying proportionally to the square root of the density, have been thought to have superior asymptotic properties. The rate of convergence has been claimed to be as good as those typical for higher-order kernels, which makes the variable width estimators more attractive because no adjustment is needed to handle the negativity usually entailed by the latter. However, in a recent paper, Terrell and Scott show that these results ca...

  15. Auto-associative Kernel Regression Model with Weighted Distance Metric for Instrument Drift Monitoring

    International Nuclear Information System (INIS)

    Shin, Ho Cheol; Park, Moon Ghu; You, Skin

    2006-01-01

    Recently, many on-line approaches to instrument channel surveillance (drift monitoring and fault detection) have been reported worldwide. On-line monitoring (OLM) method evaluates instrument channel performance by assessing its consistency with other plant indications through parametric or non-parametric models. The heart of an OLM system is the model giving an estimate of the true process parameter value against individual measurements. This model gives process parameter estimate calculated as a function of other plant measurements which can be used to identify small sensor drifts that would require the sensor to be manually calibrated or replaced. This paper describes an improvement of auto associative kernel regression (AAKR) by introducing a correlation coefficient weighting on kernel distances. The prediction performance of the developed method is compared with conventional auto-associative kernel regression

  16. Curve fitting of the corporate recovery rates: the comparison of Beta distribution estimation and kernel density estimation.

    Science.gov (United States)

    Chen, Rongda; Wang, Ze

    2013-01-01

    Recovery rate is essential to the estimation of the portfolio's loss and economic capital. Neglecting the randomness of the distribution of recovery rate may underestimate the risk. The study introduces two kinds of models of distribution, Beta distribution estimation and kernel density distribution estimation, to simulate the distribution of recovery rates of corporate loans and bonds. As is known, models based on Beta distribution are common in daily usage, such as CreditMetrics by J.P. Morgan, Portfolio Manager by KMV and Losscalc by Moody's. However, it has a fatal defect that it can't fit the bimodal or multimodal distributions such as recovery rates of corporate loans and bonds as Moody's new data show. In order to overcome this flaw, the kernel density estimation is introduced and we compare the simulation results by histogram, Beta distribution estimation and kernel density estimation to reach the conclusion that the Gaussian kernel density distribution really better imitates the distribution of the bimodal or multimodal data samples of corporate loans and bonds. Finally, a Chi-square test of the Gaussian kernel density estimation proves that it can fit the curve of recovery rates of loans and bonds. So using the kernel density distribution to precisely delineate the bimodal recovery rates of bonds is optimal in credit risk management.

  17. Efficient estimation of an additive quantile regression model

    NARCIS (Netherlands)

    Cheng, Y.; de Gooijer, J.G.; Zerom, D.

    2011-01-01

    In this paper, two non-parametric estimators are proposed for estimating the components of an additive quantile regression model. The first estimator is a computationally convenient approach which can be viewed as a more viable alternative to existing kernel-based approaches. The second estimator

  18. Nonparametric estimation of location and scale parameters

    KAUST Repository

    Potgieter, C.J.

    2012-12-01

    Two random variables X and Y belong to the same location-scale family if there are constants μ and σ such that Y and μ+σX have the same distribution. In this paper we consider non-parametric estimation of the parameters μ and σ under minimal assumptions regarding the form of the distribution functions of X and Y. We discuss an approach to the estimation problem that is based on asymptotic likelihood considerations. Our results enable us to provide a methodology that can be implemented easily and which yields estimators that are often near optimal when compared to fully parametric methods. We evaluate the performance of the estimators in a series of Monte Carlo simulations. © 2012 Elsevier B.V. All rights reserved.

  19. Generalized Jackknife Estimators of Weighted Average Derivatives

    DEFF Research Database (Denmark)

    Cattaneo, Matias D.; Crump, Richard K.; Jansson, Michael

    With the aim of improving the quality of asymptotic distributional approximations for nonlinear functionals of nonparametric estimators, this paper revisits the large-sample properties of an important member of that class, namely a kernel-based weighted average derivative estimator. Asymptotic...

  20. Curve fitting of the corporate recovery rates: the comparison of Beta distribution estimation and kernel density estimation.

    Directory of Open Access Journals (Sweden)

    Rongda Chen

    Full Text Available Recovery rate is essential to the estimation of the portfolio's loss and economic capital. Neglecting the randomness of the distribution of recovery rate may underestimate the risk. The study introduces two kinds of models of distribution, Beta distribution estimation and kernel density distribution estimation, to simulate the distribution of recovery rates of corporate loans and bonds. As is known, models based on Beta distribution are common in daily usage, such as CreditMetrics by J.P. Morgan, Portfolio Manager by KMV and Losscalc by Moody's. However, it has a fatal defect that it can't fit the bimodal or multimodal distributions such as recovery rates of corporate loans and bonds as Moody's new data show. In order to overcome this flaw, the kernel density estimation is introduced and we compare the simulation results by histogram, Beta distribution estimation and kernel density estimation to reach the conclusion that the Gaussian kernel density distribution really better imitates the distribution of the bimodal or multimodal data samples of corporate loans and bonds. Finally, a Chi-square test of the Gaussian kernel density estimation proves that it can fit the curve of recovery rates of loans and bonds. So using the kernel density distribution to precisely delineate the bimodal recovery rates of bonds is optimal in credit risk management.

  1. Curve Fitting of the Corporate Recovery Rates: The Comparison of Beta Distribution Estimation and Kernel Density Estimation

    Science.gov (United States)

    Chen, Rongda; Wang, Ze

    2013-01-01

    Recovery rate is essential to the estimation of the portfolio’s loss and economic capital. Neglecting the randomness of the distribution of recovery rate may underestimate the risk. The study introduces two kinds of models of distribution, Beta distribution estimation and kernel density distribution estimation, to simulate the distribution of recovery rates of corporate loans and bonds. As is known, models based on Beta distribution are common in daily usage, such as CreditMetrics by J.P. Morgan, Portfolio Manager by KMV and Losscalc by Moody’s. However, it has a fatal defect that it can’t fit the bimodal or multimodal distributions such as recovery rates of corporate loans and bonds as Moody’s new data show. In order to overcome this flaw, the kernel density estimation is introduced and we compare the simulation results by histogram, Beta distribution estimation and kernel density estimation to reach the conclusion that the Gaussian kernel density distribution really better imitates the distribution of the bimodal or multimodal data samples of corporate loans and bonds. Finally, a Chi-square test of the Gaussian kernel density estimation proves that it can fit the curve of recovery rates of loans and bonds. So using the kernel density distribution to precisely delineate the bimodal recovery rates of bonds is optimal in credit risk management. PMID:23874558

  2. Nonparametric Bayes Classification and Hypothesis Testing on Manifolds

    Science.gov (United States)

    Bhattacharya, Abhishek; Dunson, David

    2012-01-01

    Our first focus is prediction of a categorical response variable using features that lie on a general manifold. For example, the manifold may correspond to the surface of a hypersphere. We propose a general kernel mixture model for the joint distribution of the response and predictors, with the kernel expressed in product form and dependence induced through the unknown mixing measure. We provide simple sufficient conditions for large support and weak and strong posterior consistency in estimating both the joint distribution of the response and predictors and the conditional distribution of the response. Focusing on a Dirichlet process prior for the mixing measure, these conditions hold using von Mises-Fisher kernels when the manifold is the unit hypersphere. In this case, Bayesian methods are developed for efficient posterior computation using slice sampling. Next we develop Bayesian nonparametric methods for testing whether there is a difference in distributions between groups of observations on the manifold having unknown densities. We prove consistency of the Bayes factor and develop efficient computational methods for its calculation. The proposed classification and testing methods are evaluated using simulation examples and applied to spherical data applications. PMID:22754028

  3. Smooth semi-nonparametric (SNP) estimation of the cumulative incidence function.

    Science.gov (United States)

    Duc, Anh Nguyen; Wolbers, Marcel

    2017-08-15

    This paper presents a novel approach to estimation of the cumulative incidence function in the presence of competing risks. The underlying statistical model is specified via a mixture factorization of the joint distribution of the event type and the time to the event. The time to event distributions conditional on the event type are modeled using smooth semi-nonparametric densities. One strength of this approach is that it can handle arbitrary censoring and truncation while relying on mild parametric assumptions. A stepwise forward algorithm for model estimation and adaptive selection of smooth semi-nonparametric polynomial degrees is presented, implemented in the statistical software R, evaluated in a sequence of simulation studies, and applied to data from a clinical trial in cryptococcal meningitis. The simulations demonstrate that the proposed method frequently outperforms both parametric and nonparametric alternatives. They also support the use of 'ad hoc' asymptotic inference to derive confidence intervals. An extension to regression modeling is also presented, and its potential and challenges are discussed. © 2017 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.

  4. Effective dysphonia detection using feature dimension reduction and kernel density estimation for patients with Parkinson's disease.

    Directory of Open Access Journals (Sweden)

    Shanshan Yang

    Full Text Available Detection of dysphonia is useful for monitoring the progression of phonatory impairment for patients with Parkinson's disease (PD, and also helps assess the disease severity. This paper describes the statistical pattern analysis methods to study different vocal measurements of sustained phonations. The feature dimension reduction procedure was implemented by using the sequential forward selection (SFS and kernel principal component analysis (KPCA methods. Four selected vocal measures were projected by the KPCA onto the bivariate feature space, in which the class-conditional feature densities can be approximated with the nonparametric kernel density estimation technique. In the vocal pattern classification experiments, Fisher's linear discriminant analysis (FLDA was applied to perform the linear classification of voice records for healthy control subjects and PD patients, and the maximum a posteriori (MAP decision rule and support vector machine (SVM with radial basis function kernels were employed for the nonlinear classification tasks. Based on the KPCA-mapped feature densities, the MAP classifier successfully distinguished 91.8% voice records, with a sensitivity rate of 0.986, a specificity rate of 0.708, and an area value of 0.94 under the receiver operating characteristic (ROC curve. The diagnostic performance provided by the MAP classifier was superior to those of the FLDA and SVM classifiers. In addition, the classification results indicated that gender is insensitive to dysphonia detection, and the sustained phonations of PD patients with minimal functional disability are more difficult to be correctly identified.

  5. Efficient estimation of an additive quantile regression model

    NARCIS (Netherlands)

    Cheng, Y.; de Gooijer, J.G.; Zerom, D.

    2009-01-01

    In this paper two kernel-based nonparametric estimators are proposed for estimating the components of an additive quantile regression model. The first estimator is a computationally convenient approach which can be viewed as a viable alternative to the method of De Gooijer and Zerom (2003). By

  6. Efficient estimation of an additive quantile regression model

    NARCIS (Netherlands)

    Cheng, Y.; de Gooijer, J.G.; Zerom, D.

    2010-01-01

    In this paper two kernel-based nonparametric estimators are proposed for estimating the components of an additive quantile regression model. The first estimator is a computationally convenient approach which can be viewed as a viable alternative to the method of De Gooijer and Zerom (2003). By

  7. Kernel-based tests for joint independence

    DEFF Research Database (Denmark)

    Pfister, Niklas; Bühlmann, Peter; Schölkopf, Bernhard

    2018-01-01

    if the $d$ variables are jointly independent, as long as the kernel is characteristic. Based on an empirical estimate of dHSIC, we define three different non-parametric hypothesis tests: a permutation test, a bootstrap test and a test based on a Gamma approximation. We prove that the permutation test......We investigate the problem of testing whether $d$ random variables, which may or may not be continuous, are jointly (or mutually) independent. Our method builds on ideas of the two variable Hilbert-Schmidt independence criterion (HSIC) but allows for an arbitrary number of variables. We embed...... the $d$-dimensional joint distribution and the product of the marginals into a reproducing kernel Hilbert space and define the $d$-variable Hilbert-Schmidt independence criterion (dHSIC) as the squared distance between the embeddings. In the population case, the value of dHSIC is zero if and only...

  8. On the robust nonparametric regression estimation for a functional regressor

    OpenAIRE

    Azzedine , Nadjia; Laksaci , Ali; Ould-Saïd , Elias

    2009-01-01

    On the robust nonparametric regression estimation for a functional regressor correspondance: Corresponding author. (Ould-Said, Elias) (Azzedine, Nadjia) (Laksaci, Ali) (Ould-Said, Elias) Departement de Mathematiques--> , Univ. Djillali Liabes--> , BP 89--> , 22000 Sidi Bel Abbes--> - ALGERIA (Azzedine, Nadjia) Departement de Mathema...

  9. Effective Dysphonia Detection Using Feature Dimension Reduction and Kernel Density Estimation for Patients with Parkinson’s Disease

    Science.gov (United States)

    Yang, Shanshan; Zheng, Fang; Luo, Xin; Cai, Suxian; Wu, Yunfeng; Liu, Kaizhi; Wu, Meihong; Chen, Jian; Krishnan, Sridhar

    2014-01-01

    Detection of dysphonia is useful for monitoring the progression of phonatory impairment for patients with Parkinson’s disease (PD), and also helps assess the disease severity. This paper describes the statistical pattern analysis methods to study different vocal measurements of sustained phonations. The feature dimension reduction procedure was implemented by using the sequential forward selection (SFS) and kernel principal component analysis (KPCA) methods. Four selected vocal measures were projected by the KPCA onto the bivariate feature space, in which the class-conditional feature densities can be approximated with the nonparametric kernel density estimation technique. In the vocal pattern classification experiments, Fisher’s linear discriminant analysis (FLDA) was applied to perform the linear classification of voice records for healthy control subjects and PD patients, and the maximum a posteriori (MAP) decision rule and support vector machine (SVM) with radial basis function kernels were employed for the nonlinear classification tasks. Based on the KPCA-mapped feature densities, the MAP classifier successfully distinguished 91.8% voice records, with a sensitivity rate of 0.986, a specificity rate of 0.708, and an area value of 0.94 under the receiver operating characteristic (ROC) curve. The diagnostic performance provided by the MAP classifier was superior to those of the FLDA and SVM classifiers. In addition, the classification results indicated that gender is insensitive to dysphonia detection, and the sustained phonations of PD patients with minimal functional disability are more difficult to be correctly identified. PMID:24586406

  10. Non-parametric Estimation of Diffusion-Paths Using Wavelet Scaling Methods

    DEFF Research Database (Denmark)

    Høg, Esben

    In continuous time, diffusion processes have been used for modelling financial dynamics for a long time. For example the Ornstein-Uhlenbeck process (the simplest mean-reverting process) has been used to model non-speculative price processes. We discuss non--parametric estimation of these processes...

  11. Non-Parametric Estimation of Diffusion-Paths Using Wavelet Scaling Methods

    DEFF Research Database (Denmark)

    Høg, Esben

    2003-01-01

    In continuous time, diffusion processes have been used for modelling financial dynamics for a long time. For example the Ornstein-Uhlenbeck process (the simplest mean--reverting process) has been used to model non-speculative price processes. We discuss non--parametric estimation of these processes...

  12. Non-parametric estimation of the individual's utility map

    OpenAIRE

    Noguchi, Takao; Sanborn, Adam N.; Stewart, Neil

    2013-01-01

    Models of risky choice have attracted much attention in behavioural economics. Previous research has repeatedly demonstrated that individuals' choices are not well explained by expected utility theory, and a number of alternative models have been examined using carefully selected sets of choice alternatives. The model performance however, can depend on which choice alternatives are being tested. Here we develop a non-parametric method for estimating the utility map over the wide range of choi...

  13. Nonparametric Estimation of Distributions in Random Effects Models

    KAUST Repository

    Hart, Jeffrey D.

    2011-01-01

    We propose using minimum distance to obtain nonparametric estimates of the distributions of components in random effects models. A main setting considered is equivalent to having a large number of small datasets whose locations, and perhaps scales, vary randomly, but which otherwise have a common distribution. Interest focuses on estimating the distribution that is common to all datasets, knowledge of which is crucial in multiple testing problems where a location/scale invariant test is applied to every small dataset. A detailed algorithm for computing minimum distance estimates is proposed, and the usefulness of our methodology is illustrated by a simulation study and an analysis of microarray data. Supplemental materials for the article, including R-code and a dataset, are available online. © 2011 American Statistical Association.

  14. Nonparametric Estimation of Cumulative Incidence Functions for Competing Risks Data with Missing Cause of Failure

    DEFF Research Database (Denmark)

    Effraimidis, Georgios; Dahl, Christian Møller

    In this paper, we develop a fully nonparametric approach for the estimation of the cumulative incidence function with Missing At Random right-censored competing risks data. We obtain results on the pointwise asymptotic normality as well as the uniform convergence rate of the proposed nonparametric...

  15. Interpolation of Missing Precipitation Data Using Kernel Estimations for Hydrologic Modeling

    Directory of Open Access Journals (Sweden)

    Hyojin Lee

    2015-01-01

    Full Text Available Precipitation is the main factor that drives hydrologic modeling; therefore, missing precipitation data can cause malfunctions in hydrologic modeling. Although interpolation of missing precipitation data is recognized as an important research topic, only a few methods follow a regression approach. In this study, daily precipitation data were interpolated using five different kernel functions, namely, Epanechnikov, Quartic, Triweight, Tricube, and Cosine, to estimate missing precipitation data. This study also presents an assessment that compares estimation of missing precipitation data through Kth nearest neighborhood (KNN regression to the five different kernel estimations and their performance in simulating streamflow using the Soil Water Assessment Tool (SWAT hydrologic model. The results show that the kernel approaches provide higher quality interpolation of precipitation data compared with the KNN regression approach, in terms of both statistical data assessment and hydrologic modeling performance.

  16. On the Choice of Difference Sequence in a Unified Framework for Variance Estimation in Nonparametric Regression

    KAUST Repository

    Dai, Wenlin; Tong, Tiejun; Zhu, Lixing

    2017-01-01

    Difference-based methods do not require estimating the mean function in nonparametric regression and are therefore popular in practice. In this paper, we propose a unified framework for variance estimation that combines the linear regression method with the higher-order difference estimators systematically. The unified framework has greatly enriched the existing literature on variance estimation that includes most existing estimators as special cases. More importantly, the unified framework has also provided a smart way to solve the challenging difference sequence selection problem that remains a long-standing controversial issue in nonparametric regression for several decades. Using both theory and simulations, we recommend to use the ordinary difference sequence in the unified framework, no matter if the sample size is small or if the signal-to-noise ratio is large. Finally, to cater for the demands of the application, we have developed a unified R package, named VarED, that integrates the existing difference-based estimators and the unified estimators in nonparametric regression and have made it freely available in the R statistical program http://cran.r-project.org/web/packages/.

  17. On the Choice of Difference Sequence in a Unified Framework for Variance Estimation in Nonparametric Regression

    KAUST Repository

    Dai, Wenlin

    2017-09-01

    Difference-based methods do not require estimating the mean function in nonparametric regression and are therefore popular in practice. In this paper, we propose a unified framework for variance estimation that combines the linear regression method with the higher-order difference estimators systematically. The unified framework has greatly enriched the existing literature on variance estimation that includes most existing estimators as special cases. More importantly, the unified framework has also provided a smart way to solve the challenging difference sequence selection problem that remains a long-standing controversial issue in nonparametric regression for several decades. Using both theory and simulations, we recommend to use the ordinary difference sequence in the unified framework, no matter if the sample size is small or if the signal-to-noise ratio is large. Finally, to cater for the demands of the application, we have developed a unified R package, named VarED, that integrates the existing difference-based estimators and the unified estimators in nonparametric regression and have made it freely available in the R statistical program http://cran.r-project.org/web/packages/.

  18. Estimation of the lifetime distribution of mechatronic systems in the presence of a covariate: A comparison among parametric, semiparametric and nonparametric models

    International Nuclear Information System (INIS)

    Bobrowski, Sebastian; Chen, Hong; Döring, Maik; Jensen, Uwe; Schinköthe, Wolfgang

    2015-01-01

    In practice manufacturers may have lots of failure data of similar products using the same technology basis under different operating conditions. Thus, one can try to derive predictions for the distribution of the lifetime of newly developed components or new application environments through the existing data using regression models based on covariates. Three categories of such regression models are considered: a parametric, a semiparametric and a nonparametric approach. First, we assume that the lifetime is Weibull distributed, where its parameters are modelled as linear functions of the covariate. Second, the Cox proportional hazards model, well-known in Survival Analysis, is applied. Finally, a kernel estimator is used to interpolate between empirical distribution functions. In particular the last case is new in the context of reliability analysis. We propose a goodness of fit measure (GoF), which can be applied to all three types of regression models. Using this GoF measure we discuss a new model selection procedure. To illustrate this method of reliability prediction, the three classes of regression models are applied to real test data of motor experiments. Further the performance of the approaches is investigated by Monte Carlo simulations. - Highlights: • We estimate the lifetime distribution in the presence of a covariate. • Three types of regression models are considered and compared. • A new nonparametric estimator based on our particular data structure is introduced. • We propose a goodness of fit measure and show a new model selection procedure. • A case study with real data and Monte Carlo simulations are performed

  19. Transformation-invariant and nonparametric monotone smooth estimation of ROC curves.

    Science.gov (United States)

    Du, Pang; Tang, Liansheng

    2009-01-30

    When a new diagnostic test is developed, it is of interest to evaluate its accuracy in distinguishing diseased subjects from non-diseased subjects. The accuracy of the test is often evaluated by receiver operating characteristic (ROC) curves. Smooth ROC estimates are often preferable for continuous test results when the underlying ROC curves are in fact continuous. Nonparametric and parametric methods have been proposed by various authors to obtain smooth ROC curve estimates. However, there are certain drawbacks with the existing methods. Parametric methods need specific model assumptions. Nonparametric methods do not always satisfy the inherent properties of the ROC curves, such as monotonicity and transformation invariance. In this paper we propose a monotone spline approach to obtain smooth monotone ROC curves. Our method ensures important inherent properties of the underlying ROC curves, which include monotonicity, transformation invariance, and boundary constraints. We compare the finite sample performance of the newly proposed ROC method with other ROC smoothing methods in large-scale simulation studies. We illustrate our method through a real life example. Copyright (c) 2008 John Wiley & Sons, Ltd.

  20. A non-parametric framework for estimating threshold limit values

    Directory of Open Access Journals (Sweden)

    Ulm Kurt

    2005-11-01

    Full Text Available Abstract Background To estimate a threshold limit value for a compound known to have harmful health effects, an 'elbow' threshold model is usually applied. We are interested on non-parametric flexible alternatives. Methods We describe how a step function model fitted by isotonic regression can be used to estimate threshold limit values. This method returns a set of candidate locations, and we discuss two algorithms to select the threshold among them: the reduced isotonic regression and an algorithm considering the closed family of hypotheses. We assess the performance of these two alternative approaches under different scenarios in a simulation study. We illustrate the framework by analysing the data from a study conducted by the German Research Foundation aiming to set a threshold limit value in the exposure to total dust at workplace, as a causal agent for developing chronic bronchitis. Results In the paper we demonstrate the use and the properties of the proposed methodology along with the results from an application. The method appears to detect the threshold with satisfactory success. However, its performance can be compromised by the low power to reject the constant risk assumption when the true dose-response relationship is weak. Conclusion The estimation of thresholds based on isotonic framework is conceptually simple and sufficiently powerful. Given that in threshold value estimation context there is not a gold standard method, the proposed model provides a useful non-parametric alternative to the standard approaches and can corroborate or challenge their findings.

  1. Nonparametric estimation of the stationary M/G/1 workload distribution function

    DEFF Research Database (Denmark)

    Hansen, Martin Bøgsted

    2005-01-01

    In this paper it is demonstrated how a nonparametric estimator of the stationary workload distribution function of the M/G/1-queue can be obtained by systematic sampling the workload process. Weak convergence results and bootstrap methods for empirical distribution functions for stationary associ...

  2. Nonparametric Estimation of Interval Reliability for Discrete-Time Semi-Markov Systems

    DEFF Research Database (Denmark)

    Georgiadis, Stylianos; Limnios, Nikolaos

    2016-01-01

    In this article, we consider a repairable discrete-time semi-Markov system with finite state space. The measure of the interval reliability is given as the probability of the system being operational over a given finite-length time interval. A nonparametric estimator is proposed for the interval...

  3. Reproducing kernel Hilbert spaces of Gaussian priors

    NARCIS (Netherlands)

    Vaart, van der A.W.; Zanten, van J.H.; Clarke, B.; Ghosal, S.

    2008-01-01

    We review definitions and properties of reproducing kernel Hilbert spaces attached to Gaussian variables and processes, with a view to applications in nonparametric Bayesian statistics using Gaussian priors. The rate of contraction of posterior distributions based on Gaussian priors can be described

  4. Kernel and wavelet density estimators on manifolds and more general metric spaces

    DEFF Research Database (Denmark)

    Cleanthous, G.; Georgiadis, Athanasios; Kerkyacharian, G.

    We consider the problem of estimating the density of observations taking values in classical or nonclassical spaces such as manifolds and more general metric spaces. Our setting is quite general but also sufficiently rich in allowing the development of smooth functional calculus with well localized...... spectral kernels, Besov regularity spaces, and wavelet type systems. Kernel and both linear and nonlinear wavelet density estimators are introduced and studied. Convergence rates for these estimators are established, which are analogous to the existing results in the classical setting of real...

  5. Corruption clubs: empirical evidence from kernel density estimates

    NARCIS (Netherlands)

    Herzfeld, T.; Weiss, Ch.

    2007-01-01

    A common finding of many analytical models is the existence of multiple equilibria of corruption. Countries characterized by the same economic, social and cultural background do not necessarily experience the same levels of corruption. In this article, we use Kernel Density Estimation techniques to

  6. The Visualization and Analysis of POI Features under Network Space Supported by Kernel Density Estimation

    Directory of Open Access Journals (Sweden)

    YU Wenhao

    2015-01-01

    Full Text Available The distribution pattern and the distribution density of urban facility POIs are of great significance in the fields of infrastructure planning and urban spatial analysis. The kernel density estimation, which has been usually utilized for expressing these spatial characteristics, is superior to other density estimation methods (such as Quadrat analysis, Voronoi-based method, for that the Kernel density estimation considers the regional impact based on the first law of geography. However, the traditional kernel density estimation is mainly based on the Euclidean space, ignoring the fact that the service function and interrelation of urban feasibilities is carried out on the network path distance, neither than conventional Euclidean distance. Hence, this research proposed a computational model of network kernel density estimation, and the extension type of model in the case of adding constraints. This work also discussed the impacts of distance attenuation threshold and height extreme to the representation of kernel density. The large-scale actual data experiment for analyzing the different POIs' distribution patterns (random type, sparse type, regional-intensive type, linear-intensive type discusses the POI infrastructure in the city on the spatial distribution of characteristics, influence factors, and service functions.

  7. Moderate deviations principles for the kernel estimator of ...

    African Journals Online (AJOL)

    Abstract. The aim of this paper is to provide pointwise and uniform moderate deviations principles for the kernel estimator of a nonrandom regression function. Moreover, we give an application of these moderate deviations principles to the construction of condence regions for the regression function. Resume. L'objectif de ...

  8. Considerations on absorbed dose estimates based on different β-dose point kernels in internal dosimetry

    International Nuclear Information System (INIS)

    Uchida, Isao; Yamada, Yasuhiko; Yamashita, Takashi; Okigaki, Shigeyasu; Oyamada, Hiyoshimaru; Ito, Akira.

    1995-01-01

    In radiotherapy with radiopharmaceuticals, more accurate estimates of the three-dimensional (3-D) distribution of absorbed dose is important in specifying the activity to be administered to patients to deliver a prescribed absorbed dose to target volumes without exceeding the toxicity limit of normal tissues in the body. A calculation algorithm for the purpose has already been developed by the authors. An accurate 3-D distribution of absorbed dose based on the algorithm is given by convolution of the 3-D dose matrix for a unit cubic voxel containing unit cumulated activity, which is obtained by transforming a dose point kernel into a 3-D cubic dose matrix, with the 3-D cumulated activity distribution given by the same voxel size. However, beta-dose point kernels affecting accurate estimates of the 3-D absorbed dose distribution have been different among the investigators. The purpose of this study is to elucidate how different beta-dose point kernels in water influence on the estimates of the absorbed dose distribution due to the dose point kernel convolution method by the authors. Computer simulations were performed using the MIRD thyroid and lung phantoms under assumption of uniform activity distribution of 32 P. Using beta-dose point kernels derived from Monte Carlo simulations (EGS-4 or ACCEPT computer code), the differences among their point kernels gave little differences for the mean and maximum absorbed dose estimates for the MIRD phantoms used. In the estimates of mean and maximum absorbed doses calculated using different cubic voxel sizes (4x4x4 mm and 8x8x8 mm) for the MIRD thyroid phantom, the maximum absorbed doses for the 4x4x4 mm-voxel were estimated approximately 7% greater than the cases of the 8x8x8 mm-voxel. They were found in every beta-dose point kernel used in this study. On the other hand, the percentage difference of the mean absorbed doses in the both voxel sizes for each beta-dose point kernel was less than approximately 0.6%. (author)

  9. Nonparametric estimation of benchmark doses in environmental risk assessment

    Science.gov (United States)

    Piegorsch, Walter W.; Xiong, Hui; Bhattacharya, Rabi N.; Lin, Lizhen

    2013-01-01

    Summary An important statistical objective in environmental risk analysis is estimation of minimum exposure levels, called benchmark doses (BMDs), that induce a pre-specified benchmark response in a dose-response experiment. In such settings, representations of the risk are traditionally based on a parametric dose-response model. It is a well-known concern, however, that if the chosen parametric form is misspecified, inaccurate and possibly unsafe low-dose inferences can result. We apply a nonparametric approach for calculating benchmark doses, based on an isotonic regression method for dose-response estimation with quantal-response data (Bhattacharya and Kong, 2007). We determine the large-sample properties of the estimator, develop bootstrap-based confidence limits on the BMDs, and explore the confidence limits’ small-sample properties via a short simulation study. An example from cancer risk assessment illustrates the calculations. PMID:23914133

  10. Nonparametric Fine Tuning of Mixtures: Application to Non-Life Insurance Claims Distribution Estimation

    Science.gov (United States)

    Sardet, Laure; Patilea, Valentin

    When pricing a specific insurance premium, actuary needs to evaluate the claims cost distribution for the warranty. Traditional actuarial methods use parametric specifications to model claims distribution, like lognormal, Weibull and Pareto laws. Mixtures of such distributions allow to improve the flexibility of the parametric approach and seem to be quite well-adapted to capture the skewness, the long tails as well as the unobserved heterogeneity among the claims. In this paper, instead of looking for a finely tuned mixture with many components, we choose a parsimonious mixture modeling, typically a two or three-component mixture. Next, we use the mixture cumulative distribution function (CDF) to transform data into the unit interval where we apply a beta-kernel smoothing procedure. A bandwidth rule adapted to our methodology is proposed. Finally, the beta-kernel density estimate is back-transformed to recover an estimate of the original claims density. The beta-kernel smoothing provides an automatic fine-tuning of the parsimonious mixture and thus avoids inference in more complex mixture models with many parameters. We investigate the empirical performance of the new method in the estimation of the quantiles with simulated nonnegative data and the quantiles of the individual claims distribution in a non-life insurance application.

  11. Nonparametric Forecasting for Biochar Utilization in Poyang Lake Eco-Economic Zone in China

    Directory of Open Access Journals (Sweden)

    Meng-Shiuh Chang

    2014-01-01

    Full Text Available Agriculture is the least profitable industry in China. However, even with large financial subsidies from the government, farmers’ living standards have had no significant impact so far due to the historical, geographical, climatic factors. The study examines and quantifies the net economic and environmental benefits by utilizing biochar as a soil amendment in eleven counties in the Poyang Lake Eco-Economic Zone. A nonparametric kernel regression model is employed to estimate the relation between the scaled environmental and economic factors, which are determined as regression variables. In addition, the partial linear and single index regression models are used for comparison. In terms of evaluations of mean squared errors, the kernel estimator, exceeding the other estimators, is employed to forecast benefits of using biochar under various scenarios. The results indicate that biochar utilization can potentially increase farmers’ income if rice is planted and the net economic benefits can be achieved up to ¥114,900. The net economic benefits are higher when the pyrolysis plant is built in the south of Poyang Lake Eco-Economic Zone than when it is built in the north as the southern land is relatively barren, and biochar can save more costs on irrigation and fertilizer use.

  12. Efficient nonparametric n -body force fields from machine learning

    Science.gov (United States)

    Glielmo, Aldo; Zeni, Claudio; De Vita, Alessandro

    2018-05-01

    We provide a definition and explicit expressions for n -body Gaussian process (GP) kernels, which can learn any interatomic interaction occurring in a physical system, up to n -body contributions, for any value of n . The series is complete, as it can be shown that the "universal approximator" squared exponential kernel can be written as a sum of n -body kernels. These recipes enable the choice of optimally efficient force models for each target system, as confirmed by extensive testing on various materials. We furthermore describe how the n -body kernels can be "mapped" on equivalent representations that provide database-size-independent predictions and are thus crucially more efficient. We explicitly carry out this mapping procedure for the first nontrivial (three-body) kernel of the series, and we show that this reproduces the GP-predicted forces with meV /Å accuracy while being orders of magnitude faster. These results pave the way to using novel force models (here named "M-FFs") that are computationally as fast as their corresponding standard parametrized n -body force fields, while retaining the nonparametric character, the ease of training and validation, and the accuracy of the best recently proposed machine-learning potentials.

  13. Bayesian Nonparametric Mixture Estimation for Time-Indexed Functional Data in R

    Directory of Open Access Journals (Sweden)

    Terrance D. Savitsky

    2016-08-01

    Full Text Available We present growfunctions for R that offers Bayesian nonparametric estimation models for analysis of dependent, noisy time series data indexed by a collection of domains. This data structure arises from combining periodically published government survey statistics, such as are reported in the Current Population Study (CPS. The CPS publishes monthly, by-state estimates of employment levels, where each state expresses a noisy time series. Published state-level estimates from the CPS are composed from household survey responses in a model-free manner and express high levels of volatility due to insufficient sample sizes. Existing software solutions borrow information over a modeled time-based dependence to extract a de-noised time series for each domain. These solutions, however, ignore the dependence among the domains that may be additionally leveraged to improve estimation efficiency. The growfunctions package offers two fully nonparametric mixture models that simultaneously estimate both a time and domain-indexed dependence structure for a collection of time series: (1 A Gaussian process (GP construction, which is parameterized through the covariance matrix, estimates a latent function for each domain. The covariance parameters of the latent functions are indexed by domain under a Dirichlet process prior that permits estimation of the dependence among functions across the domains: (2 An intrinsic Gaussian Markov random field prior construction provides an alternative to the GP that expresses different computation and estimation properties. In addition to performing denoised estimation of latent functions from published domain estimates, growfunctions allows estimation of collections of functions for observation units (e.g., households, rather than aggregated domains, by accounting for an informative sampling design under which the probabilities for inclusion of observation units are related to the response variable. growfunctions includes plot

  14. Nonparametric Transfer Function Models

    Science.gov (United States)

    Liu, Jun M.; Chen, Rong; Yao, Qiwei

    2009-01-01

    In this paper a class of nonparametric transfer function models is proposed to model nonlinear relationships between ‘input’ and ‘output’ time series. The transfer function is smooth with unknown functional forms, and the noise is assumed to be a stationary autoregressive-moving average (ARMA) process. The nonparametric transfer function is estimated jointly with the ARMA parameters. By modeling the correlation in the noise, the transfer function can be estimated more efficiently. The parsimonious ARMA structure improves the estimation efficiency in finite samples. The asymptotic properties of the estimators are investigated. The finite-sample properties are illustrated through simulations and one empirical example. PMID:20628584

  15. Kernel-based whole-genome prediction of complex traits: a review.

    Science.gov (United States)

    Morota, Gota; Gianola, Daniel

    2014-01-01

    Prediction of genetic values has been a focus of applied quantitative genetics since the beginning of the 20th century, with renewed interest following the advent of the era of whole genome-enabled prediction. Opportunities offered by the emergence of high-dimensional genomic data fueled by post-Sanger sequencing technologies, especially molecular markers, have driven researchers to extend Ronald Fisher and Sewall Wright's models to confront new challenges. In particular, kernel methods are gaining consideration as a regression method of choice for genome-enabled prediction. Complex traits are presumably influenced by many genomic regions working in concert with others (clearly so when considering pathways), thus generating interactions. Motivated by this view, a growing number of statistical approaches based on kernels attempt to capture non-additive effects, either parametrically or non-parametrically. This review centers on whole-genome regression using kernel methods applied to a wide range of quantitative traits of agricultural importance in animals and plants. We discuss various kernel-based approaches tailored to capturing total genetic variation, with the aim of arriving at an enhanced predictive performance in the light of available genome annotation information. Connections between prediction machines born in animal breeding, statistics, and machine learning are revisited, and their empirical prediction performance is discussed. Overall, while some encouraging results have been obtained with non-parametric kernels, recovering non-additive genetic variation in a validation dataset remains a challenge in quantitative genetics.

  16. Kernel-based whole-genome prediction of complex traits: a review

    Directory of Open Access Journals (Sweden)

    Gota eMorota

    2014-10-01

    Full Text Available Prediction of genetic values has been a focus of applied quantitative genetics since the beginning of the 20th century, with renewed interest following the advent of the era of whole genome-enabled prediction. Opportunities offered by the emergence of high-dimensional genomic data fueled by post-Sanger sequencing technologies, especially molecular markers, have driven researchers to extend Ronald Fisher and Sewall Wright's models to confront new challenges. In particular, kernel methods are gaining consideration as a regression method of choice for genome-enabled prediction. Complex traits are presumably influenced by many genomic regions working in concert with others (clearly so when considering pathways, thus generating interactions. Motivated by this view, a growing number of statistical approaches based on kernels attempt to capture non-additive effects, either parametrically or non-parametrically. This review centers on whole-genome regression using kernel methods applied to a wide range of quantitative traits of agricultural importance in animals and plants. We discuss various kernel-based approaches tailored to capturing total genetic variation, with the aim of arriving at an enhanced predictive performance in the light of available genome annotation information. Connections between prediction machines born in animal breeding, statistics, and machine learning are revisited, and their empirical prediction performance is discussed. Overall, while some encouraging results have been obtained with non-parametric kernels, recovering non-additive genetic variation in a validation dataset remains a challenge in quantitative genetics.

  17. Regularized Pre-image Estimation for Kernel PCA De-noising

    DEFF Research Database (Denmark)

    Abrahamsen, Trine Julie; Hansen, Lars Kai

    2011-01-01

    The main challenge in de-noising by kernel Principal Component Analysis (PCA) is the mapping of de-noised feature space points back into input space, also referred to as “the pre-image problem”. Since the feature space mapping is typically not bijective, pre-image estimation is inherently illposed...

  18. Nonparametric estimation for censored mixture data with application to the Cooperative Huntington's Observational Research Trial.

    Science.gov (United States)

    Wang, Yuanjia; Garcia, Tanya P; Ma, Yanyuan

    2012-01-01

    This work presents methods for estimating genotype-specific distributions from genetic epidemiology studies where the event times are subject to right censoring, the genotypes are not directly observed, and the data arise from a mixture of scientifically meaningful subpopulations. Examples of such studies include kin-cohort studies and quantitative trait locus (QTL) studies. Current methods for analyzing censored mixture data include two types of nonparametric maximum likelihood estimators (NPMLEs) which do not make parametric assumptions on the genotype-specific density functions. Although both NPMLEs are commonly used, we show that one is inefficient and the other inconsistent. To overcome these deficiencies, we propose three classes of consistent nonparametric estimators which do not assume parametric density models and are easy to implement. They are based on the inverse probability weighting (IPW), augmented IPW (AIPW), and nonparametric imputation (IMP). The AIPW achieves the efficiency bound without additional modeling assumptions. Extensive simulation experiments demonstrate satisfactory performance of these estimators even when the data are heavily censored. We apply these estimators to the Cooperative Huntington's Observational Research Trial (COHORT), and provide age-specific estimates of the effect of mutation in the Huntington gene on mortality using a sample of family members. The close approximation of the estimated non-carrier survival rates to that of the U.S. population indicates small ascertainment bias in the COHORT family sample. Our analyses underscore an elevated risk of death in Huntington gene mutation carriers compared to non-carriers for a wide age range, and suggest that the mutation equally affects survival rates in both genders. The estimated survival rates are useful in genetic counseling for providing guidelines on interpreting the risk of death associated with a positive genetic testing, and in facilitating future subjects at risk

  19. portfolio optimization based on nonparametric estimation methods

    Directory of Open Access Journals (Sweden)

    mahsa ghandehari

    2017-03-01

    Full Text Available One of the major issues investors are facing with in capital markets is decision making about select an appropriate stock exchange for investing and selecting an optimal portfolio. This process is done through the risk and expected return assessment. On the other hand in portfolio selection problem if the assets expected returns are normally distributed, variance and standard deviation are used as a risk measure. But, the expected returns on assets are not necessarily normal and sometimes have dramatic differences from normal distribution. This paper with the introduction of conditional value at risk ( CVaR, as a measure of risk in a nonparametric framework, for a given expected return, offers the optimal portfolio and this method is compared with the linear programming method. The data used in this study consists of monthly returns of 15 companies selected from the top 50 companies in Tehran Stock Exchange during the winter of 1392 which is considered from April of 1388 to June of 1393. The results of this study show the superiority of nonparametric method over the linear programming method and the nonparametric method is much faster than the linear programming method.

  20. The Support Reduction Algorithm for Computing Non-Parametric Function Estimates in Mixture Models

    OpenAIRE

    GROENEBOOM, PIET; JONGBLOED, GEURT; WELLNER, JON A.

    2008-01-01

    In this paper, we study an algorithm (which we call the support reduction algorithm) that can be used to compute non-parametric M-estimators in mixture models. The algorithm is compared with natural competitors in the context of convex regression and the ‘Aspect problem’ in quantum physics.

  1. Partial Deconvolution with Inaccurate Blur Kernel.

    Science.gov (United States)

    Ren, Dongwei; Zuo, Wangmeng; Zhang, David; Xu, Jun; Zhang, Lei

    2017-10-17

    Most non-blind deconvolution methods are developed under the error-free kernel assumption, and are not robust to inaccurate blur kernel. Unfortunately, despite the great progress in blind deconvolution, estimation error remains inevitable during blur kernel estimation. Consequently, severe artifacts such as ringing effects and distortions are likely to be introduced in the non-blind deconvolution stage. In this paper, we tackle this issue by suggesting: (i) a partial map in the Fourier domain for modeling kernel estimation error, and (ii) a partial deconvolution model for robust deblurring with inaccurate blur kernel. The partial map is constructed by detecting the reliable Fourier entries of estimated blur kernel. And partial deconvolution is applied to wavelet-based and learning-based models to suppress the adverse effect of kernel estimation error. Furthermore, an E-M algorithm is developed for estimating the partial map and recovering the latent sharp image alternatively. Experimental results show that our partial deconvolution model is effective in relieving artifacts caused by inaccurate blur kernel, and can achieve favorable deblurring quality on synthetic and real blurry images.Most non-blind deconvolution methods are developed under the error-free kernel assumption, and are not robust to inaccurate blur kernel. Unfortunately, despite the great progress in blind deconvolution, estimation error remains inevitable during blur kernel estimation. Consequently, severe artifacts such as ringing effects and distortions are likely to be introduced in the non-blind deconvolution stage. In this paper, we tackle this issue by suggesting: (i) a partial map in the Fourier domain for modeling kernel estimation error, and (ii) a partial deconvolution model for robust deblurring with inaccurate blur kernel. The partial map is constructed by detecting the reliable Fourier entries of estimated blur kernel. And partial deconvolution is applied to wavelet-based and learning

  2. Semi-nonparametric estimates of interfuel substitution in US energy demand

    Energy Technology Data Exchange (ETDEWEB)

    Serletis, A.; Shahmoradi, A. [University of Calgary, Calgary, AB (Canada). Dept. of Economics

    2008-09-15

    This paper focuses on the demand for crude oil, natural gas, and coal in the United States in the context of two globally flexible functional forms - the Fourier and the Asymptotically Ideal Model (AIM) - estimated subject to full regularity, using methods suggested over 20 years ago by Gallant and Golub (Gallant, A. Ronald and Golub, Gene H. Imposing Curvature Restrictions on Flexible Functional Forms. Journal of Econometrics 26 (1984), 295-321) and recently used by Serletis and Shahmoradi (Serletis, A., Shahmoradi, A., 2005. Semi-nonparametric estimates of the demand for money in the United States. Macroeconomic Dynamics 9, 542-559) in the monetary demand systems literature. We provide a comparison in terms of a full set of elasticities and also a policy perspective, using (for the first time) parameter estimates that are consistent with global regularity.

  3. Non-parametric estimation of the availability in a general repairable system

    International Nuclear Information System (INIS)

    Gamiz, M.L.; Roman, Y.

    2008-01-01

    This work deals with repairable systems with unknown failure and repair time distributions. We focus on the estimation of the instantaneous availability, that is, the probability that the system is functioning at a given time, which we consider as the most significant measure for evaluating the effectiveness of a repairable system. The estimation of the availability function is not, in general, an easy task, i.e., analytical techniques are difficult to apply. We propose a smooth estimation of the availability based on kernel estimator of the cumulative distribution functions (CDF) of the failure and repair times, for which the bandwidth parameters are obtained by bootstrap procedures. The consistency properties of the availability estimator are established by using techniques based on the Laplace transform

  4. Non-parametric estimation of the availability in a general repairable system

    Energy Technology Data Exchange (ETDEWEB)

    Gamiz, M.L. [Departamento de Estadistica e I.O., Facultad de Ciencias, Universidad de Granada, Granada 18071 (Spain)], E-mail: mgamiz@ugr.es; Roman, Y. [Departamento de Estadistica e I.O., Facultad de Ciencias, Universidad de Granada, Granada 18071 (Spain)

    2008-08-15

    This work deals with repairable systems with unknown failure and repair time distributions. We focus on the estimation of the instantaneous availability, that is, the probability that the system is functioning at a given time, which we consider as the most significant measure for evaluating the effectiveness of a repairable system. The estimation of the availability function is not, in general, an easy task, i.e., analytical techniques are difficult to apply. We propose a smooth estimation of the availability based on kernel estimator of the cumulative distribution functions (CDF) of the failure and repair times, for which the bandwidth parameters are obtained by bootstrap procedures. The consistency properties of the availability estimator are established by using techniques based on the Laplace transform.

  5. Smoothed Conditional Scale Function Estimation in AR(1-ARCH(1 Processes

    Directory of Open Access Journals (Sweden)

    Lema Logamou Seknewna

    2018-01-01

    Full Text Available The estimation of the Smoothed Conditional Scale Function for time series was taken out under the conditional heteroscedastic innovations by imitating the kernel smoothing in nonparametric QAR-QARCH scheme. The estimation was taken out based on the quantile regression methodology proposed by Koenker and Bassett. And the proof of the asymptotic properties of the Conditional Scale Function estimator for this type of process was given and its consistency was shown.

  6. Nonparametric correlation models for portfolio allocation

    DEFF Research Database (Denmark)

    Aslanidis, Nektarios; Casas, Isabel

    2013-01-01

    This article proposes time-varying nonparametric and semiparametric estimators of the conditional cross-correlation matrix in the context of portfolio allocation. Simulations results show that the nonparametric and semiparametric models are best in DGPs with substantial variability or structural ...... currencies. Results show the nonparametric model generally dominates the others when evaluating in-sample. However, the semiparametric model is best for out-of-sample analysis....

  7. Subsampling Realised Kernels

    DEFF Research Database (Denmark)

    Barndorff-Nielsen, Ole Eiler; Hansen, Peter Reinhard; Lunde, Asger

    2011-01-01

    In a recent paper we have introduced the class of realised kernel estimators of the increments of quadratic variation in the presence of noise. We showed that this estimator is consistent and derived its limit distribution under various assumptions on the kernel weights. In this paper we extend our...... that subsampling is impotent, in the sense that subsampling has no effect on the asymptotic distribution. Perhaps surprisingly, for the efficient smooth kernels, such as the Parzen kernel, we show that subsampling is harmful as it increases the asymptotic variance. We also study the performance of subsampled...

  8. Nonparametric estimation for censored mixture data with application to the Cooperative Huntington’s Observational Research Trial

    Science.gov (United States)

    Wang, Yuanjia; Garcia, Tanya P.; Ma, Yanyuan

    2012-01-01

    This work presents methods for estimating genotype-specific distributions from genetic epidemiology studies where the event times are subject to right censoring, the genotypes are not directly observed, and the data arise from a mixture of scientifically meaningful subpopulations. Examples of such studies include kin-cohort studies and quantitative trait locus (QTL) studies. Current methods for analyzing censored mixture data include two types of nonparametric maximum likelihood estimators (NPMLEs) which do not make parametric assumptions on the genotype-specific density functions. Although both NPMLEs are commonly used, we show that one is inefficient and the other inconsistent. To overcome these deficiencies, we propose three classes of consistent nonparametric estimators which do not assume parametric density models and are easy to implement. They are based on the inverse probability weighting (IPW), augmented IPW (AIPW), and nonparametric imputation (IMP). The AIPW achieves the efficiency bound without additional modeling assumptions. Extensive simulation experiments demonstrate satisfactory performance of these estimators even when the data are heavily censored. We apply these estimators to the Cooperative Huntington’s Observational Research Trial (COHORT), and provide age-specific estimates of the effect of mutation in the Huntington gene on mortality using a sample of family members. The close approximation of the estimated non-carrier survival rates to that of the U.S. population indicates small ascertainment bias in the COHORT family sample. Our analyses underscore an elevated risk of death in Huntington gene mutation carriers compared to non-carriers for a wide age range, and suggest that the mutation equally affects survival rates in both genders. The estimated survival rates are useful in genetic counseling for providing guidelines on interpreting the risk of death associated with a positive genetic testing, and in facilitating future subjects at risk

  9. Nonparametric autocovariance estimation from censored time series by Gaussian imputation.

    Science.gov (United States)

    Park, Jung Wook; Genton, Marc G; Ghosh, Sujit K

    2009-02-01

    One of the most frequently used methods to model the autocovariance function of a second-order stationary time series is to use the parametric framework of autoregressive and moving average models developed by Box and Jenkins. However, such parametric models, though very flexible, may not always be adequate to model autocovariance functions with sharp changes. Furthermore, if the data do not follow the parametric model and are censored at a certain value, the estimation results may not be reliable. We develop a Gaussian imputation method to estimate an autocovariance structure via nonparametric estimation of the autocovariance function in order to address both censoring and incorrect model specification. We demonstrate the effectiveness of the technique in terms of bias and efficiency with simulations under various rates of censoring and underlying models. We describe its application to a time series of silicon concentrations in the Arctic.

  10. Nonlinear Denoising and Analysis of Neuroimages With Kernel Principal Component Analysis and Pre-Image Estimation

    DEFF Research Database (Denmark)

    Rasmussen, Peter Mondrup; Abrahamsen, Trine Julie; Madsen, Kristoffer Hougaard

    2012-01-01

    We investigate the use of kernel principal component analysis (PCA) and the inverse problem known as pre-image estimation in neuroimaging: i) We explore kernel PCA and pre-image estimation as a means for image denoising as part of the image preprocessing pipeline. Evaluation of the denoising...... procedure is performed within a data-driven split-half evaluation framework. ii) We introduce manifold navigation for exploration of a nonlinear data manifold, and illustrate how pre-image estimation can be used to generate brain maps in the continuum between experimentally defined brain states/classes. We...

  11. Using Cochran's Z Statistic to Test the Kernel-Smoothed Item Response Function Differences between Focal and Reference Groups

    Science.gov (United States)

    Zheng, Yinggan; Gierl, Mark J.; Cui, Ying

    2010-01-01

    This study combined the kernel smoothing procedure and a nonparametric differential item functioning statistic--Cochran's Z--to statistically test the difference between the kernel-smoothed item response functions for reference and focal groups. Simulation studies were conducted to investigate the Type I error and power of the proposed…

  12. Nonparametric functional mapping of quantitative trait loci.

    Science.gov (United States)

    Yang, Jie; Wu, Rongling; Casella, George

    2009-03-01

    Functional mapping is a useful tool for mapping quantitative trait loci (QTL) that control dynamic traits. It incorporates mathematical aspects of biological processes into the mixture model-based likelihood setting for QTL mapping, thus increasing the power of QTL detection and the precision of parameter estimation. However, in many situations there is no obvious functional form and, in such cases, this strategy will not be optimal. Here we propose to use nonparametric function estimation, typically implemented with B-splines, to estimate the underlying functional form of phenotypic trajectories, and then construct a nonparametric test to find evidence of existing QTL. Using the representation of a nonparametric regression as a mixed model, the final test statistic is a likelihood ratio test. We consider two types of genetic maps: dense maps and general maps, and the power of nonparametric functional mapping is investigated through simulation studies and demonstrated by examples.

  13. Employment of kernel methods on wind turbine power performance assessment

    DEFF Research Database (Denmark)

    Skrimpas, Georgios Alexandros; Sweeney, Christian Walsted; Marhadi, Kun S.

    2015-01-01

    A power performance assessment technique is developed for the detection of power production discrepancies in wind turbines. The method employs a widely used nonparametric pattern recognition technique, the kernel methods. The evaluation is based on the trending of an extracted feature from...... the kernel matrix, called similarity index, which is introduced by the authors for the first time. The operation of the turbine and consequently the computation of the similarity indexes is classified into five power bins offering better resolution and thus more consistent root cause analysis. The accurate...

  14. Hamilton's gradient estimate for the heat kernel on complete manifolds

    OpenAIRE

    Kotschwar, Brett

    2007-01-01

    In this paper we extend a gradient estimate of R. Hamilton for positive solutions to the heat equation on closed manifolds to bounded positive solutions on complete, non-compact manifolds with $Rc \\geq -Kg$. We accomplish this extension via a maximum principle of L. Karp and P. Li and a Bernstein-type estimate on the gradient of the solution. An application of our result, together with the bounds of P. Li and S.T. Yau, yields an estimate on the gradient of the heat kernel for complete manifol...

  15. A Cure for Variance Inflation in High Dimensional Kernel Principal Component Analysis

    DEFF Research Database (Denmark)

    Abrahamsen, Trine Julie; Hansen, Lars Kai

    2011-01-01

    Small sample high-dimensional principal component analysis (PCA) suffers from variance inflation and lack of generalizability. It has earlier been pointed out that a simple leave-one-out variance renormalization scheme can cure the problem. In this paper we generalize the cure in two directions......: First, we propose a computationally less intensive approximate leave-one-out estimator, secondly, we show that variance inflation is also present in kernel principal component analysis (kPCA) and we provide a non-parametric renormalization scheme which can quite efficiently restore generalizability in kPCA....... As for PCA our analysis also suggests a simplified approximate expression. © 2011 Trine J. Abrahamsen and Lars K. Hansen....

  16. Probit vs. semi-nonparametric estimation: examining the role of disability on institutional entry for older adults.

    Science.gov (United States)

    Sharma, Andy

    2017-06-01

    The purpose of this study was to showcase an advanced methodological approach to model disability and institutional entry. Both of these are important areas to investigate given the on-going aging of the United States population. By 2020, approximately 15% of the population will be 65 years and older. Many of these older adults will experience disability and require formal care. A probit analysis was employed to determine which disabilities were associated with admission into an institution (i.e. long-term care). Since this framework imposes strong distributional assumptions, misspecification leads to inconsistent estimators. To overcome such a short-coming, this analysis extended the probit framework by employing an advanced semi-nonparamertic maximum likelihood estimation utilizing Hermite polynomial expansions. Specification tests show semi-nonparametric estimation is preferred over probit. In terms of the estimates, semi-nonparametric ratios equal 42 for cognitive difficulty, 64 for independent living, and 111 for self-care disability while probit yields much smaller estimates of 19, 30, and 44, respectively. Public health professionals can use these results to better understand why certain interventions have not shown promise. Equally important, healthcare workers can use this research to evaluate which type of treatment plans may delay institutionalization and improve the quality of life for older adults. Implications for rehabilitation With on-going global aging, understanding the association between disability and institutional entry is important in devising successful rehabilitation interventions. Semi-nonparametric is preferred to probit and shows ambulatory and cognitive impairments present high risk for institutional entry (long-term care). Informal caregiving and home-based care require further examination as forms of rehabilitation/therapy for certain types of disabilities.

  17. Nonparametric adaptive estimation of linear functionals for low frequency observed Lévy processes

    OpenAIRE

    Kappus, Johanna

    2012-01-01

    For a Lévy process X having finite variation on compact sets and finite first moments, µ( dx) = xv( dx) is a finite signed measure which completely describes the jump dynamics. We construct kernel estimators for linear functionals of µ and provide rates of convergence under regularity assumptions. Moreover, we consider adaptive estimation via model selection and propose a new strategy for the data driven choice of the smoothing parameter.

  18. Semi-nonparametric estimates of interfuel substitution in U.S. energy demand

    Energy Technology Data Exchange (ETDEWEB)

    Serletis, Apostolos [Department of Economics, University of Calgary, Calgary, Alberta (Canada); Shahmoradi, Asghar [Faculty of Economics, University of Tehran, Tehran (Iran)

    2008-09-15

    This paper focuses on the demand for crude oil, natural gas, and coal in the United States in the context of two globally flexible functional forms - the Fourier and the Asymptotically Ideal Model (AIM) - estimated subject to full regularity, using methods suggested over 20 years ago by Gallant and Golub [Gallant, A. Ronald and Golub, Gene H. Imposing Curvature Restrictions on Flexible Functional Forms. Journal of Econometrics 26 (1984), 295-321] and recently used by Serletis and Shahmoradi [Serletis, A., Shahmoradi, A., 2005. Semi-nonparametric estimates of the demand for money in the United States. Macroeconomic Dynamics 9, 542-559] in the monetary demand systems literature. We provide a comparison in terms of a full set of elasticities and also a policy perspective, using (for the first time) parameter estimates that are consistent with global regularity. (author)

  19. Nonparametric estimation in an "illness-death" model when all transition times are interval censored

    DEFF Research Database (Denmark)

    Frydman, Halina; Gerds, Thomas; Grøn, Randi

    2013-01-01

    We develop nonparametric maximum likelihood estimation for the parameters of an irreversible Markov chain on states {0,1,2} from the observations with interval censored times of 0 → 1, 0 → 2 and 1 → 2 transitions. The distinguishing aspect of the data is that, in addition to all transition times ...

  20. Using kernel density estimates to investigate lymphatic filariasis in northeast Brazil

    Science.gov (United States)

    Medeiros, Zulma; Bonfim, Cristine; Brandão, Eduardo; Netto, Maria José Evangelista; Vasconcellos, Lucia; Ribeiro, Liany; Portugal, José Luiz

    2012-01-01

    After more than 10 years of the Global Program to Eliminate Lymphatic Filariasis (GPELF) in Brazil, advances have been seen, but the endemic disease persists as a public health problem. The aim of this study was to describe the spatial distribution of lymphatic filariasis in the municipality of Jaboatão dos Guararapes, Pernambuco, Brazil. An epidemiological survey was conducted in the municipality, and positive filariasis cases identified in this survey were georeferenced in point form, using the GPS. A kernel intensity estimator was applied to identify clusters with greater intensity of cases. We examined 23 673 individuals and 323 individuals with microfilaremia were identified, representing a mean prevalence rate of 1.4%. Around 88% of the districts surveyed presented cases of filarial infection, with prevalences of 0–5.6%. The male population was more affected by the infection, with 63.8% of the cases (P<0.005). Positive cases were found in all age groups examined. The kernel intensity estimator identified the areas of greatest intensity and least intensity of filarial infection cases. The case distribution was heterogeneous across the municipality. The kernel estimator identified spatial clusters of cases, thus indicating locations with greater intensity of transmission. The main advantage of this type of analysis lies in its ability to rapidly and easily show areas with the highest concentration of cases, thereby contributing towards planning, monitoring, and surveillance of filariasis elimination actions. Incorporation of geoprocessing and spatial analysis techniques constitutes an important tool for use within the GPELF. PMID:22943547

  1. Probability Machines: Consistent Probability Estimation Using Nonparametric Learning Machines

    Science.gov (United States)

    Malley, J. D.; Kruppa, J.; Dasgupta, A.; Malley, K. G.; Ziegler, A.

    2011-01-01

    Summary Background Most machine learning approaches only provide a classification for binary responses. However, probabilities are required for risk estimation using individual patient characteristics. It has been shown recently that every statistical learning machine known to be consistent for a nonparametric regression problem is a probability machine that is provably consistent for this estimation problem. Objectives The aim of this paper is to show how random forests and nearest neighbors can be used for consistent estimation of individual probabilities. Methods Two random forest algorithms and two nearest neighbor algorithms are described in detail for estimation of individual probabilities. We discuss the consistency of random forests, nearest neighbors and other learning machines in detail. We conduct a simulation study to illustrate the validity of the methods. We exemplify the algorithms by analyzing two well-known data sets on the diagnosis of appendicitis and the diagnosis of diabetes in Pima Indians. Results Simulations demonstrate the validity of the method. With the real data application, we show the accuracy and practicality of this approach. We provide sample code from R packages in which the probability estimation is already available. This means that all calculations can be performed using existing software. Conclusions Random forest algorithms as well as nearest neighbor approaches are valid machine learning methods for estimating individual probabilities for binary responses. Freely available implementations are available in R and may be used for applications. PMID:21915433

  2. Optimized Kernel Entropy Components.

    Science.gov (United States)

    Izquierdo-Verdiguier, Emma; Laparra, Valero; Jenssen, Robert; Gomez-Chova, Luis; Camps-Valls, Gustau

    2017-06-01

    This brief addresses two main issues of the standard kernel entropy component analysis (KECA) algorithm: the optimization of the kernel decomposition and the optimization of the Gaussian kernel parameter. KECA roughly reduces to a sorting of the importance of kernel eigenvectors by entropy instead of variance, as in the kernel principal components analysis. In this brief, we propose an extension of the KECA method, named optimized KECA (OKECA), that directly extracts the optimal features retaining most of the data entropy by means of compacting the information in very few features (often in just one or two). The proposed method produces features which have higher expressive power. In particular, it is based on the independent component analysis framework, and introduces an extra rotation to the eigen decomposition, which is optimized via gradient-ascent search. This maximum entropy preservation suggests that OKECA features are more efficient than KECA features for density estimation. In addition, a critical issue in both the methods is the selection of the kernel parameter, since it critically affects the resulting performance. Here, we analyze the most common kernel length-scale selection criteria. The results of both the methods are illustrated in different synthetic and real problems. Results show that OKECA returns projections with more expressive power than KECA, the most successful rule for estimating the kernel parameter is based on maximum likelihood, and OKECA is more robust to the selection of the length-scale parameter in kernel density estimation.

  3. Automated voxelization of 3D atom probe data through kernel density estimation

    International Nuclear Information System (INIS)

    Srinivasan, Srikant; Kaluskar, Kaustubh; Dumpala, Santoshrupa; Broderick, Scott; Rajan, Krishna

    2015-01-01

    Identifying nanoscale chemical features from atom probe tomography (APT) data routinely involves adjustment of voxel size as an input parameter, through visual supervision, making the final outcome user dependent, reliant on heuristic knowledge and potentially prone to error. This work utilizes Kernel density estimators to select an optimal voxel size in an unsupervised manner to perform feature selection, in particular targeting resolution of interfacial features and chemistries. The capability of this approach is demonstrated through analysis of the γ / γ’ interface in a Ni–Al–Cr superalloy. - Highlights: • Develop approach for standardizing aspects of atom probe reconstruction. • Use Kernel density estimators to select optimal voxel sizes in an unsupervised manner. • Perform interfacial analysis of Ni–Al–Cr superalloy, using new automated approach. • Optimize voxel size to preserve the feature of interest and minimizing loss / noise.

  4. Estimation from PET data of transient changes in dopamine concentration induced by alcohol: support for a non-parametric signal estimation method

    Energy Technology Data Exchange (ETDEWEB)

    Constantinescu, C C; Yoder, K K; Normandin, M D; Morris, E D [Department of Radiology, Indiana University School of Medicine, Indianapolis, IN (United States); Kareken, D A [Department of Neurology, Indiana University School of Medicine, Indianapolis, IN (United States); Bouman, C A [Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN (United States); O' Connor, S J [Department of Psychiatry, Indiana University School of Medicine, Indianapolis, IN (United States)], E-mail: emorris@iupui.edu

    2008-03-07

    We previously developed a model-independent technique (non-parametric ntPET) for extracting the transient changes in neurotransmitter concentration from paired (rest and activation) PET studies with a receptor ligand. To provide support for our method, we introduced three hypotheses of validation based on work by Endres and Carson (1998 J. Cereb. Blood Flow Metab. 18 1196-210) and Yoder et al (2004 J. Nucl. Med. 45 903-11), and tested them on experimental data. All three hypotheses describe relationships between the estimated free (synaptic) dopamine curves (F{sup DA}(t)) and the change in binding potential ({delta}BP). The veracity of the F{sup DA}(t) curves recovered by nonparametric ntPET is supported when the data adhere to the following hypothesized behaviors: (1) {delta}BP should decline with increasing DA peak time, (2) {delta}BP should increase as the strength of the temporal correlation between F{sup DA}(t) and the free raclopride (F{sup RAC}(t)) curve increases, (3) {delta}BP should decline linearly with the effective weighted availability of the receptor sites. We analyzed regional brain data from 8 healthy subjects who received two [{sup 11}C]raclopride scans: one at rest, and one during which unanticipated IV alcohol was administered to stimulate dopamine release. For several striatal regions, nonparametric ntPET was applied to recover F{sup DA}(t), and binding potential values were determined. Kendall rank-correlation analysis confirmed that the F{sup DA}(t) data followed the expected trends for all three validation hypotheses. Our findings lend credence to our model-independent estimates of F{sup DA}(t). Application of nonparametric ntPET may yield important insights into how alterations in timing of dopaminergic neurotransmission are involved in the pathologies of addiction and other psychiatric disorders.

  5. Heat kernel estimates for pseudodifferential operators, fractional Laplacians and Dirichlet-to-Neumann operators

    DEFF Research Database (Denmark)

    Gimperlein, Heiko; Grubb, Gerd

    2014-01-01

    The purpose of this article is to establish upper and lower estimates for the integral kernel of the semigroup exp(−t P) associated to a classical, strongly elliptic pseudodifferential operator P of positive order on a closed manifold. The Poissonian bounds generalize those obtained for perturbat......The purpose of this article is to establish upper and lower estimates for the integral kernel of the semigroup exp(−t P) associated to a classical, strongly elliptic pseudodifferential operator P of positive order on a closed manifold. The Poissonian bounds generalize those obtained...... for perturbations of fractional powers of the Laplacian. In the selfadjoint case, extensions to t∈C+  are studied. In particular, our results apply to the Dirichlet-to-Neumann semigroup....

  6. Non-parametric PSF estimation from celestial transit solar images using blind deconvolution

    Directory of Open Access Journals (Sweden)

    González Adriana

    2016-01-01

    Full Text Available Context: Characterization of instrumental effects in astronomical imaging is important in order to extract accurate physical information from the observations. The measured image in a real optical instrument is usually represented by the convolution of an ideal image with a Point Spread Function (PSF. Additionally, the image acquisition process is also contaminated by other sources of noise (read-out, photon-counting. The problem of estimating both the PSF and a denoised image is called blind deconvolution and is ill-posed. Aims: We propose a blind deconvolution scheme that relies on image regularization. Contrarily to most methods presented in the literature, our method does not assume a parametric model of the PSF and can thus be applied to any telescope. Methods: Our scheme uses a wavelet analysis prior model on the image and weak assumptions on the PSF. We use observations from a celestial transit, where the occulting body can be assumed to be a black disk. These constraints allow us to retain meaningful solutions for the filter and the image, eliminating trivial, translated, and interchanged solutions. Under an additive Gaussian noise assumption, they also enforce noise canceling and avoid reconstruction artifacts by promoting the whiteness of the residual between the blurred observations and the cleaned data. Results: Our method is applied to synthetic and experimental data. The PSF is estimated for the SECCHI/EUVI instrument using the 2007 Lunar transit, and for SDO/AIA using the 2012 Venus transit. Results show that the proposed non-parametric blind deconvolution method is able to estimate the core of the PSF with a similar quality to parametric methods proposed in the literature. We also show that, if these parametric estimations are incorporated in the acquisition model, the resulting PSF outperforms both the parametric and non-parametric methods.

  7. Bayesian Kernel Mixtures for Counts.

    Science.gov (United States)

    Canale, Antonio; Dunson, David B

    2011-12-01

    Although Bayesian nonparametric mixture models for continuous data are well developed, there is a limited literature on related approaches for count data. A common strategy is to use a mixture of Poissons, which unfortunately is quite restrictive in not accounting for distributions having variance less than the mean. Other approaches include mixing multinomials, which requires finite support, and using a Dirichlet process prior with a Poisson base measure, which does not allow smooth deviations from the Poisson. As a broad class of alternative models, we propose to use nonparametric mixtures of rounded continuous kernels. An efficient Gibbs sampler is developed for posterior computation, and a simulation study is performed to assess performance. Focusing on the rounded Gaussian case, we generalize the modeling framework to account for multivariate count data, joint modeling with continuous and categorical variables, and other complications. The methods are illustrated through applications to a developmental toxicity study and marketing data. This article has supplementary material online.

  8. Nonparametric Inference for Periodic Sequences

    KAUST Repository

    Sun, Ying

    2012-02-01

    This article proposes a nonparametric method for estimating the period and values of a periodic sequence when the data are evenly spaced in time. The period is estimated by a "leave-out-one-cycle" version of cross-validation (CV) and complements the periodogram, a widely used tool for period estimation. The CV method is computationally simple and implicitly penalizes multiples of the smallest period, leading to a "virtually" consistent estimator of integer periods. This estimator is investigated both theoretically and by simulation.We also propose a nonparametric test of the null hypothesis that the data have constantmean against the alternative that the sequence of means is periodic. Finally, our methodology is demonstrated on three well-known time series: the sunspots and lynx trapping data, and the El Niño series of sea surface temperatures. © 2012 American Statistical Association and the American Society for Quality.

  9. Analog forecasting with dynamics-adapted kernels

    Science.gov (United States)

    Zhao, Zhizhen; Giannakis, Dimitrios

    2016-09-01

    Analog forecasting is a nonparametric technique introduced by Lorenz in 1969 which predicts the evolution of states of a dynamical system (or observables defined on the states) by following the evolution of the sample in a historical record of observations which most closely resembles the current initial data. Here, we introduce a suite of forecasting methods which improve traditional analog forecasting by combining ideas from kernel methods developed in harmonic analysis and machine learning and state-space reconstruction for dynamical systems. A key ingredient of our approach is to replace single-analog forecasting with weighted ensembles of analogs constructed using local similarity kernels. The kernels used here employ a number of dynamics-dependent features designed to improve forecast skill, including Takens’ delay-coordinate maps (to recover information in the initial data lost through partial observations) and a directional dependence on the dynamical vector field generating the data. Mathematically, our approach is closely related to kernel methods for out-of-sample extension of functions, and we discuss alternative strategies based on the Nyström method and the multiscale Laplacian pyramids technique. We illustrate these techniques in applications to forecasting in a low-order deterministic model for atmospheric dynamics with chaotic metastability, and interannual-scale forecasting in the North Pacific sector of a comprehensive climate model. We find that forecasts based on kernel-weighted ensembles have significantly higher skill than the conventional approach following a single analog.

  10. On Improving Density Estimators which are not Bona Fide Functions

    OpenAIRE

    Gajek, Leslaw

    1986-01-01

    In order to improve the rate of decrease of the IMSE for nonparametric kernel density estimators with nonrandom bandwidth beyond $O(n^{-4/5})$ all current methods must relax the constraint that the density estimate be a bona fide function, that is, be nonnegative and integrate to one. In this paper we show how to achieve similar improvement without relaxing any of these constraints. The method can also be applied for orthogonal series, adaptive orthogonal series, spline, jackknife, and other ...

  11. A ¤nonparametric dynamic additive regression model for longitudinal data

    DEFF Research Database (Denmark)

    Martinussen, T.; Scheike, T. H.

    2000-01-01

    dynamic linear models, estimating equations, least squares, longitudinal data, nonparametric methods, partly conditional mean models, time-varying-coefficient models......dynamic linear models, estimating equations, least squares, longitudinal data, nonparametric methods, partly conditional mean models, time-varying-coefficient models...

  12. Estimation of the applicability domain of kernel-based machine learning models for virtual screening

    Directory of Open Access Journals (Sweden)

    Fechner Nikolas

    2010-03-01

    Full Text Available Abstract Background The virtual screening of large compound databases is an important application of structural-activity relationship models. Due to the high structural diversity of these data sets, it is impossible for machine learning based QSAR models, which rely on a specific training set, to give reliable results for all compounds. Thus, it is important to consider the subset of the chemical space in which the model is applicable. The approaches to this problem that have been published so far mostly use vectorial descriptor representations to define this domain of applicability of the model. Unfortunately, these cannot be extended easily to structured kernel-based machine learning models. For this reason, we propose three approaches to estimate the domain of applicability of a kernel-based QSAR model. Results We evaluated three kernel-based applicability domain estimations using three different structured kernels on three virtual screening tasks. Each experiment consisted of the training of a kernel-based QSAR model using support vector regression and the ranking of a disjoint screening data set according to the predicted activity. For each prediction, the applicability of the model for the respective compound is quantitatively described using a score obtained by an applicability domain formulation. The suitability of the applicability domain estimation is evaluated by comparing the model performance on the subsets of the screening data sets obtained by different thresholds for the applicability scores. This comparison indicates that it is possible to separate the part of the chemspace, in which the model gives reliable predictions, from the part consisting of structures too dissimilar to the training set to apply the model successfully. A closer inspection reveals that the virtual screening performance of the model is considerably improved if half of the molecules, those with the lowest applicability scores, are omitted from the screening

  13. Estimation of the applicability domain of kernel-based machine learning models for virtual screening.

    Science.gov (United States)

    Fechner, Nikolas; Jahn, Andreas; Hinselmann, Georg; Zell, Andreas

    2010-03-11

    The virtual screening of large compound databases is an important application of structural-activity relationship models. Due to the high structural diversity of these data sets, it is impossible for machine learning based QSAR models, which rely on a specific training set, to give reliable results for all compounds. Thus, it is important to consider the subset of the chemical space in which the model is applicable. The approaches to this problem that have been published so far mostly use vectorial descriptor representations to define this domain of applicability of the model. Unfortunately, these cannot be extended easily to structured kernel-based machine learning models. For this reason, we propose three approaches to estimate the domain of applicability of a kernel-based QSAR model. We evaluated three kernel-based applicability domain estimations using three different structured kernels on three virtual screening tasks. Each experiment consisted of the training of a kernel-based QSAR model using support vector regression and the ranking of a disjoint screening data set according to the predicted activity. For each prediction, the applicability of the model for the respective compound is quantitatively described using a score obtained by an applicability domain formulation. The suitability of the applicability domain estimation is evaluated by comparing the model performance on the subsets of the screening data sets obtained by different thresholds for the applicability scores. This comparison indicates that it is possible to separate the part of the chemspace, in which the model gives reliable predictions, from the part consisting of structures too dissimilar to the training set to apply the model successfully. A closer inspection reveals that the virtual screening performance of the model is considerably improved if half of the molecules, those with the lowest applicability scores, are omitted from the screening. The proposed applicability domain formulations

  14. Nonparametric estimation of age-specific reference percentile curves with radial smoothing.

    Science.gov (United States)

    Wan, Xiaohai; Qu, Yongming; Huang, Yao; Zhang, Xiao; Song, Hanping; Jiang, Honghua

    2012-01-01

    Reference percentile curves represent the covariate-dependent distribution of a quantitative measurement and are often used to summarize and monitor dynamic processes such as human growth. We propose a new nonparametric method based on a radial smoothing (RS) technique to estimate age-specific reference percentile curves assuming the underlying distribution is relatively close to normal. We compared the RS method with both the LMS and the generalized additive models for location, scale and shape (GAMLSS) methods using simulated data and found that our method has smaller estimation error than the two existing methods. We also applied the new method to analyze height growth data from children being followed in a clinical observational study of growth hormone treatment, and compared the growth curves between those with growth disorders and the general population. Copyright © 2011 Elsevier Inc. All rights reserved.

  15. Bayesian Nonparametric Model for Estimating Multistate Travel Time Distribution

    Directory of Open Access Journals (Sweden)

    Emmanuel Kidando

    2017-01-01

    Full Text Available Multistate models, that is, models with more than two distributions, are preferred over single-state probability models in modeling the distribution of travel time. Literature review indicated that the finite multistate modeling of travel time using lognormal distribution is superior to other probability functions. In this study, we extend the finite multistate lognormal model of estimating the travel time distribution to unbounded lognormal distribution. In particular, a nonparametric Dirichlet Process Mixture Model (DPMM with stick-breaking process representation was used. The strength of the DPMM is that it can choose the number of components dynamically as part of the algorithm during parameter estimation. To reduce computational complexity, the modeling process was limited to a maximum of six components. Then, the Markov Chain Monte Carlo (MCMC sampling technique was employed to estimate the parameters’ posterior distribution. Speed data from nine links of a freeway corridor, aggregated on a 5-minute basis, were used to calculate the corridor travel time. The results demonstrated that this model offers significant flexibility in modeling to account for complex mixture distributions of the travel time without specifying the number of components. The DPMM modeling further revealed that freeway travel time is characterized by multistate or single-state models depending on the inclusion of onset and offset of congestion periods.

  16. A Structural Labor Supply Model with Nonparametric Preferences

    NARCIS (Netherlands)

    van Soest, A.H.O.; Das, J.W.M.; Gong, X.

    2000-01-01

    Nonparametric techniques are usually seen as a statistic device for data description and exploration, and not as a tool for estimating models with a richer economic structure, which are often required for policy analysis.This paper presents an example where nonparametric flexibility can be attained

  17. Nonparametric estimates of drift and diffusion profiles via Fokker-Planck algebra.

    Science.gov (United States)

    Lund, Steven P; Hubbard, Joseph B; Halter, Michael

    2014-11-06

    Diffusion processes superimposed upon deterministic motion play a key role in understanding and controlling the transport of matter, energy, momentum, and even information in physics, chemistry, material science, biology, and communications technology. Given functions defining these random and deterministic components, the Fokker-Planck (FP) equation is often used to model these diffusive systems. Many methods exist for estimating the drift and diffusion profiles from one or more identifiable diffusive trajectories; however, when many identical entities diffuse simultaneously, it may not be possible to identify individual trajectories. Here we present a method capable of simultaneously providing nonparametric estimates for both drift and diffusion profiles from evolving density profiles, requiring only the validity of Langevin/FP dynamics. This algebraic FP manipulation provides a flexible and robust framework for estimating stationary drift and diffusion coefficient profiles, is not based on fluctuation theory or solved diffusion equations, and may facilitate predictions for many experimental systems. We illustrate this approach on experimental data obtained from a model lipid bilayer system exhibiting free diffusion and electric field induced drift. The wide range over which this approach provides accurate estimates for drift and diffusion profiles is demonstrated through simulation.

  18. Nonparametric Bayesian inference for multidimensional compound Poisson processes

    NARCIS (Netherlands)

    Gugushvili, S.; van der Meulen, F.; Spreij, P.

    2015-01-01

    Given a sample from a discretely observed multidimensional compound Poisson process, we study the problem of nonparametric estimation of its jump size density r0 and intensity λ0. We take a nonparametric Bayesian approach to the problem and determine posterior contraction rates in this context,

  19. Influence Function and Robust Variant of Kernel Canonical Correlation Analysis

    OpenAIRE

    Alam, Md. Ashad; Fukumizu, Kenji; Wang, Yu-Ping

    2017-01-01

    Many unsupervised kernel methods rely on the estimation of the kernel covariance operator (kernel CO) or kernel cross-covariance operator (kernel CCO). Both kernel CO and kernel CCO are sensitive to contaminated data, even when bounded positive definite kernels are used. To the best of our knowledge, there are few well-founded robust kernel methods for statistical unsupervised learning. In addition, while the influence function (IF) of an estimator can characterize its robustness, asymptotic ...

  20. An Adaptive Genetic Association Test Using Double Kernel Machines.

    Science.gov (United States)

    Zhan, Xiang; Epstein, Michael P; Ghosh, Debashis

    2015-10-01

    Recently, gene set-based approaches have become very popular in gene expression profiling studies for assessing how genetic variants are related to disease outcomes. Since most genes are not differentially expressed, existing pathway tests considering all genes within a pathway suffer from considerable noise and power loss. Moreover, for a differentially expressed pathway, it is of interest to select important genes that drive the effect of the pathway. In this article, we propose an adaptive association test using double kernel machines (DKM), which can both select important genes within the pathway as well as test for the overall genetic pathway effect. This DKM procedure first uses the garrote kernel machines (GKM) test for the purposes of subset selection and then the least squares kernel machine (LSKM) test for testing the effect of the subset of genes. An appealing feature of the kernel machine framework is that it can provide a flexible and unified method for multi-dimensional modeling of the genetic pathway effect allowing for both parametric and nonparametric components. This DKM approach is illustrated with application to simulated data as well as to data from a neuroimaging genetics study.

  1. Nonparametric estimation of the heterogeneity of a random medium using compound Poisson process modeling of wave multiple scattering.

    Science.gov (United States)

    Le Bihan, Nicolas; Margerin, Ludovic

    2009-07-01

    In this paper, we present a nonparametric method to estimate the heterogeneity of a random medium from the angular distribution of intensity of waves transmitted through a slab of random material. Our approach is based on the modeling of forward multiple scattering using compound Poisson processes on compact Lie groups. The estimation technique is validated through numerical simulations based on radiative transfer theory.

  2. Partially linear varying coefficient models stratified by a functional covariate

    KAUST Repository

    Maity, Arnab; Huang, Jianhua Z.

    2012-01-01

    We consider the problem of estimation in semiparametric varying coefficient models where the covariate modifying the varying coefficients is functional and is modeled nonparametrically. We develop a kernel-based estimator of the nonparametric

  3. ANALISIS MODEL REGRESI NONPARAMETRIK SIRKULAR-LINEAR BERGANDA

    Directory of Open Access Journals (Sweden)

    KOMANG CANDRA IVAN

    2016-05-01

    Full Text Available Circular data are data which the value in form of vector is circular data. Statistic analysis that is used in analyzing circular data is circular statistics analysis. In regression analysis, if any of predictor or response variables or both are circular then the regression analysis used is called circular regression analysis. Observation data in circular statistic which use direction and time units usually don’t satisfy all of the parametric assumptions, thus making nonparametric regression as a good solution. Nonparametric regression function estimation is using epanechnikov kernel estimator for the linier variables and von Mises kernel estimator for the circular variable. This study showed that the result of circular analysis by using circular descriptive statistic is better than common statistic. Multiple circular-linier nonparametric regressions with Epanechnikov and von Mises kernel estimator didn’t create estimation model explicitly as parametric regression does, but create estimation from its observation knots instead.

  4. Two-component mixture cure rate model with spline estimated nonparametric components.

    Science.gov (United States)

    Wang, Lu; Du, Pang; Liang, Hua

    2012-09-01

    In some survival analysis of medical studies, there are often long-term survivors who can be considered as permanently cured. The goals in these studies are to estimate the noncured probability of the whole population and the hazard rate of the susceptible subpopulation. When covariates are present as often happens in practice, to understand covariate effects on the noncured probability and hazard rate is of equal importance. The existing methods are limited to parametric and semiparametric models. We propose a two-component mixture cure rate model with nonparametric forms for both the cure probability and the hazard rate function. Identifiability of the model is guaranteed by an additive assumption that allows no time-covariate interactions in the logarithm of hazard rate. Estimation is carried out by an expectation-maximization algorithm on maximizing a penalized likelihood. For inferential purpose, we apply the Louis formula to obtain point-wise confidence intervals for noncured probability and hazard rate. Asymptotic convergence rates of our function estimates are established. We then evaluate the proposed method by extensive simulations. We analyze the survival data from a melanoma study and find interesting patterns for this study. © 2011, The International Biometric Society.

  5. Support vector machines for nuclear reactor state estimation

    Energy Technology Data Exchange (ETDEWEB)

    Zavaljevski, N.; Gross, K. C.

    2000-02-14

    Validation of nuclear power reactor signals is often performed by comparing signal prototypes with the actual reactor signals. The signal prototypes are often computed based on empirical data. The implementation of an estimation algorithm which can make predictions on limited data is an important issue. A new machine learning algorithm called support vector machines (SVMS) recently developed by Vladimir Vapnik and his coworkers enables a high level of generalization with finite high-dimensional data. The improved generalization in comparison with standard methods like neural networks is due mainly to the following characteristics of the method. The input data space is transformed into a high-dimensional feature space using a kernel function, and the learning problem is formulated as a convex quadratic programming problem with a unique solution. In this paper the authors have applied the SVM method for data-based state estimation in nuclear power reactors. In particular, they implemented and tested kernels developed at Argonne National Laboratory for the Multivariate State Estimation Technique (MSET), a nonlinear, nonparametric estimation technique with a wide range of applications in nuclear reactors. The methodology has been applied to three data sets from experimental and commercial nuclear power reactor applications. The results are promising. The combination of MSET kernels with the SVM method has better noise reduction and generalization properties than the standard MSET algorithm.

  6. Support vector machines for nuclear reactor state estimation

    International Nuclear Information System (INIS)

    Zavaljevski, N.; Gross, K. C.

    2000-01-01

    Validation of nuclear power reactor signals is often performed by comparing signal prototypes with the actual reactor signals. The signal prototypes are often computed based on empirical data. The implementation of an estimation algorithm which can make predictions on limited data is an important issue. A new machine learning algorithm called support vector machines (SVMS) recently developed by Vladimir Vapnik and his coworkers enables a high level of generalization with finite high-dimensional data. The improved generalization in comparison with standard methods like neural networks is due mainly to the following characteristics of the method. The input data space is transformed into a high-dimensional feature space using a kernel function, and the learning problem is formulated as a convex quadratic programming problem with a unique solution. In this paper the authors have applied the SVM method for data-based state estimation in nuclear power reactors. In particular, they implemented and tested kernels developed at Argonne National Laboratory for the Multivariate State Estimation Technique (MSET), a nonlinear, nonparametric estimation technique with a wide range of applications in nuclear reactors. The methodology has been applied to three data sets from experimental and commercial nuclear power reactor applications. The results are promising. The combination of MSET kernels with the SVM method has better noise reduction and generalization properties than the standard MSET algorithm

  7. A kernel-based approach to MIMO LPV state-space identification and application to a nonlinear process system

    NARCIS (Netherlands)

    Rizvi, S.Z.; Mohammadpour, J.; Toth, R.; Meskin, N.

    2015-01-01

    This paper first describes the development of a nonparametric identification method for linear parameter-varying (LPV) state-space models and then applies it to a nonlinear process system. The proposed method uses kernel-based least-squares support vector machines (LS-SVM). While parametric

  8. Parametric and Non-Parametric System Modelling

    DEFF Research Database (Denmark)

    Nielsen, Henrik Aalborg

    1999-01-01

    the focus is on combinations of parametric and non-parametric methods of regression. This combination can be in terms of additive models where e.g. one or more non-parametric term is added to a linear regression model. It can also be in terms of conditional parametric models where the coefficients...... considered. It is shown that adaptive estimation in conditional parametric models can be performed by combining the well known methods of local polynomial regression and recursive least squares with exponential forgetting. The approach used for estimation in conditional parametric models also highlights how...... networks is included. In this paper, neural networks are used for predicting the electricity production of a wind farm. The results are compared with results obtained using an adaptively estimated ARX-model. Finally, two papers on stochastic differential equations are included. In the first paper, among...

  9. On Wasserstein Two-Sample Testing and Related Families of Nonparametric Tests

    Directory of Open Access Journals (Sweden)

    Aaditya Ramdas

    2017-01-01

    Full Text Available Nonparametric two-sample or homogeneity testing is a decision theoretic problem that involves identifying differences between two random variables without making parametric assumptions about their underlying distributions. The literature is old and rich, with a wide variety of statistics having being designed and analyzed, both for the unidimensional and the multivariate setting. Inthisshortsurvey,wefocusonteststatisticsthatinvolvetheWassersteindistance. Usingan entropic smoothing of the Wasserstein distance, we connect these to very different tests including multivariate methods involving energy statistics and kernel based maximum mean discrepancy and univariate methods like the Kolmogorov–Smirnov test, probability or quantile (PP/QQ plots and receiver operating characteristic or ordinal dominance (ROC/ODC curves. Some observations are implicit in the literature, while others seem to have not been noticed thus far. Given nonparametric two-sample testing’s classical and continued importance, we aim to provide useful connections for theorists and practitioners familiar with one subset of methods but not others.

  10. Stochastic semi-nonparametric frontier estimation of electricity distribution networks: Application of the StoNED method in the Finnish regulatory model

    International Nuclear Information System (INIS)

    Kuosmanen, Timo

    2012-01-01

    Electricity distribution network is a prime example of a natural local monopoly. In many countries, electricity distribution is regulated by the government. Many regulators apply frontier estimation techniques such as data envelopment analysis (DEA) or stochastic frontier analysis (SFA) as an integral part of their regulatory framework. While more advanced methods that combine nonparametric frontier with stochastic error term are known in the literature, in practice, regulators continue to apply simplistic methods. This paper reports the main results of the project commissioned by the Finnish regulator for further development of the cost frontier estimation in their regulatory framework. The key objectives of the project were to integrate a stochastic SFA-style noise term to the nonparametric, axiomatic DEA-style cost frontier, and to take the heterogeneity of firms and their operating environments better into account. To achieve these objectives, a new method called stochastic nonparametric envelopment of data (StoNED) was examined. Based on the insights and experiences gained in the empirical analysis using the real data of the regulated networks, the Finnish regulator adopted the StoNED method in use from 2012 onwards.

  11. Estimation and variable selection for generalized additive partial linear models

    KAUST Repository

    Wang, Li

    2011-08-01

    We study generalized additive partial linear models, proposing the use of polynomial spline smoothing for estimation of nonparametric functions, and deriving quasi-likelihood based estimators for the linear parameters. We establish asymptotic normality for the estimators of the parametric components. The procedure avoids solving large systems of equations as in kernel-based procedures and thus results in gains in computational simplicity. We further develop a class of variable selection procedures for the linear parameters by employing a nonconcave penalized quasi-likelihood, which is shown to have an asymptotic oracle property. Monte Carlo simulations and an empirical example are presented for illustration. © Institute of Mathematical Statistics, 2011.

  12. Calculation of solar irradiation prediction intervals combining volatility and kernel density estimates

    International Nuclear Information System (INIS)

    Trapero, Juan R.

    2016-01-01

    In order to integrate solar energy into the grid it is important to predict the solar radiation accurately, where forecast errors can lead to significant costs. Recently, the increasing statistical approaches that cope with this problem is yielding a prolific literature. In general terms, the main research discussion is centred on selecting the “best” forecasting technique in accuracy terms. However, the need of the users of such forecasts require, apart from point forecasts, information about the variability of such forecast to compute prediction intervals. In this work, we will analyze kernel density estimation approaches, volatility forecasting models and combination of both of them in order to improve the prediction intervals performance. The results show that an optimal combination in terms of prediction interval statistical tests can achieve the desired confidence level with a lower average interval width. Data from a facility located in Spain are used to illustrate our methodology. - Highlights: • This work explores uncertainty forecasting models to build prediction intervals. • Kernel density estimators, exponential smoothing and GARCH models are compared. • An optimal combination of methods provides the best results. • A good compromise between coverage and average interval width is shown.

  13. The Influence of Reconstruction Kernel on Bone Mineral and Strength Estimates Using Quantitative Computed Tomography and Finite Element Analysis.

    Science.gov (United States)

    Michalski, Andrew S; Edwards, W Brent; Boyd, Steven K

    2017-10-17

    Quantitative computed tomography has been posed as an alternative imaging modality to investigate osteoporosis. We examined the influence of computed tomography convolution back-projection reconstruction kernels on the analysis of bone quantity and estimated mechanical properties in the proximal femur. Eighteen computed tomography scans of the proximal femur were reconstructed using both a standard smoothing reconstruction kernel and a bone-sharpening reconstruction kernel. Following phantom-based density calibration, we calculated typical bone quantity outcomes of integral volumetric bone mineral density, bone volume, and bone mineral content. Additionally, we performed finite element analysis in a standard sideways fall on the hip loading configuration. Significant differences for all outcome measures, except integral bone volume, were observed between the 2 reconstruction kernels. Volumetric bone mineral density measured using images reconstructed by the standard kernel was significantly lower (6.7%, p kernel. Furthermore, the whole-bone stiffness and the failure load measured in images reconstructed by the standard kernel were significantly lower (16.5%, p kernel. These data suggest that for future quantitative computed tomography studies, a standardized reconstruction kernel will maximize reproducibility, independent of the use of a quantitative calibration phantom. Copyright © 2017 The International Society for Clinical Densitometry. Published by Elsevier Inc. All rights reserved.

  14. Nonparametric Identification and Estimation of Finite Mixture Models of Dynamic Discrete Choices

    OpenAIRE

    Hiroyuki Kasahara; Katsumi Shimotsu

    2006-01-01

    In dynamic discrete choice analysis, controlling for unobserved heterogeneity is an important issue, and finite mixture models provide flexible ways to account for unobserved heterogeneity. This paper studies nonparametric identifiability of type probabilities and type-specific component distributions in finite mixture models of dynamic discrete choices. We derive sufficient conditions for nonparametric identification for various finite mixture models of dynamic discrete choices used in appli...

  15. Flat-Top Realized Kernel Estimation of Quadratic Covariation with Non-Synchronous and Noisy Asset Prices

    DEFF Research Database (Denmark)

    Varneskov, Rasmus T.

    . Lastly, two small empirical applications to high frequency stock market data illustrate the bias reduction relative to competing estimators in estimating correlations, realized betas, and mean-variance frontiers, as well as the use of the new estimators in the dynamics of hedging....... problems. These transformations are all shown to inherit the desirable asymptotic properties of the generalized at-top realized kernels. A simulation study shows that the class of estimators has a superior finite sample tradeoff between bias and root mean squared error relative to competing estimators...

  16. On convergence of kernel learning estimators

    NARCIS (Netherlands)

    Norkin, V.I.; Keyzer, M.A.

    2009-01-01

    The paper studies convex stochastic optimization problems in a reproducing kernel Hilbert space (RKHS). The objective (risk) functional depends on functions from this RKHS and takes the form of a mathematical expectation (integral) of a nonnegative integrand (loss function) over a probability

  17. Permissible Home Range Estimation (PHRE in Restricted Habitats: A New Algorithm and an Evaluation for Sea Otters.

    Directory of Open Access Journals (Sweden)

    L Max Tarjan

    Full Text Available Parametric and nonparametric kernel methods dominate studies of animal home ranges and space use. Most existing methods are unable to incorporate information about the underlying physical environment, leading to poor performance in excluding areas that are not used. Using radio-telemetry data from sea otters, we developed and evaluated a new algorithm for estimating home ranges (hereafter Permissible Home Range Estimation, or "PHRE" that reflects habitat suitability. We began by transforming sighting locations into relevant landscape features (for sea otters, coastal position and distance from shore. Then, we generated a bivariate kernel probability density function in landscape space and back-transformed this to geographic space in order to define a permissible home range. Compared to two commonly used home range estimation methods, kernel densities and local convex hulls, PHRE better excluded unused areas and required a smaller sample size. Our PHRE method is applicable to species whose ranges are restricted by complex physical boundaries or environmental gradients and will improve understanding of habitat-use requirements and, ultimately, aid in conservation efforts.

  18. Non-parametric adaptive importance sampling for the probability estimation of a launcher impact position

    International Nuclear Information System (INIS)

    Morio, Jerome

    2011-01-01

    Importance sampling (IS) is a useful simulation technique to estimate critical probability with a better accuracy than Monte Carlo methods. It consists in generating random weighted samples from an auxiliary distribution rather than the distribution of interest. The crucial part of this algorithm is the choice of an efficient auxiliary PDF that has to be able to simulate more rare random events. The optimisation of this auxiliary distribution is often in practice very difficult. In this article, we propose to approach the IS optimal auxiliary density with non-parametric adaptive importance sampling (NAIS). We apply this technique for the probability estimation of spatial launcher impact position since it has currently become a more and more important issue in the field of aeronautics.

  19. Predictive analysis and mapping of indoor radon concentrations in a complex environment using kernel estimation: An application to Switzerland

    Energy Technology Data Exchange (ETDEWEB)

    Kropat, Georg, E-mail: georg.kropat@chuv.ch [Institute of Radiation Physics, Lausanne University Hospital, Rue du Grand-Pré 1, 1007 Lausanne (Switzerland); Bochud, Francois [Institute of Radiation Physics, Lausanne University Hospital, Rue du Grand-Pré 1, 1007 Lausanne (Switzerland); Jaboyedoff, Michel [Faculty of Geosciences and Environment, University of Lausanne, GEOPOLIS — 3793, 1015 Lausanne (Switzerland); Laedermann, Jean-Pascal [Institute of Radiation Physics, Lausanne University Hospital, Rue du Grand-Pré 1, 1007 Lausanne (Switzerland); Murith, Christophe; Palacios, Martha [Swiss Federal Office of Public Health, Schwarzenburgstrasse 165, 3003 Berne (Switzerland); Baechler, Sébastien [Institute of Radiation Physics, Lausanne University Hospital, Rue du Grand-Pré 1, 1007 Lausanne (Switzerland); Swiss Federal Office of Public Health, Schwarzenburgstrasse 165, 3003 Berne (Switzerland)

    2015-02-01

    Purpose: The aim of this study was to develop models based on kernel regression and probability estimation in order to predict and map IRC in Switzerland by taking into account all of the following: architectural factors, spatial relationships between the measurements, as well as geological information. Methods: We looked at about 240 000 IRC measurements carried out in about 150 000 houses. As predictor variables we included: building type, foundation type, year of construction, detector type, geographical coordinates, altitude, temperature and lithology into the kernel estimation models. We developed predictive maps as well as a map of the local probability to exceed 300 Bq/m{sup 3}. Additionally, we developed a map of a confidence index in order to estimate the reliability of the probability map. Results: Our models were able to explain 28% of the variations of IRC data. All variables added information to the model. The model estimation revealed a bandwidth for each variable, making it possible to characterize the influence of each variable on the IRC estimation. Furthermore, we assessed the mapping characteristics of kernel estimation overall as well as by municipality. Overall, our model reproduces spatial IRC patterns which were already obtained earlier. On the municipal level, we could show that our model accounts well for IRC trends within municipal boundaries. Finally, we found that different building characteristics result in different IRC maps. Maps corresponding to detached houses with concrete foundations indicate systematically smaller IRC than maps corresponding to farms with earth foundation. Conclusions: IRC mapping based on kernel estimation is a powerful tool to predict and analyze IRC on a large-scale as well as on a local level. This approach enables to develop tailor-made maps for different architectural elements and measurement conditions and to account at the same time for geological information and spatial relations between IRC measurements

  20. Nonparametric estimation of stochastic differential equations with sparse Gaussian processes.

    Science.gov (United States)

    García, Constantino A; Otero, Abraham; Félix, Paulo; Presedo, Jesús; Márquez, David G

    2017-08-01

    The application of stochastic differential equations (SDEs) to the analysis of temporal data has attracted increasing attention, due to their ability to describe complex dynamics with physically interpretable equations. In this paper, we introduce a nonparametric method for estimating the drift and diffusion terms of SDEs from a densely observed discrete time series. The use of Gaussian processes as priors permits working directly in a function-space view and thus the inference takes place directly in this space. To cope with the computational complexity that requires the use of Gaussian processes, a sparse Gaussian process approximation is provided. This approximation permits the efficient computation of predictions for the drift and diffusion terms by using a distribution over a small subset of pseudosamples. The proposed method has been validated using both simulated data and real data from economy and paleoclimatology. The application of the method to real data demonstrates its ability to capture the behavior of complex systems.

  1. Multivariate realised kernels

    DEFF Research Database (Denmark)

    Barndorff-Nielsen, Ole; Hansen, Peter Reinhard; Lunde, Asger

    We propose a multivariate realised kernel to estimate the ex-post covariation of log-prices. We show this new consistent estimator is guaranteed to be positive semi-definite and is robust to measurement noise of certain types and can also handle non-synchronous trading. It is the first estimator...

  2. Assessing Goodness of Fit in Item Response Theory with Nonparametric Models: A Comparison of Posterior Probabilities and Kernel-Smoothing Approaches

    Science.gov (United States)

    Sueiro, Manuel J.; Abad, Francisco J.

    2011-01-01

    The distance between nonparametric and parametric item characteristic curves has been proposed as an index of goodness of fit in item response theory in the form of a root integrated squared error index. This article proposes to use the posterior distribution of the latent trait as the nonparametric model and compares the performance of an index…

  3. Scatter kernel estimation with an edge-spread function method for cone-beam computed tomography imaging

    International Nuclear Information System (INIS)

    Li Heng; Mohan, Radhe; Zhu, X Ronald

    2008-01-01

    The clinical applications of kilovoltage x-ray cone-beam computed tomography (CBCT) have been compromised by the limited quality of CBCT images, which typically is due to a substantial scatter component in the projection data. In this paper, we describe an experimental method of deriving the scatter kernel of a CBCT imaging system. The estimated scatter kernel can be used to remove the scatter component from the CBCT projection images, thus improving the quality of the reconstructed image. The scattered radiation was approximated as depth-dependent, pencil-beam kernels, which were derived using an edge-spread function (ESF) method. The ESF geometry was achieved with a half-beam block created by a 3 mm thick lead sheet placed on a stack of slab solid-water phantoms. Measurements for ten water-equivalent thicknesses (WET) ranging from 0 cm to 41 cm were taken with (half-blocked) and without (unblocked) the lead sheet, and corresponding pencil-beam scatter kernels or point-spread functions (PSFs) were then derived without assuming any empirical trial function. The derived scatter kernels were verified with phantom studies. Scatter correction was then incorporated into the reconstruction process to improve image quality. For a 32 cm diameter cylinder phantom, the flatness of the reconstructed image was improved from 22% to 5%. When the method was applied to CBCT images for patients undergoing image-guided therapy of the pelvis and lung, the variation in selected regions of interest (ROIs) was reduced from >300 HU to <100 HU. We conclude that the scatter reduction technique utilizing the scatter kernel effectively suppresses the artifact caused by scatter in CBCT.

  4. Adaptive Kernel in Meshsize Boosting Algorithm in KDE ...

    African Journals Online (AJOL)

    This paper proposes the use of adaptive kernel in a meshsize boosting algorithm in kernel density estimation. The algorithm is a bias reduction scheme like other existing schemes but uses adaptive kernel instead of the regular fixed kernels. An empirical study for this scheme is conducted and the findings are comparatively ...

  5. Genomic similarity and kernel methods I: advancements by building on mathematical and statistical foundations.

    Science.gov (United States)

    Schaid, Daniel J

    2010-01-01

    Measures of genomic similarity are the basis of many statistical analytic methods. We review the mathematical and statistical basis of similarity methods, particularly based on kernel methods. A kernel function converts information for a pair of subjects to a quantitative value representing either similarity (larger values meaning more similar) or distance (smaller values meaning more similar), with the requirement that it must create a positive semidefinite matrix when applied to all pairs of subjects. This review emphasizes the wide range of statistical methods and software that can be used when similarity is based on kernel methods, such as nonparametric regression, linear mixed models and generalized linear mixed models, hierarchical models, score statistics, and support vector machines. The mathematical rigor for these methods is summarized, as is the mathematical framework for making kernels. This review provides a framework to move from intuitive and heuristic approaches to define genomic similarities to more rigorous methods that can take advantage of powerful statistical modeling and existing software. A companion paper reviews novel approaches to creating kernels that might be useful for genomic analyses, providing insights with examples [1]. Copyright © 2010 S. Karger AG, Basel.

  6. Testing and Estimating Shape-Constrained Nonparametric Density and Regression in the Presence of Measurement Error

    KAUST Repository

    Carroll, Raymond J.

    2011-03-01

    In many applications we can expect that, or are interested to know if, a density function or a regression curve satisfies some specific shape constraints. For example, when the explanatory variable, X, represents the value taken by a treatment or dosage, the conditional mean of the response, Y , is often anticipated to be a monotone function of X. Indeed, if this regression mean is not monotone (in the appropriate direction) then the medical or commercial value of the treatment is likely to be significantly curtailed, at least for values of X that lie beyond the point at which monotonicity fails. In the case of a density, common shape constraints include log-concavity and unimodality. If we can correctly guess the shape of a curve, then nonparametric estimators can be improved by taking this information into account. Addressing such problems requires a method for testing the hypothesis that the curve of interest satisfies a shape constraint, and, if the conclusion of the test is positive, a technique for estimating the curve subject to the constraint. Nonparametric methodology for solving these problems already exists, but only in cases where the covariates are observed precisely. However in many problems, data can only be observed with measurement errors, and the methods employed in the error-free case typically do not carry over to this error context. In this paper we develop a novel approach to hypothesis testing and function estimation under shape constraints, which is valid in the context of measurement errors. Our method is based on tilting an estimator of the density or the regression mean until it satisfies the shape constraint, and we take as our test statistic the distance through which it is tilted. Bootstrap methods are used to calibrate the test. The constrained curve estimators that we develop are also based on tilting, and in that context our work has points of contact with methodology in the error-free case.

  7. Adaptive kernel regression for freehand 3D ultrasound reconstruction

    Science.gov (United States)

    Alshalalfah, Abdel-Latif; Daoud, Mohammad I.; Al-Najar, Mahasen

    2017-03-01

    Freehand three-dimensional (3D) ultrasound imaging enables low-cost and flexible 3D scanning of arbitrary-shaped organs, where the operator can freely move a two-dimensional (2D) ultrasound probe to acquire a sequence of tracked cross-sectional images of the anatomy. Often, the acquired 2D ultrasound images are irregularly and sparsely distributed in the 3D space. Several 3D reconstruction algorithms have been proposed to synthesize 3D ultrasound volumes based on the acquired 2D images. A challenging task during the reconstruction process is to preserve the texture patterns in the synthesized volume and ensure that all gaps in the volume are correctly filled. This paper presents an adaptive kernel regression algorithm that can effectively reconstruct high-quality freehand 3D ultrasound volumes. The algorithm employs a kernel regression model that enables nonparametric interpolation of the voxel gray-level values. The kernel size of the regression model is adaptively adjusted based on the characteristics of the voxel that is being interpolated. In particular, when the algorithm is employed to interpolate a voxel located in a region with dense ultrasound data samples, the size of the kernel is reduced to preserve the texture patterns. On the other hand, the size of the kernel is increased in areas that include large gaps to enable effective gap filling. The performance of the proposed algorithm was compared with seven previous interpolation approaches by synthesizing freehand 3D ultrasound volumes of a benign breast tumor. The experimental results show that the proposed algorithm outperforms the other interpolation approaches.

  8. Nonparametric statistical inference

    CERN Document Server

    Gibbons, Jean Dickinson

    2010-01-01

    Overall, this remains a very fine book suitable for a graduate-level course in nonparametric statistics. I recommend it for all people interested in learning the basic ideas of nonparametric statistical inference.-Eugenia Stoimenova, Journal of Applied Statistics, June 2012… one of the best books available for a graduate (or advanced undergraduate) text for a theory course on nonparametric statistics. … a very well-written and organized book on nonparametric statistics, especially useful and recommended for teachers and graduate students.-Biometrics, 67, September 2011This excellently presente

  9. Genomic outlier profile analysis: mixture models, null hypotheses, and nonparametric estimation.

    Science.gov (United States)

    Ghosh, Debashis; Chinnaiyan, Arul M

    2009-01-01

    In most analyses of large-scale genomic data sets, differential expression analysis is typically assessed by testing for differences in the mean of the distributions between 2 groups. A recent finding by Tomlins and others (2005) is of a different type of pattern of differential expression in which a fraction of samples in one group have overexpression relative to samples in the other group. In this work, we describe a general mixture model framework for the assessment of this type of expression, called outlier profile analysis. We start by considering the single-gene situation and establishing results on identifiability. We propose 2 nonparametric estimation procedures that have natural links to familiar multiple testing procedures. We then develop multivariate extensions of this methodology to handle genome-wide measurements. The proposed methodologies are compared using simulation studies as well as data from a prostate cancer gene expression study.

  10. Analytical Plug-In Method for Kernel Density Estimator Applied to Genetic Neutrality Study

    Directory of Open Access Journals (Sweden)

    Samir Saoudi

    2008-07-01

    Full Text Available The plug-in method enables optimization of the bandwidth of the kernel density estimator in order to estimate probability density functions (pdfs. Here, a faster procedure than that of the common plug-in method is proposed. The mean integrated square error (MISE depends directly upon J(f which is linked to the second-order derivative of the pdf. As we intend to introduce an analytical approximation of J(f, the pdf is estimated only once, at the end of iterations. These two kinds of algorithm are tested on different random variables having distributions known for their difficult estimation. Finally, they are applied to genetic data in order to provide a better characterisation in the mean of neutrality of Tunisian Berber populations.

  11. Realized kernels in practice

    DEFF Research Database (Denmark)

    Barndorff-Nielsen, Ole Eiler; Hansen, P. Reinhard; Lunde, Asger

    2009-01-01

    and find a remarkable level of agreement. We identify some features of the high-frequency data, which are challenging for realized kernels. They are when there are local trends in the data, over periods of around 10 minutes, where the prices and quotes are driven up or down. These can be associated......Realized kernels use high-frequency data to estimate daily volatility of individual stock prices. They can be applied to either trade or quote data. Here we provide the details of how we suggest implementing them in practice. We compare the estimates based on trade and quote data for the same stock...

  12. Oscillometric blood pressure estimation by combining nonparametric bootstrap with Gaussian mixture model.

    Science.gov (United States)

    Lee, Soojeong; Rajan, Sreeraman; Jeon, Gwanggil; Chang, Joon-Hyuk; Dajani, Hilmi R; Groza, Voicu Z

    2017-06-01

    Blood pressure (BP) is one of the most important vital indicators and plays a key role in determining the cardiovascular activity of patients. This paper proposes a hybrid approach consisting of nonparametric bootstrap (NPB) and machine learning techniques to obtain the characteristic ratios (CR) used in the blood pressure estimation algorithm to improve the accuracy of systolic blood pressure (SBP) and diastolic blood pressure (DBP) estimates and obtain confidence intervals (CI). The NPB technique is used to circumvent the requirement for large sample set for obtaining the CI. A mixture of Gaussian densities is assumed for the CRs and Gaussian mixture model (GMM) is chosen to estimate the SBP and DBP ratios. The K-means clustering technique is used to obtain the mixture order of the Gaussian densities. The proposed approach achieves grade "A" under British Society of Hypertension testing protocol and is superior to the conventional approach based on maximum amplitude algorithm (MAA) that uses fixed CR ratios. The proposed approach also yields a lower mean error (ME) and the standard deviation of the error (SDE) in the estimates when compared to the conventional MAA method. In addition, CIs obtained through the proposed hybrid approach are also narrower with a lower SDE. The proposed approach combining the NPB technique with the GMM provides a methodology to derive individualized characteristic ratio. The results exhibit that the proposed approach enhances the accuracy of SBP and DBP estimation and provides narrower confidence intervals for the estimates. Copyright © 2015 Elsevier Ltd. All rights reserved.

  13. Robust Kernel (Cross-) Covariance Operators in Reproducing Kernel Hilbert Space toward Kernel Methods

    OpenAIRE

    Alam, Md. Ashad; Fukumizu, Kenji; Wang, Yu-Ping

    2016-01-01

    To the best of our knowledge, there are no general well-founded robust methods for statistical unsupervised learning. Most of the unsupervised methods explicitly or implicitly depend on the kernel covariance operator (kernel CO) or kernel cross-covariance operator (kernel CCO). They are sensitive to contaminated data, even when using bounded positive definite kernels. First, we propose robust kernel covariance operator (robust kernel CO) and robust kernel crosscovariance operator (robust kern...

  14. 2nd Conference of the International Society for Nonparametric Statistics

    CERN Document Server

    Manteiga, Wenceslao; Romo, Juan

    2016-01-01

    This volume collects selected, peer-reviewed contributions from the 2nd Conference of the International Society for Nonparametric Statistics (ISNPS), held in Cádiz (Spain) between June 11–16 2014, and sponsored by the American Statistical Association, the Institute of Mathematical Statistics, the Bernoulli Society for Mathematical Statistics and Probability, the Journal of Nonparametric Statistics and Universidad Carlos III de Madrid. The 15 articles are a representative sample of the 336 contributed papers presented at the conference. They cover topics such as high-dimensional data modelling, inference for stochastic processes and for dependent data, nonparametric and goodness-of-fit testing, nonparametric curve estimation, object-oriented data analysis, and semiparametric inference. The aim of the ISNPS 2014 conference was to bring together recent advances and trends in several areas of nonparametric statistics in order to facilitate the exchange of research ideas, promote collaboration among researchers...

  15. A Design-Adaptive Local Polynomial Estimator for the Errors-in-Variables Problem

    KAUST Repository

    Delaigle, Aurore

    2009-03-01

    Local polynomial estimators are popular techniques for nonparametric regression estimation and have received great attention in the literature. Their simplest version, the local constant estimator, can be easily extended to the errors-in-variables context by exploiting its similarity with the deconvolution kernel density estimator. The generalization of the higher order versions of the estimator, however, is not straightforward and has remained an open problem for the last 15 years. We propose an innovative local polynomial estimator of any order in the errors-in-variables context, derive its design-adaptive asymptotic properties and study its finite sample performance on simulated examples. We provide not only a solution to a long-standing open problem, but also provide methodological contributions to error-invariable regression, including local polynomial estimation of derivative functions.

  16. Higher-Order Hybrid Gaussian Kernel in Meshsize Boosting Algorithm

    African Journals Online (AJOL)

    In this paper, we shall use higher-order hybrid Gaussian kernel in a meshsize boosting algorithm in kernel density estimation. Bias reduction is guaranteed in this scheme like other existing schemes but uses the higher-order hybrid Gaussian kernel instead of the regular fixed kernels. A numerical verification of this scheme ...

  17. Adaptive Kernel In The Bootstrap Boosting Algorithm In KDE ...

    African Journals Online (AJOL)

    This paper proposes the use of adaptive kernel in a bootstrap boosting algorithm in kernel density estimation. The algorithm is a bias reduction scheme like other existing schemes but uses adaptive kernel instead of the regular fixed kernels. An empirical study for this scheme is conducted and the findings are comparatively ...

  18. Intelligent Control of a Sensor-Actuator System via Kernelized Least-Squares Policy Iteration

    Directory of Open Access Journals (Sweden)

    Bo Liu

    2012-02-01

    Full Text Available In this paper a new framework, called Compressive Kernelized Reinforcement Learning (CKRL, for computing near-optimal policies in sequential decision making with uncertainty is proposed via incorporating the non-adaptive data-independent Random Projections and nonparametric Kernelized Least-squares Policy Iteration (KLSPI. Random Projections are a fast, non-adaptive dimensionality reduction framework in which high-dimensionality data is projected onto a random lower-dimension subspace via spherically random rotation and coordination sampling. KLSPI introduce kernel trick into the LSPI framework for Reinforcement Learning, often achieving faster convergence and providing automatic feature selection via various kernel sparsification approaches. In this approach, policies are computed in a low-dimensional subspace generated by projecting the high-dimensional features onto a set of random basis. We first show how Random Projections constitute an efficient sparsification technique and how our method often converges faster than regular LSPI, while at lower computational costs. Theoretical foundation underlying this approach is a fast approximation of Singular Value Decomposition (SVD. Finally, simulation results are exhibited on benchmark MDP domains, which confirm gains both in computation time and in performance in large feature spaces.

  19. Analytical Plug-In Method for Kernel Density Estimator Applied to Genetic Neutrality Study

    Science.gov (United States)

    Troudi, Molka; Alimi, Adel M.; Saoudi, Samir

    2008-12-01

    The plug-in method enables optimization of the bandwidth of the kernel density estimator in order to estimate probability density functions (pdfs). Here, a faster procedure than that of the common plug-in method is proposed. The mean integrated square error (MISE) depends directly upon [InlineEquation not available: see fulltext.] which is linked to the second-order derivative of the pdf. As we intend to introduce an analytical approximation of [InlineEquation not available: see fulltext.], the pdf is estimated only once, at the end of iterations. These two kinds of algorithm are tested on different random variables having distributions known for their difficult estimation. Finally, they are applied to genetic data in order to provide a better characterisation in the mean of neutrality of Tunisian Berber populations.

  20. Using non-parametric methods in econometric production analysis

    DEFF Research Database (Denmark)

    Czekaj, Tomasz Gerard; Henningsen, Arne

    2012-01-01

    by investigating the relationship between the elasticity of scale and the farm size. We use a balanced panel data set of 371~specialised crop farms for the years 2004-2007. A non-parametric specification test shows that neither the Cobb-Douglas function nor the Translog function are consistent with the "true......Econometric estimation of production functions is one of the most common methods in applied economic production analysis. These studies usually apply parametric estimation techniques, which obligate the researcher to specify a functional form of the production function of which the Cobb...... parameter estimates, but also in biased measures which are derived from the parameters, such as elasticities. Therefore, we propose to use non-parametric econometric methods. First, these can be applied to verify the functional form used in parametric production analysis. Second, they can be directly used...

  1. Estimating the shadow prices of SO2 and NOx for U.S. coal power plants: A convex nonparametric least squares approach

    International Nuclear Information System (INIS)

    Mekaroonreung, Maethee; Johnson, Andrew L.

    2012-01-01

    Weak disposability between outputs and pollutants, defined as a simultaneous proportional reduction of both outputs and pollutants, assumes that pollutants are byproducts of the output generation process and that a firm can “freely dispose” of both by scaling down production levels, leaving some inputs idle. Based on the production axioms of monotonicity, convexity and weak disposability, we formulate a convex nonparametric least squares (CNLS) quadratic optimization problem to estimate a frontier production function assuming either a deterministic disturbance term consisting only of inefficiency, or a composite disturbance term composed of both inefficiency and noise. The suggested methodology extends the stochastic semi-nonparametric envelopment of data (StoNED) described in Kuosmanen and Kortelainen (2011). Applying the method to estimate the shadow prices of SO 2 and NO x generated by U.S. coal power plants, we conclude that the weak disposability StoNED method provides more consistent estimates of market prices. - Highlights: ► Develops methodology to estimate shadow prices for SO 2 and NO x in the U.S. coal power plants. ► Extends CNLS and StoNED methods to include the weak disposability assumption. ► Estimates the range of SO 2 and NO x shadow prices as 201–343 $/ton and 409–1352 $/ton. ► StoNED method provides more accurate estimates of shadow prices than deterministic frontier.

  2. Soft Sensor of Vehicle State Estimation Based on the Kernel Principal Component and Improved Neural Network

    Directory of Open Access Journals (Sweden)

    Haorui Liu

    2016-01-01

    Full Text Available In the car control systems, it is hard to measure some key vehicle states directly and accurately when running on the road and the cost of the measurement is high as well. To address these problems, a vehicle state estimation method based on the kernel principal component analysis and the improved Elman neural network is proposed. Combining with nonlinear vehicle model of three degrees of freedom (3 DOF, longitudinal, lateral, and yaw motion, this paper applies the method to the soft sensor of the vehicle states. The simulation results of the double lane change tested by Matlab/SIMULINK cosimulation prove the KPCA-IENN algorithm (kernel principal component algorithm and improved Elman neural network to be quick and precise when tracking the vehicle states within the nonlinear area. This algorithm method can meet the software performance requirements of the vehicle states estimation in precision, tracking speed, noise suppression, and other aspects.

  3. Optimal kernel shape and bandwidth for atomistic support of continuum stress

    International Nuclear Information System (INIS)

    Ulz, Manfred H; Moran, Sean J

    2013-01-01

    The treatment of atomistic scale interactions via molecular dynamics simulations has recently found favour for multiscale modelling within engineering. The estimation of stress at a continuum point on the atomistic scale requires a pre-defined kernel function. This kernel function derives the stress at a continuum point by averaging the contribution from atoms within a region surrounding the continuum point. This averaging volume, and therefore the associated stress at a continuum point, is highly dependent on the bandwidth and shape of the kernel. In this paper we propose an effective and entirely data-driven strategy for simultaneously computing the optimal shape and bandwidth for the kernel. We thoroughly evaluate our proposed approach on copper using three classical elasticity problems. Our evaluation yields three key findings: firstly, our technique can provide a physically meaningful estimation of kernel bandwidth; secondly, we show that a uniform kernel is preferred, thereby justifying the default selection of this kernel shape in future work; and thirdly, we can reliably estimate both of these attributes in a data-driven manner, obtaining values that lead to an accurate estimation of the stress at a continuum point. (paper)

  4. Nonparametric trend estimation in the presence of fractal noise: application to fMRI time-series analysis.

    Science.gov (United States)

    Afshinpour, Babak; Hossein-Zadeh, Gholam-Ali; Soltanian-Zadeh, Hamid

    2008-06-30

    Unknown low frequency fluctuations called "trend" are observed in noisy time-series measured for different applications. In some disciplines, they carry primary information while in other fields such as functional magnetic resonance imaging (fMRI) they carry nuisance effects. In all cases, however, it is necessary to estimate them accurately. In this paper, a method for estimating trend in the presence of fractal noise is proposed and applied to fMRI time-series. To this end, a partly linear model (PLM) is fitted to each time-series. The parametric and nonparametric parts of PLM are considered as contributions of hemodynamic response and trend, respectively. Using the whitening property of wavelet transform, the unknown components of the model are estimated in the wavelet domain. The results of the proposed method are compared to those of other parametric trend-removal approaches such as spline and polynomial models. It is shown that the proposed method improves activation detection and decreases variance of the estimated parameters relative to the other methods.

  5. Viscozyme L pretreatment on palm kernels improved the aroma of palm kernel oil after kernel roasting.

    Science.gov (United States)

    Zhang, Wencan; Leong, Siew Mun; Zhao, Feifei; Zhao, Fangju; Yang, Tiankui; Liu, Shaoquan

    2018-05-01

    With an interest to enhance the aroma of palm kernel oil (PKO), Viscozyme L, an enzyme complex containing a wide range of carbohydrases, was applied to alter the carbohydrates in palm kernels (PK) to modulate the formation of volatiles upon kernel roasting. After Viscozyme treatment, the content of simple sugars and free amino acids in PK increased by 4.4-fold and 4.5-fold, respectively. After kernel roasting and oil extraction, significantly more 2,5-dimethylfuran, 2-[(methylthio)methyl]-furan, 1-(2-furanyl)-ethanone, 1-(2-furyl)-2-propanone, 5-methyl-2-furancarboxaldehyde and 2-acetyl-5-methylfuran but less 2-furanmethanol and 2-furanmethanol acetate were found in treated PKO; the correlation between their formation and simple sugar profile was estimated by using partial least square regression (PLS1). Obvious differences in pyrroles and Strecker aldehydes were also found between the control and treated PKOs. Principal component analysis (PCA) clearly discriminated the treated PKOs from that of control PKOs on the basis of all volatile compounds. Such changes in volatiles translated into distinct sensory attributes, whereby treated PKO was more caramelic and burnt after aqueous extraction and more nutty, roasty, caramelic and smoky after solvent extraction. Copyright © 2018 Elsevier Ltd. All rights reserved.

  6. Comparing estimates of genetic variance across different relationship models.

    Science.gov (United States)

    Legarra, Andres

    2016-02-01

    Use of relationships between individuals to estimate genetic variances and heritabilities via mixed models is standard practice in human, plant and livestock genetics. Different models or information for relationships may give different estimates of genetic variances. However, comparing these estimates across different relationship models is not straightforward as the implied base populations differ between relationship models. In this work, I present a method to compare estimates of variance components across different relationship models. I suggest referring genetic variances obtained using different relationship models to the same reference population, usually a set of individuals in the population. Expected genetic variance of this population is the estimated variance component from the mixed model times a statistic, Dk, which is the average self-relationship minus the average (self- and across-) relationship. For most typical models of relationships, Dk is close to 1. However, this is not true for very deep pedigrees, for identity-by-state relationships, or for non-parametric kernels, which tend to overestimate the genetic variance and the heritability. Using mice data, I show that heritabilities from identity-by-state and kernel-based relationships are overestimated. Weighting these estimates by Dk scales them to a base comparable to genomic or pedigree relationships, avoiding wrong comparisons, for instance, "missing heritabilities". Copyright © 2015 Elsevier Inc. All rights reserved.

  7. Speaker Linking and Applications using Non-Parametric Hashing Methods

    Science.gov (United States)

    2016-09-08

    nonparametric estimate of a multivariate density function,” The Annals of Math- ematical Statistics , vol. 36, no. 3, pp. 1049–1051, 1965. [9] E. A. Patrick...Speaker Linking and Applications using Non-Parametric Hashing Methods† Douglas Sturim and William M. Campbell MIT Lincoln Laboratory, Lexington, MA...with many approaches [1, 2]. For this paper, we focus on using i-vectors [2], but the methods apply to any embedding. For the task of speaker QBE and

  8. Partially linear varying coefficient models stratified by a functional covariate

    KAUST Repository

    Maity, Arnab

    2012-10-01

    We consider the problem of estimation in semiparametric varying coefficient models where the covariate modifying the varying coefficients is functional and is modeled nonparametrically. We develop a kernel-based estimator of the nonparametric component and a profiling estimator of the parametric component of the model and derive their asymptotic properties. Specifically, we show the consistency of the nonparametric functional estimates and derive the asymptotic expansion of the estimates of the parametric component. We illustrate the performance of our methodology using a simulation study and a real data application.

  9. Calculation of the time resolution of the J-PET tomograph using kernel density estimation

    Science.gov (United States)

    Raczyński, L.; Wiślicki, W.; Krzemień, W.; Kowalski, P.; Alfs, D.; Bednarski, T.; Białas, P.; Curceanu, C.; Czerwiński, E.; Dulski, K.; Gajos, A.; Głowacz, B.; Gorgol, M.; Hiesmayr, B.; Jasińska, B.; Kamińska, D.; Korcyl, G.; Kozik, T.; Krawczyk, N.; Kubicz, E.; Mohammed, M.; Pawlik-Niedźwiecka, M.; Niedźwiecki, S.; Pałka, M.; Rudy, Z.; Rundel, O.; Sharma, N. G.; Silarski, M.; Smyrski, J.; Strzelecki, A.; Wieczorek, A.; Zgardzińska, B.; Zieliński, M.; Moskal, P.

    2017-06-01

    In this paper we estimate the time resolution of the J-PET scanner built from plastic scintillators. We incorporate the method of signal processing using the Tikhonov regularization framework and the kernel density estimation method. We obtain simple, closed-form analytical formulae for time resolution. The proposed method is validated using signals registered by means of the single detection unit of the J-PET tomograph built from a 30 cm long plastic scintillator strip. It is shown that the experimental and theoretical results obtained for the J-PET scanner equipped with vacuum tube photomultipliers are consistent.

  10. Nonparametric method for genomics-based prediction of performance of quantitative traits involving epistasis in plant breeding.

    Directory of Open Access Journals (Sweden)

    Xiaochun Sun

    Full Text Available Genomic selection (GS procedures have proven useful in estimating breeding value and predicting phenotype with genome-wide molecular marker information. However, issues of high dimensionality, multicollinearity, and the inability to deal effectively with epistasis can jeopardize accuracy and predictive ability. We, therefore, propose a new nonparametric method, pRKHS, which combines the features of supervised principal component analysis (SPCA and reproducing kernel Hilbert spaces (RKHS regression, with versions for traits with no/low epistasis, pRKHS-NE, to high epistasis, pRKHS-E. Instead of assigning a specific relationship to represent the underlying epistasis, the method maps genotype to phenotype in a nonparametric way, thus requiring fewer genetic assumptions. SPCA decreases the number of markers needed for prediction by filtering out low-signal markers with the optimal marker set determined by cross-validation. Principal components are computed from reduced marker matrix (called supervised principal components, SPC and included in the smoothing spline ANOVA model as independent variables to fit the data. The new method was evaluated in comparison with current popular methods for practicing GS, specifically RR-BLUP, BayesA, BayesB, as well as a newer method by Crossa et al., RKHS-M, using both simulated and real data. Results demonstrate that pRKHS generally delivers greater predictive ability, particularly when epistasis impacts trait expression. Beyond prediction, the new method also facilitates inferences about the extent to which epistasis influences trait expression.

  11. Nonparametric method for genomics-based prediction of performance of quantitative traits involving epistasis in plant breeding.

    Science.gov (United States)

    Sun, Xiaochun; Ma, Ping; Mumm, Rita H

    2012-01-01

    Genomic selection (GS) procedures have proven useful in estimating breeding value and predicting phenotype with genome-wide molecular marker information. However, issues of high dimensionality, multicollinearity, and the inability to deal effectively with epistasis can jeopardize accuracy and predictive ability. We, therefore, propose a new nonparametric method, pRKHS, which combines the features of supervised principal component analysis (SPCA) and reproducing kernel Hilbert spaces (RKHS) regression, with versions for traits with no/low epistasis, pRKHS-NE, to high epistasis, pRKHS-E. Instead of assigning a specific relationship to represent the underlying epistasis, the method maps genotype to phenotype in a nonparametric way, thus requiring fewer genetic assumptions. SPCA decreases the number of markers needed for prediction by filtering out low-signal markers with the optimal marker set determined by cross-validation. Principal components are computed from reduced marker matrix (called supervised principal components, SPC) and included in the smoothing spline ANOVA model as independent variables to fit the data. The new method was evaluated in comparison with current popular methods for practicing GS, specifically RR-BLUP, BayesA, BayesB, as well as a newer method by Crossa et al., RKHS-M, using both simulated and real data. Results demonstrate that pRKHS generally delivers greater predictive ability, particularly when epistasis impacts trait expression. Beyond prediction, the new method also facilitates inferences about the extent to which epistasis influences trait expression.

  12. Firing rate estimation using infinite mixture models and its application to neural decoding.

    Science.gov (United States)

    Shibue, Ryohei; Komaki, Fumiyasu

    2017-11-01

    Neural decoding is a framework for reconstructing external stimuli from spike trains recorded by various neural recordings. Kloosterman et al. proposed a new decoding method using marked point processes (Kloosterman F, Layton SP, Chen Z, Wilson MA. J Neurophysiol 111: 217-227, 2014). This method does not require spike sorting and thereby improves decoding accuracy dramatically. In this method, they used kernel density estimation to estimate intensity functions of marked point processes. However, the use of kernel density estimation causes problems such as low decoding accuracy and high computational costs. To overcome these problems, we propose a new decoding method using infinite mixture models to estimate intensity. The proposed method improves decoding performance in terms of accuracy and computational speed. We apply the proposed method to simulation and experimental data to verify its performance. NEW & NOTEWORTHY We propose a new neural decoding method using infinite mixture models and nonparametric Bayesian statistics. The proposed method improves decoding performance in terms of accuracy and computation speed. We have successfully applied the proposed method to position decoding from spike trains recorded in a rat hippocampus. Copyright © 2017 the American Physiological Society.

  13. A note on Nonparametric Confidence Interval for a Shift Parameter ...

    African Journals Online (AJOL)

    The method is illustrated using the Cauchy distribution as a location model. The kernel-based method is found to have a shorter interval for the shift parameter between two Cauchy distributions than the one based on the Mann-Whitney test statistic. Keywords: Best Asymptotic Normal; Cauchy distribution; Kernel estimates; ...

  14. Pencil kernel correction and residual error estimation for quality-index-based dose calculations

    International Nuclear Information System (INIS)

    Nyholm, Tufve; Olofsson, Joergen; Ahnesjoe, Anders; Georg, Dietmar; Karlsson, Mikael

    2006-01-01

    Experimental data from 593 photon beams were used to quantify the errors in dose calculations using a previously published pencil kernel model. A correction of the kernel was derived in order to remove the observed systematic errors. The remaining residual error for individual beams was modelled through uncertainty associated with the kernel model. The methods were tested against an independent set of measurements. No significant systematic error was observed in the calculations using the derived correction of the kernel and the remaining random errors were found to be adequately predicted by the proposed method

  15. Predictive Model Equations for Palm Kernel (Elaeis guneensis J ...

    African Journals Online (AJOL)

    Estimated error of ± 0.18 and ± 0.2 are envisaged while applying the models for predicting palm kernel and sesame oil colours respectively. Keywords: Palm kernel, Sesame, Palm kernel, Oil Colour, Process Parameters, Model. Journal of Applied Science, Engineering and Technology Vol. 6 (1) 2006 pp. 34-38 ...

  16. Nonparametric statistical inference

    CERN Document Server

    Gibbons, Jean Dickinson

    2014-01-01

    Thoroughly revised and reorganized, the fourth edition presents in-depth coverage of the theory and methods of the most widely used nonparametric procedures in statistical analysis and offers example applications appropriate for all areas of the social, behavioral, and life sciences. The book presents new material on the quantiles, the calculation of exact and simulated power, multiple comparisons, additional goodness-of-fit tests, methods of analysis of count data, and modern computer applications using MINITAB, SAS, and STATXACT. It includes tabular guides for simplified applications of tests and finding P values and confidence interval estimates.

  17. Efficient 3D movement-based kernel density estimator and application to wildlife ecology

    Science.gov (United States)

    Tracey-PR, Jeff; Sheppard, James K.; Lockwood, Glenn K.; Chourasia, Amit; Tatineni, Mahidhar; Fisher, Robert N.; Sinkovits, Robert S.

    2014-01-01

    We describe an efficient implementation of a 3D movement-based kernel density estimator for determining animal space use from discrete GPS measurements. This new method provides more accurate results, particularly for species that make large excursions in the vertical dimension. The downside of this approach is that it is much more computationally expensive than simpler, lower-dimensional models. Through a combination of code restructuring, parallelization and performance optimization, we were able to reduce the time to solution by up to a factor of 1000x, thereby greatly improving the applicability of the method.

  18. Probabilistic wind power forecasting based on logarithmic transformation and boundary kernel

    International Nuclear Information System (INIS)

    Zhang, Yao; Wang, Jianxue; Luo, Xu

    2015-01-01

    Highlights: • Quantitative information on the uncertainty of wind power generation. • Kernel density estimator provides non-Gaussian predictive distributions. • Logarithmic transformation reduces the skewness of wind power density. • Boundary kernel method eliminates the density leakage near the boundary. - Abstracts: Probabilistic wind power forecasting not only produces the expectation of wind power output, but also gives quantitative information on the associated uncertainty, which is essential for making better decisions about power system and market operations with the increasing penetration of wind power generation. This paper presents a novel kernel density estimator for probabilistic wind power forecasting, addressing two characteristics of wind power which have adverse impacts on the forecast accuracy, namely, the heavily skewed and double-bounded nature of wind power density. Logarithmic transformation is used to reduce the skewness of wind power density, which improves the effectiveness of the kernel density estimator in a transformed scale. Transformations partially relieve the boundary effect problem of the kernel density estimator caused by the double-bounded nature of wind power density. However, the case study shows that there are still some serious problems of density leakage after the transformation. In order to solve this problem in the transformed scale, a boundary kernel method is employed to eliminate the density leak at the bounds of wind power distribution. The improvement of the proposed method over the standard kernel density estimator is demonstrated by short-term probabilistic forecasting results based on the data from an actual wind farm. Then, a detailed comparison is carried out of the proposed method and some existing probabilistic forecasting methods

  19. A nonparametric empirical Bayes framework for large-scale multiple testing.

    Science.gov (United States)

    Martin, Ryan; Tokdar, Surya T

    2012-07-01

    We propose a flexible and identifiable version of the 2-groups model, motivated by hierarchical Bayes considerations, that features an empirical null and a semiparametric mixture model for the nonnull cases. We use a computationally efficient predictive recursion (PR) marginal likelihood procedure to estimate the model parameters, even the nonparametric mixing distribution. This leads to a nonparametric empirical Bayes testing procedure, which we call PRtest, based on thresholding the estimated local false discovery rates. Simulations and real data examples demonstrate that, compared to existing approaches, PRtest's careful handling of the nonnull density can give a much better fit in the tails of the mixture distribution which, in turn, can lead to more realistic conclusions.

  20. [Nonparametric method of estimating survival functions containing right-censored and interval-censored data].

    Science.gov (United States)

    Xu, Yonghong; Gao, Xiaohuan; Wang, Zhengxi

    2014-04-01

    Missing data represent a general problem in many scientific fields, especially in medical survival analysis. Dealing with censored data, interpolation method is one of important methods. However, most of the interpolation methods replace the censored data with the exact data, which will distort the real distribution of the censored data and reduce the probability of the real data falling into the interpolation data. In order to solve this problem, we in this paper propose a nonparametric method of estimating the survival function of right-censored and interval-censored data and compare its performance to SC (self-consistent) algorithm. Comparing to the average interpolation and the nearest neighbor interpolation method, the proposed method in this paper replaces the right-censored data with the interval-censored data, and greatly improves the probability of the real data falling into imputation interval. Then it bases on the empirical distribution theory to estimate the survival function of right-censored and interval-censored data. The results of numerical examples and a real breast cancer data set demonstrated that the proposed method had higher accuracy and better robustness for the different proportion of the censored data. This paper provides a good method to compare the clinical treatments performance with estimation of the survival data of the patients. This pro vides some help to the medical survival data analysis.

  1. Predicting complex traits using a diffusion kernel on genetic markers with an application to dairy cattle and wheat data

    Science.gov (United States)

    2013-01-01

    Background Arguably, genotypes and phenotypes may be linked in functional forms that are not well addressed by the linear additive models that are standard in quantitative genetics. Therefore, developing statistical learning models for predicting phenotypic values from all available molecular information that are capable of capturing complex genetic network architectures is of great importance. Bayesian kernel ridge regression is a non-parametric prediction model proposed for this purpose. Its essence is to create a spatial distance-based relationship matrix called a kernel. Although the set of all single nucleotide polymorphism genotype configurations on which a model is built is finite, past research has mainly used a Gaussian kernel. Results We sought to investigate the performance of a diffusion kernel, which was specifically developed to model discrete marker inputs, using Holstein cattle and wheat data. This kernel can be viewed as a discretization of the Gaussian kernel. The predictive ability of the diffusion kernel was similar to that of non-spatial distance-based additive genomic relationship kernels in the Holstein data, but outperformed the latter in the wheat data. However, the difference in performance between the diffusion and Gaussian kernels was negligible. Conclusions It is concluded that the ability of a diffusion kernel to capture the total genetic variance is not better than that of a Gaussian kernel, at least for these data. Although the diffusion kernel as a choice of basis function may have potential for use in whole-genome prediction, our results imply that embedding genetic markers into a non-Euclidean metric space has very small impact on prediction. Our results suggest that use of the black box Gaussian kernel is justified, given its connection to the diffusion kernel and its similar predictive performance. PMID:23763755

  2. Multivariate realised kernels

    DEFF Research Database (Denmark)

    Barndorff-Nielsen, Ole Eiler; Hansen, Peter Reinhard; Lunde, Asger

    2011-01-01

    We propose a multivariate realised kernel to estimate the ex-post covariation of log-prices. We show this new consistent estimator is guaranteed to be positive semi-definite and is robust to measurement error of certain types and can also handle non-synchronous trading. It is the first estimator...... which has these three properties which are all essential for empirical work in this area. We derive the large sample asymptotics of this estimator and assess its accuracy using a Monte Carlo study. We implement the estimator on some US equity data, comparing our results to previous work which has used...

  3. Impulse response identification with deterministic inputs using non-parametric methods

    International Nuclear Information System (INIS)

    Bhargava, U.K.; Kashyap, R.L.; Goodman, D.M.

    1985-01-01

    This paper addresses the problem of impulse response identification using non-parametric methods. Although the techniques developed herein apply to the truncated, untruncated, and the circulant models, we focus on the truncated model which is useful in certain applications. Two methods of impulse response identification will be presented. The first is based on the minimization of the C/sub L/ Statistic, which is an estimate of the mean-square prediction error; the second is a Bayesian approach. For both of these methods, we consider the effects of using both the identity matrix and the Laplacian matrix as weights on the energy in the impulse response. In addition, we present a method for estimating the effective length of the impulse response. Estimating the length is particularly important in the truncated case. Finally, we develop a method for estimating the noise variance at the output. Often, prior information on the noise variance is not available, and a good estimate is crucial to the success of estimating the impulse response with a nonparametric technique

  4. Input Space Regularization Stabilizes Pre-images for Kernel PCA De-noising

    DEFF Research Database (Denmark)

    Abrahamsen, Trine Julie; Hansen, Lars Kai

    2009-01-01

    Solution of the pre-image problem is key to efficient nonlinear de-noising using kernel Principal Component Analysis. Pre-image estimation is inherently ill-posed for typical kernels used in applications and consequently the most widely used estimation schemes lack stability. For de...

  5. Scientific opinion on the acute health risks related to the presence of cyanogenic glycosides in raw apricot kernels and products derived from raw apricot kernels

    DEFF Research Database (Denmark)

    Petersen, Annette

    of kernels promoted (10 and 60 kernels/day for the general population and cancer patients, respectively), exposures exceeded the ARfD 17–413 and 3–71 times in toddlers and adults, respectively. The estimated maximum quantity of apricot kernels (or raw apricot material) that can be consumed without exceeding...

  6. The nonparametric bootstrap for the current status model

    NARCIS (Netherlands)

    Groeneboom, P.; Hendrickx, K.

    2017-01-01

    It has been proved that direct bootstrapping of the nonparametric maximum likelihood estimator (MLE) of the distribution function in the current status model leads to inconsistent confidence intervals. We show that bootstrapping of functionals of the MLE can however be used to produce valid

  7. Explicit signal to noise ratio in reproducing kernel Hilbert spaces

    DEFF Research Database (Denmark)

    Gomez-Chova, Luis; Nielsen, Allan Aasbjerg; Camps-Valls, Gustavo

    2011-01-01

    This paper introduces a nonlinear feature extraction method based on kernels for remote sensing data analysis. The proposed approach is based on the minimum noise fraction (MNF) transform, which maximizes the signal variance while also minimizing the estimated noise variance. We here propose...... an alternative kernel MNF (KMNF) in which the noise is explicitly estimated in the reproducing kernel Hilbert space. This enables KMNF dealing with non-linear relations between the noise and the signal features jointly. Results show that the proposed KMNF provides the most noise-free features when confronted...

  8. A generalized L1-approach for a kernel estimator of conditional quantile with functional regressors: Consistency and asymptotic normality

    OpenAIRE

    2009-01-01

    Abstract A kernel estimator of the conditional quantile is defined for a scalar response variable given a covariate taking values in a semi-metric space. The approach generalizes the median?s L1-norm estimator. The almost complete consistency and asymptotic normality are stated. correspondance: Corresponding author. Tel: +33 320 964 933; fax: +33 320 964 704. (Lemdani, Mohamed) (Laksaci, Ali) mohamed.lemdani@univ-lill...

  9. Bayesian nonparametric estimation of continuous monotone functions with applications to dose-response analysis.

    Science.gov (United States)

    Bornkamp, Björn; Ickstadt, Katja

    2009-03-01

    In this article, we consider monotone nonparametric regression in a Bayesian framework. The monotone function is modeled as a mixture of shifted and scaled parametric probability distribution functions, and a general random probability measure is assumed as the prior for the mixing distribution. We investigate the choice of the underlying parametric distribution function and find that the two-sided power distribution function is well suited both from a computational and mathematical point of view. The model is motivated by traditional nonlinear models for dose-response analysis, and provides possibilities to elicitate informative prior distributions on different aspects of the curve. The method is compared with other recent approaches to monotone nonparametric regression in a simulation study and is illustrated on a data set from dose-response analysis.

  10. Examining Potential Boundary Bias Effects in Kernel Smoothing on Equating: An Introduction for the Adaptive and Epanechnikov Kernels.

    Science.gov (United States)

    Cid, Jaime A; von Davier, Alina A

    2015-05-01

    Test equating is a method of making the test scores from different test forms of the same assessment comparable. In the equating process, an important step involves continuizing the discrete score distributions. In traditional observed-score equating, this step is achieved using linear interpolation (or an unscaled uniform kernel). In the kernel equating (KE) process, this continuization process involves Gaussian kernel smoothing. It has been suggested that the choice of bandwidth in kernel smoothing controls the trade-off between variance and bias. In the literature on estimating density functions using kernels, it has also been suggested that the weight of the kernel depends on the sample size, and therefore, the resulting continuous distribution exhibits bias at the endpoints, where the samples are usually smaller. The purpose of this article is (a) to explore the potential effects of atypical scores (spikes) at the extreme ends (high and low) on the KE method in distributions with different degrees of asymmetry using the randomly equivalent groups equating design (Study I), and (b) to introduce the Epanechnikov and adaptive kernels as potential alternative approaches to reducing boundary bias in smoothing (Study II). The beta-binomial model is used to simulate observed scores reflecting a range of different skewed shapes.

  11. Using non-parametric methods in econometric production analysis

    DEFF Research Database (Denmark)

    Czekaj, Tomasz Gerard; Henningsen, Arne

    Econometric estimation of production functions is one of the most common methods in applied economic production analysis. These studies usually apply parametric estimation techniques, which obligate the researcher to specify the functional form of the production function. Most often, the Cobb...... results—including measures that are of interest of applied economists, such as elasticities. Therefore, we propose to use nonparametric econometric methods. First, they can be applied to verify the functional form used in parametric estimations of production functions. Second, they can be directly used...

  12. A menu-driven software package of Bayesian nonparametric (and parametric) mixed models for regression analysis and density estimation.

    Science.gov (United States)

    Karabatsos, George

    2017-02-01

    Most of applied statistics involves regression analysis of data. In practice, it is important to specify a regression model that has minimal assumptions which are not violated by data, to ensure that statistical inferences from the model are informative and not misleading. This paper presents a stand-alone and menu-driven software package, Bayesian Regression: Nonparametric and Parametric Models, constructed from MATLAB Compiler. Currently, this package gives the user a choice from 83 Bayesian models for data analysis. They include 47 Bayesian nonparametric (BNP) infinite-mixture regression models; 5 BNP infinite-mixture models for density estimation; and 31 normal random effects models (HLMs), including normal linear models. Each of the 78 regression models handles either a continuous, binary, or ordinal dependent variable, and can handle multi-level (grouped) data. All 83 Bayesian models can handle the analysis of weighted observations (e.g., for meta-analysis), and the analysis of left-censored, right-censored, and/or interval-censored data. Each BNP infinite-mixture model has a mixture distribution assigned one of various BNP prior distributions, including priors defined by either the Dirichlet process, Pitman-Yor process (including the normalized stable process), beta (two-parameter) process, normalized inverse-Gaussian process, geometric weights prior, dependent Dirichlet process, or the dependent infinite-probits prior. The software user can mouse-click to select a Bayesian model and perform data analysis via Markov chain Monte Carlo (MCMC) sampling. After the sampling completes, the software automatically opens text output that reports MCMC-based estimates of the model's posterior distribution and model predictive fit to the data. Additional text and/or graphical output can be generated by mouse-clicking other menu options. This includes output of MCMC convergence analyses, and estimates of the model's posterior predictive distribution, for selected

  13. Non-Parametric Analysis of Rating Transition and Default Data

    DEFF Research Database (Denmark)

    Fledelius, Peter; Lando, David; Perch Nielsen, Jens

    2004-01-01

    We demonstrate the use of non-parametric intensity estimation - including construction of pointwise confidence sets - for analyzing rating transition data. We find that transition intensities away from the class studied here for illustration strongly depend on the direction of the previous move b...

  14. Adaptive nonparametric Bayesian inference using location-scale mixture priors

    NARCIS (Netherlands)

    Jonge, de R.; Zanten, van J.H.

    2010-01-01

    We study location-scale mixture priors for nonparametric statistical problems, including multivariate regression, density estimation and classification. We show that a rate-adaptive procedure can be obtained if the prior is properly constructed. In particular, we show that adaptation is achieved if

  15. Nonlinear Forecasting With Many Predictors Using Kernel Ridge Regression

    DEFF Research Database (Denmark)

    Exterkate, Peter; Groenen, Patrick J.F.; Heij, Christiaan

    This paper puts forward kernel ridge regression as an approach for forecasting with many predictors that are related nonlinearly to the target variable. In kernel ridge regression, the observed predictor variables are mapped nonlinearly into a high-dimensional space, where estimation of the predi...

  16. Geostatistical radar-raingauge combination with nonparametric correlograms: methodological considerations and application in Switzerland

    Science.gov (United States)

    Schiemann, R.; Erdin, R.; Willi, M.; Frei, C.; Berenguer, M.; Sempere-Torres, D.

    2011-05-01

    Modelling spatial covariance is an essential part of all geostatistical methods. Traditionally, parametric semivariogram models are fit from available data. More recently, it has been suggested to use nonparametric correlograms obtained from spatially complete data fields. Here, both estimation techniques are compared. Nonparametric correlograms are shown to have a substantial negative bias. Nonetheless, when combined with the sample variance of the spatial field under consideration, they yield an estimate of the semivariogram that is unbiased for small lag distances. This justifies the use of this estimation technique in geostatistical applications. Various formulations of geostatistical combination (Kriging) methods are used here for the construction of hourly precipitation grids for Switzerland based on data from a sparse realtime network of raingauges and from a spatially complete radar composite. Two variants of Ordinary Kriging (OK) are used to interpolate the sparse gauge observations. In both OK variants, the radar data are only used to determine the semivariogram model. One variant relies on a traditional parametric semivariogram estimate, whereas the other variant uses the nonparametric correlogram. The variants are tested for three cases and the impact of the semivariogram model on the Kriging prediction is illustrated. For the three test cases, the method using nonparametric correlograms performs equally well or better than the traditional method, and at the same time offers great practical advantages. Furthermore, two variants of Kriging with external drift (KED) are tested, both of which use the radar data to estimate nonparametric correlograms, and as the external drift variable. The first KED variant has been used previously for geostatistical radar-raingauge merging in Catalonia (Spain). The second variant is newly proposed here and is an extension of the first. Both variants are evaluated for the three test cases as well as an extended evaluation

  17. Non-parametric system identification from non-linear stochastic response

    DEFF Research Database (Denmark)

    Rüdinger, Finn; Krenk, Steen

    2001-01-01

    An estimation method is proposed for identification of non-linear stiffness and damping of single-degree-of-freedom systems under stationary white noise excitation. Non-parametric estimates of the stiffness and damping along with an estimate of the white noise intensity are obtained by suitable...... of the energy at mean-level crossings, which yields the damping relative to white noise intensity. Finally, an estimate of the noise intensity is extracted by estimating the absolute damping from the autocovariance functions of a set of modified phase plane variables at different energy levels. The method...

  18. Kernel-based noise filtering of neutron detector signals

    International Nuclear Information System (INIS)

    Park, Moon Ghu; Shin, Ho Cheol; Lee, Eun Ki

    2007-01-01

    This paper describes recently developed techniques for effective filtering of neutron detector signal noise. In this paper, three kinds of noise filters are proposed and their performance is demonstrated for the estimation of reactivity. The tested filters are based on the unilateral kernel filter, unilateral kernel filter with adaptive bandwidth and bilateral filter to show their effectiveness in edge preservation. Filtering performance is compared with conventional low-pass and wavelet filters. The bilateral filter shows a remarkable improvement compared with unilateral kernel and wavelet filters. The effectiveness and simplicity of the unilateral kernel filter with adaptive bandwidth is also demonstrated by applying it to the reactivity measurement performed during reactor start-up physics tests

  19. Locally linear approximation for Kernel methods : the Railway Kernel

    OpenAIRE

    Muñoz, Alberto; González, Javier

    2008-01-01

    In this paper we present a new kernel, the Railway Kernel, that works properly for general (nonlinear) classification problems, with the interesting property that acts locally as a linear kernel. In this way, we avoid potential problems due to the use of a general purpose kernel, like the RBF kernel, as the high dimension of the induced feature space. As a consequence, following our methodology the number of support vectors is much lower and, therefore, the generalization capab...

  20. Dispersal kernel estimation: A comparison of empirical and modelled particle dispersion in a coastal marine system

    Science.gov (United States)

    Hrycik, Janelle M.; Chassé, Joël; Ruddick, Barry R.; Taggart, Christopher T.

    2013-11-01

    Early life-stage dispersal influences recruitment and is of significance in explaining the distribution and connectivity of marine species. Motivations for quantifying dispersal range from biodiversity conservation to the design of marine reserves and the mitigation of species invasions. Here we compare estimates of real particle dispersion in a coastal marine environment with similar estimates provided by hydrodynamic modelling. We do so by using a system of magnetically attractive particles (MAPs) and a magnetic-collector array that provides measures of Lagrangian dispersion based on the time-integration of MAPs dispersing through the array. MAPs released as a point source in a coastal marine location dispersed through the collector array over a 5-7 d period. A virtual release and observed (real-time) environmental conditions were used in a high-resolution three-dimensional hydrodynamic model to estimate the dispersal of virtual particles (VPs). The number of MAPs captured throughout the collector array and the number of VPs that passed through each corresponding model location were enumerated and compared. Although VP dispersal reflected several aspects of the observed MAP dispersal, the comparisons demonstrated model sensitivity to the small-scale (random-walk) particle diffusivity parameter (Kp). The one-dimensional dispersal kernel for the MAPs had an e-folding scale estimate in the range of 5.19-11.44 km, while those from the model simulations were comparable at 1.89-6.52 km, and also demonstrated sensitivity to Kp. Variations among comparisons are related to the value of Kp used in modelling and are postulated to be related to MAP losses from the water column and (or) shear dispersion acting on the MAPs; a process that is constrained in the model. Our demonstration indicates a promising new way of 1) quantitatively and empirically estimating the dispersal kernel in aquatic systems, and 2) quantitatively assessing and (or) improving regional hydrodynamic

  1. Multiple Kernel Learning with Random Effects for Predicting Longitudinal Outcomes and Data Integration

    Science.gov (United States)

    Chen, Tianle; Zeng, Donglin

    2015-01-01

    Summary Predicting disease risk and progression is one of the main goals in many clinical research studies. Cohort studies on the natural history and etiology of chronic diseases span years and data are collected at multiple visits. Although kernel-based statistical learning methods are proven to be powerful for a wide range of disease prediction problems, these methods are only well studied for independent data but not for longitudinal data. It is thus important to develop time-sensitive prediction rules that make use of the longitudinal nature of the data. In this paper, we develop a novel statistical learning method for longitudinal data by introducing subject-specific short-term and long-term latent effects through a designed kernel to account for within-subject correlation of longitudinal measurements. Since the presence of multiple sources of data is increasingly common, we embed our method in a multiple kernel learning framework and propose a regularized multiple kernel statistical learning with random effects to construct effective nonparametric prediction rules. Our method allows easy integration of various heterogeneous data sources and takes advantage of correlation among longitudinal measures to increase prediction power. We use different kernels for each data source taking advantage of the distinctive feature of each data modality, and then optimally combine data across modalities. We apply the developed methods to two large epidemiological studies, one on Huntington's disease and the other on Alzheimer's Disease (Alzheimer's Disease Neuroimaging Initiative, ADNI) where we explore a unique opportunity to combine imaging and genetic data to study prediction of mild cognitive impairment, and show a substantial gain in performance while accounting for the longitudinal aspect of the data. PMID:26177419

  2. An Extreme Learning Machine Based on the Mixed Kernel Function of Triangular Kernel and Generalized Hermite Dirichlet Kernel

    Directory of Open Access Journals (Sweden)

    Senyue Zhang

    2016-01-01

    Full Text Available According to the characteristics that the kernel function of extreme learning machine (ELM and its performance have a strong correlation, a novel extreme learning machine based on a generalized triangle Hermitian kernel function was proposed in this paper. First, the generalized triangle Hermitian kernel function was constructed by using the product of triangular kernel and generalized Hermite Dirichlet kernel, and the proposed kernel function was proved as a valid kernel function of extreme learning machine. Then, the learning methodology of the extreme learning machine based on the proposed kernel function was presented. The biggest advantage of the proposed kernel is its kernel parameter values only chosen in the natural numbers, which thus can greatly shorten the computational time of parameter optimization and retain more of its sample data structure information. Experiments were performed on a number of binary classification, multiclassification, and regression datasets from the UCI benchmark repository. The experiment results demonstrated that the robustness and generalization performance of the proposed method are outperformed compared to other extreme learning machines with different kernels. Furthermore, the learning speed of proposed method is faster than support vector machine (SVM methods.

  3. Nonparametric Bayesian inference for mean residual life functions in survival analysis.

    Science.gov (United States)

    Poynor, Valerie; Kottas, Athanasios

    2018-01-19

    Modeling and inference for survival analysis problems typically revolves around different functions related to the survival distribution. Here, we focus on the mean residual life (MRL) function, which provides the expected remaining lifetime given that a subject has survived (i.e. is event-free) up to a particular time. This function is of direct interest in reliability, medical, and actuarial fields. In addition to its practical interpretation, the MRL function characterizes the survival distribution. We develop general Bayesian nonparametric inference for MRL functions built from a Dirichlet process mixture model for the associated survival distribution. The resulting model for the MRL function admits a representation as a mixture of the kernel MRL functions with time-dependent mixture weights. This model structure allows for a wide range of shapes for the MRL function. Particular emphasis is placed on the selection of the mixture kernel, taken to be a gamma distribution, to obtain desirable properties for the MRL function arising from the mixture model. The inference method is illustrated with a data set of two experimental groups and a data set involving right censoring. The supplementary material available at Biostatistics online provides further results on empirical performance of the model, using simulated data examples. © The Author 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  4. Kernel density estimation and transition maps of Moldavian Neolithic and Eneolithic settlement

    Directory of Open Access Journals (Sweden)

    Robin Brigand

    2018-04-01

    Full Text Available The data presented in this article are related to the research article entitled “Neo-Eneolithic settlement pattern and salt exploitation in Romanian Moldavia” (Brigand and Weller, 2018 [1]. Kernel density estimation (KDE is used in order to move beyond the discrete distribution of sites and to enable us to work on a continuous surface that reflects the intensity of the occupation in the space. Maps of density per period – Neolithic I (Cris, Neolithic II (LBK, Eneolithic I (Precucuteni, Eneolithic II (Cucuteni A, Eneolithic III-IV (Cucuteni A-B and B – are used to create maps of density difference (Figs. 1–4 in order to analyse the dynamic (either non-existent, negative or positive between two chronological sequences.

  5. Essays on parametric and nonparametric modeling and estimation with applications to energy economics

    Science.gov (United States)

    Gao, Weiyu

    My dissertation research is composed of two parts: a theoretical part on semiparametric efficient estimation and an applied part in energy economics under different dynamic settings. The essays are related in terms of their applications as well as the way in which models are constructed and estimated. In the first essay, efficient estimation of the partially linear model is studied. We work out the efficient score functions and efficiency bounds under four stochastic restrictions---independence, conditional symmetry, conditional zero mean, and partially conditional zero mean. A feasible efficient estimation method for the linear part of the model is developed based on the efficient score. A battery of specification test that allows for choosing between the alternative assumptions is provided. A Monte Carlo simulation is also conducted. The second essay presents a dynamic optimization model for a stylized oilfield resembling the largest developed light oil field in Saudi Arabia, Ghawar. We use data from different sources to estimate the oil production cost function and the revenue function. We pay particular attention to the dynamic aspect of the oil production by employing petroleum-engineering software to simulate the interaction between control variables and reservoir state variables. Optimal solutions are studied under different scenarios to account for the possible changes in the exogenous variables and the uncertainty about the forecasts. The third essay examines the effect of oil price volatility on the level of innovation displayed by the U.S. economy. A measure of innovation is calculated by decomposing an output-based Malmquist index. We also construct a nonparametric measure for oil price volatility. Technical change and oil price volatility are then placed in a VAR system with oil price and a variable indicative of monetary policy. The system is estimated and analyzed for significant relationships. We find that oil price volatility displays a significant

  6. Estimation with Right-Censored Observations Under A Semi-Markov Model.

    Science.gov (United States)

    Zhao, Lihui; Hu, X Joan

    2013-06-01

    The semi-Markov process often provides a better framework than the classical Markov process for the analysis of events with multiple states. The purpose of this paper is twofold. First, we show that in the presence of right censoring, when the right end-point of the support of the censoring time is strictly less than the right end-point of the support of the semi-Markov kernel, the transition probability of the semi-Markov process is nonidentifiable, and the estimators proposed in the literature are inconsistent in general. We derive the set of all attainable values for the transition probability based on the censored data, and we propose a nonparametric inference procedure for the transition probability using this set. Second, the conventional approach to constructing confidence bands is not applicable for the semi-Markov kernel and the sojourn time distribution. We propose new perturbation resampling methods to construct these confidence bands. Different weights and transformations are explored in the construction. We use simulation to examine our proposals and illustrate them with hospitalization data from a recent cancer survivor study.

  7. A Nonparametric Bayesian Approach For Emission Tomography Reconstruction

    International Nuclear Information System (INIS)

    Barat, Eric; Dautremer, Thomas

    2007-01-01

    We introduce a PET reconstruction algorithm following a nonparametric Bayesian (NPB) approach. In contrast with Expectation Maximization (EM), the proposed technique does not rely on any space discretization. Namely, the activity distribution--normalized emission intensity of the spatial poisson process--is considered as a spatial probability density and observations are the projections of random emissions whose distribution has to be estimated. This approach is nonparametric in the sense that the quantity of interest belongs to the set of probability measures on R k (for reconstruction in k-dimensions) and it is Bayesian in the sense that we define a prior directly on this spatial measure. In this context, we propose to model the nonparametric probability density as an infinite mixture of multivariate normal distributions. As a prior for this mixture we consider a Dirichlet Process Mixture (DPM) with a Normal-Inverse Wishart (NIW) model as base distribution of the Dirichlet Process. As in EM-family reconstruction, we use a data augmentation scheme where the set of hidden variables are the emission locations for each observed line of response in the continuous object space. Thanks to the data augmentation, we propose a Markov Chain Monte Carlo (MCMC) algorithm (Gibbs sampler) which is able to generate draws from the posterior distribution of the spatial intensity. A difference with EM is that one step of the Gibbs sampler corresponds to the generation of emission locations while only the expected number of emissions per pixel/voxel is used in EM. Another key difference is that the estimated spatial intensity is a continuous function such that there is no need to compute a projection matrix. Finally, draws from the intensity posterior distribution allow the estimation of posterior functionnals like the variance or confidence intervals. Results are presented for simulated data based on a 2D brain phantom and compared to Bayesian MAP-EM

  8. Promotion time cure rate model with nonparametric form of covariate effects.

    Science.gov (United States)

    Chen, Tianlei; Du, Pang

    2018-05-10

    Survival data with a cured portion are commonly seen in clinical trials. Motivated from a biological interpretation of cancer metastasis, promotion time cure model is a popular alternative to the mixture cure rate model for analyzing such data. The existing promotion cure models all assume a restrictive parametric form of covariate effects, which can be incorrectly specified especially at the exploratory stage. In this paper, we propose a nonparametric approach to modeling the covariate effects under the framework of promotion time cure model. The covariate effect function is estimated by smoothing splines via the optimization of a penalized profile likelihood. Point-wise interval estimates are also derived from the Bayesian interpretation of the penalized profile likelihood. Asymptotic convergence rates are established for the proposed estimates. Simulations show excellent performance of the proposed nonparametric method, which is then applied to a melanoma study. Copyright © 2018 John Wiley & Sons, Ltd.

  9. Prior processes and their applications nonparametric Bayesian estimation

    CERN Document Server

    Phadia, Eswar G

    2016-01-01

    This book presents a systematic and comprehensive treatment of various prior processes that have been developed over the past four decades for dealing with Bayesian approach to solving selected nonparametric inference problems. This revised edition has been substantially expanded to reflect the current interest in this area. After an overview of different prior processes, it examines the now pre-eminent Dirichlet process and its variants including hierarchical processes, then addresses new processes such as dependent Dirichlet, local Dirichlet, time-varying and spatial processes, all of which exploit the countable mixture representation of the Dirichlet process. It subsequently discusses various neutral to right type processes, including gamma and extended gamma, beta and beta-Stacy processes, and then describes the Chinese Restaurant, Indian Buffet and infinite gamma-Poisson processes, which prove to be very useful in areas such as machine learning, information retrieval and featural modeling. Tailfree and P...

  10. Network Kernel Density Estimation for the Analysis of Facility POI Hotspots

    Directory of Open Access Journals (Sweden)

    YU Wenhao

    2015-12-01

    Full Text Available The distribution pattern of urban facility POIs (points of interest usually forms clusters (i.e. "hotspots" in urban geographic space. To detect such type of hotspot, the methods mostly employ spatial density estimation based on Euclidean distance, ignoring the fact that the service function and interrelation of urban feasibilities is carried out on the network path distance, neither than conventional Euclidean distance. By using these methods, it is difficult to exactly and objectively delimitate the shape and the size of hotspot. Therefore, this research adopts the kernel density estimation based on the network distance to compute the density of hotspot and proposes a simple and efficient algorithm. The algorithm extends the 2D dilation operator to the 1D morphological operator, thus computing the density of network unit. Through evaluation experiment, it is suggested that the algorithm is more efficient and scalable than the existing algorithms. Based on the case study on real POI data, the range of hotspot can highlight the spatial characteristic of urban functions along traffic routes, in order to provide valuable spatial knowledge and information services for the applications of region planning, navigation and geographic information inquiring.

  11. Lévy matters IV estimation for discretely observed Lévy processes

    CERN Document Server

    Belomestny, Denis; Genon-Catalot, Valentine; Masuda, Hiroki; Reiß, Markus

    2015-01-01

    The aim of this volume is to provide an extensive account of the most recent advances in statistics for discretely observed Lévy processes. These days, statistics for stochastic processes is a lively topic, driven by the needs of various fields of application, such as finance, the biosciences, and telecommunication. The three chapters of this volume are completely dedicated to the estimation of Lévy processes, and are written by experts in the field. The first chapter by Denis Belomestny and Markus Reiß treats the low frequency situation, and estimation methods are based on the empirical characteristic function. The second chapter by Fabienne Comte and Valery Genon-Catalon is dedicated to non-parametric estimation mainly covering the high-frequency data case. A distinctive feature of this part is the construction of adaptive estimators, based on deconvolution or projection or kernel methods. The last chapter by Hiroki Masuda considers the parametric situation. The chapters cover the main aspects of the est...

  12. System identification via sparse multiple kernel-based regularization using sequential convex optimization techniques

    DEFF Research Database (Denmark)

    Chen, Tianshi; Andersen, Martin Skovgaard; Ljung, Lennart

    2014-01-01

    Model estimation and structure detection with short data records are two issues that receive increasing interests in System Identification. In this paper, a multiple kernel-based regularization method is proposed to handle those issues. Multiple kernels are conic combinations of fixed kernels...

  13. Kernel Machine SNP-set Testing under Multiple Candidate Kernels

    Science.gov (United States)

    Wu, Michael C.; Maity, Arnab; Lee, Seunggeun; Simmons, Elizabeth M.; Harmon, Quaker E.; Lin, Xinyi; Engel, Stephanie M.; Molldrem, Jeffrey J.; Armistead, Paul M.

    2013-01-01

    Joint testing for the cumulative effect of multiple single nucleotide polymorphisms grouped on the basis of prior biological knowledge has become a popular and powerful strategy for the analysis of large scale genetic association studies. The kernel machine (KM) testing framework is a useful approach that has been proposed for testing associations between multiple genetic variants and many different types of complex traits by comparing pairwise similarity in phenotype between subjects to pairwise similarity in genotype, with similarity in genotype defined via a kernel function. An advantage of the KM framework is its flexibility: choosing different kernel functions allows for different assumptions concerning the underlying model and can allow for improved power. In practice, it is difficult to know which kernel to use a priori since this depends on the unknown underlying trait architecture and selecting the kernel which gives the lowest p-value can lead to inflated type I error. Therefore, we propose practical strategies for KM testing when multiple candidate kernels are present based on constructing composite kernels and based on efficient perturbation procedures. We demonstrate through simulations and real data applications that the procedures protect the type I error rate and can lead to substantially improved power over poor choices of kernels and only modest differences in power versus using the best candidate kernel. PMID:23471868

  14. On the expected value and variance for an estimator of the spatio-temporal product density function

    DEFF Research Database (Denmark)

    Rodríguez-Corté, Francisco J.; Ghorbani, Mohammad; Mateu, Jorge

    Second-order characteristics are used to analyse the spatio-temporal structure of the underlying point process, and thus these methods provide a natural starting point for the analysis of spatio-temporal point process data. We restrict our attention to the spatio-temporal product density function......, and develop a non-parametric edge-corrected kernel estimate of the product density under the second-order intensity-reweighted stationary hypothesis. The expectation and variance of the estimator are obtained, and closed form expressions derived under the Poisson case. A detailed simulation study is presented...... to compare our close expression for the variance with estimated ones for Poisson cases. The simulation experiments show that the theoretical form for the variance gives acceptable values, which can be used in practice. Finally, we apply the resulting estimator to data on the spatio-temporal distribution...

  15. Nonparametric instrumental regression with non-convex constraints

    International Nuclear Information System (INIS)

    Grasmair, M; Scherzer, O; Vanhems, A

    2013-01-01

    This paper considers the nonparametric regression model with an additive error that is dependent on the explanatory variables. As is common in empirical studies in epidemiology and economics, it also supposes that valid instrumental variables are observed. A classical example in microeconomics considers the consumer demand function as a function of the price of goods and the income, both variables often considered as endogenous. In this framework, the economic theory also imposes shape restrictions on the demand function, such as integrability conditions. Motivated by this illustration in microeconomics, we study an estimator of a nonparametric constrained regression function using instrumental variables by means of Tikhonov regularization. We derive rates of convergence for the regularized model both in a deterministic and stochastic setting under the assumption that the true regression function satisfies a projected source condition including, because of the non-convexity of the imposed constraints, an additional smallness condition. (paper)

  16. Nonparametric instrumental regression with non-convex constraints

    Science.gov (United States)

    Grasmair, M.; Scherzer, O.; Vanhems, A.

    2013-03-01

    This paper considers the nonparametric regression model with an additive error that is dependent on the explanatory variables. As is common in empirical studies in epidemiology and economics, it also supposes that valid instrumental variables are observed. A classical example in microeconomics considers the consumer demand function as a function of the price of goods and the income, both variables often considered as endogenous. In this framework, the economic theory also imposes shape restrictions on the demand function, such as integrability conditions. Motivated by this illustration in microeconomics, we study an estimator of a nonparametric constrained regression function using instrumental variables by means of Tikhonov regularization. We derive rates of convergence for the regularized model both in a deterministic and stochastic setting under the assumption that the true regression function satisfies a projected source condition including, because of the non-convexity of the imposed constraints, an additional smallness condition.

  17. Adaptive Kernel Canonical Correlation Analysis Algorithms for Nonparametric Identification of Wiener and Hammerstein Systems

    Directory of Open Access Journals (Sweden)

    Ignacio Santamaría

    2008-04-01

    Full Text Available This paper treats the identification of nonlinear systems that consist of a cascade of a linear channel and a nonlinearity, such as the well-known Wiener and Hammerstein systems. In particular, we follow a supervised identification approach that simultaneously identifies both parts of the nonlinear system. Given the correct restrictions on the identification problem, we show how kernel canonical correlation analysis (KCCA emerges as the logical solution to this problem. We then extend the proposed identification algorithm to an adaptive version allowing to deal with time-varying systems. In order to avoid overfitting problems, we discuss and compare three possible regularization techniques for both the batch and the adaptive versions of the proposed algorithm. Simulations are included to demonstrate the effectiveness of the presented algorithm.

  18. On Convergence of Kernel Density Estimates in Particle Filtering

    Czech Academy of Sciences Publication Activity Database

    Coufal, David

    2016-01-01

    Roč. 52, č. 5 (2016), s. 735-756 ISSN 0023-5954 Grant - others:GA ČR(CZ) GA16-03708S; SVV(CZ) 260334/2016 Institutional support: RVO:67985807 Keywords : Fourier analysis * kernel methods * particle filter Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.379, year: 2016

  19. Seasonal adjustment methods and real time trend-cycle estimation

    CERN Document Server

    Bee Dagum, Estela

    2016-01-01

    This book explores widely used seasonal adjustment methods and recent developments in real time trend-cycle estimation. It discusses in detail the properties and limitations of X12ARIMA, TRAMO-SEATS and STAMP - the main seasonal adjustment methods used by statistical agencies. Several real-world cases illustrate each method and real data examples can be followed throughout the text. The trend-cycle estimation is presented using nonparametric techniques based on moving averages, linear filters and reproducing kernel Hilbert spaces, taking recent advances into account. The book provides a systematical treatment of results that to date have been scattered throughout the literature. Seasonal adjustment and real time trend-cycle prediction play an essential part at all levels of activity in modern economies. They are used by governments to counteract cyclical recessions, by central banks to control inflation, by decision makers for better modeling and planning and by hospitals, manufacturers, builders, transportat...

  20. Kernel Temporal Differences for Neural Decoding

    Science.gov (United States)

    Bae, Jihye; Sanchez Giraldo, Luis G.; Pohlmeyer, Eric A.; Francis, Joseph T.; Sanchez, Justin C.; Príncipe, José C.

    2015-01-01

    We study the feasibility and capability of the kernel temporal difference (KTD)(λ) algorithm for neural decoding. KTD(λ) is an online, kernel-based learning algorithm, which has been introduced to estimate value functions in reinforcement learning. This algorithm combines kernel-based representations with the temporal difference approach to learning. One of our key observations is that by using strictly positive definite kernels, algorithm's convergence can be guaranteed for policy evaluation. The algorithm's nonlinear functional approximation capabilities are shown in both simulations of policy evaluation and neural decoding problems (policy improvement). KTD can handle high-dimensional neural states containing spatial-temporal information at a reasonable computational complexity allowing real-time applications. When the algorithm seeks a proper mapping between a monkey's neural states and desired positions of a computer cursor or a robot arm, in both open-loop and closed-loop experiments, it can effectively learn the neural state to action mapping. Finally, a visualization of the coadaptation process between the decoder and the subject shows the algorithm's capabilities in reinforcement learning brain machine interfaces. PMID:25866504

  1. CADDIS Volume 4. Data Analysis: PECBO Appendix - R Scripts for Non-Parametric Regressions

    Science.gov (United States)

    Script for computing nonparametric regression analysis. Overview of using scripts to infer environmental conditions from biological observations, statistically estimating species-environment relationships, statistical scripts.

  2. RTOS kernel in portable electrocardiograph

    Science.gov (United States)

    Centeno, C. A.; Voos, J. A.; Riva, G. G.; Zerbini, C.; Gonzalez, E. A.

    2011-12-01

    This paper presents the use of a Real Time Operating System (RTOS) on a portable electrocardiograph based on a microcontroller platform. All medical device digital functions are performed by the microcontroller. The electrocardiograph CPU is based on the 18F4550 microcontroller, in which an uCOS-II RTOS can be embedded. The decision associated with the kernel use is based on its benefits, the license for educational use and its intrinsic time control and peripherals management. The feasibility of its use on the electrocardiograph is evaluated based on the minimum memory requirements due to the kernel structure. The kernel's own tools were used for time estimation and evaluation of resources used by each process. After this feasibility analysis, the migration from cyclic code to a structure based on separate processes or tasks able to synchronize events is used; resulting in an electrocardiograph running on one Central Processing Unit (CPU) based on RTOS.

  3. RTOS kernel in portable electrocardiograph

    International Nuclear Information System (INIS)

    Centeno, C A; Voos, J A; Riva, G G; Zerbini, C; Gonzalez, E A

    2011-01-01

    This paper presents the use of a Real Time Operating System (RTOS) on a portable electrocardiograph based on a microcontroller platform. All medical device digital functions are performed by the microcontroller. The electrocardiograph CPU is based on the 18F4550 microcontroller, in which an uCOS-II RTOS can be embedded. The decision associated with the kernel use is based on its benefits, the license for educational use and its intrinsic time control and peripherals management. The feasibility of its use on the electrocardiograph is evaluated based on the minimum memory requirements due to the kernel structure. The kernel's own tools were used for time estimation and evaluation of resources used by each process. After this feasibility analysis, the migration from cyclic code to a structure based on separate processes or tasks able to synchronize events is used; resulting in an electrocardiograph running on one Central Processing Unit (CPU) based on RTOS.

  4. A kernel principal component analysis–based degradation model and remaining useful life estimation for the turbofan engine

    Directory of Open Access Journals (Sweden)

    Delong Feng

    2016-05-01

    Full Text Available Remaining useful life estimation of the prognostics and health management technique is a complicated and difficult research question for maintenance. In this article, we consider the problem of prognostics modeling and estimation of the turbofan engine under complicated circumstances and propose a kernel principal component analysis–based degradation model and remaining useful life estimation method for such aircraft engine. We first analyze the output data created by the turbofan engine thermodynamic simulation that is based on the kernel principal component analysis method and then distinguish the qualitative and quantitative relationships between the key factors. Next, we build a degradation model for the engine fault based on the following assumptions: the engine has only had constant failure (i.e. no sudden failure is included, and the engine has a Wiener process, which is a covariate stand for the engine system drift. To predict the remaining useful life of the turbofan engine, we built a health index based on the degradation model and used the method of maximum likelihood and the data from the thermodynamic simulation model to estimate the parameters of this degradation model. Through the data analysis, we obtained a trend model of the regression curve line that fits with the actual statistical data. Based on the predicted health index model and the data trend model, we estimate the remaining useful life of the aircraft engine as the index reaches zero. At last, a case study involving engine simulation data demonstrates the precision and performance advantages of this prediction method that we propose. At last, a case study involving engine simulation data demonstrates the precision and performance advantages of this proposed method, the precision of the method can reach to 98.9% and the average precision is 95.8%.

  5. Nonparametric statistics for social and behavioral sciences

    CERN Document Server

    Kraska-MIller, M

    2013-01-01

    Introduction to Research in Social and Behavioral SciencesBasic Principles of ResearchPlanning for ResearchTypes of Research Designs Sampling ProceduresValidity and Reliability of Measurement InstrumentsSteps of the Research Process Introduction to Nonparametric StatisticsData AnalysisOverview of Nonparametric Statistics and Parametric Statistics Overview of Parametric Statistics Overview of Nonparametric StatisticsImportance of Nonparametric MethodsMeasurement InstrumentsAnalysis of Data to Determine Association and Agreement Pearson Chi-Square Test of Association and IndependenceContingency

  6. Evaluation of Nonparametric Probabilistic Forecasts of Wind Power

    DEFF Research Database (Denmark)

    Pinson, Pierre; Møller, Jan Kloppenborg; Nielsen, Henrik Aalborg, orlov 31.07.2008

    Predictions of wind power production for horizons up to 48-72 hour ahead comprise a highly valuable input to the methods for the daily management or trading of wind generation. Today, users of wind power predictions are not only provided with point predictions, which are estimates of the most...... likely outcome for each look-ahead time, but also with uncertainty estimates given by probabilistic forecasts. In order to avoid assumptions on the shape of predictive distributions, these probabilistic predictions are produced from nonparametric methods, and then take the form of a single or a set...

  7. An obstructive sleep apnea detection approach using kernel density classification based on single-lead electrocardiogram.

    Science.gov (United States)

    Chen, Lili; Zhang, Xi; Wang, Hui

    2015-05-01

    Obstructive sleep apnea (OSA) is a common sleep disorder that often remains undiagnosed, leading to an increased risk of developing cardiovascular diseases. Polysomnogram (PSG) is currently used as a golden standard for screening OSA. However, because it is time consuming, expensive and causes discomfort, alternative techniques based on a reduced set of physiological signals are proposed to solve this problem. This study proposes a convenient non-parametric kernel density-based approach for detection of OSA using single-lead electrocardiogram (ECG) recordings. Selected physiologically interpretable features are extracted from segmented RR intervals, which are obtained from ECG signals. These features are fed into the kernel density classifier to detect apnea event and bandwidths for density of each class (normal or apnea) are automatically chosen through an iterative bandwidth selection algorithm. To validate the proposed approach, RR intervals are extracted from ECG signals of 35 subjects obtained from a sleep apnea database ( http://physionet.org/cgi-bin/atm/ATM ). The results indicate that the kernel density classifier, with two features for apnea event detection, achieves a mean accuracy of 82.07 %, with mean sensitivity of 83.23 % and mean specificity of 80.24 %. Compared with other existing methods, the proposed kernel density approach achieves a comparably good performance but by using fewer features without significantly losing discriminant power, which indicates that it could be widely used for home-based screening or diagnosis of OSA.

  8. Parameter Selection Method for Support Vector Regression Based on Adaptive Fusion of the Mixed Kernel Function

    Directory of Open Access Journals (Sweden)

    Hailun Wang

    2017-01-01

    Full Text Available Support vector regression algorithm is widely used in fault diagnosis of rolling bearing. A new model parameter selection method for support vector regression based on adaptive fusion of the mixed kernel function is proposed in this paper. We choose the mixed kernel function as the kernel function of support vector regression. The mixed kernel function of the fusion coefficients, kernel function parameters, and regression parameters are combined together as the parameters of the state vector. Thus, the model selection problem is transformed into a nonlinear system state estimation problem. We use a 5th-degree cubature Kalman filter to estimate the parameters. In this way, we realize the adaptive selection of mixed kernel function weighted coefficients and the kernel parameters, the regression parameters. Compared with a single kernel function, unscented Kalman filter (UKF support vector regression algorithms, and genetic algorithms, the decision regression function obtained by the proposed method has better generalization ability and higher prediction accuracy.

  9. A contingency table approach to nonparametric testing

    CERN Document Server

    Rayner, JCW

    2000-01-01

    Most texts on nonparametric techniques concentrate on location and linear-linear (correlation) tests, with less emphasis on dispersion effects and linear-quadratic tests. Tests for higher moment effects are virtually ignored. Using a fresh approach, A Contingency Table Approach to Nonparametric Testing unifies and extends the popular, standard tests by linking them to tests based on models for data that can be presented in contingency tables.This approach unifies popular nonparametric statistical inference and makes the traditional, most commonly performed nonparametric analyses much more comp

  10. Bayesian Nonparametric Estimation of Targeted Agent Effects on Biomarker Change to Predict Clinical Outcome

    Science.gov (United States)

    Graziani, Rebecca; Guindani, Michele; Thall, Peter F.

    2015-01-01

    Summary The effect of a targeted agent on a cancer patient's clinical outcome putatively is mediated through the agent's effect on one or more early biological events. This is motivated by pre-clinical experiments with cells or animals that identify such events, represented by binary or quantitative biomarkers. When evaluating targeted agents in humans, central questions are whether the distribution of a targeted biomarker changes following treatment, the nature and magnitude of this change, and whether it is associated with clinical outcome. Major difficulties in estimating these effects are that a biomarker's distribution may be complex, vary substantially between patients, and have complicated relationships with clinical outcomes. We present a probabilistically coherent framework for modeling and estimation in this setting, including a hierarchical Bayesian nonparametric mixture model for biomarkers that we use to define a functional profile of pre-versus-post treatment biomarker distribution change. The functional is similar to the receiver operating characteristic used in diagnostic testing. The hierarchical model yields clusters of individual patient biomarker profile functionals, and we use the profile as a covariate in a regression model for clinical outcome. The methodology is illustrated by analysis of a dataset from a clinical trial in prostate cancer using imatinib to target platelet-derived growth factor, with the clinical aim to improve progression-free survival time. PMID:25319212

  11. Scalable Bayesian nonparametric regression via a Plackett-Luce model for conditional ranks

    Science.gov (United States)

    Gray-Davies, Tristan; Holmes, Chris C.; Caron, François

    2018-01-01

    We present a novel Bayesian nonparametric regression model for covariates X and continuous response variable Y ∈ ℝ. The model is parametrized in terms of marginal distributions for Y and X and a regression function which tunes the stochastic ordering of the conditional distributions F (y|x). By adopting an approximate composite likelihood approach, we show that the resulting posterior inference can be decoupled for the separate components of the model. This procedure can scale to very large datasets and allows for the use of standard, existing, software from Bayesian nonparametric density estimation and Plackett-Luce ranking estimation to be applied. As an illustration, we show an application of our approach to a US Census dataset, with over 1,300,000 data points and more than 100 covariates. PMID:29623150

  12. Shielding calculations and collective dose estimations with the point-kernel-code VISIPLAN registered for the example of the project ZENT

    International Nuclear Information System (INIS)

    Boehlke, S.; Niegoth, H.

    2012-01-01

    In the nuclear power plant Leibstadt (KKL) during the next year large components will be dismantled and stored for final disposal within the interim storage facility ZENT at the NPP site. Before construction of ZENT appropriate estimations of the local dose rate inside and outside the building and the collective dose for the normal operation have to be performed. The shielding calculations are based on the properties of the stored components and radiation sources and on the concepts for working place requirements. The installation of control and monitoring areas will depend on these calculations. For the determination of the shielding potential of concrete walls and steel doors with the defined boundary conditions point-kernel codes like MICROSHIELd registered are used. Complex problems cannot be modeled with this code. Therefore the point-kernel code VISIPLAN registered was developed for the determination of the local dose distribution functions in 3D models. The possibility of motion sequence inputs allows an optimization of collective dose estimations for the operational phases of a nuclear facility.

  13. Kernel Multivariate Analysis Framework for Supervised Subspace Learning: A Tutorial on Linear and Kernel Multivariate Methods

    DEFF Research Database (Denmark)

    Arenas-Garcia, J.; Petersen, K.; Camps-Valls, G.

    2013-01-01

    correlation analysis (CCA), and orthonormalized PLS (OPLS), as well as their nonlinear extensions derived by means of the theory of reproducing kernel Hilbert spaces (RKHSs). We also review their connections to other methods for classification and statistical dependence estimation and introduce some recent...

  14. Data-variant kernel analysis

    CERN Document Server

    Motai, Yuichi

    2015-01-01

    Describes and discusses the variants of kernel analysis methods for data types that have been intensely studied in recent years This book covers kernel analysis topics ranging from the fundamental theory of kernel functions to its applications. The book surveys the current status, popular trends, and developments in kernel analysis studies. The author discusses multiple kernel learning algorithms and how to choose the appropriate kernels during the learning phase. Data-Variant Kernel Analysis is a new pattern analysis framework for different types of data configurations. The chapters include

  15. Kernel regression with functional response

    OpenAIRE

    Ferraty, Frédéric; Laksaci, Ali; Tadj, Amel; Vieu, Philippe

    2011-01-01

    We consider kernel regression estimate when both the response variable and the explanatory one are functional. The rates of uniform almost complete convergence are stated as function of the small ball probability of the predictor and as function of the entropy of the set on which uniformity is obtained.

  16. Kernel and divergence techniques in high energy physics separations

    Science.gov (United States)

    Bouř, Petr; Kůs, Václav; Franc, Jiří

    2017-10-01

    Binary decision trees under the Bayesian decision technique are used for supervised classification of high-dimensional data. We present a great potential of adaptive kernel density estimation as the nested separation method of the supervised binary divergence decision tree. Also, we provide a proof of alternative computing approach for kernel estimates utilizing Fourier transform. Further, we apply our method to Monte Carlo data set from the particle accelerator Tevatron at DØ experiment in Fermilab and provide final top-antitop signal separation results. We have achieved up to 82 % AUC while using the restricted feature selection entering the signal separation procedure.

  17. Lévy matters VI Lévy-type processes moments, construction and heat kernel estimates

    CERN Document Server

    Kühn, Franziska

    2017-01-01

    Presenting some recent results on the construction and the moments of Lévy-type processes, the focus of this volume is on a new existence theorem, which is proved using a parametrix construction. Applications range from heat kernel estimates for a class of Lévy-type processes to existence and uniqueness theorems for Lévy-driven stochastic differential equations with Hölder continuous coefficients. Moreover, necessary and sufficient conditions for the existence of moments of Lévy-type processes are studied and some estimates on moments are derived. Lévy-type processes behave locally like Lévy processes but, in contrast to Lévy processes, they are not homogeneous in space. Typical examples are processes with varying index of stability and solutions of Lévy-driven stochastic differential equations. This is the sixth volume in a subseries of the Lecture Notes in Mathematics called Lévy Matters. Each volume describes a number of important topics in the theory or applicati ons of Lévy processes and pays ...

  18. The Photoplethismographic Signal Processed with Nonlinear Time Series Analysis Tools

    International Nuclear Information System (INIS)

    Hernandez Caceres, Jose Luis; Hong, Rolando; Garcia Lanz, Abel; Garcia Dominguez, Luis; Cabannas, Karelia

    2001-01-01

    Finger photoplethismography (PPG) signals were submitted to nonlinear time series analysis. The applied analytical techniques were: (i) High degree polynomial fitting for baseline estimation; (ii) FFT analysis for estimating power spectra; (iii) fractal dimension estimation via the Higuchi's time-domain method, and (iv) kernel nonparametric estimation for reconstructing noise free-attractors and also for estimating signal's stochastic components

  19. Ensemble-based forecasting at Horns Rev: Ensemble conversion and kernel dressing

    DEFF Research Database (Denmark)

    Pinson, Pierre; Madsen, Henrik

    . The obtained ensemble forecasts of wind power are then converted into predictive distributions with an original adaptive kernel dressing method. The shape of the kernels is driven by a mean-variance model, the parameters of which are recursively estimated in order to maximize the overall skill of obtained...

  20. Nonparametric factor analysis of time series

    OpenAIRE

    Rodríguez-Poo, Juan M.; Linton, Oliver Bruce

    1998-01-01

    We introduce a nonparametric smoothing procedure for nonparametric factor analaysis of multivariate time series. The asymptotic properties of the proposed procedures are derived. We present an application based on the residuals from the Fair macromodel.

  1. A structural nonparametric reappraisal of the CO2 emissions-income relationship

    NARCIS (Netherlands)

    Azomahou, T.T.; Goedhuys - Degelin, Micheline; Nguyen-Van, P.

    Relying on a structural nonparametric estimation, we show that co2 emissions clearly increase with income at low income levels. For higher income levels, we observe a decreasing relationship, though not significant. We also find thatco2 emissions monotonically increases with energy use at a

  2. Nonparametric tests for censored data

    CERN Document Server

    Bagdonavicus, Vilijandas; Nikulin, Mikhail

    2013-01-01

    This book concerns testing hypotheses in non-parametric models. Generalizations of many non-parametric tests to the case of censored and truncated data are considered. Most of the test results are proved and real applications are illustrated using examples. Theories and exercises are provided. The incorrect use of many tests applying most statistical software is highlighted and discussed.

  3. Determining the multi-scale hedge ratios of stock index futures using the lower partial moments method

    Science.gov (United States)

    Dai, Jun; Zhou, Haigang; Zhao, Shaoquan

    2017-01-01

    This paper considers a multi-scale future hedge strategy that minimizes lower partial moments (LPM). To do this, wavelet analysis is adopted to decompose time series data into different components. Next, different parametric estimation methods with known distributions are applied to calculate the LPM of hedged portfolios, which is the key to determining multi-scale hedge ratios over different time scales. Then these parametric methods are compared with the prevailing nonparametric kernel metric method. Empirical results indicate that in the China Securities Index 300 (CSI 300) index futures and spot markets, hedge ratios and hedge efficiency estimated by the nonparametric kernel metric method are inferior to those estimated by parametric hedging model based on the features of sequence distributions. In addition, if minimum-LPM is selected as a hedge target, the hedging periods, degree of risk aversion, and target returns can affect the multi-scale hedge ratios and hedge efficiency, respectively.

  4. Nonparametric Bayesian Modeling of Complex Networks

    DEFF Research Database (Denmark)

    Schmidt, Mikkel Nørgaard; Mørup, Morten

    2013-01-01

    an infinite mixture model as running example, we go through the steps of deriving the model as an infinite limit of a finite parametric model, inferring the model parameters by Markov chain Monte Carlo, and checking the model?s fit and predictive performance. We explain how advanced nonparametric models......Modeling structure in complex networks using Bayesian nonparametrics makes it possible to specify flexible model structures and infer the adequate model complexity from the observed data. This article provides a gentle introduction to nonparametric Bayesian modeling of complex networks: Using...

  5. Developing an immigration policy for Germany on the basis of a nonparametric labor market classification

    OpenAIRE

    Froelich, Markus; Puhani, Patrick

    2004-01-01

    Based on a nonparametrically estimated model of labor market classifications, this paper makes suggestions for immigration policy using data from western Germany in the 1990s. It is demonstrated that nonparametric regression is feasible in higher dimensions with only a few thousand observations. In sum, labor markets able to absorb immigrants are characterized by above average age and by professional occupations. On the other hand, labor markets for young workers in service occupations are id...

  6. An iterative kernel based method for fourth order nonlinear equation with nonlinear boundary condition

    Science.gov (United States)

    Azarnavid, Babak; Parand, Kourosh; Abbasbandy, Saeid

    2018-06-01

    This article discusses an iterative reproducing kernel method with respect to its effectiveness and capability of solving a fourth-order boundary value problem with nonlinear boundary conditions modeling beams on elastic foundations. Since there is no method of obtaining reproducing kernel which satisfies nonlinear boundary conditions, the standard reproducing kernel methods cannot be used directly to solve boundary value problems with nonlinear boundary conditions as there is no knowledge about the existence and uniqueness of the solution. The aim of this paper is, therefore, to construct an iterative method by the use of a combination of reproducing kernel Hilbert space method and a shooting-like technique to solve the mentioned problems. Error estimation for reproducing kernel Hilbert space methods for nonlinear boundary value problems have yet to be discussed in the literature. In this paper, we present error estimation for the reproducing kernel method to solve nonlinear boundary value problems probably for the first time. Some numerical results are given out to demonstrate the applicability of the method.

  7. Modeling reactive transport with particle tracking and kernel estimators

    Science.gov (United States)

    Rahbaralam, Maryam; Fernandez-Garcia, Daniel; Sanchez-Vila, Xavier

    2015-04-01

    Groundwater reactive transport models are useful to assess and quantify the fate and transport of contaminants in subsurface media and are an essential tool for the analysis of coupled physical, chemical, and biological processes in Earth Systems. Particle Tracking Method (PTM) provides a computationally efficient and adaptable approach to solve the solute transport partial differential equation. On a molecular level, chemical reactions are the result of collisions, combinations, and/or decay of different species. For a well-mixed system, the chem- ical reactions are controlled by the classical thermodynamic rate coefficient. Each of these actions occurs with some probability that is a function of solute concentrations. PTM is based on considering that each particle actually represents a group of molecules. To properly simulate this system, an infinite number of particles is required, which is computationally unfeasible. On the other hand, a finite number of particles lead to a poor-mixed system which is limited by diffusion. Recent works have used this effect to actually model incomplete mix- ing in naturally occurring porous media. In this work, we demonstrate that this effect in most cases should be attributed to a defficient estimation of the concentrations and not to the occurrence of true incomplete mixing processes in porous media. To illustrate this, we show that a Kernel Density Estimation (KDE) of the concentrations can approach the well-mixed solution with a limited number of particles. KDEs provide weighting functions of each particle mass that expands its region of influence, hence providing a wider region for chemical reactions with time. Simulation results show that KDEs are powerful tools to improve state-of-the-art simulations of chemical reactions and indicates that incomplete mixing in diluted systems should be modeled based on alternative conceptual models and not on a limited number of particles.

  8. Approximate kernel competitive learning.

    Science.gov (United States)

    Wu, Jian-Sheng; Zheng, Wei-Shi; Lai, Jian-Huang

    2015-03-01

    Kernel competitive learning has been successfully used to achieve robust clustering. However, kernel competitive learning (KCL) is not scalable for large scale data processing, because (1) it has to calculate and store the full kernel matrix that is too large to be calculated and kept in the memory and (2) it cannot be computed in parallel. In this paper we develop a framework of approximate kernel competitive learning for processing large scale dataset. The proposed framework consists of two parts. First, it derives an approximate kernel competitive learning (AKCL), which learns kernel competitive learning in a subspace via sampling. We provide solid theoretical analysis on why the proposed approximation modelling would work for kernel competitive learning, and furthermore, we show that the computational complexity of AKCL is largely reduced. Second, we propose a pseudo-parallelled approximate kernel competitive learning (PAKCL) based on a set-based kernel competitive learning strategy, which overcomes the obstacle of using parallel programming in kernel competitive learning and significantly accelerates the approximate kernel competitive learning for large scale clustering. The empirical evaluation on publicly available datasets shows that the proposed AKCL and PAKCL can perform comparably as KCL, with a large reduction on computational cost. Also, the proposed methods achieve more effective clustering performance in terms of clustering precision against related approximate clustering approaches. Copyright © 2014 Elsevier Ltd. All rights reserved.

  9. Assessing pupil and school performance by non-parametric and parametric techniques

    NARCIS (Netherlands)

    de Witte, K.; Thanassoulis, E.; Simpson, G.; Battisti, G.; Charlesworth-May, A.

    2010-01-01

    This paper discusses the use of the non-parametric free disposal hull (FDH) and the parametric multi-level model (MLM) as alternative methods for measuring pupil and school attainment where hierarchical structured data are available. Using robust FDH estimates, we show how to decompose the overall

  10. Supremum Norm Posterior Contraction and Credible Sets for Nonparametric Multivariate Regression

    NARCIS (Netherlands)

    Yoo, W.W.; Ghosal, S

    2016-01-01

    In the setting of nonparametric multivariate regression with unknown error variance, we study asymptotic properties of a Bayesian method for estimating a regression function f and its mixed partial derivatives. We use a random series of tensor product of B-splines with normal basis coefficients as a

  11. Bayesian Non-Parametric Mixtures of GARCH(1,1 Models

    Directory of Open Access Journals (Sweden)

    John W. Lau

    2012-01-01

    Full Text Available Traditional GARCH models describe volatility levels that evolve smoothly over time, generated by a single GARCH regime. However, nonstationary time series data may exhibit abrupt changes in volatility, suggesting changes in the underlying GARCH regimes. Further, the number and times of regime changes are not always obvious. This article outlines a nonparametric mixture of GARCH models that is able to estimate the number and time of volatility regime changes by mixing over the Poisson-Kingman process. The process is a generalisation of the Dirichlet process typically used in nonparametric models for time-dependent data provides a richer clustering structure, and its application to time series data is novel. Inference is Bayesian, and a Markov chain Monte Carlo algorithm to explore the posterior distribution is described. The methodology is illustrated on the Standard and Poor's 500 financial index.

  12. Single versus mixture Weibull distributions for nonparametric satellite reliability

    International Nuclear Information System (INIS)

    Castet, Jean-Francois; Saleh, Joseph H.

    2010-01-01

    Long recognized as a critical design attribute for space systems, satellite reliability has not yet received the proper attention as limited on-orbit failure data and statistical analyses can be found in the technical literature. To fill this gap, we recently conducted a nonparametric analysis of satellite reliability for 1584 Earth-orbiting satellites launched between January 1990 and October 2008. In this paper, we provide an advanced parametric fit, based on mixture of Weibull distributions, and compare it with the single Weibull distribution model obtained with the Maximum Likelihood Estimation (MLE) method. We demonstrate that both parametric fits are good approximations of the nonparametric satellite reliability, but that the mixture Weibull distribution provides significant accuracy in capturing all the failure trends in the failure data, as evidenced by the analysis of the residuals and their quasi-normal dispersion.

  13. Omnibus risk assessment via accelerated failure time kernel machine modeling.

    Science.gov (United States)

    Sinnott, Jennifer A; Cai, Tianxi

    2013-12-01

    Integrating genomic information with traditional clinical risk factors to improve the prediction of disease outcomes could profoundly change the practice of medicine. However, the large number of potential markers and possible complexity of the relationship between markers and disease make it difficult to construct accurate risk prediction models. Standard approaches for identifying important markers often rely on marginal associations or linearity assumptions and may not capture non-linear or interactive effects. In recent years, much work has been done to group genes into pathways and networks. Integrating such biological knowledge into statistical learning could potentially improve model interpretability and reliability. One effective approach is to employ a kernel machine (KM) framework, which can capture nonlinear effects if nonlinear kernels are used (Scholkopf and Smola, 2002; Liu et al., 2007, 2008). For survival outcomes, KM regression modeling and testing procedures have been derived under a proportional hazards (PH) assumption (Li and Luan, 2003; Cai, Tonini, and Lin, 2011). In this article, we derive testing and prediction methods for KM regression under the accelerated failure time (AFT) model, a useful alternative to the PH model. We approximate the null distribution of our test statistic using resampling procedures. When multiple kernels are of potential interest, it may be unclear in advance which kernel to use for testing and estimation. We propose a robust Omnibus Test that combines information across kernels, and an approach for selecting the best kernel for estimation. The methods are illustrated with an application in breast cancer. © 2013, The International Biometric Society.

  14. Classification With Truncated Distance Kernel.

    Science.gov (United States)

    Huang, Xiaolin; Suykens, Johan A K; Wang, Shuning; Hornegger, Joachim; Maier, Andreas

    2018-05-01

    This brief proposes a truncated distance (TL1) kernel, which results in a classifier that is nonlinear in the global region but is linear in each subregion. With this kernel, the subregion structure can be trained using all the training data and local linear classifiers can be established simultaneously. The TL1 kernel has good adaptiveness to nonlinearity and is suitable for problems which require different nonlinearities in different areas. Though the TL1 kernel is not positive semidefinite, some classical kernel learning methods are still applicable which means that the TL1 kernel can be directly used in standard toolboxes by replacing the kernel evaluation. In numerical experiments, the TL1 kernel with a pregiven parameter achieves similar or better performance than the radial basis function kernel with the parameter tuned by cross validation, implying the TL1 kernel a promising nonlinear kernel for classification tasks.

  15. Approximation of the breast height diameter distribution of two-cohort stands by mixture models III Kernel density estimators vs mixture models

    Science.gov (United States)

    Rafal Podlaski; Francis A. Roesch

    2014-01-01

    Two-component mixtures of either the Weibull distribution or the gamma distribution and the kernel density estimator were used for describing the diameter at breast height (dbh) empirical distributions of two-cohort stands. The data consisted of study plots from the Å wietokrzyski National Park (central Poland) and areas close to and including the North Carolina section...

  16. A nonparametric approach to medical survival data: Uncertainty in the context of risk in mortality analysis

    International Nuclear Information System (INIS)

    Janurová, Kateřina; Briš, Radim

    2014-01-01

    Medical survival right-censored data of about 850 patients are evaluated to analyze the uncertainty related to the risk of mortality on one hand and compare two basic surgery techniques in the context of risk of mortality on the other hand. Colorectal data come from patients who underwent colectomy in the University Hospital of Ostrava. Two basic surgery operating techniques are used for the colectomy: either traditional (open) or minimally invasive (laparoscopic). Basic question arising at the colectomy operation is, which type of operation to choose to guarantee longer overall survival time. Two non-parametric approaches have been used to quantify probability of mortality with uncertainties. In fact, complement of the probability to one, i.e. survival function with corresponding confidence levels is calculated and evaluated. First approach considers standard nonparametric estimators resulting from both the Kaplan–Meier estimator of survival function in connection with Greenwood's formula and the Nelson–Aalen estimator of cumulative hazard function including confidence interval for survival function as well. The second innovative approach, represented by Nonparametric Predictive Inference (NPI), uses lower and upper probabilities for quantifying uncertainty and provides a model of predictive survival function instead of the population survival function. The traditional log-rank test on one hand and the nonparametric predictive comparison of two groups of lifetime data on the other hand have been compared to evaluate risk of mortality in the context of mentioned surgery techniques. The size of the difference between two groups of lifetime data has been considered and analyzed as well. Both nonparametric approaches led to the same conclusion, that the minimally invasive operating technique guarantees the patient significantly longer survival time in comparison with the traditional operating technique

  17. A Comparison of Kernel Equating and Traditional Equipercentile Equating Methods and the Parametric Bootstrap Methods for Estimating Standard Errors in Equipercentile Equating

    Science.gov (United States)

    Choi, Sae Il

    2009-01-01

    This study used simulation (a) to compare the kernel equating method to traditional equipercentile equating methods under the equivalent-groups (EG) design and the nonequivalent-groups with anchor test (NEAT) design and (b) to apply the parametric bootstrap method for estimating standard errors of equating. A two-parameter logistic item response…

  18. Exact Heat Kernel on a Hypersphere and Its Applications in Kernel SVM

    Directory of Open Access Journals (Sweden)

    Chenchao Zhao

    2018-01-01

    Full Text Available Many contemporary statistical learning methods assume a Euclidean feature space. This paper presents a method for defining similarity based on hyperspherical geometry and shows that it often improves the performance of support vector machine compared to other competing similarity measures. Specifically, the idea of using heat diffusion on a hypersphere to measure similarity has been previously proposed and tested by Lafferty and Lebanon [1], demonstrating promising results based on a heuristic heat kernel obtained from the zeroth order parametrix expansion; however, how well this heuristic kernel agrees with the exact hyperspherical heat kernel remains unknown. This paper presents a higher order parametrix expansion of the heat kernel on a unit hypersphere and discusses several problems associated with this expansion method. We then compare the heuristic kernel with an exact form of the heat kernel expressed in terms of a uniformly and absolutely convergent series in high-dimensional angular momentum eigenmodes. Being a natural measure of similarity between sample points dwelling on a hypersphere, the exact kernel often shows superior performance in kernel SVM classifications applied to text mining, tumor somatic mutation imputation, and stock market analysis.

  19. Copula Based Factorization in Bayesian Multivariate Infinite Mixture Models

    OpenAIRE

    Martin Burda; Artem Prokhorov

    2012-01-01

    Bayesian nonparametric models based on infinite mixtures of density kernels have been recently gaining in popularity due to their flexibility and feasibility of implementation even in complicated modeling scenarios. In economics, they have been particularly useful in estimating nonparametric distributions of latent variables. However, these models have been rarely applied in more than one dimension. Indeed, the multivariate case suffers from the curse of dimensionality, with a rapidly increas...

  20. Decision support using nonparametric statistics

    CERN Document Server

    Beatty, Warren

    2018-01-01

    This concise volume covers nonparametric statistics topics that most are most likely to be seen and used from a practical decision support perspective. While many degree programs require a course in parametric statistics, these methods are often inadequate for real-world decision making in business environments. Much of the data collected today by business executives (for example, customer satisfaction opinions) requires nonparametric statistics for valid analysis, and this book provides the reader with a set of tools that can be used to validly analyze all data, regardless of type. Through numerous examples and exercises, this book explains why nonparametric statistics will lead to better decisions and how they are used to reach a decision, with a wide array of business applications. Online resources include exercise data, spreadsheets, and solutions.

  1. A new discrete dipole kernel for quantitative susceptibility mapping.

    Science.gov (United States)

    Milovic, Carlos; Acosta-Cabronero, Julio; Pinto, José Miguel; Mattern, Hendrik; Andia, Marcelo; Uribe, Sergio; Tejos, Cristian

    2018-09-01

    Most approaches for quantitative susceptibility mapping (QSM) are based on a forward model approximation that employs a continuous Fourier transform operator to solve a differential equation system. Such formulation, however, is prone to high-frequency aliasing. The aim of this study was to reduce such errors using an alternative dipole kernel formulation based on the discrete Fourier transform and discrete operators. The impact of such an approach on forward model calculation and susceptibility inversion was evaluated in contrast to the continuous formulation both with synthetic phantoms and in vivo MRI data. The discrete kernel demonstrated systematically better fits to analytic field solutions, and showed less over-oscillations and aliasing artifacts while preserving low- and medium-frequency responses relative to those obtained with the continuous kernel. In the context of QSM estimation, the use of the proposed discrete kernel resulted in error reduction and increased sharpness. This proof-of-concept study demonstrated that discretizing the dipole kernel is advantageous for QSM. The impact on small or narrow structures such as the venous vasculature might by particularly relevant to high-resolution QSM applications with ultra-high field MRI - a topic for future investigations. The proposed dipole kernel has a straightforward implementation to existing QSM routines. Copyright © 2018 Elsevier Inc. All rights reserved.

  2. A shortest-path graph kernel for estimating gene product semantic similarity

    Directory of Open Access Journals (Sweden)

    Alvarez Marco A

    2011-07-01

    Full Text Available Abstract Background Existing methods for calculating semantic similarity between gene products using the Gene Ontology (GO often rely on external resources, which are not part of the ontology. Consequently, changes in these external resources like biased term distribution caused by shifting of hot research topics, will affect the calculation of semantic similarity. One way to avoid this problem is to use semantic methods that are "intrinsic" to the ontology, i.e. independent of external knowledge. Results We present a shortest-path graph kernel (spgk method that relies exclusively on the GO and its structure. In spgk, a gene product is represented by an induced subgraph of the GO, which consists of all the GO terms annotating it. Then a shortest-path graph kernel is used to compute the similarity between two graphs. In a comprehensive evaluation using a benchmark dataset, spgk compares favorably with other methods that depend on external resources. Compared with simUI, a method that is also intrinsic to GO, spgk achieves slightly better results on the benchmark dataset. Statistical tests show that the improvement is significant when the resolution and EC similarity correlation coefficient are used to measure the performance, but is insignificant when the Pfam similarity correlation coefficient is used. Conclusions Spgk uses a graph kernel method in polynomial time to exploit the structure of the GO to calculate semantic similarity between gene products. It provides an alternative to both methods that use external resources and "intrinsic" methods with comparable performance.

  3. Using multinomial and imprecise probability for non-parametric modelling of rainfall in Manizales (Colombia

    Directory of Open Access Journals (Sweden)

    Ibsen Chivatá Cárdenas

    2008-05-01

    Full Text Available This article presents a rainfall model constructed by applying non-parametric modelling and imprecise probabilities; these tools were used because there was not enough homogeneous information in the study area. The area’s hydro-logical information regarding rainfall was scarce and existing hydrological time series were not uniform. A distributed extended rainfall model was constructed from so-called probability boxes (p-boxes, multinomial probability distribu-tion and confidence intervals (a friendly algorithm was constructed for non-parametric modelling by combining the last two tools. This model confirmed the high level of uncertainty involved in local rainfall modelling. Uncertainty en-compassed the whole range (domain of probability values thereby showing the severe limitations on information, leading to the conclusion that a detailed estimation of probability would lead to significant error. Nevertheless, rele-vant information was extracted; it was estimated that maximum daily rainfall threshold (70 mm would be surpassed at least once every three years and the magnitude of uncertainty affecting hydrological parameter estimation. This paper’s conclusions may be of interest to non-parametric modellers and decisions-makers as such modelling and imprecise probability represents an alternative for hydrological variable assessment and maybe an obligatory proce-dure in the future. Its potential lies in treating scarce information and represents a robust modelling strategy for non-seasonal stochastic modelling conditions

  4. Experimental Sentinel-2 LAI estimation using parametric, non-parametric and physical retrieval methods - A comparison

    NARCIS (Netherlands)

    Verrelst, Jochem; Rivera, Juan Pablo; Veroustraete, Frank; Muñoz-Marí, Jordi; Clevers, J.G.P.W.; Camps-Valls, Gustau; Moreno, José

    2015-01-01

    Given the forthcoming availability of Sentinel-2 (S2) images, this paper provides a systematic comparison of retrieval accuracy and processing speed of a multitude of parametric, non-parametric and physically-based retrieval methods using simulated S2 data. An experimental field dataset (SPARC),

  5. Theory of nonparametric tests

    CERN Document Server

    Dickhaus, Thorsten

    2018-01-01

    This textbook provides a self-contained presentation of the main concepts and methods of nonparametric statistical testing, with a particular focus on the theoretical foundations of goodness-of-fit tests, rank tests, resampling tests, and projection tests. The substitution principle is employed as a unified approach to the nonparametric test problems discussed. In addition to mathematical theory, it also includes numerous examples and computer implementations. The book is intended for advanced undergraduate, graduate, and postdoc students as well as young researchers. Readers should be familiar with the basic concepts of mathematical statistics typically covered in introductory statistics courses.

  6. Coupling individual kernel-filling processes with source-sink interactions into GREENLAB-Maize.

    Science.gov (United States)

    Ma, Yuntao; Chen, Youjia; Zhu, Jinyu; Meng, Lei; Guo, Yan; Li, Baoguo; Hoogenboom, Gerrit

    2018-02-13

    Failure to account for the variation of kernel growth in a cereal crop simulation model may cause serious deviations in the estimates of crop yield. The goal of this research was to revise the GREENLAB-Maize model to incorporate source- and sink-limited allocation approaches to simulate the dry matter accumulation of individual kernels of an ear (GREENLAB-Maize-Kernel). The model used potential individual kernel growth rates to characterize the individual potential sink demand. The remobilization of non-structural carbohydrates from reserve organs to kernels was also incorporated. Two years of field experiments were conducted to determine the model parameter values and to evaluate the model using two maize hybrids with different plant densities and pollination treatments. Detailed observations were made on the dimensions and dry weights of individual kernels and other above-ground plant organs throughout the seasons. Three basic traits characterizing an individual kernel were compared on simulated and measured individual kernels: (1) final kernel size; (2) kernel growth rate; and (3) duration of kernel filling. Simulations of individual kernel growth closely corresponded to experimental data. The model was able to reproduce the observed dry weight of plant organs well. Then, the source-sink dynamics and the remobilization of carbohydrates for kernel growth were quantified to show that remobilization processes accompanied source-sink dynamics during the kernel-filling process. We conclude that the model may be used to explore options for optimizing plant kernel yield by matching maize management to the environment, taking into account responses at the level of individual kernels. © The Author(s) 2018. Published by Oxford University Press on behalf of the Annals of Botany Company. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  7. Nonparametric statistics with applications to science and engineering

    CERN Document Server

    Kvam, Paul H

    2007-01-01

    A thorough and definitive book that fully addresses traditional and modern-day topics of nonparametric statistics This book presents a practical approach to nonparametric statistical analysis and provides comprehensive coverage of both established and newly developed methods. With the use of MATLAB, the authors present information on theorems and rank tests in an applied fashion, with an emphasis on modern methods in regression and curve fitting, bootstrap confidence intervals, splines, wavelets, empirical likelihood, and goodness-of-fit testing. Nonparametric Statistics with Applications to Science and Engineering begins with succinct coverage of basic results for order statistics, methods of categorical data analysis, nonparametric regression, and curve fitting methods. The authors then focus on nonparametric procedures that are becoming more relevant to engineering researchers and practitioners. The important fundamental materials needed to effectively learn and apply the discussed methods are also provide...

  8. Phylodynamic Inference with Kernel ABC and Its Application to HIV Epidemiology.

    Science.gov (United States)

    Poon, Art F Y

    2015-09-01

    The shapes of phylogenetic trees relating virus populations are determined by the adaptation of viruses within each host, and by the transmission of viruses among hosts. Phylodynamic inference attempts to reverse this flow of information, estimating parameters of these processes from the shape of a virus phylogeny reconstructed from a sample of genetic sequences from the epidemic. A key challenge to phylodynamic inference is quantifying the similarity between two trees in an efficient and comprehensive way. In this study, I demonstrate that a new distance measure, based on a subset tree kernel function from computational linguistics, confers a significant improvement over previous measures of tree shape for classifying trees generated under different epidemiological scenarios. Next, I incorporate this kernel-based distance measure into an approximate Bayesian computation (ABC) framework for phylodynamic inference. ABC bypasses the need for an analytical solution of model likelihood, as it only requires the ability to simulate data from the model. I validate this "kernel-ABC" method for phylodynamic inference by estimating parameters from data simulated under a simple epidemiological model. Results indicate that kernel-ABC attained greater accuracy for parameters associated with virus transmission than leading software on the same data sets. Finally, I apply the kernel-ABC framework to study a recent outbreak of a recombinant HIV subtype in China. Kernel-ABC provides a versatile framework for phylodynamic inference because it can fit a broader range of models than methods that rely on the computation of exact likelihoods. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  9. Kernel abortion in maize. II. Distribution of 14C among kernel carboydrates

    International Nuclear Information System (INIS)

    Hanft, J.M.; Jones, R.J.

    1986-01-01

    This study was designed to compare the uptake and distribution of 14 C among fructose, glucose, sucrose, and starch in the cob, pedicel, and endosperm tissues of maize (Zea mays L.) kernels induced to abort by high temperature with those that develop normally. Kernels cultured in vitro at 309 and 35 0 C were transferred to [ 14 C]sucrose media 10 days after pollination. Kernels cultured at 35 0 C aborted prior to the onset of linear dry matter accumulation. Significant uptake into the cob, pedicel, and endosperm of radioactivity associated with the soluble and starch fractions of the tissues was detected after 24 hours in culture on atlageled media. After 8 days in culture on [ 14 C]sucrose media, 48 and 40% of the radioactivity associated with the cob carbohydrates was found in the reducing sugars at 30 and 35 0 C, respectively. Of the total carbohydrates, a higher percentage of label was associated with sucrose and lower percentage with fructose and glucose in pedicel tissue of kernels cultured at 35 0 C compared to kernels cultured at 30 0 C. These results indicate that sucrose was not cleaved to fructose and glucose as rapidly during the unloading process in the pedicel of kernels induced to abort by high temperature. Kernels cultured at 35 0 C had a much lower proportion of label associated with endosperm starch (29%) than did kernels cultured at 30 0 C (89%). Kernels cultured at 35 0 C had a correspondingly higher proportion of 14 C in endosperm fructose, glucose, and sucrose

  10. Estimating technical efficiency in the hospital sector with panel data: a comparison of parametric and non-parametric techniques.

    Science.gov (United States)

    Siciliani, Luigi

    2006-01-01

    Policy makers are increasingly interested in developing performance indicators that measure hospital efficiency. These indicators may give the purchasers of health services an additional regulatory tool to contain health expenditure. Using panel data, this study compares different parametric (econometric) and non-parametric (linear programming) techniques for the measurement of a hospital's technical efficiency. This comparison was made using a sample of 17 Italian hospitals in the years 1996-9. Highest correlations are found in the efficiency scores between the non-parametric data envelopment analysis under the constant returns to scale assumption (DEA-CRS) and several parametric models. Correlation reduces markedly when using more flexible non-parametric specifications such as data envelopment analysis under the variable returns to scale assumption (DEA-VRS) and the free disposal hull (FDH) model. Correlation also generally reduces when moving from one output to two-output specifications. This analysis suggests that there is scope for developing performance indicators at hospital level using panel data, but it is important that extensive sensitivity analysis is carried out if purchasers wish to make use of these indicators in practice.

  11. Dose-response curve estimation: a semiparametric mixture approach.

    Science.gov (United States)

    Yuan, Ying; Yin, Guosheng

    2011-12-01

    In the estimation of a dose-response curve, parametric models are straightforward and efficient but subject to model misspecifications; nonparametric methods are robust but less efficient. As a compromise, we propose a semiparametric approach that combines the advantages of parametric and nonparametric curve estimates. In a mixture form, our estimator takes a weighted average of the parametric and nonparametric curve estimates, in which a higher weight is assigned to the estimate with a better model fit. When the parametric model assumption holds, the semiparametric curve estimate converges to the parametric estimate and thus achieves high efficiency; when the parametric model is misspecified, the semiparametric estimate converges to the nonparametric estimate and remains consistent. We also consider an adaptive weighting scheme to allow the weight to vary according to the local fit of the models. We conduct extensive simulation studies to investigate the performance of the proposed methods and illustrate them with two real examples. © 2011, The International Biometric Society.

  12. A non-parametric Bayesian approach to decompounding from high frequency data

    NARCIS (Netherlands)

    Gugushvili, Shota; van der Meulen, F.H.; Spreij, Peter

    2016-01-01

    Given a sample from a discretely observed compound Poisson process, we consider non-parametric estimation of the density f0 of its jump sizes, as well as of its intensity λ0. We take a Bayesian approach to the problem and specify the prior on f0 as the Dirichlet location mixture of normal densities.

  13. Mixed kernel function support vector regression for global sensitivity analysis

    Science.gov (United States)

    Cheng, Kai; Lu, Zhenzhou; Wei, Yuhao; Shi, Yan; Zhou, Yicheng

    2017-11-01

    Global sensitivity analysis (GSA) plays an important role in exploring the respective effects of input variables on an assigned output response. Amongst the wide sensitivity analyses in literature, the Sobol indices have attracted much attention since they can provide accurate information for most models. In this paper, a mixed kernel function (MKF) based support vector regression (SVR) model is employed to evaluate the Sobol indices at low computational cost. By the proposed derivation, the estimation of the Sobol indices can be obtained by post-processing the coefficients of the SVR meta-model. The MKF is constituted by the orthogonal polynomials kernel function and Gaussian radial basis kernel function, thus the MKF possesses both the global characteristic advantage of the polynomials kernel function and the local characteristic advantage of the Gaussian radial basis kernel function. The proposed approach is suitable for high-dimensional and non-linear problems. Performance of the proposed approach is validated by various analytical functions and compared with the popular polynomial chaos expansion (PCE). Results demonstrate that the proposed approach is an efficient method for global sensitivity analysis.

  14. Feature Selection and Kernel Learning for Local Learning-Based Clustering.

    Science.gov (United States)

    Zeng, Hong; Cheung, Yiu-ming

    2011-08-01

    The performance of the most clustering algorithms highly relies on the representation of data in the input space or the Hilbert space of kernel methods. This paper is to obtain an appropriate data representation through feature selection or kernel learning within the framework of the Local Learning-Based Clustering (LLC) (Wu and Schölkopf 2006) method, which can outperform the global learning-based ones when dealing with the high-dimensional data lying on manifold. Specifically, we associate a weight to each feature or kernel and incorporate it into the built-in regularization of the LLC algorithm to take into account the relevance of each feature or kernel for the clustering. Accordingly, the weights are estimated iteratively in the clustering process. We show that the resulting weighted regularization with an additional constraint on the weights is equivalent to a known sparse-promoting penalty. Hence, the weights of those irrelevant features or kernels can be shrunk toward zero. Extensive experiments show the efficacy of the proposed methods on the benchmark data sets.

  15. Bayesian nonparametric data analysis

    CERN Document Server

    Müller, Peter; Jara, Alejandro; Hanson, Tim

    2015-01-01

    This book reviews nonparametric Bayesian methods and models that have proven useful in the context of data analysis. Rather than providing an encyclopedic review of probability models, the book’s structure follows a data analysis perspective. As such, the chapters are organized by traditional data analysis problems. In selecting specific nonparametric models, simpler and more traditional models are favored over specialized ones. The discussed methods are illustrated with a wealth of examples, including applications ranging from stylized examples to case studies from recent literature. The book also includes an extensive discussion of computational methods and details on their implementation. R code for many examples is included in on-line software pages.

  16. Estimation of the limit of detection with a bootstrap-derived standard error by a partly non-parametric approach. Application to HPLC drug assays

    DEFF Research Database (Denmark)

    Linnet, Kristian

    2005-01-01

    Bootstrap, HPLC, limit of blank, limit of detection, non-parametric statistics, type I and II errors......Bootstrap, HPLC, limit of blank, limit of detection, non-parametric statistics, type I and II errors...

  17. Nonparametric additive regression for repeatedly measured data

    KAUST Repository

    Carroll, R. J.

    2009-05-20

    We develop an easily computed smooth backfitting algorithm for additive model fitting in repeated measures problems. Our methodology easily copes with various settings, such as when some covariates are the same over repeated response measurements. We allow for a working covariance matrix for the regression errors, showing that our method is most efficient when the correct covariance matrix is used. The component functions achieve the known asymptotic variance lower bound for the scalar argument case. Smooth backfitting also leads directly to design-independent biases in the local linear case. Simulations show our estimator has smaller variance than the usual kernel estimator. This is also illustrated by an example from nutritional epidemiology. © 2009 Biometrika Trust.

  18. Output-Only Modal Parameter Recursive Estimation of Time-Varying Structures via a Kernel Ridge Regression FS-TARMA Approach

    Directory of Open Access Journals (Sweden)

    Zhi-Sai Ma

    2017-01-01

    Full Text Available Modal parameter estimation plays an important role in vibration-based damage detection and is worth more attention and investigation, as changes in modal parameters are usually being used as damage indicators. This paper focuses on the problem of output-only modal parameter recursive estimation of time-varying structures based upon parameterized representations of the time-dependent autoregressive moving average (TARMA. A kernel ridge regression functional series TARMA (FS-TARMA recursive identification scheme is proposed and subsequently employed for the modal parameter estimation of a numerical three-degree-of-freedom time-varying structural system and a laboratory time-varying structure consisting of a simply supported beam and a moving mass sliding on it. The proposed method is comparatively assessed against an existing recursive pseudolinear regression FS-TARMA approach via Monte Carlo experiments and shown to be capable of accurately tracking the time-varying dynamics in a recursive manner.

  19. Surface and top-of-atmosphere radiative feedback kernels for CESM-CAM5

    Science.gov (United States)

    Pendergrass, Angeline G.; Conley, Andrew; Vitt, Francis M.

    2018-02-01

    Radiative kernels at the top of the atmosphere are useful for decomposing changes in atmospheric radiative fluxes due to feedbacks from atmosphere and surface temperature, water vapor, and surface albedo. Here we describe and validate radiative kernels calculated with the large-ensemble version of CAM5, CESM1.1.2, at the top of the atmosphere and the surface. Estimates of the radiative forcing from greenhouse gases and aerosols in RCP8.5 in the CESM large-ensemble simulations are also diagnosed. As an application, feedbacks are calculated for the CESM large ensemble. The kernels are freely available at https://doi.org/10.5065/D6F47MT6" target="_blank">https://doi.org/10.5065/D6F47MT6, and accompanying software can be downloaded from https://github.com/apendergrass/cam5-kernels" target="_blank">https://github.com/apendergrass/cam5-kernels.

  20. NONPARAMETRIC IDENTIFICATION FOR NONLINEAR AUTOREGRESSIVE TIMESERIES MODELS: CONVERGENCE RATES

    Institute of Scientific and Technical Information of China (English)

    LUZUDI; CHENGPING

    1999-01-01

    In this paper the optimal convergence rates of estimators ba~ed on kernel approach fornonlinear AR model are investigated in the sense of Stone[17'1a]. By combining the mixingproperty of the stationary solution with the characteristics of the model itself, the restrictiveconditions in the literature which are not easy to be satisfied by the nonlinear AR model axeremoved, and the mild conditions are obtained to guarantee the optimal ratea of the estimatorof autoregTession function. In addition: the strongly coasistent estimator of the ~riance ofwhite noise is also constructed.

  1. Kendall-Theil Robust Line (KTRLine--version 1.0)-A Visual Basic Program for Calculating and Graphing Robust Nonparametric Estimates of Linear-Regression Coefficients Between Two Continuous Variables

    Science.gov (United States)

    Granato, Gregory E.

    2006-01-01

    The Kendall-Theil Robust Line software (KTRLine-version 1.0) is a Visual Basic program that may be used with the Microsoft Windows operating system to calculate parameters for robust, nonparametric estimates of linear-regression coefficients between two continuous variables. The KTRLine software was developed by the U.S. Geological Survey, in cooperation with the Federal Highway Administration, for use in stochastic data modeling with local, regional, and national hydrologic data sets to develop planning-level estimates of potential effects of highway runoff on the quality of receiving waters. The Kendall-Theil robust line was selected because this robust nonparametric method is resistant to the effects of outliers and nonnormality in residuals that commonly characterize hydrologic data sets. The slope of the line is calculated as the median of all possible pairwise slopes between points. The intercept is calculated so that the line will run through the median of input data. A single-line model or a multisegment model may be specified. The program was developed to provide regression equations with an error component for stochastic data generation because nonparametric multisegment regression tools are not available with the software that is commonly used to develop regression models. The Kendall-Theil robust line is a median line and, therefore, may underestimate total mass, volume, or loads unless the error component or a bias correction factor is incorporated into the estimate. Regression statistics such as the median error, the median absolute deviation, the prediction error sum of squares, the root mean square error, the confidence interval for the slope, and the bias correction factor for median estimates are calculated by use of nonparametric methods. These statistics, however, may be used to formulate estimates of mass, volume, or total loads. The program is used to read a two- or three-column tab-delimited input file with variable names in the first row and

  2. A tool for the estimation of the distribution of landslide area in R

    Science.gov (United States)

    Rossi, M.; Cardinali, M.; Fiorucci, F.; Marchesini, I.; Mondini, A. C.; Santangelo, M.; Ghosh, S.; Riguer, D. E. L.; Lahousse, T.; Chang, K. T.; Guzzetti, F.

    2012-04-01

    We have developed a tool in R (the free software environment for statistical computing, http://www.r-project.org/) to estimate the probability density and the frequency density of landslide area. The tool implements parametric and non-parametric approaches to the estimation of the probability density and the frequency density of landslide area, including: (i) Histogram Density Estimation (HDE), (ii) Kernel Density Estimation (KDE), and (iii) Maximum Likelihood Estimation (MLE). The tool is available as a standard Open Geospatial Consortium (OGC) Web Processing Service (WPS), and is accessible through the web using different GIS software clients. We tested the tool to compare Double Pareto and Inverse Gamma models for the probability density of landslide area in different geological, morphological and climatological settings, and to compare landslides shown in inventory maps prepared using different mapping techniques, including (i) field mapping, (ii) visual interpretation of monoscopic and stereoscopic aerial photographs, (iii) visual interpretation of monoscopic and stereoscopic VHR satellite images and (iv) semi-automatic detection and mapping from VHR satellite images. Results show that both models are applicable in different geomorphological settings. In most cases the two models provided very similar results. Non-parametric estimation methods (i.e., HDE and KDE) provided reasonable results for all the tested landslide datasets. For some of the datasets, MLE failed to provide a result, for convergence problems. The two tested models (Double Pareto and Inverse Gamma) resulted in very similar results for large and very large datasets (> 150 samples). Differences in the modeling results were observed for small datasets affected by systematic biases. A distinct rollover was observed in all analyzed landslide datasets, except for a few datasets obtained from landslide inventories prepared through field mapping or by semi-automatic mapping from VHR satellite imagery

  3. Introduction to nonparametric statistics for the biological sciences using R

    CERN Document Server

    MacFarland, Thomas W

    2016-01-01

    This book contains a rich set of tools for nonparametric analyses, and the purpose of this supplemental text is to provide guidance to students and professional researchers on how R is used for nonparametric data analysis in the biological sciences: To introduce when nonparametric approaches to data analysis are appropriate To introduce the leading nonparametric tests commonly used in biostatistics and how R is used to generate appropriate statistics for each test To introduce common figures typically associated with nonparametric data analysis and how R is used to generate appropriate figures in support of each data set The book focuses on how R is used to distinguish between data that could be classified as nonparametric as opposed to data that could be classified as parametric, with both approaches to data classification covered extensively. Following an introductory lesson on nonparametric statistics for the biological sciences, the book is organized into eight self-contained lessons on various analyses a...

  4. A novel adaptive kernel method with kernel centers determined by a support vector regression approach

    NARCIS (Netherlands)

    Sun, L.G.; De Visser, C.C.; Chu, Q.P.; Mulder, J.A.

    2012-01-01

    The optimality of the kernel number and kernel centers plays a significant role in determining the approximation power of nearly all kernel methods. However, the process of choosing optimal kernels is always formulated as a global optimization task, which is hard to accomplish. Recently, an

  5. High throughput nonparametric probability density estimation.

    Science.gov (United States)

    Farmer, Jenny; Jacobs, Donald

    2018-01-01

    In high throughput applications, such as those found in bioinformatics and finance, it is important to determine accurate probability distribution functions despite only minimal information about data characteristics, and without using human subjectivity. Such an automated process for univariate data is implemented to achieve this goal by merging the maximum entropy method with single order statistics and maximum likelihood. The only required properties of the random variables are that they are continuous and that they are, or can be approximated as, independent and identically distributed. A quasi-log-likelihood function based on single order statistics for sampled uniform random data is used to empirically construct a sample size invariant universal scoring function. Then a probability density estimate is determined by iteratively improving trial cumulative distribution functions, where better estimates are quantified by the scoring function that identifies atypical fluctuations. This criterion resists under and over fitting data as an alternative to employing the Bayesian or Akaike information criterion. Multiple estimates for the probability density reflect uncertainties due to statistical fluctuations in random samples. Scaled quantile residual plots are also introduced as an effective diagnostic to visualize the quality of the estimated probability densities. Benchmark tests show that estimates for the probability density function (PDF) converge to the true PDF as sample size increases on particularly difficult test probability densities that include cases with discontinuities, multi-resolution scales, heavy tails, and singularities. These results indicate the method has general applicability for high throughput statistical inference.

  6. Protein Subcellular Localization with Gaussian Kernel Discriminant Analysis and Its Kernel Parameter Selection.

    Science.gov (United States)

    Wang, Shunfang; Nie, Bing; Yue, Kun; Fei, Yu; Li, Wenjia; Xu, Dongshu

    2017-12-15

    Kernel discriminant analysis (KDA) is a dimension reduction and classification algorithm based on nonlinear kernel trick, which can be novelly used to treat high-dimensional and complex biological data before undergoing classification processes such as protein subcellular localization. Kernel parameters make a great impact on the performance of the KDA model. Specifically, for KDA with the popular Gaussian kernel, to select the scale parameter is still a challenging problem. Thus, this paper introduces the KDA method and proposes a new method for Gaussian kernel parameter selection depending on the fact that the differences between reconstruction errors of edge normal samples and those of interior normal samples should be maximized for certain suitable kernel parameters. Experiments with various standard data sets of protein subcellular localization show that the overall accuracy of protein classification prediction with KDA is much higher than that without KDA. Meanwhile, the kernel parameter of KDA has a great impact on the efficiency, and the proposed method can produce an optimum parameter, which makes the new algorithm not only perform as effectively as the traditional ones, but also reduce the computational time and thus improve efficiency.

  7. Application of nonparametric regression methods to study the relationship between NO2 concentrations and local wind direction and speed at background sites.

    Science.gov (United States)

    Donnelly, Aoife; Misstear, Bruce; Broderick, Brian

    2011-02-15

    Background concentrations of nitrogen dioxide (NO(2)) are not constant but vary temporally and spatially. The current paper presents a powerful tool for the quantification of the effects of wind direction and wind speed on background NO(2) concentrations, particularly in cases where monitoring data are limited. In contrast to previous studies which applied similar methods to sites directly affected by local pollution sources, the current study focuses on background sites with the aim of improving methods for predicting background concentrations adopted in air quality modelling studies. The relationship between measured NO(2) concentration in air at three such sites in Ireland and locally measured wind direction has been quantified using nonparametric regression methods. The major aim was to analyse a method for quantifying the effects of local wind direction on background levels of NO(2) in Ireland. The method was expanded to include wind speed as an added predictor variable. A Gaussian kernel function is used in the analysis and circular statistics employed for the wind direction variable. Wind direction and wind speed were both found to have a statistically significant effect on background levels of NO(2) at all three sites. Frequently environmental impact assessments are based on short term baseline monitoring producing a limited dataset. The presented non-parametric regression methods, in contrast to the frequently used methods such as binning of the data, allow concentrations for missing data pairs to be estimated and distinction between spurious and true peaks in concentrations to be made. The methods were found to provide a realistic estimation of long term concentration variation with wind direction and speed, even for cases where the data set is limited. Accurate identification of the actual variation at each location and causative factors could be made, thus supporting the improved definition of background concentrations for use in air quality modelling

  8. Propagation of Uncertainty in Bayesian Kernel Models - Application to Multiple-Step Ahead Forecasting

    DEFF Research Database (Denmark)

    Quinonero, Joaquin; Girard, Agathe; Larsen, Jan

    2003-01-01

    The object of Bayesian modelling is predictive distribution, which, in a forecasting scenario, enables evaluation of forecasted values and their uncertainties. We focus on reliably estimating the predictive mean and variance of forecasted values using Bayesian kernel based models such as the Gaus......The object of Bayesian modelling is predictive distribution, which, in a forecasting scenario, enables evaluation of forecasted values and their uncertainties. We focus on reliably estimating the predictive mean and variance of forecasted values using Bayesian kernel based models...... such as the Gaussian process and the relevance vector machine. We derive novel analytic expressions for the predictive mean and variance for Gaussian kernel shapes under the assumption of a Gaussian input distribution in the static case, and of a recursive Gaussian predictive density in iterative forecasting...

  9. DBKGrad: An R Package for Mortality Rates Graduation by Discrete Beta Kernel Techniques

    Directory of Open Access Journals (Sweden)

    Angelo Mazza

    2014-04-01

    Full Text Available We introduce the R package DBKGrad, conceived to facilitate the use of kernel smoothing in graduating mortality rates. The package implements univariate and bivariate adaptive discrete beta kernel estimators. Discrete kernels have been preferred because, in this context, variables such as age, calendar year and duration, are pragmatically considered as discrete and the use of beta kernels is motivated since it reduces boundary bias. Furthermore, when data on exposures to the risk of death are available, the use of adaptive bandwidth, that may be selected by cross-validation, can provide additional benefits. To exemplify the use of the package, an application to Italian mortality rates, for different ages and calendar years, is presented.

  10. 7 CFR 981.7 - Edible kernel.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 8 2010-01-01 2010-01-01 false Edible kernel. 981.7 Section 981.7 Agriculture... Regulating Handling Definitions § 981.7 Edible kernel. Edible kernel means a kernel, piece, or particle of almond kernel that is not inedible. [41 FR 26852, June 30, 1976] ...

  11. Bit Error-Rate Minimizing Detector for Amplify-and-Forward Relaying Systems Using Generalized Gaussian Kernel

    KAUST Repository

    Ahmed, Qasim Zeeshan

    2013-01-01

    In this letter, a new detector is proposed for amplifyand- forward (AF) relaying system when communicating with the assistance of relays. The major goal of this detector is to improve the bit error rate (BER) performance of the receiver. The probability density function is estimated with the help of kernel density technique. A generalized Gaussian kernel is proposed. This new kernel provides more flexibility and encompasses Gaussian and uniform kernels as special cases. The optimal window width of the kernel is calculated. Simulations results show that a gain of more than 1 dB can be achieved in terms of BER performance as compared to the minimum mean square error (MMSE) receiver when communicating over Rayleigh fading channels.

  12. Kernel versions of some orthogonal transformations

    DEFF Research Database (Denmark)

    Nielsen, Allan Aasbjerg

    Kernel versions of orthogonal transformations such as principal components are based on a dual formulation also termed Q-mode analysis in which the data enter into the analysis via inner products in the Gram matrix only. In the kernel version the inner products of the original data are replaced...... by inner products between nonlinear mappings into higher dimensional feature space. Via kernel substitution also known as the kernel trick these inner products between the mappings are in turn replaced by a kernel function and all quantities needed in the analysis are expressed in terms of this kernel...... function. This means that we need not know the nonlinear mappings explicitly. Kernel principal component analysis (PCA) and kernel minimum noise fraction (MNF) analyses handle nonlinearities by implicitly transforming data into high (even infinite) dimensional feature space via the kernel function...

  13. Model Selection in Kernel Ridge Regression

    DEFF Research Database (Denmark)

    Exterkate, Peter

    Kernel ridge regression is gaining popularity as a data-rich nonlinear forecasting tool, which is applicable in many different contexts. This paper investigates the influence of the choice of kernel and the setting of tuning parameters on forecast accuracy. We review several popular kernels......, including polynomial kernels, the Gaussian kernel, and the Sinc kernel. We interpret the latter two kernels in terms of their smoothing properties, and we relate the tuning parameters associated to all these kernels to smoothness measures of the prediction function and to the signal-to-noise ratio. Based...... on these interpretations, we provide guidelines for selecting the tuning parameters from small grids using cross-validation. A Monte Carlo study confirms the practical usefulness of these rules of thumb. Finally, the flexible and smooth functional forms provided by the Gaussian and Sinc kernels makes them widely...

  14. Comparison between linear and non-parametric regression models for genome-enabled prediction in wheat.

    Science.gov (United States)

    Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne

    2012-12-01

    In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models.

  15. Penetuan Bilangan Iodin pada Hydrogenated Palm Kernel Oil (HPKO) dan Refined Bleached Deodorized Palm Kernel Oil (RBDPKO)

    OpenAIRE

    Sitompul, Monica Angelina

    2015-01-01

    Have been conducted Determination of Iodin Value by method titration to some Hydrogenated Palm Kernel Oil (HPKO) and Refined Bleached Deodorized Palm Kernel Oil (RBDPKO). The result of analysis obtained the Iodin Value in Hydrogenated Palm Kernel Oil (A) = 0,16 gr I2/100gr, Hydrogenated Palm Kernel Oil (B) = 0,20 gr I2/100gr, Hydrogenated Palm Kernel Oil (C) = 0,24 gr I2/100gr. And in Refined Bleached Deodorized Palm Kernel Oil (A) = 17,51 gr I2/100gr, Refined Bleached Deodorized Palm Kernel ...

  16. 7 CFR 981.8 - Inedible kernel.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 8 2010-01-01 2010-01-01 false Inedible kernel. 981.8 Section 981.8 Agriculture... Regulating Handling Definitions § 981.8 Inedible kernel. Inedible kernel means a kernel, piece, or particle of almond kernel with any defect scored as serious damage, or damage due to mold, gum, shrivel, or...

  17. Bayesian nonparametric inference on quantile residual life function: Application to breast cancer data.

    Science.gov (United States)

    Park, Taeyoung; Jeong, Jong-Hyeon; Lee, Jae Won

    2012-08-15

    There is often an interest in estimating a residual life function as a summary measure of survival data. For ease in presentation of the potential therapeutic effect of a new drug, investigators may summarize survival data in terms of the remaining life years of patients. Under heavy right censoring, however, some reasonably high quantiles (e.g., median) of a residual lifetime distribution cannot be always estimated via a popular nonparametric approach on the basis of the Kaplan-Meier estimator. To overcome the difficulties in dealing with heavily censored survival data, this paper develops a Bayesian nonparametric approach that takes advantage of a fully model-based but highly flexible probabilistic framework. We use a Dirichlet process mixture of Weibull distributions to avoid strong parametric assumptions on the unknown failure time distribution, making it possible to estimate any quantile residual life function under heavy censoring. Posterior computation through Markov chain Monte Carlo is straightforward and efficient because of conjugacy properties and partial collapse. We illustrate the proposed methods by using both simulated data and heavily censored survival data from a recent breast cancer clinical trial conducted by the National Surgical Adjuvant Breast and Bowel Project. Copyright © 2012 John Wiley & Sons, Ltd.

  18. 7 CFR 981.408 - Inedible kernel.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 8 2010-01-01 2010-01-01 false Inedible kernel. 981.408 Section 981.408 Agriculture... Administrative Rules and Regulations § 981.408 Inedible kernel. Pursuant to § 981.8, the definition of inedible kernel is modified to mean a kernel, piece, or particle of almond kernel with any defect scored as...

  19. Digital spectral analysis parametric, non-parametric and advanced methods

    CERN Document Server

    Castanié, Francis

    2013-01-01

    Digital Spectral Analysis provides a single source that offers complete coverage of the spectral analysis domain. This self-contained work includes details on advanced topics that are usually presented in scattered sources throughout the literature.The theoretical principles necessary for the understanding of spectral analysis are discussed in the first four chapters: fundamentals, digital signal processing, estimation in spectral analysis, and time-series models.An entire chapter is devoted to the non-parametric methods most widely used in industry.High resolution methods a

  20. A field operational test on valve-regulated lead-acid absorbent-glass-mat batteries in micro-hybrid electric vehicles. Part I. Results based on kernel density estimation

    Science.gov (United States)

    Schaeck, S.; Karspeck, T.; Ott, C.; Weckler, M.; Stoermer, A. O.

    2011-03-01

    In March 2007 the BMW Group has launched the micro-hybrid functions brake energy regeneration (BER) and automatic start and stop function (ASSF). Valve-regulated lead-acid (VRLA) batteries in absorbent glass mat (AGM) technology are applied in vehicles with micro-hybrid power system (MHPS). In both part I and part II of this publication vehicles with MHPS and AGM batteries are subject to a field operational test (FOT). Test vehicles with conventional power system (CPS) and flooded batteries were used as a reference. In the FOT sample batteries were mounted several times and electrically tested in the laboratory intermediately. Vehicle- and battery-related diagnosis data were read out for each test run and were matched with laboratory data in a data base. The FOT data were analyzed by the use of two-dimensional, nonparametric kernel estimation for clear data presentation. The data show that capacity loss in the MHPS is comparable to the CPS. However, the influence of mileage performance, which cannot be separated, suggests that battery stress is enhanced in the MHPS although a battery refresh function is applied. Anyway, the FOT demonstrates the unsuitability of flooded batteries for the MHPS because of high early capacity loss due to acid stratification and because of vanishing cranking performance due to increasing internal resistance. Furthermore, the lack of dynamic charge acceptance for high energy regeneration efficiency is illustrated. Under the presented FOT conditions charge acceptance of lead-acid (LA) batteries decreases to less than one third for about half of the sample batteries compared to new battery condition. In part II of this publication FOT data are presented by multiple regression analysis (Schaeck et al., submitted for publication [1]).

  1. Biochemical and seminal parameters of lambs fed palm kernel cake under grazing system

    Directory of Open Access Journals (Sweden)

    Lopes César Mugabe

    Full Text Available ABSTRACT This study aimed to assess the effects of palm kernel cake on semen quality and biochemical parameters of Santa Inês lambs. A total of 40 animals with 24.10±2.72 kg body weight and five months old were assigned in a completely randomized design into four groups and 10 replicates. The animals were subjected to four levels of palm kernel cake (0, 15, 30, and 45% based on dry matter. The trial lasted 90 days foregone by 15 days for adaptation. Blood samples were collected every 45 days from jugular vein using vacuum tubes without anticoagulant. Total serum cholesterol, triglycerides, high-density lipoprotein, low-density lipoprotein, and very-low-density lipoprotein were assessed. Once the animals reached puberty at a mean age of 225 days, the semen samples were collected by electroejaculator once a week for three sequence weeks and assessed for volume, color, aspect, wave motion, motility, sperm concentration, sperm vigor, total of spermatozoa per ejaculate, viable spermatozoa per mL, and sperm morphology. The data were subjected to analysis of variance and followed by regression analysis. Non-parametric data were analysed by Kruskal-Wallis test. Total cholesterol, high-density lipoprotein, triglycerides, and very-low-density lipoprotein were linearly increased. There was no difference for low-density lipoprotein. Diets did not affect mass motility, sperm motility, vigor, total spermatozoa per ejaculate, viability sperm per mL, and minor and total sperm defects. Sperm concentration increased linearly. Negative quadratic effects were observed for major sperm defects. Supplementation of diets with palm kernel cake up to 45% on dry matter enhance biochemical parameters and do not impair the qualitative variables of lamb sperm.

  2. Model selection in kernel ridge regression

    DEFF Research Database (Denmark)

    Exterkate, Peter

    2013-01-01

    Kernel ridge regression is a technique to perform ridge regression with a potentially infinite number of nonlinear transformations of the independent variables as regressors. This method is gaining popularity as a data-rich nonlinear forecasting tool, which is applicable in many different contexts....... The influence of the choice of kernel and the setting of tuning parameters on forecast accuracy is investigated. Several popular kernels are reviewed, including polynomial kernels, the Gaussian kernel, and the Sinc kernel. The latter two kernels are interpreted in terms of their smoothing properties......, and the tuning parameters associated to all these kernels are related to smoothness measures of the prediction function and to the signal-to-noise ratio. Based on these interpretations, guidelines are provided for selecting the tuning parameters from small grids using cross-validation. A Monte Carlo study...

  3. On Cooper's Nonparametric Test.

    Science.gov (United States)

    Schmeidler, James

    1978-01-01

    The basic assumption of Cooper's nonparametric test for trend (EJ 125 069) is questioned. It is contended that the proper assumption alters the distribution of the statistic and reduces its usefulness. (JKS)

  4. LZW-Kernel: fast kernel utilizing variable length code blocks from LZW compressors for protein sequence classification.

    Science.gov (United States)

    Filatov, Gleb; Bauwens, Bruno; Kertész-Farkas, Attila

    2018-05-07

    Bioinformatics studies often rely on similarity measures between sequence pairs, which often pose a bottleneck in large-scale sequence analysis. Here, we present a new convolutional kernel function for protein sequences called the LZW-Kernel. It is based on code words identified with the Lempel-Ziv-Welch (LZW) universal text compressor. The LZW-Kernel is an alignment-free method, it is always symmetric, is positive, always provides 1.0 for self-similarity and it can directly be used with Support Vector Machines (SVMs) in classification problems, contrary to normalized compression distance (NCD), which often violates the distance metric properties in practice and requires further techniques to be used with SVMs. The LZW-Kernel is a one-pass algorithm, which makes it particularly plausible for big data applications. Our experimental studies on remote protein homology detection and protein classification tasks reveal that the LZW-Kernel closely approaches the performance of the Local Alignment Kernel (LAK) and the SVM-pairwise method combined with Smith-Waterman (SW) scoring at a fraction of the time. Moreover, the LZW-Kernel outperforms the SVM-pairwise method when combined with BLAST scores, which indicates that the LZW code words might be a better basis for similarity measures than local alignment approximations found with BLAST. In addition, the LZW-Kernel outperforms n-gram based mismatch kernels, hidden Markov model based SAM and Fisher kernel, and protein family based PSI-BLAST, among others. Further advantages include the LZW-Kernel's reliance on a simple idea, its ease of implementation, and its high speed, three times faster than BLAST and several magnitudes faster than SW or LAK in our tests. LZW-Kernel is implemented as a standalone C code and is a free open-source program distributed under GPLv3 license and can be downloaded from https://github.com/kfattila/LZW-Kernel. akerteszfarkas@hse.ru. Supplementary data are available at Bioinformatics Online.

  5. Viscosity kernel of molecular fluids

    DEFF Research Database (Denmark)

    Puscasu, Ruslan; Todd, Billy; Daivis, Peter

    2010-01-01

    , temperature, and chain length dependencies of the reciprocal and real-space viscosity kernels are presented. We find that the density has a major effect on the shape of the kernel. The temperature range and chain lengths considered here have by contrast less impact on the overall normalized shape. Functional...... forms that fit the wave-vector-dependent kernel data over a large density and wave-vector range have also been tested. Finally, a structural normalization of the kernels in physical space is considered. Overall, the real-space viscosity kernel has a width of roughly 3–6 atomic diameters, which means...

  6. Kernel learning algorithms for face recognition

    CERN Document Server

    Li, Jun-Bao; Pan, Jeng-Shyang

    2013-01-01

    Kernel Learning Algorithms for Face Recognition covers the framework of kernel based face recognition. This book discusses the advanced kernel learning algorithms and its application on face recognition. This book also focuses on the theoretical deviation, the system framework and experiments involving kernel based face recognition. Included within are algorithms of kernel based face recognition, and also the feasibility of the kernel based face recognition method. This book provides researchers in pattern recognition and machine learning area with advanced face recognition methods and its new

  7. A Bayesian nonparametric approach to reconstruction and prediction of random dynamical systems

    Science.gov (United States)

    Merkatas, Christos; Kaloudis, Konstantinos; Hatjispyros, Spyridon J.

    2017-06-01

    We propose a Bayesian nonparametric mixture model for the reconstruction and prediction from observed time series data, of discretized stochastic dynamical systems, based on Markov Chain Monte Carlo methods. Our results can be used by researchers in physical modeling interested in a fast and accurate estimation of low dimensional stochastic models when the size of the observed time series is small and the noise process (perhaps) is non-Gaussian. The inference procedure is demonstrated specifically in the case of polynomial maps of an arbitrary degree and when a Geometric Stick Breaking mixture process prior over the space of densities, is applied to the additive errors. Our method is parsimonious compared to Bayesian nonparametric techniques based on Dirichlet process mixtures, flexible and general. Simulations based on synthetic time series are presented.

  8. A Bayesian nonparametric approach to reconstruction and prediction of random dynamical systems.

    Science.gov (United States)

    Merkatas, Christos; Kaloudis, Konstantinos; Hatjispyros, Spyridon J

    2017-06-01

    We propose a Bayesian nonparametric mixture model for the reconstruction and prediction from observed time series data, of discretized stochastic dynamical systems, based on Markov Chain Monte Carlo methods. Our results can be used by researchers in physical modeling interested in a fast and accurate estimation of low dimensional stochastic models when the size of the observed time series is small and the noise process (perhaps) is non-Gaussian. The inference procedure is demonstrated specifically in the case of polynomial maps of an arbitrary degree and when a Geometric Stick Breaking mixture process prior over the space of densities, is applied to the additive errors. Our method is parsimonious compared to Bayesian nonparametric techniques based on Dirichlet process mixtures, flexible and general. Simulations based on synthetic time series are presented.

  9. Exact nonparametric confidence bands for the survivor function.

    Science.gov (United States)

    Matthews, David

    2013-10-12

    A method to produce exact simultaneous confidence bands for the empirical cumulative distribution function that was first described by Owen, and subsequently corrected by Jager and Wellner, is the starting point for deriving exact nonparametric confidence bands for the survivor function of any positive random variable. We invert a nonparametric likelihood test of uniformity, constructed from the Kaplan-Meier estimator of the survivor function, to obtain simultaneous lower and upper bands for the function of interest with specified global confidence level. The method involves calculating a null distribution and associated critical value for each observed sample configuration. However, Noe recursions and the Van Wijngaarden-Decker-Brent root-finding algorithm provide the necessary tools for efficient computation of these exact bounds. Various aspects of the effect of right censoring on these exact bands are investigated, using as illustrations two observational studies of survival experience among non-Hodgkin's lymphoma patients and a much larger group of subjects with advanced lung cancer enrolled in trials within the North Central Cancer Treatment Group. Monte Carlo simulations confirm the merits of the proposed method of deriving simultaneous interval estimates of the survivor function across the entire range of the observed sample. This research was supported by the Natural Sciences and Engineering Research Council (NSERC) of Canada. It was begun while the author was visiting the Department of Statistics, University of Auckland, and completed during a subsequent sojourn at the Medical Research Council Biostatistics Unit in Cambridge. The support of both institutions, in addition to that of NSERC and the University of Waterloo, is greatly appreciated.

  10. Kernel methods for deep learning

    OpenAIRE

    Cho, Youngmin

    2012-01-01

    We introduce a new family of positive-definite kernels that mimic the computation in large neural networks. We derive the different members of this family by considering neural networks with different activation functions. Using these kernels as building blocks, we also show how to construct other positive-definite kernels by operations such as composition, multiplication, and averaging. We explore the use of these kernels in standard models of supervised learning, such as support vector mach...

  11. Tremor Detection Using Parametric and Non-Parametric Spectral Estimation Methods: A Comparison with Clinical Assessment

    Science.gov (United States)

    Martinez Manzanera, Octavio; Elting, Jan Willem; van der Hoeven, Johannes H.; Maurits, Natasha M.

    2016-01-01

    In the clinic, tremor is diagnosed during a time-limited process in which patients are observed and the characteristics of tremor are visually assessed. For some tremor disorders, a more detailed analysis of these characteristics is needed. Accelerometry and electromyography can be used to obtain a better insight into tremor. Typically, routine clinical assessment of accelerometry and electromyography data involves visual inspection by clinicians and occasionally computational analysis to obtain objective characteristics of tremor. However, for some tremor disorders these characteristics may be different during daily activity. This variability in presentation between the clinic and daily life makes a differential diagnosis more difficult. A long-term recording of tremor by accelerometry and/or electromyography in the home environment could help to give a better insight into the tremor disorder. However, an evaluation of such recordings using routine clinical standards would take too much time. We evaluated a range of techniques that automatically detect tremor segments in accelerometer data, as accelerometer data is more easily obtained in the home environment than electromyography data. Time can be saved if clinicians only have to evaluate the tremor characteristics of segments that have been automatically detected in longer daily activity recordings. We tested four non-parametric methods and five parametric methods on clinical accelerometer data from 14 patients with different tremor disorders. The consensus between two clinicians regarding the presence or absence of tremor on 3943 segments of accelerometer data was employed as reference. The nine methods were tested against this reference to identify their optimal parameters. Non-parametric methods generally performed better than parametric methods on our dataset when optimal parameters were used. However, one parametric method, employing the high frequency content of the tremor bandwidth under consideration

  12. Robust estimation for ordinary differential equation models.

    Science.gov (United States)

    Cao, J; Wang, L; Xu, J

    2011-12-01

    Applied scientists often like to use ordinary differential equations (ODEs) to model complex dynamic processes that arise in biology, engineering, medicine, and many other areas. It is interesting but challenging to estimate ODE parameters from noisy data, especially when the data have some outliers. We propose a robust method to address this problem. The dynamic process is represented with a nonparametric function, which is a linear combination of basis functions. The nonparametric function is estimated by a robust penalized smoothing method. The penalty term is defined with the parametric ODE model, which controls the roughness of the nonparametric function and maintains the fidelity of the nonparametric function to the ODE model. The basis coefficients and ODE parameters are estimated in two nested levels of optimization. The coefficient estimates are treated as an implicit function of ODE parameters, which enables one to derive the analytic gradients for optimization using the implicit function theorem. Simulation studies show that the robust method gives satisfactory estimates for the ODE parameters from noisy data with outliers. The robust method is demonstrated by estimating a predator-prey ODE model from real ecological data. © 2011, The International Biometric Society.

  13. Dose calculation methods in photon beam therapy using energy deposition kernels

    International Nuclear Information System (INIS)

    Ahnesjoe, A.

    1991-01-01

    The problem of calculating accurate dose distributions in treatment planning of megavoltage photon radiation therapy has been studied. New dose calculation algorithms using energy deposition kernels have been developed. The kernels describe the transfer of energy by secondary particles from a primary photon interaction site to its surroundings. Monte Carlo simulations of particle transport have been used for derivation of kernels for primary photon energies form 0.1 MeV to 50 MeV. The trade off between accuracy and calculational speed has been addressed by the development of two algorithms; one point oriented with low computional overhead for interactive use and one for fast and accurate calculation of dose distributions in a 3-dimensional lattice. The latter algorithm models secondary particle transport in heterogeneous tissue by scaling energy deposition kernels with the electron density of the tissue. The accuracy of the methods has been tested using full Monte Carlo simulations for different geometries, and found to be superior to conventional algorithms based on scaling of broad beam dose distributions. Methods have also been developed for characterization of clinical photon beams in entities appropriate for kernel based calculation models. By approximating the spectrum as laterally invariant, an effective spectrum and dose distribution for contaminating charge particles are derived form depth dose distributions measured in water, using analytical constraints. The spectrum is used to calculate kernels by superposition of monoenergetic kernels. The lateral energy fluence distribution is determined by deconvolving measured lateral dose distributions by a corresponding pencil beam kernel. Dose distributions for contaminating photons are described using two different methods, one for estimation of the dose outside of the collimated beam, and the other for calibration of output factors derived from kernel based dose calculations. (au)

  14. Nonparametric Collective Spectral Density Estimation and Clustering

    KAUST Repository

    Maadooliat, Mehdi; Sun, Ying; Chen, Tianbo

    2017-01-01

    In this paper, we develop a method for the simultaneous estimation of spectral density functions (SDFs) for a collection of stationary time series that share some common features. Due to the similarities among the SDFs, the log-SDF can be represented using a common set of basis functions. The basis shared by the collection of the log-SDFs is estimated as a low-dimensional manifold of a large space spanned by a pre-specified rich basis. A collective estimation approach pools information and borrows strength across the SDFs to achieve better estimation efficiency. Also, each estimated spectral density has a concise representation using the coefficients of the basis expansion, and these coefficients can be used for visualization, clustering, and classification purposes. The Whittle pseudo-maximum likelihood approach is used to fit the model and an alternating blockwise Newton-type algorithm is developed for the computation. A web-based shiny App found at

  15. Nonparametric Collective Spectral Density Estimation and Clustering

    KAUST Repository

    Maadooliat, Mehdi

    2017-04-12

    In this paper, we develop a method for the simultaneous estimation of spectral density functions (SDFs) for a collection of stationary time series that share some common features. Due to the similarities among the SDFs, the log-SDF can be represented using a common set of basis functions. The basis shared by the collection of the log-SDFs is estimated as a low-dimensional manifold of a large space spanned by a pre-specified rich basis. A collective estimation approach pools information and borrows strength across the SDFs to achieve better estimation efficiency. Also, each estimated spectral density has a concise representation using the coefficients of the basis expansion, and these coefficients can be used for visualization, clustering, and classification purposes. The Whittle pseudo-maximum likelihood approach is used to fit the model and an alternating blockwise Newton-type algorithm is developed for the computation. A web-based shiny App found at

  16. 7 CFR 981.9 - Kernel weight.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 8 2010-01-01 2010-01-01 false Kernel weight. 981.9 Section 981.9 Agriculture Regulations of the Department of Agriculture (Continued) AGRICULTURAL MARKETING SERVICE (Marketing Agreements... Regulating Handling Definitions § 981.9 Kernel weight. Kernel weight means the weight of kernels, including...

  17. Efficient Kernel-Based Ensemble Gaussian Mixture Filtering

    KAUST Repository

    Liu, Bo

    2015-11-11

    We consider the Bayesian filtering problem for data assimilation following the kernel-based ensemble Gaussian-mixture filtering (EnGMF) approach introduced by Anderson and Anderson (1999). In this approach, the posterior distribution of the system state is propagated with the model using the ensemble Monte Carlo method, providing a forecast ensemble that is then used to construct a prior Gaussian-mixture (GM) based on the kernel density estimator. This results in two update steps: a Kalman filter (KF)-like update of the ensemble members and a particle filter (PF)-like update of the weights, followed by a resampling step to start a new forecast cycle. After formulating EnGMF for any observational operator, we analyze the influence of the bandwidth parameter of the kernel function on the covariance of the posterior distribution. We then focus on two aspects: i) the efficient implementation of EnGMF with (relatively) small ensembles, where we propose a new deterministic resampling strategy preserving the first two moments of the posterior GM to limit the sampling error; and ii) the analysis of the effect of the bandwidth parameter on contributions of KF and PF updates and on the weights variance. Numerical results using the Lorenz-96 model are presented to assess the behavior of EnGMF with deterministic resampling, study its sensitivity to different parameters and settings, and evaluate its performance against ensemble KFs. The proposed EnGMF approach with deterministic resampling suggests improved estimates in all tested scenarios, and is shown to require less localization and to be less sensitive to the choice of filtering parameters.

  18. Veto-Consensus Multiple Kernel Learning

    NARCIS (Netherlands)

    Zhou, Y.; Hu, N.; Spanos, C.J.

    2016-01-01

    We propose Veto-Consensus Multiple Kernel Learning (VCMKL), a novel way of combining multiple kernels such that one class of samples is described by the logical intersection (consensus) of base kernelized decision rules, whereas the other classes by the union (veto) of their complements. The

  19. 7 CFR 51.2295 - Half kernel.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 2 2010-01-01 2010-01-01 false Half kernel. 51.2295 Section 51.2295 Agriculture... Standards for Shelled English Walnuts (Juglans Regia) Definitions § 51.2295 Half kernel. Half kernel means the separated half of a kernel with not more than one-eighth broken off. ...

  20. Non-parametric correlative uncertainty quantification and sensitivity analysis: Application to a Langmuir bimolecular adsorption model

    Science.gov (United States)

    Feng, Jinchao; Lansford, Joshua; Mironenko, Alexander; Pourkargar, Davood Babaei; Vlachos, Dionisios G.; Katsoulakis, Markos A.

    2018-03-01

    We propose non-parametric methods for both local and global sensitivity analysis of chemical reaction models with correlated parameter dependencies. The developed mathematical and statistical tools are applied to a benchmark Langmuir competitive adsorption model on a close packed platinum surface, whose parameters, estimated from quantum-scale computations, are correlated and are limited in size (small data). The proposed mathematical methodology employs gradient-based methods to compute sensitivity indices. We observe that ranking influential parameters depends critically on whether or not correlations between parameters are taken into account. The impact of uncertainty in the correlation and the necessity of the proposed non-parametric perspective are demonstrated.

  1. Non-parametric correlative uncertainty quantification and sensitivity analysis: Application to a Langmuir bimolecular adsorption model

    Directory of Open Access Journals (Sweden)

    Jinchao Feng

    2018-03-01

    Full Text Available We propose non-parametric methods for both local and global sensitivity analysis of chemical reaction models with correlated parameter dependencies. The developed mathematical and statistical tools are applied to a benchmark Langmuir competitive adsorption model on a close packed platinum surface, whose parameters, estimated from quantum-scale computations, are correlated and are limited in size (small data. The proposed mathematical methodology employs gradient-based methods to compute sensitivity indices. We observe that ranking influential parameters depends critically on whether or not correlations between parameters are taken into account. The impact of uncertainty in the correlation and the necessity of the proposed non-parametric perspective are demonstrated.

  2. Bayesian Nonparametric Measurement of Factor Betas and Clustering with Application to Hedge Fund Returns

    Directory of Open Access Journals (Sweden)

    Urbi Garay

    2016-03-01

    Full Text Available We define a dynamic and self-adjusting mixture of Gaussian Graphical Models to cluster financial returns, and provide a new method for extraction of nonparametric estimates of dynamic alphas (excess return and betas (to a choice set of explanatory factors in a multivariate setting. This approach, as well as the outputs, has a dynamic, nonstationary and nonparametric form, which circumvents the problem of model risk and parametric assumptions that the Kalman filter and other widely used approaches rely on. The by-product of clusters, used for shrinkage and information borrowing, can be of use to determine relationships around specific events. This approach exhibits a smaller Root Mean Squared Error than traditionally used benchmarks in financial settings, which we illustrate through simulation. As an illustration, we use hedge fund index data, and find that our estimated alphas are, on average, 0.13% per month higher (1.6% per year than alphas estimated through Ordinary Least Squares. The approach exhibits fast adaptation to abrupt changes in the parameters, as seen in our estimated alphas and betas, which exhibit high volatility, especially in periods which can be identified as times of stressful market events, a reflection of the dynamic positioning of hedge fund portfolio managers.

  3. A mixture model for robust registration in Kinect sensor

    Science.gov (United States)

    Peng, Li; Zhou, Huabing; Zhu, Shengguo

    2018-03-01

    The Microsoft Kinect sensor has been widely used in many applications, but it suffers from the drawback of low registration precision between color image and depth image. In this paper, we present a robust method to improve the registration precision by a mixture model that can handle multiply images with the nonparametric model. We impose non-parametric geometrical constraints on the correspondence, as a prior distribution, in a reproducing kernel Hilbert space (RKHS).The estimation is performed by the EM algorithm which by also estimating the variance of the prior model is able to obtain good estimates. We illustrate the proposed method on the public available dataset. The experimental results show that our approach outperforms the baseline methods.

  4. An Approximate Approach to Automatic Kernel Selection.

    Science.gov (United States)

    Ding, Lizhong; Liao, Shizhong

    2016-02-02

    Kernel selection is a fundamental problem of kernel-based learning algorithms. In this paper, we propose an approximate approach to automatic kernel selection for regression from the perspective of kernel matrix approximation. We first introduce multilevel circulant matrices into automatic kernel selection, and develop two approximate kernel selection algorithms by exploiting the computational virtues of multilevel circulant matrices. The complexity of the proposed algorithms is quasi-linear in the number of data points. Then, we prove an approximation error bound to measure the effect of the approximation in kernel matrices by multilevel circulant matrices on the hypothesis and further show that the approximate hypothesis produced with multilevel circulant matrices converges to the accurate hypothesis produced with kernel matrices. Experimental evaluations on benchmark datasets demonstrate the effectiveness of approximate kernel selection.

  5. Iterative software kernels

    Energy Technology Data Exchange (ETDEWEB)

    Duff, I.

    1994-12-31

    This workshop focuses on kernels for iterative software packages. Specifically, the three speakers discuss various aspects of sparse BLAS kernels. Their topics are: `Current status of user lever sparse BLAS`; Current status of the sparse BLAS toolkit`; and `Adding matrix-matrix and matrix-matrix-matrix multiply to the sparse BLAS toolkit`.

  6. Multiscale Support Vector Learning With Projection Operator Wavelet Kernel for Nonlinear Dynamical System Identification.

    Science.gov (United States)

    Lu, Zhao; Sun, Jing; Butts, Kenneth

    2016-02-03

    A giant leap has been made in the past couple of decades with the introduction of kernel-based learning as a mainstay for designing effective nonlinear computational learning algorithms. In view of the geometric interpretation of conditional expectation and the ubiquity of multiscale characteristics in highly complex nonlinear dynamic systems [1]-[3], this paper presents a new orthogonal projection operator wavelet kernel, aiming at developing an efficient computational learning approach for nonlinear dynamical system identification. In the framework of multiresolution analysis, the proposed projection operator wavelet kernel can fulfill the multiscale, multidimensional learning to estimate complex dependencies. The special advantage of the projection operator wavelet kernel developed in this paper lies in the fact that it has a closed-form expression, which greatly facilitates its application in kernel learning. To the best of our knowledge, it is the first closed-form orthogonal projection wavelet kernel reported in the literature. It provides a link between grid-based wavelets and mesh-free kernel-based methods. Simulation studies for identifying the parallel models of two benchmark nonlinear dynamical systems confirm its superiority in model accuracy and sparsity.

  7. A kernel version of spatial factor analysis

    DEFF Research Database (Denmark)

    Nielsen, Allan Aasbjerg

    2009-01-01

    . Schölkopf et al. introduce kernel PCA. Shawe-Taylor and Cristianini is an excellent reference for kernel methods in general. Bishop and Press et al. describe kernel methods among many other subjects. Nielsen and Canty use kernel PCA to detect change in univariate airborne digital camera images. The kernel...... version of PCA handles nonlinearities by implicitly transforming data into high (even infinite) dimensional feature space via the kernel function and then performing a linear analysis in that space. In this paper we shall apply kernel versions of PCA, maximum autocorrelation factor (MAF) analysis...

  8. Model-based estimation of finite population total in stratified sampling

    African Journals Online (AJOL)

    The work presented in this paper concerns the estimation of finite population total under model – based framework. Nonparametric regression approach as a method of estimating finite population total is explored. The asymptotic properties of the estimators based on nonparametric regression are also developed under ...

  9. Detoxification of Jatropha curcas kernel cake by a novel Streptomyces fimicarius strain.

    Science.gov (United States)

    Wang, Xing-Hong; Ou, Lingcheng; Fu, Liang-Liang; Zheng, Shui; Lou, Ji-Dong; Gomes-Laranjo, José; Li, Jiao; Zhang, Changhe

    2013-09-15

    A huge amount of kernel cake, which contains a variety of toxins including phorbol esters (tumor promoters), is projected to be generated yearly in the near future by the Jatropha biodiesel industry. We showed that the kernel cake strongly inhibited plant seed germination and root growth and was highly toxic to carp fingerlings, even though phorbol esters were undetectable by HPLC. Therefore it must be detoxified before disposal to the environment. A mathematic model was established to estimate the general toxicity of the kernel cake by determining the survival time of carp fingerling. A new strain (Streptomyces fimicarius YUCM 310038) capable of degrading the total toxicity by more than 97% in a 9-day solid state fermentation was screened out from 578 strains including 198 known strains and 380 strains isolated from air and soil. The kernel cake fermented by YUCM 310038 was nontoxic to plants and carp fingerlings and significantly promoted tobacco plant growth, indicating its potential to transform the toxic kernel cake to bio-safe animal feed or organic fertilizer to remove the environmental concern and to reduce the cost of the Jatropha biodiesel industry. Microbial strain profile essential for the kernel cake detoxification was discussed. Copyright © 2013 Elsevier B.V. All rights reserved.

  10. 7 CFR 51.1441 - Half-kernel.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 2 2010-01-01 2010-01-01 false Half-kernel. 51.1441 Section 51.1441 Agriculture... Standards for Grades of Shelled Pecans Definitions § 51.1441 Half-kernel. Half-kernel means one of the separated halves of an entire pecan kernel with not more than one-eighth of its original volume missing...

  11. Gaussian process-based Bayesian nonparametric inference of population size trajectories from gene genealogies.

    Science.gov (United States)

    Palacios, Julia A; Minin, Vladimir N

    2013-03-01

    Changes in population size influence genetic diversity of the population and, as a result, leave a signature of these changes in individual genomes in the population. We are interested in the inverse problem of reconstructing past population dynamics from genomic data. We start with a standard framework based on the coalescent, a stochastic process that generates genealogies connecting randomly sampled individuals from the population of interest. These genealogies serve as a glue between the population demographic history and genomic sequences. It turns out that only the times of genealogical lineage coalescences contain information about population size dynamics. Viewing these coalescent times as a point process, estimating population size trajectories is equivalent to estimating a conditional intensity of this point process. Therefore, our inverse problem is similar to estimating an inhomogeneous Poisson process intensity function. We demonstrate how recent advances in Gaussian process-based nonparametric inference for Poisson processes can be extended to Bayesian nonparametric estimation of population size dynamics under the coalescent. We compare our Gaussian process (GP) approach to one of the state-of-the-art Gaussian Markov random field (GMRF) methods for estimating population trajectories. Using simulated data, we demonstrate that our method has better accuracy and precision. Next, we analyze two genealogies reconstructed from real sequences of hepatitis C and human Influenza A viruses. In both cases, we recover more believed aspects of the viral demographic histories than the GMRF approach. We also find that our GP method produces more reasonable uncertainty estimates than the GMRF method. Copyright © 2013, The International Biometric Society.

  12. Application of nonparametric statistics to material strength/reliability assessment

    International Nuclear Information System (INIS)

    Arai, Taketoshi

    1992-01-01

    An advanced material technology requires data base on a wide variety of material behavior which need to be established experimentally. It may often happen that experiments are practically limited in terms of reproducibility or a range of test parameters. Statistical methods can be applied to understanding uncertainties in such a quantitative manner as required from the reliability point of view. Statistical assessment involves determinations of a most probable value and the maximum and/or minimum value as one-sided or two-sided confidence limit. A scatter of test data can be approximated by a theoretical distribution only if the goodness of fit satisfies a test criterion. Alternatively, nonparametric statistics (NPS) or distribution-free statistics can be applied. Mathematical procedures by NPS are well established for dealing with most reliability problems. They handle only order statistics of a sample. Mathematical formulas and some applications to engineering assessments are described. They include confidence limits of median, population coverage of sample, required minimum number of a sample, and confidence limits of fracture probability. These applications demonstrate that a nonparametric statistical estimation is useful in logical decision making in the case a large uncertainty exists. (author)

  13. Bayesian nonparametric hierarchical modeling.

    Science.gov (United States)

    Dunson, David B

    2009-04-01

    In biomedical research, hierarchical models are very widely used to accommodate dependence in multivariate and longitudinal data and for borrowing of information across data from different sources. A primary concern in hierarchical modeling is sensitivity to parametric assumptions, such as linearity and normality of the random effects. Parametric assumptions on latent variable distributions can be challenging to check and are typically unwarranted, given available prior knowledge. This article reviews some recent developments in Bayesian nonparametric methods motivated by complex, multivariate and functional data collected in biomedical studies. The author provides a brief review of flexible parametric approaches relying on finite mixtures and latent class modeling. Dirichlet process mixture models are motivated by the need to generalize these approaches to avoid assuming a fixed finite number of classes. Focusing on an epidemiology application, the author illustrates the practical utility and potential of nonparametric Bayes methods.

  14. Local Observed-Score Kernel Equating

    Science.gov (United States)

    Wiberg, Marie; van der Linden, Wim J.; von Davier, Alina A.

    2014-01-01

    Three local observed-score kernel equating methods that integrate methods from the local equating and kernel equating frameworks are proposed. The new methods were compared with their earlier counterparts with respect to such measures as bias--as defined by Lord's criterion of equity--and percent relative error. The local kernel item response…

  15. Credit scoring analysis using kernel discriminant

    Science.gov (United States)

    Widiharih, T.; Mukid, M. A.; Mustafid

    2018-05-01

    Credit scoring model is an important tool for reducing the risk of wrong decisions when granting credit facilities to applicants. This paper investigate the performance of kernel discriminant model in assessing customer credit risk. Kernel discriminant analysis is a non- parametric method which means that it does not require any assumptions about the probability distribution of the input. The main ingredient is a kernel that allows an efficient computation of Fisher discriminant. We use several kernel such as normal, epanechnikov, biweight, and triweight. The models accuracy was compared each other using data from a financial institution in Indonesia. The results show that kernel discriminant can be an alternative method that can be used to determine who is eligible for a credit loan. In the data we use, it shows that a normal kernel is relevant to be selected for credit scoring using kernel discriminant model. Sensitivity and specificity reach to 0.5556 and 0.5488 respectively.

  16. Nonparametric identification of nonlinear dynamic systems using a synchronisation-based method

    Science.gov (United States)

    Kenderi, Gábor; Fidlin, Alexander

    2014-12-01

    The present study proposes an identification method for highly nonlinear mechanical systems that does not require a priori knowledge of the underlying nonlinearities to reconstruct arbitrary restoring force surfaces between degrees of freedom. This approach is based on the master-slave synchronisation between a dynamic model of the system as the slave and the real system as the master using measurements of the latter. As the model synchronises to the measurements, it becomes an observer of the real system. The optimal observer algorithm in a least-squares sense is given by the Kalman filter. Using the well-known state augmentation technique, the Kalman filter can be turned into a dual state and parameter estimator to identify parameters of a priori characterised nonlinearities. The paper proposes an extension of this technique towards nonparametric identification. A general system model is introduced by describing the restoring forces as bilateral spring-dampers with time-variant coefficients, which are estimated as augmented states. The estimation procedure is followed by an a posteriori statistical analysis to reconstruct noise-free restoring force characteristics using the estimated states and their estimated variances. Observability is provided using only one measured mechanical quantity per degree of freedom, which makes this approach less demanding in the number of necessary measurement signals compared with truly nonparametric solutions, which typically require displacement, velocity and acceleration signals. Additionally, due to the statistical rigour of the procedure, it successfully addresses signals corrupted by significant measurement noise. In the present paper, the method is described in detail, which is followed by numerical examples of one degree of freedom (1DoF) and 2DoF mechanical systems with strong nonlinearities of vibro-impact type to demonstrate the effectiveness of the proposed technique.

  17. Effects of dating errors on nonparametric trend analyses of speleothem time series

    Directory of Open Access Journals (Sweden)

    M. Mudelsee

    2012-10-01

    Full Text Available A fundamental problem in paleoclimatology is to take fully into account the various error sources when examining proxy records with quantitative methods of statistical time series analysis. Records from dated climate archives such as speleothems add extra uncertainty from the age determination to the other sources that consist in measurement and proxy errors. This paper examines three stalagmite time series of oxygen isotopic composition (δ18O from two caves in western Germany, the series AH-1 from the Atta Cave and the series Bu1 and Bu4 from the Bunker Cave. These records carry regional information about past changes in winter precipitation and temperature. U/Th and radiocarbon dating reveals that they cover the later part of the Holocene, the past 8.6 thousand years (ka. We analyse centennial- to millennial-scale climate trends by means of nonparametric Gasser–Müller kernel regression. Error bands around fitted trend curves are determined by combining (1 block bootstrap resampling to preserve noise properties (shape, autocorrelation of the δ18O residuals and (2 timescale simulations (models StalAge and iscam. The timescale error influences on centennial- to millennial-scale trend estimation are not excessively large. We find a "mid-Holocene climate double-swing", from warm to cold to warm winter conditions (6.5 ka to 6.0 ka to 5.1 ka, with warm–cold amplitudes of around 0.5‰ δ18O; this finding is documented by all three records with high confidence. We also quantify the Medieval Warm Period (MWP, the Little Ice Age (LIA and the current warmth. Our analyses cannot unequivocally support the conclusion that current regional winter climate is warmer than that during the MWP.

  18. Quantized kernel least mean square algorithm.

    Science.gov (United States)

    Chen, Badong; Zhao, Songlin; Zhu, Pingping; Príncipe, José C

    2012-01-01

    In this paper, we propose a quantization approach, as an alternative of sparsification, to curb the growth of the radial basis function structure in kernel adaptive filtering. The basic idea behind this method is to quantize and hence compress the input (or feature) space. Different from sparsification, the new approach uses the "redundant" data to update the coefficient of the closest center. In particular, a quantized kernel least mean square (QKLMS) algorithm is developed, which is based on a simple online vector quantization method. The analytical study of the mean square convergence has been carried out. The energy conservation relation for QKLMS is established, and on this basis we arrive at a sufficient condition for mean square convergence, and a lower and upper bound on the theoretical value of the steady-state excess mean square error. Static function estimation and short-term chaotic time-series prediction examples are presented to demonstrate the excellent performance.

  19. A comparison of dependence function estimators in multivariate extremes

    KAUST Repository

    Vettori, Sabrina; Huser, Raphaë l; Genton, Marc G.

    2017-01-01

    Various nonparametric and parametric estimators of extremal dependence have been proposed in the literature. Nonparametric methods commonly suffer from the curse of dimensionality and have been mostly implemented in extreme-value studies up to three dimensions, whereas parametric models can tackle higher-dimensional settings. In this paper, we assess, through a vast and systematic simulation study, the performance of classical and recently proposed estimators in multivariate settings. In particular, we first investigate the performance of nonparametric methods and then compare them with classical parametric approaches under symmetric and asymmetric dependence structures within the commonly used logistic family. We also explore two different ways to make nonparametric estimators satisfy the necessary dependence function shape constraints, finding a general improvement in estimator performance either (i) by substituting the estimator with its greatest convex minorant, developing a computational tool to implement this method for dimensions $$D\\ge 2$$D≥2 or (ii) by projecting the estimator onto a subspace of dependence functions satisfying such constraints and taking advantage of Bernstein–Bézier polynomials. Implementing the convex minorant method leads to better estimator performance as the dimensionality increases.

  20. A comparison of dependence function estimators in multivariate extremes

    KAUST Repository

    Vettori, Sabrina

    2017-05-11

    Various nonparametric and parametric estimators of extremal dependence have been proposed in the literature. Nonparametric methods commonly suffer from the curse of dimensionality and have been mostly implemented in extreme-value studies up to three dimensions, whereas parametric models can tackle higher-dimensional settings. In this paper, we assess, through a vast and systematic simulation study, the performance of classical and recently proposed estimators in multivariate settings. In particular, we first investigate the performance of nonparametric methods and then compare them with classical parametric approaches under symmetric and asymmetric dependence structures within the commonly used logistic family. We also explore two different ways to make nonparametric estimators satisfy the necessary dependence function shape constraints, finding a general improvement in estimator performance either (i) by substituting the estimator with its greatest convex minorant, developing a computational tool to implement this method for dimensions $$D\\\\ge 2$$D≥2 or (ii) by projecting the estimator onto a subspace of dependence functions satisfying such constraints and taking advantage of Bernstein–Bézier polynomials. Implementing the convex minorant method leads to better estimator performance as the dimensionality increases.

  1. Bayesian Frequency Domain Identification of LTI Systems with OBFs kernels

    NARCIS (Netherlands)

    Darwish, M.A.H.; Lataire, J.P.G.; Tóth, R.

    2017-01-01

    Regularised Frequency Response Function (FRF) estimation based on Gaussian process regression formulated directly in the frequency-domain has been introduced recently The underlying approach largely depends on the utilised kernel function, which encodes the relevant prior knowledge on the system

  2. Kernel parameter dependence in spatial factor analysis

    DEFF Research Database (Denmark)

    Nielsen, Allan Aasbjerg

    2010-01-01

    kernel PCA. Shawe-Taylor and Cristianini [4] is an excellent reference for kernel methods in general. Bishop [5] and Press et al. [6] describe kernel methods among many other subjects. The kernel version of PCA handles nonlinearities by implicitly transforming data into high (even infinite) dimensional...... feature space via the kernel function and then performing a linear analysis in that space. In this paper we shall apply a kernel version of maximum autocorrelation factor (MAF) [7, 8] analysis to irregularly sampled stream sediment geochemistry data from South Greenland and illustrate the dependence...... of the kernel width. The 2,097 samples each covering on average 5 km2 are analyzed chemically for the content of 41 elements....

  3. Spatiotemporal characteristics of elderly population's traffic accidents in Seoul using space-time cube and space-time kernel density estimation.

    Science.gov (United States)

    Kang, Youngok; Cho, Nahye; Son, Serin

    2018-01-01

    The purpose of this study is to analyze how the spatiotemporal characteristics of traffic accidents involving the elderly population in Seoul are changing by time period. We applied kernel density estimation and hotspot analyses to analyze the spatial characteristics of elderly people's traffic accidents, and the space-time cube, emerging hotspot, and space-time kernel density estimation analyses to analyze the spatiotemporal characteristics. In addition, we analyzed elderly people's traffic accidents by dividing cases into those in which the drivers were elderly people and those in which elderly people were victims of traffic accidents, and used the traffic accidents data in Seoul for 2013 for analysis. The main findings were as follows: (1) the hotspots for elderly people's traffic accidents differed according to whether they were drivers or victims. (2) The hourly analysis showed that the hotspots for elderly drivers' traffic accidents are in specific areas north of the Han River during the period from morning to afternoon, whereas the hotspots for elderly victims are distributed over a wide area from daytime to evening. (3) Monthly analysis showed that the hotspots are weak during winter and summer, whereas they are strong in the hiking and climbing areas in Seoul during spring and fall. Further, elderly victims' hotspots are more sporadic than elderly drivers' hotspots. (4) The analysis for the entire period of 2013 indicates that traffic accidents involving elderly people are increasing in specific areas on the north side of the Han River. We expect the results of this study to aid in reducing the number of traffic accidents involving elderly people in the future.

  4. Characterisation and final disposal behaviour of theoria-based fuel kernels in aqueous phases

    International Nuclear Information System (INIS)

    Titov, M.

    2005-08-01

    Two high-temperature reactors (AVR and THTR) operated in Germany have produced about 1 million spent fuel elements. The nuclear fuel in these reactors consists mainly of thorium-uranium mixed oxides, but also pure uranium dioxide and carbide fuels were tested. One of the possible solutions of utilising spent HTR fuel is the direct disposal in deep geological formations. Under such circumstances, the properties of fuel kernels, and especially their leaching behaviour in aqueous phases, have to be investigated for safety assessments of the final repository. In the present work, unirradiated ThO 2 , (Th 0.906 ,U 0.094 )O 2 , (Th 0.834 ,U 0.166 )O 2 and UO 2 fuel kernels were investigated. The composition, crystal structure and surface of the kernels were investigated by traditional methods. Furthermore, a new method was developed for testing the mechanical properties of ceramic kernels. The method was successfully used for the examination of mechanical properties of oxide kernels and for monitoring their evolution during contact with aqueous phases. The leaching behaviour of thoria-based oxide kernels and powders was investigated in repository-relevant salt solutions, as well as in artificial leachates. The influence of different experimental parameters on the kernel leaching stability was investigated. It was shown that thoria-based fuel kernels possess high chemical stability and are indifferent to presence of oxidative and radiolytic species in solution. The dissolution rate of thoria-based materials is typically several orders of magnitude lower than of conventional UO 2 fuel kernels. The life time of a single intact (Th,U)O 2 kernel under aggressive conditions of salt repository was estimated as about hundred thousand years. The importance of grain boundary quality on the leaching stability was demonstrated. Numerical Monte Carlo simulations were performed in order to explain the results of leaching experiments. (orig.)

  5. An integrated approach to estimate storage reliability with initial failures based on E-Bayesian estimates

    International Nuclear Information System (INIS)

    Zhang, Yongjin; Zhao, Ming; Zhang, Shitao; Wang, Jiamei; Zhang, Yanjun

    2017-01-01

    Storage reliability that measures the ability of products in a dormant state to keep their required functions is studied in this paper. For certain types of products, Storage reliability may not always be 100% at the beginning of storage, unlike the operational reliability, which exist possible initial failures that are normally neglected in the models of storage reliability. In this paper, a new integrated technique, the non-parametric measure based on the E-Bayesian estimates of current failure probabilities is combined with the parametric measure based on the exponential reliability function, is proposed to estimate and predict the storage reliability of products with possible initial failures, where the non-parametric method is used to estimate the number of failed products and the reliability at each testing time, and the parameter method is used to estimate the initial reliability and the failure rate of storage product. The proposed method has taken into consideration that, the reliability test data of storage products containing the unexamined before and during the storage process, is available for providing more accurate estimates of both the initial failure probability and the storage failure probability. When storage reliability prediction that is the main concern in this field should be made, the non-parametric estimates of failure numbers can be used into the parametric models for the failure process in storage. In the case of exponential models, the assessment and prediction method for storage reliability is presented in this paper. Finally, a numerical example is given to illustrate the method. Furthermore, a detailed comparison between the proposed and traditional method, for examining the rationality of assessment and prediction on the storage reliability, is investigated. The results should be useful for planning a storage environment, decision-making concerning the maximum length of storage, and identifying the production quality. - Highlights:

  6. A Fast Multiple-Kernel Method With Applications to Detect Gene-Environment Interaction.

    Science.gov (United States)

    Marceau, Rachel; Lu, Wenbin; Holloway, Shannon; Sale, Michèle M; Worrall, Bradford B; Williams, Stephen R; Hsu, Fang-Chi; Tzeng, Jung-Ying

    2015-09-01

    Kernel machine (KM) models are a powerful tool for exploring associations between sets of genetic variants and complex traits. Although most KM methods use a single kernel function to assess the marginal effect of a variable set, KM analyses involving multiple kernels have become increasingly popular. Multikernel analysis allows researchers to study more complex problems, such as assessing gene-gene or gene-environment interactions, incorporating variance-component based methods for population substructure into rare-variant association testing, and assessing the conditional effects of a variable set adjusting for other variable sets. The KM framework is robust, powerful, and provides efficient dimension reduction for multifactor analyses, but requires the estimation of high dimensional nuisance parameters. Traditional estimation techniques, including regularization and the "expectation-maximization (EM)" algorithm, have a large computational cost and are not scalable to large sample sizes needed for rare variant analysis. Therefore, under the context of gene-environment interaction, we propose a computationally efficient and statistically rigorous "fastKM" algorithm for multikernel analysis that is based on a low-rank approximation to the nuisance effect kernel matrices. Our algorithm is applicable to various trait types (e.g., continuous, binary, and survival traits) and can be implemented using any existing single-kernel analysis software. Through extensive simulation studies, we show that our algorithm has similar performance to an EM-based KM approach for quantitative traits while running much faster. We also apply our method to the Vitamin Intervention for Stroke Prevention (VISP) clinical trial, examining gene-by-vitamin effects on recurrent stroke risk and gene-by-age effects on change in homocysteine level. © 2015 WILEY PERIODICALS, INC.

  7. Two-Phase Iteration for Value Function Approximation and Hyperparameter Optimization in Gaussian-Kernel-Based Adaptive Critic Design

    Directory of Open Access Journals (Sweden)

    Xin Chen

    2015-01-01

    Full Text Available Adaptive Dynamic Programming (ADP with critic-actor architecture is an effective way to perform online learning control. To avoid the subjectivity in the design of a neural network that serves as a critic network, kernel-based adaptive critic design (ACD was developed recently. There are two essential issues for a static kernel-based model: how to determine proper hyperparameters in advance and how to select right samples to describe the value function. They all rely on the assessment of sample values. Based on the theoretical analysis, this paper presents a two-phase simultaneous learning method for a Gaussian-kernel-based critic network. It is able to estimate the values of samples without infinitively revisiting them. And the hyperparameters of the kernel model are optimized simultaneously. Based on the estimated sample values, the sample set can be refined by adding alternatives or deleting redundances. Combining this critic design with actor network, we present a Gaussian-kernel-based Adaptive Dynamic Programming (GK-ADP approach. Simulations are used to verify its feasibility, particularly the necessity of two-phase learning, the convergence characteristics, and the improvement of the system performance by using a varying sample set.

  8. Bayesian Nonparametric Longitudinal Data Analysis.

    Science.gov (United States)

    Quintana, Fernando A; Johnson, Wesley O; Waetjen, Elaine; Gold, Ellen

    2016-01-01

    Practical Bayesian nonparametric methods have been developed across a wide variety of contexts. Here, we develop a novel statistical model that generalizes standard mixed models for longitudinal data that include flexible mean functions as well as combined compound symmetry (CS) and autoregressive (AR) covariance structures. AR structure is often specified through the use of a Gaussian process (GP) with covariance functions that allow longitudinal data to be more correlated if they are observed closer in time than if they are observed farther apart. We allow for AR structure by considering a broader class of models that incorporates a Dirichlet Process Mixture (DPM) over the covariance parameters of the GP. We are able to take advantage of modern Bayesian statistical methods in making full predictive inferences and about characteristics of longitudinal profiles and their differences across covariate combinations. We also take advantage of the generality of our model, which provides for estimation of a variety of covariance structures. We observe that models that fail to incorporate CS or AR structure can result in very poor estimation of a covariance or correlation matrix. In our illustration using hormone data observed on women through the menopausal transition, biology dictates the use of a generalized family of sigmoid functions as a model for time trends across subpopulation categories.

  9. An adaptive distance measure for use with nonparametric models

    International Nuclear Information System (INIS)

    Garvey, D. R.; Hines, J. W.

    2006-01-01

    Distance measures perform a critical task in nonparametric, locally weighted regression. Locally weighted regression (LWR) models are a form of 'lazy learning' which construct a local model 'on the fly' by comparing a query vector to historical, exemplar vectors according to a three step process. First, the distance of the query vector to each of the exemplar vectors is calculated. Next, these distances are passed to a kernel function, which converts the distances to similarities or weights. Finally, the model output or response is calculated by performing locally weighted polynomial regression. To date, traditional distance measures, such as the Euclidean, weighted Euclidean, and L1-norm have been used as the first step in the prediction process. Since these measures do not take into consideration sensor failures and drift, they are inherently ill-suited for application to 'real world' systems. This paper describes one such LWR model, namely auto associative kernel regression (AAKR), and describes a new, Adaptive Euclidean distance measure that can be used to dynamically compensate for faulty sensor inputs. In this new distance measure, the query observations that lie outside of the training range (i.e. outside the minimum and maximum input exemplars) are dropped from the distance calculation. This allows for the distance calculation to be robust to sensor drifts and failures, in addition to providing a method for managing inputs that exceed the training range. In this paper, AAKR models using the standard and Adaptive Euclidean distance are developed and compared for the pressure system of an operating nuclear power plant. It is shown that using the standard Euclidean distance for data with failed inputs, significant errors in the AAKR predictions can result. By using the Adaptive Euclidean distance it is shown that high fidelity predictions are possible, in spite of the input failure. In fact, it is shown that with the Adaptive Euclidean distance prediction

  10. Multiple Kernel Learning with Data Augmentation

    Science.gov (United States)

    2016-11-22

    JMLR: Workshop and Conference Proceedings 63:49–64, 2016 ACML 2016 Multiple Kernel Learning with Data Augmentation Khanh Nguyen nkhanh@deakin.edu.au...University, Australia Editors: Robert J. Durrant and Kee-Eung Kim Abstract The motivations of multiple kernel learning (MKL) approach are to increase... kernel expres- siveness capacity and to avoid the expensive grid search over a wide spectrum of kernels . A large amount of work has been proposed to

  11. OS X and iOS Kernel Programming

    CERN Document Server

    Halvorsen, Ole Henry

    2011-01-01

    OS X and iOS Kernel Programming combines essential operating system and kernel architecture knowledge with a highly practical approach that will help you write effective kernel-level code. You'll learn fundamental concepts such as memory management and thread synchronization, as well as the I/O Kit framework. You'll also learn how to write your own kernel-level extensions, such as device drivers for USB and Thunderbolt devices, including networking, storage and audio drivers. OS X and iOS Kernel Programming provides an incisive and complete introduction to the XNU kernel, which runs iPhones, i

  12. Model selection for Gaussian kernel PCA denoising

    DEFF Research Database (Denmark)

    Jørgensen, Kasper Winther; Hansen, Lars Kai

    2012-01-01

    We propose kernel Parallel Analysis (kPA) for automatic kernel scale and model order selection in Gaussian kernel PCA. Parallel Analysis [1] is based on a permutation test for covariance and has previously been applied for model order selection in linear PCA, we here augment the procedure to also...... tune the Gaussian kernel scale of radial basis function based kernel PCA.We evaluate kPA for denoising of simulated data and the US Postal data set of handwritten digits. We find that kPA outperforms other heuristics to choose the model order and kernel scale in terms of signal-to-noise ratio (SNR...

  13. Online Capacity Estimation of Lithium-Ion Batteries Based on Novel Feature Extraction and Adaptive Multi-Kernel Relevance Vector Machine

    Directory of Open Access Journals (Sweden)

    Yang Zhang

    2015-11-01

    Full Text Available Prognostics is necessary to ensure the reliability and safety of lithium-ion batteries for hybrid electric vehicles or satellites. This process can be achieved by capacity estimation, which is a direct fading indicator for assessing the state of health of a battery. However, the capacity of a lithium-ion battery onboard is difficult to monitor. This paper presents a data-driven approach for online capacity estimation. First, six novel features are extracted from cyclic charge/discharge cycles and used as indirect health indicators. An adaptive multi-kernel relevance machine (MKRVM based on accelerated particle swarm optimization algorithm is used to determine the optimal parameters of MKRVM and characterize the relationship between extracted features and battery capacity. The overall estimation process comprises offline and online stages. A supervised learning step in the offline stage is established for model verification to ensure the generalizability of MKRVM for online application. Cross-validation is further conducted to validate the performance of the proposed model. Experiment and comparison results show the effectiveness, accuracy, efficiency, and robustness of the proposed approach for online capacity estimation of lithium-ion batteries.

  14. Paramecium: An Extensible Object-Based Kernel

    NARCIS (Netherlands)

    van Doorn, L.; Homburg, P.; Tanenbaum, A.S.

    1995-01-01

    In this paper we describe the design of an extensible kernel, called Paramecium. This kernel uses an object-based software architecture which together with instance naming, late binding and explicit overrides enables easy reconfiguration. Determining which components reside in the kernel protection

  15. Theory of reproducing kernels and applications

    CERN Document Server

    Saitoh, Saburou

    2016-01-01

    This book provides a large extension of the general theory of reproducing kernels published by N. Aronszajn in 1950, with many concrete applications. In Chapter 1, many concrete reproducing kernels are first introduced with detailed information. Chapter 2 presents a general and global theory of reproducing kernels with basic applications in a self-contained way. Many fundamental operations among reproducing kernel Hilbert spaces are dealt with. Chapter 2 is the heart of this book. Chapter 3 is devoted to the Tikhonov regularization using the theory of reproducing kernels with applications to numerical and practical solutions of bounded linear operator equations. In Chapter 4, the numerical real inversion formulas of the Laplace transform are presented by applying the Tikhonov regularization, where the reproducing kernels play a key role in the results. Chapter 5 deals with ordinary differential equations; Chapter 6 includes many concrete results for various fundamental partial differential equations. In Chapt...

  16. Nonparametric identification of copula structures

    KAUST Repository

    Li, Bo; Genton, Marc G.

    2013-01-01

    We propose a unified framework for testing a variety of assumptions commonly made about the structure of copulas, including symmetry, radial symmetry, joint symmetry, associativity and Archimedeanity, and max-stability. Our test is nonparametric

  17. Kernels for structured data

    CERN Document Server

    Gärtner, Thomas

    2009-01-01

    This book provides a unique treatment of an important area of machine learning and answers the question of how kernel methods can be applied to structured data. Kernel methods are a class of state-of-the-art learning algorithms that exhibit excellent learning results in several application domains. Originally, kernel methods were developed with data in mind that can easily be embedded in a Euclidean vector space. Much real-world data does not have this property but is inherently structured. An example of such data, often consulted in the book, is the (2D) graph structure of molecules formed by

  18. Simple nonparametric checks for model data fit in CAT

    NARCIS (Netherlands)

    Meijer, R.R.

    2005-01-01

    In this paper, the usefulness of several nonparametric checks is discussed in a computerized adaptive testing (CAT) context. Although there is no tradition of nonparametric scalability in CAT, it can be argued that scalability checks can be useful to investigate, for example, the quality of item

  19. 7 CFR 981.401 - Adjusted kernel weight.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 8 2010-01-01 2010-01-01 false Adjusted kernel weight. 981.401 Section 981.401... Administrative Rules and Regulations § 981.401 Adjusted kernel weight. (a) Definition. Adjusted kernel weight... kernels in excess of five percent; less shells, if applicable; less processing loss of one percent for...

  20. Development of a Modified Kernel Regression Model for a Robust Signal Reconstruction

    Energy Technology Data Exchange (ETDEWEB)

    Ahmed, Ibrahim; Heo, Gyunyoung [Kyung Hee University, Yongin (Korea, Republic of)

    2016-10-15

    The demand for robust and resilient performance has led to the use of online-monitoring techniques to monitor the process parameters and signal validation. On-line monitoring and signal validation techniques are the two important terminologies in process and equipment monitoring. These techniques are automated methods of monitoring instrument performance while the plant is operating. To implementing these techniques, several empirical models are used. One of these models is nonparametric regression model, otherwise known as kernel regression (KR). Unlike parametric models, KR is an algorithmic estimation procedure which assumes no significant parameters, and it needs no training process after its development when new observations are prepared; which is good for a system characteristic of changing due to ageing phenomenon. Although KR is used and performed excellently when applied to steady state or normal operating data, it has limitation in time-varying data that has several repetition of the same signal, especially if those signals are used to infer the other signals. The convectional KR has limitation in correctly estimating the dependent variable when time-varying data with repeated values are used to estimate the dependent variable especially in signal validation and monitoring. Therefore, we presented here in this work a modified KR that can resolve this issue which can also be feasible in time domain. Data are first transformed prior to the Euclidian distance evaluation considering their slopes/changes with respect to time. The performance of the developed model is evaluated and compared with that of conventional KR using both the lab experimental data and the real time data from CNS provided by KAERI. The result shows that the proposed developed model, having demonstrated high performance accuracy than that of conventional KR, is capable of resolving the identified limitation with convectional KR. We also discovered that there is still need to further

  1. Testing Infrastructure for Operating System Kernel Development

    DEFF Research Database (Denmark)

    Walter, Maxwell; Karlsson, Sven

    2014-01-01

    Testing is an important part of system development, and to test effectively we require knowledge of the internal state of the system under test. Testing an operating system kernel is a challenge as it is the operating system that typically provides access to this internal state information. Multi......-core kernels pose an even greater challenge due to concurrency and their shared kernel state. In this paper, we present a testing framework that addresses these challenges by running the operating system in a virtual machine, and using virtual machine introspection to both communicate with the kernel...... and obtain information about the system. We have also developed an in-kernel testing API that we can use to develop a suite of unit tests in the kernel. We are using our framework for for the development of our own multi-core research kernel....

  2. Analysing the length of care episode after hip fracture: a nonparametric and a parametric Bayesian approach.

    Science.gov (United States)

    Riihimäki, Jaakko; Sund, Reijo; Vehtari, Aki

    2010-06-01

    Effective utilisation of limited resources is a challenge for health care providers. Accurate and relevant information extracted from the length of stay distributions is useful for management purposes. Patient care episodes can be reconstructed from the comprehensive health registers, and in this paper we develop a Bayesian approach to analyse the length of care episode after a fractured hip. We model the large scale data with a flexible nonparametric multilayer perceptron network and with a parametric Weibull mixture model. To assess the performances of the models, we estimate expected utilities using predictive density as a utility measure. Since the model parameters cannot be directly compared, we focus on observables, and estimate the relevances of patient explanatory variables in predicting the length of stay. To demonstrate how the use of the nonparametric flexible model is advantageous for this complex health care data, we also study joint effects of variables in predictions, and visualise nonlinearities and interactions found in the data.

  3. 7 CFR 51.1403 - Kernel color classification.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 2 2010-01-01 2010-01-01 false Kernel color classification. 51.1403 Section 51.1403... STANDARDS) United States Standards for Grades of Pecans in the Shell 1 Kernel Color Classification § 51.1403 Kernel color classification. (a) The skin color of pecan kernels may be described in terms of the color...

  4. Recent Advances and Trends in Nonparametric Statistics

    CERN Document Server

    Akritas, MG

    2003-01-01

    The advent of high-speed, affordable computers in the last two decades has given a new boost to the nonparametric way of thinking. Classical nonparametric procedures, such as function smoothing, suddenly lost their abstract flavour as they became practically implementable. In addition, many previously unthinkable possibilities became mainstream; prime examples include the bootstrap and resampling methods, wavelets and nonlinear smoothers, graphical methods, data mining, bioinformatics, as well as the more recent algorithmic approaches such as bagging and boosting. This volume is a collection o

  5. A nonparametric spatial scan statistic for continuous data.

    Science.gov (United States)

    Jung, Inkyung; Cho, Ho Jin

    2015-10-20

    Spatial scan statistics are widely used for spatial cluster detection, and several parametric models exist. For continuous data, a normal-based scan statistic can be used. However, the performance of the model has not been fully evaluated for non-normal data. We propose a nonparametric spatial scan statistic based on the Wilcoxon rank-sum test statistic and compared the performance of the method with parametric models via a simulation study under various scenarios. The nonparametric method outperforms the normal-based scan statistic in terms of power and accuracy in almost all cases under consideration in the simulation study. The proposed nonparametric spatial scan statistic is therefore an excellent alternative to the normal model for continuous data and is especially useful for data following skewed or heavy-tailed distributions.

  6. Feature Augmentation via Nonparametrics and Selection (FANS) in High-Dimensional Classification.

    Science.gov (United States)

    Fan, Jianqing; Feng, Yang; Jiang, Jiancheng; Tong, Xin

    We propose a high dimensional classification method that involves nonparametric feature augmentation. Knowing that marginal density ratios are the most powerful univariate classifiers, we use the ratio estimates to transform the original feature measurements. Subsequently, penalized logistic regression is invoked, taking as input the newly transformed or augmented features. This procedure trains models equipped with local complexity and global simplicity, thereby avoiding the curse of dimensionality while creating a flexible nonlinear decision boundary. The resulting method is called Feature Augmentation via Nonparametrics and Selection (FANS). We motivate FANS by generalizing the Naive Bayes model, writing the log ratio of joint densities as a linear combination of those of marginal densities. It is related to generalized additive models, but has better interpretability and computability. Risk bounds are developed for FANS. In numerical analysis, FANS is compared with competing methods, so as to provide a guideline on its best application domain. Real data analysis demonstrates that FANS performs very competitively on benchmark email spam and gene expression data sets. Moreover, FANS is implemented by an extremely fast algorithm through parallel computing.

  7. A nonparametric statistical method for determination of a confidence interval for the mean of a set of results obtained in a laboratory intercomparison

    International Nuclear Information System (INIS)

    Veglia, A.

    1981-08-01

    In cases where sets of data are obviously not normally distributed, the application of a nonparametric method for the estimation of a confidence interval for the mean seems to be more suitable than some other methods because such a method requires few assumptions about the population of data. A two-step statistical method is proposed which can be applied to any set of analytical results: elimination of outliers by a nonparametric method based on Tchebycheff's inequality, and determination of a confidence interval for the mean by a non-parametric method based on binominal distribution. The method is appropriate only for samples of size n>=10

  8. Performance Recognition for Sulphur Flotation Process Based on Froth Texture Unit Distribution

    Directory of Open Access Journals (Sweden)

    Mingfang He

    2013-01-01

    Full Text Available As an important indicator of flotation performance, froth texture is believed to be related to operational condition in sulphur flotation process. A novel fault detection method based on froth texture unit distribution (TUD is proposed to recognize the fault condition of sulphur flotation in real time. The froth texture unit number is calculated based on texture spectrum, and the probability density function (PDF of froth texture unit number is defined as texture unit distribution, which can describe the actual textual feature more accurately than the grey level dependence matrix approach. As the type of the froth TUD is unknown, a nonparametric kernel estimation method based on the fixed kernel basis is proposed, which can overcome the difficulty when comparing different TUDs under various conditions is impossible using the traditional varying kernel basis. Through transforming nonparametric description into dynamic kernel weight vectors, a principle component analysis (PCA model is established to reduce the dimensionality of the vectors. Then a threshold criterion determined by the TQ statistic based on the PCA model is proposed to realize the performance recognition. The industrial application results show that the accurate performance recognition of froth flotation can be achieved by using the proposed method.

  9. The definition of kernel Oz

    OpenAIRE

    Smolka, Gert

    1994-01-01

    Oz is a concurrent language providing for functional, object-oriented, and constraint programming. This paper defines Kernel Oz, a semantically complete sublanguage of Oz. It was an important design requirement that Oz be definable by reduction to a lean kernel language. The definition of Kernel Oz introduces three essential abstractions: the Oz universe, the Oz calculus, and the actor model. The Oz universe is a first-order structure defining the values and constraints Oz computes with. The ...

  10. Fabrication of Uranium Oxycarbide Kernels for HTR Fuel

    International Nuclear Information System (INIS)

    Barnes, Charles; Richardson, Clay; Nagley, Scott; Hunn, John; Shaber, Eric

    2010-01-01

    Babcock and Wilcox (B and W) has been producing high quality uranium oxycarbide (UCO) kernels for Advanced Gas Reactor (AGR) fuel tests at the Idaho National Laboratory. In 2005, 350-(micro)m, 19.7% 235U-enriched UCO kernels were produced for the AGR-1 test fuel. Following coating of these kernels and forming the coated-particles into compacts, this fuel was irradiated in the Advanced Test Reactor (ATR) from December 2006 until November 2009. B and W produced 425-(micro)m, 14% enriched UCO kernels in 2008, and these kernels were used to produce fuel for the AGR-2 experiment that was inserted in ATR in 2010. B and W also produced 500-(micro)m, 9.6% enriched UO2 kernels for the AGR-2 experiments. Kernels of the same size and enrichment as AGR-1 were also produced for the AGR-3/4 experiment. In addition to fabricating enriched UCO and UO2 kernels, B and W has produced more than 100 kg of natural uranium UCO kernels which are being used in coating development tests. Successive lots of kernels have demonstrated consistent high quality and also allowed for fabrication process improvements. Improvements in kernel forming were made subsequent to AGR-1 kernel production. Following fabrication of AGR-2 kernels, incremental increases in sintering furnace charge size have been demonstrated. Recently small scale sintering tests using a small development furnace equipped with a residual gas analyzer (RGA) has increased understanding of how kernel sintering parameters affect sintered kernel properties. The steps taken to increase throughput and process knowledge have reduced kernel production costs. Studies have been performed of additional modifications toward the goal of increasing capacity of the current fabrication line to use for production of first core fuel for the Next Generation Nuclear Plant (NGNP) and providing a basis for the design of a full scale fuel fabrication facility.

  11. Spatiotemporal characteristics of elderly population’s traffic accidents in Seoul using space-time cube and space-time kernel density estimation

    Science.gov (United States)

    Cho, Nahye; Son, Serin

    2018-01-01

    The purpose of this study is to analyze how the spatiotemporal characteristics of traffic accidents involving the elderly population in Seoul are changing by time period. We applied kernel density estimation and hotspot analyses to analyze the spatial characteristics of elderly people’s traffic accidents, and the space-time cube, emerging hotspot, and space-time kernel density estimation analyses to analyze the spatiotemporal characteristics. In addition, we analyzed elderly people’s traffic accidents by dividing cases into those in which the drivers were elderly people and those in which elderly people were victims of traffic accidents, and used the traffic accidents data in Seoul for 2013 for analysis. The main findings were as follows: (1) the hotspots for elderly people’s traffic accidents differed according to whether they were drivers or victims. (2) The hourly analysis showed that the hotspots for elderly drivers’ traffic accidents are in specific areas north of the Han River during the period from morning to afternoon, whereas the hotspots for elderly victims are distributed over a wide area from daytime to evening. (3) Monthly analysis showed that the hotspots are weak during winter and summer, whereas they are strong in the hiking and climbing areas in Seoul during spring and fall. Further, elderly victims’ hotspots are more sporadic than elderly drivers’ hotspots. (4) The analysis for the entire period of 2013 indicates that traffic accidents involving elderly people are increasing in specific areas on the north side of the Han River. We expect the results of this study to aid in reducing the number of traffic accidents involving elderly people in the future. PMID:29768453

  12. NONLINEAR PROPERTIES OF MEASLES EPIDEMIC DATA ASSESSED WITH A KERNEL NONPARAMETRIC IDENTIFICATION APPROACH

    Directory of Open Access Journals (Sweden)

    Luis García Domínguez

    2006-03-01

    Full Text Available ABSTRACTKernel nonparametric nonlinear autoregression was applied to measles data from the pre-vaccination era (1944-1966. A slowly sliding time window covered 20 overlapping segments of the series. In the case of data from Birmingham the order of the model was higher than 22 for all windows and the reconstructed noise free realizations were periodic with the most probable period being equal to 3 years, though values of 2, 4 and 6 years were also obtained.For London data 6 windows were with low orders (below 5. Low order noise free realizations were chaotic. The rest presented periodic solutions corresponding to 1, 2, and 3-years cycles. Our results are consistent with views about dynamical transitions among measles data. The method is reliable and puts practically no restrictions regarding data properties. We recommend its use for further exploration of epidemic data from different origin. RESUMENPROPIEDADES NO LINEALES DE DATOS EPIDEMIOLÓGICOS DE SARAMPIÓN EVALUADAS MEDIANTE UN ENFOQUE DE IDENTIFICACIÓN NO LINEAL POR NÚCLEOS.Se aplicó un método de auto-regresión no lineal por núcleos a datos de incidencia de sarampión correspondientes a la época previa a la vacunación (1944-1966. Una ventana de tiempo que se desplazaba lentamente cubrió 20 segmentos de serie temporal que se solapaban. En el caso de los datos correspondientes a Birmingham el orden del modelo era mayor de 22 para todas las ventanas y las realizaciones libres de ruido reconstruidas eran periódicas con la duración del periodo más probable igual a 3 años, aunque también se obtuvieron valores de 2, 4 y 6 años.Para los datos de Londres, se observaron 6 ventanas con órdenes inferiores a 5. Las realizaciones libres de ruido con órdenes bajos eran caóticas. El resto de las ventanas mostraron ciclos de 1, 2 y tres años. Nuestros resultados son concordantes con la idea de la presencia de transiciones de fase en series de sarampión. El método es confiable y no

  13. Anisotropic hydrodynamics with a scalar collisional kernel

    Science.gov (United States)

    Almaalol, Dekrayat; Strickland, Michael

    2018-04-01

    Prior studies of nonequilibrium dynamics using anisotropic hydrodynamics have used the relativistic Anderson-Witting scattering kernel or some variant thereof. In this paper, we make the first study of the impact of using a more realistic scattering kernel. For this purpose, we consider a conformal system undergoing transversally homogenous and boost-invariant Bjorken expansion and take the collisional kernel to be given by the leading order 2 ↔2 scattering kernel in scalar λ ϕ4 . We consider both classical and quantum statistics to assess the impact of Bose enhancement on the dynamics. We also determine the anisotropic nonequilibrium attractor of a system subject to this collisional kernel. We find that, when the near-equilibrium relaxation-times in the Anderson-Witting and scalar collisional kernels are matched, the scalar kernel results in a higher degree of momentum-space anisotropy during the system's evolution, given the same initial conditions. Additionally, we find that taking into account Bose enhancement further increases the dynamically generated momentum-space anisotropy.

  14. Object classification and detection with context kernel descriptors

    DEFF Research Database (Denmark)

    Pan, Hong; Olsen, Søren Ingvor; Zhu, Yaping

    2014-01-01

    Context information is important in object representation. By embedding context cue of image attributes into kernel descriptors, we propose a set of novel kernel descriptors called Context Kernel Descriptors (CKD) for object classification and detection. The motivation of CKD is to use spatial...... consistency of image attributes or features defined within a neighboring region to improve the robustness of descriptor matching in kernel space. For feature selection, Kernel Entropy Component Analysis (KECA) is exploited to learn a subset of discriminative CKD. Different from Kernel Principal Component...

  15. Unstable volatility

    DEFF Research Database (Denmark)

    Casas, Isabel; Gijbels, Irène

    2012-01-01

    The objective of this paper is to introduce the break-preserving local linear (BPLL) estimator for the estimation of unstable volatility functions for independent and asymptotically independent processes. Breaks in the structure of the conditional mean and/or the volatility functions are common...... in Finance. Nonparametric estimators are well suited for these events due to the flexibility of their functional form and their good asymptotic properties. However, the local polynomial kernel estimators are not consistent at points where the volatility function has a break. The estimator presented...

  16. Ranking Support Vector Machine with Kernel Approximation.

    Science.gov (United States)

    Chen, Kai; Li, Rongchun; Dou, Yong; Liang, Zhengfa; Lv, Qi

    2017-01-01

    Learning to rank algorithm has become important in recent years due to its successful application in information retrieval, recommender system, and computational biology, and so forth. Ranking support vector machine (RankSVM) is one of the state-of-art ranking models and has been favorably used. Nonlinear RankSVM (RankSVM with nonlinear kernels) can give higher accuracy than linear RankSVM (RankSVM with a linear kernel) for complex nonlinear ranking problem. However, the learning methods for nonlinear RankSVM are still time-consuming because of the calculation of kernel matrix. In this paper, we propose a fast ranking algorithm based on kernel approximation to avoid computing the kernel matrix. We explore two types of kernel approximation methods, namely, the Nyström method and random Fourier features. Primal truncated Newton method is used to optimize the pairwise L2-loss (squared Hinge-loss) objective function of the ranking model after the nonlinear kernel approximation. Experimental results demonstrate that our proposed method gets a much faster training speed than kernel RankSVM and achieves comparable or better performance over state-of-the-art ranking algorithms.

  17. Ranking Support Vector Machine with Kernel Approximation

    Directory of Open Access Journals (Sweden)

    Kai Chen

    2017-01-01

    Full Text Available Learning to rank algorithm has become important in recent years due to its successful application in information retrieval, recommender system, and computational biology, and so forth. Ranking support vector machine (RankSVM is one of the state-of-art ranking models and has been favorably used. Nonlinear RankSVM (RankSVM with nonlinear kernels can give higher accuracy than linear RankSVM (RankSVM with a linear kernel for complex nonlinear ranking problem. However, the learning methods for nonlinear RankSVM are still time-consuming because of the calculation of kernel matrix. In this paper, we propose a fast ranking algorithm based on kernel approximation to avoid computing the kernel matrix. We explore two types of kernel approximation methods, namely, the Nyström method and random Fourier features. Primal truncated Newton method is used to optimize the pairwise L2-loss (squared Hinge-loss objective function of the ranking model after the nonlinear kernel approximation. Experimental results demonstrate that our proposed method gets a much faster training speed than kernel RankSVM and achieves comparable or better performance over state-of-the-art ranking algorithms.

  18. An improved nonparametric lower bound of species richness via a modified good-turing frequency formula.

    Science.gov (United States)

    Chiu, Chun-Huo; Wang, Yi-Ting; Walther, Bruno A; Chao, Anne

    2014-09-01

    It is difficult to accurately estimate species richness if there are many almost undetectable species in a hyper-diverse community. Practically, an accurate lower bound for species richness is preferable to an inaccurate point estimator. The traditional nonparametric lower bound developed by Chao (1984, Scandinavian Journal of Statistics 11, 265-270) for individual-based abundance data uses only the information on the rarest species (the numbers of singletons and doubletons) to estimate the number of undetected species in samples. Applying a modified Good-Turing frequency formula, we derive an approximate formula for the first-order bias of this traditional lower bound. The approximate bias is estimated by using additional information (namely, the numbers of tripletons and quadrupletons). This approximate bias can be corrected, and an improved lower bound is thus obtained. The proposed lower bound is nonparametric in the sense that it is universally valid for any species abundance distribution. A similar type of improved lower bound can be derived for incidence data. We test our proposed lower bounds on simulated data sets generated from various species abundance models. Simulation results show that the proposed lower bounds always reduce bias over the traditional lower bounds and improve accuracy (as measured by mean squared error) when the heterogeneity of species abundances is relatively high. We also apply the proposed new lower bounds to real data for illustration and for comparisons with previously developed estimators. © 2014, The International Biometric Society.

  19. Dose point kernels for beta-emitting radioisotopes

    International Nuclear Information System (INIS)

    Prestwich, W.V.; Chan, L.B.; Kwok, C.S.; Wilson, B.

    1986-01-01

    Knowledge of the dose point kernel corresponding to a specific radionuclide is required to calculate the spatial dose distribution produced in a homogeneous medium by a distributed source. Dose point kernels for commonly used radionuclides have been calculated previously using as a basis monoenergetic dose point kernels derived by numerical integration of a model transport equation. The treatment neglects fluctuations in energy deposition, an effect which has been later incorporated in dose point kernels calculated using Monte Carlo methods. This work describes new calculations of dose point kernels using the Monte Carlo results as a basis. An analytic representation of the monoenergetic dose point kernels has been developed. This provides a convenient method both for calculating the dose point kernel associated with a given beta spectrum and for incorporating the effect of internal conversion. An algebraic expression for allowed beta spectra has been accomplished through an extension of the Bethe-Bacher approximation, and tested against the exact expression. Simplified expression for first-forbidden shape factors have also been developed. A comparison of the calculated dose point kernel for 32 P with experimental data indicates good agreement with a significant improvement over the earlier results in this respect. An analytic representation of the dose point kernel associated with the spectrum of a single beta group has been formulated. 9 references, 16 figures, 3 tables

  20. Rare variant testing across methods and thresholds using the multi-kernel sequence kernel association test (MK-SKAT).

    Science.gov (United States)

    Urrutia, Eugene; Lee, Seunggeun; Maity, Arnab; Zhao, Ni; Shen, Judong; Li, Yun; Wu, Michael C

    Analysis of rare genetic variants has focused on region-based analysis wherein a subset of the variants within a genomic region is tested for association with a complex trait. Two important practical challenges have emerged. First, it is difficult to choose which test to use. Second, it is unclear which group of variants within a region should be tested. Both depend on the unknown true state of nature. Therefore, we develop the Multi-Kernel SKAT (MK-SKAT) which tests across a range of rare variant tests and groupings. Specifically, we demonstrate that several popular rare variant tests are special cases of the sequence kernel association test which compares pair-wise similarity in trait value to similarity in the rare variant genotypes between subjects as measured through a kernel function. Choosing a particular test is equivalent to choosing a kernel. Similarly, choosing which group of variants to test also reduces to choosing a kernel. Thus, MK-SKAT uses perturbation to test across a range of kernels. Simulations and real data analyses show that our framework controls type I error while maintaining high power across settings: MK-SKAT loses power when compared to the kernel for a particular scenario but has much greater power than poor choices.

  1. Nonparametric analysis of blocked ordered categories data: some examples revisited

    Directory of Open Access Journals (Sweden)

    O. Thas

    2006-08-01

    Full Text Available Nonparametric analysis for general block designs can be given by using the Cochran-Mantel-Haenszel (CMH statistics. We demonstrate this with four examples and note that several well-known nonparametric statistics are special cases of CMH statistics.

  2. Wigner functions defined with Laplace transform kernels.

    Science.gov (United States)

    Oh, Se Baek; Petruccelli, Jonathan C; Tian, Lei; Barbastathis, George

    2011-10-24

    We propose a new Wigner-type phase-space function using Laplace transform kernels--Laplace kernel Wigner function. Whereas momentum variables are real in the traditional Wigner function, the Laplace kernel Wigner function may have complex momentum variables. Due to the property of the Laplace transform, a broader range of signals can be represented in complex phase-space. We show that the Laplace kernel Wigner function exhibits similar properties in the marginals as the traditional Wigner function. As an example, we use the Laplace kernel Wigner function to analyze evanescent waves supported by surface plasmon polariton. © 2011 Optical Society of America

  3. Metabolic network prediction through pairwise rational kernels.

    Science.gov (United States)

    Roche-Lima, Abiel; Domaratzki, Michael; Fristensky, Brian

    2014-09-26

    Metabolic networks are represented by the set of metabolic pathways. Metabolic pathways are a series of biochemical reactions, in which the product (output) from one reaction serves as the substrate (input) to another reaction. Many pathways remain incompletely characterized. One of the major challenges of computational biology is to obtain better models of metabolic pathways. Existing models are dependent on the annotation of the genes. This propagates error accumulation when the pathways are predicted by incorrectly annotated genes. Pairwise classification methods are supervised learning methods used to classify new pair of entities. Some of these classification methods, e.g., Pairwise Support Vector Machines (SVMs), use pairwise kernels. Pairwise kernels describe similarity measures between two pairs of entities. Using pairwise kernels to handle sequence data requires long processing times and large storage. Rational kernels are kernels based on weighted finite-state transducers that represent similarity measures between sequences or automata. They have been effectively used in problems that handle large amount of sequence information such as protein essentiality, natural language processing and machine translations. We create a new family of pairwise kernels using weighted finite-state transducers (called Pairwise Rational Kernel (PRK)) to predict metabolic pathways from a variety of biological data. PRKs take advantage of the simpler representations and faster algorithms of transducers. Because raw sequence data can be used, the predictor model avoids the errors introduced by incorrect gene annotations. We then developed several experiments with PRKs and Pairwise SVM to validate our methods using the metabolic network of Saccharomyces cerevisiae. As a result, when PRKs are used, our method executes faster in comparison with other pairwise kernels. Also, when we use PRKs combined with other simple kernels that include evolutionary information, the accuracy

  4. Nonparametric Change Point Diagnosis Method of Concrete Dam Crack Behavior Abnormality

    Directory of Open Access Journals (Sweden)

    Zhanchao Li

    2013-01-01

    Full Text Available The study on diagnosis method of concrete crack behavior abnormality has always been a hot spot and difficulty in the safety monitoring field of hydraulic structure. Based on the performance of concrete dam crack behavior abnormality in parametric statistical model and nonparametric statistical model, the internal relation between concrete dam crack behavior abnormality and statistical change point theory is deeply analyzed from the model structure instability of parametric statistical model and change of sequence distribution law of nonparametric statistical model. On this basis, through the reduction of change point problem, the establishment of basic nonparametric change point model, and asymptotic analysis on test method of basic change point problem, the nonparametric change point diagnosis method of concrete dam crack behavior abnormality is created in consideration of the situation that in practice concrete dam crack behavior may have more abnormality points. And the nonparametric change point diagnosis method of concrete dam crack behavior abnormality is used in the actual project, demonstrating the effectiveness and scientific reasonableness of the method established. Meanwhile, the nonparametric change point diagnosis method of concrete dam crack behavior abnormality has a complete theoretical basis and strong practicality with a broad application prospect in actual project.

  5. Moisture Adsorption Isotherm and Storability of Hazelnut Inshells and Kernels Produced in Oregon, USA.

    Science.gov (United States)

    Jung, Jooyeoun; Wang, Wenjie; McGorrin, Robert J; Zhao, Yanyun

    2018-02-01

    Moisture adsorption isotherms and storability of dried hazelnut inshells and kernels produced in Oregon were evaluated and compared among cultivars, including Barcelona, Yamhill, and Jefferson. Experimental moisture adsorption data fitted to Guggenheim-Anderson-de Boer (GAB) model, showing less hygroscopic properties in Yamhill than other cultivars of inshells and kernels due to lower content of carbohydrate and protein, but higher content of fat. The safe levels of moisture content (MC, dry basis) of dried inshells and kernels for reaching kernel water activity (a w ) ≤0.65 were estimated using the GAB model as 11.3% and 5.0% for Barcelona, 9.4% and 4.2% for Yamhill, and 10.7% and 4.9% for Jefferson, respectively. Storage conditions (2 °C at 85% to 95% relative humidity [RH], 10 °C at 65% to 75% RH, and 27 °C at 35% to 45% RH), times (0, 4, 8, or 12 mo), and packaging methods (atmosphere vs. vacuum) affected MC, a w , bioactive compounds, lipid oxidation, and enzyme activity of dried hazelnut inshells or kernels. For inshells packaged at woven polypropylene bag, MC and a w of inshells and kernels (inside shells) increased at 2 and 10 °C, but decreased at 27 °C during storage. For kernels, lipid oxidation and polyphenol oxidase activity also increased with extended storage time (P adsorption and physicochemical and enzymatic stability during storage. Moisture adsorption isotherm of hazelnut inshells and kernels is useful for predicting the storability of nuts. This study found that water adsorption and storability varied among the different cultivars of nuts, in which Yamhill was less hygroscopic than Barcelona and Jefferson, thus more stable during storage. For ensuring food safety and quality of nuts during storage, each cultivar of kernels should be dried to a certain level of MC. Lipid oxidation and enzyme activity of kernel could be increased with extended storage time. Vacuum packaging was recommended to kernels for reducing moisture adsorption

  6. Nonparametric Mixture Models for Supervised Image Parcellation.

    Science.gov (United States)

    Sabuncu, Mert R; Yeo, B T Thomas; Van Leemput, Koen; Fischl, Bruce; Golland, Polina

    2009-09-01

    We present a nonparametric, probabilistic mixture model for the supervised parcellation of images. The proposed model yields segmentation algorithms conceptually similar to the recently developed label fusion methods, which register a new image with each training image separately. Segmentation is achieved via the fusion of transferred manual labels. We show that in our framework various settings of a model parameter yield algorithms that use image intensity information differently in determining the weight of a training subject during fusion. One particular setting computes a single, global weight per training subject, whereas another setting uses locally varying weights when fusing the training data. The proposed nonparametric parcellation approach capitalizes on recently developed fast and robust pairwise image alignment tools. The use of multiple registrations allows the algorithm to be robust to occasional registration failures. We report experiments on 39 volumetric brain MRI scans with expert manual labels for the white matter, cerebral cortex, ventricles and subcortical structures. The results demonstrate that the proposed nonparametric segmentation framework yields significantly better segmentation than state-of-the-art algorithms.

  7. Performance of non-parametric algorithms for spatial mapping of tropical forest structure

    Directory of Open Access Journals (Sweden)

    Liang Xu

    2016-08-01

    Full Text Available Abstract Background Mapping tropical forest structure is a critical requirement for accurate estimation of emissions and removals from land use activities. With the availability of a wide range of remote sensing imagery of vegetation characteristics from space, development of finer resolution and more accurate maps has advanced in recent years. However, the mapping accuracy relies heavily on the quality of input layers, the algorithm chosen, and the size and quality of inventory samples for calibration and validation. Results By using airborne lidar data as the “truth” and focusing on the mean canopy height (MCH as a key structural parameter, we test two commonly-used non-parametric techniques of maximum entropy (ME and random forest (RF for developing maps over a study site in Central Gabon. Results of mapping show that both approaches have improved accuracy with more input layers in mapping canopy height at 100 m (1-ha pixels. The bias-corrected spatial models further improve estimates for small and large trees across the tails of height distributions with a trade-off in increasing overall mean squared error that can be readily compensated by increasing the sample size. Conclusions A significant improvement in tropical forest mapping can be achieved by weighting the number of inventory samples against the choice of image layers and the non-parametric algorithms. Without future satellite observations with better sensitivity to forest biomass, the maps based on existing data will remain slightly biased towards the mean of the distribution and under and over estimating the upper and lower tails of the distribution.

  8. The Linux kernel as flexible product-line architecture

    NARCIS (Netherlands)

    M. de Jonge (Merijn)

    2002-01-01

    textabstractThe Linux kernel source tree is huge ($>$ 125 MB) and inflexible (because it is difficult to add new kernel components). We propose to make this architecture more flexible by assembling kernel source trees dynamically from individual kernel components. Users then, can select what

  9. Exploiting graph kernels for high performance biomedical relation extraction.

    Science.gov (United States)

    Panyam, Nagesh C; Verspoor, Karin; Cohn, Trevor; Ramamohanarao, Kotagiri

    2018-01-30

    Relation extraction from biomedical publications is an important task in the area of semantic mining of text. Kernel methods for supervised relation extraction are often preferred over manual feature engineering methods, when classifying highly ordered structures such as trees and graphs obtained from syntactic parsing of a sentence. Tree kernels such as the Subset Tree Kernel and Partial Tree Kernel have been shown to be effective for classifying constituency parse trees and basic dependency parse graphs of a sentence. Graph kernels such as the All Path Graph kernel (APG) and Approximate Subgraph Matching (ASM) kernel have been shown to be suitable for classifying general graphs with cycles, such as the enhanced dependency parse graph of a sentence. In this work, we present a high performance Chemical-Induced Disease (CID) relation extraction system. We present a comparative study of kernel methods for the CID task and also extend our study to the Protein-Protein Interaction (PPI) extraction task, an important biomedical relation extraction task. We discuss novel modifications to the ASM kernel to boost its performance and a method to apply graph kernels for extracting relations expressed in multiple sentences. Our system for CID relation extraction attains an F-score of 60%, without using external knowledge sources or task specific heuristic or rules. In comparison, the state of the art Chemical-Disease Relation Extraction system achieves an F-score of 56% using an ensemble of multiple machine learning methods, which is then boosted to 61% with a rule based system employing task specific post processing rules. For the CID task, graph kernels outperform tree kernels substantially, and the best performance is obtained with APG kernel that attains an F-score of 60%, followed by the ASM kernel at 57%. The performance difference between the ASM and APG kernels for CID sentence level relation extraction is not significant. In our evaluation of ASM for the PPI task, ASM

  10. GRIM : Leveraging GPUs for Kernel integrity monitoring

    NARCIS (Netherlands)

    Koromilas, Lazaros; Vasiliadis, Giorgos; Athanasopoulos, Ilias; Ioannidis, Sotiris

    2016-01-01

    Kernel rootkits can exploit an operating system and enable future accessibility and control, despite all recent advances in software protection. A promising defense mechanism against rootkits is Kernel Integrity Monitor (KIM) systems, which inspect the kernel text and data to discover any malicious

  11. Non-parametric tests of productive efficiency with errors-in-variables

    NARCIS (Netherlands)

    Kuosmanen, T.K.; Post, T.; Scholtes, S.

    2007-01-01

    We develop a non-parametric test of productive efficiency that accounts for errors-in-variables, following the approach of Varian. [1985. Nonparametric analysis of optimizing behavior with measurement error. Journal of Econometrics 30(1/2), 445-458]. The test is based on the general Pareto-Koopmans

  12. 7 CFR 51.2296 - Three-fourths half kernel.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 2 2010-01-01 2010-01-01 false Three-fourths half kernel. 51.2296 Section 51.2296 Agriculture Regulations of the Department of Agriculture AGRICULTURAL MARKETING SERVICE (Standards...-fourths half kernel. Three-fourths half kernel means a portion of a half of a kernel which has more than...

  13. Music recommendation according to human motion based on kernel CCA-based relationship

    Science.gov (United States)

    Ohkushi, Hiroyuki; Ogawa, Takahiro; Haseyama, Miki

    2011-12-01

    In this article, a method for recommendation of music pieces according to human motions based on their kernel canonical correlation analysis (CCA)-based relationship is proposed. In order to perform the recommendation between different types of multimedia data, i.e., recommendation of music pieces from human motions, the proposed method tries to estimate their relationship. Specifically, the correlation based on kernel CCA is calculated as the relationship in our method. Since human motions and music pieces have various time lengths, it is necessary to calculate the correlation between time series having different lengths. Therefore, new kernel functions for human motions and music pieces, which can provide similarities between data that have different time lengths, are introduced into the calculation of the kernel CCA-based correlation. This approach effectively provides a solution to the conventional problem of not being able to calculate the correlation from multimedia data that have various time lengths. Therefore, the proposed method can perform accurate recommendation of best matched music pieces according to a target human motion from the obtained correlation. Experimental results are shown to verify the performance of the proposed method.

  14. Nonparametric Monitoring for Geotechnical Structures Subject to Long-Term Environmental Change

    Directory of Open Access Journals (Sweden)

    Hae-Bum Yun

    2011-01-01

    Full Text Available A nonparametric, data-driven methodology of monitoring for geotechnical structures subject to long-term environmental change is discussed. Avoiding physical assumptions or excessive simplification of the monitored structures, the nonparametric monitoring methodology presented in this paper provides reliable performance-related information particularly when the collection of sensor data is limited. For the validation of the nonparametric methodology, a field case study was performed using a full-scale retaining wall, which had been monitored for three years using three tilt gauges. Using the very limited sensor data, it is demonstrated that important performance-related information, such as drainage performance and sensor damage, could be disentangled from significant daily, seasonal and multiyear environmental variations. Extensive literature review on recent developments of parametric and nonparametric data processing techniques for geotechnical applications is also presented.

  15. Estimation of Esfarayen Farmers Risk Aversion Coefficient and Its Influencing Factors (Nonparametric Approach

    Directory of Open Access Journals (Sweden)

    Z. Nematollahi

    2016-03-01

    Full Text Available Introduction: Due to existence of the risk and uncertainty in agriculture, risk management is crucial for management in agriculture. Therefore the present study was designed to determine the risk aversion coefficient for Esfarayens farmers. Materials and Methods: The following approaches have been utilized to assess risk attitudes: (1 direct elicitation of utility functions, (2 experimental procedures in which individuals are presented with hypothetical questionnaires regarding risky alternatives with or without real payments and (3: Inference from observation of economic behavior. In this paper, we focused on approach (3: inference from observation of economic behavior, based on this assumption of existence of the relationship between the actual behavior of a decision maker and the behavior predicted from empirically specified models. A new non-parametric method and the QP method were used to calculate the coefficient of risk aversion. We maximized the decision maker expected utility with the E-V formulation (Freund, 1956. Ideally, in constructing a QP model, the variance-covariance matrix should be formed for each individual farmer. For this purpose, a sample of 100 farmers was selected using random sampling and their data about 14 products of years 2008- 2012 were assembled. The lowlands of Esfarayen were used since within this area, production possibilities are rather homogeneous. Results and Discussion: The results of this study showed that there was low correlation between some of the activities, which implies opportunities for income stabilization through diversification. With respect to transitory income, Ra, vary from 0.000006 to 0.000361 and the absolute coefficient of risk aversion in our sample were 0.00005. The estimated Ra values vary considerably from farm to farm. The results showed that the estimated Ra for the subsample existing of 'non-wealthy' farmers was 0.00010. The subsample with farmers in the 'wealthy' group had an

  16. A kernel version of multivariate alteration detection

    DEFF Research Database (Denmark)

    Nielsen, Allan Aasbjerg; Vestergaard, Jacob Schack

    2013-01-01

    Based on the established methods kernel canonical correlation analysis and multivariate alteration detection we introduce a kernel version of multivariate alteration detection. A case study with SPOT HRV data shows that the kMAD variates focus on extreme change observations.......Based on the established methods kernel canonical correlation analysis and multivariate alteration detection we introduce a kernel version of multivariate alteration detection. A case study with SPOT HRV data shows that the kMAD variates focus on extreme change observations....

  17. Implementing Kernel Methods Incrementally by Incremental Nonlinear Projection Trick.

    Science.gov (United States)

    Kwak, Nojun

    2016-05-20

    Recently, the nonlinear projection trick (NPT) was introduced enabling direct computation of coordinates of samples in a reproducing kernel Hilbert space. With NPT, any machine learning algorithm can be extended to a kernel version without relying on the so called kernel trick. However, NPT is inherently difficult to be implemented incrementally because an ever increasing kernel matrix should be treated as additional training samples are introduced. In this paper, an incremental version of the NPT (INPT) is proposed based on the observation that the centerization step in NPT is unnecessary. Because the proposed INPT does not change the coordinates of the old data, the coordinates obtained by INPT can directly be used in any incremental methods to implement a kernel version of the incremental methods. The effectiveness of the INPT is shown by applying it to implement incremental versions of kernel methods such as, kernel singular value decomposition, kernel principal component analysis, and kernel discriminant analysis which are utilized for problems of kernel matrix reconstruction, letter classification, and face image retrieval, respectively.

  18. a Comparison Study of Different Kernel Functions for Svm-Based Classification of Multi-Temporal Polarimetry SAR Data

    Science.gov (United States)

    Yekkehkhany, B.; Safari, A.; Homayouni, S.; Hasanlou, M.

    2014-10-01

    In this paper, a framework is developed based on Support Vector Machines (SVM) for crop classification using polarimetric features extracted from multi-temporal Synthetic Aperture Radar (SAR) imageries. The multi-temporal integration of data not only improves the overall retrieval accuracy but also provides more reliable estimates with respect to single-date data. Several kernel functions are employed and compared in this study for mapping the input space to higher Hilbert dimension space. These kernel functions include linear, polynomials and Radial Based Function (RBF). The method is applied to several UAVSAR L-band SAR images acquired over an agricultural area near Winnipeg, Manitoba, Canada. In this research, the temporal alpha features of H/A/α decomposition method are used in classification. The experimental tests show an SVM classifier with RBF kernel for three dates of data increases the Overall Accuracy (OA) to up to 3% in comparison to using linear kernel function, and up to 1% in comparison to a 3rd degree polynomial kernel function.

  19. Robust anti-synchronization of uncertain chaotic systems based on multiple-kernel least squares support vector machine modeling

    International Nuclear Information System (INIS)

    Chen Qiang; Ren Xuemei; Na Jing

    2011-01-01

    Highlights: Model uncertainty of the system is approximated by multiple-kernel LSSVM. Approximation errors and disturbances are compensated in the controller design. Asymptotical anti-synchronization is achieved with model uncertainty and disturbances. Abstract: In this paper, we propose a robust anti-synchronization scheme based on multiple-kernel least squares support vector machine (MK-LSSVM) modeling for two uncertain chaotic systems. The multiple-kernel regression, which is a linear combination of basic kernels, is designed to approximate system uncertainties by constructing a multiple-kernel Lagrangian function and computing the corresponding regression parameters. Then, a robust feedback control based on MK-LSSVM modeling is presented and an improved update law is employed to estimate the unknown bound of the approximation error. The proposed control scheme can guarantee the asymptotic convergence of the anti-synchronization errors in the presence of system uncertainties and external disturbances. Numerical examples are provided to show the effectiveness of the proposed method.

  20. Uranium kernel formation via internal gelation

    International Nuclear Information System (INIS)

    Hunt, R.D.; Collins, J.L.

    2004-01-01

    In the 1970s and 1980s, U.S. Department of Energy (DOE) conducted numerous studies on the fabrication of nuclear fuel particles using the internal gelation process. These amorphous kernels were prone to flaking or breaking when gases tried to escape from the kernels during calcination and sintering. These earlier kernels would not meet today's proposed specifications for reactor fuel. In the interim, the internal gelation process has been used to create hydrous metal oxide microspheres for the treatment of nuclear waste. With the renewed interest in advanced nuclear fuel by the DOE, the lessons learned from the nuclear waste studies were recently applied to the fabrication of uranium kernels, which will become tri-isotropic (TRISO) fuel particles. These process improvements included equipment modifications, small changes to the feed formulations, and a new temperature profile for the calcination and sintering. The modifications to the laboratory-scale equipment and its operation as well as small changes to the feed composition increased the product yield from 60% to 80%-99%. The new kernels were substantially less glassy, and no evidence of flaking was found. Finally, key process parameters were identified, and their effects on the uranium microspheres and kernels are discussed. (orig.)

  1. Comparing parametric and nonparametric regression methods for panel data

    DEFF Research Database (Denmark)

    Czekaj, Tomasz Gerard; Henningsen, Arne

    We investigate and compare the suitability of parametric and non-parametric stochastic regression methods for analysing production technologies and the optimal firm size. Our theoretical analysis shows that the most commonly used functional forms in empirical production analysis, Cobb......-Douglas and Translog, are unsuitable for analysing the optimal firm size. We show that the Translog functional form implies an implausible linear relationship between the (logarithmic) firm size and the elasticity of scale, where the slope is artificially related to the substitutability between the inputs....... The practical applicability of the parametric and non-parametric regression methods is scrutinised and compared by an empirical example: we analyse the production technology and investigate the optimal size of Polish crop farms based on a firm-level balanced panel data set. A nonparametric specification test...

  2. A multitemporal and non-parametric approach for assessing the impacts of drought on vegetation greenness

    DEFF Research Database (Denmark)

    Carrao, Hugo; Sepulcre, Guadalupe; Horion, Stéphanie Marie Anne F

    2013-01-01

    This study evaluates the relationship between the frequency and duration of meteorological droughts and the subsequent temporal changes on the quantity of actively photosynthesizing biomass (greenness) estimated from satellite imagery on rainfed croplands in Latin America. An innovative non-parametric...... and non-supervised approach, based on the Fisher-Jenks optimal classification algorithm, is used to identify multi-scale meteorological droughts on the basis of empirical cumulative distributions of 1, 3, 6, and 12-monthly precipitation totals. As input data for the classifier, we use the gridded GPCC...... for the period between 1998 and 2010. The time-series analysis of vegetation greenness is performed during the growing season with a non-parametric method, namely the seasonal Relative Greenness (RG) of spatially accumulated fAPAR. The Global Land Cover map of 2000 and the GlobCover maps of 2005/2006 and 2009...

  3. Kernel learning at the first level of inference.

    Science.gov (United States)

    Cawley, Gavin C; Talbot, Nicola L C

    2014-05-01

    Kernel learning methods, whether Bayesian or frequentist, typically involve multiple levels of inference, with the coefficients of the kernel expansion being determined at the first level and the kernel and regularisation parameters carefully tuned at the second level, a process known as model selection. Model selection for kernel machines is commonly performed via optimisation of a suitable model selection criterion, often based on cross-validation or theoretical performance bounds. However, if there are a large number of kernel parameters, as for instance in the case of automatic relevance determination (ARD), there is a substantial risk of over-fitting the model selection criterion, resulting in poor generalisation performance. In this paper we investigate the possibility of learning the kernel, for the Least-Squares Support Vector Machine (LS-SVM) classifier, at the first level of inference, i.e. parameter optimisation. The kernel parameters and the coefficients of the kernel expansion are jointly optimised at the first level of inference, minimising a training criterion with an additional regularisation term acting on the kernel parameters. The key advantage of this approach is that the values of only two regularisation parameters need be determined in model selection, substantially alleviating the problem of over-fitting the model selection criterion. The benefits of this approach are demonstrated using a suite of synthetic and real-world binary classification benchmark problems, where kernel learning at the first level of inference is shown to be statistically superior to the conventional approach, improves on our previous work (Cawley and Talbot, 2007) and is competitive with Multiple Kernel Learning approaches, but with reduced computational expense. Copyright © 2014 Elsevier Ltd. All rights reserved.

  4. Quantum tomography, phase-space observables and generalized Markov kernels

    International Nuclear Information System (INIS)

    Pellonpaeae, Juha-Pekka

    2009-01-01

    We construct a generalized Markov kernel which transforms the observable associated with the homodyne tomography into a covariant phase-space observable with a regular kernel state. Illustrative examples are given in the cases of a 'Schroedinger cat' kernel state and the Cahill-Glauber s-parametrized distributions. Also we consider an example of a kernel state when the generalized Markov kernel cannot be constructed.

  5. Analysis of small sample size studies using nonparametric bootstrap test with pooled resampling method.

    Science.gov (United States)

    Dwivedi, Alok Kumar; Mallawaarachchi, Indika; Alvarado, Luis A

    2017-06-30

    Experimental studies in biomedical research frequently pose analytical problems related to small sample size. In such studies, there are conflicting findings regarding the choice of parametric and nonparametric analysis, especially with non-normal data. In such instances, some methodologists questioned the validity of parametric tests and suggested nonparametric tests. In contrast, other methodologists found nonparametric tests to be too conservative and less powerful and thus preferred using parametric tests. Some researchers have recommended using a bootstrap test; however, this method also has small sample size limitation. We used a pooled method in nonparametric bootstrap test that may overcome the problem related with small samples in hypothesis testing. The present study compared nonparametric bootstrap test with pooled resampling method corresponding to parametric, nonparametric, and permutation tests through extensive simulations under various conditions and using real data examples. The nonparametric pooled bootstrap t-test provided equal or greater power for comparing two means as compared with unpaired t-test, Welch t-test, Wilcoxon rank sum test, and permutation test while maintaining type I error probability for any conditions except for Cauchy and extreme variable lognormal distributions. In such cases, we suggest using an exact Wilcoxon rank sum test. Nonparametric bootstrap paired t-test also provided better performance than other alternatives. Nonparametric bootstrap test provided benefit over exact Kruskal-Wallis test. We suggest using nonparametric bootstrap test with pooled resampling method for comparing paired or unpaired means and for validating the one way analysis of variance test results for non-normal data in small sample size studies. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  6. Single pass kernel k-means clustering method

    Indian Academy of Sciences (India)

    paper proposes a simple and faster version of the kernel k-means clustering ... It has been considered as an important tool ... On the other hand, kernel-based clustering methods, like kernel k-means clus- ..... able at the UCI machine learning repository (Murphy 1994). ... All the data sets have only numeric valued features.

  7. Relationship between attenuation coefficients and dose-spread kernels

    International Nuclear Information System (INIS)

    Boyer, A.L.

    1988-01-01

    Dose-spread kernels can be used to calculate the dose distribution in a photon beam by convolving the kernel with the primary fluence distribution. The theoretical relationships between various types and components of dose-spread kernels relative to photon attenuation coefficients are explored. These relations can be valuable as checks on the conservation of energy by dose-spread kernels calculated by analytic or Monte Carlo methods

  8. Mixture Density Mercer Kernels: A Method to Learn Kernels

    Data.gov (United States)

    National Aeronautics and Space Administration — This paper presents a method of generating Mercer Kernels from an ensemble of probabilistic mixture models, where each mixture model is generated from a Bayesian...

  9. A comprehensive benchmark of kernel methods to extract protein-protein interactions from literature.

    Directory of Open Access Journals (Sweden)

    Domonkos Tikk

    Full Text Available The most important way of conveying new findings in biomedical research is scientific publication. Extraction of protein-protein interactions (PPIs reported in scientific publications is one of the core topics of text mining in the life sciences. Recently, a new class of such methods has been proposed - convolution kernels that identify PPIs using deep parses of sentences. However, comparing published results of different PPI extraction methods is impossible due to the use of different evaluation corpora, different evaluation metrics, different tuning procedures, etc. In this paper, we study whether the reported performance metrics are robust across different corpora and learning settings and whether the use of deep parsing actually leads to an increase in extraction quality. Our ultimate goal is to identify the one method that performs best in real-life scenarios, where information extraction is performed on unseen text and not on specifically prepared evaluation data. We performed a comprehensive benchmarking of nine different methods for PPI extraction that use convolution kernels on rich linguistic information. Methods were evaluated on five different public corpora using cross-validation, cross-learning, and cross-corpus evaluation. Our study confirms that kernels using dependency trees generally outperform kernels based on syntax trees. However, our study also shows that only the best kernel methods can compete with a simple rule-based approach when the evaluation prevents information leakage between training and test corpora. Our results further reveal that the F-score of many approaches drops significantly if no corpus-specific parameter optimization is applied and that methods reaching a good AUC score often perform much worse in terms of F-score. We conclude that for most kernels no sensible estimation of PPI extraction performance on new text is possible, given the current heterogeneity in evaluation data. Nevertheless, our study

  10. Integral equations with contrasting kernels

    Directory of Open Access Journals (Sweden)

    Theodore Burton

    2008-01-01

    Full Text Available In this paper we study integral equations of the form $x(t=a(t-\\int^t_0 C(t,sx(sds$ with sharply contrasting kernels typified by $C^*(t,s=\\ln (e+(t-s$ and $D^*(t,s=[1+(t-s]^{-1}$. The kernel assigns a weight to $x(s$ and these kernels have exactly opposite effects of weighting. Each type is well represented in the literature. Our first project is to show that for $a\\in L^2[0,\\infty$, then solutions are largely indistinguishable regardless of which kernel is used. This is a surprise and it leads us to study the essential differences. In fact, those differences become large as the magnitude of $a(t$ increases. The form of the kernel alone projects necessary conditions concerning the magnitude of $a(t$ which could result in bounded solutions. Thus, the next project is to determine how close we can come to proving that the necessary conditions are also sufficient. The third project is to show that solutions will be bounded for given conditions on $C$ regardless of whether $a$ is chosen large or small; this is important in real-world problems since we would like to have $a(t$ as the sum of a bounded, but badly behaved function, and a large well behaved function.

  11. Kernel methods in orthogonalization of multi- and hypervariate data

    DEFF Research Database (Denmark)

    Nielsen, Allan Aasbjerg

    2009-01-01

    A kernel version of maximum autocorrelation factor (MAF) analysis is described very briefly and applied to change detection in remotely sensed hyperspectral image (HyMap) data. The kernel version is based on a dual formulation also termed Q-mode analysis in which the data enter into the analysis...... via inner products in the Gram matrix only. In the kernel version the inner products are replaced by inner products between nonlinear mappings into higher dimensional feature space of the original data. Via kernel substitution also known as the kernel trick these inner products between the mappings...... are in turn replaced by a kernel function and all quantities needed in the analysis are expressed in terms of this kernel function. This means that we need not know the nonlinear mappings explicitly. Kernel PCA and MAF analysis handle nonlinearities by implicitly transforming data into high (even infinite...

  12. Kernel based subspace projection of near infrared hyperspectral images of maize kernels

    DEFF Research Database (Denmark)

    Larsen, Rasmus; Arngren, Morten; Hansen, Per Waaben

    2009-01-01

    In this paper we present an exploratory analysis of hyper- spectral 900-1700 nm images of maize kernels. The imaging device is a line scanning hyper spectral camera using a broadband NIR illumi- nation. In order to explore the hyperspectral data we compare a series of subspace projection methods ......- tor transform outperform the linear methods as well as kernel principal components in producing interesting projections of the data.......In this paper we present an exploratory analysis of hyper- spectral 900-1700 nm images of maize kernels. The imaging device is a line scanning hyper spectral camera using a broadband NIR illumi- nation. In order to explore the hyperspectral data we compare a series of subspace projection methods...... including principal component analysis and maximum autocorrelation factor analysis. The latter utilizes the fact that interesting phenomena in images exhibit spatial autocorrelation. However, linear projections often fail to grasp the underlying variability on the data. Therefore we propose to use so...

  13. Kernel-density estimation and approximate Bayesian computation for flexible epidemiological model fitting in Python.

    Science.gov (United States)

    Irvine, Michael A; Hollingsworth, T Déirdre

    2018-05-26

    Fitting complex models to epidemiological data is a challenging problem: methodologies can be inaccessible to all but specialists, there may be challenges in adequately describing uncertainty in model fitting, the complex models may take a long time to run, and it can be difficult to fully capture the heterogeneity in the data. We develop an adaptive approximate Bayesian computation scheme to fit a variety of epidemiologically relevant data with minimal hyper-parameter tuning by using an adaptive tolerance scheme. We implement a novel kernel density estimation scheme to capture both dispersed and multi-dimensional data, and directly compare this technique to standard Bayesian approaches. We then apply the procedure to a complex individual-based simulation of lymphatic filariasis, a human parasitic disease. The procedure and examples are released alongside this article as an open access library, with examples to aid researchers to rapidly fit models to data. This demonstrates that an adaptive ABC scheme with a general summary and distance metric is capable of performing model fitting for a variety of epidemiological data. It also does not require significant theoretical background to use and can be made accessible to the diverse epidemiological research community. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.

  14. Sparse Event Modeling with Hierarchical Bayesian Kernel Methods

    Science.gov (United States)

    2016-01-05

    SECURITY CLASSIFICATION OF: The research objective of this proposal was to develop a predictive Bayesian kernel approach to model count data based on...several predictive variables. Such an approach, which we refer to as the Poisson Bayesian kernel model, is able to model the rate of occurrence of... kernel methods made use of: (i) the Bayesian property of improving predictive accuracy as data are dynamically obtained, and (ii) the kernel function

  15. The Classification of Diabetes Mellitus Using Kernel k-means

    Science.gov (United States)

    Alamsyah, M.; Nafisah, Z.; Prayitno, E.; Afida, A. M.; Imah, E. M.

    2018-01-01

    Diabetes Mellitus is a metabolic disorder which is characterized by chronicle hypertensive glucose. Automatics detection of diabetes mellitus is still challenging. This study detected diabetes mellitus by using kernel k-Means algorithm. Kernel k-means is an algorithm which was developed from k-means algorithm. Kernel k-means used kernel learning that is able to handle non linear separable data; where it differs with a common k-means. The performance of kernel k-means in detecting diabetes mellitus is also compared with SOM algorithms. The experiment result shows that kernel k-means has good performance and a way much better than SOM.

  16. Evaluating the Application of Tissue-Specific Dose Kernels Instead of Water Dose Kernels in Internal Dosimetry : A Monte Carlo Study

    NARCIS (Netherlands)

    Moghadam, Maryam Khazaee; Asl, Alireza Kamali; Geramifar, Parham; Zaidi, Habib

    2016-01-01

    Purpose: The aim of this work is to evaluate the application of tissue-specific dose kernels instead of water dose kernels to improve the accuracy of patient-specific dosimetry by taking tissue heterogeneities into consideration. Materials and Methods: Tissue-specific dose point kernels (DPKs) and

  17. Parsimonious Wavelet Kernel Extreme Learning Machine

    Directory of Open Access Journals (Sweden)

    Wang Qin

    2015-11-01

    Full Text Available In this study, a parsimonious scheme for wavelet kernel extreme learning machine (named PWKELM was introduced by combining wavelet theory and a parsimonious algorithm into kernel extreme learning machine (KELM. In the wavelet analysis, bases that were localized in time and frequency to represent various signals effectively were used. Wavelet kernel extreme learning machine (WELM maximized its capability to capture the essential features in “frequency-rich” signals. The proposed parsimonious algorithm also incorporated significant wavelet kernel functions via iteration in virtue of Householder matrix, thus producing a sparse solution that eased the computational burden and improved numerical stability. The experimental results achieved from the synthetic dataset and a gas furnace instance demonstrated that the proposed PWKELM is efficient and feasible in terms of improving generalization accuracy and real time performance.

  18. Estimating Mutual Information by Local Gaussian Approximation

    Science.gov (United States)

    2015-07-13

    proposed a variety of methods to overcome the bias, such as the reflection method (Schuster, 1985), ( Silverman , 1986); the boundary kernel method...Stephen Marron and David Ruppert. Transformations to reduce boundary bias in kernel density estimation. Journal of the Royal Statistical Society. Series B...estimation with applications to machine learning on distributions. In Proceedings of Uncertainty in Artificial In- telligence (UAI), 2011. David N Reshef

  19. Designing realised kernels to measure the ex-post variation of equity prices in the presence of noise

    DEFF Research Database (Denmark)

    Barndorff-Nielsen, Ole Eiler; Hansen, Peter Reinhard; Lunde, Asger

    This paper shows how to use realised kernels to carry out efficient feasible inference on the ex-post variation of underlying equity prices in the presence of simple models of market frictions. The issue is subtle with only estimators which have symmetric weights delivering consistent estimators...... with mixed Gaussian limit theorems. The weights can be chosen to achieve the best possible rate of convergence and to have an asymptotic variance which is close to that of the maximum likelihood estimator in the parametric version of this problem. Realised kernels can also be selected to (i) be analysed...... using endogenously spaced data such as that in databases on transactions, (ii) allow for market frictions which are endogenous, (iii) allow for temporally dependent noise. The finite sample performance of our estimators is studied using simulation, while empirical work illustrates their use in practice....

  20. Network structure exploration via Bayesian nonparametric models

    International Nuclear Information System (INIS)

    Chen, Y; Wang, X L; Xiang, X; Tang, B Z; Bu, J Z

    2015-01-01

    Complex networks provide a powerful mathematical representation of complex systems in nature and society. To understand complex networks, it is crucial to explore their internal structures, also called structural regularities. The task of network structure exploration is to determine how many groups there are in a complex network and how to group the nodes of the network. Most existing structure exploration methods need to specify either a group number or a certain type of structure when they are applied to a network. In the real world, however, the group number and also the certain type of structure that a network has are usually unknown in advance. To explore structural regularities in complex networks automatically, without any prior knowledge of the group number or the certain type of structure, we extend a probabilistic mixture model that can handle networks with any type of structure but needs to specify a group number using Bayesian nonparametric theory. We also propose a novel Bayesian nonparametric model, called the Bayesian nonparametric mixture (BNPM) model. Experiments conducted on a large number of networks with different structures show that the BNPM model is able to explore structural regularities in networks automatically with a stable, state-of-the-art performance. (paper)

  1. Bioprocess iterative batch-to-batch optimization based on hybrid parametric/nonparametric models.

    Science.gov (United States)

    Teixeira, Ana P; Clemente, João J; Cunha, António E; Carrondo, Manuel J T; Oliveira, Rui

    2006-01-01

    This paper presents a novel method for iterative batch-to-batch dynamic optimization of bioprocesses. The relationship between process performance and control inputs is established by means of hybrid grey-box models combining parametric and nonparametric structures. The bioreactor dynamics are defined by material balance equations, whereas the cell population subsystem is represented by an adjustable mixture of nonparametric and parametric models. Thus optimizations are possible without detailed mechanistic knowledge concerning the biological system. A clustering technique is used to supervise the reliability of the nonparametric subsystem during the optimization. Whenever the nonparametric outputs are unreliable, the objective function is penalized. The technique was evaluated with three simulation case studies. The overall results suggest that the convergence to the optimal process performance may be achieved after a small number of batches. The model unreliability risk constraint along with sampling scheduling are crucial to minimize the experimental effort required to attain a given process performance. In general terms, it may be concluded that the proposed method broadens the application of the hybrid parametric/nonparametric modeling technique to "newer" processes with higher potential for optimization.

  2. Testing discontinuities in nonparametric regression

    KAUST Repository

    Dai, Wenlin

    2017-01-19

    In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100

  3. Testing discontinuities in nonparametric regression

    KAUST Repository

    Dai, Wenlin; Zhou, Yuejin; Tong, Tiejun

    2017-01-01

    In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100

  4. Difference between standard and quasi-conformal BFKL kernels

    International Nuclear Information System (INIS)

    Fadin, V.S.; Fiore, R.; Papa, A.

    2012-01-01

    As it was recently shown, the colour singlet BFKL kernel, taken in Möbius representation in the space of impact parameters, can be written in quasi-conformal shape, which is unbelievably simple compared with the conventional form of the BFKL kernel in momentum space. It was also proved that the total kernel is completely defined by its Möbius representation. In this paper we calculated the difference between standard and quasi-conformal BFKL kernels in momentum space and discovered that it is rather simple. Therefore we come to the conclusion that the simplicity of the quasi-conformal kernel is caused mainly by using the impact parameter space.

  5. Quantal Response: Nonparametric Modeling

    Science.gov (United States)

    2017-01-01

    capture the behavior of observed phenomena. Higher-order polynomial and finite-dimensional spline basis models allow for more complicated responses as the...flexibility as these are nonparametric (not constrained to any particular functional form). These should be useful in identifying nonstandard behavior via... deviance ∆ = −2 log(Lreduced/Lfull) is defined in terms of the likelihood function L. For normal error, Lfull = 1, and based on Eq. A-2, we have log

  6. A laser optical method for detecting corn kernel defects

    Energy Technology Data Exchange (ETDEWEB)

    Gunasekaran, S.; Paulsen, M. R.; Shove, G. C.

    1984-01-01

    An opto-electronic instrument was developed to examine individual corn kernels and detect various kernel defects according to reflectance differences. A low power helium-neon (He-Ne) laser (632.8 nm, red light) was used as the light source in the instrument. Reflectance from good and defective parts of corn kernel surfaces differed by approximately 40%. Broken, chipped, and starch-cracked kernels were detected with nearly 100% accuracy; while surface-split kernels were detected with about 80% accuracy. (author)

  7. Predicting Market Impact Costs Using Nonparametric Machine Learning Models.

    Science.gov (United States)

    Park, Saerom; Lee, Jaewook; Son, Youngdoo

    2016-01-01

    Market impact cost is the most significant portion of implicit transaction costs that can reduce the overall transaction cost, although it cannot be measured directly. In this paper, we employed the state-of-the-art nonparametric machine learning models: neural networks, Bayesian neural network, Gaussian process, and support vector regression, to predict market impact cost accurately and to provide the predictive model that is versatile in the number of variables. We collected a large amount of real single transaction data of US stock market from Bloomberg Terminal and generated three independent input variables. As a result, most nonparametric machine learning models outperformed a-state-of-the-art benchmark parametric model such as I-star model in four error measures. Although these models encounter certain difficulties in separating the permanent and temporary cost directly, nonparametric machine learning models can be good alternatives in reducing transaction costs by considerably improving in prediction performance.

  8. Kernel maximum autocorrelation factor and minimum noise fraction transformations

    DEFF Research Database (Denmark)

    Nielsen, Allan Aasbjerg

    2010-01-01

    in hyperspectral HyMap scanner data covering a small agricultural area, and 3) maize kernel inspection. In the cases shown, the kernel MAF/MNF transformation performs better than its linear counterpart as well as linear and kernel PCA. The leading kernel MAF/MNF variates seem to possess the ability to adapt...

  9. Testing for constant nonparametric effects in general semiparametric regression models with interactions

    KAUST Repository

    Wei, Jiawei; Carroll, Raymond J.; Maity, Arnab

    2011-01-01

    We consider the problem of testing for a constant nonparametric effect in a general semi-parametric regression model when there is the potential for interaction between the parametrically and nonparametrically modeled variables. The work

  10. Identification of Fusarium damaged wheat kernels using image analysis

    Directory of Open Access Journals (Sweden)

    Ondřej Jirsa

    2011-01-01

    Full Text Available Visual evaluation of kernels damaged by Fusarium spp. pathogens is labour intensive and due to a subjective approach, it can lead to inconsistencies. Digital imaging technology combined with appropriate statistical methods can provide much faster and more accurate evaluation of the visually scabby kernels proportion. The aim of the present study was to develop a discrimination model to identify wheat kernels infected by Fusarium spp. using digital image analysis and statistical methods. Winter wheat kernels from field experiments were evaluated visually as healthy or damaged. Deoxynivalenol (DON content was determined in individual kernels using an ELISA method. Images of individual kernels were produced using a digital camera on dark background. Colour and shape descriptors were obtained by image analysis from the area representing the kernel. Healthy and damaged kernels differed significantly in DON content and kernel weight. Various combinations of individual shape and colour descriptors were examined during the development of the model using linear discriminant analysis. In addition to basic descriptors of the RGB colour model (red, green, blue, very good classification was also obtained using hue from the HSL colour model (hue, saturation, luminance. The accuracy of classification using the developed discrimination model based on RGBH descriptors was 85 %. The shape descriptors themselves were not specific enough to distinguish individual kernels.

  11. APLIKASI SPLINE ESTIMATOR TERBOBOT

    Directory of Open Access Journals (Sweden)

    I Nyoman Budiantara

    2001-01-01

    Full Text Available We considered the nonparametric regression model : Zj = X(tj + ej, j = 1,2,…,n, where X(tj is the regression curve. The random error ej are independently distributed normal with a zero mean and a variance s2/bj, bj > 0. The estimation of X obtained by minimizing a Weighted Least Square. The solution of this optimation is a Weighted Spline Polynomial. Further, we give an application of weigted spline estimator in nonparametric regression. Abstract in Bahasa Indonesia : Diberikan model regresi nonparametrik : Zj = X(tj + ej, j = 1,2,…,n, dengan X (tj kurva regresi dan ej sesatan random yang diasumsikan berdistribusi normal dengan mean nol dan variansi s2/bj, bj > 0. Estimasi kurva regresi X yang meminimumkan suatu Penalized Least Square Terbobot, merupakan estimator Polinomial Spline Natural Terbobot. Selanjutnya diberikan suatu aplikasi estimator spline terbobot dalam regresi nonparametrik. Kata kunci: Spline terbobot, Regresi nonparametrik, Penalized Least Square.

  12. Unified heat kernel regression for diffusion, kernel smoothing and wavelets on manifolds and its application to mandible growth modeling in CT images.

    Science.gov (United States)

    Chung, Moo K; Qiu, Anqi; Seo, Seongho; Vorperian, Houri K

    2015-05-01

    We present a novel kernel regression framework for smoothing scalar surface data using the Laplace-Beltrami eigenfunctions. Starting with the heat kernel constructed from the eigenfunctions, we formulate a new bivariate kernel regression framework as a weighted eigenfunction expansion with the heat kernel as the weights. The new kernel method is mathematically equivalent to isotropic heat diffusion, kernel smoothing and recently popular diffusion wavelets. The numerical implementation is validated on a unit sphere using spherical harmonics. As an illustration, the method is applied to characterize the localized growth pattern of mandible surfaces obtained in CT images between ages 0 and 20 by regressing the length of displacement vectors with respect to a surface template. Copyright © 2015 Elsevier B.V. All rights reserved.

  13. Digital signal processing with kernel methods

    CERN Document Server

    Rojo-Alvarez, José Luis; Muñoz-Marí, Jordi; Camps-Valls, Gustavo

    2018-01-01

    A realistic and comprehensive review of joint approaches to machine learning and signal processing algorithms, with application to communications, multimedia, and biomedical engineering systems Digital Signal Processing with Kernel Methods reviews the milestones in the mixing of classical digital signal processing models and advanced kernel machines statistical learning tools. It explains the fundamental concepts from both fields of machine learning and signal processing so that readers can quickly get up to speed in order to begin developing the concepts and application software in their own research. Digital Signal Processing with Kernel Methods provides a comprehensive overview of kernel methods in signal processing, without restriction to any application field. It also offers example applications and detailed benchmarking experiments with real and synthetic datasets throughout. Readers can find further worked examples with Matlab source code on a website developed by the authors. * Presents the necess...

  14. Windows Vista Kernel-Mode: Functions, Security Enhancements and Flaws

    Directory of Open Access Journals (Sweden)

    Mohammed D. ABDULMALIK

    2008-06-01

    Full Text Available Microsoft has made substantial enhancements to the kernel of the Microsoft Windows Vista operating system. Kernel improvements are significant because the kernel provides low-level operating system functions, including thread scheduling, interrupt and exception dispatching, multiprocessor synchronization, and a set of routines and basic objects.This paper describes some of the kernel security enhancements for 64-bit edition of Windows Vista. We also point out some weakness areas (flaws that can be attacked by malicious leading to compromising the kernel.

  15. Generalization Performance of Regularized Ranking With Multiscale Kernels.

    Science.gov (United States)

    Zhou, Yicong; Chen, Hong; Lan, Rushi; Pan, Zhibin

    2016-05-01

    The regularized kernel method for the ranking problem has attracted increasing attentions in machine learning. The previous regularized ranking algorithms are usually based on reproducing kernel Hilbert spaces with a single kernel. In this paper, we go beyond this framework by investigating the generalization performance of the regularized ranking with multiscale kernels. A novel ranking algorithm with multiscale kernels is proposed and its representer theorem is proved. We establish the upper bound of the generalization error in terms of the complexity of hypothesis spaces. It shows that the multiscale ranking algorithm can achieve satisfactory learning rates under mild conditions. Experiments demonstrate the effectiveness of the proposed method for drug discovery and recommendation tasks.

  16. Multineuron spike train analysis with R-convolution linear combination kernel.

    Science.gov (United States)

    Tezuka, Taro

    2018-06-01

    A spike train kernel provides an effective way of decoding information represented by a spike train. Some spike train kernels have been extended to multineuron spike trains, which are simultaneously recorded spike trains obtained from multiple neurons. However, most of these multineuron extensions were carried out in a kernel-specific manner. In this paper, a general framework is proposed for extending any single-neuron spike train kernel to multineuron spike trains, based on the R-convolution kernel. Special subclasses of the proposed R-convolution linear combination kernel are explored. These subclasses have a smaller number of parameters and make optimization tractable when the size of data is limited. The proposed kernel was evaluated using Gaussian process regression for multineuron spike trains recorded from an animal brain. It was compared with the sum kernel and the population Spikernel, which are existing ways of decoding multineuron spike trains using kernels. The results showed that the proposed approach performs better than these kernels and also other commonly used neural decoding methods. Copyright © 2018 Elsevier Ltd. All rights reserved.

  17. Semi-Nonparametric Estimation and Misspecification Testing of Diffusion Models

    DEFF Research Database (Denmark)

    Kristensen, Dennis

    of the estimators and tests under the null are derived, and the power properties are analyzed by considering contiguous alternatives. Test directly comparing the drift and diffusion estimators under the relevant null and alternative are also analyzed. Markov Bootstrap versions of the test statistics are proposed...... to improve on the finite-sample approximations. The finite sample properties of the estimators are examined in a simulation study....

  18. An analysis of 1-D smoothed particle hydrodynamics kernels

    International Nuclear Information System (INIS)

    Fulk, D.A.; Quinn, D.W.

    1996-01-01

    In this paper, the smoothed particle hydrodynamics (SPH) kernel is analyzed, resulting in measures of merit for one-dimensional SPH. Various methods of obtaining an objective measure of the quality and accuracy of the SPH kernel are addressed. Since the kernel is the key element in the SPH methodology, this should be of primary concern to any user of SPH. The results of this work are two measures of merit, one for smooth data and one near shocks. The measure of merit for smooth data is shown to be quite accurate and a useful delineator of better and poorer kernels. The measure of merit for non-smooth data is not quite as accurate, but results indicate the kernel is much less important for these types of problems. In addition to the theory, 20 kernels are analyzed using the measure of merit demonstrating the general usefulness of the measure of merit and the individual kernels. In general, it was decided that bell-shaped kernels perform better than other shapes. 12 refs., 16 figs., 7 tabs

  19. Putting Priors in Mixture Density Mercer Kernels

    Science.gov (United States)

    Srivastava, Ashok N.; Schumann, Johann; Fischer, Bernd

    2004-01-01

    This paper presents a new methodology for automatic knowledge driven data mining based on the theory of Mercer Kernels, which are highly nonlinear symmetric positive definite mappings from the original image space to a very high, possibly infinite dimensional feature space. We describe a new method called Mixture Density Mercer Kernels to learn kernel function directly from data, rather than using predefined kernels. These data adaptive kernels can en- code prior knowledge in the kernel using a Bayesian formulation, thus allowing for physical information to be encoded in the model. We compare the results with existing algorithms on data from the Sloan Digital Sky Survey (SDSS). The code for these experiments has been generated with the AUTOBAYES tool, which automatically generates efficient and documented C/C++ code from abstract statistical model specifications. The core of the system is a schema library which contains template for learning and knowledge discovery algorithms like different versions of EM, or numeric optimization methods like conjugate gradient methods. The template instantiation is supported by symbolic- algebraic computations, which allows AUTOBAYES to find closed-form solutions and, where possible, to integrate them into the code. The results show that the Mixture Density Mercer-Kernel described here outperforms tree-based classification in distinguishing high-redshift galaxies from low- redshift galaxies by approximately 16% on test data, bagged trees by approximately 7%, and bagged trees built on a much larger sample of data by approximately 2%.

  20. NLO corrections to the Kernel of the BKP-equations

    Energy Technology Data Exchange (ETDEWEB)

    Bartels, J. [Hamburg Univ. (Germany). 2. Inst. fuer Theoretische Physik; Fadin, V.S. [Budker Institute of Nuclear Physics, Novosibirsk (Russian Federation); Novosibirskij Gosudarstvennyj Univ., Novosibirsk (Russian Federation); Lipatov, L.N. [Hamburg Univ. (Germany). 2. Inst. fuer Theoretische Physik; Petersburg Nuclear Physics Institute, Gatchina, St. Petersburg (Russian Federation); Vacca, G.P. [INFN, Sezione di Bologna (Italy)

    2012-10-02

    We present results for the NLO kernel of the BKP equations for composite states of three reggeized gluons in the Odderon channel, both in QCD and in N=4 SYM. The NLO kernel consists of the NLO BFKL kernel in the color octet representation and the connected 3{yields}3 kernel, computed in the tree approximation.

  1. A Fast and Simple Graph Kernel for RDF

    NARCIS (Netherlands)

    de Vries, G.K.D.; de Rooij, S.

    2013-01-01

    In this paper we study a graph kernel for RDF based on constructing a tree for each instance and counting the number of paths in that tree. In our experiments this kernel shows comparable classification performance to the previously introduced intersection subtree kernel, but is significantly faster

  2. An SVM model with hybrid kernels for hydrological time series

    Science.gov (United States)

    Wang, C.; Wang, H.; Zhao, X.; Xie, Q.

    2017-12-01

    Support Vector Machine (SVM) models have been widely applied to the forecast of climate/weather and its impact on other environmental variables such as hydrologic response to climate/weather. When using SVM, the choice of the kernel function plays the key role. Conventional SVM models mostly use one single type of kernel function, e.g., radial basis kernel function. Provided that there are several featured kernel functions available, each having its own advantages and drawbacks, a combination of these kernel functions may give more flexibility and robustness to SVM approach, making it suitable for a wide range of application scenarios. This paper presents such a linear combination of radial basis kernel and polynomial kernel for the forecast of monthly flowrate in two gaging stations using SVM approach. The results indicate significant improvement in the accuracy of predicted series compared to the approach with either individual kernel function, thus demonstrating the feasibility and advantages of such hybrid kernel approach for SVM applications.

  3. Kernel based eigenvalue-decomposition methods for analysing ham

    DEFF Research Database (Denmark)

    Christiansen, Asger Nyman; Nielsen, Allan Aasbjerg; Møller, Flemming

    2010-01-01

    methods, such as PCA, MAF or MNF. We therefore investigated the applicability of kernel based versions of these transformation. This meant implementing the kernel based methods and developing new theory, since kernel based MAF and MNF is not described in the literature yet. The traditional methods only...... have two factors that are useful for segmentation and none of them can be used to segment the two types of meat. The kernel based methods have a lot of useful factors and they are able to capture the subtle differences in the images. This is illustrated in Figure 1. You can see a comparison of the most...... useful factor of PCA and kernel based PCA respectively in Figure 2. The factor of the kernel based PCA turned out to be able to segment the two types of meat and in general that factor is much more distinct, compared to the traditional factor. After the orthogonal transformation a simple thresholding...

  4. Novel applications of the temporal kernel method: Historical and future radiative forcing

    Science.gov (United States)

    Portmann, R. W.; Larson, E.; Solomon, S.; Murphy, D. M.

    2017-12-01

    We present a new estimate of the historical radiative forcing derived from the observed global mean surface temperature and a model derived kernel function. Current estimates of historical radiative forcing are usually derived from climate models. Despite large variability in these models, the multi-model mean tends to do a reasonable job of representing the Earth system and climate. One method of diagnosing the transient radiative forcing in these models requires model output of top of the atmosphere radiative imbalance and global mean temperature anomaly. It is difficult to apply this method to historical observations due to the lack of TOA radiative measurements before CERES. We apply the temporal kernel method (TKM) of calculating radiative forcing to the historical global mean temperature anomaly. This novel approach is compared against the current regression based methods using model outputs and shown to produce consistent forcing estimates giving confidence in the forcing derived from the historical temperature record. The derived TKM radiative forcing provides an estimate of the forcing time series that the average climate model needs to produce the observed temperature record. This forcing time series is found to be in good overall agreement with previous estimates but includes significant differences that will be discussed. The historical anthropogenic aerosol forcing is estimated as a residual from the TKM and found to be consistent with earlier moderate forcing estimates. In addition, this method is applied to future temperature projections to estimate the radiative forcing required to achieve those temperature goals, such as those set in the Paris agreement.

  5. Reduced multiple empirical kernel learning machine.

    Science.gov (United States)

    Wang, Zhe; Lu, MingZhe; Gao, Daqi

    2015-02-01

    Multiple kernel learning (MKL) is demonstrated to be flexible and effective in depicting heterogeneous data sources since MKL can introduce multiple kernels rather than a single fixed kernel into applications. However, MKL would get a high time and space complexity in contrast to single kernel learning, which is not expected in real-world applications. Meanwhile, it is known that the kernel mapping ways of MKL generally have two forms including implicit kernel mapping and empirical kernel mapping (EKM), where the latter is less attracted. In this paper, we focus on the MKL with the EKM, and propose a reduced multiple empirical kernel learning machine named RMEKLM for short. To the best of our knowledge, it is the first to reduce both time and space complexity of the MKL with EKM. Different from the existing MKL, the proposed RMEKLM adopts the Gauss Elimination technique to extract a set of feature vectors, which is validated that doing so does not lose much information of the original feature space. Then RMEKLM adopts the extracted feature vectors to span a reduced orthonormal subspace of the feature space, which is visualized in terms of the geometry structure. It can be demonstrated that the spanned subspace is isomorphic to the original feature space, which means that the dot product of two vectors in the original feature space is equal to that of the two corresponding vectors in the generated orthonormal subspace. More importantly, the proposed RMEKLM brings a simpler computation and meanwhile needs a less storage space, especially in the processing of testing. Finally, the experimental results show that RMEKLM owns a much efficient and effective performance in terms of both complexity and classification. The contributions of this paper can be given as follows: (1) by mapping the input space into an orthonormal subspace, the geometry of the generated subspace is visualized; (2) this paper first reduces both the time and space complexity of the EKM-based MKL; (3

  6. Kernel principal component analysis for change detection

    DEFF Research Database (Denmark)

    Nielsen, Allan Aasbjerg; Morton, J.C.

    2008-01-01

    region acquired at two different time points. If change over time does not dominate the scene, the projection of the original two bands onto the second eigenvector will show change over time. In this paper a kernel version of PCA is used to carry out the analysis. Unlike ordinary PCA, kernel PCA...... with a Gaussian kernel successfully finds the change observations in a case where nonlinearities are introduced artificially....

  7. Application of nonparametric statistic method for DNBR limit calculation

    International Nuclear Information System (INIS)

    Dong Bo; Kuang Bo; Zhu Xuenong

    2013-01-01

    Background: Nonparametric statistical method is a kind of statistical inference method not depending on a certain distribution; it calculates the tolerance limits under certain probability level and confidence through sampling methods. The DNBR margin is one important parameter of NPP design, which presents the safety level of NPP. Purpose and Methods: This paper uses nonparametric statistical method basing on Wilks formula and VIPER-01 subchannel analysis code to calculate the DNBR design limits (DL) of 300 MW NPP (Nuclear Power Plant) during the complete loss of flow accident, simultaneously compared with the DL of DNBR through means of ITDP to get certain DNBR margin. Results: The results indicate that this method can gain 2.96% DNBR margin more than that obtained by ITDP methodology. Conclusions: Because of the reduction of the conservation during analysis process, the nonparametric statistical method can provide greater DNBR margin and the increase of DNBR margin is benefited for the upgrading of core refuel scheme. (authors)

  8. Predicting Market Impact Costs Using Nonparametric Machine Learning Models.

    Directory of Open Access Journals (Sweden)

    Saerom Park

    Full Text Available Market impact cost is the most significant portion of implicit transaction costs that can reduce the overall transaction cost, although it cannot be measured directly. In this paper, we employed the state-of-the-art nonparametric machine learning models: neural networks, Bayesian neural network, Gaussian process, and support vector regression, to predict market impact cost accurately and to provide the predictive model that is versatile in the number of variables. We collected a large amount of real single transaction data of US stock market from Bloomberg Terminal and generated three independent input variables. As a result, most nonparametric machine learning models outperformed a-state-of-the-art benchmark parametric model such as I-star model in four error measures. Although these models encounter certain difficulties in separating the permanent and temporary cost directly, nonparametric machine learning models can be good alternatives in reducing transaction costs by considerably improving in prediction performance.

  9. Bootstrap-Based Inference for Cube Root Consistent Estimators

    DEFF Research Database (Denmark)

    Cattaneo, Matias D.; Jansson, Michael; Nagasawa, Kenichi

    This note proposes a consistent bootstrap-based distributional approximation for cube root consistent estimators such as the maximum score estimator of Manski (1975) and the isotonic density estimator of Grenander (1956). In both cases, the standard nonparametric bootstrap is known...... to be inconsistent. Our method restores consistency of the nonparametric bootstrap by altering the shape of the criterion function defining the estimator whose distribution we seek to approximate. This modification leads to a generic and easy-to-implement resampling method for inference that is conceptually distinct...... from other available distributional approximations based on some form of modified bootstrap. We offer simulation evidence showcasing the performance of our inference method in finite samples. An extension of our methodology to general M-estimation problems is also discussed....

  10. Enhanced gluten properties in soft kernel durum wheat

    Science.gov (United States)

    Soft kernel durum wheat is a relatively recent development (Morris et al. 2011 Crop Sci. 51:114). The soft kernel trait exerts profound effects on kernel texture, flour milling including break flour yield, milling energy, and starch damage, and dough water absorption (DWA). With the caveat of reduce...

  11. 7 CFR 981.61 - Redetermination of kernel weight.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 8 2010-01-01 2010-01-01 false Redetermination of kernel weight. 981.61 Section 981... GROWN IN CALIFORNIA Order Regulating Handling Volume Regulation § 981.61 Redetermination of kernel weight. The Board, on the basis of reports by handlers, shall redetermine the kernel weight of almonds...

  12. Stable Kernel Representations as Nonlinear Left Coprime Factorizations

    NARCIS (Netherlands)

    Paice, A.D.B.; Schaft, A.J. van der

    1994-01-01

    A representation of nonlinear systems based on the idea of representing the input-output pairs of the system as elements of the kernel of a stable operator has been recently introduced. This has been denoted the kernel representation of the system. In this paper it is demonstrated that the kernel

  13. 7 CFR 981.60 - Determination of kernel weight.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 8 2010-01-01 2010-01-01 false Determination of kernel weight. 981.60 Section 981.60... Regulating Handling Volume Regulation § 981.60 Determination of kernel weight. (a) Almonds for which settlement is made on kernel weight. All lots of almonds, whether shelled or unshelled, for which settlement...

  14. End-use quality of soft kernel durum wheat

    Science.gov (United States)

    Kernel texture is a major determinant of end-use quality of wheat. Durum wheat has very hard kernels. We developed soft kernel durum wheat via Ph1b-mediated homoeologous recombination. The Hardness locus was transferred from Chinese Spring to Svevo durum wheat via back-crossing. ‘Soft Svevo’ had SKC...

  15. Long-distance wind-dispersal of spores in a fungal plant pathogen: estimation of anisotropic dispersal kernels from an extensive field experiment.

    Directory of Open Access Journals (Sweden)

    Adrien Rieux

    Full Text Available Given its biological significance, determining the dispersal kernel (i.e., the distribution of dispersal distances of spore-producing pathogens is essential. Here, we report two field experiments designed to measure disease gradients caused by sexually- and asexually-produced spores of the wind-dispersed banana plant fungus Mycosphaerella fijiensis. Gradients were measured during a single generation and over 272 traps installed up to 1000 m along eight directions radiating from a traceable source of inoculum composed of fungicide-resistant strains. We adjusted several kernels differing in the shape of their tail and tested for two types of anisotropy. Contrasting dispersal kernels were observed between the two types of spores. For sexual spores (ascospores, we characterized both a steep gradient in the first few metres in all directions and rare long-distance dispersal (LDD events up to 1000 m from the source in two directions. A heavy-tailed kernel best fitted the disease gradient. Although ascospores distributed evenly in all directions, average dispersal distance was greater in two different directions without obvious correlation with wind patterns. For asexual spores (conidia, few dispersal events occurred outside of the source plot. A gradient up to 12.5 m from the source was observed in one direction only. Accordingly, a thin-tailed kernel best fitted the disease gradient, and anisotropy in both density and distance was correlated with averaged daily wind gust. We discuss the validity of our results as well as their implications in terms of disease diffusion and management strategy.

  16. Multi-sample nonparametric treatments comparison in medical ...

    African Journals Online (AJOL)

    Multi-sample nonparametric treatments comparison in medical follow-up study with unequal observation processes through simulation and bladder tumour case study. P. L. Tan, N.A. Ibrahim, M.B. Adam, J. Arasan ...

  17. Per-Sample Multiple Kernel Approach for Visual Concept Learning

    Directory of Open Access Journals (Sweden)

    Ling-Yu Duan

    2010-01-01

    Full Text Available Learning visual concepts from images is an important yet challenging problem in computer vision and multimedia research areas. Multiple kernel learning (MKL methods have shown great advantages in visual concept learning. As a visual concept often exhibits great appearance variance, a canonical MKL approach may not generate satisfactory results when a uniform kernel combination is applied over the input space. In this paper, we propose a per-sample multiple kernel learning (PS-MKL approach to take into account intraclass diversity for improving discrimination. PS-MKL determines sample-wise kernel weights according to kernel functions and training samples. Kernel weights as well as kernel-based classifiers are jointly learned. For efficient learning, PS-MKL employs a sample selection strategy. Extensive experiments are carried out over three benchmarking datasets of different characteristics including Caltech101, WikipediaMM, and Pascal VOC'07. PS-MKL has achieved encouraging performance, comparable to the state of the art, which has outperformed a canonical MKL.

  18. Per-Sample Multiple Kernel Approach for Visual Concept Learning

    Directory of Open Access Journals (Sweden)

    Tian Yonghong

    2010-01-01

    Full Text Available Abstract Learning visual concepts from images is an important yet challenging problem in computer vision and multimedia research areas. Multiple kernel learning (MKL methods have shown great advantages in visual concept learning. As a visual concept often exhibits great appearance variance, a canonical MKL approach may not generate satisfactory results when a uniform kernel combination is applied over the input space. In this paper, we propose a per-sample multiple kernel learning (PS-MKL approach to take into account intraclass diversity for improving discrimination. PS-MKL determines sample-wise kernel weights according to kernel functions and training samples. Kernel weights as well as kernel-based classifiers are jointly learned. For efficient learning, PS-MKL employs a sample selection strategy. Extensive experiments are carried out over three benchmarking datasets of different characteristics including Caltech101, WikipediaMM, and Pascal VOC'07. PS-MKL has achieved encouraging performance, comparable to the state of the art, which has outperformed a canonical MKL.

  19. Segmentation of 3D microPET images of the rat brain via the hybrid gaussian mixture method with kernel density estimation.

    Science.gov (United States)

    Chen, Tai-Been; Chen, Jyh-Cheng; Lu, Henry Horng-Shing

    2012-01-01

    Segmentation of positron emission tomography (PET) is typically achieved using the K-Means method or other approaches. In preclinical and clinical applications, the K-Means method needs a prior estimation of parameters such as the number of clusters and appropriate initialized values. This work segments microPET images using a hybrid method combining the Gaussian mixture model (GMM) with kernel density estimation. Segmentation is crucial to registration of disordered 2-deoxy-2-fluoro-D-glucose (FDG) accumulation locations with functional diagnosis and to estimate standardized uptake values (SUVs) of region of interests (ROIs) in PET images. Therefore, simulation studies are conducted to apply spherical targets to evaluate segmentation accuracy based on Tanimoto's definition of similarity. The proposed method generates a higher degree of similarity than the K-Means method. The PET images of a rat brain are used to compare the segmented shape and area of the cerebral cortex by the K-Means method and the proposed method by volume rendering. The proposed method provides clearer and more detailed activity structures of an FDG accumulation location in the cerebral cortex than those by the K-Means method.

  20. A nonparametric mean-variance smoothing method to assess Arabidopsis cold stress transcriptional regulator CBF2 overexpression microarray data.

    Science.gov (United States)

    Hu, Pingsha; Maiti, Tapabrata

    2011-01-01

    Microarray is a powerful tool for genome-wide gene expression analysis. In microarray expression data, often mean and variance have certain relationships. We present a non-parametric mean-variance smoothing method (NPMVS) to analyze differentially expressed genes. In this method, a nonlinear smoothing curve is fitted to estimate the relationship between mean and variance. Inference is then made upon shrinkage estimation of posterior means assuming variances are known. Different methods have been applied to simulated datasets, in which a variety of mean and variance relationships were imposed. The simulation study showed that NPMVS outperformed the other two popular shrinkage estimation methods in some mean-variance relationships; and NPMVS was competitive with the two methods in other relationships. A real biological dataset, in which a cold stress transcription factor gene, CBF2, was overexpressed, has also been analyzed with the three methods. Gene ontology and cis-element analysis showed that NPMVS identified more cold and stress responsive genes than the other two methods did. The good performance of NPMVS is mainly due to its shrinkage estimation for both means and variances. In addition, NPMVS exploits a non-parametric regression between mean and variance, instead of assuming a specific parametric relationship between mean and variance. The source code written in R is available from the authors on request.

  1. Nonparametric regression using the concept of minimum energy

    International Nuclear Information System (INIS)

    Williams, Mike

    2011-01-01

    It has recently been shown that an unbinned distance-based statistic, the energy, can be used to construct an extremely powerful nonparametric multivariate two sample goodness-of-fit test. An extension to this method that makes it possible to perform nonparametric regression using multiple multivariate data sets is presented in this paper. The technique, which is based on the concept of minimizing the energy of the system, permits determination of parameters of interest without the need for parametric expressions of the parent distributions of the data sets. The application and performance of this new method is discussed in the context of some simple example analyses.

  2. Delimiting areas of endemism through kernel interpolation.

    Science.gov (United States)

    Oliveira, Ubirajara; Brescovit, Antonio D; Santos, Adalberto J

    2015-01-01

    We propose a new approach for identification of areas of endemism, the Geographical Interpolation of Endemism (GIE), based on kernel spatial interpolation. This method differs from others in being independent of grid cells. This new approach is based on estimating the overlap between the distribution of species through a kernel interpolation of centroids of species distribution and areas of influence defined from the distance between the centroid and the farthest point of occurrence of each species. We used this method to delimit areas of endemism of spiders from Brazil. To assess the effectiveness of GIE, we analyzed the same data using Parsimony Analysis of Endemism and NDM and compared the areas identified through each method. The analyses using GIE identified 101 areas of endemism of spiders in Brazil GIE demonstrated to be effective in identifying areas of endemism in multiple scales, with fuzzy edges and supported by more synendemic species than in the other methods. The areas of endemism identified with GIE were generally congruent with those identified for other taxonomic groups, suggesting that common processes can be responsible for the origin and maintenance of these biogeographic units.

  3. Delimiting areas of endemism through kernel interpolation.

    Directory of Open Access Journals (Sweden)

    Ubirajara Oliveira

    Full Text Available We propose a new approach for identification of areas of endemism, the Geographical Interpolation of Endemism (GIE, based on kernel spatial interpolation. This method differs from others in being independent of grid cells. This new approach is based on estimating the overlap between the distribution of species through a kernel interpolation of centroids of species distribution and areas of influence defined from the distance between the centroid and the farthest point of occurrence of each species. We used this method to delimit areas of endemism of spiders from Brazil. To assess the effectiveness of GIE, we analyzed the same data using Parsimony Analysis of Endemism and NDM and compared the areas identified through each method. The analyses using GIE identified 101 areas of endemism of spiders in Brazil GIE demonstrated to be effective in identifying areas of endemism in multiple scales, with fuzzy edges and supported by more synendemic species than in the other methods. The areas of endemism identified with GIE were generally congruent with those identified for other taxonomic groups, suggesting that common processes can be responsible for the origin and maintenance of these biogeographic units.

  4. Deep Restricted Kernel Machines Using Conjugate Feature Duality.

    Science.gov (United States)

    Suykens, Johan A K

    2017-08-01

    The aim of this letter is to propose a theory of deep restricted kernel machines offering new foundations for deep learning with kernel machines. From the viewpoint of deep learning, it is partially related to restricted Boltzmann machines, which are characterized by visible and hidden units in a bipartite graph without hidden-to-hidden connections and deep learning extensions as deep belief networks and deep Boltzmann machines. From the viewpoint of kernel machines, it includes least squares support vector machines for classification and regression, kernel principal component analysis (PCA), matrix singular value decomposition, and Parzen-type models. A key element is to first characterize these kernel machines in terms of so-called conjugate feature duality, yielding a representation with visible and hidden units. It is shown how this is related to the energy form in restricted Boltzmann machines, with continuous variables in a nonprobabilistic setting. In this new framework of so-called restricted kernel machine (RKM) representations, the dual variables correspond to hidden features. Deep RKM are obtained by coupling the RKMs. The method is illustrated for deep RKM, consisting of three levels with a least squares support vector machine regression level and two kernel PCA levels. In its primal form also deep feedforward neural networks can be trained within this framework.

  5. A kernel plus method for quantifying wind turbine performance upgrades

    KAUST Repository

    Lee, Giwhyun

    2014-04-21

    Power curves are commonly estimated using the binning method recommended by the International Electrotechnical Commission, which primarily incorporates wind speed information. When such power curves are used to quantify a turbine\\'s upgrade, the results may not be accurate because many other environmental factors in addition to wind speed, such as temperature, air pressure, turbulence intensity, wind shear and humidity, all potentially affect the turbine\\'s power output. Wind industry practitioners are aware of the need to filter out effects from environmental conditions. Toward that objective, we developed a kernel plus method that allows incorporation of multivariate environmental factors in a power curve model, thereby controlling the effects from environmental factors while comparing power outputs. We demonstrate that the kernel plus method can serve as a useful tool for quantifying a turbine\\'s upgrade because it is sensitive to small and moderate changes caused by certain turbine upgrades. Although we demonstrate the utility of the kernel plus method in this specific application, the resulting method is a general, multivariate model that can connect other physical factors, as long as their measurements are available, with a turbine\\'s power output, which may allow us to explore new physical properties associated with wind turbine performance. © 2014 John Wiley & Sons, Ltd.

  6. Improved modeling of clinical data with kernel methods.

    Science.gov (United States)

    Daemen, Anneleen; Timmerman, Dirk; Van den Bosch, Thierry; Bottomley, Cecilia; Kirk, Emma; Van Holsbeke, Caroline; Valentin, Lil; Bourne, Tom; De Moor, Bart

    2012-02-01

    Despite the rise of high-throughput technologies, clinical data such as age, gender and medical history guide clinical management for most diseases and examinations. To improve clinical management, available patient information should be fully exploited. This requires appropriate modeling of relevant parameters. When kernel methods are used, traditional kernel functions such as the linear kernel are often applied to the set of clinical parameters. These kernel functions, however, have their disadvantages due to the specific characteristics of clinical data, being a mix of variable types with each variable its own range. We propose a new kernel function specifically adapted to the characteristics of clinical data. The clinical kernel function provides a better representation of patients' similarity by equalizing the influence of all variables and taking into account the range r of the variables. Moreover, it is robust with respect to changes in r. Incorporated in a least squares support vector machine, the new kernel function results in significantly improved diagnosis, prognosis and prediction of therapy response. This is illustrated on four clinical data sets within gynecology, with an average increase in test area under the ROC curve (AUC) of 0.023, 0.021, 0.122 and 0.019, respectively. Moreover, when combining clinical parameters and expression data in three case studies on breast cancer, results improved overall with use of the new kernel function and when considering both data types in a weighted fashion, with a larger weight assigned to the clinical parameters. The increase in AUC with respect to a standard kernel function and/or unweighted data combination was maximum 0.127, 0.042 and 0.118 for the three case studies. For clinical data consisting of variables of different types, the proposed kernel function--which takes into account the type and range of each variable--has shown to be a better alternative for linear and non-linear classification problems

  7. Linear and kernel methods for multi- and hypervariate change detection

    DEFF Research Database (Denmark)

    Nielsen, Allan Aasbjerg; Canty, Morton J.

    2010-01-01

    . Principal component analysis (PCA) as well as maximum autocorrelation factor (MAF) and minimum noise fraction (MNF) analyses of IR-MAD images, both linear and kernel-based (which are nonlinear), may further enhance change signals relative to no-change background. The kernel versions are based on a dual...... formulation, also termed Q-mode analysis, in which the data enter into the analysis via inner products in the Gram matrix only. In the kernel version the inner products of the original data are replaced by inner products between nonlinear mappings into higher dimensional feature space. Via kernel substitution......, also known as the kernel trick, these inner products between the mappings are in turn replaced by a kernel function and all quantities needed in the analysis are expressed in terms of the kernel function. This means that we need not know the nonlinear mappings explicitly. Kernel principal component...

  8. Kernel based orthogonalization for change detection in hyperspectral images

    DEFF Research Database (Denmark)

    Nielsen, Allan Aasbjerg

    function and all quantities needed in the analysis are expressed in terms of this kernel function. This means that we need not know the nonlinear mappings explicitly. Kernel PCA and MNF analyses handle nonlinearities by implicitly transforming data into high (even infinite) dimensional feature space via...... analysis all 126 spectral bands of the HyMap are included. Changes on the ground are most likely due to harvest having taken place between the two acquisitions and solar effects (both solar elevation and azimuth have changed). Both types of kernel analysis emphasize change and unlike kernel PCA, kernel MNF...

  9. Mitigation of artifacts in rtm with migration kernel decomposition

    KAUST Repository

    Zhan, Ge; Schuster, Gerard T.

    2012-01-01

    The migration kernel for reverse-time migration (RTM) can be decomposed into four component kernels using Born scattering and migration theory. Each component kernel has a unique physical interpretation and can be interpreted differently

  10. Semi-Supervised Kernel PCA

    DEFF Research Database (Denmark)

    Walder, Christian; Henao, Ricardo; Mørup, Morten

    We present three generalisations of Kernel Principal Components Analysis (KPCA) which incorporate knowledge of the class labels of a subset of the data points. The first, MV-KPCA, penalises within class variances similar to Fisher discriminant analysis. The second, LSKPCA is a hybrid of least...... squares regression and kernel PCA. The final LR-KPCA is an iteratively reweighted version of the previous which achieves a sigmoid loss function on the labeled points. We provide a theoretical risk bound as well as illustrative experiments on real and toy data sets....

  11. Effect on Prediction when Modeling Covariates in Bayesian Nonparametric Models.

    Science.gov (United States)

    Cruz-Marcelo, Alejandro; Rosner, Gary L; Müller, Peter; Stewart, Clinton F

    2013-04-01

    In biomedical research, it is often of interest to characterize biologic processes giving rise to observations and to make predictions of future observations. Bayesian nonparametric methods provide a means for carrying out Bayesian inference making as few assumptions about restrictive parametric models as possible. There are several proposals in the literature for extending Bayesian nonparametric models to include dependence on covariates. Limited attention, however, has been directed to the following two aspects. In this article, we examine the effect on fitting and predictive performance of incorporating covariates in a class of Bayesian nonparametric models by one of two primary ways: either in the weights or in the locations of a discrete random probability measure. We show that different strategies for incorporating continuous covariates in Bayesian nonparametric models can result in big differences when used for prediction, even though they lead to otherwise similar posterior inferences. When one needs the predictive density, as in optimal design, and this density is a mixture, it is better to make the weights depend on the covariates. We demonstrate these points via a simulated data example and in an application in which one wants to determine the optimal dose of an anticancer drug used in pediatric oncology.

  12. 21 CFR 176.350 - Tamarind seed kernel powder.

    Science.gov (United States)

    2010-04-01

    ... 21 Food and Drugs 3 2010-04-01 2009-04-01 true Tamarind seed kernel powder. 176.350 Section 176... Substances for Use Only as Components of Paper and Paperboard § 176.350 Tamarind seed kernel powder. Tamarind seed kernel powder may be safely used as a component of articles intended for use in producing...

  13. Designing realized kernels to measure the ex post variation of equity prices in the presence of noise

    DEFF Research Database (Denmark)

    Barndorff-Nielsen, Ole Eiler; Hansen, P.R.; Lunde, Asger

    2008-01-01

    This paper shows how to use realized kernels to carry out efficient feasible inference on the ex post variation of underlying equity prices in the presence of simple models of market frictions. The weights can be chosen to achieve the best possible rate of convergence and to have an asymptotic va......) allow for temporally dependent noise. The finite sample performance of our estimators is studied using simulation, while empirical work illustrates their use in practice.......This paper shows how to use realized kernels to carry out efficient feasible inference on the ex post variation of underlying equity prices in the presence of simple models of market frictions. The weights can be chosen to achieve the best possible rate of convergence and to have an asymptotic...... variance which equals that of the maximum likelihood estimator in the parametric version of this problem. Realized kernels can also be selected to (i) be analyzed using endogenously spaced data such as that in data bases on transactions, (ii) allow for market frictions which are endogenous, and (iii...

  14. Examining the Feasibility and Utility of Estimating Partial Expected Value of Perfect Information (via a Nonparametric Approach) as Part of the Reimbursement Decision-Making Process in Ireland: Application to Drugs for Cancer.

    Science.gov (United States)

    McCullagh, Laura; Schmitz, Susanne; Barry, Michael; Walsh, Cathal

    2017-11-01

    In Ireland, all new drugs for which reimbursement by the healthcare payer is sought undergo a health technology assessment by the National Centre for Pharmacoeconomics. The National Centre for Pharmacoeconomics estimate expected value of perfect information but not partial expected value of perfect information (owing to computational expense associated with typical methodologies). The objective of this study was to examine the feasibility and utility of estimating partial expected value of perfect information via a computationally efficient, non-parametric regression approach. This was a retrospective analysis of evaluations on drugs for cancer that had been submitted to the National Centre for Pharmacoeconomics (January 2010 to December 2014 inclusive). Drugs were excluded if cost effective at the submitted price. Drugs were excluded if concerns existed regarding the validity of the applicants' submission or if cost-effectiveness model functionality did not allow required modifications to be made. For each included drug (n = 14), value of information was estimated at the final reimbursement price, at a threshold equivalent to the incremental cost-effectiveness ratio at that price. The expected value of perfect information was estimated from probabilistic analysis. Partial expected value of perfect information was estimated via a non-parametric approach. Input parameters with a population value at least €1 million were identified as potential targets for research. All partial estimates were determined within minutes. Thirty parameters (across nine models) each had a value of at least €1 million. These were categorised. Collectively, survival analysis parameters were valued at €19.32 million, health state utility parameters at €15.81 million and parameters associated with the cost of treating adverse effects at €6.64 million. Those associated with drug acquisition costs and with the cost of care were valued at €6.51 million and €5.71

  15. Dense Medium Machine Processing Method for Palm Kernel/ Shell ...

    African Journals Online (AJOL)

    ADOWIE PERE

    Cracked palm kernel is a mixture of kernels, broken shells, dusts and other impurities. In ... machine processing method using dense medium, a separator, a shell collector and a kernel .... efficiency, ease of maintenance and uniformity of.

  16. Notes on the gamma kernel

    DEFF Research Database (Denmark)

    Barndorff-Nielsen, Ole E.

    The density function of the gamma distribution is used as shift kernel in Brownian semistationary processes modelling the timewise behaviour of the velocity in turbulent regimes. This report presents exact and asymptotic properties of the second order structure function under such a model......, and relates these to results of von Karmann and Horwath. But first it is shown that the gamma kernel is interpretable as a Green’s function....

  17. NParCov3: A SAS/IML Macro for Nonparametric Randomization-Based Analysis of Covariance

    Directory of Open Access Journals (Sweden)

    Richard C. Zink

    2012-07-01

    Full Text Available Analysis of covariance serves two important purposes in a randomized clinical trial. First, there is a reduction of variance for the treatment effect which provides more powerful statistical tests and more precise confidence intervals. Second, it provides estimates of the treatment effect which are adjusted for random imbalances of covariates between the treatment groups. The nonparametric analysis of covariance method of Koch, Tangen, Jung, and Amara (1998 defines a very general methodology using weighted least-squares to generate covariate-adjusted treatment effects with minimal assumptions. This methodology is general in its applicability to a variety of outcomes, whether continuous, binary, ordinal, incidence density or time-to-event. Further, its use has been illustrated in many clinical trial settings, such as multi-center, dose-response and non-inferiority trials.NParCov3 is a SAS/IML macro written to conduct the nonparametric randomization-based covariance analyses of Koch et al. (1998. The software can analyze a variety of outcomes and can account for stratification. Data from multiple clinical trials will be used for illustration.

  18. Calculation of the thermal neutron scattering kernel using the synthetic model. Pt. 2. Zero-order energy transfer kernel

    International Nuclear Information System (INIS)

    Drozdowicz, K.

    1995-01-01

    A comprehensive unified description of the application of Granada's Synthetic Model to the slow-neutron scattering by the molecular systems is continued. Detailed formulae for the zero-order energy transfer kernel are presented basing on the general formalism of the model. An explicit analytical formula for the total scattering cross section as a function of the incident neutron energy is also obtained. Expressions of the free gas model for the zero-order scattering kernel and for total scattering kernel are considered as a sub-case of the Synthetic Model. (author). 10 refs

  19. Convergence of barycentric coordinates to barycentric kernels

    KAUST Repository

    Kosinka, Jiří

    2016-02-12

    We investigate the close correspondence between barycentric coordinates and barycentric kernels from the point of view of the limit process when finer and finer polygons converge to a smooth convex domain. We show that any barycentric kernel is the limit of a set of barycentric coordinates and prove that the convergence rate is quadratic. Our convergence analysis extends naturally to barycentric interpolants and mappings induced by barycentric coordinates and kernels. We verify our theoretical convergence results numerically on several examples.

  20. Convergence of barycentric coordinates to barycentric kernels

    KAUST Repository

    Kosinka, Jiří ; Barton, Michael

    2016-01-01

    We investigate the close correspondence between barycentric coordinates and barycentric kernels from the point of view of the limit process when finer and finer polygons converge to a smooth convex domain. We show that any barycentric kernel is the limit of a set of barycentric coordinates and prove that the convergence rate is quadratic. Our convergence analysis extends naturally to barycentric interpolants and mappings induced by barycentric coordinates and kernels. We verify our theoretical convergence results numerically on several examples.